__STYLES__
Data, the lifeblood of the digital age, encompasses information collected and stored for analysis. Its importance lies in its ability to inform decision-making, drive innovation, and uncover valuable insights across industries. Data professionals, adept in data collection, analysis, and interpretation, are pivotal in harnessing the power of data. They utilize specialized skills and tools to extract meaningful patterns, trends, and predictions from vast datasets, facilitating informed business strategies, improving operational efficiency, and fueling organizational growth. In essence, data professionals are the architects of data-driven success, bridging the gap between raw information and actionable intelligence in today's data-centric world. Absolutely Data Professional plays vital role in the field of Data , though there are various title holders out here in the world of Data. Data professionals possess multiple responsibilities like Managing the Data (ensuring the data is being collected, stored and maintained in a secured way), Analysis of the data to derive valuable insights and helping the organizations to track their business and improving their performances, Making decisions through the data, bring Innovation ,helping organizations so that they can improve efficiency, enhance customer experiences, increase competitiveness, and drive bottom-line results.
To reach and learn the pattern of professional lives, a survey has been taken by asking many simple and basic questions. All the questions are option based.
Data set was obtained from a Github source and it was a raw data in a CSV file , I went through the data it requires much more cleaning. It consists of one data table holding 631 rows and 28 columns.
Given dataset was raw and required some cleaning, reshaping and reordering. Data table has the following fields :
Unique ID
Date Taken (America/ New_York)
Time Taken (America/ New_York)
Browser
OS
City
Country
Referrer
Time Spent
Q1 - Which Title Best Fits your Current Role?
Q2 - Did you switch careers into Data?
Q3 - Current Yearly Salary (in USD)
Q4 - What Industry do you work in?
Q5 - Favorite Programming Language
Q6 - How Happy are you in your Current Position with the following? (Salary)
Q6 - How Happy are you in your Current Position with the following? (Work/Life Balance)
Q6 - How Happy are you in your Current Position with the following? (Coworkers)
Q6 - How Happy are you in your Current Position with the following? (Management)
Q6 - How Happy are you in your Current Position with the following? (Upward Mobility)
Q6 - How Happy are you in your Current Position with the following? (Learning New Things)
Q7 - How difficult was it for you to break into Data?
Q8 - If you were to look for a new job today, what would be the most important thing to you?
Q9 - Male/Female?
Q10 - Current Age
Q11 - Which Country do you live in?
Q12 - Highest Level of Education
Q13 - Ethnicity
I performed data cleaning by applying various steps like removing columns , splitting columns , replacing values and most importantly adding custom columns where needed. Looking through the data I witnessed that there were some entities which holds nothing just blank values looks like no one answered about those entities , it was better to exclude them from the table because they were just taking the place.In this given image we can see that there are around 5 columns which have no values means blank and have no contribution among the procedure so these columns from browser to Referrer has been removed.
For other entities starting from the questionnaire columns I depicted that there were some columns which have some uncertain values that is Q1, Q4,Q5 and Q11, looking at "Q5 :Favourite Programming Language" many voters have voted for option "Other" and to specify it each have stated their favourite language other than given options to avoid this congestion I followed the technique of "Splitting Column" and opted for "by delimiter" and specified it by custom: "( , at left most delimiter " and after that we had a decent column by sequizing the options and the other column which we had containing some null values and those other specified values were removed to avoid distraction. Same process have done with other columns named as:Q1:Which title best fits your Current Role?, Q 4:Which industry do you work in? and Q11:Which country do you live in. In column Q3: Current Yearly Salary (in USD) the values were given in non digit form and values were like that "0-40k" like this , these types of values doesn't make any sense and can not be used as a proper value , to consult this issue I decided to find out the average salary number between the ranges given in the answers. I first duplicated the columns and then split them , replaced values and then removed the extra column containing text value
After performing splitting and duplicating column
Now , these columns are un clear and records there have uncertain values , to merge them into one column which is going to show the exact average values of the salary , I replaced the values "k" and "-" with "none" , removed the right most column Q3. Current yearly salary (in USD).copy 3 from the table
After removing the column and replacing values , I got the columns with positive numbers and now average of them can be easily calculate.
to calculate the average a custom column have been generated using DAX.
These above metrices are depicting a lot of information holding into their titles.
They ensure that the data is collected ,stored and maintained in a secured way. We can easily make decision by presenting data and driven insights that inform strategic decisions.