__STYLES__

Data Professionals Survey Breakdown

Tools used in this project
Data Professionals Survey Breakdown

About this project

undefinedData, the lifeblood of the digital age, encompasses information collected and stored for analysis. Its importance lies in its ability to inform decision-making, drive innovation, and uncover valuable insights across industries. Data professionals, adept in data collection, analysis, and interpretation, are pivotal in harnessing the power of data. They utilize specialized skills and tools to extract meaningful patterns, trends, and predictions from vast datasets, facilitating informed business strategies, improving operational efficiency, and fueling organizational growth. In essence, data professionals are the architects of data-driven success, bridging the gap between raw information and actionable intelligence in today's data-centric world. Absolutely Data Professional plays vital role in the field of Data , though there are various title holders out here in the world of Data. Data professionals possess multiple responsibilities like Managing the Data (ensuring the data is being collected, stored and maintained in a secured way), Analysis of the data to derive valuable insights and helping the organizations to track their business and improving their performances, Making decisions through the data, bring Innovation ,helping organizations so that they can improve efficiency, enhance customer experiences, increase competitiveness, and drive bottom-line results.

To reach and learn the pattern of professional lives, a survey has been taken by asking many simple and basic questions. All the questions are option based.

Data set was obtained from a Github source and it was a raw data in a CSV file , I went through the data it requires much more cleaning. It consists of one data table holding 631 rows and 28 columns.

Objectives:

  • Understanding the demographic distribution of data professionals across different countries.
  • Identifying the most common job titles and skills within the data profession.
  • Analyzing the preferences for programming languages among data professionals.
  • Assessing the level of difficulty in entering the data profession.
  • Evaluating the satisfaction levels regarding work-life balance and salary among data professionals.

ETL(Extract, Transform and Load)

Given dataset was raw and required some cleaning, reshaping and reordering. Data table has the following fields :

Unique ID

Email

Date Taken (America/ New_York)

Time Taken (America/ New_York)

Browser

OS

City

Country

Referrer

Time Spent

Q1 - Which Title Best Fits your Current Role?

Q2 - Did you switch careers into Data?

Q3 - Current Yearly Salary (in USD)

Q4 - What Industry do you work in?

Q5 - Favorite Programming Language

Q6 - How Happy are you in your Current Position with the following? (Salary)

Q6 - How Happy are you in your Current Position with the following? (Work/Life Balance)

Q6 - How Happy are you in your Current Position with the following? (Coworkers)

Q6 - How Happy are you in your Current Position with the following? (Management)

Q6 - How Happy are you in your Current Position with the following? (Upward Mobility)

Q6 - How Happy are you in your Current Position with the following? (Learning New Things)

Q7 - How difficult was it for you to break into Data?

Q8 - If you were to look for a new job today, what would be the most important thing to you?

Q9 - Male/Female?

Q10 - Current Age

Q11 - Which Country do you live in?

Q12 - Highest Level of Education

Q13 - Ethnicity

Data Cleaning

I performed data cleaning by applying various steps like removing columns , splitting columns , replacing values and most importantly adding custom columns where needed. Looking through the data I witnessed that there were some entities which holds nothing just blank values looks like no one answered about those entities , it was better to exclude them from the table because they were just taking the place.undefinedIn this given image we can see that there are around 5 columns which have no values means blank and have no contribution among the procedure so these columns from browser to Referrer has been removed.

For other entities starting from the questionnaire columns I depicted that there were some columns which have some uncertain values that is Q1, Q4,Q5 and Q11, looking at "Q5 :Favourite Programming Language" many voters have voted for option "Other" and to specify it each have stated their favourite language other than given options to avoid this congestion I followed the technique of "Splitting Column" and opted for "by delimiter" and specified it by custom: "( , at left most delimiter " and after that we had a decent column by sequizing the options and the other column which we had containing some null values and those other specified values were removed to avoid distraction. Same process have done with other columns named as:Q1:Which title best fits your Current Role?, Q 4:Which industry do you work in? and Q11:Which country do you live in. undefinedundefinedIn column Q3: Current Yearly Salary (in USD) the values were given in non digit form and values were like that "0-40k" like this , these types of values doesn't make any sense and can not be used as a proper value , to consult this issue I decided to find out the average salary number between the ranges given in the answers. I first duplicated the columns and then split them , replaced values and then removed the extra column containing text value

undefinedAfter performing splitting and duplicating column

undefinedNow , these columns are un clear and records there have uncertain values , to merge them into one column which is going to show the exact average values of the salary , I replaced the values "k" and "-" with "none" , removed the right most column Q3. Current yearly salary (in USD).copy 3 from the table

After removing the column and replacing values , I got the columns with positive numbers and now average of them can be easily calculate.

to calculate the average a custom column have been generated using DAX.

Key Metrices and Visualizations:

  • Average Salary by Job titles
  • Average age of survey takers
  • Satisfaction for Work life Balance
  • Salary Satisfaction
  • Favorite Programming Languages
  • Level of Difficulty to break into Data
  • Country of Voters

These above metrices are depicting a lot of information holding into their titles.

Business Impact:

  • Data Management
  • Data Analysis
  • Decision making and Innovation

They ensure that the data is collected ,stored and maintained in a secured way. We can easily make decision by presenting data and driven insights that inform strategic decisions.

Additional project images

DAX for Custom Column
Queries Applied for Data Cleaning
Queries applied for Data Cleaning
Q3 Uncleaned
Reordered the column
Discussion and feedback(0 comments)
2000 characters remaining
Cookie SettingsWe use cookies to enhance your experience, analyze site traffic and deliver personalized content. Read our Privacy Policy.