Project goal:
The objective of this Microsoft Excel project was to create a scenario where data insights can be easily viewed by Bike Sales business stakeholders across their regional locations. To achieve this, a survey was conducted across Europe (300 participants), North America (508 participants), and the Pacific region (192 participants).
Data source:
Source of data set was found from this hyperlink:
https://github.com/AlexTheAnalyst/Excel-Tutorial/blob/main/Excel%20Project%20Dataset.xlsx
Business insights from data set:
The main interest of the Bike Sales stakeholders was to identify the most profitable region, in order to increase Bike Sales revenue in that specific region. Bike Sales operations were located in Europe, North America, and the Pacific.
A data analysis was conducted to explore the correlation between bike purchases and demographic factors such as marital status, education background, gender, and home ownership across different regions. The survey study categorized the participants into three age groups: Adolescents (25-30 years old), Middle-aged adults (31-54 years old), and Older adults (55-89 years old).
Business insight:
Data-driven analysis reveals that the Pacific Region boasts the highest profit margins in Bike Sales, making it an attractive investment opportunity for Bike Sales stakeholders seeking to capitalize on future industry growth.
Exploration of Six Key Bike Sales Questions:
By analyzing survey data from participants in each region and asking six key questions, the Pacific Region was identified as the most profitable for Bike Sales stakeholders to make data-informed decisions about revenue and investment opportunities.
1.) Which region has the highest average income per Bike purchase?
Married females who own homes and hold management career occupations in the Pacific Region have an average income of $90,000.
2.) In which region do customers commute the farthest by bike?
Customers in the Pacific region are biking distances ranging from 5 to more than 10 miles.
3.) In which region is the average income of homeowners the highest?
Home ownership is highest in the Pacific region where females have an average income of $70,000.
4.) Which age group of customers commutes the farthest by bike in a specific region?
In the Pacific region, middle-aged females bike a distance of more than 10 miles.
5.) Which age group of customers commutes the most frequently by bike?
In Europe, middle-aged females frequently bike commute a distance of up to 1 mile.
6.) In which region are career occupations paid above the average salary?
In both North America and the Pacific region, there are customers who purchase bikes and have above-average incomes. These individuals typically hold bachelor’s or graduate degrees and have careers in management or professional occupations.
Import data set and data tool used:
Imported the data set into Microsoft Excel for data analysis and data visualization purposes.
Data set details:
Raw data set contained 13,351 entries.
After importing the Excel file, I removed entries of data duplicates before any data analysis was performed.
By removing the duplicate entries, there was a total of 13,013 of data entries ready for data analysis.
The following column names were found in the data set:
- ID
- Marital Status
- Gender
- Income
- Children
- Education
- Occupation
- Home Owner
- Cars
- Commute Distance
- Region
- Age
- Purchased Bike
Data Cleaning methods:
Overview of data cleaning methods that were utilized:
- Created three new worksheets:
- Working Sheet – Modify data
- Note: Editing raw data worksheet is strongly not recommended.
- Data industry best practice is to backup and keep raw data unedited.
- Pivot Tables & Charts
- Dashboard
Data cleaning methods that were performed within Working Sheet – Modify data
- Applied Find and Replace Excel search feature to the following columns:
- Marital Status column: M and S data values were changed to Married and Single data values.
- Gender column: M and F data values were changed to Male and Female data values.
- Adjusted Currency column:
- decreased decimal places by two so that data values were a whole number. Example $40,000 is used as a data value instead of the data value of $40,000.00
- New Age column was created and named:
- Age brackets column was created to simplify the data analysis and data visualization process.
- Note: Data visualization and data analysis progresses more efficiently when age data values are compacted by means of age group classification rather than individual age data points. For example, the data visualization of 20,000 individual age values looks very messy and data analysis results are very hard to understand. By using age brackets, such as mentioned above, data visualization of age group classification provides more accurate data analysis and clear data visualization representation.
- The column labeled ‘Age Brackets’ contained data values that were organized by age range.
- Adolescent: 25 to 30 years old
- Middle Age: 31 to 54 years old
- Old: 55 to 89 years old
- Created a nested IF statement within excel formula bar that utilized the age group classification:
- =IF(L2>54,"Old",IF(L2>=31,"Middle Age",IF(L2<31,"Adolescent","Invalid")))
The following methods were used in the Pivot Tables & Charts worksheet:
- Four Pivot Tables were created and named:
- PivotTable1
- PivotTable2
- Two filters were applied:
- PivotTable3
- PivotTable4
- Four Charts were created and named based on data values from each respective pivot table:
- Average Income Per Purchase
- Customer Commute
- Distance range: 0 to 10 + miles
- Two filters were applied:
- Customer Age Brackets
- Adolescent: 25 to 30 years old
- Middle Age: 31 to 54 years old
- Old: 55 to 89 years old
- Bikes Sold by Occupation (with Car Ownership data)
Creation of Excel Dashboard elements:
As a result, the four images displayed below are the data analysis chart graphs that will be used towards the creation of the Bike Sales Dashboard.
The following methods were used in the Dashboard worksheet:
- Creation of the dashboard header
- Alignment and text insertion
- Merged and centered for text input
- text input: Bike Sales Dashboard
- Design
- background color
- font color
- Bike Sales Dashboard layout design beneath the dashboard header
- Group all the charts together in an organized manner.
- Insertion and utilization of multiple slicer tools that interact directly with all chart graphs
- Multiple slicer tools were all organized to the left side of the chart graphs and below the dashboard header.
- The functionality of the slicer tool is to filter every chart all at once with the column parameters that are found in the Working Sheet – Modify data.
- For example, one slicer tool can be used to filter all chart graphs by gender attribute where the data value is male or female. An alternative method can be also used to display both genders simultaneously.
- Another example is where two or more slicers can be used to visually display on all corresponding chart graphs where the gender of male also has the data attribute of marital status of single.
- All four images displayed above were combined along with functionality of multiple slicers that were linked to each chart graph within the Bike Sales Dashboard.
As a result of combining all dashboard elements together, the image below displays the effect of when the Bikes Sales Dashboard is fully organized and complete.
Areas for data set improvements:
- No year column entry that was in the raw data set
- Filtering by year would be very useful for data visualization purposes.
- Data set collection on the frequency of customers commuting by bike
- Are commuters traveling throughout the week? Or only a few days during the week?
- Ask customers in a survey if electric bikes would be favorable for their commuting distance
- If customer surveys indicate a preference for electric bikes, this business insight could potentially increase profits by making them available.
Conclusion and thankfulness for viewing data project:
Please feel free to reach out to me on LinkedIn if you have any comments or questions.
Lastly, thank you very much for viewing this Excel data project.