Introduction
For the Maven Music Challenge, I have been provided with a Spotify user's complete music streaming history data from 2013 to 2024, including timestamps, track, artist, album names, and reasons for playing and ending each track.
Challenge Objective
Every December, millions of Spotify users look forward to their Spotify Wrapped – a personalized recap showcasing their listening habits over the past year.
For the Maven Music Challenge, my task is to create my own version of Spotify Wrapped, by using the sample dataset provided. My requirement is to create a Dashboard that shows a breakdown of the user’s listening habits and streaming patterns. The final Report or Dashboard should be a visualization on the Summary of Key Insights, such as:
The user’s Most Streamed Tracks by Total Minutes Streamed.
The user’s Most Streamed Artists and Albums by Total Minutes Streamed.
The Trend of Tracks or Music Streaming by Month, Day and Time Period,
The percentage of streaming from different platforms.
The Reason for Starting and Ending Tracks.
Layout and Visuals for a One-Page Spotify Music Streaming Dashboard
Layout
1. KPIs (Top Section of the Dashboard)
2. Slicers (Top Section of the Dashboard)
3. Visuals/Charts (Main Section of the Dashboard)
Metrics / Key Performance Indicators (KPIs).
I used Card Visuals for the following KPIs:
· Count of Tracks: Total number of unique tracks streamed.
· Count of Albums: Total number of unique albums streamed.
· Count of Artists: Total number of unique artists streamed.
· Total Minutes Streamed: Sum of total minutes streamed.
Slicers
I added slicers for key filters to allow users to drill down into the data:
· Date Range: A Tile slicer to filter the data by date range (e.g., from 2013 to 2024, January to December).
· Platform: Drop-down slicer to filter by platform (e.g., Android, iOS, Windows, Web Player, etc.).
· Week Day: Drop-down slicer to filter by Day, (Monday to Sunday).
· Time Period: Drop-down slicer to filter by Time Period in a Day, (Morning, Afternoon, Evening, Dawn).
· Shuffled and Skipped: Tile slicer to show details on tracks that was shuffled and skipped.
Visuals/Charts
1. Stacked Bar Chart (Top 10 Tracks by Minutes Streamed):
· This will show the most streamed tracks based on total minutes streamed.
2. Stacked Bar Chart (Top 10 Artists by Minutes Streamed):
· Display the top artists based on total minutes streamed.
3. Line Chart (Monthly Streaming Trends):
· Show the trend of total minutes streamed over the months from 2013 to 2024.
4. Donut Chart (Platform Distribution):
· Show the percentage of streams from different platforms for quick insights on platform usage.
5. Stacked Bar Chart (Streaming Reason Breakdown):
· Display the reasons for starting and ending tracks (e.g., user action, autoplay, etc.).
Data Preparation
I began the data preparation process by loading the data from the source CSV file into Power Query Editor in Power BI Desktop.
I extracted the TIME values from the Timestamp column that has values for Date and Time. I split the column into two with different columns for Date and Time. I created a custom column for TIME PERIOD using an M Language formula. I now have values in the TIME COLUMN showing data for Morning, Afternoon, Evening, Night and Dawn streaming.
Secondly, I created a new Custom Column; Minutes Streamed Column, by DIVIDING the Milliseconds Column by 60000 to get values for Minutes Streamed for each track.
I CLEANED all column to remove all unprintable values. I TRIMED all columns to remove all trailing empty spaces before and at the end of each value.
I replaced BLANK VALUES in all columns with appropriate value to represent the content for that particular column.
The Data Model
I created 5 DIMENSION TABLES from the main FACT TABLE (Spotify History). I have 5 new Dimension Tables that contain unique values in each column:
URI+Tracks+Albums+Artists (Spotify URI, Track Name, Album Name, Artist Name)
Time Period
Platform
Reason Started
Reason Ended
Dimension Date (Unique dates based on the dates in the Fact Table)
I connected the Dimension Tables to the Fact Table and established the Relationship between the tables. A data Model is needed to establish a clear relationship between the 5 Dimension Tables and the Fact Table. The data model will help establish how the URI for each track relates to values in other columns in the dataset.
Tools Used:
I used Power BI Desktop for this project. Power Query was used for data cleansing and data transformation. The Data Model was created in the Model View and the Dashboard was created in the Report View.
DAX was used to calculate the TOTAL MINUTES STREAMED.
Key Insights from the Data:
My analysis of the music streaming history in the last 12 years revealed the following insights:
Count of Artists: 4112
Count of Albums: 7893
Count of Tracks: 13656
Total Minutes Streamed: 320,487.85
Favorite Track is: Ode To The Mets
Favorite Artist is: The Beatles
October is the highest streaming month with over 3.3K minutes streamed
Android is the No.1 Streaming Platform with 90.95% of streaming happening on this platform.
The reason why most Tracks Started and Ended was because of Trackdone