Project Name:
Maven Slopes Challenge
Description:
This project is about building a one-page dashboard that will help skiers find their ideal destination from nearly 500 ski resorts around the world. My aim was to build an exploratory and interactive dashboard with the end users (skiers) in mind.
The Dataset:
The dataset (provided by Maven Analytics) contains 2 tables in CSV format:
- The Resorts table contains information on 499 ski resorts around the world, including their location, slopes, lifts, prices, and ski season.
- The Snow table contains supplemental data on the surface of the earth covered by snow for each month in 2022, by latitude & longitude.
- Tags: Travel, Weather, Geospatial
- Data Structure: Multiple Table
- Number of Records: 499
- Number of Fields: 25
Approach:
The idea I had was to build a webpage-like dashboard (more like a Ski Guide website) since the requirement was to help skiers find their ideal destination.
Step 1: Data Cleaning & Aggregation
- CountBlank function was used in MS Excel to ascertain missing data (if any).
- Cleaning of Resort Names, removing noises (?, /, ---, etc.) using functions like find & replace, text-to-columns, concatenate, trim, etc.
- For "Snow" file, Date was split into Month and Year, whilst date elements like 12 were changed to December, 1 to January, 2 to February, etc. using TEXT function in MS Excel.
Step 2: Data Manipulation
- Bands were created for fields like Price, Highest Point, Lowest Point, Beginner Slope, Intermediate Slope, Difficult Slope, Longest Run, and Lift Capacity.
- This is to enable me create slicers for a filtered page.
- At first, months in the Snow file were used to determine another field called Season (winter/summer).
Step 3: ETL
Step 4: Data Transformation and Field Categorization in Power Query
Step 5: Data Modelling
Step 6: DAX creation
Step 7: Analyses and Dashboard (Webpage) Design
Challenges:
Major challenge was in trying to extract months from seasons (which were presented as month ranges) as well as incorporating supplemental data into the data model. I did a few collaboration on LinkedIn among the data community and I was availed the matrix file for Season Dates, which I used to create a new file - "Season".
Key Insights:
- There are free-entry resorts in every continent except South America.
- Cheapest resort (€14) is in Gudauri, Georgia. It's the only transcontinental resort at the intersection of Eastern Europe and Western Asia.
- Priciest resort (€141) is Beaver Creek in USA.
- Cheapest resorts are in Europe and Asia, with most in Europe.
- Top 5 countries with highest number of resorts are Austria, France, USA, Switzerland, and Italy.
- Europe has 72% of total number of resorts, followed by North America with 20%.
- 99% of resorts are child-friendly; 76% of resorts have snow parks; only 41% of resorts support night skiing, whilst only 6% have facilities for summer skiing.
- Top resorts for Beginners and Intermediate Skiers are in Europe (France), whilst the top resort for Professional Skiers is in USA.
- Lowest and Highest Peak resorts are in North America; lowest peak resort (Le Massif - 36 meters) is in Canada, and highest peak resort (Breckenridge - 3,914 meters) is in USA.
- Resorts with the longest run (in kilometers) are all in Europe.
Impact:
I sincerely trust that potential skiers would not need to spend more than 5 minutes on the dashboard to find their ideal destination(s) - once they have been able to clarify their budget, preferred destination, travel month(s)/season, and preferred amenities, among others.