Objective 1: Profile and QA the Data
Read in the AirBnB listings data, calculate basic profiling metrics, change column datatypes as necessary, and filter down to only Paris Listings.
- cast the date columns as datetime
- Filter the data down to rows where the city is Paris, and keep only the columns ‘host_since’, ‘neighbourhood’, ‘city’, ‘accommodates’, and ‘price’
- QA the Paris listings data: check for missing values and calculate the minimum, maximum, and average for each numeric field
Objective 2: Prepare the Data for Visualization
Produce DataFrames that will be used in visualizations by aggregating and manipulating the listings data in several ways.
- Create a table named paris_listings_neighbourhood that groups Paris listings by 'neighbourhood' and calculates the mean price (sorted low to high)
- Create a table named paris_listings_accomodations, filter down to the most expensive neighborhood, group by the ‘accommodations’ column, and add the mean price for each value of ‘accommodates’ (sorted low to high)
- Create a table called paris_listings_over_time grouped by the ‘host_since’ year, and calculate the average price and count of rows representing the number of new hosts.
Objective 3: Visualize the data and summarize the findings
- Build visuals to show the number of new hosts by year, overall average price by year and neighborhood, and average price for various listings in Paris' most expensive neighborhood.
Based on your findings, what insights do you have about the impact of the 2015 regulations on new hosts and prices?
The regulations on new hosts caused the number of new Airbnb locations to drop and the prices of the existing ones to go back up.
Bonus:
I added an interactive map with plotly that shows each Airbnb available, its name, and neighborhood.