__STYLES__
Tools used in this project
Google Capstone Project

About this project

Scenario

I am a junior data analyst working in the marketing analyst team at Cyclistic, a bike-share company in Chicago. The director of marketing believes the company’s future success depends on maximizing the number of annual memberships. Therefore, my team wants to understand how casual riders and annual members use Cyclistic bikes differently. From these insights, my team will design a new marketing strategy to convert casual riders into annual members.

Background

Cyclistic has three pricing plans: single-ride passes, full-day passes, and annual memberships. Customers who purchase single-ride or full-day passes are referred to as casual riders. Customers who purchase annual memberships are Cyclistic members. The marketing director believes there is a very good chance to convert casual riders into members.

Ask

Three questions will guide the future marketing program:

  1. How do annual members and casual riders use Cyclistic bikes differently?

  2. Why would casual riders buy Cyclistic annual memberships?

  3. How can Cyclistic use digital media to influence casual riders to become members?

I was assigned the first question- "How do annual members and casual riders use Cyclistic bikes differently?"

Prepare

(Though the company Cyclistic is a fictious company the data provided is real and has been made available by Motivate International Inc. under this license.) I will be analyzing the historical data for the past 12 months in 2022. The data is reliable and free of any bias. Cyclistic has stored the data on a database that is open to the public for analytic purposes. I downloaded the data and stored it and a copy of it on my computer. Then I saved them as .xls files so I could work with them in excel.

Process

Once files were saved in Excel, I combined each of the 12 months of data onto one Excel workbook where each dataset has its own tab. Here is a summary of my cleaning

ride_id

Using =LEN I was able to see the length of the ride_id, which should have been 16 characters long. I removed any that were longer or shorter.

rideable_type

According to the data collection classic_bike and docked_bike were the same thing so I changed the name to reflect that.

started_at/ended_at

Used these columns to create a new tab to show how long the ride was, I removed any ride longer than a day and shorter than a minute. I also created another tab using =WEEKDAY to pick out just what day of the week the ride took place.

start_station_name/end_station_name

Checked the nulls of these two columns to make sure they were only for electric_bike as that is the only bike type that has its own bike lock and does not need to be at a station. Also trimmed any excess spaces before and after the words.

start_station_id/end_station_id

These columns had so many inconsistencies in their string lengths and did not add any value to my analysis so I removed them.

start_lat/end_lat & start_lng/end_lng

If any of these were null I removed the row because it showed that bikes were not turned in properly or something else happened.

member_casual

I checked this column to make sure the only values were member or casual.

Analyze

The amount of rows were over the one million that Excel allows so, to make sure that I had the most accurate analysis I could I moved over to SQL using Big Query. Once their I promptly used UNION ALL to combine all the data for the 12 months in 2022.

I used BigQuery to explore the data even more and then analyzed some different ways members and casual riders were using the bikes. I then used Tableau to showcase that analysis.

Insights

undefinedTo answer the original question, "How do annual members and casual riders use Cyclistic bikes differently?”. First I looked at the types of bikes both members and casual riders preferred. The graph above shows that members are more likely to choose a classic bike where as the casual riders leaned more towards the electric bike. This is most likely due to more members using Cyclistic as a means to get to and from work which is highlighted by the next two graphs. The members are more consistent riding all year but especially during the months that have fairer weather. Casual riders also follow this trend just on a smaller scale, spiking in the summer months. You can also see that the times of day the members spike are 8am and 5 pm and falling off before and after. This shows most members are using the bikes to get to and from work. This also shows that the casual members are using the bikes for their own leisure.

undefinedThe next thing I analyzed was the length of rides. This confirmed that casual riders were riding more for pleasure and not necessarily with a specific place in mind like the members do. The days that the riders ride on are also show a clear preference for the bikes on Monday-Friday for members and a spike in rides Friday-Sun for the casual riders.

Both of these insights strongly support our hypothesis and we can conclude that members are typically commuters using Cyclistic’s bikes for short trips during the weekdays to get to work and casual riders typically use Cyclistic’s bikes on the weekend for leisure.

The next two visualizations are all about where in Chicago people are starting and ending their rides. One thing that the visualizations can not show is there are more people docking their bikes at each station however, it is not at the exact longitude and latitude of the station so it does not register as being at that station specifically. The table shows the more accurate numbers.

undefinedWhen looking at the starting stations for the members and casual riders it tells a fairly different story. Most casual members are starting from along the beach which means that it is being used by tourists as well as people using it for leisure. The members on the other hand are starting from more inland starting from historic districts and near a university.

The Ending stations below tell the same story as the starting stations.

undefinedConclusion/Recommendations

In summary,

Members:

  • Tend to use Cyclistic for the purposes of going to and coming from work (8am-5pm)
  • They start and end trips in the city near offices and universities
  • The length of ride averages10-15 minutes

Casual riders:

  • Tend to use Cyclistic for the purposes of going sightseeing, most likely by tourists (spike in the summer months)
  • They start and end trips along the beach
  • The length of the ride averages 20-25 minutes(2x members)

Based on the data their are some casual riders that will not be able to be converted (tourists, sightseers). However I believe the data supports that there is a possibility that casual riders would be convert to members. Below are some recommendations.

  1. Cyclistic could offer new subscriptions targeted at those who use the service mainly in the summer months. Instead of an all or nothing like what is in place it could be a 3 or 6 month plan at a discounted rate.

  2. If a new subscription is not an option then I would suggest to advertise heavily during the summer months. I would especially focus on advertising at major tourist locations and at the top 5 starting and top 5 ending locations for casual members.

Thank you!

Discussion and feedback(2 comments)
comment-121-avatar
Jad Saade
Jad Saade
over 1 year ago
Very neat project, the graphics look very nice. The counts of top start and end stations is very interesting - the legend for bubble size is very helpful, the station is Streeter Dr & Grand A
2000 characters remaining
Cookie SettingsWe use cookies to enhance your experience, analyze site traffic and deliver personalized content. Read our Privacy Policy.