__STYLES__
Tools used in this project
Analysis of Cyclistic Bike Share Riders

About this project

Introduction

This project focuses on analyzing the usage patterns of Cyclistic bike share riders, specifically the two categories of riders: casual riders and annual members. The objective is to understand how these two groups differ in their usage of Cyclistic bikes. The ultimate goal is to leverage this insight to create a marketing strategy that will help convert casual riders into annual members.

To gain a comprehensive understanding of the data and to compare different data analysis tools, this project was replicated using two different tools: Excel and R. Each tool offers unique strengths and weaknesses, providing valuable insights into their respective capabilities for this project.

Excel Analysis

The project commenced by clearly defining the business task, which was to identify the differential usage patterns between annual members and casual riders of Cyclistic bikes. The datasets were downloaded from the Motivate International Inc website and stored locally. The dataset consisted of 12 CSV files, each representing a month of the year.

The datasets were imported into Excel using the import tool, creating separate worksheets for each file. The data inspection phase involved verifying the consistency of column names and data formats across the 12 files. The column names were found to be consistent, but the data formatting in the May and June 2022 files required conversion to date format.

Using Power Query, the 12 files were combined and transformed, incorporating additional relevant columns for analysis. One such column was "ride_length," representing the duration of each ride, calculated as the difference between the "started_at" and "ended_at" columns. The "month" and "started_day" columns were also added, derived from the "started_at" column.

Following exploratory analysis, the data was prepared and ready for analysis.

Using pivot tables, the average ride length for both casual riders and annual members was calculated, revealing that casual riders tend to have longer ride durations compared to annual members. Additionally, the number of rides for each category was computed and grouped by days of the week and months of the year.

The analysis of days of the week showed that casual riders prefer riding on weekends, while annual members exhibit higher ride counts on weekdays. Regarding months of the year, both casual riders and annual members tend to ride the most during the summer season and the least during winter. Furthermore, the analysis revealed that casual riders have a higher preference for electric bikes, while annual members prefer classic bikes.

These calculations effectively highlight the distinct usage patterns of annual members and casual riders of Cyclistic bikes.

R Programming Analysis

The R programming analysis began by installing several packages required for data handling, processing, and transformation, including janitor, skimr, here, and tidyverse. Once the packages were imported, the 12 data files were imported using the readr package within the tidyverse package.

The next step involved combining the 12 files into a single dataset using the rbind function. Consistency checks were performed on variable names and data formats to ensure accurate merging. Inconsistencies in data formatting were resolved by converting the problematic columns, such as "started_at" and "ended_at," to the correct date format using the as.POSIXct function from the lubridate package.

Following the data merging process, the selected variables essential to the analysis were extracted using the select function from the dplyr package. To gain a comprehensive understanding of the dataset, the skim_without_chart function from the skimr package was utilized to obtain a comprehensive summary, including data types, the total number of rows (approximately 5.8 million), and information about missing values.

New variables deemed relevant to the analysis were added to the dataframe using the mutate function. These included "start_day," "month," and "ride_length."

The analysis commenced by calculating the average ride length for each category of riders. The results revealed that casual riders have an average ride length of 28.49 minutes, while annual members have an average ride length of 12.50 minutes. This indicates that casual riders tend to take longer rides compared to annual members.

Further analysis focused on the total number of rides per day of the week and month of the year, grouped by member type. The findings showed that casual riders have a higher ride count on weekends, suggesting a preference for leisure or recreational biking. In contrast, annual members exhibit consistent ride patterns throughout the week, indicating more frequent and regular bike usage.

The analysis also examined the rideable types preferred by each category of riders. The results indicated that casual riders have a higher proportion of rides on electric bikes, while annual members predominantly opt for classic bikes. This suggests that casual riders may value the convenience and ease of electric bikes for short rides, while annual members prefer the reliability and familiarity of classic bikes.

Conclusion

In conclusion, this project successfully analyzed the usage patterns of Cyclistic bike share riders, specifically focusing on the distinctions between casual riders and annual members. The analysis highlighted significant differences in ride length, ride count by day of the week and month, and preferred rideable types.

The insights gained from this analysis can be leveraged to create a targeted marketing strategy aimed at converting casual riders into annual members. For instance, the analysis showed that casual riders have a higher ride count on weekends, Cyclistic can create weekend-specific promotions and events to attract casual riders. Organize guided group rides, themed biking tours, or special weekend-only discounts to encourage more casual riders to explore the benefits of annual membership. By understanding the unique preferences and behaviors of these rider categories, Cyclistic can tailor its promotions and offerings to attract and retain more annual members.

This project not only achieved its objectives but also provided valuable insights into the capabilities and differences between Excel and R as data analysis tools. Both tools proved effective in handling and analyzing the data. R showcased its strengths in terms of data cleaning while Excel demonstrated it’s power in performing calculations using pivot tables.

Overall, this project exemplifies strong data analysis skills, showcasing the ability to extract actionable insights and inform strategic decision-making.

Additional Notes

Cyclistic is a fictional bike share company, and this project serves as a simplified example of how tools like Excel and R can be utilized to analyze data. In a real-world scenario, an analysis of this nature would typically involve more extensive data transformation, and analysis techniques to determine a broader range of criteria and metrics that highlight the differences in usage patterns between casual riders and annual members.

Discussion and feedback(0 comments)
2000 characters remaining
Cookie SettingsWe use cookies to enhance your experience, analyze site traffic and deliver personalized content. Read our Privacy Policy.