Bike Share Data Analysis Case Study

3 min readMay 2, 2021

The following is a data analysis case study that I completed for a Data Analysis Certificate Capstone project. Our completion of this project was optional and we had a few options on what case study to choose or come up with our own. I chose one of the sample case studies included in the program.

We were given access to the data, some tips on how to proceed through each step of the data analysis workflow and then were expected to share it via an online portfolio of some kind. This is what I decided to use as the delivery mechanism for the online portfolio. My presentation of the case study is linked at the bottom but between here and there are the steps I took given my starting point was a collection of raw CSVs. Enjoy!

Ask

Determine how users of the bike share program differ based on casual riders vs those with memberships

Prepare

For this data analysis we will use the data located here. We will use the quarterly data from the last few quarters of 2019 and the first quarter of 2021.

The data in these files contains information about the rides using the bike share service including data points such as:

start time and end time
start location and end location
member or non-member

Process

Since the spreadsheets were already fairly large I decided to process the CSVs with R rather than using a spreadsheet tool. I knew once I got the data into R and made some initial analysis I’d need to export it into a database so I could run queries on it. I am sure R could have done everything I needed but my background is stronger in SQL than it is in R in terms of gathering statistics. Once I figured out some basic insights about the users I knew Tableau was going to be my tool for making those visualizations.

In order to process the data I took the following steps:

Determine the column names across the four quarterly files
Rename any column inconsistencies to make merging the dataframes possible
Mutate a few of the columns using as.character
Combine the dataframes from all of the four quarter dataframes
Removed columns that were not in use for all the files such as lat, long, gender, birthyear, and a few others
Consolidate the labels in the membership column such that member and Subscriber are a single label of ‘member’. Do the same for Customer -> ‘casual’
Added some calculated columns based on the date field so we can easily access month, year, day of the week
Calculated the ride length based on start and end time of the ride
Dropped any rows that had a start location of ‘HQ QR’ or that had a negative ride length

Analyze

Once I had the data in a single dataframe I ran some basic analysis in R for statistics. After that I exported it to a CSV file such that I could analyze it further with SQL and Tableau.

In SQL I loaded it into a local Postgres DB I did the following checks:

What were the top 10 most popular stations for members vs casual riders?
Count the total number of rides for each rider type and then compare it with the number of times that the start and end location were the same
How many rides per month across the two rider types are there and what are the most popular months?
Determine what month has the highest percentage of casual riders?
Find the average ride length for each rider type. Compare that with the median to see what seems more reasonable as there were some really long rides in the less popular months

After using SQL to get some basic analysis I imported the combined CSV into Tableau to do some further exploration and more visualizations to see what else the data was telling me.

In the presentation below you can see the final visualizations that I came up with after testing out various possibilities.

Act

Finally I built a presentation of my findings using Google Slides. Here is the end result with my top three recommendations.

References

Link to the data
Link to a copy of the R script I used in my processing and analysis phases
Link to a copy of my SQL exploration script

Bike Share Data Analysis Case Study

Ask

Prepare

Process

Analyze

Share

Act

References

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Jon Kelley

No responses yet