Capital Bikeshare

Predicting the number of casual and registered bike rentals

October 2021 - December 2021

Bike rental service has been experiencing an increase in demand in the past year. As a leader in the industry, Capital Bikeshare in DC is interested in the factors that affect the demand for bike-sharing since they are important not only for the company’s benefit, but also for addressing traffic patterns, environmental impacts, and health and wellness

Dataset

To help build the best models possible to predict the number of rentals, Capital Bikeshare has provided extensive data on each rental in the form of 752 random one-hour observations. Each observation consists of 11 variables with Casual and Registered as variables that need to be predicted, representing the number of casual users and the number of registered users.

Theory and Analysis

Using R I performed exploratory analysis by creating a correlation matrix to check for strong correlations between the variables. Using the information collected from the data set itself and the exploratory analyses, I was able to identify the variables I should use in the model. From there, I performed regression analyses on different combinations of variables to finally reach two that satisfy all assumptions with significant accuracy tests.

Results

Based on the model and the interpretation of the formula, it can be concluded that feelslike temperature, day of the week (Wednesday to Friday), weather (clear and cloudy), and time of a day (7 AM-10 AM and 1 PM-8 PM) are good factors to predict the number of registered Bikeshare users. A High temperature on a Wednesday, Thursday, or Friday with clear or cloudy weather, during 7 AM-10 AM or 1 PM-8 PM will greatly increase the number of registered users.

What I Learned

Technical Tools

This project was my first exposure to R. I learned the basics of R code, created my first linear regression predicting models, and tested for their accuracies.

For more details on this project, please visit Bikeshare