Rome Airbnb Reviews

What does it take for a Roman Airbnb owner to have a highly-rated property?

October 2022 - December 2022

Percentage of Superhost vs Non-Superhost

While I was looking for a place to stay for Thanksgiving on Airbnb, the status “Superhost” particularly interested me. I asked myself: "How does one achieve such status? What does it take for a host to become a “Superhost?”. Instead of looking up the criteria on the website itself, I thought it would be an interesting topic for me and my team to explore for our Data Mining class project.

Details of the project: Rome Airbnb.

Dataset

Having all visited Rome in the past, our team decided to focus on Airbnb data in that city and we retrieved the data from this website. We then cleaned the data set and made some adjustments in terms of transforming and combining variables to keep relevant variables as well as eliminating problems like multicollinearity

Theory and Analysis

Using R and variables like Price, Host Response Time, Neighborhood, Number of Bedrooms, and Distance to Airports… our team investigated the classification and regression problems through modeling techniques like Neural Network, KNN, One Classification Tree, Random Forest… The purpose of the analysis is to predict the rating score of a listing based on what service the listing offers. At the same time, our team also wants to learn how a host can become a “Superhost”.

Results

After running the models, our team decided on the two best models to predict the “Superhost” status and well as the rating score. Our team created the following scenario:

Guido is the great grandson of a famous real estate developer in Rome. He has listed several accommodations on Airbnb over the last couple of years. Guido wants to predict if a future listing will take him to the coveted SuperHost status and how high his rating would go. Based on this Airbnb data, he has 5 current listings, uploaded a profile picture, and verified his identity by Airbnb. The listing in question costs 4,500 Euros a night, is 12.2miles away from Leonardo Da Vinci airport, accommodates 5 people across 3 bedrooms, is equipped with a washer, and has no parking. Guido also takes the time to greet every guest.

Using a single classification tree, it is revealed that Guido would not be a Superhost.

Using the TreeNet model, it is revealed that the rating for Guido’s listing is 89/100, which is equivalent to 4.47/5.

What I learned

Technical Tools

I learned how to run different classification and regression models: neural networks, naive Bayes, logistic, KNN, one classification tree, bagged tree, random forest, and tree net on R. I also had a chance to asset each model to find out which is the best and utilize those models to predict the outcomes of different scenarios

For more details on this project, please visit Rome Airbnb

Overview

For this project, I was in charge of cleaning the data set which consists of 280,000 data points and over 30 variables. After that, I ran multiple regression and classification models (Neural Network, KNN, Naive Bayes, Logistics, One Classification Tree, Bagged Tree, Random Forest, TreeNet) to predict a listing’s rating and whether or not a host is a “Superhost”. Finally, I created a scenario to test the functionality of the best models our team chose.

Contribution