During 12-week Metis Data Science Bootcamp I was using Recommendation System for my final project.
Goal of my project is to find the best hotel recommendation for each user.
I got 60.000 reviews for 1.000 hotels from 7.000 users. All reviews I was using are from TripAdvisor website.
TripAdvisor is a travel website that gather review information from its users.
Each hotel has own rating and reviews, each user can leave a review for the hotel he stayed in and rating from 1 to 5
For main goal I was using user-based Collaborative filtering.
How does it work?
We have users, hotels and rating users left for hotels
Users who rated the same hotels identify like similar
And each of user get recommendation from similar user for the hotels they didn’t rate yet.
top 10 recommendation
To find top 10 recommendation I needed to do next steps
First of all I created train and anti-train subset from the data
Training set is pair user-hotel for all reviews we have in the system and rating for each of this pair
Anti-train set is all possible pairs user-hotel that are not in training set
After creating 2 subsets, I build my algorithm using training set
And then fit anti-train set in my algorithm and got rating predictions for all couples in anti-train set.
Then after I sort all my predictions I got top 10 recommendations for each user
How to evaluate recommendations? How to decide if system works well or not?
For evaluation I divide data on trianing and test subsets randomly,
Build algorithm on training set
For each couple in test set, model predicted rating based on built algorithm and compare predicted and real ratings
And for evaluation metric I was using mean absolute error. That metric measure mean error between predicted and real ratings. For my problem I got 0.68 rating which means predicted rating is 0.68 more or less than real rating in average
On slide you can my web app that I show you later.
Ok, it works if user left a review. What if it’s new user?
In this case user can choose the hotel he likes from the list and get recommendation of the most similar hotels
For this problem I was using k-nearest neighbors algorithm
So we look at the users who liked target hotel and look for other hotel they like. And we find k nearest hotels. In example is 5
For the cold start problem when both user and hotel are new and doesn’t have/leave any reviews, system just give top 10 the most popular hotels
You can check my web app: here
I created my streamlit web-app to demonstrate how my recommendation system works. On the app for new user you can search by hotel similarity. Just choose hotel you like and app show you 10 the most similar hotels. If you are existing user you can choose your name from the list and the system will give you top 10 recommendation based on your previous review. You can see name and picture of hotel and link for it
To summarize my work:
- During this project I implemented Collaborative Filtering Recommendation System and got 10 top recommendation for each user
- I resolved cold start problem
- I created web app to demonstrate work of my recommendation system
What’s possible impact of my project?
This project can help hotels sell more and make their clients happier. Because clients gonna get exactly what they want they gonna buy more often
So, hotels companies sell more, clients happier, win-win situation, and sales are increasing!