Recommender systems are a very popular topic in e-commerce. They are frameworks and engines which help us to implement a recommender system easier. One of these engines is LightFM which is a satisfiable python framework.
This project is part of a 4-person team project, each person has worked on a special package my package has been LightFM.
You can see our final report in Final_Report_Group8.pdf.
The goal is to implement a recommender system with LightFM with 3 of Kaggle datasets. 3 data set and their related files are:
files:
- collabrative_restaurant.py (main file)
- geoplaces2.json
- rating_final_U1011.json
- userprofile.json
- resturant_json.py
files:
- book.py (main file)
- test_train.py
- book_wrap_vs_BPR.py
- book_eval_K_OS.py
- mainbookup_lim1.json (10000 rows of main data in randomly picked)
files:
- size_test&trainWRAPvsBPR.py
- size_test&trainWRAP.py
- random_pick.py
- renttherunway_lim.json (10000 rows of main data in randomly picked)
- sizeRs.py (main file)
I used Conda in Pycharm and install LightFM with:
conda install -c conda-forge lightfm
conda install -c conda-forge/label/gcc7 lightfm
conda install -c conda-forge/label/cf201901 lightfm
conda install -c conda-forge/label/cf202003 lightfm
One of the most important challenges is how to give the data to the package. First, read the Json file and create a dataset of lightfm :
f = open('rating_final_U1011.json', ) ff = open('userprofile.json', ) df = open(r'geoplaces2.json') data_User = json.load(ff) data_item = json.load(df) data = json.load(f) dataset = Dataset()
Then fit the dataset with your data:
dataset.fit((x['userID'] for x in data), (x['placeID'] for x in data), (x['budget'] for x in data_User),(x['price'] for x in data_item))
Now it's possible to create the matrixes:
(interactions, weights) = dataset.build_interactions(((x['userID'], x['placeID']) for x in data)) print(repr(interactions)) user_interactions = dataset.build_user_features((x['userID'], [x['budget']]) for x in data_User) print(repr(user_interactions)) item_interactions = dataset.build_item_features((x['placeID'], [x['price']]) for x in data_item) print(repr(item_interactions))
This package is a model base package define and fit package:
alpha = 1e-05 epochs = 70 num_components = 32 model = LightFM(no_components=num_components, loss='warp', learning_schedule='adadelta', user_alpha=alpha, item_alpha=alpha)
For testing and validating the model you need to split data to test and train like in test_train.py.
Testing learning_schedule adadelta vs adagrad for cloth dataset:
Screenshot (994)
Testing loss WARP vs BPR for cloth dataset:
http://www2.informatik.uni-freiburg.de/~cziegler/BX/
https://making.lyst.com/lightfm/docs/home.html
https://github.com/lyst/lightfm
Reach out to me at ghazalze@yahoo.com.
Thanks @alirezaomidi π