I'm working on implementing the stochastic gradient descent algorithm for recommender systems (see Funk) using sparse matrices with Scipy.
This is how a first basic implementation looks like:
N = self.model.shape[0] #no of users
M = self.model.shape[1] #no of items
self.p = np.random.rand(N, K)
self.q = np.random.rand(M, K)
rows,cols = self.model.nonzero()
for step in xrange(steps):
for u, i in zip(rows,cols):
e=self.model-np.dot(self.p,self.q.T) #calculate error for gradient
p_temp = learning_rate * ( e[u,i] * self.q[i,:] - regularization * self.p[u,:])
self.q[i,:]+= learning_rate * ( e[u,i] * self.p[u,:] - regularization * self.q[i,:])
self.p[u,:] += p_temp
I have two questions and I am hoping some seasoned python coders / recommender systems experts here could provide me with some insight into this:
1) How often should the error e be adjusted? It is not clear from Funk's and Koren's papers if it should be once for every rating or if it should be once per every factor. Meaning, should i take it out of the for loop, leave it there, or put it even deeper (over the k iterations) ?
2) Unfortunately, my code is pretty slow, taking about 20 minutes on ONE iteration over a 100k ratings dataset. This means that computing the factorization over the whole dataset takes about 10 hours. I was thinking that this is probably due to the sparse matrix for loop. I've tried expressing the q and p changes using fancy indexing but since I'm still pretty new at scipy and numpy, I couldn't figure a better way to do it. Do you have any pointers on how i could avoid iterating over the rows and columns of the sparse matrix explicitly?
-
\$\begingroup\$ Which matrix is the scipy sparse matrix? \$\endgroup\$insanely_sin– insanely_sin2018年11月15日 19:52:46 +00:00Commented Nov 15, 2018 at 19:52
2 Answers 2
There may be other optimizations available, but this one alone should go a long way...
When you do e = self.model - np.dot(self.p,self.q.T)
, the resulting e
is a dense matrix the same size as your model
. You later only use e[u, i]
in your loop, so you are throwing away what probably are millions of computed values. If you simply replace that line with:
e = self.model[u, i] - np.dot(self.p[u, :], self.q[i, :])
and then replace e[u, i]
with e
, you will spare yourself a huge amount both of computations and memory accesses, which should boost performance by a lot.
-
\$\begingroup\$ Thank you. That's a great improvement. I also did some other improvements. I transposed q from the very beginning and called the column directly, instead of calling and transposing the row. I also integrated the learning rate into the error , because it was being called three times. All in all, I manage to get one step to take only about 10 seconds now! Do you think it's worth optimising further with Cython? \$\endgroup\$Ana Todor– Ana Todor2013年11月20日 16:21:02 +00:00Commented Nov 20, 2013 at 16:21
-
\$\begingroup\$ Yes, it almost cetainly is... The problem is that you are updating your
p
andq
vectors as you progress over the non-zero entries ofmodel
. This makes it very hard to get rid of the for loops, becausep
andq
are different for every entry ofmodel
. If you could delay updating ofp
andq
until the fullmodel
has been processed, then you would get different results, but you could speed things up a lot without Cython or C. With your algorithm, I cannot think of any way around that... \$\endgroup\$Jaime– Jaime2013年11月20日 16:33:00 +00:00Commented Nov 20, 2013 at 16:33 -
\$\begingroup\$ Thank you for the insight. Time to learn myself some Cython then! I'll be tackling the Netflix dataset soon, so I need all the speed I can get. I'm going to mark this as answered then. \$\endgroup\$Ana Todor– Ana Todor2013年11月20日 16:45:10 +00:00Commented Nov 20, 2013 at 16:45
Besides the recommendation that Jaime offered, it would be very easy to get performance improvements by using Numba.
There is a tutorial on Numba and matrix factorization here. When Numba works, it requires much less effort than rewriting your code in Cython. In a nutshell, you need to import numba:
from numba.decorators import jit
And then add a decorator to your matrix factorization function, by simply adding this:
@jit
Before the method definition.