SKLearn NMF Vs Custom NMF

0 votes

I am trying to build a recommendation system using Non-negative matrix factorization. Using scikit-learn NMF as the model, I fit my data, resulting in a certain loss(i.e., reconstruction error). Then I generate recommendation for new data using the inverse_transform method.

Now I do the same using another model I built in TensorFlow. The reconstruction error after training is close to that obtained using sklearn's approach earlier. However, neither are the latent factors similar to one another nor the final recommendations.

One difference between the 2 approaches that I am aware of is: In sklearn, I am using the Coordinate Descent solver whereas in TensorFlow, I am using the AdamOptimizer which is based on Gradient Descent. Everything else seems to be the same:

  1. Loss function used is the Frobenius Norm
  2. No regularization in both cases
  3. Tested on the same data using same number of latent dimensions

Relevant code that I am using:

1. scikit-learn approach:

model =  NMF(alpha=0.0, init='random', l1_ratio=0.0, max_iter=200, 
n_components=2, random_state=0, shuffle=False, solver='cd', tol=0.0001, 
verbose=0)
model.fit(data)
result = model.inverse_transform(model.transform(data))

2. TensorFlow approach:

w = tf.get_variable(initializer=tf.abs(tf.random_normal((data.shape[0], 
2))), constraint=lambda p: tf.maximum(0., p))
h = tf.get_variable(initializer=tf.abs(tf.random_normal((2, 
data.shape[1]))), constraint=lambda p: tf.maximum(0., p))
loss = tf.sqrt(tf.reduce_sum(tf.squared_difference(x, tf.matmul(w, h))))

My question is that if the recommendations generated by these 2 approaches do not match, then how can I determine which are the right ones? Based on my use case, sklearn's NMF is giving me good results, but not the TensorFlow implementation. How can I achieve the same using my custom implementation?

May 9, 2018 in Python by aryya
• 7,460 points
1,780 views

1 answer to this question.

0 votes
The choice of the optimizer has a big impact on the quality of the training. Some very simple models (I'm thinking of GloVe for example) do work with some optimizer and not at all with some others. Then, to answer your questions:

how can I determine which are the right ones ?

The evaluation is as important as the design of your model, and it is as hard i.e. you can try these 2 models and several available datasets and use some metrics to score them. You could also use A/B testing on a real case application to estimate the relevance of your recommendations.

How can I achieve the same using my custom implementation ?

First, try to find a coordinate descent optimizer for Tensorflow and make sure all step you implemented are exactly the same as the one in scikit-learn. Then, if you can't reproduce the same, try different solutions (why don't you try a simple gradient descent optimizer first ?) and take profit of the great modularity that Tensorflow offers !

Finally, if the recommendations provided by your implementation are that bad, I suggest you have an error in it. Try to compare with some existing codes.
answered May 9, 2018 by charlie_brown
• 7,720 points

Related Questions In Python

0 votes
1 answer

SKLearn NMF Vs Custom NMF

The choice of the optimizer has a ...READ MORE

answered Sep 7, 2018 in Python by Priyaj
• 58,020 points
734 views
0 votes
1 answer

SKLearn NMF Vs Custom NMF

The choice of the optimizer has a ...READ MORE

answered Sep 14, 2018 in Python by Priyaj
• 58,020 points
777 views
+1 vote
3 answers

Difference between append vs. extend list methods in Python

Python append() method adds an element to ...READ MORE

answered Aug 21, 2019 in Python by germyrinn
• 240 points
97,015 views
+3 votes
2 answers

Compiled vs Interpreted Languages

Compiled languages are written in a code ...READ MORE

answered Dec 3, 2018 in Python by allenvarna
• 540 points
5,846 views
0 votes
1 answer

Difference between append vs. extend list methods in Python

append: Appends object at the end. x = ...READ MORE

answered Aug 8, 2018 in Python by bug_seeker
• 15,510 points
2,298 views
0 votes
1 answer

Python string formatting: % vs. .format

To answer your first question... .format just ...READ MORE

answered Aug 17, 2018 in Python by Priyaj
• 58,020 points
1,076 views
+1 vote
2 answers

Python string formatting: % vs. .format

Using Python format() function is what the ...READ MORE

answered Apr 11, 2019 in Python by Dasa Ravi
1,488 views
+7 votes
8 answers

Difference for string comparison in Python: 'is' vs. ==

If we use "==" means both variables ...READ MORE

answered Sep 3, 2018 in Python by Parul Raheja
2,588 views
0 votes
1 answer

Python vs Cpython

So what is CPython? CPython is the original ...READ MORE

answered Aug 29, 2018 in Python by Priyaj
• 58,020 points
12,011 views
0 votes
1 answer

Python - abs vs fabs

math.fabs() converts its argument to float if it ...READ MORE

answered Sep 19, 2018 in Python by Priyaj
• 58,020 points
710 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP