Potential bug? LMNN not transforming my input space · scikit-learn-contrib/metric-learn · Discussion #318

angelotc
Apr 23, 2021

Hello, I wanted to open this up in discussion before making an issue out of it. If you run this code in Google Collab, my input space does not seem to transform or if I was doing this wrong.

import metric_learn
import numpy as np
import pandas as pd
from sklearn.datasets import make_friedman1
lmnn = metric_learn.LMNN()
# fit the data!
def friedman_np_to_df(X,y):
 return pd.DataFrame(X,columns=['x0','x1', 'x2', 'x3', 'x4']), pd.Series(y)
# Make training set. We don't care about Y so call it NA.
X_train, NA1 = make_friedman1(n_samples=1000, n_features=5, random_state = 1) 
X_train, NA1 = friedman_np_to_df(X_train,NA1)
#categorize training set based off of x0
domain_list = []
for i in range(len(X_train)):
 if X_train.iloc[i]['x0'] < 0.6 :
 domain_list.append(1)
 else:
 domain_list.append(0)
X_train['domain'] = domain_list
# Set training set to where domain == 1 (x0 < 0.6), but also add ~60 out-of-domain samples (X_train['domain'] == 1 )
n = 10
out_of_domain = X_train[X_train['domain'] == 0][:n]
X_train = X_train[X_train['domain']==1]
X_train = pd.concat([out_of_domain, X_train])
y_train = X_train.copy()
X_train = X_train.drop(columns = ['domain'])
y_train = y_train['domain']
# Make testing set with a different random_state
X_test, NA2 = make_friedman1(n_samples=1000, n_features=5, random_state = 3)
X_test, NA2 = friedman_np_to_df(X_test,NA2)
#categorize testing set based off of x0
domain_list = []
for i in range(len(X_test)):
 if X_test.iloc[i]['x0'] < 0.6:
 domain_list.append(1)
 else:
 domain_list.append(0)
X_test['domain'] = domain_list
y_test = X_test['domain'].copy()
X_test = X_test.drop(columns = ['domain'])
lmnn.fit(np.array(X_train), np.array(y_train))
# transform our input space
X_lmnn = lmnn.transform(np.array(X_test))

Then in a new cell run the following. You can see they are the same:

X_lmnn[:5], X_test[:5]

Replies: 1 comment

wdevazelhes
Apr 23, 2021
Maintainer

@angelotc thanks for reporting this! What version of metric-learn are you using ?
Running your script on colab with the latest metric-learn version from pip, I didn't obtained the same X_lmnn and X_test (note that I obtained the same result (i.e. different arrays) (on at least on those 5 first rows) with the LMNN PR #309 ):

X_lmnn[:5], X_test[:5]
(array([[ 2.48105611, 0.90078992, -0.12757663, 0.82172106, 1.26216829],
 [ 4.23874062, -0.04583753, 0.29213554, -0.02094142, 0.68556641],
 [ 0.31345599, 0.586104 , 0.04540811, 0.42523969, 0.72904908],
 [ 2.88944521, -0.1862227 , 0.34355608, 0.29715756, 0.51748561],
 [ 1.37856497, 0.9168011 , 0.07280986, 0.203493 , 0.66506381]]),
 x0 x1 x2 x3 x4
0 0.550798 0.708148 0.290905 0.510828 0.892947
1 0.896293 0.125585 0.207243 0.051467 0.440810
2 0.029876 0.456833 0.649144 0.278487 0.676255
3 0.590863 0.023982 0.558854 0.259252 0.415101
4 0.283525 0.693138 0.440454 0.156868 0.544649)

Also, adding the "verbose" flag, iterations are happening and the algorithm seems to be training (the objective values improves)

0 replies

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Potential bug? LMNN not transforming my input space #318

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

angelotc
Apr 23, 2021

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

wdevazelhes
Apr 23, 2021
Maintainer

Select a reply

Uh oh!

Uh oh!

Potential bug? LMNN not transforming my input space #318

Uh oh!

Uh oh!

angelotc Apr 23, 2021

Replies: 1 comment

Uh oh!

Uh oh!

wdevazelhes Apr 23, 2021 Maintainer

angelotc
Apr 23, 2021

wdevazelhes
Apr 23, 2021
Maintainer