Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Potential bug? LMNN not transforming my input space #318

Unanswered
angelotc asked this question in Q&A
Discussion options

Hello, I wanted to open this up in discussion before making an issue out of it. If you run this code in Google Collab, my input space does not seem to transform or if I was doing this wrong.

import metric_learn
import numpy as np
import pandas as pd
from sklearn.datasets import make_friedman1
lmnn = metric_learn.LMNN()
# fit the data!
def friedman_np_to_df(X,y):
 return pd.DataFrame(X,columns=['x0','x1', 'x2', 'x3', 'x4']), pd.Series(y)
# Make training set. We don't care about Y so call it NA.
X_train, NA1 = make_friedman1(n_samples=1000, n_features=5, random_state = 1) 
X_train, NA1 = friedman_np_to_df(X_train,NA1)
#categorize training set based off of x0
domain_list = []
for i in range(len(X_train)):
 if X_train.iloc[i]['x0'] < 0.6 :
 domain_list.append(1)
 else:
 domain_list.append(0)
X_train['domain'] = domain_list
# Set training set to where domain == 1 (x0 < 0.6), but also add ~60 out-of-domain samples (X_train['domain'] == 1 )
n = 10
out_of_domain = X_train[X_train['domain'] == 0][:n]
X_train = X_train[X_train['domain']==1]
X_train = pd.concat([out_of_domain, X_train])
y_train = X_train.copy()
X_train = X_train.drop(columns = ['domain'])
y_train = y_train['domain']
# Make testing set with a different random_state
X_test, NA2 = make_friedman1(n_samples=1000, n_features=5, random_state = 3)
X_test, NA2 = friedman_np_to_df(X_test,NA2)
#categorize testing set based off of x0
domain_list = []
for i in range(len(X_test)):
 if X_test.iloc[i]['x0'] < 0.6:
 domain_list.append(1)
 else:
 domain_list.append(0)
X_test['domain'] = domain_list
y_test = X_test['domain'].copy()
X_test = X_test.drop(columns = ['domain'])
lmnn.fit(np.array(X_train), np.array(y_train))
# transform our input space
X_lmnn = lmnn.transform(np.array(X_test))

Then in a new cell run the following. You can see they are the same:

X_lmnn[:5], X_test[:5]
You must be logged in to vote

Replies: 1 comment

Comment options

@angelotc thanks for reporting this! What version of metric-learn are you using ?
Running your script on colab with the latest metric-learn version from pip, I didn't obtained the same X_lmnn and X_test (note that I obtained the same result (i.e. different arrays) (on at least on those 5 first rows) with the LMNN PR #309 ): ​

X_lmnn[:5], X_test[:5]
(array([[ 2.48105611, 0.90078992, -0.12757663, 0.82172106, 1.26216829],
 ​[ 4.23874062, -0.04583753, 0.29213554, -0.02094142, 0.68556641],
 ​[ 0.31345599, 0.586104 , 0.04540811, 0.42523969, 0.72904908],
 ​[ 2.88944521, -0.1862227 , 0.34355608, 0.29715756, 0.51748561],
 ​[ 1.37856497, 0.9168011 , 0.07280986, 0.203493 , 0.66506381]]),
 ​x0 x1 x2 x3 x40 0.550798 0.708148 0.290905 0.510828 0.8929471 0.896293 0.125585 0.207243 0.051467 0.4408102 0.029876 0.456833 0.649144 0.278487 0.6762553 0.590863 0.023982 0.558854 0.259252 0.4151014 0.283525 0.693138 0.440454 0.156868 0.544649)

Also, adding the "verbose" flag, iterations are happening and the algorithm seems to be training (the objective values improves)

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet

AltStyle によって変換されたページ (->オリジナル) /