4

I am using Scikit-learn to train a classification model. I have both discrete and continuous features in my training data.

I want to do feature selection using mutual information.

The features 1,2 and 3 are discrete. to this end, I try the code below :

mutual_info_classif(x, y, discrete_features=[1, 2, 3])

but it did not work, it gives me the error:

 ValueError: could not convert string to float: 'INT'
desertnaut
60.8k32 gold badges155 silver badges183 bronze badges
asked Nov 25, 2018 at 17:34
4
  • I have apply the code that Mr W.P. McNeill have proposed in stackoverflow.com/q/43643278 but did not work Commented Nov 25, 2018 at 17:42
  • 1
    we need more information in order to be able to help you. It might be useful if you copy a simplified example of your code. Commented Nov 25, 2018 at 18:14
  • this is my code: from sklearn.feature_selection import mutual_info_classif res_M_train = mutual_info_classif(data_train, Y_train, discrete_features= [1,2,3]) thank you Commented Nov 25, 2018 at 18:23
  • my data is like this :[0.983874,tcp,http,FIN,10,8,816,1172,17.278635,62,252,5976.375,8342.53125,2,2,109.319333,124.932859,5929.211713,192.590406,255,794167371,1624757001,255,0.206572,0.108393,0.098179,82,147,1,184,2,1,1,1,1,2,0,0,1,1,3,0,] as you can see my three first features are categoricale , and I want to calculate the mutual information of each feature: from sklearn.feature_selection import mutual_info_classif res_M_train = mutual_info_classif(data_train, Y_train, discrete_features= [1,2,3]) Commented Nov 25, 2018 at 18:28

3 Answers 3

4

A simple example with mutual information classifier:

import numpy as np
from sklearn.feature_selection import mutual_info_classif
X = np.array([[0, 0, 0],
 [1, 1, 0],
 [2, 0, 1],
 [2, 0, 1],
 [2, 0, 1]])
y = np.array([0, 1, 2, 2, 1])
mutual_info_classif(X, y, discrete_features=True)
# result: array([ 0.67301167, 0.22314355, 0.39575279]
answered Nov 25, 2018 at 18:28
Sign up to request clarification or add additional context in comments.

3 Comments

but I have mixed features like this X = np.array([[0, a, 0], [1, b, 0], [2, c,1], [2, d, 1], [2, a, 1]])
this is a row from my Data [8e-06,"udp","-","INT",2,0,1762,0,125000.0003,254,0,881000000.0,0.0,0,0,0.008,0.0,0.0,0.0,0,0,0,0,0.0,0.0,0.0,881,0,0,0,2,2,1,1,1,2,0,0,0,1,2,0] it seems that the three first features cause the problem
if you're using categories and you have string information, take a look to get_dummies
2

mutual_info_classif can only take numeric data. You need to do label encoding of the categorical features and then run the same code.

x1=x.apply(LabelEncoder().fit_transform)

Then run the exact same code you were running.

mutual_info_classif(x1, y, discrete_features=[1, 2, 3])
answered Mar 30, 2020 at 1:26

2 Comments

Care with that @Jatin, refering to sklearn's docs: This transformer should be used to encode target values, i.e. y, and not the input X. So maybe for this case it is a better option to use OrdinalEncoder.
@rmoret Does it matter for calculating mutual information? "Not limited to real-valued random variables and linear dependence like the correlation coefficient, MI is more general and determines how different the joint distribution of the pair (X,Y) is from the product of the marginal distributions of X and Y. MI is the expected value of the pointwise mutual information (PMI)." Mutual Information Since we only care about shared information, ordering should not matter?
1

.There is a difference between 'discrete' and 'categorical' In this case, function demands the data to be numerical. May be you can use label encoder if you have ordinal features. Else you would have to use one hot encoding for nominal features. You can use pd.get_dummies for this purpose.

answered Feb 9, 2020 at 5:41

1 Comment

Same here. Does it matter whether you have ordinal features for calculating mutual information? "Not limited to real-valued random variables and linear dependence like the correlation coefficient, MI is more general and determines how different the joint distribution of the pair (X,Y) is from the product of the marginal distributions of X and Y. MI is the expected value of the pointwise mutual information (PMI)." Mutual Information Since we only care about shared information, ordering should not matter?

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.