Multi-label embedding classification

Multi-label embedding classification¶

class skmultilearn.embedding.EmbeddingClassifier(embedder, regressor, classifier, regressor_per_dimension=False, require_dense=None)[source] ¶

Bases: skmultilearn.base.problem_transformation.ProblemTransformationBase

Embedding-based classifier

Implements a general scheme presented in LNEMLC: label network embeddings for multi-label classification. The classifier embeds the label space with the embedder, trains a set of single-variate or a multi-variate regressor for embedding unseen cases and a base classifier to predict labels based on input features and the embeddings.

Parameters:

Parameters:	embedder (`BaseEstimator`) – the class to embed the label space regressor (`BaseEstimator`) – the base regressor to predict embeddings from input features classifier (`BaseEstimator`) – the base classifier to predict labels from input features and embeddings regressor_per_dimension (bool) – whether to train one joint multi-variate regressor (False) or per dimension single-variate regressor (True) require_dense ([bool , bool ], optional) – whether the base classifier requires dense representations for input features and classes/labels matrices in fit/predict.

embedder (BaseEstimator) – the class to embed the label space
regressor (BaseEstimator) – the base regressor to predict embeddings from input features
classifier (BaseEstimator) – the base classifier to predict labels from input features and embeddings
regressor_per_dimension (bool) – whether to train one joint multi-variate regressor (False) or per dimension single-variate regressor (True)
require_dense ([bool , bool ], optional) – whether the base classifier requires dense representations for input features and classes/labels matrices in fit/predict.

n_regressors_¶

number of trained regressors

Type:	int

partition_¶

list of lists of label indexes, used to index the output space matrix, set in _generate_partition() via fit()

Type:	List[List[int]], shape=(model_count_,)

classifiers_¶

list of classifiers trained per partition, set in fit()

Type:	List[`BaseEstimator`] of shape model_count

If you use this classifier please cite the relevant embedding method paper and the label network embedding for multi-label classification paper:

@article{zhang2007ml,
 title={ML-KNN: A lazy learning approach to multi-label learning},
 author={Zhang, Min-Ling and Zhou, Zhi-Hua},
 journal={Pattern recognition},
 volume={40},
 number={7},
 pages={2038--2048},
 year={2007},
 publisher={Elsevier}
}

Example

An example use case for EmbeddingClassifier:

from skmultilearn.embedding import SKLearnEmbedder, EmbeddingClassifier
from sklearn.manifold import SpectralEmbedding
from sklearn.ensemble import RandomForestRegressor
from skmultilearn.adapt import MLkNN
clf = EmbeddingClassifier(
 SKLearnEmbedder(SpectralEmbedding(n_components = 10)),
 RandomForestRegressor(n_estimators=10),
 MLkNN(k=5)
)
clf.fit(X_train, y_train)
predictions = clf.predict(X_test)

fit(X, y)[source] ¶

Fits classifier to training data

Parameters:	X (array_like, `numpy.matrix` or `scipy.sparse` matrix, shape=(n_samples, n_features)) – input feature matrix y (array_like, `numpy.matrix` or `scipy.sparse` matrix of {0, 1}, shape=(n_samples, n_labels)) – binary indicator matrix with label assignments
Returns:	fitted instance of self
Return type:	self

predict(X)[source] ¶

Predict labels for X

Parameters:	X (array_like, `numpy.matrix` or `scipy.sparse` matrix, shape=(n_samples, n_features)) – input feature matrix
Returns:	binary indicator matrix with label assignments
Return type:	`scipy.sparse` matrix of {0, 1}, shape=(n_samples, n_labels)

predict_proba(X)[source] ¶

Predict probabilities of label assignments for X

Parameters:	X (array_like, `numpy.matrix` or `scipy.sparse` matrix, shape=(n_samples, n_features)) – input feature matrix
Returns:	matrix with label assignment probabilities
Return type:	`scipy.sparse` matrix of float in [0.0, 1.0], shape=(n_samples, n_labels)

Cite US!

If you use scikit-multilearn in your research and publish it, please consider citing us, it will help us get funding for making the library better. The paper is available on arXiv, to cite it try the Bibtex code on the right.


 
 @ARTICLE{2017arXiv170201460S,
 author = {{Szyma{\'n}ski}, P. and {Kajdanowicz}, T.},
 title = "{A scikit-based Python environment for performing multi-label classification}",
 journal = {ArXiv e-prints},
 archivePrefix = "arXiv",
 eprint = {1702.01460},
 primaryClass = "cs.LG",
 keywords = {Computer Science - Learning, Computer Science - Mathematical Software},
 year = 2017,
 month = feb,
 }

Created using Sphinx 1.8.2. Show this page source