Binary Relevance

Binary Relevance¶

class skmultilearn.problem_transform.BinaryRelevance(classifier=None, require_dense=None)[source] ¶

Bases: skmultilearn.base.problem_transformation.ProblemTransformationBase

Performs classification per label

Transforms a multi-label classification problem with L labels into L single-label separate binary classification problems using the same base classifier provided in the constructor. The prediction output is the union of all per label classifiers

Parameters:	classifier (`BaseEstimator`) – scikit-learn compatible base classifier require_dense ([bool , bool ], optional) – whether the base classifier requires dense representations for input features and classes/labels matrices in fit/predict. If value not provided, sparse representations are used if base classifier is an instance of `MLClassifierBase` and dense otherwise.

model_count_¶

number of trained models, in this classifier equal to n_labels

Type:	int

partition_¶

list of lists of label indexes, used to index the output space matrix, set in _generate_partition() via fit()

Type:	List[List[int]], shape=(model_count_,)

classifiers_¶

list of classifiers trained per partition, set in fit()

Type:	List[`BaseEstimator`] of shape model_count

Notes

Note

This is one of the most basic approaches to multi-label classification, it ignores relationships between labels.

Examples

An example use case for Binary Relevance classification with an sklearn.svm.SVC base classifier which supports sparse input:

from skmultilearn.problem_transform import BinaryRelevance
from sklearn.svm import SVC
# initialize Binary Relevance multi-label classifier
# with an SVM classifier
# SVM in scikit only supports the X matrix in sparse representation
classifier = BinaryRelevance(
 classifier = SVC(),
 require_dense = [False, True]
)
# train
classifier.fit(X_train, y_train)
# predict
predictions = classifier.predict(X_test)

Another way to use this classifier is to select the best scenario from a set of single-label classifiers used with Binary Relevance, this can be done using cross validation grid search. In the example below, the model with highest accuracy results is selected from either a sklearn.naive_bayes.MultinomialNB or sklearn.svm.SVC base classifier, alongside with best parameters for that base classifier.

from skmultilearn.problem_transform import BinaryRelevance
from sklearn.model_selection import GridSearchCV
from sklearn.naive_bayes import MultinomialNB
from sklearn.svm import SVC
parameters = [
 {
 'classifier': [MultinomialNB()],
 'classifier__alpha': [0.7, 1.0],
 },
 {
 'classifier': [SVC()],
 'classifier__kernel': ['rbf', 'linear'],
 },
]
clf = GridSearchCV(BinaryRelevance(), parameters, scoring='accuracy')
clf.fit(x, y)
print (clf.best_params_, clf.best_score_)
# result:
#
# {
# 'classifier': SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
# decision_function_shape='ovr', degree=3, gamma='auto', kernel='linear',
# max_iter=-1, probability=False, random_state=None, shrinking=True,
# tol=0.001, verbose=False), 'classifier__kernel': 'linear'
# } 0.17

fit(X, y)[source] ¶

Fits classifier to training data

Parameters:	X (array_like, `numpy.matrix` or `scipy.sparse` matrix, shape=(n_samples, n_features)) – input feature matrix y (array_like, `numpy.matrix` or `scipy.sparse` matrix of {0, 1}, shape=(n_samples, n_labels)) – binary indicator matrix with label assignments
Returns:	fitted instance of self
Return type:	self

Notes

Note

Input matrices are converted to sparse format internally if a numpy representation is passed

predict(X)[source] ¶

Predict labels for X

Parameters:	X (array_like, `numpy.matrix` or `scipy.sparse` matrix, shape=(n_samples, n_features)) – input feature matrix
Returns:	binary indicator matrix with label assignments
Return type:	`scipy.sparse` matrix of {0, 1}, shape=(n_samples, n_labels)

predict_proba(X)[source] ¶

Predict probabilities of label assignments for X

Parameters:	X (array_like, `numpy.matrix` or `scipy.sparse` matrix, shape=(n_samples, n_features)) – input feature matrix
Returns:	matrix with label assignment probabilities
Return type:	`scipy.sparse` matrix of float in [0.0, 1.0], shape=(n_samples, n_labels)

Cite US!

If you use scikit-multilearn in your research and publish it, please consider citing us, it will help us get funding for making the library better. The paper is available on arXiv, to cite it try the Bibtex code on the right.


 
 @ARTICLE{2017arXiv170201460S,
 author = {{Szyma{\'n}ski}, P. and {Kajdanowicz}, T.},
 title = "{A scikit-based Python environment for performing multi-label classification}",
 journal = {ArXiv e-prints},
 archivePrefix = "arXiv",
 eprint = {1702.01460},
 primaryClass = "cs.LG",
 keywords = {Computer Science - Learning, Computer Science - Mathematical Software},
 year = 2017,
 month = feb,
 }

Created using Sphinx 1.8.2. Show this page source