SequentialFeatureSelector#
- classsklearn.feature_selection.SequentialFeatureSelector(estimator, *, n_features_to_select='auto', tol=None, direction='forward', scoring=None, cv=5, n_jobs=None)[source] #
- Transformer that performs Sequential Feature Selection. - This Sequential Feature Selector adds (forward selection) or removes (backward selection) features to form a feature subset in a greedy fashion. At each stage, this estimator chooses the best feature to add or remove based on the cross-validation score of an estimator. In the case of unsupervised learning, this Sequential Feature Selector looks only at the features (X), not the desired outputs (y). - Read more in the User Guide. - Added in version 0.24. - Parameters:
- estimatorestimator instance
- An unfitted estimator. 
- n_features_to_select"auto", int or float, default="auto"
- If - "auto", the behaviour depends on the- tolparameter:- if - tolis not- None, then features are selected while the score change does not exceed- tol.
- otherwise, half of the features are selected. 
 - If integer, the parameter is the absolute number of features to select. If float between 0 and 1, it is the fraction of features to select. - Added in version 1.1: The option - "auto"was added in version 1.1.- Changed in version 1.3: The default changed from - "warn"to- "auto"in 1.3.
- tolfloat, default=None
- If the score is not incremented by at least - tolbetween two consecutive feature additions or removals, stop adding or removing.- tolcan be negative when removing features using- direction="backward".- tolis required to be strictly positive when doing forward selection. It can be useful to reduce the number of features at the cost of a small decrease in the score.- tolis enabled only when- n_features_to_selectis- "auto".- Added in version 1.1. 
- direction{‘forward’, ‘backward’}, default=’forward’
- Whether to perform forward selection or backward selection. 
- scoringstr or callable, default=None
- Scoring method to use for cross-validation. Options: - str: see String name scorers for options. 
- callable: a scorer callable object (e.g., function) with signature - scorer(estimator, X, y)that returns a single value. See Callable scorers for details.
- None: the- estimator’s default evaluation criterion is used.
 
- cvint, cross-validation generator or an iterable, default=None
- Determines the cross-validation splitting strategy. Possible inputs for cv are: - None, to use the default 5-fold cross validation, 
- integer, to specify the number of folds in a - (Stratified)KFold,
- An iterable yielding (train, test) splits as arrays of indices. 
 - For integer/None inputs, if the estimator is a classifier and - yis either binary or multiclass,- StratifiedKFoldis used. In all other cases,- KFoldis used. These splitters are instantiated with- shuffle=Falseso the splits will be the same across calls.- Refer User Guide for the various cross-validation strategies that can be used here. 
- n_jobsint, default=None
- Number of jobs to run in parallel. When evaluating a new feature to add or remove, the cross-validation procedure is parallel over the folds. - Nonemeans 1 unless in a- joblib.parallel_backendcontext.- -1means using all processors. See Glossary for more details.
 
- Attributes:
- n_features_in_int
- Number of features seen during fit. Only defined if the underlying estimator exposes such an attribute when fit. - Added in version 0.24. 
- feature_names_in_ndarray of shape (n_features_in_,)
- Names of features seen during fit. Defined only when - Xhas feature names that are all strings.- Added in version 1.0. 
- n_features_to_select_int
- The number of features that were selected. 
- support_ndarray of shape (n_features,), dtype=bool
- The mask of selected features. 
 
 - See also - GenericUnivariateSelect
- Univariate feature selector with configurable strategy. 
- RFE
- Recursive feature elimination based on importance weights. 
- RFECV
- Recursive feature elimination based on importance weights, with automatic selection of the number of features. 
- SelectFromModel
- Feature selection based on thresholds of importance weights. 
 - Examples - >>> fromsklearn.feature_selectionimport SequentialFeatureSelector >>> fromsklearn.neighborsimport KNeighborsClassifier >>> fromsklearn.datasetsimport load_iris >>> X, y = load_iris(return_X_y=True) >>> knn = KNeighborsClassifier(n_neighbors=3) >>> sfs = SequentialFeatureSelector(knn, n_features_to_select=3) >>> sfs.fit(X, y) SequentialFeatureSelector(estimator=KNeighborsClassifier(n_neighbors=3), n_features_to_select=3) >>> sfs.get_support() array([ True, False, True, True]) >>> sfs.transform(X).shape (150, 3) - fit(X, y=None, **params)[source] #
- Learn the features to select from X. - Parameters:
- Xarray-like of shape (n_samples, n_features)
- Training vectors, where - n_samplesis the number of samples and- n_featuresis the number of predictors.
- yarray-like of shape (n_samples,), default=None
- Target values. This parameter may be ignored for unsupervised learning. 
- **paramsdict, default=None
- Parameters to be passed to the underlying - estimator,- cvand- scorerobjects.- Added in version 1.6: Only available if - enable_metadata_routing=True, which can be set by using- sklearn.set_config(enable_metadata_routing=True). See Metadata Routing User Guide for more details.
 
- Returns:
- selfobject
- Returns the instance itself. 
 
 
 - fit_transform(X, y=None, **fit_params)[source] #
- Fit to data, then transform it. - Fits transformer to - Xand- ywith optional parameters- fit_paramsand returns a transformed version of- X.- Parameters:
- Xarray-like of shape (n_samples, n_features)
- Input samples. 
- yarray-like of shape (n_samples,) or (n_samples, n_outputs), default=None
- Target values (None for unsupervised transformations). 
- **fit_paramsdict
- Additional fit parameters. 
 
- Returns:
- X_newndarray array of shape (n_samples, n_features_new)
- Transformed array. 
 
 
 - get_feature_names_out(input_features=None)[source] #
- Mask feature names according to selected features. - Parameters:
- input_featuresarray-like of str or None, default=None
- Input features. - If - input_featuresis- None, then- feature_names_in_is used as feature names in. If- feature_names_in_is not defined, then the following input feature names are generated:- ["x0", "x1", ..., "x(n_features_in_ - 1)"].
- If - input_featuresis an array-like, then- input_featuresmust match- feature_names_in_if- feature_names_in_is defined.
 
 
- Returns:
- feature_names_outndarray of str objects
- Transformed feature names. 
 
 
 - get_metadata_routing()[source] #
- Get metadata routing of this object. - Please check User Guide on how the routing mechanism works. - Added in version 1.6. - Returns:
- routingMetadataRouter
- A - MetadataRouterencapsulating routing information.
 
 
 - get_params(deep=True)[source] #
- Get parameters for this estimator. - Parameters:
- deepbool, default=True
- If True, will return the parameters for this estimator and contained subobjects that are estimators. 
 
- Returns:
- paramsdict
- Parameter names mapped to their values. 
 
 
 - get_support(indices=False)[source] #
- Get a mask, or integer index, of the features selected. - Parameters:
- indicesbool, default=False
- If True, the return value will be an array of integers, rather than a boolean mask. 
 
- Returns:
- supportarray
- An index that selects the retained features from a feature vector. If - indicesis False, this is a boolean array of shape [# input features], in which an element is True iff its corresponding feature is selected for retention. If- indicesis True, this is an integer array of shape [# output features] whose values are indices into the input feature vector.
 
 
 - inverse_transform(X)[source] #
- Reverse the transformation operation. - Parameters:
- Xarray of shape [n_samples, n_selected_features]
- The input samples. 
 
- Returns:
- X_originalarray of shape [n_samples, n_original_features]
- Xwith columns of zeros inserted where features would have been removed by- transform.
 
 
 - set_output(*, transform=None)[source] #
- Set output container. - See Introducing the set_output API for an example on how to use the API. - Parameters:
- transform{"default", "pandas", "polars"}, default=None
- Configure output of - transformand- fit_transform.- "default": Default output format of a transformer
- "pandas": DataFrame output
- "polars": Polars output
- None: Transform configuration is unchanged
 - Added in version 1.4: - "polars"option was added.
 
- Returns:
- selfestimator instance
- Estimator instance. 
 
 
 - set_params(**params)[source] #
- Set the parameters of this estimator. - The method works on simple estimators as well as on nested objects (such as - Pipeline). The latter have parameters of the form- <component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
- **paramsdict
- Estimator parameters. 
 
- Returns:
- selfestimator instance
- Estimator instance. 
 
 
 
Gallery examples#
Model-based and sequential feature selection
Release Highlights for scikit-learn 0.24