scikit-learn homepage scikit-learn homepage

GitHub

Note

Go to the end to download the full example code. or to run this example in your browser via JupyterLite or Binder

Displaying Pipelines#

The default configuration for displaying a pipeline in a Jupyter Notebook is 'diagram' where set_config(display='diagram'). To deactivate HTML representation, use set_config(display='text').

To see more detailed steps in the visualization of the pipeline, click on the steps in the pipeline.

# Authors: The scikit-learn developers
# SPDX-License-Identifier: BSD-3-Clause

Displaying a Pipeline with a Preprocessing Step and Classifier#

This section constructs a Pipeline with a preprocessing step, StandardScaler, and classifier, LogisticRegression, and displays its visual representation.

fromsklearnimport set_config
fromsklearn.linear_modelimport LogisticRegression
fromsklearn.pipelineimport Pipeline
fromsklearn.preprocessingimport StandardScaler
steps = [
 ("preprocessing", StandardScaler ()),
 ("classifier", LogisticRegression ()),
]
pipe = Pipeline (steps)

To visualize the diagram, the default is display='diagram'.

set_config (display="diagram")
pipe # click on the diagram below to see the details of each step

Pipeline(steps=[('preprocessing', StandardScaler()),
 ('classifier', LogisticRegression())])

In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

To view the text pipeline, change to display='text'.

set_config (display="text")
pipe

Pipeline(steps=[('preprocessing', StandardScaler()),
 ('classifier', LogisticRegression())])

Put back the default display

set_config (display="diagram")

Displaying a Pipeline Chaining Multiple Preprocessing Steps & Classifier#

This section constructs a Pipeline with multiple preprocessing steps, PolynomialFeatures and StandardScaler, and a classifier step, LogisticRegression, and displays its visual representation.

fromsklearn.linear_modelimport LogisticRegression
fromsklearn.pipelineimport Pipeline
fromsklearn.preprocessingimport PolynomialFeatures , StandardScaler
steps = [
 ("standard_scaler", StandardScaler ()),
 ("polynomial", PolynomialFeatures (degree=3)),
 ("classifier", LogisticRegression (C=2.0)),
]
pipe = Pipeline (steps)
pipe # click on the diagram below to see the details of each step

Pipeline(steps=[('standard_scaler', StandardScaler()),
 ('polynomial', PolynomialFeatures(degree=3)),
 ('classifier', LogisticRegression(C=2.0))])

In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

Displaying a Pipeline and Dimensionality Reduction and Classifier#

This section constructs a Pipeline with a dimensionality reduction step, PCA, a classifier, SVC, and displays its visual representation.

fromsklearn.decompositionimport PCA
fromsklearn.pipelineimport Pipeline
fromsklearn.svmimport SVC
steps = [("reduce_dim", PCA (n_components=4)), ("classifier", SVC (kernel="linear"))]
pipe = Pipeline (steps)
pipe # click on the diagram below to see the details of each step

Pipeline(steps=[('reduce_dim', PCA(n_components=4)),
 ('classifier', SVC(kernel='linear'))])

In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

Displaying a Complex Pipeline Chaining a Column Transformer#

This section constructs a complex Pipeline with a ColumnTransformer and a classifier, LogisticRegression, and displays its visual representation.

importnumpyasnp
fromsklearn.composeimport ColumnTransformer
fromsklearn.imputeimport SimpleImputer
fromsklearn.linear_modelimport LogisticRegression
fromsklearn.pipelineimport Pipeline , make_pipeline
fromsklearn.preprocessingimport OneHotEncoder , StandardScaler
numeric_preprocessor = Pipeline (
 steps=[
 ("imputation_mean", SimpleImputer (missing_values=np.nan , strategy="mean")),
 ("scaler", StandardScaler ()),
 ]
)
categorical_preprocessor = Pipeline (
 steps=[
 (
 "imputation_constant",
 SimpleImputer (fill_value="missing", strategy="constant"),
 ),
 ("onehot", OneHotEncoder (handle_unknown="ignore")),
 ]
)
preprocessor = ColumnTransformer (
 [
 ("categorical", categorical_preprocessor, ["state", "gender"]),
 ("numerical", numeric_preprocessor, ["age", "weight"]),
 ]
)
pipe = make_pipeline (preprocessor, LogisticRegression (max_iter=500))
pipe # click on the diagram below to see the details of each step

Pipeline(steps=[('columntransformer',
 ColumnTransformer(transformers=[('categorical',
 Pipeline(steps=[('imputation_constant',
 SimpleImputer(fill_value='missing',
 strategy='constant')),
 ('onehot',
 OneHotEncoder(handle_unknown='ignore'))]),
 ['state', 'gender']),
 ('numerical',
 Pipeline(steps=[('imputation_mean',
 SimpleImputer()),
 ('scaler',
 StandardScaler())]),
 ['age', 'weight'])])),
 ('logisticregression', LogisticRegression(max_iter=500))])

In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

Pipeline

?Documentation for Pipeline iNot fitted

Parameters

steps [('columntransformer', ...), ('logisticregression', ...)]

transform_input None

memory None

verbose False

columntransformer: ColumnTransformer

?Documentation for columntransformer: ColumnTransformer

Parameters

transformers [('categorical', ...), ('numerical', ...)]

remainder 'drop'

sparse_threshold 0.3

n_jobs None

transformer_weights None

verbose False

verbose_feature_names_out True

force_int_remainder_cols 'deprecated'

categorical

['state', 'gender']

SimpleImputer

?Documentation for SimpleImputer

Parameters

missing_values nan

strategy 'constant'

fill_value 'missing'

copy True

add_indicator False

keep_empty_features False

OneHotEncoder

?Documentation for OneHotEncoder

Parameters

categories 'auto'

drop None

sparse_output True

dtype <class 'numpy.float64'>

handle_unknown 'ignore'

min_frequency None

max_categories None

feature_name_combiner 'concat'

numerical

['age', 'weight']

SimpleImputer

?Documentation for SimpleImputer

Parameters

missing_values nan

strategy 'mean'

fill_value None

copy True

add_indicator False

keep_empty_features False

StandardScaler

?Documentation for StandardScaler

Parameters

copy True

with_mean True

with_std True

LogisticRegression

?Documentation for LogisticRegression

Parameters

penalty 'l2'

dual False

tol 0.0001

C 1.0

fit_intercept True

intercept_scaling 1

class_weight None

random_state None

solver 'lbfgs'

max_iter 500

multi_class 'deprecated'

verbose 0

warm_start False

n_jobs None

l1_ratio None