This example will demonstrate the set_output API to configure transformers to
output pandas DataFrames. set_output can be configured per estimator by calling
the set_output method or globally by setting set_config(transform_output="pandas").
For details, see
SLEP018.
First, we load the iris dataset as a DataFrame to demonstrate the set_output API.
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
Each transformer in the pipeline is configured to return DataFrames. This
means that the final logistic regression step contains the feature names of the input.
With the global configuration, all transformers output DataFrames. This allows us to
easily plot the logistic regression coefficients with the corresponding feature names.
When configuring the output type with config_context the
configuration at the time when transform or fit_transform are
called is what counts. Setting these only when you construct or fit
the transformer has no effect.
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
Parameters
copy
True
with_mean
True
with_std
True
withconfig_context(transform_output="pandas"):# the output of transform will be a Pandas DataFrameX_test_scaled=scaler.transform(X_test[num_cols])X_test_scaled.head()
age
fare
629
0.628306
-0.063210
688
-0.057984
-0.515704
439
1.314596
0.566624
664
-0.675645
-0.512279
669
-0.744274
-0.496950
outside of the context manager, the output will be a NumPy array