Visualization of the prediction error of a regression model.
This tool can display "residuals vs predicted" or "actual vs predicted"
using scatter plots to qualitatively assess the behavior of a regressor,
preferably on held-out data points.
See the details in the docstrings of
from_estimator or
from_predictions to
create a visualizer. All parameters are stored as attributes.
For general information regarding scikit-learn visualization tools, read
more in the Visualization Guide.
For details regarding interpreting these plots, refer to the
Model Evaluation Guide.
Added in version 1.2.
Parameters:
y_truendarray of shape (n_samples,)
True values.
y_predndarray of shape (n_samples,)
Prediction values.
Attributes:
line_matplotlib Artist
Optimal line representing y_true==y_pred. Therefore, it is a
diagonal line for kind="predictions" and a horizontal line for
kind="residuals".
errors_lines_matplotlib Artist or None
Residual lines. If with_errors=False, then it is set to None.
classmethodfrom_estimator(estimator, X, y, *, kind='residual_vs_predicted', subsample=1000, random_state=None, ax=None, scatter_kwargs=None, line_kwargs=None)[source]#
Plot the prediction error given a regressor and some data.
For general information regarding scikit-learn visualization tools,
read more in the Visualization Guide.
For details regarding interpreting these plots, refer to the
Model Evaluation Guide.
Added in version 1.2.
Parameters:
estimatorestimator instance
Fitted regressor or a fitted Pipeline
in which the last estimator is a regressor.
X{array-like, sparse matrix} of shape (n_samples, n_features)
"actual_vs_predicted" draws the observed values (y-axis) vs.
the predicted values (x-axis).
"residual_vs_predicted" draws the residuals, i.e. difference
between observed and predicted values, (y-axis) vs. the predicted
values (x-axis).
subsamplefloat, int or None, default=1_000
Sampling the samples to be shown on the scatter plot. If float,
it should be between 0 and 1 and represents the proportion of the
original dataset. If int, it represents the number of samples
display on the scatter plot. If None, no subsampling will be
applied. by default, 1000 samples or less will be displayed.
random_stateint or RandomState, default=None
Controls the randomness when subsample is not None.
See Glossary for details.
axmatplotlib axes, default=None
Axes object to plot on. If None, a new figure and axes is
created.
scatter_kwargsdict, default=None
Dictionary with keywords passed to the matplotlib.pyplot.scatter
call.
line_kwargsdict, default=None
Dictionary with keyword passed to the matplotlib.pyplot.plot
call to draw the optimal line.
Plot the prediction error given the true and predicted targets.
For general information regarding scikit-learn visualization tools,
read more in the Visualization Guide.
For details regarding interpreting these plots, refer to the
Model Evaluation Guide.
"actual_vs_predicted" draws the observed values (y-axis) vs.
the predicted values (x-axis).
"residual_vs_predicted" draws the residuals, i.e. difference
between observed and predicted values, (y-axis) vs. the predicted
values (x-axis).
subsamplefloat, int or None, default=1_000
Sampling the samples to be shown on the scatter plot. If float,
it should be between 0 and 1 and represents the proportion of the
original dataset. If int, it represents the number of samples
display on the scatter plot. If None, no subsampling will be
applied. by default, 1000 samples or less will be displayed.
random_stateint or RandomState, default=None
Controls the randomness when subsample is not None.
See Glossary for details.
axmatplotlib axes, default=None
Axes object to plot on. If None, a new figure and axes is
created.
scatter_kwargsdict, default=None
Dictionary with keywords passed to the matplotlib.pyplot.scatter
call.
line_kwargsdict, default=None
Dictionary with keyword passed to the matplotlib.pyplot.plot
call to draw the optimal line.