Note
Go to the end to download the full example code. or to run this example in your browser via JupyterLite or Binder
Gaussian Mixture Model Selection#
This example shows that model selection can be performed with Gaussian Mixture Models (GMM) using information-theory criteria. Model selection concerns both the covariance type and the number of components in the model.
In this case, both the Akaike Information Criterion (AIC) and the Bayes Information Criterion (BIC) provide the right result, but we only demo the latter as BIC is better suited to identify the true model among a set of candidates. Unlike Bayesian procedures, such inferences are prior-free.
# Authors: The scikit-learn developers # SPDX-License-Identifier: BSD-3-Clause
Data generation#
We generate two components (each one containing n_samples
) by randomly
sampling the standard normal distribution as returned by numpy.random.randn
.
One component is kept spherical yet shifted and re-scaled. The other one is
deformed to have a more general covariance matrix.
importnumpyasnp n_samples = 500 np.random.seed (0) C = np.array ([[0.0, -0.1], [1.7, 0.4]]) component_1 = np.dot (np.random.randn (n_samples, 2), C) # general component_2 = 0.7 * np.random.randn (n_samples, 2) + np.array ([-4, 1]) # spherical X = np.concatenate ([component_1, component_2])
We can visualize the different components:
importmatplotlib.pyplotasplt plt.scatter (component_1[:, 0], component_1[:, 1], s=0.8) plt.scatter (component_2[:, 0], component_2[:, 1], s=0.8) plt.title ("Gaussian Mixture components") plt.axis ("equal") plt.show ()
Model training and selection#
We vary the number of components from 1 to 6 and the type of covariance parameters to use:
"full"
: each component has its own general covariance matrix."tied"
: all components share the same general covariance matrix."diag"
: each component has its own diagonal covariance matrix."spherical"
: each component has its own single variance.
We score the different models and keep the best model (the lowest BIC). This
is done by using GridSearchCV
and a
user-defined score function which returns the negative BIC score, as
GridSearchCV
is designed to maximize a
score (maximizing the negative BIC is equivalent to minimizing the BIC).
The best set of parameters and estimator are stored in best_parameters_
and
best_estimator_
, respectively.
fromsklearn.mixtureimport GaussianMixture fromsklearn.model_selectionimport GridSearchCV defgmm_bic_score(estimator, X): """Callable to pass to GridSearchCV that will use the BIC score.""" # Make it negative since GridSearchCV expects a score to maximize return -estimator.bic(X) param_grid = { "n_components": range(1, 7), "covariance_type": ["spherical", "tied", "diag", "full"], } grid_search = GridSearchCV ( GaussianMixture (), param_grid=param_grid, scoring=gmm_bic_score ) grid_search.fit(X)
GridSearchCV(estimator=GaussianMixture(), param_grid={'covariance_type': ['spherical', 'tied', 'diag', 'full'], 'n_components': range(1, 7)}, scoring=<function gmm_bic_score at 0x7f489c450ee0>)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
Parameters
GaussianMixture(n_components=2)