Module decomposition (2.2.0)

Matrix Decomposition models. This module is styled after Scikit-Learn's decomposition module: https://scikit-learn.org/stable/modules/decomposition.html.

Classes

MatrixFactorization

MatrixFactorization(
 *,
 feedback_type: typing.Literal["explicit", "implicit"] = "explicit",
 num_factors: int,
 user_col: str,
 item_col: str,
 rating_col: str = "rating",
 l2_reg: float = 1.0
)

Matrix Factorization (MF).

Examples:

>>> import bigframes.pandas as bpd
>>> from bigframes.ml.decomposition import MatrixFactorization
>>> bpd.options.display.progress_bar = None
>>> X = bpd.DataFrame({
... "row": [0, 0, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6],
... "column": [0,1] * 7,
... "value": [1, 1, 2, 1, 3, 1.2, 4, 1, 5, 0.8, 6, 1, 2, 3],
... })
>>> model = MatrixFactorization(feedback_type='explicit', num_factors=6, user_col='row', item_col='column', rating_col='value', l2_reg=2.06)
>>> W = model.fit(X)
Parameters
Name Description
feedback_type 'explicit' 'implicit'

Specifies the feedback type for the model. The feedback type determines the algorithm that is used during training.

num_factors int or auto, default auto

Specifies the number of latent factors to use.

user_col str

The user column name.

item_col str

The item column name.

l2_reg float, default 1.0

A floating point value for L2 regularization. The default value is 1.0.

PCA

PCA(
 n_components: typing.Optional[typing.Union[int, float]] = None,
 *,
 svd_solver: typing.Literal["full", "randomized", "auto"] = "auto"
)

Principal component analysis (PCA).

Examples:

>>> import bigframes.pandas as bpd
>>> from bigframes.ml.decomposition import PCA
>>> bpd.options.display.progress_bar = None
>>> X = bpd.DataFrame({"feat0": [-1, -2, -3, 1, 2, 3], "feat1": [-1, -1, -2, 1, 1, 2]})
>>> pca = PCA(n_components=2).fit(X)
>>> pca.predict(X) # doctest:+SKIP
 principal_component_1 principal_component_2
0 -0.755243 0.157628
1 -1.05405 -0.141179
2 -1.809292 0.016449
3 0.755243 -0.157628
4 1.05405 0.141179
5 1.809292 -0.016449
<BLANKLINE>
[6 rows x 2 columns]
>>> pca.explained_variance_ratio_ # doctest:+SKIP
 principal_component_id explained_variance_ratio
0 1 0.00901
1 0 0.99099
<BLANKLINE>
[2 rows x 2 columns]
Parameters
Name Description
n_components int, float or None, default None

Number of components to keep. If n_components is not set, all components are kept, n_components = min(n_samples, n_features). If 0 < n_components < 1, select the number of components such that the amount of variance that needs to be explained is greater than the percentage specified by n_components.

svd_solver "full", "randomized" or "auto", default "auto"

The solver to use to calculate the principal components. Details: https://cloud.google.com/bigquery/docs/reference/standard-sql/bigqueryml-syntax-create-pca#pca_solver.

Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025年10月27日 UTC.