The ML.IMPUTER function

This document describes the ML.IMPUTER function, which lets you replace NULL values in a string or numerical expression. You can replace NULL values with the most frequently used value for string expressions, or the mean or median value for numerical expressions.

When used in the TRANSFORM clause, the values calculated during training for mean, median, and most frequently used value are automatically used in prediction.

You can use this function with models that support manual feature preprocessing. For more information, see the following documents:

Syntax

ML.IMPUTER(expression, strategy) OVER()

Arguments

ML.IMPUTER takes the following arguments:

  • expression: the numerical or STRING expression to impute.
  • strategy: a STRING value that specifies how to replace NULL values. Valid values are as follows:
    • mean: the mean of expression. You can only use this value with numerical expressions.
    • median: the median of expression. You can only use this value with numerical expressions.
    • most_frequent: the most frequent value in expression.

Output

ML.IMPUTER returns a FLOAT64 (for numerical expressions) or STRING (for string expressions) value that contains the replacement for the NULL value.

Examples

Example 1

The following example imputes numerical expressions:

SELECTf,ML.IMPUTER(f,'mean')OVER()ASoutput
FROM
UNNEST([NULL,-3,-3,-3,1,2,3,4,5])ASf
ORDERBYf;

The output looks similar to the following:

+------+--------+
| f | output |
+------+--------+
| NULL | 0.75 |
| -3 | -3.0 |
| -3 | -3.0 |
| -3 | -3.0 |
| 1 | 1.0 |
| 2 | 2.0 |
| 3 | 3.0 |
| 4 | 4.0 |
| 5 | 5.0 |
+------+--------+

Example 2

The following example imputes string expressions:

SELECTf,ML.IMPUTER(f,'most_frequent')OVER()ASoutput
FROM
UNNEST([NULL,NULL,NULL,NULL,'a','a','b','b','c','c','c'])ASf
ORDERBYf;

The output looks similar to the following:

+------+--------+
| f | output |
+------+--------+
| NULL | c |
| NULL | c |
| NULL | c |
| NULL | c |
| a | a |
| a | a |
| b | b |
| b | b |
| c | c |
| c | c |
| c | c |
+------+--------+

What's next

Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025年10月24日 UTC.