The ML.FEATURES_AT_TIME function

This document describes the ML.FEATURES_AT_TIME function, which lets you use a point-in-time cutoff for all entities when retrieving features, because features can have time dependencies if they include time-sensitive data. To avoid data leakage, use point-in-time features when training models and running inference.

Use this function to use the same point-in-time cutoff for all entities when retrieving features. Use the ML.ENTITY_FEATURES_AT_TIME function to retrieve features from multiple points in time for multiple entities.

Syntax

ML.FEATURES_AT_TIME(
 { TABLE `PROJECT_ID.DATASET.TABLE_NAME` | (QUERY_STATEMENT) }
 [, TIME => TIMESTAMP][, NUM_ROWS => INT64][, IGNORE_FEATURE_NULLS => BOOL])

Arguments

ML.FEATURES_AT_TIME takes the following arguments:

  • PROJECT_ID: the project that contains the table.
  • DATASET: the BigQuery dataset that contains the table.
  • TABLE_NAME: is the name of the BigQuery table that contains the feature data. The feature table must contain the following columns:

    • entity_id: a STRING column that contains the ID of the entity related to the features.
    • One or more feature columns.
    • feature_timestamp: a TIMESTAMP column that identifies when the feature data was last updated.

    The column names are case-insensitive. For example, you can use a column named Entity_ID instead of entity_id.

    The feature table must be in wide format, with one column for each feature.

  • QUERY_STATEMENT: a STRING value that specifies a GoogleSQL query that returns the feature data. This query must return the same columns as the TABLE_NAME argument. See GoogleSQL query syntax for the supported SQL syntax of the QUERY_STATEMENT clause.

  • TIME: a TIMESTAMP value that specifies the point in time to use as a cutoff for feature data. Only rows where the value in the feature_timestamp column is equal to or earlier than the TIME value are returned. Defaults to the value of the CURRENT_TIMESTAMP function.

  • NUM_ROWS: an INT64 value that specifies the number of rows to return for each entity ID. Defaults to 1.

  • IGNORE_FEATURE_NULLS: a BOOL value that indicates whether to replace a NULL value in a feature column with the feature column value from the row for the same entity that immediately precedes it in time. For example, for the following feature table:

    +-----------+------+------+--------------------------+
    | entity_id | f1 | f2 | feature_timestamp |
    +-----------+------+------+--------------------------+
    | '2' | 5.0 | 8.0 | '2022-06-10 09:00:00+00' |
    +-----------+------+------+--------------------------+
    | '2' | 2.0 | 4.0 | '2022-06-10 12:00:00+00' |
    +-----------+------+------+--------------------------+
    | '2' | 7.0 | NULL | '2022-06-11 10:00:00+00' |
    +-----------+------+------+--------------------------+
    

    Running this query:

    SELECT*
    FROM
    ML.FEATURES_AT_TIME(
    TABLEmydataset.feature_table,
    time=>'2022-06-11 10:00:00+00',
    num_rows=>1,
    ignore_feature_nulls=>TRUE);

    Results in the following output, where the f2 value from the row for entity ID 2 that is timestamped '2022年06月10日 12:00:00+00' is substituted for the NULL value in the row timestamped '2022年06月11日 10:00:00+00':

    +-----------+------+------+--------------------------+
    | entity_id | f1 | f2 | feature_timestamp |
    +-----------+------+------+--------------------------+
    | '2' | 7.0 | 4.0 | '2022-06-11 10:00:00+00' |
    +-----------+------+------+--------------------------+
    

    If there is no available replacement value, for example, where there is no earlier row for that entity ID, a NULL value is returned.

    Defaults to FALSE.

Output

The ML.FEATURES_AT_TIME function returns the input table rows that meet the point-in-time cutoff criteria, with the feature_timestamp column showing the timestamp that was input in the time argument.

Examples

Example 1

This example shows a how to retrain a model using only features that were created or updated before 2023年01月01日 12:00:00+00:

CREATEORREPLACE
`mydataset.mymodel`OPTIONS(WARM_START=TRUE)
AS
SELECT*EXCEPT(feature_timestamp,entity_id)
FROM
ML.FEATURES_AT_TIME(
TABLE`mydataset.feature_table`,
time=>'2023-01-01 12:00:00+00',
num_rows=>1,
ignore_feature_nulls=>TRUE);

Example 2

This example shows how to get predictions from a model based on features that were created or updated before 2023年01月01日 12:00:00+00:

SELECT
*
FROM
ML.PREDICT(
MODEL`mydataset.mymodel`,
(
SELECT*EXCEPT(feature_timestamp,entity_id)
FROM
ML.FEATURES_AT_TIME(
TABLE`mydataset.feature_table`,
time=>'2023-01-01 12:00:00+00',
num_rows=>1,
ignore_feature_nulls=>TRUE)
)
);

Example 3

This is a contrived example that you can use to see the output of the function:

WITH
feature_tableAS(
SELECT*FROMUNNEST(
ARRAY<STRUCT<entity_idSTRING,f_1FLOAT64,f_2FLOAT64,feature_timestampTIMESTAMP>>[
('id1',1.0,1.0,TIMESTAMP'2022-06-10 12:00:00+00'),
('id2',12.0,24.0,TIMESTAMP'2022-06-11 12:00:00+00'),
('id1',11.0,NULL,TIMESTAMP'2022-06-11 12:00:00+00'),
('id1',6.0,12.0,TIMESTAMP'2022-06-11 10:00:00+00'),
('id2',2.0,4.0,TIMESTAMP'2022-06-10 12:00:00+00'),
('id2',7.0,NULL,TIMESTAMP'2022-06-11 10:00:00+00')])
)
SELECT*
FROM
ML.FEATURES_AT_TIME(
TABLEfeature_table,
time=>'2022-06-12 10:00:00+00',
num_rows=>1,
ignore_feature_nulls=>TRUE);

Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025年12月09日 UTC.