Module: tft

View source on GitHub

Init module for TF.Transform.

Modules

coders module: Module level imports for tensorflow_transform.coders.

experimental module: Module level imports for tensorflow_transform.experimental.

Classes

class DatasetMetadata: Metadata about a dataset used for the "instance dict" format.

class TFTransformOutput: A wrapper around the output of the tf.Transform.

class TransformFeaturesLayer: A Keras layer for applying a tf.Transform output to input layers.

Functions

annotate_asset(...): Creates mapping between user-defined keys and SavedModel assets.

apply_buckets(...): Returns a bucketized column, with a bucket index assigned to each input.

apply_buckets_with_interpolation(...): Interpolates within the provided buckets and then normalizes to 0 to 1.

apply_pyfunc(...): Applies a python function to some Tensors.

apply_vocabulary(...): Maps x to a vocabulary specified by the deferred tensor.

bag_of_words(...): Computes a bag of "words" based on the specified ngram configuration.

bucketize(...): Returns a bucketized column, with a bucket index assigned to each input.

bucketize_per_key(...): Returns a bucketized column, with a bucket index assigned to each input.

compute_and_apply_vocabulary(...): Generates a vocabulary for x and maps it to an integer with this vocab.

count_per_key(...): Computes the count of each element of a Tensor.

covariance(...): Computes the covariance matrix over the whole dataset.

deduplicate_tensor_per_row(...): Deduplicates each row (0-th dimension) of the provided tensor.

estimated_probability_density(...): Computes an approximate probability density at each x, given the bins.

get_analyze_input_columns(...): Return columns that are required inputs of AnalyzeDataset.

get_num_buckets_for_transformed_feature(...): Provides the number of buckets for a transformed feature if annotated.

get_transform_input_columns(...): Return columns that are required inputs of TransformDataset.

hash_strings(...): Hash strings into buckets.

histogram(...): Computes a histogram over x, given the bin boundaries or bin count.

make_and_track_object(...): Keeps track of the object created by invoking trackable_factory_callable.

max(...): Computes the maximum of the values of x over the whole dataset.

mean(...): Computes the mean of the values of a Tensor over the whole dataset.

min(...): Computes the minimum of the values of x over the whole dataset.

ngrams(...): Create a SparseTensor of n-grams.

pca(...): Computes PCA on the dataset using biased covariance.

quantiles(...): Computes the quantile boundaries of a Tensor over the whole dataset.

scale_by_min_max(...): Scale a numerical column into the range [output_min, output_max].

scale_by_min_max_per_key(...): Scale a numerical column into a predefined range on a per-key basis.

scale_to_0_1(...): Returns a column which is the input column scaled to have range [0,1].

scale_to_0_1_per_key(...): Returns a column which is the input column scaled to have range [0,1].

scale_to_gaussian(...): Returns an (approximately) normal column with mean to 0 and variance 1.

scale_to_z_score(...): Returns a standardized column with mean 0 and variance 1.

scale_to_z_score_per_key(...): Returns a standardized column with mean 0 and variance 1, grouped per key.

segment_indices(...): Returns a Tensor of indices within each segment.

size(...): Computes the total size of instances in a Tensor over the whole dataset.

sparse_tensor_left_align(...): Re-arranges a tf.SparseTensor and returns a left-aligned version of it.

sparse_tensor_to_dense_with_shape(...): Converts a SparseTensor into a dense tensor and sets its shape.

sum(...): Computes the sum of the values of a Tensor over the whole dataset.

tfidf(...): Maps the terms in x to their term frequency * inverse document frequency.

tukey_h_params(...): Computes the h parameters of the values of a Tensor over the dataset.

tukey_location(...): Computes the location of the values of a Tensor over the whole dataset.

tukey_scale(...): Computes the scale of the values of a Tensor over the whole dataset.

var(...): Computes the variance of the values of a Tensor over the whole dataset.

vocabulary(...): Computes the unique values of x over the whole dataset.

word_count(...): Find the token count of each document/row.

Other Members

version '1.16.0'

Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2024年11月01日 UTC.