FeatureExtraction [examples]
generates a FeatureExtractorFunction […] trained from the examples given.
FeatureExtraction [examples,spec]
uses the specified feature extractor method spec.
FeatureExtraction [examples,spec,props]
gives the feature extraction properties specified by props.
FeatureExtraction
FeatureExtraction [examples]
generates a FeatureExtractorFunction […] trained from the examples given.
FeatureExtraction [examples,spec]
uses the specified feature extractor method spec.
FeatureExtraction [examples,spec,props]
gives the feature extraction properties specified by props.
Details and Options
- FeatureExtraction is typically used to define a function that processes raw data into usable features (e.g. for training a machine learning algorithm).
- FeatureExtraction can be used on many types of data, including numerical, textual, sounds, images, graphs and time series, as well as combinations of these.
- Possible values of examples are:
-
{example1,…} a list of training examplesNone no training examples
- Each examplei can be a single data element, a list of data elements or an association of data elements.
- Possible values for spec include:
-
extractor use the specified extractor methodpartextractor apply the extractor to the specific example part{part1extractor1,…} specify extractors for specific parts
- Possible feature extractor methods extractor include:
-
Automatic automatic extractionIdentity give data unchanged"ConformedData" conformed images, colors, dates, etc."NumericVector" numeric vector from any data"name" a named extractor methodf applies function f to each example{extractor1,extractor2,…} use a sequence of extractors in turn
- Possible forms of part are:
-
All all parts of each examplei i^(th) part of each example{i1,i2,…} parts i1, i2, … of each example"key" part with the specified key in each example{"key1","key2",…} parts with names "keyi" in each example
- When explicitly specifying parts, any unmentioned parts are dropped when extracting features.
- FeatureExtraction [examples] is equivalent to FeatureExtraction [examples,Automatic ], which is typically equivalent to FeatureExtraction [examples,"NumericVector"].
- The "NumericVector" method will typically convert examples to numeric vectors, impute missing data and reduce the dimension using DimensionReduction .
- Feature extractor methods specific to a single data type are applied only to data elements with whose types they are compatible. Other data elements are returned unchanged.
- Not all specific feature extractors are available when the examples is None .
- The specific extractors are:
- Numeric data:
-
"DiscretizedVector" discretized numerical data"DimensionReducedVector" reduced-dimension numeric vectors"MissingImputed" data with missing values imputed"StandardizedVector" numeric data processed with Standardize
- Nominal data:
-
"IndicatorVector" nominal data "one-hot encoded" with indicator vectors"IntegerVector" nominal data encoded with integers
- Text:
-
"LowerCasedText" text with each character lowercase"SegmentedCharacters" text segmented into characters"SegmentedWords" text segmented into words"SentenceVector" semantic vector from a text"TFIDF" term frequency-inverse document frequency vector"WordVectors" semantic vectors sequence from a text (English only)
- Images:
-
"FaceFeatures" semantic vector from an image of a human face"ImageFeatures" semantic vector from an image"PixelVector" vector of pixel values from an image
- Audio objects:
-
"AudioFeatures" sequence of semantic vectors from an audio object"AudioFeatureVector" semantic vector from an audio object"LPC" audio linear prediction coefficients"MelSpectrogram" audio spectrogram with logarithmic frequencies bins"MFCC" audio mel-frequency cepstral coefficients vectors sequence"SpeakerFeatures" sequence of semantic speaker vectors"SpeakerFeatureVector" semantic vector for a speaker"Spectrogram" audio spectrogram
- Video objects:
-
"VideoFeatures" sequence of semantic vectors from a video object"VideoFeatureVector" semantic vector from a video object
- Graphs:
-
"GraphFeatures" numeric vector summarizing graph properties
- Molecules:
-
"AtomPairs" Boolean vector from pairs of atoms and the path lengths between them"MoleculeExtendedConnectivity" Boolean vector from enumerated molecule subgraphs"MoleculeFeatures" numeric vector summarizing molecule properties"MoleculeTopologicalFeatures" Boolean vector from circular atom neighborhoods
- In FeatureExtraction [examples,extractors,props], props can be a single property or a list of properties. Possible properties include:
-
"ExtractorFunction" FeatureExtractorFunction […] (default)"ExtractedFeatures" examples after feature extraction"ReconstructedData" examples after extraction and inverse extraction"FeatureDistance" FeatureDistance […] generated from the extractor
- The "ExtractedFeatures" and "ReconstructedData" properties are not available when examples is None .
- The "ReconstructedData" property can be computed only when every specified extractor is invertible.
- The following options can be given:
-
RandomSeeding 1234 what seeding of pseudorandom generators should be done internally
- Possible settings for RandomSeeding include:
-
Automatic automatically reseed every time the function is calledInherited use externally seeded random numbersseed use an explicit integer or strings as a seed
Extractors
Properties
Options
Examples
open all close allBasic Examples (3)
Train a FeatureExtractorFunction on a simple dataset:
Extract features from a new example:
Extract features from a list of examples:
Train a feature extractor on a dataset of images:
Use the feature extractor on the training set:
Specify a specific extractor:
Scope (32)
Input Shape (9)
Train a feature extractor on a list of examples with a single feature:
Extract features from a new example:
Extract features from multiple new examples:
Train a feature extractor on a list of examples with multiple features:
Extract features from multiple new examples:
Train a feature extractor on a mixed-type dataset:
Extract features from a new example:
Train a feature extractor from a list of associations:
Extract features from a new example:
Extract features from multiple new examples:
Train a feature extractor from data given as feature lists:
Train a feature extractor from a Tabular :
Train a feature extractor from a Dataset :
Train a feature extractor from a dataset that contains missing values:
Define a feature extractor that requires no training:
Apply it on some text:
Extractor Specifications (10)
Specify the feature extractor "SentenceVector" on a single textual feature:
Apply it on some text:
Train a feature extractor using the "StandardizedVector" method:
Extract features from a new example:
Since this feature extractor is invertible, the FeatureExtractorFunction property "OriginalData" can be used to perform the inverse extraction:
Train a feature extractor on text using the "TFIDF" method followed by the "DimensionReducedVector" method:
Extract features on the training set:
Train a feature extractor on texts and images using the text-only "TFIDF" method:
Features will only be extracted from the text part:
Specify the feature extraction on multiple features by position:
Use the feature extractor on new features:
A list of two items will be assumed to be a single input of two features:
Train a feature extractor with the "IndicatorVector" method on only the second nominal variable:
The first nominal variable is dropped:
Use the Identity extractor method to copy the first variable:
The first variable is copied:
A variable can be copied multiple times:
Specify the feature extraction on multiple features by key:
Use the feature extractor on new features:
Using the feature extractor on a list will assume the same ordering of features as originally specified:
Generate a feature extractor using a custom function:
Apply the extractor on the training set:
Chain the custom extractor with the "StandardizedVector" method:
Conform data prior to processing:
Reduce the dimensionality of the output:
Feature Types (10)
Create a feature extractor for textual data using the "SentenceVector" extractor with no training:
Input type is inferred from the specified extractor. Use the feature extractor on some examples:
Create a feature extractor for examples with implicit textual and image features:
Features will be extracted from both parts:
Train a feature extractor on textual data:
Train a feature extractor with the "IndicatorVector" method on nominal variables:
Train a feature extractor to compute term frequency-inverse document frequency vectors from texts:
The term frequency-inverse document frequency matrix of the training set can be computed in a SparseArray :
Visualize the matrix:
The "TFIDF" method can also be used on tokenized data (nominal bags):
Train a feature extractor on a list of DateObject instances:
Extract features from a new DateObject :
A string date can also be given:
Train a feature extractor on a list of Graph instances:
Extract features from a new graph:
Train a feature extractor on a list of TimeSeries instances:
Train a feature extractor on Molecule data:
Train a feature extractor on a list of Audio instances:
Information (3)
Get Information from a trained FeatureExtractorFunction :
Find the available properties:
Get information about the input and output types:
Options (4)
FeatureNames (2)
Train a feature extractor and give a name to each feature:
Use the association format to extract features from a new example:
The list format can still be used:
Use FeatureNames to set up names and refer to them in FeatureExtraction [examples,{spec1ext1,…}]:
Extract features on a new example using the names to specify the features:
FeatureTypes (2)
Train a feature extractor with the "IndicatorVector" method on a simple dataset:
The first feature has been interpreted as numerical. Since the "IndicatorVector" method only acts on nominal features, the first feature is unchanged:
Use FeatureTypes to enforce the interpretation of the first feature as nominal:
Now both features are encoded as indicator vectors:
Creating a feature extractor with no training infers the expected data type from the specific extractor:
Specifying the feature type will override the assumption:
Apply to named features:
Applications (3)
Image Search (1)
Construct a dataset of dog images:
Train an extractor function from this dataset:
Generate a NearestFunction on the extracted features of the dataset:
Using the NearestFunction , construct a function that displays the nearest image of the dataset:
Use this function on images that are not in the dataset:
This feature extractor function can also be used to delete image pairs that are too similar:
Text Search (1)
Load the text of Alice in Wonderland:
Split the text into sentences:
Train a feature extractor on these sentences:
Generate a NearestFunction with the sentences' features:
Using the NearestFunction , construct a function that displays the nearest sentence in Alice in Wonderland:
Use this function with a few queries:
Imputation (1)
Load the "MNIST" dataset from ExampleData and keep the images:
Convert images to numerical data and separate the dataset into a training set and a test set:
The dimension of the dataset is 784:
Create a feature extractor using the "MissingImputed" method:
Replace some values of a test-set vector by Missing [] and visualize it:
Impute missing values using the FeatureExtractorFunction […]:
Visualize the original image, the image with missing values, and the imputed image:
Properties & Relations (4)
Train a feature extractor from data with named features:
Unrecognized keys will be ignored:
FeatureExtraction […,"ExtractedFeatures"] is equivalent to FeatureExtract […]:
The "FeatureDistance" property is equivalent to using FeatureDistance on the extractor:
Compute the FeatureExtractorFunction first:
Construct a feature distance for this feature extractor:
The two distance functions are identical:
Creating a FeatureExtractorFunction on some training data creates a feature space representing those features:
Using different training data can result in a sized feature space:
Creating the same item with no data will result in a untrained function that will consistently give the same results in the same feature space:
Possible Issues (7)
Training an extractor on anonymous data will use automatic feature names:
Using custom names when applying the function will give a feature missing error:
Feature names can be specified at training time:
Check the feature names of a FeatureExtractorFunction :
The custom name can now be used:
The FeatureExtraction property "ReconstructedData" can be used to obtain the data after extraction and reconstruction:
Some feature extractors can only perform an approximation of the inverse extraction:
Some feature extractors cannot be inverted:
The property "ReconstructedData" cannot be used without training data:
Some extractors can be created without needing data:
Others require examples to initialize them:
Similarity, not all properties are supported:
Extractors that do not match the data type are ignored:
The input type is "Nominal", so the "LowerCasedText" extractor ignores the input type:
Similarly, forcing the input to "Text" will cause the "IndicatorVector" to be ignored:
The "ConformedData" extractor requires additional information to operate in a data-free context:
Specifying the FeatureTypes explicitly:
The feature type can also be implicitly inferred from subsequent extractors:
The automatic feature extraction often applies a dimension reduction step:
Explicit feature extractors do not include dimensional reduction and typically result in longer vectors:
Use the "DimensionReducedVector" to add a dimension reduction step:
Dimension reduction must be trained on the available features and therefore cannot be applied when no data is provided:
Related Guides
History
Introduced in 2016 (11.0) | Updated in 2017 (11.2) ▪ 2020 (12.1) ▪ 2020 (12.2) ▪ 2021 (12.3) ▪ 2025 (14.3)
Text
Wolfram Research (2016), FeatureExtraction, Wolfram Language function, https://reference.wolfram.com/language/ref/FeatureExtraction.html (updated 2025).
CMS
Wolfram Language. 2016. "FeatureExtraction." Wolfram Language & System Documentation Center. Wolfram Research. Last Modified 2025. https://reference.wolfram.com/language/ref/FeatureExtraction.html.
APA
Wolfram Language. (2016). FeatureExtraction. Wolfram Language & System Documentation Center. Retrieved from https://reference.wolfram.com/language/ref/FeatureExtraction.html
BibTeX
@misc{reference.wolfram_2025_featureextraction, author="Wolfram Research", title="{FeatureExtraction}", year="2025", howpublished="\url{https://reference.wolfram.com/language/ref/FeatureExtraction.html}", note=[Accessed: 16-November-2025]}
BibLaTeX
@online{reference.wolfram_2025_featureextraction, organization={Wolfram Research}, title={FeatureExtraction}, year={2025}, url={https://reference.wolfram.com/language/ref/FeatureExtraction.html}, note=[Accessed: 16-November-2025]}