"RandomForest" (Machine Learning Method)
Details & Suboptions
- Random forest is an ensemble learning method for classification and regression that operates by constructing a multitude of decision trees. The forest prediction is obtained by taking the most common class or the mean-value tree predictions. Each decision tree is trained on a random subset of the training set and only uses a random subset of the features (bootstrap aggregating algorithm).
- The following options can be given:
-
- "FeatureFraction", "LeafSize" and "DistributionSmoothing" can be used to control overfitting.
Examples
open all close allBasic Examples (3)
Train a predictor on labeled examples:
Obtain information about the predictor:
Predict a new example:
Train a classifier function on labeled examples:
Plot the probability that the class of an example is "A" or "B" as a function of the feature and compare them:
Train a predictor function on labeled data:
Compare the data with the predicted values and look at the standard deviation:
Options (6)
"DistributionSmoothing" (2)
Train a classifier using the "DistributionSmoothing" suboption:
Use the "Titanic" training set to train a classifier with the default value of "DistributionSmoothing":
Train a second classifier using a large "DistributionSmoothing":
Compare the probabilities for examples from a test set:
"FeatureFraction" (2)
Train a predictor on high-dimensional data using the "FeatureFraction" suboption:
In the "RandomForest" method, a balanced "FeatureFraction" prevents overfitting.
Use the "Titanic" training set to train two classifiers with different values of "FeatureFraction":
Compare the accuracy of these classifiers on both the test set and the training set:
"LeafSize" (1)
Use the "Titanic" training set to train two classifiers with different values of "LeafSize":
Compare the size of the corresponding forests:
"TreeNumber" (1)
Use the "Mushroom" training set to train two classifiers with different values of "TreeNumber":
Look at the training time of these classifiers:
See Also
Classify Predict ClassifierFunction PredictorFunction ClassifierMeasurements PredictorMeasurements SequencePredict ClusterClassify
Methods: DecisionTree LinearRegression LogisticRegression GaussianProcess GradientBoostedTrees Markov NaiveBayes NearestNeighbors NeuralNetwork SupportVectorMachine