Models RandomForestClassifierModel


class RandomForestClassifierModel

Create a ‘new’ instance of a Random Forest Classifier model.

Random Forest [R54] is a supervised ensemble learning algorithm which can be used to perform binary and multi-class classification. The Random Forest Classifier model is initialized, trained on columns of a frame, used to predict the labels of observations in a frame, and tests the predicted labels against the true labels. This model runs the MLLib implementation of Random Forest [R55]. During training, the decision trees are trained in parallel. During prediction, each tree’s prediction is counted as vote for one class. The label is predicted to be the class which receives the most votes. During testing, labels of the observations are predicted and tested against the true labels using built-in binary and multi-class Classification Metrics.

footnotes

[R54]https://en.wikipedia.org/wiki/Random_forest
[R55]https://spark.apache.org/docs/1.3.0/mllib-ensembles.html

Attributes

name Set or get the name of the model object.

Methods

__init__(self[, name, _info]) Create a ‘new’ instance of a Random Forest Classifier model.
predict(self, frame[, observation_columns]) [ALPHA] Predict the labels for the data points.
publish(self) [BETA] Creates a scoring engine tar file.
test(self, frame, label_column[, observation_columns]) [ALPHA] Predict test frame labels and return metrics.
train(self, frame, label_column, observation_columns[, num_classes, ...]) [ALPHA] Build Random Forests Classifier model.
__init__(self, name=None)

Create a ‘new’ instance of a Random Forest Classifier model.

Parameters:

name : unicode (default=None)

User supplied name.

Returns:

: <bound method AtkEntityType.__name__ of <trustedanalytics.rest.jsonschema.AtkEntityType object at 0x7f9e68702090>>

Random Forest [R56] is a supervised ensemble learning algorithm which can be used to perform binary and multi-class classification. The Random Forest Classifier model is initialized, trained on columns of a frame, used to predict the labels of observations in a frame, and tests the predicted labels against the true labels. This model runs the MLLib implementation of Random Forest [R57]. During training, the decision trees are trained in parallel. During prediction, each tree’s prediction is counted as vote for one class. The label is predicted to be the class which receives the most votes. During testing, labels of the observations are predicted and tested against the true labels using built-in binary and multi-class Classification Metrics.

footnotes

[R56]https://en.wikipedia.org/wiki/Random_forest
[R57]https://spark.apache.org/docs/1.3.0/mllib-ensembles.html