RandomForestClassifierModel train¶

train(self, frame, label_column, observation_columns, num_classes=2, num_trees=1, impurity='gini', max_depth=4, max_bins=100, seed=-686633978, categorical_features_info=None, feature_subset_category=None)¶

[ALPHA] Build Random Forests Classifier model.

Parameters:

Parameters:	frame : <bound method AtkEntityType.__name__ of <trustedanalytics.rest.jsonschema.AtkEntityType object at 0x7f9e686f3fd0>> A frame to train the model on. label_column : unicode Column name containing the label for each observation. observation_columns : list Column(s) containing the observations. num_classes : int32 (default=2) Number of classes for classification. num_trees : int32 (default=1) Number of tress in the random forest. impurity : unicode (default=gini) Criterion used for information gain calculation. Supported values “gini” or “entropy” max_depth : int32 (default=4) Maximum depth of the tree. max_bins : int32 (default=100) Maximum number of bins used for splitting features. seed : int32 (default=-686633978) Random seed for bootstrapping and choosing feature subsets categorical_features_info : None (default=None) feature_subset_category : unicode (default=None) Number of features to consider for splits at each node. Supported values: “auto”,”all”,”sqrt”,”log2”,”onethird”.
Returns:	: dict Values of the Random Forest Classifier model object storing: the list of observation columns on which the model was trained, the column name containing the labels of the observations, the number of classes, the number of decision trees in the random forest, the number of nodes in the random forest, the map storing arity of categorical features, the criterion used for information gain calculation, the maximum depth of the tree, the maximum number of bins used for splitting features, the random seed used for bootstrapping and choosing feature subset.

frame : <bound method AtkEntityType.__name__ of <trustedanalytics.rest.jsonschema.AtkEntityType object at 0x7f9e686f3fd0>>

A frame to train the model on.

label_column : unicode

Column name containing the label for each observation.

observation_columns : list

Column(s) containing the observations.

num_classes : int32 (default=2)

Number of classes for classification.

num_trees : int32 (default=1)

Number of tress in the random forest.

impurity : unicode (default=gini)

Criterion used for information gain calculation. Supported values “gini” or “entropy”

max_depth : int32 (default=4)

Maximum depth of the tree.

max_bins : int32 (default=100)

Maximum number of bins used for splitting features.

seed : int32 (default=-686633978)

Random seed for bootstrapping and choosing feature subsets

categorical_features_info : None (default=None)

feature_subset_category : unicode (default=None)

Number of features to consider for splits at each node. Supported values: “auto”,”all”,”sqrt”,”log2”,”onethird”.

Returns:

: dict

Values of the Random Forest Classifier model object storing:

the list of observation columns on which the model was trained,

the column name containing the labels of the observations,

the number of classes,

the number of decision trees in the random forest,

the number of nodes in the random forest,

the map storing arity of categorical features,

the criterion used for information gain calculation,

the maximum depth of the tree,

the maximum number of bins used for splitting features,

the random seed used for bootstrapping and choosing feature subset.

Creating a Random Forests Classifier Model using the observation columns and label column.

Quick search

Table Of Contents

RandomForestClassifierModel train¶