GmmModel train


train(self, frame, observation_columns, column_scalings, k=2, max_iterations=100, convergence_tol=0.01, seed=4729768646873607665)

Creates a GMM Model from the train frame.

Parameters:

frame : Frame

A frame to train the model on.

observation_columns : list

Columns containing the observations.

column_scalings : list

Column scalings for each of the observation columns. The scaling value is multiplied by the corresponding value in the observation column.

k : int32 (default=2)

Desired number of clusters. Default is 2.

max_iterations : int32 (default=100)

Number of iterations for which the algorithm should run. Default is 100.

convergence_tol : float64 (default=0.01)

Largest change in log-likelihood at which convergence iis considered to have occurred.

seed : int64 (default=4729768646873607665)

Random seed

Returns:

: dict

dict

Returns a dictionary the following fields

cluster_size
: dict

with the key being a string of the form ‘Cluster:Id’ storing the number of elements in cluster number ‘Id’

gaussians
: dict

Stores the ‘mu’ and ‘sigma’ corresponding to the Multivariate Gaussian (Normal) Distribution for each Gaussian

At training the ‘k’ cluster centers are computed.

Examples

See here for examples.