Models DaalKMeansModel¶
- 
class DaalKMeansModel¶
- Entity DaalKMeansModel - Attributes - last_read_date - Read-only property - Last time this model’s data was accessed. - name - Set or get the name of the model object. - status - Read-only property - Current model life cycle status. - Methods - __init__(self[, name, _info]) - [BETA] Create a ‘new’ instance of a DAAL k-means model. - predict(self, frame[, observation_columns, label_column]) - [BETA] Predict the cluster assignments for the data points. - publish(self) - [BETA] Creates a tar file that will be used as input to the scoring engine - train(self, frame, observation_columns[, column_scalings, k, max_iterations, ...]) - [ALPHA] Creates DAAL KMeans Model from train frame. 
- 
__init__(self, name=None)¶
- [BETA] Create a ‘new’ instance of a DAAL k-means model. - Parameters: - name : unicode (default=None) - User supplied name. - Returns: - : Model - A new instance of DaalKMeansModel - k-means [R1] is an unsupervised algorithm used to partition the data into ‘k’ clusters. Each observation can belong to only one cluster, the cluster with the nearest mean. The k-means model is initialized, trained on columns of a frame, and used to predict cluster assignments for a frame. - This model runs the DAAL implementation of k-means[R2]_. The K-Means clustering algorithm computes centroids using the Lloyd method[R3]_ - footnotes - [R1] - https://en.wikipedia.org/wiki/K-means_clustering - [R2] - https://software.intel.com/en-us/daal - [R3] - https://en.wikipedia.org/wiki/Lloyd%27s_algorithm - Examples - Consider the following model trained and tested on the sample data set in frame ‘frame’. - Consider the following frame containing two columns. - >>> frame.inspect() [#] data name =================== [0] 2.0 ab [1] 1.0 cd [2] 7.0 ef [3] 1.0 gh [4] 9.0 ij [5] 2.0 kl [6] 0.0 mn [7] 6.0 op [8] 5.0 qr [9] 120.0 outlier - >>> model = ta.DaalKMeansModel() [===Job Progress===] >>> train_output = model.train(frame, ["data"], k=2, max_iterations = 20) [===Job Progress===] >>> train_output {u'centroids': {u'Cluster:0': [120.0], u'Cluster:1': [3.6666666666666665]}, u'cluster_size': {u'Cluster:0': 1, u'Cluster:1': 9}} >>> predicted_frame = model.predict(frame, ["data"]) [===Job Progress===] >>> predicted_frame.inspect() [#] data name distance_from_cluster_0 distance_from_cluster_1 predicted_cluster ======================================================================================== [0] 2.0 ab 13924.0 2.77777777778 1 [1] 1.0 cd 14161.0 7.11111111111 1 [2] 7.0 ef 12769.0 11.1111111111 1 [3] 1.0 gh 14161.0 7.11111111111 1 [4] 9.0 ij 12321.0 28.4444444444 1 [5] 2.0 kl 13924.0 2.77777777778 1 [6] 0.0 mn 14400.0 13.4444444444 1 [7] 6.0 op 12996.0 5.44444444444 1 [8] 5.0 qr 13225.0 1.77777777778 1 [9] 120.0 outlier 0.0 13533.4444444 0