Models KMeansModel


class KMeansModel

Create a ‘new’ instance of a k-means model.

k-means [R3] is an unsupervised algorithm used to partition the data into ‘k’ clusters. Each observation can belong to only one cluster, the cluster with the nearest mean. The k-means model is initialized, trained on columns of a frame, and used to predict cluster assignments for a frame. This model runs the MLLib implementation of k-means [R4] with enhanced features, computing the number of elements in each cluster during training. During predict, it computes the distance of each observation from its cluster center and also from every other cluster center.

footnotes

[R3]https://en.wikipedia.org/wiki/K-means_clustering
[R4]https://spark.apache.org/docs/1.3.0/mllib-clustering.html#k-means

Attributes

name Set or get the name of the model object.

Methods

__init__(self[, name, _info]) Create a ‘new’ instance of a k-means model.
predict(self, frame[, observation_columns]) [BETA] Predict the cluster assignments for the data points.
publish(self) [BETA] Creates a scoring engine tar file.
train(self, frame, observation_columns, column_scalings[, k, max_iterations, ...]) [BETA] Creates k-means model from trained frame.
__init__(self, name=None)

Create a ‘new’ instance of a k-means model.

Parameters:

name : unicode (default=None)

Name for the model.

Returns:

: <bound method AtkEntityType.__name__ of <trustedanalytics.rest.jsonschema.AtkEntityType object at 0x7f9e68702090>>

k-means [R5] is an unsupervised algorithm used to partition the data into ‘k’ clusters. Each observation can belong to only one cluster, the cluster with the nearest mean. The k-means model is initialized, trained on columns of a frame, and used to predict cluster assignments for a frame. This model runs the MLLib implementation of k-means [R6] with enhanced features, computing the number of elements in each cluster during training. During predict, it computes the distance of each observation from its cluster center and also from every other cluster center.

footnotes

[R5]https://en.wikipedia.org/wiki/K-means_clustering
[R6]https://spark.apache.org/docs/1.3.0/mllib-clustering.html#k-means