Table Of Contents

PrincipalComponentsModel new


__init__(self, name=None)

Create a ‘new’ instance of a Principal Components model.

Parameters:

name : unicode (default=None)

User supplied name.

Returns:

: <bound method AtkEntityType.__name__ of <trustedanalytics.rest.jsonschema.AtkEntityType object at 0x7f9e68702090>>

Principal component analysis [R51] is a statistical algorithm that converts possibly correlated features to linearly uncorrelated variables called principal components. The number of principal components is less than or equal to the number of original variables. This implementation of computing Principal Components is done by Singular Value Decomposition [R52] of the data, providing the user with an option to mean center the data. The Principal Components model is initialized; trained on specifying the observation columns of the frame and the number of components; used to predict principal components. The MLLib Singular Value Decomposition [R53] implementation has been used for this, with additional features to 1) mean center the data during train and predict and 2) compute the t-squared index during prediction.

footnotes

[R51]https://en.wikipedia.org/wiki/Principal_component_analysis
[R52]https://en.wikipedia.org/wiki/Singular_value_decomposition
[R53]https://spark.apache.org/docs/1.3.0/mllib-dimensionality-reduction.html