Models ArxModel¶
-
class
ArxModel
¶ Entity ArxModel
Attributes
last_read_date Read-only property - Last time this model’s data was accessed. name Set or get the name of the model object. status Read-only property - Current model life cycle status. Methods
__init__(self[, name, _info]) [ALPHA] Create a ‘new’ instance of a AutoRegressive Exogenous model. predict(self, frame, timeseries_column, x_columns) [ALPHA] New frame with column of predicted y values publish(self) [ALPHA] Creates a tar file that will be used as input to the scoring engine train(self, frame, timeseries_column, x_columns, y_max_lag, x_max_lag[, ...]) [ALPHA] Creates AutoregressionX (ARX) Model from train frame.
-
__init__
(self, name=None)¶ [ALPHA] Create a ‘new’ instance of a AutoRegressive Exogenous model.
Parameters: name : unicode (default=None)
User supplied name.
Returns: : Model
A new instance of ARXModel
Examples
Consider the following model trained and tested on the sample data set in frame ‘frame’. The frame has a snippet of air quality data from:
https://archive.ics.uci.edu/ml/datasets/Air+Quality.
Lichman, M. (2013). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.
>>> frame.inspect() [#] Date Time CO_GT PT08_S1_CO NMHC_GT C6H6_GT ============================================================================= [0] 10/03/2004 18.00.00 2.59999990463 1360 150 11.8999996185 [1] 10/03/2004 19.00.00 2.0 1292 112 9.39999961853 [2] 10/03/2004 20.00.00 2.20000004768 1402 88 9.0 [3] 10/03/2004 21.00.00 2.20000004768 1376 80 9.19999980927 [4] 10/03/2004 22.00.00 1.60000002384 1272 51 6.5 [5] 10/03/2004 23.00.00 1.20000004768 1197 38 4.69999980927 [6] 11/03/2004 00.00.00 1.20000004768 1185 31 3.59999990463 [7] 11/03/2004 01.00.00 1.0 1136 31 3.29999995232 [8] 11/03/2004 02.00.00 0.899999976158 1094 24 2.29999995232 [9] 11/03/2004 03.00.00 0.600000023842 1010 19 1.70000004768 [#] PT08_S2_NMHC NOx_GT PT08_S3_NOx NO2_GT PT08_S4_NO2 PT08_S5_O3 ======================================================================= [0] 1046 166 1056 113 1692 1268 [1] 955 103 1174 92 1559 972 [2] 939 131 1140 114 1555 1074 [3] 948 172 1092 122 1584 1203 [4] 836 131 1205 116 1490 1110 [5] 750 89 1337 96 1393 949 [6] 690 62 1462 77 1333 733 [7] 672 62 1453 76 1333 730 [8] 609 45 1579 60 1276 620 [9] 561 -200 1705 -200 1235 501 [#] Temp RH AH ================================================= [0] 13.6000003815 48.9000015259 0.757799983025 [1] 13.3000001907 47.7000007629 0.725499987602 [2] 11.8999996185 54.0 0.750199973583 [3] 11.0 60.0 0.7867000103 [4] 11.1999998093 59.5999984741 0.788800001144 [5] 11.1999998093 59.2000007629 0.784799993038 [6] 11.3000001907 56.7999992371 0.76029998064 [7] 10.6999998093 60.0 0.770200014114 [8] 10.6999998093 59.7000007629 0.764800012112 [9] 10.3000001907 60.2000007629 0.751699984074
>>> model = ta.ArxModel() [===Job Progress===]
We will be using the column “Temp” (temperature in Celsius) as our time series value:
>>> y_column = "Temp"
The sensor values will be used as our exogenous variables:
>>> x_columns = ['CO_GT','PT08_S1_CO','NMHC_GT','C6H6_GT','PT08_S2_NMHC','NOx_GT','PT08_S3_NOx','NO2_GT','PT08_S4_NO2','PT08_S5_O3']
>>> train_output = model.train(frame, y_column, x_columns, 0, 0, True) [===Job Progress===]
>>> train_output {u'c': 0.0, u'coefficients': [0.005567992923907625, -0.010969068059453009, 0.012556586798371176, -0.39792503380811506, 0.04289162879826746, -0.012253952164677924, 0.01192148525581035, 0.014100699808650077, -0.021091473795935345, 0.007622676727420039]}
>>> predicted_frame = model.predict(frame, y_column, x_columns) [===Job Progress===]
>>> predicted_frame.column_names [u'Date', u'Time', u'CO_GT', u'PT08_S1_CO', u'NMHC_GT', u'C6H6_GT', u'PT08_S2_NMHC', u'NOx_GT', u'PT08_S3_NOx', u'NO2_GT', u'PT08_S4_NO2', u'PT08_S5_O3', u'Temp', u'RH', u'AH', u'predicted_y']
>>> predicted_frame.inspect(columns=("Temp","predicted_y")) [#] Temp predicted_y ================================= [0] 13.6000003815 13.236459938 [1] 13.3000001907 13.0250130899 [2] 11.8999996185 11.4147282294 [3] 11.0 11.3157457822 [4] 11.1999998093 11.3982074883 [5] 11.1999998093 11.7079198051 [6] 11.3000001907 10.7879916472 [7] 10.6999998093 10.527428478 [8] 10.6999998093 10.4439615476 [9] 10.3000001907 10.276662138
>>> model.publish() [===Job Progress===]
Take the path to the published model and run it in the Scoring Engine:
>>> import requests >>> headers = {'Content-type': 'application/json', 'Accept': 'application/json,text/plain'}
Post a request to get the metadata about the model
>>> r = requests.get('http://mymodel.demotrustedanalytics.com/v2/metadata') >>> r.text u'{"model_details":{"model_type":"ARX Model","model_class":"com.cloudera.sparkts.models.ARXModel","model_reader":"org.trustedanalytics.atk.scoring.models.ARXModelReaderPlugin","custom_values":{}},"input":[{"name":"y","value":"Array[Double]"},{"name":"x_values","value":"Array[Double]"}],"output":[{"name":"y","value":"Array[Double]"},{"name":"x_values","value":"Array[Double]"},{"name":"score","value":"Array[Double]"}]}'
The ARX model only supports version 2 of the scoring engine. In the following example, we are using the ARX model that was trained and published in the example above. To keep things simple, we just send the first three rows of ‘y’ values and the corresponding ‘x_values’.
>>> r = requests.post('http://mymodel.demotrustedanalytics.com/v2/score',json={"records":[{"y":[13.6000003815,13.3000001907,11.8999996185],"x_values":[2.6,2.0,2.2,1360,1292,1402,150,112,88,11.9,9.4,9.0,1046,955,939,166,103,131,1056,1174,1140,113,92,114,1692,1559,1555,1268,972,1074]}]})
The ‘score’ value contains an array of predicted y values.
>>> r.text u'{"data":[{"y":[13.6000003815,13.3000001907,11.8999996185],"x_values":[13.6000003815,13.3000001907,11.8999996185],"x_values":[2.6,2.0,2.2,1360,1292,1402,150,112,88,11.9,9.4,9.0,1046,955,939,166,103,131,1056,1174,1140,113,92,114,1692,1559,1555,1268,972,1074],"score":[13.2364599379956,13.02501308994565,11.414728229443007]}]}'