Commands frame:/label_propagation¶
Label Propagation on Gaussian Random Fields.
POST /v1/commands/¶
GET /v1/commands/:id¶
Request¶
Route
POST /v1/commands/
Body
name: | frame:/label_propagation |
---|---|
arguments: | frame : <bound method AtkEntityType.__name__ of <trustedanalytics.rest.jsonschema.AtkEntityType object at 0x7f9e686f3fd0>>
src_col_name : unicode
dest_col_name : unicode
weight_col_name : unicode
src_label_col_name : unicode
result_col_name : unicode (default=None)
max_iterations : int32 (default=None)
convergence_threshold : float32 (default=None)
alpha : float32 (default=None)
|
Headers
Authorization: test_api_key_1
Content-type: application/json
Description
Label Propagation on Gaussian Random Fields.
This algorithm is presented in X. Zhu and Z. Ghahramani. Learning from labeled and unlabeled data with label propagation. Technical Report CMU-CALD-02-107, CMU, 2002.
Label Propagation (LP)
LP is a message passing technique for inputing or smoothing labels in partially-labeled datasets. Labels are propagated from labeled data to unlabeled data along a graph encoding similarity relationships among data points. The labels of known data can be probabilistic, in other words, a known point can be represented with fuzzy labels such as 90% label 0 and 10% label 1. The inverse distance between data points is represented by edge weights, with closer points having a higher weight (stronger influence on posterior estimates) than points farther away. LP has been used for many problems, particularly those involving a similarity measure between data points. Our implementation is based on Zhu and Ghahramani’s 2002 paper, Learning from labeled and unlabeled data..
The Label Propagation Algorithm
In LP, all nodes start with a prior distribution of states and the initial messages vertices pass to their neighbors are simply their prior beliefs. If certain observations have states that are known deterministically, they can be given a prior probability of 100% for their true state and 0% for all others. Unknown observations should be given uninformative priors.
Each node, , receives messages from its
neighbors and
updates its beliefs by taking a weighted average of its current beliefs
and a weighted average of the messages received from its neighbors.
The updated beliefs for node are:
where is the normalized weight between nodes
and
, normalized such that the sum of all weights to neighbors is 1.
is a leaning parameter.
If
is greater than zero, updated probabilities will be anchored
in the direction of prior beliefs.
The final distribution of state probabilities will also tend to be biased in
the direction of the distribution of initial beliefs.
For the first iteration of updates, nodes’ previous beliefs are equal to the
priors, and, in each future iteration,
previous beliefs are equal to their beliefs as of the last iteration.
All beliefs for every node will be updated in this fashion, including known
observations, unless anchor_threshold
is set.
The anchor_threshold
parameter specifies a probability threshold above
which beliefs should no longer be updated.
Hence, with an anchor_threshold
of 0.99, observations with states known
with 100% certainty will not be updated by this algorithm.
This process of updating and message passing continues until the convergence criteria is met, or the maximum number of supersteps is reached. A node is said to converge if the total change in its cost function is below the convergence threshold. The cost function for a node is given by:
Convergence is a local phenomenon; not all nodes will converge at the same time. It is also possible that some (most) nodes will converge and others will not converge. The algorithm requires all nodes to converge before declaring global convergence. If this condition is not met, the algorithm will continue up to the maximum number of supersteps.
Response¶
Status
200 OK
Body
Returns information about the command. See the Response Body for Get Command here below. It is the same.