Table Of Contents

LdaModel predict


predict(self, document)

[BETA] Predict conditional probabilities of topics given document.

Parameters:

document : list

Document whose topics are to be predicted.

Returns:

: dict

Dictionary containing predicted topics. The data returned is composed of multiple components:

list of doubles | topics_given_doc
List of conditional probabilities of topics given document.
int : new_words_count
Count of new words in test document not present in training set.
double | new_words_percentage
Percentage of new words in test document.

Predicts conditional probabilities of topics given document using trained Latent Dirichlet Allocation model. The input document is represented as a list of strings

Examples

Inspect the input frame:

>>> my_model = ta.LdaModel()
>>> results = my_model.train(frame, 'doc_id', 'word_id', 'word_count', max_iterations = 3, num_topics = 2)
>>> prediction = model.predict(['harry', 'secrets', 'magic', 'harry', 'chamber' 'test'])

The variable prediction is a dictionary with three keys:

>>> topics_given_doc = results['topics_given_doc']
>>> new_words_percentage = results['new_words_percentage']
>>> new_words_count = results['new_words_count']
>>> print(prediction)

{u'topics_given_doc': [0.04150190747884333, 0.7584980925211566], u'new_words_percentage': 20.0, u'new_words_count': 1}