Using the the N highest probability tokens for each topic, calculate the exclusivity for each topic

topic_exclusivity(topic_model, top_n_tokens = 10, excl_weight = 0.5)

Arguments

topic_model

a fitted topic model object from one of the following: tm-class

top_n_tokens

an integer indicating the number of top words to consider, the default is 10

excl_weight

a numeric between 0 and 1 indicating the weight to place on exclusivity versus frequency in the calculation, 0.5 is the default

Value

A vector of exclusivity values with length equal to the number of topics in the fitted model

References

Bischof, Jonathan, and Edoardo Airoldi. 2012. "Summarizing topical content with word frequency and exclusivity." In Proceedings of the 29th International Conference on Machine Learning (ICML-12), eds John Langford and Joelle Pineau.New York, NY: Omnipress, 201–208.

See also

Examples


# Using the example from the LDA function
library(topicmodels)
data("AssociatedPress", package = "topicmodels")
lda <- LDA(AssociatedPress[1:20,], control = list(alpha = 0.1), k = 2)
topic_exclusivity(lda)
#> [1] 9.694711 9.674710