Introduction to Topic Modeling with LDA and more


Topic models are a family of models to estimate the distribution of abstract concepts (topics) that make up a collection of documents. Over the last several years, the popularity of topic modeling has swelled. One model, Latent Dirichlet Allocation (LDA), is especially popular. Tommy Jones will describe a range of topic modeling algorithms and how they fit into the topic modeling taxonomy. He will then focus on LDA, explaining how to tune its parameters and giving tips for building better LDA models. Finally, Tommy will present several open statistical questions in topic modeling, particularly LDA. Examples include LDA’s inconsistency, how sample selection affects estimates, and how to best present results. Researchers have begun to tackle some of these issues, but others remain. Still, LDA and other topic models are becoming invaluable resources for researchers in many disciplines.

Nov 12, 2014 6:30 PM