Topic Modeling Bibliography

Bibliometrics

Cross-language

Evaluation

Implementations

Inference

NLP

Networks

Non-parametric

Scalability

Social media

Temporal

Theory

User interface

Vision

Where to start

Show All (showing )

Edoardo M. Airoldi, David M. Blei, Stephen E. Fienberg, Eric P. Xing. Mixed Membership Stochastic Blockmodels. JMLR (9) 2008 pp. 1981-2014.

Networks

[BibTeX]

@article{airoldi2008mixed,
 author={Edoardo M. Airoldi and David M. Blei and Stephen E. Fienberg and Eric P. Xing},
 title={Mixed Membership Stochastic Blockmodels},
 journal={JMLR},
 year={2008},
 volume={9},
 pages={1981-2014},
}

Loulwah AlSumait, Daniel Barbará, James Gentle, Carlotta Domeniconi. Topic Significance Ranking of LDA Generative Models. ECML (2009).

Evaluation

[BibTeX]

@inproceedings{alsumait2009topic,
 author={Loulwah AlSumait and Daniel Barbará and James Gentle and Carlotta Domeniconi},
 title={Topic Significance Ranking of LDA Generative Models},
 booktitle={ECML},
 year={2009},
 url={http://www.springerlink.com/content/v3jth868647716kg/},
}

David Andrzejewski, Anne Mulhern, Ben Liblit, Xiaojin Zhu. Statistical Debugging using Latent Topic Models. ECML (2007).

[BibTeX]

@inproceedings{andrzejewski2007statistical,
 author={David Andrzejewski and Anne Mulhern and Ben Liblit and Xiaojin Zhu},
 title={Statistical Debugging using Latent Topic Models},
 booktitle={ECML},
 year={2007},
}

David Andrzejewski, Xiaojin Zhu, Mark Craven. Incorporating domain knowledge into topic modeling via Dirichlet Forest priors. ICML (2009).

[BibTeX]

@inproceedings{andrzejewski2009incorporating,
 author={David Andrzejewski and Xiaojin Zhu and Mark Craven},
 title={Incorporating domain knowledge into topic modeling via Dirichlet Forest priors},
 booktitle={ICML},
 year={2009},
 pages={25-32},
}

David Andrzejewski, Xiaojin Zhu, Mark Craven, Ben Recht. A Framework for Incorporating General Domain Knowledge into Latent Dirichlet Allocation using First-Order Logic. IJCAI (2011).

[BibTeX]

@inproceedings{andrzejewski2011framework,
 author={David Andrzejewski and Xiaojin Zhu and Mark Craven and Ben Recht},
 title={A Framework for Incorporating General Domain Knowledge into Latent Dirichlet Allocation using First-Order Logic},
 booktitle={IJCAI},
 year={2011},
}

Arthur Asuncion, Padhraic Smyth, Max Welling. Asynchronous Distributed Learning of Topic Models. NIPS (2008).

Scalability

[BibTeX]

@inproceedings{asuncion2008distributed,
 author={Arthur Asuncion and Padhraic Smyth and Max Welling},
 title={Asynchronous Distributed Learning of Topic Models},
 booktitle={NIPS},
 year={2008},
 pages={81-88},
 url={http://www.ics.uci.edu/~asuncion/pubs/NIPS_08.pdf},
}

Arthur Asuncion, Max Welling, Padhraic Smyth, Yee-Whye Teh. On Smoothing and Inference for Topic Models. UAI (2009).

Inference

[BibTeX]

@inproceedings{asuncion2009smoothing,
 author={Arthur Asuncion and Max Welling and Padhraic Smyth and Yee-Whye Teh},
 title={On Smoothing and Inference for Topic Models},
 booktitle={UAI},
 year={2009},
 url={http://www.ics.uci.edu/~asuncion/pubs/UAI_09.pdf},
}

A dense but excellent review of inference in topic models. Introduces CVB0, a method for collapsed variational inference surprisingly similar to Gibbs sampling.

David Blei, Michael Jordan. Modeling Annotated Data. SIGIR (2003).

[BibTeX]

@inproceedings{blei2003modeling,
 author={David Blei and Michael Jordan},
 title={Modeling Annotated Data},
 booktitle={SIGIR},
 year={2003},
}

This paper introduces CorrLDA for data that consists of text and images, where image "topics" are chosen only from topics that are assigned to the text in the same document.

David M. Blei. lda-c. (2003).

Implementations

[BibTeX]

@misc{blei-lda-c,
 author={David M. Blei},
 title={lda-c},
 year={2003},
 url={http://www.cs.princeton.edu/~blei/lda-c/},
}

lda-c implements LDA with variational inference in C.

David M. Blei, Andrew Ng, Michael Jordan. Latent Dirichlet allocation. JMLR (3) 2003 pp. 993-1022.

[BibTeX]

@article{blei2003latent,
 author={David M. Blei and Andrew Ng and Michael Jordan},
 title={Latent Dirichlet allocation},
 journal={JMLR},
 year={2003},
 volume={3},
 pages={993-1022},
}

David M. Blei, Thomas Griffiths, Michael Jordan, Joshua Tenenbaum. Hierarchical topic models and the nested Chinese restaurant process. NIPS (2003).

Non-parametric

[BibTeX]

@inproceedings{blei2003hierarchical,
 author={David M. Blei and Thomas Griffiths and Michael Jordan and Joshua Tenenbaum},
 title={Hierarchical topic models and the nested Chinese restaurant process},
 booktitle={NIPS},
 year={2003},
 url={http://books.nips.cc/papers/files/nips16/NIPS2003_AA03.pdf},
}

Introduces hLDA, which models topics in a tree. Each document is generated by topics along a single path through the tree.

David M. Blei, Thomas L. Griffiths, Michael I. Jordan. The nested Chinese restaurant process and hierarchical topic models. (2007).

Non-parametric

[BibTeX][Abstract]

@misc{blei2007nested,
 author={David M. Blei and Thomas L. Griffiths and Michael I. Jordan},
 title={The nested Chinese restaurant process and hierarchical topic models},
 year={2007},
 url={http://arxiv.org/abs/0710.0845},
}

We present the nested Chinese restaurant process (nCRP), a stochastic process which assigns probability distributions to infinitely-deep, infinitely-branching trees. We show how this stochastic process can be used as a prior distribution in a nonparametric Bayesian model of document collections. Specifically, we present an application to information retrieval in which documents are modeled as paths down a random tree, and the preferential attachment dynamics of the nCRP leads to clustering of documents according to sharing of topics at multiple levels of abstraction. Given a corpus of documents, a posterior inference algorithm finds an approximation to a posterior distribution over trees, topics and allocations of words to levels of the tree. We demonstrate this algorithm on several collections of scientific abstracts. This model exemplifies a recent trend in statistical machine learning-the use of nonparametric Bayesian methods to infer distributions on flexible data structures.

This is a longer version of Blei et al. 2004, which extends that paper's hLDA model to trees of unlimited depth.

David M. Blei, John D. Lafferty. Dynamic Topic Models. ICML (2006).

Temporal

[BibTeX]

@inproceedings{blei2006dynamic,
 author={David M. Blei and John D. Lafferty},
 title={Dynamic Topic Models},
 booktitle={ICML},
 year={2006},
 url={http://portal.acm.org/citation.cfm?id=1143859},
}

David M. Blei, John D. Lafferty. A Correlated Topic model of Science. AAS (1) 2007 pp. 17-35.

[BibTeX]

@article{blei2007correlated,
 author={David M. Blei and John D. Lafferty},
 title={A Correlated Topic model of Science},
 journal={AAS},
 year={2007},
 volume={1},
 number={1},
 pages={17-35},
}

David M. Blei, Jon D. McAuliffe. Supervised Topic Models. NIPS (2007).

[BibTeX]

@inproceedings{blei2007supervised,
 author={David M. Blei and Jon D. McAuliffe},
 title={Supervised Topic Models},
 booktitle={NIPS},
 year={2007},
 url={http://books.nips.cc/papers/files/nips20/NIPS2007_0893.pdf},
}

David M. Blei. Introduction to Probabilistic Topic Models. Communications of the ACM () 2011 pp. .

Where to start

[BibTeX]

@article{blei2011introduction,
 author={David M. Blei},
 title={Introduction to Probabilistic Topic Models},
 journal={Communications of the ACM},
 year={2011},
 url={http://www.cs.princeton.edu/~blei/papers/Blei2011.pdf},
}

A high-level overview of probabilistic topic models.

Brad Block. Collapsed variational HDP. (2011).

Implementations

[BibTeX]

@misc{block2011cvhdp,
 author={Brad Block},
 title={Collapsed variational HDP},
 year={2011},
 url={http://www.bradblock.com/tm-0.1.tar.gz},
}

This library contains Java source and class files implementing the Latent Dirichlet Allocation (single-threaded collapsed Gibbs sampling) and Hierarchical Dirichlet Process (multi-threaded collapsed variational inference) topic models. The models can be accessed through the command-line or through a simple Java API. Also included is a subset of the 20 Newsgroup dataset and results of experiments done on the dataset to confirm the correct operation and investigate some properties of the topic models. No third-party scientific libraries are required and all needed special functions are implemented and included.

Jordan Boyd-Graber, David M. Blei, Xiaojin Zhu. A Topic Model for Word Sense Disambiguation. EMNLP (2007).

NLP

[BibTeX]

@inproceedings{boydgraber2007topic,
 author={Jordan Boyd-Graber and David M. Blei and Xiaojin Zhu},
 title={A Topic Model for Word Sense Disambiguation},
 booktitle={EMNLP},
 year={2007},
}

Jordan Boyd-Graber, David M. Blei. PUTOP: Turning Predominant Senses into a Topic Model for WSD. SEMEVAL (2007).

NLP

[BibTeX]

@inproceedings{boydgraber2007turning,
 author={Jordan Boyd-Graber and David M. Blei},
 title={PUTOP: Turning Predominant Senses into a Topic Model for WSD},
 booktitle={SEMEVAL},
 year={2007},
}

Jordan Boyd-Graber, David M. Blei. Syntactic Topic Models. NIPS (2008).

NLP

[BibTeX]

@inproceedings{boydgraber2008syntactic,
 author={Jordan Boyd-Graber and David M. Blei},
 title={Syntactic Topic Models},
 booktitle={NIPS},
 year={2008},
 url={http://books.nips.cc/papers/files/nips21/NIPS2008_0319.pdf},
}

Jordan Boyd-Graber, David M. Blei. Multilingual Topic Models for Unaligned Text. UAI (2009).

Cross-language

[BibTeX]

@inproceedings{boydgraber2009multilingual,
 author={Jordan Boyd-Graber and David M. Blei},
 title={Multilingual Topic Models for Unaligned Text},
 booktitle={UAI},
 year={2009},
}

David A. Broniatowski, Christopher L. Magee. Analysis of Social Dynamics on FDA Panels Using Social Networks Extracted From Meeting Transcripts. SocCom (2010).

Networks

[BibTeX]

@inproceedings{broniatowskimagee2010,
 author={David A. Broniatowski and Christopher L. Magee},
 title={Analysis of Social Dynamics on FDA Panels Using Social Networks Extracted From Meeting Transcripts},
 booktitle={SocCom},
 year={2010},
 url={http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5591237&tag=1},
}

Method for analyzing group decision making based on the Author-Topic Model

David A. Broniatowski, Christopher L. Magee. Towards A Computational Analysis of Status and Leadership Styles on FDA Panels. SBP (2011).

NetworksTemporal

[BibTeX]

@inproceedings{broniatowskimagee2011,
 author={David A. Broniatowski and Christopher L. Magee},
 title={Towards A Computational Analysis of Status and Leadership Styles on FDA Panels},
 booktitle={SBP},
 year={2011},
 url={http://www.springerlink.com/content/w655v786lp583660/},
}

Incorporates temporal information to generate directed graphs based upon topic models

Wray L. Buntine. Discrete Component Analysis. (2009).

Implementations

[BibTeX]

@misc{buntine-dca,
 author={Wray L. Buntine},
 title={Discrete Component Analysis},
 year={2009},
 url={http://www.nicta.com.au/people/buntinew/discrete_component_analysis},
}

C implementation of LDA and multinomial PCA.

Wray L. Buntine, Aleks Jakulin. Discrete Component Analysis. SLSFS (2005).

[BibTeX]

@inproceedings{buntine2005discrete,
 author={Wray L. Buntine and Aleks Jakulin},
 title={Discrete Component Analysis},
 booktitle={SLSFS},
 year={2005},
 pages={1-33},
}

Wray L. Buntine. Estimating Likelihoods for Topic Models. Asian Conference on Machine Learning (2009).

Evaluation

[BibTeX]

@inproceedings{buntine2009estimating,
 author={Wray L. Buntine},
 title={Estimating Likelihoods for Topic Models},
 booktitle={Asian Conference on Machine Learning},
 year={2009},
 url={http://www.nicta.com.au/__data/assets/pdf_file/0019/20746/sdca-0202.pdf},
}

Provides improved versions of some of the methods in Wallach et al. (2009) for calculating held-out probability.

Wray L. Buntine, Swapnil Mishra. Experiments with Non-parametric Topic Models. (2014).

Non-parametric

[BibTeX]

@misc{buntine2014experiments,
 author={Wray L. Buntine and Swapnil Mishra},
 title={Experiments with Non-parametric Topic Models},
 year={2014},
 url={http://dl.acm.org/citation.cfm?id=2623691},
}

Non-parametric implementations of bursty models. The authors find that using fixed numbers of topics but optimizing hyperparameters provides a good approximation of a non-parametric HDP model.

Jun Fu Cai, Wee Sun Lee, Yee Whye Teh. NUS-ML: Improving Word Sense Disambiguation Using Topic Features. SEMEVAL (2007).

NLP

[BibTeX]

@inproceedings{cai2007nus,
 author={Jun Fu Cai and Wee Sun Lee and Yee Whye Teh},
 title={NUS-ML: Improving Word Sense Disambiguation Using Topic Features},
 booktitle={SEMEVAL},
 year={2007},
}

Jonathan Chang. R package 'lda'. (2011).

Implementations

[BibTeX]

@misc{r-lda,
 author={Jonathan Chang},
 title={R package 'lda'},
 year={2011},
 url={http://cran.r-project.org/web/packages/lda/},
}

This package implements latent Dirichlet allocation (LDA) and related models. This includes (but is not limited to) sLDA, corrLDA, and the mixed-membership stochastic blockmodel. Inference for all of these models is implemented via a fast collapsed Gibbs sampler writtten in C. Utility functions for reading/writing data typically used in topic models, as well as tools for examining posterior distributions are also included.

Jonathan Chang, David Blei. Relational Topic Models for Document Networks. AIStats (2009).

Networks

[BibTeX]

@inproceedings{chang2009relational,
 author={Jonathan Chang and David Blei},
 title={Relational Topic Models for Document Networks},
 booktitle={AIStats},
 year={2009},
}

Chaitanya Chemudugunta, Padhraic Smyth, Mark Steyvers. Modeling General and Specific Aspects of Documents with a Probabilistic Topic Model. NIPS (2006).

[BibTeX]

@inproceedings{chemudugunta2006modeling,
 author={Chaitanya Chemudugunta and Padhraic Smyth and Mark Steyvers},
 title={Modeling General and Specific Aspects of Documents with a Probabilistic Topic Model},
 booktitle={NIPS},
 year={2006},
 url={http://www.datalab.uci.edu/papers/special_words_NIPS06.pdf},
}

This paper has two interesting extensions to LDA that account for the power-law distribution of word frequencies in real documents. First, a general "background" distribution represents common words. Second, a "special words" model allows each document to have some unique words.

Changyou Chen, Lan Du, Wray Buntine. Sampling Table Configurations for the Hierarchical Poisson-Dirichlet Process. ECML-PKDD (2011).

Non-parametric

[BibTeX]

@inproceedings{chen2011sampling,
 author={Changyou Chen and Lan Du and Wray Buntine},
 title={Sampling Table Configurations for the Hierarchical Poisson-Dirichlet Process},
 booktitle={ECML-PKDD},
 year={2011},
 url={http://www.nicta.com.au/pub?doc=4806},
}

A simple hierarchical Pitman-Yor LDA sampler that does not record "table" assignments. Perplexity is sometimes far superior to other methods.

Jonathan Chang, Jordan Boyd-Graber, Chong Wang, Sean Gerrish, David M. Blei. Reading Tea Leaves: How Humans Interpret Topic Models. NIPS (2009).

Evaluation

[BibTeX]

@inproceedings{chang2009reading,
 author={Jonathan Chang and Jordan Boyd-Graber and Chong Wang and Sean Gerrish and David M. Blei},
 title={Reading Tea Leaves: How Humans Interpret Topic Models},
 booktitle={NIPS},
 year={2009},
 url={http://books.nips.cc/papers/files/nips22/NIPS2009_0125.pdf},
}

Pradipto Das, Rohini Srihari, Yun Fu. Simultaneous Joint and Conditional Modeling of Documents Tagged from Two Perspectives. (2011).

[BibTeX]

@inproceedings{das2011simultaneous,
 author={Pradipto Das and Rohini Srihari and Yun Fu},
 title={Simultaneous Joint and Conditional Modeling of Documents Tagged from Two Perspectives},
 booktitle={},
 year={2011},
 url={http://www.acsu.buffalo.edu/~pdas3/research/papers/CIKM/pdasCIKM11.pdf},
}

Rajarshi Das, Manzil Zaheer, Chris Dyer. Gaussian LDA for topic Models with Word Embeddings. (2015).

[BibTeX]

@inproceedings{das2015gaussian,
 author={Rajarshi Das and Manzil Zaheer and Chris Dyer},
 title={Gaussian LDA for topic Models with Word Embeddings},
 booktitle={},
 year={2015},
 url={http://rajarshd.github.io/papers/acl2015.pdf},
}

Hal Daumé III. Markov Random Topic Fields. (2009).

[BibTeX]

@inproceedings{daume2009markov,
 author={Hal Daumé III},
 title={Markov Random Topic Fields},
 booktitle={},
 year={2009},
}

Andrew M. Dai, Amos J. Storkey. The Grouped Author-Topic Model for Unsupervised Entity Resolution . ICANN (2011).

[BibTeX]

@inproceedings{dai2011grouped,
 author={Andrew M. Dai and Amos J. Storkey},
 title={
The Grouped Author-Topic Model for Unsupervised Entity Resolution
},
 booktitle={ICANN},
 year={2011},
}

Scott Deerwester, Susan T. Dumais, George W. Furnas, Thomas K. Landauer, Richard Harshman. Indexing by Latent Semantic Analysis. JASIS (41) 1990 pp. 391-407.

[BibTeX]

@article{deerwester1990indexing,
 author={Scott Deerwester and Susan T. Dumais and George W. Furnas and Thomas K. Landauer and Richard Harshman},
 title={Indexing by Latent Semantic Analysis},
 journal={JASIS},
 year={1990},
 volume={41},
 number={6},
 pages={391-407},
}

Laura Dietz, Steffen Bickel, Tobias Scheffer. Unsupervised prediction of citation influences. ICML (2007).

NetworksBibliometrics

[BibTeX]

@inproceedings{dietz2007unsupervised,
 author={Laura Dietz and Steffen Bickel and Tobias Scheffer},
 title={Unsupervised prediction of citation influences},
 booktitle={ICML},
 year={2007},
}

Chris Ding, Tao Li, Wei Peng. On the Equivalence between Non-negative Matrix Factorization and Probabilistic Latent Semantic Indexing. Computational Statistics and Data Analysis (52) 2008 pp. 3913-3927.

Theory

[BibTeX]

@article{ding2008equivalence,
 author={Chris Ding and Tao Li and Wei Peng},
 title={On the Equivalence between Non-negative Matrix Factorization and Probabilistic Latent Semantic Indexing},
 journal={Computational Statistics and Data Analysis},
 year={2008},
 volume={52},
 pages={3913-3927},
}

Gabriel Doyle, Charles Elkan. Accounting for Burstiness in Topic Models. ICML (2009).

[BibTeX]

@inproceedings{doyle2009accounting,
 author={Gabriel Doyle and Charles Elkan},
 title={Accounting for Burstiness in Topic Models},
 booktitle={ICML},
 year={2009},
 url={http://www.cs.utah.edu/~hal/tmp/icml/papers/162.pdf},
}

Replaces the standard multinomial distribution over topics with a Dirichlet-compound Multinomial (DCM).

Jacob Eisenstein, Brendan O'Connor, Noah A. Smith, Eric P. Xing. A Latent Variable Model for Geographic Lexical Variation. EMNLP (2010).

[BibTeX]

@inproceedings{eisenstein2010latent,
 author={Jacob Eisenstein and Brendan O'Connor and Noah A. Smith and Eric P. Xing},
 title={A Latent Variable Model for Geographic Lexical Variation},
 booktitle={EMNLP},
 year={2010},
 url={http://www.cc.gatech.edu/~jeisenst/papers/emnlp2010.pdf},
}

The widely-reported Twitter dialects paper. Topics combine a word distribution with a bivariate normal over latitude and longitude.

Jacob Eisenstein, Amr Ahmed, Eric P. Xing. Sparse Additive Generative Models of Text. ICML (2011).

[BibTeX]

@inproceedings{eisenstein2011sparse,
 author={Jacob Eisenstein and Amr Ahmed and Eric P. Xing},
 title={Sparse Additive Generative Models of Text},
 booktitle={ICML},
 year={2011},
 url={http://www.cc.gatech.edu/~jeisenst/papers/icml2011.pdf},
}

Presents a new generative model of text, based on the principle of sparse deviation from a background word distribution. This approach proves effective in supervised, unsupervised, and latent variable settings.

Elena Erosheva, Stephen Fienberg, John Lafferty. Mixed Membership Models of Scientific Publications. PNAS (101) 2004 pp. 5220-5227.

Bibliometrics

[BibTeX]

@article{erosheva2004mixed,
 author={Elena Erosheva and Stephen Fienberg and John Lafferty},
 title={Mixed Membership Models of Scientific Publications},
 journal={PNAS},
 year={2004},
 volume={101},
 number={Suppl. 1},
 pages={5220-5227},
}

James R. Foulds, L. Boyles, C. DuBois, Padhraic Smyth, Max Welling. Stochastic Collapsed Variational Bayesian Inference for Latent Dirichlet Allocation. KDD (2013).

Inference

[BibTeX]

@inproceedings{foulds2013stochastic,
 author={James R. Foulds and L. Boyles and C. DuBois and Padhraic Smyth and Max Welling},
 title={Stochastic Collapsed Variational Bayesian Inference for Latent Dirichlet Allocation},
 booktitle={KDD},
 year={2013},
}

Radim Řehůřek. gensim. (2009).

Implementations

[BibTeX]

@misc{gensim,
 author={Radim Řehůřek},
 title={gensim},
 year={2009},
 url={http://nlp.fi.muni.cz/projekty/gensim/},
}

Python package for topic modelling, includes distributed and online implementation of variational LDA.

Sean Gerrish, David M. Blei. A language-based approach to measuring scholarly impact. ICML (2010).

Bibliometrics

[BibTeX]

@inproceedings{gerrish2010language,
 author={Sean Gerrish and David M. Blei},
 title={A language-based approach to measuring scholarly impact},
 booktitle={ICML},
 year={2010},
 url={http://www.cs.princeton.edu/~blei/papers/GerrishBlei2010.pdf},
}

Mark Girolami, Ata Kabán. On an equivalence between pLSI and LDA. SIGIR (2003).

Theory

[BibTeX]

@inproceedings{girolami2003on,
 author={Mark Girolami and Ata Kabán},
 title={On an equivalence between pLSI and LDA},
 booktitle={SIGIR},
 year={2003},
 pages={433-434},
}

Andre Gohr, Myra Spiliopoulou, Alexander Hinneburg. Visually Summarizing the Evolution of Documents under a Social Tag. KDIR (2010).

Temporal

[BibTeX]

@inproceedings{gohr2010visually,
 author={Andre Gohr and Myra Spiliopoulou and Alexander Hinneburg},
 title={Visually Summarizing the Evolution of Documents under a Social Tag},
 booktitle={KDIR},
 year={2010},
 url={http://users.informatik.uni-halle.de/~hinnebur/PS_Files/kdir2010_TT.pdf},
}

Andre Gohr, Alexander Hinneburg, Rene Schult, Myra Spiliopoulou. Topic Evolution in a Stream of Documents. SDM (2009).

Temporal

[BibTeX]

@inproceedings{gohr2009,
 author={Andre Gohr and Alexander Hinneburg and Rene Schult and Myra Spiliopoulou},
 title={Topic Evolution in a Stream of Documents},
 booktitle={SDM},
 year={2009},
 pages={859-870},
 url={http://users.informatik.uni-halle.de/~hinnebur/PS_Files/sdm09_APLSA.pdf},
}

Thomas L. Griffiths, Mark Steyvers. Finding Scientific Topics. PNAS (101) 2004 pp. 5228-5235.

[BibTeX]

@article{griffiths04finding,
 author={Thomas L. Griffiths and Mark Steyvers},
 title={Finding Scientific Topics},
 journal={PNAS},
 year={2004},
 volume={101},
 number={suppl. 1},
 pages={5228-5235},
}

Thomas L. Griffiths, Mark Steyvers, David M. Blei, Joshua B. Tenenbaum. Integrating Topics and Syntax. In , NIPS (2004).

NLP

[BibTeX]

@incollection{griffiths2004integrating,
 author={Thomas L. Griffiths and Mark Steyvers and David M. Blei and Joshua B. Tenenbaum},
 editor={},
 title={Integrating Topics and Syntax},
 booktitle={NIPS},
 year={2004},
 pages={537-544},
 url={http://books.nips.cc/papers/files/nips17/NIPS2004_0642.pdf},
}

David Hall, Daniel Jurafsky, Christopher D. Manning. Studying the History of Ideas Using Topic Models. EMNLP (2008).

Bibliometrics

[BibTeX]

@inproceedings{hall2008studying,
 author={David Hall and Daniel Jurafsky and Christopher D. Manning},
 title={Studying the History of Ideas Using Topic Models},
 booktitle={EMNLP},
 year={2008},
 pages={363-371},
}

Gregor Heinrich. Parameter Estimation for Text Analysis. (2004).

Inference

[BibTeX][Abstract]

@techreport{heinrich2004parameter,
 author={Gregor Heinrich},
 title={Parameter Estimation for Text Analysis},
 year={2004},
 url={http://www.arbylon.net/publications/text-est.pdf},
}

Presents parameter estimation methods common with discrete probability distributions, which is of particular interest in text modeling. Starting with maximum likelihood, a posteriori and Bayesian estimation, central concepts like conjugate distributions and Bayesian networks are reviewed. As an application, the model of latent Dirichlet allocation (LDA) is explained in detail with a full derivation of an approximate inference algorithm based on Gibbs sampling, including a discussion of Dirichlet hyperparameter estimation.

Gregor Heinrich. A generic approach to topic models. ECML/PKDD (2009).

Scalability

[BibTeX]

@inproceedings{heinrich2009generic,
 author={Gregor Heinrich},
 title={A generic approach to topic models},
 booktitle={ECML/PKDD},
 year={2009},
 url={http://arbylon.net/publications/mixnet-gibbs.pdf},
}

Gregor Heinrich. Infinite LDA. (2011).

ImplementationsNon-parametric

[BibTeX]

@misc{heinrich2011infinite,
 author={Gregor Heinrich},
 title={Infinite LDA},
 year={2011},
 url={http://arbylon.net/projects/knowceans-ilda/knowceans-ilda.zip},
}

A simple implementation of a non-parametric model, where the number of topics is not fixed in advance. Uses Teh's direct assignment method for HDP.

Alexander Hinneburg, Hans-Henning Gabriel, Andre Gohr. Bayesian Folding-In with Dirichlet Kernels for PLSI. ICDM (2007).

Theory

[BibTeX]

@inproceedings{hinneburg2007bayesian,
 author={Alexander Hinneburg and Hans-Henning Gabriel and Andre Gohr},
 title={Bayesian Folding-In with Dirichlet Kernels for PLSI},
 booktitle={ICDM},
 year={2007},
 pages={499-504},
 url={http://users.informatik.uni-halle.de/~hinnebur/PS_Files/blsi_icdm07.pdf},
}

Thomas Hofmann. Probilistic latent semantic analysis. UAI (1999).

[BibTeX]

@inproceedings{hofmann1999plsa,
 author={Thomas Hofmann},
 title={Probilistic latent semantic analysis},
 booktitle={UAI},
 year={1999},
}

Matthew Hoffman, David M. Blei, Francis Bach. Online Learning for Latent Dirichlet Allocation. NIPS (2010).

[BibTeX]

@inproceedings{hoffman2010online,
 author={Matthew Hoffman and David M. Blei and Francis Bach},
 title={Online Learning for Latent Dirichlet Allocation},
 booktitle={NIPS},
 year={2010},
}

Jagadeesh Jagarlamudi, Hal Daumé III. Extracting Multilingual Topics from Unaligned Comparable Corpora. (2010).

Cross-language

[BibTeX]

@inproceedings{jagarlamudi2010extracting,
 author={Jagadeesh Jagarlamudi and Hal Daumé III},
 title={Extracting Multilingual Topics from Unaligned Comparable Corpora},
 booktitle={},
 year={2010},
 url={http://dx.doi.org/10.1007/978-3-642-12275-0_39},
 pages={444--456},
}

Mark Johnson. PCFGs, Topic Models, Adaptor Grammars, and Learning Topical Collocations and the Structure of Proper Names. (2010).

NLP

[BibTeX]

@inproceedings{johnson2010pcfgs,
 author={Mark Johnson},
 title={PCFGs, Topic Models, Adaptor Grammars, and Learning Topical Collocations and the Structure of Proper Names},
 booktitle={},
 year={2010},
}

Jyri J. Kivinen, Erik B. Sudderth, Michael I. Jordan. Learning Multiscale Representations of Natural Scenes Using Dirichlet Processes. ICCV (2007).

Non-parametricVision

[BibTeX][Abstract]

@inproceedings{kivinen2007learning,
 author={Jyri J. Kivinen and Erik B. Sudderth and Michael I. Jordan},
 title={Learning Multiscale Representations of Natural Scenes Using Dirichlet Processes},
 booktitle={ICCV},
 year={2007},
 url={http://www.cs.berkeley.edu/~jordan/papers/kivinen-sudderth-jordan-iccv07.pdf},
}

We develop nonparametric Bayesian models for multiscale representations of images depicting natural scene categories. Individual features or wavelet coefficients are marginally described by Dirichlet process (DP) mixtures, yielding the heavy-tailed marginal distributions characteristic of natural images. Dependencies between features are then captured with a hidden Markov tree, and Markov chain Monte Carlo methods used to learn models whose latent state space grows in complexity as more images are observed. By truncating the potentially infinite set of hidden states, we are able to exploit efficient belief propagation methods when learning these hierarchical Dirichlet process hidden Markov trees (HDP-HMTs) from data. We show that our generative models capture interesting qualitative structure in natural scenes, and more accurately categorize novel images than models which ignore spatial relationships among features.

The paper introduces a blocked Gibbs sampler for learning a nonparametric Bayesian topic model whose topic assignments are coupled with a tree-structured graphical model.

Simon Lacoste-Julien, Fei Sha, Michael I. Jordan. DiscLDA: Discriminative Learning for Dimensionality Reduction and Classification. NIPS (2008).

[BibTeX]

@inproceedings{lacoste2008disclda,
 author={Simon Lacoste-Julien and Fei Sha and Michael I. Jordan},
 title={DiscLDA: Discriminative Learning for Dimensionality Reduction and Classification},
 booktitle={NIPS},
 year={2008},
 url={http://books.nips.cc/papers/files/nips21/NIPS2008_0993.pdf},
}

Thomas K. Landauer, Susan T. Dumais. Solutions to Plato's problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychological Review () 1997 pp. .

[BibTeX]

@article{landauer1997solutions,
 author={Thomas K. Landauer and Susan T. Dumais},
 title={Solutions to Plato's problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge},
 journal={Psychological Review},
 year={1997},
 number={104},
}

John Langford. Vowpal Wabbit. (2011).

Implementations

[BibTeX]

@misc{vowpalwabbit,
 author={John Langford},
 title={Vowpal Wabbit},
 year={2011},
 url={https://github.com/JohnLangford/vowpal_wabbit/wiki},
}

VW includes an implementation of Hoffman et al.'s online variational LDA.

Wei Li, David Blei, Andrew McCallum. Nonparametric Bayes Pachinko Allocation. (2007).

Non-parametric

[BibTeX]

@techreport{li2007nonparametric,
 author={Wei Li and David Blei and Andrew McCallum},
 title={Nonparametric Bayes Pachinko Allocation},
 year={2007},
}

Wei-Hao Lin, Eric P. Xing, Alexander Hauptmann. A Joint Topic and Perspective Model for Ideological Discourse. ECML PKDD (2008).

[BibTeX]

@inproceedings{lin2008joint,
 author={Wei-Hao Lin and Eric P. Xing and Alexander Hauptmann},
 title={A Joint Topic and Perspective Model for Ideological Discourse},
 booktitle={ECML PKDD},
 year={2008},
 pages={17-32},
 url={http://portal.acm.org/citation.cfm?id=1431999.1432002},
}

Rasmus Madsen, David Kauchak, Charles Elkan. Modeling Word Burstiness Using the Dirichlet Distribution. ICML (2005).

[BibTeX]

@inproceedings{madsen2005modeling,
 author={Rasmus Madsen and David Kauchak and Charles Elkan},
 title={Modeling Word Burstiness Using the Dirichlet Distribution},
 booktitle={ICML},
 year={2005},
}

Andrew Kachites McCallum. MALLET: A Machine Learning for Language Toolkit. (2002).

Implementations

[BibTeX]

@misc{mallet,
 author={Andrew Kachites McCallum},
 title={MALLET: A Machine Learning for Language Toolkit},
 year={2002},
 url={http://mallet.cs.umass.edu},
}

Implements Gibbs sampling for LDA in Java using fast sampling methods from Yao et al. MALLET also includes support for data preprocessing, classification, and sequence tagging.

Andrew McCallum, Andrés Corrada-Emmanuel, Xuerui Wang. Topic and Role Discovery in Social Networks. IJCAI (2005).

Networks

[BibTeX]

@inproceedings{mccallum2005topic,
 author={Andrew McCallum and Andrés Corrada-Emmanuel and Xuerui Wang},
 title={Topic and Role Discovery in Social Networks},
 booktitle={IJCAI},
 year={2005},
}

Rishabh Mehrotra, Scott Sanner, Wray Buntine, Lexing Xie. Improving LDA Topic Models for Microblogs via Tweet Pooling and Automatic Labeling. SIGIR (2013).

Social media

[BibTeX]

@inproceedings{mehrotra2013improving,
 author={Rishabh Mehrotra and Scott Sanner and Wray Buntine and Lexing Xie},
 title={Improving LDA Topic Models for Microblogs
via Tweet Pooling and Automatic Labeling},
 booktitle={SIGIR},
 year={2013},
}

Merging tweets based on hashtags and imputed hashtags improves topic modeling.

Qiaozhu Mei, Xu Ling, Matthew Wondra, Hang Su, ChengXiang Zhai. Topic sentiment mixture: modeling facets and opinions in weblogs. WWW (2007).

[BibTeX]

@inproceedings{mei2007topic,
 author={Qiaozhu Mei and Xu Ling and Matthew Wondra and Hang Su and ChengXiang Zhai},
 title={Topic sentiment mixture: modeling facets and opinions in weblogs},
 booktitle={WWW},
 year={2007},
}

Qiaozhu Mei, Xuehua Shen, ChengXiang Zhai. Automatic labeling of multinomial topic models. KDD (2007).

User interface

[BibTeX]

@inproceedings{mei2007automatic,
 author={Qiaozhu Mei and Xuehua Shen and ChengXiang Zhai},
 title={Automatic labeling of multinomial topic models},
 booktitle={KDD},
 year={2007},
 pages={490-499},
}

Qiaozhu Mei, Deng Cai, Duo Zhang, ChengXiang Zhai. Topic modeling with network regularization. WWW (2008).

Networks

[BibTeX][Abstract]

@inproceedings{mei2008topic,
 author={Qiaozhu Mei and Deng Cai and Duo Zhang and ChengXiang Zhai},
 title={Topic modeling with network regularization},
 booktitle={WWW},
 year={2008},
 url={http://portal.acm.org/citation.cfm?id=1367512},
}

In this paper, we formally define the problem of topic modeling with network structure (TMN). We propose a novel solution to this problem, which regularizes a statistical topic model with a harmonic regularizer based on a graph structure in the data. The proposed method bridges topic modeling and social network analysis, which leverages the power of both statistical topic models and discrete regularization. The output of this model well summarizes topics in text, maps a topic on the network, and discovers topical communities. With concrete selection of a topic model and a graph-based regularizer, our model can be applied to text mining problems such as author-topic analysis, community discovery, and spatial text mining. Empirical experiments on two different genres of data show that our approach is effective, which improves text-oriented methods as well as network-oriented methods. The proposed model is general; it can be applied to any text collections with a mixture of topics and an associated network structure.

David Mimno, Andrew McCallum. Expertise Modeling for Matching Papers with Reviewers. KDD (2007).

[BibTeX]

@inproceedings{mimno2007expertise,
 author={David Mimno and Andrew McCallum},
 title={Expertise Modeling for Matching Papers with Reviewers},
 booktitle={KDD},
 year={2007},
}

David Mimno, Andrew McCallum. Mining a digital library for influential authors. JCDL (2007).

Bibliometrics

[BibTeX]

@inproceedings{mimno2007mining,
 author={David Mimno and Andrew McCallum},
 title={Mining a digital library for influential authors},
 booktitle={JCDL},
 year={2007},
}

David Mimno, Wei Li, Andrew McCallum. Mixtures of Hierarchical Topics with Pachinko Allocation. ICML (2007).

[BibTeX]

@inproceedings{mimno2007hierarchical,
 author={David Mimno and Wei Li and Andrew McCallum},
 title={Mixtures of Hierarchical Topics with Pachinko Allocation},
 booktitle={ICML},
 year={2007},
}

David Mimno, Andrew McCallum. Topic models conditioned on arbitrary features with Dirichlet-multinomial regression. UAI (2008).

[BibTeX]

@inproceedings{mimno2008dmr,
 author={David Mimno and Andrew McCallum},
 title={Topic models conditioned on arbitrary features with Dirichlet-multinomial regression},
 booktitle={UAI},
 year={2008},
 url={http://www.cs.umass.edu/~mimno/papers/dmr-uai.pdf},
}

Per-document Dirichlet priors over topic distributions are generated using a log-linear combination of observed document features and learned feature-topic parameters. Implemented in Mallet

David Mimno, Hanna Wallach, Andrew McCallum. Gibbs Sampling for Logistic Normal Topic Models with Graph-Based Priors. NIPS Workshop on Analyzing Graphs (2008).

Networks

[BibTeX]

@inproceedings{mimno2008gibbs,
 author={David Mimno and Hanna Wallach and Andrew McCallum},
 title={Gibbs Sampling for Logistic Normal Topic Models with Graph-Based Priors},
 booktitle={NIPS Workshop on Analyzing Graphs},
 year={2008},
 url={http://www.cs.umass.edu/~mimno/papers/sampledlgstnorm.pdf},
}

Introduces an auxiliary-variable method for Gibbs sampling in non-conjugate topic models.

David Mimno, Hanna Wallach, Jason Naradowsky, David A. Smith, Andrew McCallum. Polylingual Topic Models. EMNLP (2009).

Cross-language

[BibTeX]

@inproceedings{mimno2009polylingual,
 author={David Mimno and Hanna Wallach and Jason Naradowsky and David A. Smith and Andrew McCallum},
 title={Polylingual Topic Models},
 booktitle={EMNLP},
 year={2009},
 url={http://www.cs.umass.edu/~mimno/papers/mimno2009polylingual.pdf},
}

David Mimno. Reconstructing Pompeian Households. UAI (2011).

Cross-language

[BibTeX]

@inproceedings{mimno2011reconstructing,
 author={David Mimno},
 title={Reconstructing Pompeian Households},
 booktitle={UAI},
 year={2011},
 url={http://www.cs.princeton.edu/~mimno/papers/pompeii.pdf},
}

David Mimno, Hanna Wallach, Edmund Talley, Miriam Leenders, Andrew McCallum. Optimizing Semantic Coherence in Topic Models. EMNLP (2011).

Evaluation

[BibTeX]

@inproceedings{mimno2011optimizing,
 author={David Mimno and Hanna Wallach and Edmund Talley and Miriam Leenders and Andrew McCallum},
 title={Optimizing Semantic Coherence in Topic Models},
 booktitle={EMNLP},
 year={2011},
 url={http://www.cs.princeton.edu/~mimno/papers/mimno-semantic-emnlp.pdf},
}

A simple, automated metric that uses only information contained in the training documents has strong ability to predict human judgments of topic coherence.

David Mimno, David Blei. Bayesian Checking for Topic Models. EMNLP (2011).

Evaluation

[BibTeX]

@inproceedings{mimno2011bayesian,
 author={David Mimno and David Blei},
 title={Bayesian Checking for Topic Models},
 booktitle={EMNLP},
 year={2011},
 url={http://www.cs.princeton.edu/~mimno/papers/mimno-ppcs-emnlp.pdf},
}

Posterior predictive checks are useful in detecting lack of fit in topic models and identifying which metadata-enriched models might be useful

Indraneel Mukherjee, David Blei. Relative Performance Guarantees for Approximate Inference in Latent Dirichlet Allocation. NIPS (2008).

Inference

[BibTeX]

@inproceedings{mukherjee2008relative,
 author={Indraneel Mukherjee and David Blei},
 title={Relative Performance Guarantees for Approximate Inference in Latent Dirichlet Allocation},
 booktitle={NIPS},
 year={2008},
 url={http://books.nips.cc/papers/files/nips21/NIPS2008_0434.pdf},
}

Indraneel Mukherjee, David Blei. Relative Performance Guarantees for Approximate Inference in Latent Dirichlet Allocation. NIPS (2008).

Inference

[BibTeX]

@inproceedings{mukherjee2008relative,
 author={Indraneel Mukherjee and David Blei},
 title={Relative Performance Guarantees for Approximate Inference in Latent Dirichlet Allocation},
 booktitle={NIPS},
 year={2008},
 url={http://books.nips.cc/papers/files/nips21/NIPS2008_0434.pdf},
}

Claudiu Musat, Julien Velcin, Stefan Trausan-Matu, Marian-Andrei Rizoiu. Improving Topic Evaluation Using Conceptual Knowledge. IJCAI (2011).

Evaluation

[BibTeX]

@inproceedings{musat2011improving,
 author={Claudiu Musat and Julien Velcin and Stefan Trausan-Matu and Marian-Andrei Rizoiu},
 title={Improving Topic Evaluation Using Conceptual Knowledge},
 booktitle={IJCAI},
 year={2011},
}

Ramesh Nallapati, Amr Ahmed, Eric P. Xing, William Cohen. Joint Latent Topic Models for Text and Citations. KDD (2008).

Networks

[BibTeX]

@inproceedings{nallapati2008joint,
 author={Ramesh Nallapati and Amr Ahmed and Eric P. Xing and William Cohen},
 title={Joint Latent Topic Models for Text and Citations},
 booktitle={KDD},
 year={2008},
 url={http://portal.acm.org/citation.cfm?id=1401957},
 pages={542--550},
}

This is one of the first papers to address joint topic models of text and hyperlinks. Used as a baseline in the more recent Relational Topic Models. (R.N.)

Ramesh Nallapati, William Cohen, Susan Ditmore, John Lafferty, Kin Ung. Multi-scale Topic Tomography. KDD (2007).

Temporal

[BibTeX]

@inproceedings{nallapati2007multiscale,
 author={Ramesh Nallapati and William Cohen and Susan Ditmore and John Lafferty and Kin Ung},
 title={Multi-scale Topic Tomography},
 booktitle={KDD},
 year={2007},
 url={http://portal.acm.org/citation.cfm?id=1281249},
 pages={520--529},
}

Models variation of topic content with time at various scales of resolution. A novel variant of dynamic topic models that uses the Poisson distribution for word generation, and wavelets. (R.N.)

Ramesh Nallapati, William Cohen, John Lafferty. Parallelized Variational EM for Latent Dirichlet Allocation: An experimental evaluation of speed and scalability. ICDM workshop on high performance data mining (2007).

Scalability

[BibTeX]

@inproceedings{nallapati2007parallelized,
 author={Ramesh Nallapati and William Cohen and John Lafferty},
 title={Parallelized Variational EM for Latent Dirichlet Allocation: An experimental evaluation of speed and scalability},
 booktitle={ICDM workshop on high performance data mining},
 year={2007},
 url={http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.68.4178amp;&rep=rep1&type=pdf},
}

Early paper on parallel implementations of variational EM for LDA. (R.N.)

Ramesh Nallapati. multithreaded lda-c. (2010).

Implementations

[BibTeX]

@misc{nallapati2010multi-lda-c,
 author={Ramesh Nallapati},
 title={multithreaded lda-c},
 year={2010},
 url={https://sites.google.com/site/rameshnallapati/software},
}

Multi Threaded extension of David Blei's LDA implementation in C. Speeds up the computation by orders of magnitude depending on the number of processors.

David Newman, Chaitanya Chemudugunta, Padhraic Smyth. Statistical entity-topic models. KDD (2006).

[BibTeX]

@inproceedings{newman2006statistical,
 author={David Newman and Chaitanya Chemudugunta and Padhraic Smyth},
 title={Statistical entity-topic models},
 booktitle={KDD},
 year={2006},
}

D. Newman, S. Block. Probabilistic Topic Decomposition of an Eighteenth-Century American Newspaper. JASIST () 2006 pp. .

[BibTeX]

@article{newman2005probabilistic,
 author={D. Newman and S. Block},
 title={Probabilistic Topic Decomposition of an Eighteenth-Century American Newspaper},
 journal={JASIST},
 year={2006},
}

David Newman, Jey Han Lau, Karl Grieser, Timothy Baldwin. Automatic Evaluation of Topic Coherence. NAACL (2010).

Evaluation

[BibTeX]

@inproceedings{newman2010automatic,
 author={David Newman and Jey Han Lau and Karl Grieser and Timothy Baldwin},
 title={Automatic Evaluation of Topic Coherence},
 booktitle={NAACL},
 year={2010},
}

Xiaochuan Ni, Jian-Tao Sun, Jian Hu, Zheng Chen. Mining Multilingual Topics from Wikipedia. WWW (2009).

Cross-language

[BibTeX][Abstract]

@inproceedings{ni2009multilingual,
 author={Xiaochuan Ni and Jian-Tao Sun and Jian Hu and Zheng Chen},
 title={Mining Multilingual Topics from Wikipedia},
 booktitle={WWW},
 year={2009},
 url={http://www2009.eprints.org/158/},
}

In this paper, we try to leverage a large-scale and multilingual knowledge base, Wikipedia, to help effectively analyze and organize Web information written in different languages. Based on the observation that one Wikipedia concept may be described by articles in different languages, we adapt existing topic modeling algorithm for mining multilingual topics from this knowledge base. The extracted "universal" topics have multiple types of representations, with each type corresponding to one language. Accordingly, new documents of different languages can be represented in a space using a group of universal topics, which makes various multilingual Web applications feasible.

Xuan-Hieu Phan, Cam-Tu Nguyen. GibbsLDA++. (2007).

Implementations

[BibTeX]

@misc{gibbslda++,
 author={Xuan-Hieu Phan and Cam-Tu Nguyen},
 title={GibbsLDA++},
 year={2007},
 url={http://gibbslda.sourceforge.net},
}

C/C++ implementation of LDA with Gibbs sampling.

Jukka Perkiö, Wray L. Buntine, Sami Perttu. Exploring Independent Trends in a Topic-Based Search Engine. Web Intelligence (2004).

[BibTeX]

@inproceedings{perkio2004exploring,
 author={Jukka Perkiö and Wray L. Buntine and Sami Perttu},
 title={Exploring Independent Trends in a Topic-Based Search Engine},
 booktitle={Web Intelligence},
 year={2004},
 pages={664-668},
}

Matthew Purver, Konrad Körding, Thomas L. Griffiths, Joshua Tenenbaum. Unsupervised Topic Modelling for Multi-Party Spoken Discourse. ACL (2006).

[BibTeX]

@inproceedings{purver2006unsupervised,
 author={Matthew Purver and Konrad Körding and Thomas L. Griffiths and Joshua Tenenbaum},
 title={Unsupervised Topic Modelling for Multi-Party Spoken Discourse},
 booktitle={ACL},
 year={2006},
 url={http://web.mit.edu/cocosci/Papers/purver-et-al06acl.pdf},
}

Daniel Ramage, Evan Rosen. Stanford Topic Modeling Toolbox. (2009).

Implementations

[BibTeX]

@misc{ramage-tmt,
 author={Daniel Ramage and Evan Rosen},
 title={Stanford Topic Modeling Toolbox},
 year={2009},
 url={http://nlp.stanford.edu/software/tmt/tmt-0.3/},
}

Scala implementation of LDA and LabeledLDA. Input and output integration with Excel.

Daniel Ramage, David Hall, Ramesh Nallapati, Christopher D. Manning. Labeled LDA: A Supervised Topic Model for Credit Attribution in Multi-Labeled Corpora. EMNLP (2009).

[BibTeX]

@inproceedings{ramage2009labeled,
 author={Daniel Ramage and David Hall and Ramesh Nallapati and Christopher D. Manning},
 title={Labeled LDA: A Supervised Topic Model for Credit Attribution in Multi-Labeled Corpora},
 booktitle={EMNLP},
 year={2009},
}

Daniel Ramage, Susan Dumais, Dan Liebling. Characterizing Microblogs with Topic Models. ICWSM (2010).

[BibTeX]

@inproceedings{ramage2010characterizing,
 author={Daniel Ramage and Susan Dumais and Dan Liebling},
 title={Characterizing Microblogs with Topic Models},
 booktitle={ICWSM},
 year={2010},
 url={http://www.stanford.edu/~dramage/papers/twitter-icwsm10.pdf},
}

Joseph Reisinger, Austin Waters, Brian Silverthorn, Raymond J. Mooney. Spherical Topic Models. ICML (2010).

[BibTeX][Abstract]

@inproceedings{reisinger2010spherical,
 author={Joseph Reisinger and Austin Waters and Brian Silverthorn and Raymond J. Mooney},
 title={Spherical Topic Models},
 booktitle={ICML},
 year={2010},
 url={http://www.cs.utexas.edu/users/ml/papers/reisinger.icml10.pdf},
}

We introduce the Spherical Admixture Model (SAM), a Bayesian topic model for arbitrary L2 normalized data. SAM maintains the same hierarchical structure as Latent Dirichlet Allocation (LDA), but models documents as points on a high-dimensional spherical manifold, allowing a natural likelihood parameterization in terms of cosine distance. Furthermore, SAM can model word absence/presence at the document level, and unlike previous models can assign explicit negative weight to topic terms. Performance is evaluated empirically, both through human ratings of topic quality and through diverse classification tasks from natural language processing and computer vision. In these experiments, SAM consistently outperforms existing models.

Michal Rosen-Zvi, Tom Griffiths, Mark Steyvers, Padhraic Smyth. The Author-Topic Model for Authors and Documents. UAI (2004).

[BibTeX]

@inproceedings{rosenzvi2004author,
 author={Michal Rosen-Zvi and Tom Griffiths and Mark Steyvers and Padhraic Smyth},
 title={The Author-Topic Model for Authors and Documents},
 booktitle={UAI},
 year={2004},
}

Ruslan Salakhutdinov, Geoffrey Hinton. Replicated Softmax: an Undirected Topic Model. NIPS (2009).

[BibTeX]

@inproceedings{salakhutdinov2009replicated,
 author={Ruslan Salakhutdinov and Geoffrey Hinton},
 title={Replicated Softmax: an Undirected Topic Model},
 booktitle={NIPS},
 year={2009},
 url={http://books.nips.cc/papers/files/nips22/NIPS2009_0817.pdf},
}

Carson Sievert, Kenneth E. Shirley. LDAvis: A method for visualizing and interpreting topics. Proceedings of the Workshop on Interactive Language Learning, Visualization, and Interfaces (2014).

[BibTeX]

@inproceedings{sievert2014ldavis,
 author={Carson Sievert and Kenneth E. Shirley},
 title={LDAvis: A method for visualizing and interpreting topics},
 booktitle={Proceedings of the Workshop on Interactive Language Learning, Visualization, and Interfaces},
 year={2014},
 url={https://github.com/cpsievert/LDAvis},
}

Shravan Narayanamurthy. Yahoo! LDA. (2011).

Implementations

[BibTeX]

@misc{YahooLDA,
 author={Shravan Narayanamurthy},
 title={Yahoo! LDA},
 year={2011},
 url={https://github.com/shravanmn/Yahoo_LDA/wiki},
}

Y!LDA implements a fast, sampling-based, distributed algorithm. See Smola and Narayanamurthy for details.

Alexander Smola, Shravan Narayanamurthy. An Architecture for Parallel Topic Models. VLDB (2010).

Scalability

[BibTeX]

@inproceedings{smola2010architecture,
 author={Alexander Smola and Shravan Narayanamurthy},
 title={An Architecture for Parallel Topic Models},
 booktitle={VLDB},
 year={2010},
}

Mark Steyvers, Tom Griffiths. Matlab Topic Modeling Toolbox. (2005).

Implementations

[BibTeX]

@misc{steyvers-tmtb,
 author={Mark Steyvers and Tom Griffiths},
 title={Matlab Topic Modeling Toolbox},
 year={2005},
 url={http://psiexp.ss.uci.edu/research/programs_data/toolbox.htm},
}

Implements LDA, Author-Topic, HMM-LDA, LDA-COL. Tools for 2D visualization.

Mark Steyvers, Tom Griffiths. Probabilistic Topic Models. In Landauer, T., Mcnamara, D., Dennis, S., Kintsch, W., Latent Semantic Analysis: A Road to Meaning. (2006).

Where to start

[BibTeX]

@incollection{steyvers2006probabilistic,
 author={Mark Steyvers and Tom Griffiths},
 editor={Landauer, T. and Mcnamara, D. and Dennis, S. and Kintsch, W.},
 title={Probabilistic Topic Models},
 booktitle={Latent Semantic Analysis: A Road to Meaning.},
 year={2006},
 publisher={Laurence Erlbaum},
 url={http://cocosci.berkeley.edu/tom/papers/SteyversGriffiths.pdf},
}

A good introduction to topic modeling.

Claudio Taranto, Nicola Di Mauro, Floriana Esposito. rsLDA: a Bayesian Hierarchical Model for Relational Learning. ICDKE (2011).

[BibTeX]

@inproceedings{taranto2011rslda,
 author={Claudio Taranto and Nicola Di Mauro and Floriana Esposito},
 title={rsLDA: a Bayesian Hierarchical Model for Relational Learning},
 booktitle={ICDKE},
 year={2011},
 url={http://www.di.uniba.it/~ndm/publications/files/taranto11icdke.pdf},
}

Yee-Whye Teh, David Newman, Max Welling. A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation. NIPS (2006).

Inference

[BibTeX]

@inproceedings{teh2006collapsed,
 author={Yee-Whye Teh and David Newman and Max Welling},
 title={A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation},
 booktitle={NIPS},
 year={2006},
 url={http://books.nips.cc/papers/files/nips19/NIPS2006_0511.pdf},
}

Yee Whye Teh, Michael I. Jordan, Matthew J. Beal, David M. Blei. Hierarchical Dirichlet Processes. JASA (101) 2006 pp. .

Non-parametric

[BibTeX]

@article{teh2006hierarchical,
 author={Yee Whye Teh and Michael I. Jordan and Matthew J. Beal and David M. Blei},
 title={Hierarchical Dirichlet Processes},
 journal={JASA},
 year={2006},
 url={http://dx.doi.org/10.1198/016214506000000302},
 volume={101},
}

Kristina Toutanova, Mark Johnson. A Bayesian LDA-based model for semi-supervised part-of-speech tagging. NIPS (2007).

NLP

[BibTeX]

@inproceedings{toutanova2007bayesian,
 author={Kristina Toutanova and Mark Johnson},
 title={A Bayesian LDA-based model for semi-supervised part-of-speech tagging},
 booktitle={NIPS},
 year={2007},
 pages={1521-1528},
 url={http://books.nips.cc/papers/files/nips20/NIPS2007_0964.pdf},
}

Hanna M. Wallach. Topic modeling: beyond bag-of-words. ICML (2006).

[BibTeX]

@inproceedings{wallach2006beyond,
 author={Hanna M. Wallach},
 title={Topic modeling: beyond bag-of-words},
 booktitle={ICML},
 year={2006},
}

Hanna Wallach, Iain Murray, Ruslan Salakhutdinov, David Mimno. Evaluation Methods for Topic Models. ICML (2009).

Evaluation

[BibTeX]

@inproceedings{wallach2009evaluation,
 author={Hanna Wallach and Iain Murray and Ruslan Salakhutdinov and David Mimno},
 title={Evaluation Methods for Topic Models},
 booktitle={ICML},
 year={2009},
 url={http://www.cs.umass.edu/~mimno/papers/wallach09evaluation.pdf},
}

Commonly used methods for estimating the probability of held-out words may be unstable. This paper presents more accurate methods.

Hanna Wallach, David Mimno, Andrew McCallum. Rethinking LDA: Why priors matter. NIPS (2009).

Theory

[BibTeX]

@inproceedings{wallach2009rethinking,
 author={Hanna Wallach and David Mimno and Andrew McCallum},
 title={Rethinking LDA: Why priors matter},
 booktitle={NIPS},
 year={2009},
 url={http://www.cs.umass.edu/~mimno/papers/NIPS2009_0929.pdf},
}

The use of an asymmetric Dirichlet prior on per-document topic distributions reduces sensitivity to very common words (eg stopwords and near-stopwords) and makes topic assignments more stable as the number of topics grows.

Chang Wang, Sridhar Mahadevan. Multiscale Analysis of Document Corpora Based on Diffusion Models. IJCAI (2009).

[BibTeX]

@inproceedings{wang2009multiscale,
 author={Chang Wang and Sridhar Mahadevan},
 title={Multiscale Analysis of Document Corpora Based on Diffusion Models},
 booktitle={IJCAI},
 year={2009},
 url={http://www.cs.umass.edu/~chwang/papers/IJCAI-2009-TD.pdf},
}

Chang Wang, James Fan, Aditya Kalyanpur, David Gondek. Relation Extraction with Relation Topics. EMNLP (2011).

[BibTeX]

@inproceedings{wang2011relation,
 author={Chang Wang and James Fan and Aditya Kalyanpur and David Gondek},
 title={Relation Extraction with Relation Topics},
 booktitle={EMNLP},
 year={2011},
 url={http://www-all.cs.umass.edu/~chwang/EMNLP-2011.pdf},
}

Xuerui Wang, Natasha Mohanty, Andrew McCallum. Group and Topic Discovery from Relations and Their Attributes. NIPS (2005).

Networks

[BibTeX]

@inproceedings{wang2005group,
 author={Xuerui Wang and Natasha Mohanty and Andrew McCallum},
 title={Group and Topic Discovery from Relations and Their Attributes},
 booktitle={NIPS},
 year={2005},
 url={http://books.nips.cc/papers/files/nips18/NIPS2005_0819.pdf},
}

Xuerui Wang, Andrew McCallum. Topics Over Time: a non-Markov continuous-time model of topical trends. KDD (2006).

Temporal

[BibTeX]

@inproceedings{wang2006topics,
 author={Xuerui Wang and Andrew McCallum},
 title={Topics Over Time: a non-Markov continuous-time model of topical trends},
 booktitle={KDD},
 year={2006},
}

Chong Wang, David M. Blei, David Heckerman. Continuous Time Dynamic Topic Models. UAI (2008).

Temporal

[BibTeX][Abstract]

@inproceedings{wang2008continuous,
 author={Chong Wang and David M. Blei and David Heckerman},
 title={Continuous Time Dynamic Topic Models},
 booktitle={UAI},
 year={2008},
 url={http://uai2008.cs.helsinki.fi/UAI_camera_ready/wang.pdf},
}

In this paper, we develop the continuous time dynamic topic model (cDTM). The cDTM is a dynamic topic model that uses Brownian motion to model the latent topics through a sequential collection of documents, where a "topic" is a pattern of word use that we expect to evolve over the course of the collection. We derive an efficient variational approximate inference algorithm that takes advantage of the sparsity of observations in text, a property that lets us easily handle many time points. In contrast to the cDTM, the original discrete-time dynamic topic model (dDTM) requires that time be discretized. Moreover, the complexity of variational inference for the dDTM grows quickly as time granularity increases, a drawback which limits fine-grained discretization. We demonstrate the cDTM on two news corpora, reporting both predictive perplexity and the novel task of time stamp prediction.

Chong Wang, David Blei, Fei-Fei Li. Simultaneous Image Classification and Annotation. CVPR (2009).

Vision

[BibTeX]

@inproceedings{wang2009simulataneous,
 author={Chong Wang and David Blei and Fei-Fei Li},
 title={Simultaneous Image Classification and Annotation},
 booktitle={CVPR},
 year={2009},
}

Yi Wang. Distributed Gibbs Sampling of Latent Dirichlet Allocation: The Gritty Details. (2011).

Where to start

[BibTeX]

@misc{wang2011gritty,
 author={Yi Wang},
 title={Distributed Gibbs Sampling of Latent Dirichlet Allocation: The Gritty Details},
 year={2011},
 url={http://dbgroup.cs.tsinghua.edu.cn/wangyi/lda/lda.pdf},
}

A thorough introduction for those wanting to understand the mathematical basics of topic models.

Wei Li, Andrew McCallum. Pachinko allocation: DAG-structured mixture models of topic correlations. ICML (2006).

[BibTeX]

@inproceedings{wei2006pachinko,
 author={Wei Li and Andrew McCallum},
 title={Pachinko allocation: DAG-structured mixture models of topic correlations},
 booktitle={ICML},
 year={2006},
}

Xing Wei, Bruce Croft. LDA-based document models for ad-hoc retrieval. SIGIR (2006).

[BibTeX]

@inproceedings{wei2006lda,
 author={Xing Wei and Bruce Croft},
 title={LDA-based document models for ad-hoc retrieval},
 booktitle={SIGIR},
 year={2006},
}

Feng Yan, Ningyi Xu, Yuan Qi. Parallel Inference for Latent Dirichlet Allocation on Graphics Processing Units. NIPS (2009).

[BibTeX]

@inproceedings{yan2009parallel,
 author={Feng Yan and Ningyi Xu and Yuan Qi},
 title={Parallel Inference for Latent Dirichlet Allocation on Graphics Processing Units},
 booktitle={NIPS},
 year={2009},
 url={http://books.nips.cc/papers/files/nips22/NIPS2009_0546.pdf},
}

In addition to dividing the corpus between processors, this work divides the vocabulary into the same number of partitions, such that each processor works on both its own documents and its own words at each epoch. This increases the number of epochs, but drastically reduces the possibility of incorrect samples.

Shuang-Hong Yang, Steven P. Crain, Hongyuan Zha. Bridging the language gap: topic adaptation for documents with different technicality. AIStats (2011).

[BibTeX]

@inproceedings{yang2011bridging,
 author={Shuang-Hong Yang and Steven P. Crain and Hongyuan Zha},
 title={Bridging the language gap: topic adaptation for documents with different technicality},
 booktitle={AIStats},
 year={2011},
 url={http://jmlr.csail.mit.edu/proceedings/papers/v15/yang11b/yang11b.pdf},
}

Limin Yao, David Mimno, Andrew McCallum. Efficient Methods for Topic Model Inference on Streaming Document Collections. KDD (2009).

Scalability

[BibTeX]

@inproceedings{yao2009efficient,
 author={Limin Yao and David Mimno and Andrew McCallum},
 title={Efficient Methods for Topic Model Inference on Streaming Document Collections},
 booktitle={KDD},
 year={2009},
 url={http://www.cs.umass.edu/~mimno/papers/fast-topic-model.pdf},
}

Explores methods for inferring topic distributions for new documents given a trained model. This paper includes the SparseLDA algorithm and data structure, which can dramatically improve time and memory performance in Gibbs sampling.

Jianwen Zhang, Yangqiu Song, Changshui Zhang, Shixia Liu. Evolutionary Hierarchical Dirichlet Processes for Multiple Correlated Time-varying Corpora. KDD (2010).

Non-parametricTemporal

[BibTeX]

@inproceedings{zhang2010evolutionary,
 author={Jianwen Zhang and Yangqiu Song and Changshui Zhang and Shixia Liu},
 title={Evolutionary Hierarchical Dirichlet Processes for Multiple Correlated Time-varying Corpora},
 booktitle={KDD},
 year={2010},
 url={http://research.microsoft.com/en-us/um/people/shliu/p1079-zhang.pdf},
}

Bing Zhao, Eric P. Xing. BiTAM: Bilingual Topic AdMixture Models for Word Alignment. ACL (2006).

Cross-language

[BibTeX]

@inproceedings{zhao2006bitam,
 author={Bing Zhao and Eric P. Xing},
 title={BiTAM: Bilingual Topic AdMixture Models for Word Alignment},
 booktitle={ACL},
 year={2006},
 url={http://www.aclweb.org/anthology/P/P06/P06-2124},
}

Bin Zhao, Eric P. Xing. HM-BiTAM: Bilingual Topic Exploration, Word Alignment, and Translation. NIPS (2007).

Cross-language

[BibTeX]

@inproceedings{zhao2006hmbitam,
 author={Bin Zhao and Eric P. Xing},
 title={HM-BiTAM: Bilingual Topic Exploration, Word Alignment, and Translation},
 booktitle={NIPS},
 year={2007},
 url={http://books.nips.cc/papers/files/nips20/NIPS2007_0188.pdf},
}

Jun Zhu, Amr Ahmed, Eric P. Xing. MedLDA: Maximum Margin Supervised Topic Models for Regression and Classification. ICML (2009).

[BibTeX]

@inproceedings{zhu2009medlda,
 author={Jun Zhu and Amr Ahmed and Eric P. Xing},
 title={MedLDA: Maximum Margin Supervised Topic Models for Regression and Classification},
 booktitle={ICML},
 year={2009},
}

Jun Zhu, Eric P. Xing. Conditional Topic Random Fields. ICML (2010).

[BibTeX]

@inproceedings{zhu2010conditional,
 author={Jun Zhu and Eric P. Xing},
 title={Conditional Topic Random Fields},
 booktitle={ICML},
 year={2010},
}

Xiaojin Zhu, David M. Blei, John Lafferty. TagLDA: Bringing document structure knowledge into topic models. (2006).

[BibTeX][Abstract]

@techreport{zhu2006taglda,
 author={Xiaojin Zhu and David M. Blei and John Lafferty},
 title={TagLDA: Bringing document structure knowledge into topic models},
 year={2006},
 institution={University of Wisconsin, Madison},
 number={TR-1553},
}

Latent Dirichlet Allocation models a document by a mixture of topics, where each topic itself is typically modeled by a unigram word distribution. Documents however often have known structures, and the same topic can exhibit different word distributions under different parts of the structure. We extend latent Dirichlet allocation model by replacing the unigram word distributions with a factored representation conditioned on both the topic and the structure. In the resultant model each topic is equivalent to a set of unigrams, reflecting the structure a word is in. The proposed model is more flexible in modeling the corpus. The factored representation prevents combinatorial explosion and leads to efficient parameterization. We derive the variational optimization algorithm for the new model. The model shows improved perplexity on text and image data, but no significant accuracy improvement when used for classification.