Medlda

Medlda: maximum margin supervised topic models. A supervised topic model can use side information such as ratings or labels associated with documents or images to discover more predictive low dimensional topical representations of the data. However, existing supervised topic models predominantly employ likelihood-driven objective functions for learning and inference, leaving the popular and potentially powerful max-margin principle unexploited for seeking predictive representations of data and more discriminative topic bases for the corpus. In this paper, we propose the maximum entropy discrimination latent Dirichlet allocation (MedLDA) model, which integrates the mechanism behind the max-margin prediction models (e.g., SVMs) with the mechanism behind the hierarchical Bayesian topic models (e.g., LDA) under a unified constrained optimization framework, and yields latent topical representations that are more discriminative and more suitable for prediction tasks such as document classification or regression. The principle underlying the MedLDA formalism is quite general and can be applied for jointly max-margin and maximum likelihood learning of directed or undirected topic models when supervising side information is available. Efficient variational methods for posterior inference and parameter estimation are derived and extensive empirical studies on several real data sets are also provided. Our experimental results demonstrate qualitatively and quantitatively that MedLDA could: 1) discover sparse and highly discriminative topical representations; 2) achieve state of the art prediction performance; and 3) be more efficient than existing supervised topic models, especially for classification.


References in zbMATH (referenced in 13 articles )

Showing results 1 to 13 of 13.
Sorted by year (citations)

  1. Wang, Feifei; Zhang, Junni L.; Li, Yichao; Deng, Ke; Liu, Jun S.: Bayesian text classification and summarization via a class-specified topic model (2021)
  2. Wang, Wei; Guo, Bing; Shen, Yan; Yang, Han; Chen, Yaosen; Suo, Xinhua: Robust supervised topic models under label noise (2021)
  3. He, Jia; Du, Changying; Zhuang, Fuzhen; Yin, Xin; He, Qing; Long, Guoping: Online Bayesian max-margin subspace learning for multi-view classification and regression (2020)
  4. Magnusson, Måns; Jonsson, Leif; Villani, Mattias: DOLDA: a regularized supervised topic model for high-dimensional multi-class regression (2020)
  5. Xuan, Junyu; Lu, Jie; Zhang, Guangquan: Cooperative hierarchical Dirichlet processes: superposition vs. maximization (2019)
  6. Kim, Dongwoo; Oh, Alice: Hierarchical Dirichlet scaling process (2017)
  7. Liu, Chenghao; Jin, Tao; Hoi, Steven C. H.; Zhao, Peilin; Sun, Jianling: Collaborative topic regression for online recommender systems: an online and Bayesian approach (2017)
  8. Zhang, Xuefeng; Chen, Bo; Liu, Hongwei; Zuo, Lei; Feng, Bo: Infinite max-margin factor analysis via data augmentation (2016)
  9. Zhou, Mingyuan; Cong, Yulai; Chen, Bo: Augmentable gamma belief networks (2016)
  10. Koyejo, Oluwasanmi; Lee, Cheng; Ghosh, Joydeep: A constrained matrix-variate Gaussian process for transposable data (2014)
  11. Zhu, Jun; Chen, Ning; Perkins, Hugh; Zhang, Bo: Gibbs max-margin topic models with data augmentation (2014)
  12. Rubin, Timothy N.; Chambers, America; Smyth, Padhraic; Steyvers, Mark: Statistical topic models for multi-label document classification (2012)
  13. Zhu, Jun; Ahmed, Amr; Xing, Eric P.: MedLDA: maximum margin supervised topic models (2012)