nips nips2013 nips2013-301 nips2013-301-reference knowledge-graph by maker-knowledge-mining

301 nips-2013-Sparse Additive Text Models with Low Rank Background

Source: pdf

Author: Lei Shi

Abstract: The sparse additive model for text modeling involves the sum-of-exp computing, whose cost is consuming for large scales. Moreover, the assumption of equal background across all classes/topics may be too strong. This paper extends to propose sparse additive model with low rank background (SAM-LRB) and obtains simple yet efﬁcient estimation. Particularly, employing a double majorization bound, we approximate log-likelihood into a quadratic lower-bound without the log-sumexp terms. The constraints of low rank and sparsity are then simply embodied by nuclear norm and ℓ1 -norm regularizers. Interestingly, we ﬁnd that the optimization task of SAM-LRB can be transformed into the same form as in Robust PCA. Consequently, parameters of supervised SAM-LRB can be efﬁciently learned using an existing algorithm for Robust PCA based on accelerated proximal gradient. Besides the supervised case, we extend SAM-LRB to favor unsupervised and multifaceted scenarios. Experiments on three real data demonstrate the effectiveness and efﬁciency of SAM-LRB, compared with a few state-of-the-art models. 1

reference text

[1] A. Ahmed and E. P. Xing. Staying informed: supervised and semi-supervised multi-view topical analysis of ideological pespective. In Proc. EMNLP, pages 1140–1150, 2010.

[2] A. Beck and M. Teboulle. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM Journal on Imaging Sciences, 2(1):183–202, 2009.

[3] D. Blei and J. McAuliffe. Supervised topic models. In Advances in NIPS, pages 121–128. 2008.

[4] D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent Dirichlet allocation. JMLR, 3:993–1022, 2003.

[5] D. Bohning. Multinomial logistic regression algorithm. Annals of Inst. of Stat. Math., 44:197– 200, 1992.

[6] G. Bouchard. Efﬁcient bounds for the softmax function, applications to inference in hybrid models. In Workshop for Approximate Bayesian Inference in Continuous/Hybrid Systems at NIPS’07, 2007.

[7] X. Chen, Q. Lin, S. Kim, J. G. Carbonell, and E. P. Xing. Smoothing proximal gradient method for general structured sparse regression. The Annals of Applied Statistics, 6(2):719–752, 2012.

[8] X. Ding, L. He, and L. Carin. Bayesian robust principal component analysis. IEEE Trans. Image Processing, 20(12):3419–3430, 2011. 8

[9] J. Eckstein. Augmented Lagrangian and alternating direction methods for convex optimization: A tutorial and some illustrative computational results. Technical report, RUTCOR Research Report RRR 32-2012, 2012.

[10] J. Eisenstein, A. Ahmed, and E. P. Xing. Sparse additive generative models of text. In Proc. ICML, 2011.

[11] J. Eisenstein and E. P. Xing. The CMU 2008 political blog corpus. Technical report, Carnegie Mellon University, School of Computer Science, Machine Learning Department, 2010.

[12] M. A. T. Figueiredo. Adaptive sparseness using Jeffreys prior. In Advances in NIPS, pages 679–704. 2002.

[13] M. R. Gormley, M. Dredze, B. Van Durme, and J. Eisner. Shared components topic models. In Proc. NAACL-HLT, pages 783–792, 2012.

[14] L. Hong, A. Ahmed, S. Gurumurthy, A. J. Smola, and K. Tsioutsiouliklis. Discovering geographical topics in the twitter stream. In Proc. 12th WWW, pages 769–778, 2012.

[15] T. Jaakkola and M. I. Jordan. A variational approach to Bayesian logistic regression problems and their extensions. In Proc. AISTATS, 1996.

[16] M. Jaggi and M. Sulovsk` . A simple algorithm for nuclear norm regularized problems. In y Proc. ICML, pages 471–478, 2010.

[17] Y. Jiang and A. Saxena. Discovering different types of topics: Factored topics models. In Proc. IJCAI, 2013.

[18] A. Joulin, F. Bach, and J. Ponce. Efﬁcient optimization for discriminative latent class models. In Advances in NIPS, pages 1045–1053. 2010.

[19] J. D. Lafferty and M. D. Blei. Correlated topic models. In Advances in NIPS, pages 147–155, 2006.

[20] Z. Lin, A. Ganesh, J. Wright, L. Wu, M. Chen, and Y. Ma. Fast convex optimization algorithms for exact recovery of a corrupted low-rank matrix. Technical report, UIUC Technical Report UILU-ENG-09-2214, August 2009.

[21] Q. Mei, X. Ling, M. Wondra, H. Su, and C. X. Zhai. Topic sentiment mixture: modeling facets and opinions in webblogs. In Proc. WWW, 2007.

[22] T. P. Minka. Estimating a dirichlet distribution. Technical report, Massachusetts Institute of Technology, 2003.

[23] M. Paul and R. Girju. A two-dimensional topic-aspect model for discovering multi-faceted topics. In Proc. AAAI, 2010.

[24] E. Richard, P.-A. Savalle, and N. Vayatis. Estimation of simultaneously sparse and low rank matrices. In Proc. ICML, pages 1351–1358, 2012.

[25] Y. S. N. A. Smith and D. A. Smith. Discovering factions in the computational linguistics community. In ACL Workshop on Rediscovering 50 Years of Discoveries, 2012.

[26] C. Wang and D. Blei. Decoupling sparsity and smoothness in the discrete hierarchical dirichlet process. In Advances in NIPS, pages 1982–1989. 2009.

[27] C. Wang and D. M. Blei. Variational inference in nonconjugate models. To appear in JMLR.

[28] J. Wright, A. Ganesh, S. Rao, Y. Peng, and Y. Ma. Robust principal component analysis: Exact recovery of corrupted low-rank matrices via convex optimization. In Advances in NIPS, pages 2080–2088. 2009.

[29] J. Yang and X. Yuan. Linearized augmented Lagrangian and alternating direction methods for nuclear norm minimization. Math. Comp., 82:301–329, 2013.

[30] J. Zhu, A. Ahmed, and E. P. Xing. MedLDA: maximum margin supervised topic models. JMLR, 13:2237–2278, 2012. 9