iccv iccv2013 iccv2013-73 iccv2013-73-reference knowledge-graph by maker-knowledge-mining

73 iccv-2013-Class-Specific Simplex-Latent Dirichlet Allocation for Image Classification

Source: pdf

Author: Mandar Dixit, Nikhil Rasiwasia, Nuno Vasconcelos

Abstract: An extension of the latent Dirichlet allocation (LDA), denoted class-specific-simplex LDA (css-LDA), is proposed for image classification. An analysis of the supervised LDA models currently used for this task shows that the impact of class information on the topics discovered by these models is very weak in general. This implies that the discovered topics are driven by general image regularities, rather than the semantic regularities of interest for classification. To address this, we introduce a model that induces supervision in topic discovery, while retaining the original flexibility of LDA to account for unanticipated structures of interest. The proposed css-LDA is an LDA model with class supervision at the level of image features. In css-LDA topics are discovered per class, i.e. a single set of topics shared across classes is replaced by multiple class-specific topic sets. This model can be used for generative classification using the Bayes decision rule or even extended to discriminative classification with support vector machines (SVMs). A css-LDA model can endow an image with a vector of class and topic specific count statistics that are similar to the Bag-of-words (BoW) histogram. SVM-based discriminants can be learned for classes in the space of these histograms. The effectiveness of css-LDA model in both generative and discriminative classification frameworks is demonstrated through an extensive experimental evaluation, involving multiple benchmark datasets, where it is shown to outperform all existing LDA based image classification approaches.

reference text

[1] D. Blei and J. McAuliffe. Supervised topic models. NIPS, 20: 121– 128, 2008. 1, 2

[2] D. Blei, A. Ng, and M. Jordan. Latent Dirichlet allocation. The Journal of Machine Learning Research, 3:993–1022, 2003. 1, 2, 3, 4, 6, 7

[3] A. Bosch, A. Zisserman, and X. Muoz. Scene classification using a hybrid generative/discriminative approach. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 30(4):712–727, 2008. 1, 7

[4] W. Buntine. Operations for learning with graphical models. Arxiv preprint cs/9412102, 1994. 3

[5] R. Cinbis, J. Verbeek, and C. Schmid. Image categorization using fisher kernels of non-iid image models. In IEEE CVPR 2012, 2012. 6

[6] G. Csurka, C. Dance, L. Fan, and C. Bray. Visual categorization with bags of keypoints. Workshop on Statistical Learning in Computer Vision, ECCV, 1:1–22, 2004. 1, 2, 6

[7] P. Duygulu, K. Barnard, N. Freitas, and D. Forsyth. Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. In ECCV, 2002. 7

[8] L. Fei-Fei and P. Perona. A Bayesian hierarchical model for learning natural scene categories. In IEEE CVPR, 2005. 1, 2, 4, 7, 8

[9] T. Hofmann. Probabilistic latent semantic indexing. In Proceedings ofthe 22nd annual internationalACM SIGIR conference on Research and development in information retrieval, pages 50–57. ACM, 1999. 1

[10] J. Kruskal. Nonmetric multidimensional scaling: A numerical method. Psychometrika, 29(2): 115–129, 1964. 7

[11] S. Lacoste-Julien, F. Sha, and M. Jordan. DiscLDA: Discriminative learning for dimensionality reduction and classification. NIPS, 21, 2008. 1

[12] S. Lazebnik, C. Schmid, and J. Ponce. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In IEEE CVPR, 2006. 1, 6, 7

[13] D. Putthividhya, H. Attias, and S. Nagarajan. Supervised topic model for automatic image annotation. In IEEE ICASSP, 2010. 1

[14] P. Quelhas, F. Monay, J. Odobez, D. Gatica-Perez, and T. Tuytelaars. A thousand words in a scene. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 29(9): 1575–1589, 2007. 1, 7

[15] D. Ramage, D. Hall, R. Nallapati, and C. Manning. Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pages 248–256. Association for Computational Linguistics, 2009. 1, 3

[16] J. Rennie. Improving Multi-class Text Classification with Naive Bayes. PhD thesis, Massachusetts Institute of Technology, 2001. 1

[17] J. Sivic and A. Zisserman. Video Google: a text retrieval approach to object matching in videos. In IEEE ICCV, 2003. 2

[18] M. Steyvers and T. Griffiths. Probabilistic topic models. Handbook of latent semantic analysis, 427(7):424–440, 2007. 3

[19] C. Wang, D. Blei, and L. Fei-Fei. Simultaneous image classification and annotation. In IEEE CVPR, 2009. 1, 2, 4, 7, 8

[20] Y. Wang, P. Sabzmeydani, and G. Mori. Semi-latent Dirichlet allocation: A hierarchical model for human action recognition. Human Motion Understanding, Modeling, Capture and Animation, pages 240–254, 2007. 1, 3

[21] J. Winn, A. Criminisi, and T. Minka. Object categorization by learned universal visual dictionary. In IEEE ICCV, 2005. 1

[22] J. Zhu, A. Ahmed, and E. Xing. MedLDA: maximum margin supervised topic models for regression and classification. In ICML, 2009. 1, 8 2679