jmlr jmlr2006 jmlr2006-49 jmlr2006-49-reference knowledge-graph by maker-knowledge-mining

49 jmlr-2006-Learning Parts-Based Representations of Data


Source: pdf

Author: David A. Ross, Richard S. Zemel

Abstract: Many perceptual models and theories hinge on treating objects as a collection of constituent parts. When applying these approaches to data, a fundamental problem arises: how can we determine what are the parts? We attack this problem using learning, proposing a form of generative latent factor model, in which each data dimension is allowed to select a different factor or part as its explanation. This approach permits a range of variations that posit different models for the appearance of a part. Here we provide the details for two such models: a discrete and a continuous one. Further, we show that this latent factor model can be extended hierarchically to account for correlations between the appearances of different parts. This permits modeling of data consisting of multiple categories, and learning these categories simultaneously with the parts when they are unobserved. Experiments demonstrate the ability to learn parts-based representations, and categories, of facial images and user-preference data. Keywords: parts, unsupervised learning, latent factor models, collaborative filtering, hierarchical learning


reference text

C. Andrieu, N. de Freitas, A. Doucet, , and M.I. Jordan. An introduction to MCMC for machine learning. Machine Learning, 50:5–43, 2003. M.J. Beal and Z. Ghahramani. The variational Bayesian EM algorithm for incomplete data: with application to scoring graphical model structures. In Bayesian Statistics 7, 2003. 2394 L EARNING PARTS -BASED R EPRESENTATIONS OF DATA I. Biederman. Recognition-by-components: A theory of human image understanding. Psychological Review, 94(2):115–147, 1987. D.M. Blei, A.Y. Ng, and M.I. Jordan. Latent Dirichlet Allocation. In S. Becker T. Dietterich and Z. Ghahramani, editors, Advances in Neural Information Processing Systems 14. MIT Press, Cambridge, MA, 2002. C. Boutilier, R.S. Zemel, and B. Marlin. Active collaborative filtering. In Proceedings of the 19th Annual Conference on Uncertainty in Artificial Intelligence (UAI-03), pages 98–106. Morgan Kaufmann Publishers, 2003. M. Brand. Structure learning in conditional probability models via an entropic prior and parameter extinction. Neural Computation, 11(5):1155–1182, 1999. Y. Cheng and G.M Church. Biclustering of Expression Data. In Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology (ISMB), 2000. D. Donoho and V. Stodden. When does non-negative matrix factorization give a correct decomposi¨ tion into parts? In Sebastian Thrun, Lawrence Saul, and Bernhard Sch olkopf, editors, Advances in Neural Information Processing Systems 16. MIT Press, Cambridge, MA, 2004. L. Fei-Fei and P. Perona. A Bayesian hierarchical model for learning natural scene categories. In Proc. CVPR, 2005. R. Fergus, L. Fei-Fei, P. Perona A., and Zisserman. Learning object categories from google’s image search. In Proceedings of the 2005 IEEE International Conference on Computer Vision, 2005. B. J. Frey, N. Jojic, and A. Kannan. Learning appearance and transparency manifolds of occluded objects in layers. In Proc. CVPR, 2003. B.J. Frey and N. Jojic. Transformation-invariant clustering using the EM algorithm. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(1):1–17, 2003. Z. Ghahramani. Factorial learning and the EM algorithm. In G. Tesauro, D.S. Touretzky, and T.K. Leen, editors, Advances in Neural Information Processing Systems 7. MIT Press, Cambridge, MA, 1995. Z. Ghahramani and G.E. Hinton. The EM algorithm for mixtures of factor analyzers. Technical Report CRG-TR-96-1, University of Toronto, 1996. B. Heisele, T. Poggio, and M. Pontil. Face detection in still gray images. A.I. Memo 1687, Massachusetts Institute of Technology, May 2000. G. Hinton and R.S. Zemel. Autoencoders, minimum description length, and Helmholtz free energy. In G. Tesauro J. D. Cowan and J. Alspector, editors, Advances in Neural Information Processing Systems 6. Morgan Kaufmann Publishers, San Mateo, CA, 1994. G.E. Hinton, Z. Ghahramani, and Y.W. Teh. Learning to parse images. In S.A. Solla, T.K. Leen, and K.R. Muller, editors, Advances in Neural Information Processing Systems 12. MIT Press, Cambridge, MA, 2000. 2395 ROSS AND Z EMEL T. Hofmann. Probabilistic latent semantic analysis. In Proc. of Uncertainty in Artificial Intelligence, UAI’99, Stockholm, 1999. R. A. Jacobs, M. I. Jordan, S. J. Nowlan, and G. E. Hinton. Adaptive mixtures of local experts. Neural Computation, 3(1):79–87, 1991. N. Jojic and Y. Caspi. Capturing image structure with probabilistic index maps. In Proc. CVPR, 2004. N. Jojic and B.J. Frey. Learning flexible sprites in video layers. In Proc. CVPR, 2001. D.D. Lee and H.S. Seung. Learning the parts of objects by non-negative matrix factorization. Nature, 401:788–791, October 1999. D.D. Lee and H.S. Seung. Algorithms for non-negative matrix factorization. In T.K. Leen, T. Dietterich, and V. Tresp, editors, Advances in Neural Information Processing Systems 13. MIT Press, Cambridge, MA, 2001. S. Li, X. Hou, and H. Zhang. Learning spatially localized, parts-based representation. In Proc. CVPR, 2001. D.G. Lowe. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision,, 60(2):91–110, 2004. B. Marlin. Collaborative filtering: A machine learning perspective. Master’s thesis, University of Toronto, 2004. A.M. Martinez and R. Benavente. The AR face database. Technical Report 24, CVC, 1998. B.W. Mel. Think positive to find parts. Nature, 401:759–760, October 1999. B. Mirkin. Mathematical Classification and Clustering. Kluwer Academic Publishers, 1996. MIT-CBCL. CBCL face database #1. MIT Center For Biological and Computation Learning, 2000. http://cbcl.mit.edu/. A. Mohan, C. Papageorgiou, and T. Poggio. Example-based object detection in images by components. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(4):349–361, 2001. R.M. Neal and G.E. Hinton. A new view of the EM algorithm that justifies incremental and other algorithms. In M.I. Jordan, editor, Learning and Inference in Graphical Models. Kluwer Academic Publishers, 1998. D.A. Ross and R.S. Zemel. Multiple cause vector quantization. In S. Becker, S. Thrun, and K. Obermayer, editors, Advances in Neural Information Processing Systems 15. MIT Press, Cambridge, MA, 2003. S. Roweis. EM algorithms for PCA and SPCA. In Michael I. Jordan, Michael J. Kearns, and Sara A. Solla, editors, Advances in Neural Information Processing Systems 10. MIT Press, Cambridge, MA, 1997. 2396 L EARNING PARTS -BASED R EPRESENTATIONS OF DATA J. Sivic, B. Russell, A. Efros, A. Zisserman, and W. Freeman. Discovering object categories in image collections. Technical Report A. I. Memo 2005-005, Massachusetts Institute of Technology, 2005. E.B. Sudderth, A. Torralba, W.T. Freeman, and A.S. Wilsky. Learning hierarchical models of scenes, objects, and parts. In Proceedings of the 2005 IEEE International Conference on Computer Vision, 2005. Y.W. Teh, M.I. Jordan, M.J. Beal, and D.M. Blei. Sharing clusters among related groups: Hierarchical Dirichlet processes. In Lawrence K. Saul, Yair Weiss, and L´ on Bottou, editors, Advances e in Neural Information Processing Systems 17. MIT Press, Cambridge, MA, 2005. M.E. Tipping and C.M. Bishop. Mixtures of probabilistic principal component analysers. Neural Computation, 11(2):443–482, 1999. M. Weber, W. Einh¨ user, M. Welling, and P. Perona. Viewpoint-invariant learning and detection of a human heads. In IEEE International Conference on Automatic Face and Gesture Recognition, 2000. C. Williams and N. Adams. DTs: Dynamic trees. In M.J. Kearns, S.A. Solla, and D.A. Cohn, editors, Advances in Neural Information Processing Systems 11. MIT Press, Cambridge, MA, 1999. J. Winn and N. Jojic. LOCUS: Learning object classes with unsupervised segmentation. In Proc. IEEE Intl. Conf. on Computer Vision (ICCV), 2005. R.S. Zemel. A Minimum Description Length Framework for Unsupervised Learning. PhD thesis, Dept. of Computer Science, University of Toronto, Toronto, Canada, 1993. 2397