nips nips2013 nips2013-343 nips2013-343-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Kewei Tu, Maria Pavlovskaia, Song-Chun Zhu
Abstract: Stochastic And-Or grammars compactly represent both compositionality and reconfigurability and have been used to model different types of data such as images and events. We present a unified formalization of stochastic And-Or grammars that is agnostic to the type of the data being modeled, and propose an unsupervised approach to learning the structures as well as the parameters of such grammars. Starting from a trivial initial grammar, our approach iteratively induces compositions and reconfigurations in a unified manner and optimizes the posterior probability of the grammar. In our empirical evaluation, we applied our approach to learning event grammars and image grammars and achieved comparable or better performance than previous approaches. 1
[1] S.-C. Zhu and D. Mumford, “A stochastic grammar of images,” Found. Trends. Comput. Graph. Vis., vol. 2, no. 4, pp. 259–362, 2006.
[2] Y. Jin and S. Geman, “Context and hierarchy in a probabilistic image model,” in CVPR, 2006.
[3] Y. Zhao and S. C. Zhu, “Image parsing with stochastic scene grammar,” in NIPS, 2011.
[4] Y. A. Ivanov and A. F. Bobick, “Recognition of visual activities and interactions by stochastic parsing,” Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 22, no. 8, pp. 852–872, 2000.
[5] M. S. Ryoo and J. K. Aggarwal, “Recognition of composite human activities through context-free grammar based representation,” in CVPR, 2006.
[6] Z. Zhang, T. Tan, and K. Huang, “An extended grammar system for learning and recognizing complex visual events,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 33, no. 2, pp. 240–255, Feb. 2011.
[7] M. Pei, Y. Jia, and S.-C. Zhu, “Parsing video events with goal inference and intent prediction,” in ICCV, 2011.
[8] C. D. Manning and H. Sch¨ tze, Foundations of statistical natural language processing. u MA, USA: MIT Press, 1999. Cambridge,
[9] P. Liang, M. I. Jordan, and D. Klein, “Probabilistic grammars and hierarchical dirichlet processes,” The handbook of applied Bayesian analysis, 2009.
[10] H. Poon and P. Domingos, “Sum-product networks : A new deep architecture,” in Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence (UAI), 2011.
[11] J. K. Baker, “Trainable grammars for speech recognition,” in Speech Communication Papers for the 97th Meeting of the Acoustical Society of America, 1979.
[12] D. Klein and C. D. Manning, “Corpus-based induction of syntactic structure: Models of dependency and constituency,” in Proceedings of ACL, 2004.
[13] S. Wang, Y. Wang, and S.-C. Zhu, “Hierarchical space tiling for scene modeling,” in Computer Vision– ACCV 2012. Springer, 2013, pp. 796–810.
[14] A. Stolcke and S. M. Omohundro, “Inducing probabilistic grammars by Bayesian model merging,” in ICGI, 1994, pp. 106–118.
[15] Z. Solan, D. Horn, E. Ruppin, and S. Edelman, “Unsupervised learning of natural languages,” Proc. Natl. Acad. Sci., vol. 102, no. 33, pp. 11 629–11 634, August 2005.
[16] K. Tu and V. Honavar, “Unsupervised learning of probabilistic context-free grammar using iterative biclustering,” in Proceedings of 9th International Colloquium on Grammatical Inference (ICGI 2008), ser. LNCS 5278, 2008.
[17] Z. Si and S. Zhu, “Learning and-or templates for object modeling and recognition,” IEEE Trans on Pattern Analysis and Machine Intelligence, 2013.
[18] Z. Si, M. Pei, B. Yao, and S.-C. Zhu, “Unsupervised learning of event and-or grammar and semantics from video,” in ICCV, 2011.
[19] J. F. Allen, “Towards a general theory of action and time,” Artificial intelligence, vol. 23, no. 2, pp. 123–154, 1984.
[20] V. I. Spitkovsky, H. Alshawi, D. Jurafsky, and C. D. Manning, “Viterbi training improves unsupervised dependency parsing,” in Proceedings of the Fourteenth Conference on Computational Natural Language Learning, ser. CoNLL ’10, 2010.
[21] K. Tu and V. Honavar, “Unambiguity regularization for unsupervised learning of probabilistic grammars,” in Proceedings of the 2012 Conference on Empirical Methods in Natural Language Processing and Natural Language Learning (EMNLP-CoNLL 2012), 2012.
[22] S. C. Madeira and A. L. Oliveira, “Biclustering algorithms for biological data analysis: A survey.” IEEE/ACM Trans. on Comp. Biol. and Bioinformatics, vol. 1, no. 1, pp. 24–45, 2004.
[23] P. Wei, N. Zheng, Y. Zhao, and S.-C. Zhu, “Concurrent action detection with structural prediction,” in Proc. Intl Conference on Computer Vision (ICCV), 2013.
[24] A. Barbu, M. Pavlovskaia, and S. Zhu, “Rates for inductive learning of compositional models,” in AAAI Workshop on Learning Rich Representations from Low-Level Sensors (RepLearning), 2013. 9