nips nips2000 nips2000-103 nips2000-103-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Milind R. Naphade, Igor Kozintsev, Thomas S. Huang
Abstract: We propose a novel probabilistic framework for semantic video indexing. We define probabilistic multimedia objects (multijects) to map low-level media features to high-level semantic labels. A graphical network of such multijects (multinet) captures scene context by discovering intra-frame as well as inter-frame dependency relations between the concepts. The main contribution is a novel application of a factor graph framework to model this network. We model relations between semantic concepts in terms of their co-occurrence as well as the temporal dependencies between these concepts within video shots. Using the sum-product algorithm [1] for approximate or exact inference in these factor graph multinets, we attempt to correct errors made during isolated concept detection by forcing high-level constraints. This results in a significant improvement in the overall detection performance. 1
[1] F. Kschischang, B. Frey, and H .-A. Loeliger,
[2] D. Zhong and S. F. Chang,
[3] M. Naphade, T. Kristjansson, B. Frey, and T . S. Huang,
[4] S. F. Chang, W. Chen, and H. Sundaram,
[5] M. Naphade, R. Mehrotra, A. M. Ferman, J. Warnick, T. S. Huang, and A. M. Tekalp,
[6] R. Jain, R. Kasturi, and B. Schunck, Machine Vision. MIT Press and McGraw-Hill, 1995.
[7] A. K. Jain and A. Vailaya,
[8] S. Dudani, K. Breeding, and R. McGhee,
[9] M. R. Naphade and T. S. Huang,
[10] M. R. Naphade and T. S. Huang,