nips nips2013 nips2013-343 knowledge-graph by maker-knowledge-mining

343 nips-2013-Unsupervised Structure Learning of Stochastic And-Or Grammars

Source: pdf

Author: Kewei Tu, Maria Pavlovskaia, Song-Chun Zhu

Abstract: Stochastic And-Or grammars compactly represent both compositionality and reconﬁgurability and have been used to model different types of data such as images and events. We present a uniﬁed formalization of stochastic And-Or grammars that is agnostic to the type of the data being modeled, and propose an unsupervised approach to learning the structures as well as the parameters of such grammars. Starting from a trivial initial grammar, our approach iteratively induces compositions and reconﬁgurations in a uniﬁed manner and optimizes the posterior probability of the grammar. In our empirical evaluation, we applied our approach to learning event grammars and image grammars and achieved comparable or better performance than previous approaches. 1

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 edu Abstract Stochastic And-Or grammars compactly represent both compositionality and reconﬁgurability and have been used to model different types of data such as images and events. [sent-2, score-0.551]

2 We present a uniﬁed formalization of stochastic And-Or grammars that is agnostic to the type of the data being modeled, and propose an unsupervised approach to learning the structures as well as the parameters of such grammars. [sent-3, score-0.582]

3 In our empirical evaluation, we applied our approach to learning event grammars and image grammars and achieved comparable or better performance than previous approaches. [sent-5, score-1.013]

4 1 Introduction Stochastic grammars are traditionally used to represent natural language syntax and semantics, but they have also been extended to model other types of data like images [1, 2, 3] and events [4, 5, 6, 7]. [sent-6, score-0.603]

5 It has been shown that stochastic grammars are powerful models of patterns that combine compositionality (i. [sent-7, score-0.554]

6 Stochastic grammars can be used to parse data samples into their compositional structures, which help solve tasks like classiﬁcation, annotation and segmentation in a uniﬁed way. [sent-12, score-0.495]

7 We study stochastic grammars in the form of stochastic And-Or grammars [1], which are an extension of stochastic grammars in natural language processing [8, 9] and are closely related to sum-product networks [10]. [sent-13, score-1.537]

8 Stochastic And-Or grammars have been used to model spatial structures of objects and scenes [1, 3] as well as temporal structures of actions and events [7]. [sent-14, score-0.53]

9 Manual speciﬁcation of a stochastic grammar is typically very difﬁcult and therefore machine learning approaches are often employed to automatically induce unknown stochastic grammars from data. [sent-15, score-1.289]

10 In this paper we study unsupervised learning of stochastic And-Or grammars in which the training data are unannotated (e. [sent-16, score-0.601]

11 The learning of a stochastic grammar involves two parts: learning the grammar rules (i. [sent-19, score-1.573]

12 One strategy in unsupervised learning of stochastic grammars is to manually specify a ﬁxed grammar structure (in most cases, the full set of valid grammar rules) and try to optimize the parameters of the grammar. [sent-24, score-2.029]

13 , [11, 12]) as well as some approaches of learning image grammars [10, 13] adopt this strategy. [sent-27, score-0.498]

14 The main problem of this strategy is that in some scenarios the full set of valid grammar rules is too large for practical learning and inference, while manual speciﬁcation of a compact grammar structure is challenging. [sent-28, score-1.548]

15 For example, in an image grammar the number of possible grammar rules to decompose an image patch is exponential in the size of the patch; previous approaches restrict the valid 1 ways of decomposing an image patch (e. [sent-29, score-1.69]

16 Our approach extends the previous work on structure learning of natural language grammars [14, 15, 16], while improves upon the recent work on structure learning of AndOr grammars of images [17] and events [18]. [sent-33, score-1.034]

17 Starting from a trivial initial grammar, our approach iteratively inserts new fragments into the grammar to optimize its posterior probability. [sent-34, score-0.951]

18 • We present a formalization of stochastic And-Or grammars that is agnostic to the types of atomic patterns and their compositions. [sent-37, score-0.741]

19 • We learn compositions and reconﬁgurations modeled in the grammar in a uniﬁed manner that is more efﬁcient and robust to data scarcity and ambiguity than previous approaches. [sent-43, score-0.862]

20 • We empirically evaluated our approach in learning event grammars and image grammars and it achieved comparable or better performance than previous approaches. [sent-44, score-1.013]

21 2 Stochastic And-Or Grammars Stochastic And-Or grammars are ﬁrst proposed to model images [1] and later adapted to model events [7]. [sent-45, score-0.541]

22 Here we provide a uniﬁed deﬁnition of stochastic And-Or grammars that is agnostic to the type of the data being modeled. [sent-46, score-0.523]

23 We restrict ourselves to the context-free subclass of stochastic And-Or grammars, which can be seen as an extension of stochastic context-free grammars in formal language theory [8] as well as an extension of decomposable sum-product networks [10]. [sent-47, score-0.594]

24 A stochastic context-free And-Or grammar is deﬁned as a 5-tuple Σ, N, S, R, P . [sent-48, score-0.788]

25 The set of grammar rules R is divided into two disjoint sets: And-rules and Or-rules. [sent-50, score-0.785]

26 an , where A ∈ N AND is a nonterminal And-node and a1 a2 . [sent-55, score-0.173]

27 an is a set of terminal or nonterminal nodes representing the sub-patterns. [sent-58, score-0.338]

28 A set of relations are speciﬁed between the sub-patterns and between the nonterminal node A and the sub-patterns, which conﬁgure how these sub-patterns form the composite pattern represented by A. [sent-59, score-0.302]

29 Note that one can specify different types of relations in different And-rules, which allows multiple types of compositions to be modeled in the same grammar. [sent-61, score-0.207]

30 It takes the form of O → a, where O ∈ N OR is a nonterminal Or-node, and a is either a terminal or a nonterminal node representing a possible conﬁguration. [sent-63, score-0.537]

31 A stochastic And-Or grammar deﬁnes generative processes of valid entities, i. [sent-69, score-0.788]

32 (b) Iteration 1:…… learning a a a X X …… a1 Y Y a6 Y Y X X grammar Xfragment X a6 at N1 . [sent-81, score-0.746]

33 (c)XIteration 2: learning a grammar fragment rooted at N2 . [sent-82, score-0.938]

34 1 Unsupervised Structure Learning Problem Deﬁnition In unsupervised learning of stochastic And-Or grammars, we aim to learn a grammar from a set of unannotated i. [sent-84, score-0.842]

35 By adopting a sparsity prior that penalizes the size of the grammar, we hope to learn a compact grammar with good generalizability. [sent-91, score-0.746]

36 Viterbi likelihood has been empirically shown to lead to better grammar learning results [20, 10] and can be interpreted as combining the standard likelihood with an unambiguity bias [21]. [sent-93, score-0.829]

37 2 Algorithm Framework We ﬁrst deﬁne an initial grammar that generates the exact set of training samples. [sent-95, score-0.792]

38 Speciﬁcally, for each training sample xi ∈ X, there is an Or-rule S → Ai in the initial grammar where S is the start 1 symbol and Ai is an And-node, and the probability of the rule is X where X is the number of training samples; for each xi there is also an And-rule Ai → ai1 ai2 . [sent-96, score-0.855]

39 n) are the terminal nodes representing the set of atomic patterns contained in sample xi , and a set of relations are speciﬁed between these terminal nodes such that they compose sample xi . [sent-102, score-0.551]

40 This initial grammar leads to the maximal likelihood on the training data but has a very small prior probability because of its large size. [sent-104, score-0.817]

41 3 Y X a4 a5 X a3 a4 nonterminal nodes until the entity contains only terminal nodes (atomic patterns). [sent-105, score-0.409]

42 Table 1 gives a a3 a4 few examples of stochastic context-free And-Or grammars that model different types of data. [sent-106, score-0.529]

43 a2 a5 3 …… Y a6 Y a2 a5 …… Starting from the initial grammar, we introduce new intermediate nonterminal nodes between the terminal nodes and the top-level nonterminal nodes in an iterative bottom-up fashion to generalize the grammar and increase its posterior probability. [sent-107, score-1.397]

44 There are typically multiple candidate grammar fragments that can be added at each iteration, and we employ greedy search or beam search to explore the search space and maximize the posterior probability of the grammar. [sent-110, score-1.072]

45 We also restrict the types of grammar fragments that can be added in order to reduce the number of candidate grammar fragments, which will be discussed in the next subsection. [sent-111, score-1.683]

46 The algorithm terminates when no more grammar fragment can be found that increases the posterior probability of the grammar. [sent-112, score-0.961]

47 3 And-Or Fragments In each iteration of our learning algorithm framework, we search for a new grammar fragment and add it into the grammar. [sent-114, score-0.949]

48 There are many different types of grammar fragments, the choice of which greatly inﬂuences the efﬁciency and accuracy of the learning algorithm. [sent-115, score-0.774]

49 Two simplest types of grammar fragments are And-fragments and Or-fragments. [sent-116, score-0.937]

50 While these two types of fragments are simple and intuitive, they both have important disadvantages if they are searched for separately in the learning algorithm. [sent-128, score-0.191]

51 Instead of And-fragments and Or-fragments, we propose to search for And-Or fragments in the learning algorithm. [sent-131, score-0.193]

52 An And-Or fragment contains a new And-node A, a set of new Or-nodes O1 , O2 , . [sent-132, score-0.173]

53 Such an And-Or n fragment can generate i=1 mi number of conﬁgurations of existing nodes. [sent-145, score-0.189]

54 To perform greedy search or beam search, in each iteration of our learning algorithm we need to ﬁnd the And-Or fragments that lead to the highest gain in the posterior probability of the grammar. [sent-149, score-0.305]

55 Computing the posterior gain by re-parsing the training samples can be very time-consuming if the training set or the grammar is large. [sent-150, score-0.937]

56 Fortunately, we show that by assuming grammar unambiguity the posterior gain of adding an And-Or fragment can be formulated based on a set of sufﬁcient statistics of the training data and is efﬁcient to compute. [sent-151, score-1.079]

57 Since the posterior probability is proportional to the product of the likelihood and the prior probability, the posterior gain is equal to the product of the likelihood gain and the prior gain, which we formulate separately below. [sent-152, score-0.212]

58 (b) The n-gram tensor of the And-Or fragment based 2: (a) context2 context3 … … a11a21athe training data (here n = 3). [sent-155, score-0.219]

59 (c) The context matrix of the And-Or fragment based on the 1 0 0 … on 31 a12a21a31 5 1 training data. [sent-156, score-0.234]

60 …n) is an existing terminal or nonterminal node that (i . [sent-164, score-0.364]

61 First, since the grammar now generates the And-node A ﬁrst, which then generates a1j1 a2j2 . [sent-167, score-0.746]

62 Putting these two types of changes to the likelihood together, we can formulate the likelihood gain of adding the And-Or fragment as follows (see the supplementary material for the full derivation). [sent-177, score-0.29]

63 The ﬁrst is the coherence of the n-gram tensor of the And-Or fragment (which tabulates the number of times each conﬁguration covered by the And-Or fragment appears in the training samples, as illustrated in Figure 2(b)). [sent-180, score-0.427]

64 These two factors provide a surrogate measure of how much the training data support the context-freeness within the And-Or fragment and the context-freeness of the And-Or fragment against its context respectively. [sent-182, score-0.407]

65 The prior probability of the grammar is determined by the grammar size. [sent-186, score-1.492]

66 It can be seen that the prior gain penalizes And-Or fragments that have a large size but only cover a small number of conﬁgurations in the training data. [sent-195, score-0.248]

67 See the supplementary material for the pseudocode of the complete algorithm of grammar learning. [sent-199, score-0.746]

68 The algorithm runs reasonably fast: our prototype implementation can ﬁnish running within a few minutes on a desktop with 5000 training samples each containing more than 10 atomic patterns. [sent-200, score-0.195]

69 1 Experiments Learning Event Grammars We applied our approach to learn event grammars from human activity data. [sent-202, score-0.515]

70 The atomic actions and their start/end time are annotated in each video, as shown in Figure 4. [sent-206, score-0.185]

71 We employed three different methods to apply our grammar learning approach on these two datasets. [sent-209, score-0.746]

72 For each frame of a video in the dataset, we construct a binary vector that indicates which of the atomic actions are observed in this frame. [sent-211, score-0.198]

73 Our learning approach is applied on the ID sequences, where each terminal node represents an ID and each And-node speciﬁes the temporal “followed-by” relation between its child nodes. [sent-227, score-0.226]

74 Each terminal node now represents an occurrence of an atomic action. [sent-229, score-0.341]

75 In this way, the resulting And-Or grammar is directly grounded to the observed atomic actions and is therefore more ﬂexible and expressive than the grammar learned from IDs as in the ﬁrst method. [sent-231, score-1.693]

76 We slightly modiﬁed our algorithm to accommodate this issue: right before the algorithm terminates, we change the top-level And-nodes in the grammar to Or-nodes, which removes any temporal relation between the learned events in each training sample and renders them independent of each other. [sent-235, score-0.896]

77 We used 55 samples of each dataset as the training set and evaluated the learned grammars on the remaining 6 samples. [sent-237, score-0.58]

78 On each testing sample, the events identiﬁed by the learned grammars were compared against manual annotations. [sent-238, score-0.545]

79 The third method leads to the best overall performance, which implies the advantage of grounding the grammar to atomic actions and simultaneously learning different relations. [sent-244, score-0.928]

80 2 Learning Image Grammars We ﬁrst tested our approach in learning image grammars from a synthetic dataset of animal face sketches [24]. [sent-247, score-0.521]

81 4 nodes to represent the atomic sketches in the images and set the relations in And-rules to represent relative positions between image patches. [sent-256, score-0.32]

82 We evaluated the learned grammars against the true grammar. [sent-259, score-0.493]

83 We estimated the precision and recall of the sets of images generated from the learned grammars versus the true grammar, from which we computed the F-measure. [sent-260, score-0.54]

84 We also estimated the KL-divergence of the probability distributions deﬁned by the learned grammars from that of the true grammar. [sent-261, score-0.493]

85 We compared our approach with the image grammar learning approach proposed in [17]. [sent-262, score-0.785]

86 We followed the method described in [17] to quantize the images and learn the atomic patterns, which become the terminal nodes of the grammar. [sent-267, score-0.343]

87 Figure 8 shows some images from the dataset, the quantization examples and the atomic patterns learned. [sent-268, score-0.214]

88 Since the true grammar is unknown, we evaluated the learned grammars by measuring their perplexity (the reciprocal of the geometric mean probability of a sample from a testing set). [sent-270, score-1.273]

89 We ran 10-fold cross-validation on the dataset: learning an image grammar from each training set and then evaluating its perplexity on the testing set. [sent-271, score-0.865]

90 Before estimating the perplexity, the probability distribution represented by each learned grammar was smoothed to avoid zero probability on the testing images. [sent-272, score-0.78]

91 5 Conclusion We have presented a uniﬁed formalization of stochastic And-Or grammars that is agnostic to the type of the data being modeled, and have proposed an unsupervised approach to learning the structures as well as the parameters of such grammars. [sent-275, score-0.582]

92 Our approach optimizes the posterior probability of the grammar and induces compositions and reconﬁgurations in a uniﬁed manner. [sent-276, score-0.867]

93 Our experiments in learning event grammars and image grammars show satisfactory performance of our approach. [sent-277, score-1.013]

94 Aggarwal, “Recognition of composite human activities through context-free grammar based representation,” in CVPR, 2006. [sent-310, score-0.762]

95 Huang, “An extended grammar system for learning and recognizing complex visual events,” IEEE Trans. [sent-314, score-0.746]

96 Klein, “Probabilistic grammars and hierarchical dirichlet processes,” The handbook of applied Bayesian analysis, 2009. [sent-337, score-0.459]

97 Baker, “Trainable grammars for speech recognition,” in Speech Communication Papers for the 97th Meeting of the Acoustical Society of America, 1979. [sent-343, score-0.459]

98 Omohundro, “Inducing probabilistic grammars by Bayesian model merging,” in ICGI, 1994, pp. [sent-358, score-0.459]

99 Honavar, “Unsupervised learning of probabilistic context-free grammar using iterative biclustering,” in Proceedings of 9th International Colloquium on Grammatical Inference (ICGI 2008), ser. [sent-374, score-0.746]

100 Zhu, “Unsupervised learning of event and-or grammar and semantics from video,” in ICCV, 2011. [sent-384, score-0.802]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('grammar', 0.746), ('grammars', 0.459), ('nonterminal', 0.173), ('fragment', 0.173), ('fragments', 0.163), ('atomic', 0.131), ('terminal', 0.116), ('compositions', 0.079), ('node', 0.075), ('recon', 0.058), ('rdi', 0.057), ('event', 0.056), ('relations', 0.054), ('aij', 0.051), ('nodes', 0.049), ('viterbi', 0.047), ('images', 0.047), ('gurations', 0.047), ('training', 0.046), ('guration', 0.045), ('stochastic', 0.042), ('posterior', 0.042), ('image', 0.039), ('gain', 0.039), ('rules', 0.039), ('cm', 0.039), ('zhu', 0.038), ('gt', 0.038), ('actions', 0.036), ('unsupervised', 0.036), ('patterns', 0.036), ('events', 0.035), ('relation', 0.035), ('learned', 0.034), ('language', 0.034), ('perplexity', 0.034), ('anjn', 0.033), ('trash', 0.033), ('unambiguity', 0.033), ('uni', 0.032), ('beam', 0.031), ('durations', 0.031), ('purity', 0.031), ('video', 0.031), ('parsing', 0.031), ('search', 0.03), ('types', 0.028), ('oi', 0.027), ('stand', 0.026), ('likelihood', 0.025), ('con', 0.024), ('dataset', 0.023), ('formalization', 0.023), ('entity', 0.022), ('id', 0.022), ('rd', 0.022), ('aimi', 0.022), ('andor', 0.022), ('bend', 0.022), ('fmeasure', 0.022), ('gurability', 0.022), ('icgi', 0.022), ('ids', 0.022), ('pavlovskaia', 0.022), ('pei', 0.022), ('throw', 0.022), ('agnostic', 0.022), ('patch', 0.021), ('action', 0.021), ('manning', 0.02), ('tu', 0.02), ('occurrence', 0.019), ('coherence', 0.019), ('scarcity', 0.019), ('biclustering', 0.019), ('honavar', 0.019), ('rooted', 0.019), ('reductions', 0.018), ('samples', 0.018), ('sz', 0.018), ('annotated', 0.018), ('subtypes', 0.018), ('unannotated', 0.018), ('parse', 0.018), ('modeled', 0.018), ('tensors', 0.018), ('manual', 0.017), ('symbol', 0.017), ('compositionality', 0.017), ('decomposable', 0.017), ('mi', 0.016), ('covered', 0.016), ('activities', 0.016), ('scarce', 0.016), ('standing', 0.016), ('context', 0.015), ('third', 0.015), ('klein', 0.015), ('quantized', 0.015), ('cf', 0.015)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000005 343 nips-2013-Unsupervised Structure Learning of Stochastic And-Or Grammars

Author: Kewei Tu, Maria Pavlovskaia, Song-Chun Zhu

2 0.046248991 349 nips-2013-Visual Concept Learning: Combining Machine Vision and Bayesian Generalization on Concept Hierarchies

Author: Yangqing Jia, Joshua T. Abbott, Joseph Austerweil, Thomas Griffiths, Trevor Darrell

Abstract: Learning a visual concept from a small number of positive examples is a significant challenge for machine learning algorithms. Current methods typically fail to ﬁnd the appropriate level of generalization in a concept hierarchy for a given set of visual examples. Recent work in cognitive science on Bayesian models of generalization addresses this challenge, but prior results assumed that objects were perfectly recognized. We present an algorithm for learning visual concepts directly from images, using probabilistic predictions generated by visual classiﬁers as the input to a Bayesian generalization model. As no existing challenge data tests this paradigm, we collect and make available a new, large-scale dataset for visual concept learning using the ImageNet hierarchy as the source of possible concepts, with human annotators to provide ground truth labels as to whether a new image is an instance of each concept using a paradigm similar to that used in experiments studying word learning in children. We compare the performance of our system to several baseline algorithms, and show a signiﬁcant advantage results from combining visual classiﬁers with the ability to identify an appropriate level of abstraction using Bayesian generalization. 1

3 0.044490226 263 nips-2013-Reasoning With Neural Tensor Networks for Knowledge Base Completion

Author: Richard Socher, Danqi Chen, Christopher D. Manning, Andrew Ng

Abstract: Knowledge bases are an important resource for question answering and other tasks but often suffer from incompleteness and lack of ability to reason over their discrete entities and relationships. In this paper we introduce an expressive neural tensor network suitable for reasoning over relationships between two entities. Previous work represented entities as either discrete atomic units or with a single entity vector representation. We show that performance can be improved when entities are represented as an average of their constituting word vectors. This allows sharing of statistical strength between, for instance, facts involving the “Sumatran tiger” and “Bengal tiger.” Lastly, we demonstrate that all models improve when these word vectors are initialized with vectors learned from unsupervised large corpora. We assess the model by considering the problem of predicting additional true relations between entities given a subset of the knowledge base. Our model outperforms previous models and can classify unseen relationships in WordNet and FreeBase with an accuracy of 86.2% and 90.0%, respectively. 1

4 0.042949263 37 nips-2013-Approximate Bayesian Image Interpretation using Generative Probabilistic Graphics Programs

Author: Vikash Mansinghka, Tejas D. Kulkarni, Yura N. Perov, Josh Tenenbaum

Abstract: The idea of computer vision as the Bayesian inverse problem to computer graphics has a long history and an appealing elegance, but it has proved difﬁcult to directly implement. Instead, most vision tasks are approached via complex bottom-up processing pipelines. Here we show that it is possible to write short, simple probabilistic graphics programs that deﬁne ﬂexible generative models and to automatically invert them to interpret real-world images. Generative probabilistic graphics programs (GPGP) consist of a stochastic scene generator, a renderer based on graphics software, a stochastic likelihood model linking the renderer’s output and the data, and latent variables that adjust the ﬁdelity of the renderer and the tolerance of the likelihood. Representations and algorithms from computer graphics are used as the deterministic backbone for highly approximate and stochastic generative models. This formulation combines probabilistic programming, computer graphics, and approximate Bayesian computation, and depends only on generalpurpose, automatic inference techniques. We describe two applications: reading sequences of degraded and adversarially obscured characters, and inferring 3D road models from vehicle-mounted camera images. Each of the probabilistic graphics programs we present relies on under 20 lines of probabilistic code, and yields accurate, approximately Bayesian inferences about real-world images. 1

5 0.040740248 190 nips-2013-Mid-level Visual Element Discovery as Discriminative Mode Seeking

Author: Carl Doersch, Abhinav Gupta, Alexei A. Efros

Abstract: Recent work on mid-level visual representations aims to capture information at the level of complexity higher than typical “visual words”, but lower than full-blown semantic objects. Several approaches [5, 6, 12, 23] have been proposed to discover mid-level visual elements, that are both 1) representative, i.e., frequently occurring within a visual dataset, and 2) visually discriminative. However, the current approaches are rather ad hoc and difﬁcult to analyze and evaluate. In this work, we pose visual element discovery as discriminative mode seeking, drawing connections to the the well-known and well-studied mean-shift algorithm [2, 1, 4, 8]. Given a weakly-labeled image collection, our method discovers visually-coherent patch clusters that are maximally discriminative with respect to the labels. One advantage of our formulation is that it requires only a single pass through the data. We also propose the Purity-Coverage plot as a principled way of experimentally analyzing and evaluating different visual discovery approaches, and compare our method against prior work on the Paris Street View dataset of [5]. We also evaluate our method on the task of scene classiﬁcation, demonstrating state-of-the-art performance on the MIT Scene-67 dataset. 1

6 0.040381469 206 nips-2013-Near-Optimal Entrywise Sampling for Data Matrices

7 0.040191438 195 nips-2013-Modeling Clutter Perception using Parametric Proto-object Partitioning

8 0.036362134 196 nips-2013-Modeling Overlapping Communities with Node Popularities

9 0.035877865 356 nips-2013-Zero-Shot Learning Through Cross-Modal Transfer

10 0.035843153 226 nips-2013-One-shot learning by inverting a compositional causal process

11 0.032349426 166 nips-2013-Learning invariant representations and applications to face verification

12 0.031803433 84 nips-2013-Deep Neural Networks for Object Detection

13 0.031386416 288 nips-2013-Scalable Influence Estimation in Continuous-Time Diffusion Networks

14 0.031166732 66 nips-2013-Computing the Stationary Distribution Locally

15 0.03082817 93 nips-2013-Discriminative Transfer Learning with Tree-based Priors

16 0.030410556 50 nips-2013-Bayesian Mixture Modelling and Inference based Thompson Sampling in Monte-Carlo Tree Search

17 0.030322812 114 nips-2013-Extracting regions of interest from biological images with convolutional sparse block coding

18 0.030120576 351 nips-2013-What Are the Invariant Occlusive Components of Image Patches? A Probabilistic Generative Approach

19 0.030076437 81 nips-2013-DeViSE: A Deep Visual-Semantic Embedding Model

20 0.029977493 108 nips-2013-Error-Minimizing Estimates and Universal Entry-Wise Error Bounds for Low-Rank Matrix Completion

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.088), (1, 0.019), (2, -0.04), (3, -0.023), (4, 0.041), (5, -0.021), (6, 0.019), (7, -0.014), (8, 0.01), (9, 0.009), (10, -0.012), (11, -0.01), (12, 0.054), (13, -0.007), (14, -0.029), (15, -0.003), (16, 0.01), (17, -0.021), (18, -0.016), (19, 0.005), (20, 0.021), (21, -0.015), (22, -0.055), (23, -0.022), (24, -0.003), (25, 0.072), (26, 0.061), (27, -0.031), (28, 0.001), (29, -0.003), (30, -0.003), (31, -0.022), (32, -0.007), (33, 0.014), (34, -0.006), (35, 0.004), (36, -0.052), (37, 0.049), (38, -0.024), (39, -0.008), (40, 0.011), (41, 0.007), (42, 0.027), (43, -0.04), (44, 0.001), (45, -0.027), (46, 0.045), (47, 0.04), (48, 0.054), (49, 0.022)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.86943823 343 nips-2013-Unsupervised Structure Learning of Stochastic And-Or Grammars

Author: Kewei Tu, Maria Pavlovskaia, Song-Chun Zhu

2 0.57262224 37 nips-2013-Approximate Bayesian Image Interpretation using Generative Probabilistic Graphics Programs

Author: Vikash Mansinghka, Tejas D. Kulkarni, Yura N. Perov, Josh Tenenbaum

3 0.56565577 138 nips-2013-Higher Order Priors for Joint Intrinsic Image, Objects, and Attributes Estimation

Author: Vibhav Vineet, Carsten Rother, Philip Torr

Abstract: Many methods have been proposed to solve the problems of recovering intrinsic scene properties such as shape, reﬂectance and illumination from a single image, and object class segmentation separately. While these two problems are mutually informative, in the past not many papers have addressed this topic. In this work we explore such joint estimation of intrinsic scene properties recovered from an image, together with the estimation of the objects and attributes present in the scene. In this way, our uniﬁed framework is able to capture the correlations between intrinsic properties (reﬂectance, shape, illumination), objects (table, tv-monitor), and materials (wooden, plastic) in a given scene. For example, our model is able to enforce the condition that if a set of pixels take same object label, e.g. table, most likely those pixels would receive similar reﬂectance values. We cast the problem in an energy minimization framework and demonstrate the qualitative and quantitative improvement in the overall accuracy on the NYU and Pascal datasets. 1

4 0.55903018 195 nips-2013-Modeling Clutter Perception using Parametric Proto-object Partitioning

Author: Chen-Ping Yu, Wen-Yu Hua, Dimitris Samaras, Greg Zelinsky

Abstract: Visual clutter, the perception of an image as being crowded and disordered, affects aspects of our lives ranging from object detection to aesthetics, yet relatively little effort has been made to model this important and ubiquitous percept. Our approach models clutter as the number of proto-objects segmented from an image, with proto-objects deﬁned as groupings of superpixels that are similar in intensity, color, and gradient orientation features. We introduce a novel parametric method of clustering superpixels by modeling mixture of Weibulls on Earth Mover’s Distance statistics, then taking the normalized number of proto-objects following partitioning as our estimate of clutter perception. We validated this model using a new 90-image dataset of real world scenes rank ordered by human raters for clutter, and showed that our method not only predicted clutter extremely well (Spearman’s ρ = 0.8038, p < 0.001), but also outperformed all existing clutter perception models and even a behavioral object segmentation ground truth. We conclude that the number of proto-objects in an image affects clutter perception more than the number of objects or features. 1

5 0.54905617 84 nips-2013-Deep Neural Networks for Object Detection

Author: Christian Szegedy, Alexander Toshev, Dumitru Erhan

Abstract: Deep Neural Networks (DNNs) have recently shown outstanding performance on image classiﬁcation tasks [14]. In this paper we go one step further and address the problem of object detection using DNNs, that is not only classifying but also precisely localizing objects of various classes. We present a simple and yet powerful formulation of object detection as a regression problem to object bounding box masks. We deﬁne a multi-scale inference procedure which is able to produce high-resolution object detections at a low cost by a few network applications. State-of-the-art performance of the approach is shown on Pascal VOC. 1

6 0.54521102 226 nips-2013-One-shot learning by inverting a compositional causal process

7 0.51929438 349 nips-2013-Visual Concept Learning: Combining Machine Vision and Bayesian Generalization on Concept Hierarchies

8 0.51669556 119 nips-2013-Fast Template Evaluation with Vector Quantization

9 0.51001447 163 nips-2013-Learning a Deep Compact Image Representation for Visual Tracking

10 0.49563226 356 nips-2013-Zero-Shot Learning Through Cross-Modal Transfer

11 0.48939052 166 nips-2013-Learning invariant representations and applications to face verification

12 0.48409826 104 nips-2013-Efficient Online Inference for Bayesian Nonparametric Relational Models

13 0.47629929 213 nips-2013-Nonparametric Multi-group Membership Model for Dynamic Networks

14 0.47159758 335 nips-2013-Transfer Learning in a Transductive Setting

15 0.46845973 196 nips-2013-Modeling Overlapping Communities with Node Popularities

16 0.46048555 288 nips-2013-Scalable Influence Estimation in Continuous-Time Diffusion Networks

17 0.45152971 21 nips-2013-Action from Still Image Dataset and Inverse Optimal Control to Learn Task Specific Visual Scanpaths

18 0.45052892 110 nips-2013-Estimating the Unseen: Improved Estimators for Entropy and other Properties

19 0.44885087 189 nips-2013-Message Passing Inference with Chemical Reaction Networks

20 0.44826353 329 nips-2013-Third-Order Edge Statistics: Contour Continuation, Curvature, and Cortical Connections

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(16, 0.021), (22, 0.327), (33, 0.115), (34, 0.092), (41, 0.029), (49, 0.031), (56, 0.074), (70, 0.045), (85, 0.043), (89, 0.021), (93, 0.043), (95, 0.019)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.93105888 71 nips-2013-Convergence of Monte Carlo Tree Search in Simultaneous Move Games

Author: Viliam Lisy, Vojta Kovarik, Marc Lanctot, Branislav Bosansky

Abstract: unkown-abstract

same-paper 2 0.7286731 343 nips-2013-Unsupervised Structure Learning of Stochastic And-Or Grammars

Author: Kewei Tu, Maria Pavlovskaia, Song-Chun Zhu

3 0.5751279 236 nips-2013-Optimal Neural Population Codes for High-dimensional Stimulus Variables

Author: Zhuo Wang, Alan Stocker, Daniel Lee

Abstract: In many neural systems, information about stimulus variables is often represented in a distributed manner by means of a population code. It is generally assumed that the responses of the neural population are tuned to the stimulus statistics, and most prior work has investigated the optimal tuning characteristics of one or a small number of stimulus variables. In this work, we investigate the optimal tuning for diffeomorphic representations of high-dimensional stimuli. We analytically derive the solution that minimizes the L2 reconstruction loss. We compared our solution with other well-known criteria such as maximal mutual information. Our solution suggests that the optimal weights do not necessarily decorrelate the inputs, and the optimal nonlinearity differs from the conventional equalization solution. Results illustrating these optimal representations are shown for some input distributions that may be relevant for understanding the coding of perceptual pathways. 1

4 0.4850111 22 nips-2013-Action is in the Eye of the Beholder: Eye-gaze Driven Model for Spatio-Temporal Action Localization

Author: Nataliya Shapovalova, Michalis Raptis, Leonid Sigal, Greg Mori

Abstract: We propose a weakly-supervised structured learning approach for recognition and spatio-temporal localization of actions in video. As part of the proposed approach, we develop a generalization of the Max-Path search algorithm which allows us to efﬁciently search over a structured space of multiple spatio-temporal paths while also incorporating context information into the model. Instead of using spatial annotations in the form of bounding boxes to guide the latent model during training, we utilize human gaze data in the form of a weak supervisory signal. This is achieved by incorporating eye gaze, along with the classiﬁcation, into the structured loss within the latent SVM learning framework. Experiments on a challenging benchmark dataset, UCF-Sports, show that our model is more accurate, in terms of classiﬁcation, and achieves state-of-the-art results in localization. In addition, our model can produce top-down saliency maps conditioned on the classiﬁcation label and localized latent paths. 1

5 0.48363337 5 nips-2013-A Deep Architecture for Matching Short Texts

Author: Zhengdong Lu, Hang Li

Abstract: Many machine learning problems can be interpreted as learning for matching two types of objects (e.g., images and captions, users and products, queries and documents, etc.). The matching level of two objects is usually measured as the inner product in a certain feature space, while the modeling effort focuses on mapping of objects from the original space to the feature space. This schema, although proven successful on a range of matching tasks, is insufﬁcient for capturing the rich structure in the matching process of more complicated objects. In this paper, we propose a new deep architecture to more effectively model the complicated matching relations between two objects from heterogeneous domains. More speciﬁcally, we apply this model to matching tasks in natural language, e.g., ﬁnding sensible responses for a tweet, or relevant answers to a given question. This new architecture naturally combines the localness and hierarchy intrinsic to the natural language problems, and therefore greatly improves upon the state-of-the-art models. 1

6 0.48306164 114 nips-2013-Extracting regions of interest from biological images with convolutional sparse block coding

7 0.48208296 173 nips-2013-Least Informative Dimensions

8 0.48179653 201 nips-2013-Multi-Task Bayesian Optimization

9 0.48150608 64 nips-2013-Compete to Compute

10 0.48148242 285 nips-2013-Robust Transfer Principal Component Analysis with Rank Constraints

11 0.48074329 45 nips-2013-BIG & QUIC: Sparse Inverse Covariance Estimation for a Million Variables

12 0.48071843 49 nips-2013-Bayesian Inference and Online Experimental Design for Mapping Neural Microcircuits

13 0.48038903 251 nips-2013-Predicting Parameters in Deep Learning

14 0.47943145 286 nips-2013-Robust learning of low-dimensional dynamics from large neural ensembles

15 0.47924209 294 nips-2013-Similarity Component Analysis

16 0.47917727 333 nips-2013-Trading Computation for Communication: Distributed Stochastic Dual Coordinate Ascent

17 0.47884345 350 nips-2013-Wavelets on Graphs via Deep Learning

18 0.47877958 121 nips-2013-Firing rate predictions in optimal balanced networks

19 0.4786129 99 nips-2013-Dropout Training as Adaptive Regularization

20 0.47843409 275 nips-2013-Reservoir Boosting : Between Online and Offline Ensemble Learning