nips nips2012 nips2012-344 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Sergey Karayev, Tobias Baumgartner, Mario Fritz, Trevor Darrell
Abstract: In a large visual multi-class detection framework, the timeliness of results can be crucial. Our method for timely multi-class detection aims to give the best possible performance at any single point after a start time; it is terminated at a deadline time. Toward this goal, we formulate a dynamic, closed-loop policy that infers the contents of the image in order to decide which detector to deploy next. In contrast to previous work, our method significantly diverges from the predominant greedy strategies, and is able to learn to take actions with deferred values. We evaluate our method with a novel timeliness measure, computed as the area under an Average Precision vs. Time curve. Experiments are conducted on the PASCAL VOC object detection dataset. If execution is stopped when only half the detectors have been run, our method obtains 66% better AP than a random ordering, and 14% better performance than an intelligent baseline. On the timeliness measure, our method obtains at least 11% better performance. Our method is easily extensible, as it treats detectors and classifiers as black boxes and learns from execution traces using reinforcement learning. 1
Reference: text
sentIndex sentText sentNum sentScore
1 Timely Object Recognition Sergey Karayev UC Berkeley Tobias Baumgartner RWTH Aachen University Mario Fritz MPI for Informatics Trevor Darrell UC Berkeley Abstract In a large visual multi-class detection framework, the timeliness of results can be crucial. [sent-1, score-0.419]
2 Our method for timely multi-class detection aims to give the best possible performance at any single point after a start time; it is terminated at a deadline time. [sent-2, score-0.488]
3 Toward this goal, we formulate a dynamic, closed-loop policy that infers the contents of the image in order to decide which detector to deploy next. [sent-3, score-0.641]
4 In contrast to previous work, our method significantly diverges from the predominant greedy strategies, and is able to learn to take actions with deferred values. [sent-4, score-0.269]
5 Experiments are conducted on the PASCAL VOC object detection dataset. [sent-7, score-0.446]
6 If execution is stopped when only half the detectors have been run, our method obtains 66% better AP than a random ordering, and 14% better performance than an intelligent baseline. [sent-8, score-0.306]
7 Our method is easily extensible, as it treats detectors and classifiers as black boxes and learns from execution traces using reinforcement learning. [sent-10, score-0.447]
8 In large-scale detection systems, such as image search, results need to be obtained quickly per image as the number of items to process is constantly growing. [sent-13, score-0.454]
9 The detection strategy to maximize profit in such an environment has to exploit every inter-object context signal available to it, because there is not enough time to run detection for all classes. [sent-17, score-0.635]
10 What matters in the real world is timeliness, and either not all images can be processed or not all classes can be evaluated in a detection task. [sent-18, score-0.339]
11 Taking the task of object detection, we propose a new timeliness measure of performance vs. [sent-21, score-0.299]
12 We present a method that treats different detectors and classifiers as black boxes, and uses reinforcement learning to learn a dynamic policy for selecting actions to achieve the highest performance under this evaluation. [sent-23, score-0.683]
13 Specifically, we run scene context and object class detectors over the whole image sequentially, using the results of detection obtained so far to select the next actions. [sent-24, score-0.881]
14 Evaluating on the PASCAL 1 Ts Td C1 t=0 C2 C3 3 agist scene context 2 Ts adet1 adet2 adet3 Td C3 t = 0. [sent-25, score-0.276]
15 1 bicycle detector time C1 C2 machine translation and information retrieval. [sent-26, score-0.299]
16 While it is possible that moreback at least to the bootstrapping methods used by [38] and the highest scoring detection from each of the other complete and [35]. [sent-42, score-0.266]
17 Their visualization show the positive features the PASCAL object detection challenge, used a single filter dimensionality of the feature vector can be constrained to be in a sparse set of locations determined weights at different orientations. [sent-61, score-0.561]
18 In a latent SVM each example patchwork of parts model from [2] arise in the PASCAL object detection challenge and sim- ing pairs of parts. [sent-69, score-0.492]
19 In the case of one A major innovation of the Dalal-Triggs detector was the of our star models is the concatenation of the root construction of particularly effective features. [sent-81, score-0.313]
20 score of the root filter at the given location plus the Our second class of models represents each object sum over parts of the maximum, over placements of category by a mixture of star models. [sent-86, score-0.501]
21 The score of one that part, of the part filter score on its location minus of our mixture models at a given position and scale a deformation cost measuring the deviation of the part is the maximum over components, of the score of that from its ideal location. [sent-87, score-0.408]
22 case of object detection the training problem is highly unTo train models using partially labeled data we use a balanced because there is vastly more background than latent variable formulation of MI-SVM [3] that we call objects. [sent-95, score-0.446]
23 3 person detector adet3 Figure 1: A sample trace of our method. [sent-97, score-0.268]
24 At each time step beginning at t = 0, potential actions are considered according to their predicted value, and the maximizing action is picked. [sent-98, score-0.511]
25 Different actions return different observations: a detector returns a list of detections, while a scene context action simply returns its computed feature. [sent-100, score-0.898]
26 The final evaluation of a detection episode is the area of the AP vs. [sent-102, score-0.431]
27 The value of an action is the expected result of final evaluation if the action is taken and the policy continues to be followed, which allows actions without an immediate benefit to be scheduled. [sent-104, score-1.11]
28 Each object is labeled with exactly one category label k ∈ {1, . [sent-107, score-0.229]
29 The multi-class, multi-label classification problem asks whether I contains at least one object of class k. [sent-111, score-0.226]
30 The detection problem is to output a list of bounding boxes (sub-images defined by four coordinates), each with a real-valued confidence that it encloses a single instance of an object of class k, for each k. [sent-116, score-0.656]
31 The answer for a single class k is given by an algorithm detect(I, k), which outputs a list of sub-image bounding boxes B and their associated confidences. [sent-117, score-0.255]
32 A common measure of a correct detection is the PASCAL overlap: two bounding boxes are considered to match if they have the same class label and the ratio of their 1 intersection to their union is at least 2 . [sent-121, score-0.424]
33 To highlight the hierarchical structure of these problems, we note that the confidences for each subimage b ∈ B may be given by classify(b, k), and, more saliently for our setup, correct answer to the detection problem also answers the classification problem. [sent-122, score-0.311]
34 1 Related Work Object detection The best recent performance has come from detectors that use gradient-based features to represent objects as either a collection of local patches or as object-sized windows [2, 3]. [sent-126, score-0.445]
35 Using context The most common source of context for detection is the scene or other non-detector cues; the most common scene-level feature is the GIST [6] of the image. [sent-131, score-0.53]
36 Inter-object context has also been shown to improve detection [7]. [sent-133, score-0.336]
37 In a standard evaluation setup, inter-object context plays a role only in post-filtering, once all detectors have been run. [sent-134, score-0.263]
38 A critical summary of the main approaches to using context for object and scene recognition is given in [8]. [sent-136, score-0.391]
39 Efficiency through cascades An early success in efficient object detection of a single class uses simple, fast features to build up a cascade of classifiers, which then considers image regions in a sliding window regime [10]. [sent-138, score-0.834]
40 A recent application to the problem of visual detection picks features with maximum value of information in a Hough-voting framework [12]. [sent-142, score-0.341]
41 In contrast, we learn policies that take actions without any immediate reward. [sent-145, score-0.295]
42 3 Multi-class Recognition Policy Our goal is a multi-class recognition policy π that takes an image I and outputs a list of multi-class detection results by running detector and global scene actions sequentially. [sent-146, score-1.233]
43 The policy repeatedly selects an action ai ∈ A, executes it, receiving observations oi , and then selects the next action. [sent-147, score-0.774]
44 The set of actions A can include both classifiers and detectors: anything that would be useful for inferring the contents of the image. [sent-148, score-0.271]
45 Each action ai has an expected cost c(ai ) of execution. [sent-149, score-0.337]
46 We take the empirical approach: every executed action advances t, the time into episode, by its runtime. [sent-151, score-0.309]
47 As shown in Figure 1, the system is given two times: the setup time Ts and deadline Td . [sent-152, score-0.257]
48 We evaluate policies by this more robust metric and not simply by the final performance at deadline time for the same 3 reason that Average Precision is used instead of a fixed Precision vs. [sent-155, score-0.239]
49 1 Sequential Execution An open-loop policy, such as the common classifier cascade [10], takes actions in a sequence that does not depend on observations received from previous actions. [sent-158, score-0.315]
50 Additionally, the state records that an action ai has been taken by adding it to the initially empty set O and recording the resulting observations oi . [sent-165, score-0.472]
51 The state also keeps track of the time into episode t, and the setup and deadline times Ts , Td . [sent-167, score-0.372]
52 A recognition episode takes an image I and proceeds from the initial state s0 and action a0 to the next pair (s1 , a1 ), and so on until (sJ , aJ ), where J is the last step of the process with t ≤ Td . [sent-168, score-0.573]
53 At that point, the policy is terminated, and a new episode can begin on a new image. [sent-169, score-0.377]
54 The specific actions we consider in the following exposition are detector actions adeti , where deti is a detector class Ci , and a scene-level context action agist , which updates the probabilities of all classes. [sent-170, score-1.385]
55 Although we avoid this in the exposition, note that our system easily handles multiple detector actions per class. [sent-171, score-0.447]
56 2 Selecting actions As our goal is to pick actions dynamically, we want a function Q(s, a) : S × A → R, where S is the space of all possible states, to assign a value to a potential action a ∈ A given the current state s of the decision process. [sent-173, score-0.719]
57 We featurize the state-action pair and assume linear structure: Qπ (s, a) = θπ φ(s, a) (2) The policy’s performance at time t is determined by all detections that are part of the set of observations oj at the last state sj before t. [sent-175, score-0.457]
58 Recall that detector actions returns lists of detection hypotheses. [sent-176, score-0.679]
59 Time evaluation of an episode is a function eval(h, Ts , Td ) of the history of execution h = s0 , s1 , . [sent-178, score-0.289]
60 Note from Figure 3b that this evaluation function is additive per action, as each action a generates observations that may raise or lower the mean AP of the results so far (∆ap) and takes a certain time (∆t). [sent-184, score-0.414]
61 We can accordingly represent the final evaluation eval(h, Ts , Td ) in terms of individual action J rewards: j=0 R(sj , aj ). [sent-185, score-0.331]
62 Specifically, as shown in Figure 3b, we define the reward of an action a as 1 R(sj , a) = ∆ap(tj − ∆t) T 2 (3) where tj is the time left until Td at state sj , and ∆t and ∆ap are the time taken and AP change T produced by the action a. [sent-186, score-0.887]
63 3 Learning the policy The expected value of the final evaluation can be written recursively in terms of the value function: Qπ (sj , a) = Esj+1 [R(sj , a) + γQπ (sj+1 , π(sj+1 ))] (4) where γ ∈ [0, 1] is the discount value. [sent-189, score-0.322]
64 While we can’t directly compute the expectation in (4), we can sample it by running actual episodes to gather < s, a, r, s > samples, where r is the reward obtained by taking action a in state s, and s is the following state. [sent-195, score-0.42]
65 We then learn the optimal policy by repeatedly gathering samples with the current policy, minimizing the error between the discounted reward to the end of the episode as predicted by our current Q(sj , a) and the actual values gathered, and updating the policy with the resulting weights. [sent-196, score-0.707]
66 To ensure sufficient exploration of the state space, we implement -greedy action selection during training: with a probability that decreases with each training iteration, a random action is selected instead of following the policy. [sent-197, score-0.591]
67 We run 15 iterations of accumulating samples by running 350 episodes, starting with a baseline policy which will be described in section 4, and cross-validating the regularization parameter at each iteration. [sent-201, score-0.267]
68 4 Feature representation Our policy is at its base determined by a linear function of the features of the state: π(s) = argmax θπ φ(s, ai ). [sent-205, score-0.369]
69 H(CK |o) The prior probability of the class that corresponds to the detector of action a (omitted for the scene-context action). [sent-212, score-0.533]
70 To formulate learning the policy as a single regression problem, we represent the features in block form, where φ(s, a) is a vector of size F |A|, with all values set to 0 except for the F -sized block corresponding to a. [sent-222, score-0.308]
71 Note that in the greedy learning case, this action is learned to never be taken, but it is shown to be useful in the reinforcement learning case. [sent-225, score-0.419]
72 This allows the action-value function to learn correlations between presence of different classes, and so the policy can look for the most probable classes given the observations. [sent-231, score-0.305]
73 4 Greedy As an illustration, we visualize the learned weights on these features in Figure 2, reshaped such that each row shows the weights learned for an action, with the top row representing the scene context action and then next 20 rows corresponding to the PASCAL VOC class detector actions. [sent-243, score-0.805]
74 Evaluation We evaluate our system on the multi-class, multi-label detection task, as previously described. [sent-244, score-0.3]
75 We evaluate on a popular detection challenge task: the PASCAL VOC 2007 dataset [1]. [sent-245, score-0.266]
76 We learn weights on the training and validation sets, and run our policy on all images in the testing set. [sent-247, score-0.339]
77 6 For the detector actions, we use one-vs-all cascaded deformable part-model detectors on a HOG featurization of the image [21], with linear classification of the list of detections as described in the previous section. [sent-252, score-0.702]
78 There are 20 classes in the PASCAL challenge task, so there are 20 detector actions. [sent-253, score-0.249]
79 Running a detector on a PASCAL image takes about 1 second. [sent-254, score-0.305]
80 In the first one, the start time is immediate and execution is cut off at 20 seconds, which is enough time to run all actions. [sent-256, score-0.257]
81 ap t ap(tj T tj T Ts (a) 1 t) 2 Td (b) Figure 3: (a) AP vs. [sent-261, score-0.303]
82 We establish the first baseline for our system by selecting actions randomly at each step. [sent-265, score-0.236]
83 As shown in Figure 3a, the Random policy results in a roughly linear gain of AP vs. [sent-266, score-0.267]
84 This is expected: the detectors are capable of obtaining a certain level of performance; if half the detectors are run, the expected performance level is half of the maximum level. [sent-268, score-0.276]
85 To establish an upper bound on performance, we plot the Oracle policy, obtained by re-ordering the actions at the end of each detection episode in the order of AP gains they produced. [sent-269, score-0.578]
86 In Figure 3a, we can see that due to the dataset bias, the fixed-order policy performs well at first, as the person class is disproportionately likely to be in the image, but is significantly overtaken by our model as execution goes on and more rare classes have to be detected. [sent-275, score-0.532]
87 Visualizing the learned weights in Figure 2, we note that the GIST action is learned to never be taken in the greedy (γ = 0) setting, but is learned to be taken with a higher value of γ. [sent-282, score-0.38]
88 It is additionally informative to consider the action trajectories of different policies in Figure 4. [sent-283, score-0.367]
89 7 Figure 4: Visualizing the action trajectories of different policies. [sent-284, score-0.276]
90 We see that the Random policy selects actions and obtains rewards randomly, while the Oracle policy obtains all rewards in the first few actions. [sent-286, score-0.965]
91 The Fixed Order policy selects actions in a static optimal order. [sent-287, score-0.506]
92 Our policy does not stick a static order but selects actions dynamically to maximize the rewards obtained early on. [sent-288, score-0.558]
93 530 Conclusion We presented a method for learning “closed-loop” policies for multi-class object recognition, given existing object detectors and classifiers and a metric to optimize. [sent-306, score-0.557]
94 The method learns the optimal policy using reinforcement learning, by observing execution traces in training. [sent-307, score-0.506]
95 If detection on an image is cut off after only half the detectors have been run, our method does 66% better than a random ordering, and 14% better than an intelligent baseline. [sent-308, score-0.498]
96 In particular, our method learns to take action with no intermediate reward in order to improve the overall performance of the system. [sent-309, score-0.339]
97 Here, we derive it for the novel detection AP vs. [sent-311, score-0.266]
98 Although computation devoted to scheduling actions is less significant than the computation due to running the actions, the next research direction is to explicitly consider this decision-making cost; the same goes for feature computation costs. [sent-313, score-0.274]
99 Additionally, it is interesting to consider actions defined not just by object category but also by spatial region. [sent-314, score-0.431]
100 Rapid object detection using a boosted cascade of simple features. [sent-348, score-0.509]
wordName wordTfidf (topN-words)
[('action', 0.276), ('policy', 0.267), ('detection', 0.266), ('ap', 0.253), ('detector', 0.211), ('gist', 0.207), ('actions', 0.202), ('object', 0.18), ('deadline', 0.147), ('detections', 0.141), ('detectors', 0.138), ('td', 0.136), ('pascal', 0.126), ('execution', 0.124), ('agist', 0.119), ('timeliness', 0.119), ('sj', 0.117), ('episode', 0.11), ('voc', 0.101), ('lter', 0.1), ('image', 0.094), ('ck', 0.093), ('ci', 0.093), ('scene', 0.087), ('score', 0.078), ('reinforcement', 0.076), ('rl', 0.072), ('boxes', 0.07), ('context', 0.07), ('contents', 0.069), ('greedy', 0.067), ('deformable', 0.066), ('mrf', 0.064), ('reward', 0.063), ('cascade', 0.063), ('ai', 0.061), ('policies', 0.059), ('deformation', 0.058), ('lters', 0.057), ('person', 0.057), ('evaluation', 0.055), ('ers', 0.055), ('ts', 0.055), ('bicycle', 0.055), ('recognition', 0.054), ('star', 0.054), ('fixed', 0.054), ('classi', 0.053), ('list', 0.052), ('rewards', 0.052), ('hog', 0.051), ('window', 0.051), ('tj', 0.05), ('exhaustively', 0.05), ('observations', 0.05), ('category', 0.049), ('root', 0.048), ('cascades', 0.048), ('adeti', 0.048), ('eval', 0.048), ('oi', 0.046), ('class', 0.046), ('parts', 0.046), ('answer', 0.045), ('sliding', 0.045), ('obtains', 0.044), ('curve', 0.043), ('setup', 0.043), ('episodes', 0.042), ('timely', 0.042), ('lsvm', 0.042), ('sideways', 0.042), ('bounding', 0.042), ('part', 0.041), ('features', 0.041), ('traces', 0.039), ('state', 0.039), ('deva', 0.039), ('subwindows', 0.039), ('classes', 0.038), ('selects', 0.037), ('weights', 0.037), ('feature', 0.037), ('er', 0.037), ('oj', 0.036), ('images', 0.035), ('advertising', 0.035), ('scheduling', 0.035), ('varun', 0.035), ('position', 0.034), ('cvpr', 0.034), ('visual', 0.034), ('immediate', 0.034), ('oracle', 0.034), ('system', 0.034), ('time', 0.033), ('triggs', 0.033), ('labeling', 0.033), ('start', 0.033), ('additionally', 0.032)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000001 344 nips-2012-Timely Object Recognition
Author: Sergey Karayev, Tobias Baumgartner, Mario Fritz, Trevor Darrell
Abstract: In a large visual multi-class detection framework, the timeliness of results can be crucial. Our method for timely multi-class detection aims to give the best possible performance at any single point after a start time; it is terminated at a deadline time. Toward this goal, we formulate a dynamic, closed-loop policy that infers the contents of the image in order to decide which detector to deploy next. In contrast to previous work, our method significantly diverges from the predominant greedy strategies, and is able to learn to take actions with deferred values. We evaluate our method with a novel timeliness measure, computed as the area under an Average Precision vs. Time curve. Experiments are conducted on the PASCAL VOC object detection dataset. If execution is stopped when only half the detectors have been run, our method obtains 66% better AP than a random ordering, and 14% better performance than an intelligent baseline. On the timeliness measure, our method obtains at least 11% better performance. Our method is easily extensible, as it treats detectors and classifiers as black boxes and learns from execution traces using reinforcement learning. 1
2 0.27038479 209 nips-2012-Max-Margin Structured Output Regression for Spatio-Temporal Action Localization
Author: Du Tran, Junsong Yuan
Abstract: Structured output learning has been successfully applied to object localization, where the mapping between an image and an object bounding box can be well captured. Its extension to action localization in videos, however, is much more challenging, because we need to predict the locations of the action patterns both spatially and temporally, i.e., identifying a sequence of bounding boxes that track the action in video. The problem becomes intractable due to the exponentially large size of the structured video space where actions could occur. We propose a novel structured learning approach for spatio-temporal action localization. The mapping between a video and a spatio-temporal action trajectory is learned. The intractable inference and learning problems are addressed by leveraging an efficient Max-Path search method, thus making it feasible to optimize the model over the whole structured space. Experiments on two challenging benchmark datasets show that our proposed method outperforms the state-of-the-art methods. 1
3 0.25104764 38 nips-2012-Algorithms for Learning Markov Field Policies
Author: Abdeslam Boularias, Jan R. Peters, Oliver B. Kroemer
Abstract: We use a graphical model for representing policies in Markov Decision Processes. This new representation can easily incorporate domain knowledge in the form of a state similarity graph that loosely indicates which states are supposed to have similar optimal actions. A bias is then introduced into the policy search process by sampling policies from a distribution that assigns high probabilities to policies that agree with the provided state similarity graph, i.e. smoother policies. This distribution corresponds to a Markov Random Field. We also present forward and inverse reinforcement learning algorithms for learning such policy distributions. We illustrate the advantage of the proposed approach on two problems: cart-balancing with swing-up, and teaching a robot to grasp unknown objects. 1
4 0.2492345 173 nips-2012-Learned Prioritization for Trading Off Accuracy and Speed
Author: Jiarong Jiang, Adam Teichert, Jason Eisner, Hal Daume
Abstract: Users want inference to be both fast and accurate, but quality often comes at the cost of speed. The field has experimented with approximate inference algorithms that make different speed-accuracy tradeoffs (for particular problems and datasets). We aim to explore this space automatically, focusing here on the case of agenda-based syntactic parsing [12]. Unfortunately, off-the-shelf reinforcement learning techniques fail to learn good policies: the state space is simply too large to explore naively. An attempt to counteract this by applying imitation learning algorithms also fails: the “teacher” follows a far better policy than anything in our learner’s policy space, free of the speed-accuracy tradeoff that arises when oracle information is unavailable, and thus largely insensitive to the known reward functfion. We propose a hybrid reinforcement/apprenticeship learning algorithm that learns to speed up an initial policy, trading off accuracy for speed according to various settings of a speed term in the loss function. 1
5 0.24188614 303 nips-2012-Searching for objects driven by context
Author: Bogdan Alexe, Nicolas Heess, Yee W. Teh, Vittorio Ferrari
Abstract: The dominant visual search paradigm for object class detection is sliding windows. Although simple and effective, it is also wasteful, unnatural and rigidly hardwired. We propose strategies to search for objects which intelligently explore the space of windows by making sequential observations at locations decided based on previous observations. Our strategies adapt to the class being searched and to the content of a particular test image, exploiting context as the statistical relation between the appearance of a window and its location relative to the object, as observed in the training set. In addition to being more elegant than sliding windows, we demonstrate experimentally on the PASCAL VOC 2010 dataset that our strategies evaluate two orders of magnitude fewer windows while achieving higher object detection performance. 1
6 0.22044422 311 nips-2012-Shifting Weights: Adapting Object Detectors from Image to Video
7 0.20665459 160 nips-2012-Imitation Learning by Coaching
8 0.19839442 162 nips-2012-Inverse Reinforcement Learning through Structured Classification
9 0.19241923 255 nips-2012-On the Use of Non-Stationary Policies for Stationary Infinite-Horizon Markov Decision Processes
10 0.19183123 1 nips-2012-3D Object Detection and Viewpoint Estimation with a Deformable 3D Cuboid Model
11 0.18876754 3 nips-2012-A Bayesian Approach for Policy Learning from Trajectory Preference Queries
12 0.17466502 106 nips-2012-Dynamical And-Or Graph Learning for Object Shape Modeling and Detection
13 0.16561876 348 nips-2012-Tractable Objectives for Robust Policy Optimization
14 0.16355281 201 nips-2012-Localizing 3D cuboids in single-view images
15 0.16218752 360 nips-2012-Visual Recognition using Embedded Feature Selection for Curvature Self-Similarity
16 0.15929006 40 nips-2012-Analyzing 3D Objects in Cluttered Images
17 0.15777157 88 nips-2012-Cost-Sensitive Exploration in Bayesian Reinforcement Learning
18 0.14741832 122 nips-2012-Exploration in Model-based Reinforcement Learning by Empirically Estimating Learning Progress
19 0.14664666 168 nips-2012-Kernel Latent SVM for Visual Recognition
20 0.14486471 353 nips-2012-Transferring Expectations in Model-based Reinforcement Learning
topicId topicWeight
[(0, 0.29), (1, -0.347), (2, -0.271), (3, -0.064), (4, 0.121), (5, -0.121), (6, 0.009), (7, -0.108), (8, 0.023), (9, -0.032), (10, -0.077), (11, 0.02), (12, 0.138), (13, -0.104), (14, 0.136), (15, 0.14), (16, 0.005), (17, -0.045), (18, -0.063), (19, 0.034), (20, 0.004), (21, -0.014), (22, -0.025), (23, -0.011), (24, -0.038), (25, -0.04), (26, 0.007), (27, 0.066), (28, -0.01), (29, -0.021), (30, 0.017), (31, 0.03), (32, -0.004), (33, 0.042), (34, -0.076), (35, 0.031), (36, 0.007), (37, 0.001), (38, -0.041), (39, -0.005), (40, 0.044), (41, -0.003), (42, 0.014), (43, 0.008), (44, -0.008), (45, -0.074), (46, 0.016), (47, 0.042), (48, 0.0), (49, 0.014)]
simIndex simValue paperId paperTitle
same-paper 1 0.96419525 344 nips-2012-Timely Object Recognition
Author: Sergey Karayev, Tobias Baumgartner, Mario Fritz, Trevor Darrell
Abstract: In a large visual multi-class detection framework, the timeliness of results can be crucial. Our method for timely multi-class detection aims to give the best possible performance at any single point after a start time; it is terminated at a deadline time. Toward this goal, we formulate a dynamic, closed-loop policy that infers the contents of the image in order to decide which detector to deploy next. In contrast to previous work, our method significantly diverges from the predominant greedy strategies, and is able to learn to take actions with deferred values. We evaluate our method with a novel timeliness measure, computed as the area under an Average Precision vs. Time curve. Experiments are conducted on the PASCAL VOC object detection dataset. If execution is stopped when only half the detectors have been run, our method obtains 66% better AP than a random ordering, and 14% better performance than an intelligent baseline. On the timeliness measure, our method obtains at least 11% better performance. Our method is easily extensible, as it treats detectors and classifiers as black boxes and learns from execution traces using reinforcement learning. 1
2 0.85720581 209 nips-2012-Max-Margin Structured Output Regression for Spatio-Temporal Action Localization
Author: Du Tran, Junsong Yuan
Abstract: Structured output learning has been successfully applied to object localization, where the mapping between an image and an object bounding box can be well captured. Its extension to action localization in videos, however, is much more challenging, because we need to predict the locations of the action patterns both spatially and temporally, i.e., identifying a sequence of bounding boxes that track the action in video. The problem becomes intractable due to the exponentially large size of the structured video space where actions could occur. We propose a novel structured learning approach for spatio-temporal action localization. The mapping between a video and a spatio-temporal action trajectory is learned. The intractable inference and learning problems are addressed by leveraging an efficient Max-Path search method, thus making it feasible to optimize the model over the whole structured space. Experiments on two challenging benchmark datasets show that our proposed method outperforms the state-of-the-art methods. 1
3 0.75746429 303 nips-2012-Searching for objects driven by context
Author: Bogdan Alexe, Nicolas Heess, Yee W. Teh, Vittorio Ferrari
Abstract: The dominant visual search paradigm for object class detection is sliding windows. Although simple and effective, it is also wasteful, unnatural and rigidly hardwired. We propose strategies to search for objects which intelligently explore the space of windows by making sequential observations at locations decided based on previous observations. Our strategies adapt to the class being searched and to the content of a particular test image, exploiting context as the statistical relation between the appearance of a window and its location relative to the object, as observed in the training set. In addition to being more elegant than sliding windows, we demonstrate experimentally on the PASCAL VOC 2010 dataset that our strategies evaluate two orders of magnitude fewer windows while achieving higher object detection performance. 1
4 0.72610581 1 nips-2012-3D Object Detection and Viewpoint Estimation with a Deformable 3D Cuboid Model
Author: Sanja Fidler, Sven Dickinson, Raquel Urtasun
Abstract: This paper addresses the problem of category-level 3D object detection. Given a monocular image, our aim is to localize the objects in 3D by enclosing them with tight oriented 3D bounding boxes. We propose a novel approach that extends the well-acclaimed deformable part-based model [1] to reason in 3D. Our model represents an object class as a deformable 3D cuboid composed of faces and parts, which are both allowed to deform with respect to their anchors on the 3D box. We model the appearance of each face in fronto-parallel coordinates, thus effectively factoring out the appearance variation induced by viewpoint. Our model reasons about face visibility patters called aspects. We train the cuboid model jointly and discriminatively and share weights across all aspects to attain efficiency. Inference then entails sliding and rotating the box in 3D and scoring object hypotheses. While for inference we discretize the search space, the variables are continuous in our model. We demonstrate the effectiveness of our approach in indoor and outdoor scenarios, and show that our approach significantly outperforms the stateof-the-art in both 2D [1] and 3D object detection [2]. 1
5 0.70698893 201 nips-2012-Localizing 3D cuboids in single-view images
Author: Jianxiong Xiao, Bryan Russell, Antonio Torralba
Abstract: In this paper we seek to detect rectangular cuboids and localize their corners in uncalibrated single-view images depicting everyday scenes. In contrast to recent approaches that rely on detecting vanishing points of the scene and grouping line segments to form cuboids, we build a discriminative parts-based detector that models the appearance of the cuboid corners and internal edges while enforcing consistency to a 3D cuboid model. Our model copes with different 3D viewpoints and aspect ratios and is able to detect cuboids across many different object categories. We introduce a database of images with cuboid annotations that spans a variety of indoor and outdoor scenes and show qualitative and quantitative results on our collected database. Our model out-performs baseline detectors that use 2D constraints alone on the task of localizing cuboid corners. 1
6 0.6808024 311 nips-2012-Shifting Weights: Adapting Object Detectors from Image to Video
7 0.67987764 173 nips-2012-Learned Prioritization for Trading Off Accuracy and Speed
8 0.67888016 350 nips-2012-Trajectory-Based Short-Sighted Probabilistic Planning
9 0.65680176 38 nips-2012-Algorithms for Learning Markov Field Policies
10 0.64886087 357 nips-2012-Unsupervised Template Learning for Fine-Grained Object Recognition
11 0.63987327 353 nips-2012-Transferring Expectations in Model-based Reinforcement Learning
12 0.63650793 88 nips-2012-Cost-Sensitive Exploration in Bayesian Reinforcement Learning
13 0.63304734 160 nips-2012-Imitation Learning by Coaching
14 0.62728065 360 nips-2012-Visual Recognition using Embedded Feature Selection for Curvature Self-Similarity
15 0.61390108 106 nips-2012-Dynamical And-Or Graph Learning for Object Shape Modeling and Detection
16 0.60244828 3 nips-2012-A Bayesian Approach for Policy Learning from Trajectory Preference Queries
17 0.5948087 40 nips-2012-Analyzing 3D Objects in Cluttered Images
18 0.5895201 289 nips-2012-Recognizing Activities by Attribute Dynamics
19 0.5798161 162 nips-2012-Inverse Reinforcement Learning through Structured Classification
20 0.56871271 31 nips-2012-Action-Model Based Multi-agent Plan Recognition
topicId topicWeight
[(0, 0.023), (21, 0.04), (38, 0.108), (42, 0.031), (54, 0.305), (55, 0.013), (74, 0.118), (76, 0.157), (80, 0.081), (92, 0.038)]
simIndex simValue paperId paperTitle
1 0.94692653 331 nips-2012-Symbolic Dynamic Programming for Continuous State and Observation POMDPs
Author: Zahra Zamani, Scott Sanner, Pascal Poupart, Kristian Kersting
Abstract: Point-based value iteration (PBVI) methods have proven extremely effective for finding (approximately) optimal dynamic programming solutions to partiallyobservable Markov decision processes (POMDPs) when a set of initial belief states is known. However, no PBVI work has provided exact point-based backups for both continuous state and observation spaces, which we tackle in this paper. Our key insight is that while there may be an infinite number of observations, there are only a finite number of continuous observation partitionings that are relevant for optimal decision-making when a finite, fixed set of reachable belief states is considered. To this end, we make two important contributions: (1) we show how previous exact symbolic dynamic programming solutions for continuous state MDPs can be generalized to continuous state POMDPs with discrete observations, and (2) we show how recently developed symbolic integration methods allow this solution to be extended to PBVI for continuous state and observation POMDPs with potentially correlated, multivariate continuous observation spaces. 1
2 0.89826995 177 nips-2012-Learning Invariant Representations of Molecules for Atomization Energy Prediction
Author: Grégoire Montavon, Katja Hansen, Siamac Fazli, Matthias Rupp, Franziska Biegler, Andreas Ziehe, Alexandre Tkatchenko, Anatole V. Lilienfeld, Klaus-Robert Müller
Abstract: The accurate prediction of molecular energetics in chemical compound space is a crucial ingredient for rational compound design. The inherently graph-like, non-vectorial nature of molecular data gives rise to a unique and difficult machine learning problem. In this paper, we adopt a learning-from-scratch approach where quantum-mechanical molecular energies are predicted directly from the raw molecular geometry. The study suggests a benefit from setting flexible priors and enforcing invariance stochastically rather than structurally. Our results improve the state-of-the-art by a factor of almost three, bringing statistical methods one step closer to chemical accuracy. 1
same-paper 3 0.87960213 344 nips-2012-Timely Object Recognition
Author: Sergey Karayev, Tobias Baumgartner, Mario Fritz, Trevor Darrell
Abstract: In a large visual multi-class detection framework, the timeliness of results can be crucial. Our method for timely multi-class detection aims to give the best possible performance at any single point after a start time; it is terminated at a deadline time. Toward this goal, we formulate a dynamic, closed-loop policy that infers the contents of the image in order to decide which detector to deploy next. In contrast to previous work, our method significantly diverges from the predominant greedy strategies, and is able to learn to take actions with deferred values. We evaluate our method with a novel timeliness measure, computed as the area under an Average Precision vs. Time curve. Experiments are conducted on the PASCAL VOC object detection dataset. If execution is stopped when only half the detectors have been run, our method obtains 66% better AP than a random ordering, and 14% better performance than an intelligent baseline. On the timeliness measure, our method obtains at least 11% better performance. Our method is easily extensible, as it treats detectors and classifiers as black boxes and learns from execution traces using reinforcement learning. 1
4 0.86641544 115 nips-2012-Efficient high dimensional maximum entropy modeling via symmetric partition functions
Author: Paul Vernaza, Drew Bagnell
Abstract: Maximum entropy (MaxEnt) modeling is a popular choice for sequence analysis in applications such as natural language processing, where the sequences are embedded in discrete, tractably-sized spaces. We consider the problem of applying MaxEnt to distributions over paths in continuous spaces of high dimensionality— a problem for which inference is generally intractable. Our main contribution is to show that this intractability can be avoided as long as the constrained features possess a certain kind of low dimensional structure. In this case, we show that the associated partition function is symmetric and that this symmetry can be exploited to compute the partition function efficiently in a compressed form. Empirical results are given showing an application of our method to learning models of high-dimensional human motion capture data. 1
5 0.85370636 70 nips-2012-Clustering by Nonnegative Matrix Factorization Using Graph Random Walk
Author: Zhirong Yang, Tele Hao, Onur Dikmen, Xi Chen, Erkki Oja
Abstract: Nonnegative Matrix Factorization (NMF) is a promising relaxation technique for clustering analysis. However, conventional NMF methods that directly approximate the pairwise similarities using the least square error often yield mediocre performance for data in curved manifolds because they can capture only the immediate similarities between data samples. Here we propose a new NMF clustering method which replaces the approximated matrix with its smoothed version using random walk. Our method can thus accommodate farther relationships between data samples. Furthermore, we introduce a novel regularization in the proposed objective function in order to improve over spectral clustering. The new learning objective is optimized by a multiplicative Majorization-Minimization algorithm with a scalable implementation for learning the factorizing matrix. Extensive experimental results on real-world datasets show that our method has strong performance in terms of cluster purity. 1
6 0.84333593 287 nips-2012-Random function priors for exchangeable arrays with applications to graphs and relational data
7 0.75500411 173 nips-2012-Learned Prioritization for Trading Off Accuracy and Speed
8 0.74902803 88 nips-2012-Cost-Sensitive Exploration in Bayesian Reinforcement Learning
9 0.72996587 108 nips-2012-Efficient Bayes-Adaptive Reinforcement Learning using Sample-Based Search
10 0.72500074 38 nips-2012-Algorithms for Learning Markov Field Policies
11 0.72460383 209 nips-2012-Max-Margin Structured Output Regression for Spatio-Temporal Action Localization
12 0.72160679 162 nips-2012-Inverse Reinforcement Learning through Structured Classification
13 0.71676475 259 nips-2012-Online Regret Bounds for Undiscounted Continuous Reinforcement Learning
14 0.71441442 3 nips-2012-A Bayesian Approach for Policy Learning from Trajectory Preference Queries
15 0.71004367 245 nips-2012-Nonparametric Bayesian Inverse Reinforcement Learning for Multiple Reward Functions
16 0.70437628 353 nips-2012-Transferring Expectations in Model-based Reinforcement Learning
17 0.70196313 348 nips-2012-Tractable Objectives for Robust Policy Optimization
18 0.69697714 303 nips-2012-Searching for objects driven by context
19 0.69661093 153 nips-2012-How Prior Probability Influences Decision Making: A Unifying Probabilistic Model
20 0.69545293 160 nips-2012-Imitation Learning by Coaching