nips nips2013 nips2013-323 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Tuan A. Nguyen, Subbarao Kambhampati, Minh Do
Abstract: Most current planners assume complete domain models and focus on generating correct plans. Unfortunately, domain modeling is a laborious and error-prone task, thus real world agents have to plan with incomplete domain models. While domain experts cannot guarantee completeness, often they are able to circumscribe the incompleteness of the model by providing annotations as to which parts of the domain model may be incomplete. In such cases, the goal should be to synthesize plans that are robust with respect to any known incompleteness of the domain. In this paper, we first introduce annotations expressing the knowledge of the domain incompleteness and formalize the notion of plan robustness with respect to an incomplete domain model. We then show an approach to compiling the problem of finding robust plans to the conformant probabilistic planning problem, and present experimental results with Probabilistic-FF planner. 1
Reference: text
sentIndex sentText sentNum sentScore
1 Unfortunately, domain modeling is a laborious and error-prone task, thus real world agents have to plan with incomplete domain models. [sent-7, score-0.93]
2 While domain experts cannot guarantee completeness, often they are able to circumscribe the incompleteness of the model by providing annotations as to which parts of the domain model may be incomplete. [sent-8, score-0.678]
3 In such cases, the goal should be to synthesize plans that are robust with respect to any known incompleteness of the domain. [sent-9, score-0.6]
4 In this paper, we first introduce annotations expressing the knowledge of the domain incompleteness and formalize the notion of plan robustness with respect to an incomplete domain model. [sent-10, score-1.415]
5 We then show an approach to compiling the problem of finding robust plans to the conformant probabilistic planning problem, and present experimental results with Probabilistic-FF planner. [sent-11, score-0.635]
6 1 Introduction In the past several years, significant strides have been made in scaling up plan synthesis techniques. [sent-12, score-0.427]
7 The incompleteness in such cases arises because domain writers do not have the full knowledge of the domain physics. [sent-20, score-0.59]
8 For example, although there exist efforts [1, 26] that attempt to either learn models from scratch or revise existing ones, their operation is contingent on the availability of successful plan traces, or access to execution experience. [sent-23, score-0.497]
9 There is thus a critical need for planning technology that can get by with partially specified domain models, and yet generate plans that are “robust” in the sense that they are likely to execute successfully in the real world. [sent-24, score-0.612]
10 This paper addresses the problem of formalizing the notion of plan robustness with respect to an incomplete domain model, and connects the problem of generating a robust plan under such model to conformant probabilistic planning [15, 11, 2, 4]. [sent-25, score-1.74]
11 In our framework, these annotations consist of allowing actions to have possible preconditions and effects (in addition to the standard necessary preconditions and effects). [sent-27, score-0.905]
12 The modeler can express this partial knowledge about the domain by annotating the action with a statement representing the possible precondition that balls should be light. [sent-32, score-0.56]
13 Incomplete domain models with such possible preconditions and effects implicitly define an exponential set of complete domain models, with the semantics that the real domain model is guaranteed to be one of these. [sent-33, score-1.056]
14 The robustness of a plan can now be formalized in terms of the cumulative probability mass of the complete domain models under which it succeeds. [sent-34, score-0.817]
15 We propose an approach that compiles the problem of finding robust plans into the conformant probabilistic planning problem. [sent-35, score-0.635]
16 We then present empirical results showing interesting relation between aspects such as the amount domain incompleteness, solving time and plan quality. [sent-36, score-0.578]
17 • Possible add (delete) effect set Add(a) ⊆ F \ Add(a) (Del(a) ⊆ F \ Del(a)) contains propositions that the action a might add (delete, respectively) after its execution. [sent-44, score-0.7]
18 Possible preconditions and effects whose likelihood of realization is not given are assumed to have weights of 1 . [sent-46, score-0.447]
19 1 Given an incomplete domain model D, we define its completion set D as the set of complete domain models whose actions have all the necessary preconditions, adds and deletes, and a subset of the possible preconditions, possible adds and possible deletes. [sent-48, score-0.684]
20 Since any subset of P re(a), Add(a) and Del(a) can be realized as preconditions and effects of action a, there are exponentially large number of possible complete domain models Di ∈ D = {D1 , D2 , . [sent-49, score-0.977]
21 For each complete model Di , we denote the corresponding sets of realized preconditions and effects for each action a as P rei (a), Addi (a) and Deli (a); equivalently, its complete sets of preconditions and effects are P re(a) ∪ P rei (a), Add(a) ∪ Addi (a) and Del(a) ∪ Deli (a). [sent-53, score-1.31]
22 2 incomplete models, failure of actions should be expected, and the plan needs to be “robustified” against them during synthesis. [sent-57, score-0.662]
23 The GES facilitates this by ensuring that the plan as a whole does not have to fail if an individual action fails (without it, failing actions doom the plan and thus cannot be supplanted). [sent-58, score-1.128]
24 A planning problem with incomplete domain D is P = D, I, G where I ⊆ F is the set of propositions that are true in the initial state (and all the remaining are false), and G is the set of goal propositions. [sent-66, score-0.681]
25 An action sequence π is considered a valid plan for P if π solves the problem in at least one completion of D . [sent-67, score-0.617]
26 Modeling assumptions underlying our formulation: From the modeling point of view, the possible precondition and effect sets can be modeled at either the grounded action or action schema level (and thus applicable to all grounded actions sharing the same action schema). [sent-72, score-0.976]
27 From a practical point of view, however, incompleteness annotations at ground Figure 1: Decription of incomplete action schema level hugely increase the burden on domain pick-up in Gripper domain. [sent-73, score-0.964]
28 Since possible preconditions and effects can be represented as random variables, they can in principle be modeled using graphical models such as Makov Logic Networks and Bayesian Networks [14]. [sent-76, score-0.442]
29 We therefore assume that the possible preconditions and effects are uncorrelated, thus can be realized independently (both within each action schema and across different ones). [sent-78, score-0.819]
30 Those two possible preconditions and effects can be realized independently, resulting in four possible candidate complete domains (assuming all other action schemas in the domain are completely described). [sent-86, score-1.049]
31 3 A Robustness Measure for Plans The robustness of a plan π for the problem P = D, I, G is defined as the cumulative probability mass of the completions of D under which π succeeds (in achieving the goals). [sent-87, score-0.627]
32 The robustness of π is defined as follows: def R(π, P : D, I, G ) ≡ Pr(Di ) (2) Di ∈ D ,γ(π,I,Di )| =G It is easy to see that if R(π, P) > 0, then π is a valid plan for P. [sent-89, score-0.563]
33 3 Example: Figure 2 shows an example with an incomplete domain model D = F, A with F = {p1 , p2 , p3 } and A = {a1 , a2 } and a solution plan π = a1 , a2 for the problem P = D, I = {p2 }, G = {p3 } . [sent-91, score-0.729]
34 Circles with solid and dash boundary respecinstance, p1 and p3 are realized as precondition tively are propositions that are known to be T and and add effect of a1 and a2 , whereas p1 is not might be F when the plan executes (see more in a delete effect of action a2 . [sent-95, score-1.254]
35 The 3 robustness value of the plan is R(π) = 4 if Pr(Di ) is the uniform distribution. [sent-99, score-0.563]
36 Note that under the standard non-generous execution semantics (non-GES) where action failure causes plan failure, the plan π would be mistakenly considered failing to achieve G in the first two complete models, since a2 is prevented from execution. [sent-109, score-1.16]
37 1 A Spectrum of Robust Planning Problems Given this set up, we can now talk about a spectrum of problems related to planning under incomplete domain models: Robustness Assessment (RA): Given a plan π for the problem P, assess the robustness of π. [sent-111, score-1.108]
38 Maximally Robust Plan Generation (RG∗ ): Given a problem P, generate the maximally robust plan π ∗ . [sent-112, score-0.466]
39 Generating Plan with Desired Level of Robustness (RGρ ): Given a problem P and a robustness threshold ρ (0 < ρ ≤ 1), generate a plan π with robustness greater than or equal to ρ. [sent-113, score-0.721]
40 Cost-sensitive Robust Plan Generation (RG∗ ): Given a problem P and a cost bound c, generate a c plan π of maximal robustness subject to cost bound c (where the cost of a plan π is defined as the cumulative costs of the actions in π). [sent-114, score-1.074]
41 Incremental Robustification (RIc ): Given a plan π for the problem P, improve the robustness of π, subject to a cost budget c. [sent-115, score-0.563]
42 The problem of assessing plan robustness with the uniform distribution of candidate complete models is #P -complete. [sent-119, score-0.676]
43 For plan synthesis problems, we can talk about either generating a maximally robust plan, RG∗ , or finding a plan with a robustness value above the given threshold, RGρ . [sent-120, score-1.117]
44 Often, increasing robustness involves using additional (or costlier) actions to support the desired goals, and thus comes at the expense of increased plan cost. [sent-124, score-0.669]
45 We can also talk about cost-constrained robust plan generation problem RG∗ . [sent-125, score-0.495]
46 Finally, in c practice, we are often interested in increasing the robustness of a given plan (either during iterative search, or during mixed-initiative planning). [sent-126, score-0.563]
47 In the next section, we will focus on the problem of synthesizing plans with at least a robustness value ρ. [sent-128, score-0.469]
48 4 Synthesizing Robust Plans Given a planning problem P with an incomplete domain D, the ultimate goal is to synthesize a plan having a desired level of robustness, or one with maximal robustness value. [sent-129, score-1.127]
49 In this section, we will show that the problem of generating plan with at least ρ robustness (0 < ρ ≤ 1), can be compiled into an equivalent conformant probabilistic planning problem. [sent-130, score-0.956]
50 The most robust plan can then be found with a sequence of increasing threshold values. [sent-131, score-0.466]
51 1 Conformant Probabilistic Planning Following the formalism in [4], a domain in conformant probabilistic planning (CPP) is a tuple D ′ = F ′ , A′ , where F ′ and A′ are the sets of propositions and probabilistic actions, respectively. [sent-133, score-0.659]
52 Each action a′ ∈ A′ is specified by a set of preconditions P re(a′ ) ⊆ F ′ and conditional effects E(a′ ). [sent-135, score-0.631]
53 , a′ ) is a solution plan for P ′ if a′ is applicable in the belief state bi n 1 i (assuming b1 ≡ bI ), which results in bi+1 (1 ≤ i ≤ n), and it achieves all goal propositions with at least ρ′ probability. [sent-143, score-0.634]
54 2 Compilation Given an incomplete domain model D = F, A and a planning problem P = D, I, G , we now describe a compilation that translates the problem of synthesizing a solution plan π for P such that R(π, P) ≥ ρ to a CPP problem P ′ . [sent-145, score-1.023]
55 At a high level, the realization of possible preconditions p ∈ P re(a) and effects q ∈ Add(a), r ∈ Del(a) of an action a ∈ A can be understood as add del being determined by the truth values of hidden propositions ppre , qa and ra that are certain a (i. [sent-146, score-1.506]
56 Specifically, the applicability of the action in a state s ⊆ F depends on possible preconditions p that are realized (i. [sent-149, score-0.632]
57 Similarly, the values of q and r are affected by a in the resulting state only if add del they are realized as add and delete effects of the action (i. [sent-152, score-1.241]
58 There are |P re(a)|+|Add(a)|+|Del(a)| totally 2 realizations of the action a, and all of them should be considered simultaneously in checking the applicability of the action and in defining corresponding resulting states. [sent-155, score-0.424]
59 With those observations, we use multiple conditional effects to compile away incomplete knowledge on preconditions and effects of the action a. [sent-156, score-0.932]
60 Each conditional effect corresponds to one realization of the action, and can be fired only if p = T whenever ppre = T, and adding (removing) an effect q (r) a del add into (from) the resulting state depending on the values of qa (ra , respectively) in the realization. [sent-157, score-0.784]
61 5 add del For each action a ∈ A, we introduce new propositions ppre , qa , ra and their negations nppre , a a add del nqa , nra for each p ∈ P re(a), q ∈ Add(a) and r ∈ Del(a) to determine whether they are realized as preconditions and effects of a in the real domain. [sent-161, score-2.19]
62 Each action a′ ∈ A′ is made from one action a ∈ A such that P re(a′ ) = ∅, and E(a′ ) consists of 2|P re(a)|+|Add(a)|+|Del(a)| conditional effects e. [sent-163, score-0.551]
63 It also shows that a plan for P with at least ρ robustness can be obtained directly from solutions of the compiled problem P ′ . [sent-175, score-0.592]
64 3 Experimental Results In this section, we discuss the results of the compilation with Probabilistic-FF (PFF) on variants of the Logistics and Satellite domains, where domain incompleteness is modeled on the preconditions and effects of actions (respectively). [sent-186, score-0.98]
65 Our purpose here is to observe and explain how plan length and synthesizing time vary with the amount of domain incompleteness and the robustness threshold. [sent-187, score-1.044]
66 The source of incompleteness in this domain comes from the assumption that each pair of robots R1j and R2j (1 ≤ j ≤ m) are made by the same manufacturer Mj , both therefore might fail to load a heavy container. [sent-198, score-0.586]
67 5 The actions loading containers onto trucks using robots made 3 These propositions are introduced once, and re-used for all actions sharing the same schema with a. [sent-199, score-0.626]
68 5 The uncorrelated incompleteness assumption applies for possible preconditions of action schemas specified for different manufacturers. [sent-202, score-0.774]
69 , the action schema load-truck-with-robots-of-M1 using robots of manufacturer M1 ), therefore, have a possible precondition requiring that containers should not be heavy. [sent-206, score-0.625]
70 For this domain, a plan to ship a container to another city involves a step of loading it onto the truck, which can be done by a robot (after moving it from the airport to the downtown). [sent-210, score-0.511]
71 Plans can be made more robust by using additional robots of different manufacturer after moving them into the downtown areas, with the cost of increasing plan length. [sent-211, score-0.665]
72 For each specific value of ρ and m, we re- Figure 4: The results of generating robust plans in Satellite port l/t where l is the length of plan domain. [sent-320, score-0.75]
73 Cases in which no plan is found within the time limit are denoted by “–”, and those where it is provable that no plan with the desired robustness exists are denoted by “⊥”. [sent-322, score-0.968]
74 As the results indicate, for a fixed amount of domain incompleteness (represented by m), the solution plans in both domains tend to be longer with higher robustness threshold ρ, and the time to synthesize plans also increases. [sent-323, score-1.154]
75 For instance, in Logistics with m = 5, the plan returned has 48 actions if ρ = 0. [sent-324, score-0.511]
76 However, robots of all five manufacturers are used in a plan when ρ = 0. [sent-330, score-0.525]
77 The relaxation employed by PFF that ignores all but one condition in effects of actions, while enables an upper-bound computation for plan robustness, is probably too strong and causes unnecessary increasing in plan length. [sent-332, score-0.937]
78 Also as we would expect, when the amount of domain incompleteness (i. [sent-333, score-0.417]
79 , m) increases, it takes longer time to synthesize plans satisfying a fixed robustness value ρ. [sent-335, score-0.453]
80 7 seconds to synthesize a 37-length plan when m = 5, whereas it is only 94. [sent-338, score-0.453]
81 7 where no plan is found 7 within the time limit when m = 5, although a plan with robustness of 0. [sent-341, score-0.968]
82 5 Related Work There are currently very few research efforts in automated planning literature that explicitly consider incompletely specified domain models. [sent-344, score-0.411]
83 To our best knowledge, Garland and Lesh [7] were the first discussing incomplete actions and generating robust plans under incomplete domain models. [sent-345, score-0.926]
84 Their notion of plan robustness, however, only has tenuous heuristic connections with the likelihood of successful execution of plans. [sent-346, score-0.471]
85 Weber and Bryce [24] consider a model similar to ours but assume a non-GES formulation during plan synthesis—the plan fails if any action’s preconditions are not satisfied. [sent-347, score-1.102]
86 Indeed, their method for generating robust plans relies on the propagation of “reasons” for failure of each action, assuming that every action before it successfully executes. [sent-349, score-0.557]
87 Morwood and Bryce [16] studied the problem of robustness assessment for the same incompleteness formulation in temporal planning domains, where plan robustness is defined as the number of complete models under which temporal constraints are consistent. [sent-351, score-1.269]
88 The work by Fox et al [6] also explores robustness of plans, but their focus is on temporal plans under unforeseen execution-time variations rather than on incompletely specified domains. [sent-352, score-0.451]
89 action models) and the notion of plans (secure/conformant plans v. [sent-356, score-0.729]
90 Our work can also be categorized as one particular instance of the general model-lite planning problem, as defined in [13], in which the author points out a large class of applications where handling incomplete models is unavoidable due to the difficulty in getting a complete model. [sent-359, score-0.424]
91 [1, 26]) that attempt to either learn models from scratch or revise existing ones, given the access to successful plan traces or execution experience, which can then be used to solve new planning problems. [sent-362, score-0.714]
92 Though not directly addressing formulation like ours, the work on k-fault plans for non-deterministic planning [12] focused on reducing the “faults” in plan execution. [sent-368, score-0.844]
93 The semantics of the possible preconditions/effects in our incomplete domain models fundamentally differs from nondeterministic and stochastic effects (c. [sent-370, score-0.511]
94 6 Conclusion and Future Work In this paper, we motivated the need for synthesizing robust plans under incomplete domain models. [sent-380, score-0.696]
95 We introduced annotations for expressing domain incompleteness, formalized the notion of plan robustness, and showed an approach to compile the problem of generating robust plans into conformant probabilistic planning. [sent-381, score-1.192]
96 We presented empirical results showing interesting relation between aspects such as the amount of domain incompleteness, solving time and plan quality. [sent-382, score-0.578]
97 We are working on a direct approach reasoning on correctness constraints of plan prefixes and partial relaxed plans, constrasting it with our compilation method. [sent-383, score-0.443]
98 We also plan to take successful plan traces as a second type of additional inputs for generating robust plans. [sent-384, score-0.933]
99 Model-lite planning for the web age masses: The challenges of planning with incomplete and evolving domain models. [sent-456, score-0.708]
100 Learning action models from plan examples using weighted max-sat. [sent-522, score-0.64]
wordName wordTfidf (topN-words)
[('plan', 0.405), ('del', 0.366), ('preconditions', 0.292), ('plans', 0.247), ('incompleteness', 0.244), ('action', 0.212), ('planning', 0.192), ('domain', 0.173), ('add', 0.163), ('robustness', 0.158), ('incomplete', 0.151), ('propositions', 0.129), ('effects', 0.127), ('satellite', 0.118), ('di', 0.117), ('actions', 0.106), ('conformant', 0.105), ('precondition', 0.105), ('schema', 0.096), ('realized', 0.092), ('wa', 0.091), ('annotations', 0.088), ('delete', 0.082), ('containers', 0.079), ('ppre', 0.079), ('robots', 0.067), ('downtown', 0.066), ('gripper', 0.066), ('manufacturer', 0.066), ('synthesizing', 0.064), ('ra', 0.064), ('re', 0.061), ('robust', 0.061), ('complete', 0.058), ('instruments', 0.057), ('logistics', 0.055), ('rg', 0.053), ('cons', 0.053), ('manufacturers', 0.053), ('writer', 0.053), ('pre', 0.049), ('synthesize', 0.048), ('fnew', 0.046), ('modeler', 0.046), ('incompletely', 0.046), ('qa', 0.046), ('execution', 0.043), ('loading', 0.043), ('airport', 0.04), ('addi', 0.039), ('bryce', 0.039), ('garland', 0.039), ('nppre', 0.039), ('truck', 0.038), ('compilation', 0.038), ('mj', 0.038), ('semantics', 0.037), ('generating', 0.037), ('domains', 0.037), ('heavy', 0.036), ('state', 0.036), ('completions', 0.035), ('deli', 0.035), ('bi', 0.033), ('effect', 0.033), ('candidate', 0.032), ('imprecise', 0.032), ('assessment', 0.031), ('belief', 0.031), ('probabilistic', 0.03), ('succeeds', 0.029), ('talk', 0.029), ('compiled', 0.029), ('agents', 0.028), ('realization', 0.028), ('cpp', 0.026), ('ges', 0.026), ('gmytrasiewicz', 0.026), ('kambhampati', 0.026), ('kushmerick', 0.026), ('lesh', 0.026), ('morwood', 0.026), ('nasa', 0.026), ('nqa', 0.026), ('nra', 0.026), ('pff', 0.026), ('rei', 0.026), ('revise', 0.026), ('schemas', 0.026), ('traces', 0.025), ('pr', 0.025), ('balls', 0.024), ('container', 0.023), ('eiter', 0.023), ('compile', 0.023), ('robusti', 0.023), ('models', 0.023), ('decision', 0.023), ('notion', 0.023), ('synthesis', 0.022)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999899 323 nips-2013-Synthesizing Robust Plans under Incomplete Domain Models
Author: Tuan A. Nguyen, Subbarao Kambhampati, Minh Do
Abstract: Most current planners assume complete domain models and focus on generating correct plans. Unfortunately, domain modeling is a laborious and error-prone task, thus real world agents have to plan with incomplete domain models. While domain experts cannot guarantee completeness, often they are able to circumscribe the incompleteness of the model by providing annotations as to which parts of the domain model may be incomplete. In such cases, the goal should be to synthesize plans that are robust with respect to any known incompleteness of the domain. In this paper, we first introduce annotations expressing the knowledge of the domain incompleteness and formalize the notion of plan robustness with respect to an incomplete domain model. We then show an approach to compiling the problem of finding robust plans to the conformant probabilistic planning problem, and present experimental results with Probabilistic-FF planner. 1
2 0.14353095 32 nips-2013-Aggregating Optimistic Planning Trees for Solving Markov Decision Processes
Author: Gunnar Kedenburg, Raphael Fonteneau, Remi Munos
Abstract: This paper addresses the problem of online planning in Markov decision processes using a randomized simulator, under a budget constraint. We propose a new algorithm which is based on the construction of a forest of planning trees, where each tree corresponds to a random realization of the stochastic environment. The trees are constructed using a “safe” optimistic planning strategy combining the optimistic principle (in order to explore the most promising part of the search space first) with a safety principle (which guarantees a certain amount of uniform exploration). In the decision-making step of the algorithm, the individual trees are aggregated and an immediate action is recommended. We provide a finite-sample analysis and discuss the trade-off between the principles of optimism and safety. We also report numerical results on a benchmark problem. Our algorithm performs as well as state-of-the-art optimistic planning algorithms, and better than a related algorithm which additionally assumes the knowledge of all transition distributions. 1
3 0.10748051 79 nips-2013-DESPOT: Online POMDP Planning with Regularization
Author: Adhiraj Somani, Nan Ye, David Hsu, Wee Sun Lee
Abstract: POMDPs provide a principled framework for planning under uncertainty, but are computationally intractable, due to the “curse of dimensionality” and the “curse of history”. This paper presents an online POMDP algorithm that alleviates these difficulties by focusing the search on a set of randomly sampled scenarios. A Determinized Sparse Partially Observable Tree (DESPOT) compactly captures the execution of all policies on these scenarios. Our Regularized DESPOT (R-DESPOT) algorithm searches the DESPOT for a policy, while optimally balancing the size of the policy and its estimated value obtained under the sampled scenarios. We give an output-sensitive performance bound for all policies derived from a DESPOT, and show that R-DESPOT works well if a small optimal policy exists. We also give an anytime algorithm that approximates R-DESPOT. Experiments show strong results, compared with two of the fastest online POMDP algorithms. Source code along with experimental settings are available at http://bigbird.comp. nus.edu.sg/pmwiki/farm/appl/. 1
4 0.10694778 50 nips-2013-Bayesian Mixture Modelling and Inference based Thompson Sampling in Monte-Carlo Tree Search
Author: Aijun Bai, Feng Wu, Xiaoping Chen
Abstract: Monte-Carlo tree search (MCTS) has been drawing great interest in recent years for planning and learning under uncertainty. One of the key challenges is the trade-off between exploration and exploitation. To address this, we present a novel approach for MCTS using Bayesian mixture modeling and inference based Thompson sampling and apply it to the problem of online planning in MDPs. Our algorithm, named Dirichlet-NormalGamma MCTS (DNG-MCTS), models the uncertainty of the accumulated reward for actions in the search tree as a mixture of Normal distributions. We perform inferences on the mixture in Bayesian settings by choosing conjugate priors in the form of combinations of Dirichlet and NormalGamma distributions and select the best action at each decision node using Thompson sampling. Experimental results confirm that our algorithm advances the state-of-the-art UCT approach with better values on several benchmark problems. 1
5 0.10256067 276 nips-2013-Reshaping Visual Datasets for Domain Adaptation
Author: Boqing Gong, Kristen Grauman, Fei Sha
Abstract: In visual recognition problems, the common data distribution mismatches between training and testing make domain adaptation essential. However, image data is difficult to manually divide into the discrete domains required by adaptation algorithms, and the standard practice of equating datasets with domains is a weak proxy for all the real conditions that alter the statistics in complex ways (lighting, pose, background, resolution, etc.) We propose an approach to automatically discover latent domains in image or video datasets. Our formulation imposes two key properties on domains: maximum distinctiveness and maximum learnability. By maximum distinctiveness, we require the underlying distributions of the identified domains to be different from each other to the maximum extent; by maximum learnability, we ensure that a strong discriminative model can be learned from the domain. We devise a nonparametric formulation and efficient optimization procedure that can successfully discover domains among both training and test data. We extensively evaluate our approach on object recognition and human activity recognition tasks. 1
6 0.092629611 106 nips-2013-Eluder Dimension and the Sample Complexity of Optimistic Exploration
7 0.086251698 322 nips-2013-Symbolic Opportunistic Policy Iteration for Factored-Action MDPs
8 0.081486382 22 nips-2013-Action is in the Eye of the Beholder: Eye-gaze Driven Model for Spatio-Temporal Action Localization
9 0.081008077 103 nips-2013-Efficient Exploration and Value Function Generalization in Deterministic Systems
10 0.074859381 248 nips-2013-Point Based Value Iteration with Optimal Belief Compression for Dec-POMDPs
11 0.065601103 347 nips-2013-Variational Planning for Graph-based MDPs
12 0.059090689 16 nips-2013-A message-passing algorithm for multi-agent trajectory planning
13 0.058432944 158 nips-2013-Learning Multiple Models via Regularized Weighting
14 0.057306029 211 nips-2013-Non-Linear Domain Adaptation with Boosting
15 0.056505945 150 nips-2013-Learning Adaptive Value of Information for Structured Prediction
16 0.056291819 125 nips-2013-From Bandits to Experts: A Tale of Domination and Independence
17 0.056192197 250 nips-2013-Policy Shaping: Integrating Human Feedback with Reinforcement Learning
18 0.052265812 273 nips-2013-Reinforcement Learning in Robust Markov Decision Processes
19 0.052162368 208 nips-2013-Neural representation of action sequences: how far can a simple snippet-matching model take us?
20 0.05111682 21 nips-2013-Action from Still Image Dataset and Inverse Optimal Control to Learn Task Specific Visual Scanpaths
topicId topicWeight
[(0, 0.116), (1, -0.073), (2, -0.049), (3, 0.013), (4, 0.016), (5, -0.037), (6, 0.028), (7, -0.019), (8, -0.067), (9, 0.043), (10, -0.025), (11, -0.007), (12, 0.091), (13, -0.002), (14, -0.04), (15, 0.112), (16, 0.03), (17, -0.002), (18, 0.059), (19, -0.115), (20, -0.127), (21, 0.09), (22, -0.032), (23, 0.029), (24, 0.002), (25, 0.002), (26, -0.036), (27, -0.097), (28, -0.044), (29, 0.049), (30, 0.029), (31, -0.056), (32, -0.029), (33, 0.071), (34, 0.079), (35, -0.024), (36, -0.076), (37, 0.113), (38, 0.059), (39, 0.051), (40, -0.081), (41, -0.003), (42, -0.018), (43, -0.016), (44, -0.004), (45, -0.052), (46, -0.06), (47, -0.061), (48, -0.112), (49, 0.006)]
simIndex simValue paperId paperTitle
same-paper 1 0.97511244 323 nips-2013-Synthesizing Robust Plans under Incomplete Domain Models
Author: Tuan A. Nguyen, Subbarao Kambhampati, Minh Do
Abstract: Most current planners assume complete domain models and focus on generating correct plans. Unfortunately, domain modeling is a laborious and error-prone task, thus real world agents have to plan with incomplete domain models. While domain experts cannot guarantee completeness, often they are able to circumscribe the incompleteness of the model by providing annotations as to which parts of the domain model may be incomplete. In such cases, the goal should be to synthesize plans that are robust with respect to any known incompleteness of the domain. In this paper, we first introduce annotations expressing the knowledge of the domain incompleteness and formalize the notion of plan robustness with respect to an incomplete domain model. We then show an approach to compiling the problem of finding robust plans to the conformant probabilistic planning problem, and present experimental results with Probabilistic-FF planner. 1
2 0.68139511 32 nips-2013-Aggregating Optimistic Planning Trees for Solving Markov Decision Processes
Author: Gunnar Kedenburg, Raphael Fonteneau, Remi Munos
Abstract: This paper addresses the problem of online planning in Markov decision processes using a randomized simulator, under a budget constraint. We propose a new algorithm which is based on the construction of a forest of planning trees, where each tree corresponds to a random realization of the stochastic environment. The trees are constructed using a “safe” optimistic planning strategy combining the optimistic principle (in order to explore the most promising part of the search space first) with a safety principle (which guarantees a certain amount of uniform exploration). In the decision-making step of the algorithm, the individual trees are aggregated and an immediate action is recommended. We provide a finite-sample analysis and discuss the trade-off between the principles of optimism and safety. We also report numerical results on a benchmark problem. Our algorithm performs as well as state-of-the-art optimistic planning algorithms, and better than a related algorithm which additionally assumes the knowledge of all transition distributions. 1
3 0.59604597 103 nips-2013-Efficient Exploration and Value Function Generalization in Deterministic Systems
Author: Zheng Wen, Benjamin Van Roy
Abstract: We consider the problem of reinforcement learning over episodes of a finitehorizon deterministic system and as a solution propose optimistic constraint propagation (OCP), an algorithm designed to synthesize efficient exploration and value function generalization. We establish that when the true value function Q⇤ lies within the hypothesis class Q, OCP selects optimal actions over all but at most dimE [Q] episodes, where dimE denotes the eluder dimension. We establish further efficiency and asymptotic performance guarantees that apply even if Q⇤ does not lie in Q, for the special case where Q is the span of pre-specified indicator functions over disjoint sets. 1
4 0.55768853 22 nips-2013-Action is in the Eye of the Beholder: Eye-gaze Driven Model for Spatio-Temporal Action Localization
Author: Nataliya Shapovalova, Michalis Raptis, Leonid Sigal, Greg Mori
Abstract: We propose a weakly-supervised structured learning approach for recognition and spatio-temporal localization of actions in video. As part of the proposed approach, we develop a generalization of the Max-Path search algorithm which allows us to efficiently search over a structured space of multiple spatio-temporal paths while also incorporating context information into the model. Instead of using spatial annotations in the form of bounding boxes to guide the latent model during training, we utilize human gaze data in the form of a weak supervisory signal. This is achieved by incorporating eye gaze, along with the classification, into the structured loss within the latent SVM learning framework. Experiments on a challenging benchmark dataset, UCF-Sports, show that our model is more accurate, in terms of classification, and achieves state-of-the-art results in localization. In addition, our model can produce top-down saliency maps conditioned on the classification label and localized latent paths. 1
5 0.55409765 276 nips-2013-Reshaping Visual Datasets for Domain Adaptation
Author: Boqing Gong, Kristen Grauman, Fei Sha
Abstract: In visual recognition problems, the common data distribution mismatches between training and testing make domain adaptation essential. However, image data is difficult to manually divide into the discrete domains required by adaptation algorithms, and the standard practice of equating datasets with domains is a weak proxy for all the real conditions that alter the statistics in complex ways (lighting, pose, background, resolution, etc.) We propose an approach to automatically discover latent domains in image or video datasets. Our formulation imposes two key properties on domains: maximum distinctiveness and maximum learnability. By maximum distinctiveness, we require the underlying distributions of the identified domains to be different from each other to the maximum extent; by maximum learnability, we ensure that a strong discriminative model can be learned from the domain. We devise a nonparametric formulation and efficient optimization procedure that can successfully discover domains among both training and test data. We extensively evaluate our approach on object recognition and human activity recognition tasks. 1
6 0.54080355 50 nips-2013-Bayesian Mixture Modelling and Inference based Thompson Sampling in Monte-Carlo Tree Search
7 0.50771838 21 nips-2013-Action from Still Image Dataset and Inverse Optimal Control to Learn Task Specific Visual Scanpaths
8 0.50535339 248 nips-2013-Point Based Value Iteration with Optimal Belief Compression for Dec-POMDPs
9 0.50471431 250 nips-2013-Policy Shaping: Integrating Human Feedback with Reinforcement Learning
10 0.48962691 337 nips-2013-Transportability from Multiple Environments with Limited Experiments
11 0.4389357 322 nips-2013-Symbolic Opportunistic Policy Iteration for Factored-Action MDPs
12 0.43515387 106 nips-2013-Eluder Dimension and the Sample Complexity of Optimistic Exploration
13 0.42061016 69 nips-2013-Context-sensitive active sensing in humans
14 0.41382033 79 nips-2013-DESPOT: Online POMDP Planning with Regularization
15 0.41303787 1 nips-2013-(More) Efficient Reinforcement Learning via Posterior Sampling
16 0.40425855 270 nips-2013-Regret based Robust Solutions for Uncertain Markov Decision Processes
17 0.40206629 347 nips-2013-Variational Planning for Graph-based MDPs
18 0.38139418 208 nips-2013-Neural representation of action sequences: how far can a simple snippet-matching model take us?
19 0.37787625 278 nips-2013-Reward Mapping for Transfer in Long-Lived Agents
20 0.37527564 273 nips-2013-Reinforcement Learning in Robust Markov Decision Processes
topicId topicWeight
[(16, 0.016), (33, 0.074), (34, 0.073), (41, 0.018), (49, 0.517), (56, 0.091), (70, 0.025), (85, 0.024), (89, 0.02), (93, 0.034)]
simIndex simValue paperId paperTitle
same-paper 1 0.88208872 323 nips-2013-Synthesizing Robust Plans under Incomplete Domain Models
Author: Tuan A. Nguyen, Subbarao Kambhampati, Minh Do
Abstract: Most current planners assume complete domain models and focus on generating correct plans. Unfortunately, domain modeling is a laborious and error-prone task, thus real world agents have to plan with incomplete domain models. While domain experts cannot guarantee completeness, often they are able to circumscribe the incompleteness of the model by providing annotations as to which parts of the domain model may be incomplete. In such cases, the goal should be to synthesize plans that are robust with respect to any known incompleteness of the domain. In this paper, we first introduce annotations expressing the knowledge of the domain incompleteness and formalize the notion of plan robustness with respect to an incomplete domain model. We then show an approach to compiling the problem of finding robust plans to the conformant probabilistic planning problem, and present experimental results with Probabilistic-FF planner. 1
2 0.85365081 6 nips-2013-A Determinantal Point Process Latent Variable Model for Inhibition in Neural Spiking Data
Author: Jasper Snoek, Richard Zemel, Ryan P. Adams
Abstract: Point processes are popular models of neural spiking behavior as they provide a statistical distribution over temporal sequences of spikes and help to reveal the complexities underlying a series of recorded action potentials. However, the most common neural point process models, the Poisson process and the gamma renewal process, do not capture interactions and correlations that are critical to modeling populations of neurons. We develop a novel model based on a determinantal point process over latent embeddings of neurons that effectively captures and helps visualize complex inhibitory and competitive interaction. We show that this model is a natural extension of the popular generalized linear model to sets of interacting neurons. The model is extended to incorporate gain control or divisive normalization, and the modulation of neural spiking based on periodic phenomena. Applied to neural spike recordings from the rat hippocampus, we see that the model captures inhibitory relationships, a dichotomy of classes of neurons, and a periodic modulation by the theta rhythm known to be present in the data. 1
3 0.80849326 274 nips-2013-Relevance Topic Model for Unstructured Social Group Activity Recognition
Author: Fang Zhao, Yongzhen Huang, Liang Wang, Tieniu Tan
Abstract: Unstructured social group activity recognition in web videos is a challenging task due to 1) the semantic gap between class labels and low-level visual features and 2) the lack of labeled training data. To tackle this problem, we propose a “relevance topic model” for jointly learning meaningful mid-level representations upon bagof-words (BoW) video representations and a classifier with sparse weights. In our approach, sparse Bayesian learning is incorporated into an undirected topic model (i.e., Replicated Softmax) to discover topics which are relevant to video classes and suitable for prediction. Rectified linear units are utilized to increase the expressive power of topics so as to explain better video data containing complex contents and make variational inference tractable for the proposed model. An efficient variational EM algorithm is presented for model parameter estimation and inference. Experimental results on the Unstructured Social Activity Attribute dataset show that our model achieves state of the art performance and outperforms other supervised topic model in terms of classification accuracy, particularly in the case of a very small number of labeled training videos. 1
4 0.77912652 266 nips-2013-Recurrent linear models of simultaneously-recorded neural populations
Author: Marius Pachitariu, Biljana Petreska, Maneesh Sahani
Abstract: Population neural recordings with long-range temporal structure are often best understood in terms of a common underlying low-dimensional dynamical process. Advances in recording technology provide access to an ever-larger fraction of the population, but the standard computational approaches available to identify the collective dynamics scale poorly with the size of the dataset. We describe a new, scalable approach to discovering low-dimensional dynamics that underlie simultaneously recorded spike trains from a neural population. We formulate the Recurrent Linear Model (RLM) by generalising the Kalman-filter-based likelihood calculation for latent linear dynamical systems to incorporate a generalised-linear observation process. We show that RLMs describe motor-cortical population data better than either directly-coupled generalised-linear models or latent linear dynamical system models with generalised-linear observations. We also introduce the cascaded generalised-linear model (CGLM) to capture low-dimensional instantaneous correlations in neural populations. The CGLM describes the cortical recordings better than either Ising or Gaussian models and, like the RLM, can be fit exactly and quickly. The CGLM can also be seen as a generalisation of a lowrank Gaussian model, in this case factor analysis. The computational tractability of the RLM and CGLM allow both to scale to very high-dimensional neural data. 1
5 0.73190975 131 nips-2013-Geometric optimisation on positive definite matrices for elliptically contoured distributions
Author: Suvrit Sra, Reshad Hosseini
Abstract: Hermitian positive definite (hpd) matrices recur throughout machine learning, statistics, and optimisation. This paper develops (conic) geometric optimisation on the cone of hpd matrices, which allows us to globally optimise a large class of nonconvex functions of hpd matrices. Specifically, we first use the Riemannian manifold structure of the hpd cone for studying functions that are nonconvex in the Euclidean sense but are geodesically convex (g-convex), hence globally optimisable. We then go beyond g-convexity, and exploit the conic geometry of hpd matrices to identify another class of functions that remain amenable to global optimisation without requiring g-convexity. We present key results that help recognise g-convexity and also the additional structure alluded to above. We illustrate our ideas by applying them to likelihood maximisation for a broad family of elliptically contoured distributions: for this maximisation, we derive novel, parameter free fixed-point algorithms. To our knowledge, ours are the most general results on geometric optimisation of hpd matrices known so far. Experiments show that advantages of using our fixed-point algorithms. 1
6 0.71540672 70 nips-2013-Contrastive Learning Using Spectral Methods
7 0.68454719 221 nips-2013-On the Expressive Power of Restricted Boltzmann Machines
8 0.61774319 303 nips-2013-Sparse Overlapping Sets Lasso for Multitask Learning and its Application to fMRI Analysis
9 0.59805691 345 nips-2013-Variance Reduction for Stochastic Gradient Optimization
10 0.57935691 121 nips-2013-Firing rate predictions in optimal balanced networks
11 0.51952124 262 nips-2013-Real-Time Inference for a Gamma Process Model of Neural Spiking
12 0.48315066 246 nips-2013-Perfect Associative Learning with Spike-Timing-Dependent Plasticity
13 0.47473803 141 nips-2013-Inferring neural population dynamics from multiple partial recordings of the same neural circuit
14 0.46401563 64 nips-2013-Compete to Compute
15 0.4306376 205 nips-2013-Multisensory Encoding, Decoding, and Identification
16 0.42925176 148 nips-2013-Latent Maximum Margin Clustering
17 0.42830381 236 nips-2013-Optimal Neural Population Codes for High-dimensional Stimulus Variables
19 0.42711499 174 nips-2013-Lexical and Hierarchical Topic Regression
20 0.42473584 208 nips-2013-Neural representation of action sequences: how far can a simple snippet-matching model take us?