nips nips2003 nips2003-2 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Maxim Likhachev, Geoffrey J. Gordon, Sebastian Thrun
Abstract: In real world planning problems, time for deliberation is often limited. Anytime planners are well suited for these problems: they find a feasible solution quickly and then continually work on improving it until time runs out. In this paper we propose an anytime heuristic search, ARA*, which tunes its performance bound based on available search time. It starts by finding a suboptimal solution quickly using a loose bound, then tightens the bound progressively as time allows. Given enough time it finds a provably optimal solution. While improving its bound, ARA* reuses previous search efforts and, as a result, is significantly more efficient than other anytime search methods. In addition to our theoretical analysis, we demonstrate the practical utility of ARA* with experiments on a simulated robot kinematic arm and a dynamic path planning problem for an outdoor rover. 1
Reference: text
sentIndex sentText sentNum sentScore
1 In this paper we propose an anytime heuristic search, ARA*, which tunes its performance bound based on available search time. [sent-5, score-0.547]
2 While improving its bound, ARA* reuses previous search efforts and, as a result, is significantly more efficient than other anytime search methods. [sent-8, score-0.585]
3 In addition to our theoretical analysis, we demonstrate the practical utility of ARA* with experiments on a simulated robot kinematic arm and a dynamic path planning problem for an outdoor rover. [sent-9, score-0.409]
4 1 Introduction Optimal search is often infeasible for real world problems, as we are given a limited amount of time for deliberation and want to find the best solution given the time provided. [sent-10, score-0.233]
5 In these conditions anytime algorithms [9, 2] prove to be useful as they usually find a first, possibly highly suboptimal, solution very fast and then continually work on improving the solution until allocated time expires. [sent-11, score-0.361]
6 Unfortunately, they can rarely provide bounds on the sub-optimality of their solutions unless the cost of an optimal solution is already known. [sent-12, score-0.169]
7 A* search with inflated heuristics (actual heuristic values are multiplied by an inflation factor > 1) is sub-optimal but proves to be fast for many domains [1, 5, 8] and also provides a bound on the sub-optimality, namely, the by which the heuristic is inflated [7]. [sent-16, score-0.495]
8 To construct an anytime algorithm with sub-optimality bounds one could run a succession of these A* searches with decreasing inflation factors. [sent-17, score-0.454]
9 This approach has control over the sub-optimality bound, but wastes a lot of computation since each search iteration duplicates most of the efforts of the previous searches. [sent-19, score-0.198]
10 One could try to employ incremental heuristic searches (e. [sent-20, score-0.201]
11 , [4]), but the sub-optimality bounds for each search iteration would no longer be guaranteed. [sent-22, score-0.22]
12 To this end we propose the ARA* (Anytime Repairing A*) algorithm, which is an efficient anytime heuristic search that also runs A* with inflated heuristics in succession but reuses search efforts from previous executions in such a way that the sub-optimality bounds are still satisfied. [sent-23, score-0.869]
13 An evaluation of ARA* on a simulated robot kinematic arm with six degrees of freedom shows up to 6-fold speedup over the succession of A* searches. [sent-26, score-0.384]
14 We also demonstrate ARA* on the problem of planning a path for a mobile robot that takes into account the robot’s dynamics. [sent-27, score-0.245]
15 The only other anytime heuristic search known to us is Anytime A*, described in [8]. [sent-28, score-0.478]
16 That is, h(s) ≤ c(s, s ) + h(s ) for any successor s of s if s = sgoal and h(s) = 0 if s = sgoal . [sent-37, score-0.449]
17 Consistency, in its turn, guarantees that the heuristic is admissible: h(s) is never larger than the true cost of reaching the goal from s. [sent-39, score-0.196]
18 Inflating the heuristic (that is, using ∗ h(s) for > 1) often results in much fewer state expansions and consequently faster searches. [sent-40, score-0.236]
19 However, inflating the heuristic may also violate the admissibility property, and as a result, a solution is no longer guaranteed to be optimal. [sent-41, score-0.221]
20 A* maintains two functions from states to real numbers: g(s) is the cost of the current path from the start node to s (it is assumed to be ∞ if no path to s has been found yet), and f (s) = g(s)+ ∗h(s) is an estimate of the total distance from start to goal going through s. [sent-43, score-0.287]
21 The OPEN queue is sorted by f (s), so that A* always expands next the state which appears to be on the shortest path from start to goal. [sent-45, score-0.191]
22 A* initializes the OPEN list with the start state, sstart (line 02). [sent-46, score-0.228]
23 For > 1 a solution can be sub-optimal, but the sub-optimality is bounded by a factor of : the length of the found solution is no larger than times the length of the optimal solution [7]. [sent-59, score-0.189]
24 The left three columns in Figure 2 show the operation of the A* algorithm with a heuristic inflated by = 2. [sent-60, score-0.13]
25 The heuristic is the larger of the x and y distances from the cell to the goal. [sent-66, score-0.135]
26 (A* can stop search as soon as it is about to expand a goal state without actually expanding it. [sent-68, score-0.317]
27 The A* searches with inflated heuristics expand substantially fewer cells than A* with = 1, but their solution is sub-optimal. [sent-71, score-0.269]
28 As a result, after each search a solution is guaranteed to be within a factor of optimal. [sent-74, score-0.233]
29 Running A* search from scratch every time we decrease , however, would be very expensive. [sent-75, score-0.207]
30 We will now explain how ARA* reuses the results of the previous searches to save computation. [sent-76, score-0.131]
31 A state is called locally inconsistent every time its g-value is decreased (line 09, Figure 1) and until the next time the state is expanded. [sent-80, score-0.303]
32 That is, suppose that state s is the best predecessor for some state s : that is, g(s ) = mins ∈pred(s ) (g(s )+c(s , s )) = g(s)+c(s, s ). [sent-81, score-0.22]
33 Given this definition of local inconsistency it is clear that the OPEN list consists of exactly all locally inconsistent states: every time a g-value is lowered the state is inserted into OPEN, and every time a state is expanded it is removed from OPEN until the next time its g-value is lowered. [sent-88, score-0.551]
34 A* with a consistent heuristic is guaranteed not to expand any state more than once. [sent-90, score-0.244]
35 Setting > 1, however, may violate consistency, and as a result A* search may re-expand states multiple times. [sent-91, score-0.241]
36 It turns out that if we restrict each state to be expanded no more than once, then the sub-optimality bound of still holds. [sent-92, score-0.249]
37 To implement this restriction we check any state whose g-value is lowered and insert it into OPEN only if it has not been previously expanded (line 10, Figure 3). [sent-93, score-0.268]
38 The set of expanded states is maintained in the CLOSED variable. [sent-94, score-0.189]
39 In fact, it will only contain the locally inconsistent states that have not yet been expanded. [sent-96, score-0.225]
40 It is important, however, to keep track of all the locally inconsistent states as they will be the starting points for inconsistency propagation in the future search iterations. [sent-97, score-0.436]
41 We do this by maintaining the set INCONS of all the locally inconsistent states that are not in OPEN (lines 12-13, Figure 3). [sent-98, score-0.225]
42 Thus, the union of INCONS and OPEN is exactly the set of all locally inconsistent states, and can be used as a starting point for inconsistency propagation before each new search iteration. [sent-99, score-0.361]
43 Since the ImprovePath function reuses search efforts from the previous executions, sgoal may never become locally inconsistent and thus may never be inserted into OPEN. [sent-101, score-0.65]
44 It also allows us to avoid expanding sgoal as well as possibly some other states with the same f -value. [sent-105, score-0.304]
45 Instead, the fvalue(s) function is called to compute and return the f -values only for the states in OPEN and sgoal . [sent-107, score-0.285]
46 3 ARA*: Iterative Execution of Searches We now introduce the main function of ARA* (right column in Figure 3) which performs a series of search iterations. [sent-109, score-0.169]
47 As suggested in [8] a sub-optimality bound can also be computed as the ratio between g(sgoal ), which gives an upper bound on the cost of an optimal solution, and the minimum un-weighted f -value of a locally inconsistent state, which gives a lower bound on the cost of an optimal solution. [sent-114, score-0.497]
48 At first, one may also think of using this actual sub-optimality bound in deciding how to decrease between search iterations (e. [sent-118, score-0.255]
49 The reason is that a small decrease in often results in the improvement of the solution, despite the fact that the actual sub-optimality bound of the previous solution was already substantially less than the value of . [sent-122, score-0.195]
50 ) Within each execution of the ImprovePath function we mainly save computation by not re-expanding the states which were locally consistent and whose g-values were already correct before the call to ImprovePath (Theorem 2 states this more precisely). [sent-125, score-0.31]
51 States that are locally inconsistent at the end of an iteration are shown with an asterisk. [sent-127, score-0.171]
52 This is in contrast to 15 cells expanded by A* search with the same . [sent-131, score-0.297]
53 Only 2 cells are expanded more than once across all three calls to the ImprovePath function. [sent-137, score-0.19]
54 Even a single optimal search from scratch expands 20 cells. [sent-138, score-0.248]
55 We use g ∗ (s) to denote the cost of an optimal path from sstart to s. [sent-142, score-0.291]
56 Let us also define a greedy path from sstart to s as a path that is computed by tracing it backward as follows: start at s, and at any state si pick a state si−1 = arg mins ∈pred(si ) (g(s ) + c(s , si )) until si−1 = sstart . [sent-143, score-0.734]
57 Theorem 1 Whenever the ImprovePath function exits, for any state s with f (s) ≤ mins ∈OPEN (f (s )), we have g ∗ (s) ≤ g(s) ≤ ∗ g ∗ (s), and the cost of a greedy path from sstart to s is no larger than g(s). [sent-144, score-0.413]
58 The correctness of ARA* follows from this theorem: each execution of the ImprovePath function terminates when f (sgoal ) is no larger than the minimum f -value in OPEN, which means that the greedy path from start to goal that we have found is within a factor of optimal. [sent-145, score-0.172]
59 Since before each iteration is decreased, and it, in its turn, is an upper bound on , ARA* gradually decreases the sub-optimality bound and finds new solutions to satisfy the bound. [sent-146, score-0.183]
60 Theorem 2 Within each call to ImprovePath() a state is expanded at most once and only if it was locally inconsistent before the call to ImprovePath() or its g-value was lowered during the current execution of ImprovePath(). [sent-147, score-0.482]
61 The second theorem formalizes where the computational savings for ARA* search come from. [sent-148, score-0.146]
62 Unlike A* search with an inflated heuristic, each search iteration in ARA* is guaranteed not to expand states more than once. [sent-149, score-0.453]
63 The base of the arm is fixed, and the task is to move its end-effector to the goal while navigating around obstacles (indicated by grey rectangles). [sent-153, score-0.166]
64 ) We discretitize the workspace into 50 by 50 cells and compute a distance from each cell to the cell containing the goal while taking into account that some cells are occupied by obstacles. [sent-157, score-0.14]
65 In order for the heuristic not to overestimate true costs, joint angles are discretitized so as to never move the end-effector by more than one cell in a single action. [sent-159, score-0.183]
66 The resulting state-space is over 3 billion states for a 6 DOF robot arm and over 1026 states for a 20 DOF robot arm, and memory for states is allocated on demand. [sent-160, score-0.646]
67 (a) 6D arm trajectory for = 3 (b) uniform costs (c) non-uniform costs (d) both Anytime A* and A* (e) ARA* (f) non-uniform costs after 90 secs, cost=682, =15. [sent-161, score-0.263]
68 Bottom row: 20D robot arm experiments (the trajectories shown are downsampled by 6). [sent-164, score-0.252]
69 Figure 4a shows the planned trajectory of the robot arm after the initial search of ARA* with = 3. [sent-166, score-0.434]
70 (By comparison, a search for an optimal trajectory is infeasible as it runs out of memory very quickly. [sent-170, score-0.234]
71 ) The plot in Figure 4b shows that ARA* improves both the quality of the solution and the bound on its sub-optimality faster and in a more gradual manner than either a succession of A* searches or Anytime A* [8]. [sent-171, score-0.281]
72 To evaluate the expense of the anytime property of ARA* we also ran ARA* and an optimal A* search in a slightly simpler environment (for the optimal search to be feasible). [sent-178, score-0.601]
73 3 mins (2,202,666 state expanded) to find an optimal solution, while ARA* required about 5. [sent-180, score-0.186]
74 5 mins (2,207,178 state expanded) to decrease in steps of 0. [sent-181, score-0.194]
75 As a result of the non-uniform costs our heuristic becomes less informative, and so search is much more expensive. [sent-185, score-0.301]
76 220 for the succession of A* searches and 223 for Anytime A*) and a better sub-optimality bound (3. [sent-188, score-0.235]
77 46 for both the succession of A* searches and Anytime A*). [sent-191, score-0.166]
78 7 secs, while the succession of A* searches reaches the same bound after 12. [sent-195, score-0.253]
79 4 minutes (about 140-fold speedup by ARA*) and Anytime A* reaches it after over 4 million expansions and 8. [sent-197, score-0.132]
80 While Figure 4 shows execution time, the comparison of states expanded (not shown) is almost identical. [sent-200, score-0.244]
81 0) Figure 5: outdoor robot navigation experiment (cross shows the position of the robot) seven times before a first solution was found. [sent-204, score-0.261]
82 Figures 4d-f show the results of experiments done on a 20 DOF robot arm, with actions that have non-uniform costs. [sent-205, score-0.151]
83 Figures 4d and 4e show that in 90 seconds of planning the cost of the trajectory found by ARA* and the suboptimality bound it can guarantee is substantially smaller than for the other algorithms. [sent-207, score-0.234]
84 For example, the trajectory in Figure 4d contains more steps and also makes one extra change in the angle of the third joint from the base of the arm (despite the fact that changing lower joint angles is very expensive) in comparison to the trajectory in Figure 4e. [sent-208, score-0.193]
85 To make the results from different environments comparable we normalize the bound by dividing it by the maximum of the best bounds that the algorithms achieve before they run out of memory. [sent-212, score-0.148]
86 High dimensionality and large environments result in very large state-spaces for the planner and make it computationally infeasible for the robot to plan optimally every time it discovers new obstacles or modelling errors. [sent-220, score-0.377]
87 To solve this problem we built a two-level planner: a 4D planner that uses ARA*, and a fast 2D (x, y) planner that uses A* search and whose results serve as the heuristic for the 4D planner. [sent-221, score-0.369]
88 1 1 To interleave search with the execution of the best plan so far we perform 4D search backward. [sent-222, score-0.409]
89 That is, the start of the search, sstart , is the actual goal state of the robot, while the goal of the search, sgoal , is the current state of the robot. [sent-223, score-0.587]
90 Thus, sstart does not change as the robot moves and the search tree remains valid in between search iterations. [sent-224, score-0.616]
91 Since heuristics estimate the distances to sgoal (the robot position) we have to recompute them during the reorder operation (line 10’, Figure 3). [sent-225, score-0.396]
92 In Figure 5 we show the robot we used for navigation and a 3D laser scan [3] constructed by the robot of the environment we tested our system in. [sent-226, score-0.349]
93 The 2D plan (Figure 5c) makes sharp 45 degree turns when going around the obstacles, requiring the robot to come to complete stops. [sent-238, score-0.213]
94 The optimal 4D plan results in a wider turn, and the velocity of the robot remains high throughout the whole trajectory. [sent-239, score-0.245]
95 As a result, the robot that runs ARA* can start executing its plan much earlier. [sent-244, score-0.241]
96 A robot running the optimal 4D planner would still be near the beginning of its path 25 seconds after receiving a goal location (Figure 5d). [sent-245, score-0.308]
97 In contrast, in the same amount of time the robot running ARA* has advanced much further (Figure 5f), and its plan by now has converged to optimal ( has decreased to 1). [sent-246, score-0.266]
98 4 Conclusions We have presented the first anytime heuristic search that works by continually decreasing a sub-optimality bound on its solution and finding new solutions that satisfy the bound on the way. [sent-247, score-0.73]
99 It executes a series of searches with decreasing sub-optimality bounds, and each search tries to reuse as much as possible of the results from previous searches. [sent-248, score-0.293]
100 The experiments show that our algorithm is much more efficient than any of the previous anytime searches, and can successfully solve large robotic planning problems. [sent-249, score-0.288]
wordName wordTfidf (topN-words)
[('ara', 0.705), ('improvepath', 0.297), ('anytime', 0.219), ('sgoal', 0.21), ('sstart', 0.173), ('robot', 0.151), ('search', 0.146), ('open', 0.134), ('expanded', 0.114), ('heuristic', 0.113), ('arm', 0.101), ('fvalue', 0.099), ('ated', 0.097), ('inconsistent', 0.094), ('mins', 0.088), ('searches', 0.088), ('incons', 0.087), ('succession', 0.078), ('states', 0.075), ('bound', 0.069), ('secs', 0.069), ('state', 0.066), ('inconsistency', 0.065), ('plan', 0.062), ('expansions', 0.057), ('locally', 0.056), ('execution', 0.055), ('planner', 0.055), ('insert', 0.049), ('ation', 0.049), ('expands', 0.049), ('path', 0.048), ('planning', 0.046), ('environments', 0.046), ('solution', 0.046), ('expand', 0.043), ('obstacles', 0.043), ('outdoor', 0.043), ('reuses', 0.043), ('costs', 0.042), ('decrease', 0.04), ('dof', 0.039), ('lowered', 0.039), ('calls', 0.039), ('cost', 0.038), ('cells', 0.037), ('pred', 0.037), ('decreasing', 0.036), ('trajectory', 0.036), ('heuristics', 0.035), ('speedup', 0.034), ('bounds', 0.033), ('continually', 0.032), ('optimal', 0.032), ('efforts', 0.031), ('successor', 0.029), ('call', 0.029), ('start', 0.028), ('closed', 0.027), ('list', 0.027), ('environment', 0.026), ('ating', 0.025), ('discretitized', 0.025), ('executions', 0.025), ('likhachev', 0.025), ('maxim', 0.025), ('publish', 0.025), ('suboptimality', 0.025), ('successors', 0.025), ('inserted', 0.024), ('decreases', 0.024), ('million', 0.023), ('robotic', 0.023), ('series', 0.023), ('never', 0.023), ('goal', 0.022), ('si', 0.022), ('guaranteed', 0.022), ('cell', 0.022), ('deliberation', 0.021), ('repetitively', 0.021), ('scratch', 0.021), ('decreased', 0.021), ('navigation', 0.021), ('soon', 0.021), ('iteration', 0.021), ('angle', 0.02), ('substantially', 0.02), ('already', 0.02), ('infeasible', 0.02), ('kinematic', 0.02), ('violate', 0.02), ('longer', 0.02), ('factor', 0.019), ('expanding', 0.019), ('allocated', 0.018), ('overhead', 0.018), ('reaches', 0.018), ('columns', 0.017), ('episodes', 0.017)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000001 2 nips-2003-ARA*: Anytime A* with Provable Bounds on Sub-Optimality
Author: Maxim Likhachev, Geoffrey J. Gordon, Sebastian Thrun
Abstract: In real world planning problems, time for deliberation is often limited. Anytime planners are well suited for these problems: they find a feasible solution quickly and then continually work on improving it until time runs out. In this paper we propose an anytime heuristic search, ARA*, which tunes its performance bound based on available search time. It starts by finding a suboptimal solution quickly using a loose bound, then tightens the bound progressively as time allows. Given enough time it finds a provably optimal solution. While improving its bound, ARA* reuses previous search efforts and, as a result, is significantly more efficient than other anytime search methods. In addition to our theoretical analysis, we demonstrate the practical utility of ARA* with experiments on a simulated robot kinematic arm and a dynamic path planning problem for an outdoor rover. 1
2 0.099581569 36 nips-2003-Auction Mechanism Design for Multi-Robot Coordination
Author: Curt Bererton, Geoffrey J. Gordon, Sebastian Thrun
Abstract: The design of cooperative multi-robot systems is a highly active research area in robotics. Two lines of research in particular have generated interest: the solution of large, weakly coupled MDPs, and the design and implementation of market architectures. We propose a new algorithm which joins together these two lines of research. For a class of coupled MDPs, our algorithm automatically designs a market architecture which causes a decentralized multi-robot system to converge to a consistent policy. We can show that this policy is the same as the one which would be produced by a particular centralized planning algorithm. We demonstrate the new algorithm on three simulation examples: multi-robot towing, multi-robot path planning with a limited fuel resource, and coordinating behaviors in a game of paint ball. 1
3 0.0738988 33 nips-2003-Approximate Planning in POMDPs with Macro-Actions
Author: Georgios Theocharous, Leslie P. Kaelbling
Abstract: Recent research has demonstrated that useful POMDP solutions do not require consideration of the entire belief space. We extend this idea with the notion of temporal abstraction. We present and explore a new reinforcement learning algorithm over grid-points in belief space, which uses macro-actions and Monte Carlo updates of the Q-values. We apply the algorithm to a large scale robot navigation task and demonstrate that with temporal abstraction we can consider an even smaller part of the belief space, we can learn POMDP policies faster, and we can do information gathering more efficiently.
4 0.072980098 29 nips-2003-Applying Metric-Trees to Belief-Point POMDPs
Author: Joelle Pineau, Geoffrey J. Gordon, Sebastian Thrun
Abstract: Recent developments in grid-based and point-based approximation algorithms for POMDPs have greatly improved the tractability of POMDP planning. These approaches operate on sets of belief points by individually learning a value function for each point. In reality, belief points exist in a highly-structured metric simplex, but current POMDP algorithms do not exploit this property. This paper presents a new metric-tree algorithm which can be used in the context of POMDP planning to sort belief points spatially, and then perform fast value function updates over groups of points. We present results showing that this approach can reduce computation in point-based POMDP algorithms for a wide range of problems. 1
5 0.070607573 21 nips-2003-An Autonomous Robotic System for Mapping Abandoned Mines
Author: David Ferguson, Aaron Morris, Dirk Hähnel, Christopher Baker, Zachary Omohundro, Carlos Reverte, Scott Thayer, Charles Whittaker, William Whittaker, Wolfram Burgard, Sebastian Thrun
Abstract: We present the software architecture of a robotic system for mapping abandoned mines. The software is capable of acquiring consistent 2D maps of large mines with many cycles, represented as Markov random £elds. 3D C-space maps are acquired from local 3D range scans, which are used to identify navigable paths using A* search. Our system has been deployed in three abandoned mines, two of which inaccessible to people, where it has acquired maps of unprecedented detail and accuracy. 1
6 0.059587952 62 nips-2003-Envelope-based Planning in Relational MDPs
7 0.054569114 34 nips-2003-Approximate Policy Iteration with a Policy Language Bias
8 0.051390637 158 nips-2003-Policy Search by Dynamic Programming
9 0.046301268 43 nips-2003-Bounded Invariance and the Formation of Place Fields
10 0.043641571 70 nips-2003-Fast Algorithms for Large-State-Space HMMs with Applications to Web Usage Analysis
11 0.036073487 136 nips-2003-New Algorithms for Efficient High Dimensional Non-parametric Classification
12 0.034087699 24 nips-2003-An Iterative Improvement Procedure for Hierarchical Clustering
13 0.033596523 42 nips-2003-Bounded Finite State Controllers
14 0.0330569 68 nips-2003-Eye Movements for Reward Maximization
15 0.03120362 91 nips-2003-Inferring State Sequences for Non-linear Systems with Embedded Hidden Markov Models
16 0.03080637 94 nips-2003-Information Maximization in Noisy Channels : A Variational Approach
17 0.029799631 78 nips-2003-Gaussian Processes in Reinforcement Learning
18 0.029782733 28 nips-2003-Application of SVMs for Colour Classification and Collision Detection with AIBO Robots
19 0.029737573 8 nips-2003-A Holistic Approach to Compositional Semantics: a connectionist model and robot experiments
20 0.029282305 116 nips-2003-Linear Program Approximations for Factored Continuous-State Markov Decision Processes
topicId topicWeight
[(0, -0.104), (1, 0.077), (2, -0.03), (3, 0.034), (4, -0.011), (5, -0.041), (6, -0.061), (7, -0.023), (8, 0.058), (9, 0.03), (10, 0.022), (11, -0.095), (12, 0.031), (13, 0.097), (14, 0.029), (15, 0.054), (16, 0.036), (17, -0.019), (18, 0.038), (19, 0.041), (20, -0.011), (21, -0.079), (22, -0.04), (23, -0.054), (24, 0.061), (25, -0.008), (26, -0.115), (27, 0.069), (28, -0.034), (29, 0.063), (30, 0.043), (31, -0.085), (32, 0.017), (33, 0.0), (34, 0.144), (35, -0.07), (36, -0.136), (37, -0.005), (38, -0.025), (39, 0.119), (40, -0.029), (41, -0.024), (42, 0.024), (43, 0.025), (44, 0.063), (45, -0.041), (46, -0.105), (47, 0.026), (48, 0.066), (49, 0.027)]
simIndex simValue paperId paperTitle
same-paper 1 0.93547332 2 nips-2003-ARA*: Anytime A* with Provable Bounds on Sub-Optimality
Author: Maxim Likhachev, Geoffrey J. Gordon, Sebastian Thrun
Abstract: In real world planning problems, time for deliberation is often limited. Anytime planners are well suited for these problems: they find a feasible solution quickly and then continually work on improving it until time runs out. In this paper we propose an anytime heuristic search, ARA*, which tunes its performance bound based on available search time. It starts by finding a suboptimal solution quickly using a loose bound, then tightens the bound progressively as time allows. Given enough time it finds a provably optimal solution. While improving its bound, ARA* reuses previous search efforts and, as a result, is significantly more efficient than other anytime search methods. In addition to our theoretical analysis, we demonstrate the practical utility of ARA* with experiments on a simulated robot kinematic arm and a dynamic path planning problem for an outdoor rover. 1
2 0.8575505 21 nips-2003-An Autonomous Robotic System for Mapping Abandoned Mines
Author: David Ferguson, Aaron Morris, Dirk Hähnel, Christopher Baker, Zachary Omohundro, Carlos Reverte, Scott Thayer, Charles Whittaker, William Whittaker, Wolfram Burgard, Sebastian Thrun
Abstract: We present the software architecture of a robotic system for mapping abandoned mines. The software is capable of acquiring consistent 2D maps of large mines with many cycles, represented as Markov random £elds. 3D C-space maps are acquired from local 3D range scans, which are used to identify navigable paths using A* search. Our system has been deployed in three abandoned mines, two of which inaccessible to people, where it has acquired maps of unprecedented detail and accuracy. 1
3 0.82117432 36 nips-2003-Auction Mechanism Design for Multi-Robot Coordination
Author: Curt Bererton, Geoffrey J. Gordon, Sebastian Thrun
Abstract: The design of cooperative multi-robot systems is a highly active research area in robotics. Two lines of research in particular have generated interest: the solution of large, weakly coupled MDPs, and the design and implementation of market architectures. We propose a new algorithm which joins together these two lines of research. For a class of coupled MDPs, our algorithm automatically designs a market architecture which causes a decentralized multi-robot system to converge to a consistent policy. We can show that this policy is the same as the one which would be produced by a particular centralized planning algorithm. We demonstrate the new algorithm on three simulation examples: multi-robot towing, multi-robot path planning with a limited fuel resource, and coordinating behaviors in a game of paint ball. 1
4 0.43502855 8 nips-2003-A Holistic Approach to Compositional Semantics: a connectionist model and robot experiments
Author: Yuuya Sugita, Jun Tani
Abstract: We present a novel connectionist model for acquiring the semantics of a simple language through the behavioral experiences of a real robot. We focus on the “compositionality” of semantics, a fundamental characteristic of human language, which is the ability to understand the meaning of a sentence as a combination of the meanings of words. We also pay much attention to the “embodiment” of a robot, which means that the robot should acquire semantics which matches its body, or sensory-motor system. The essential claim is that an embodied compositional semantic representation can be self-organized from generalized correspondences between sentences and behavioral patterns. This claim is examined and confirmed through simple experiments in which a robot generates corresponding behaviors from unlearned sentences by analogy with the correspondences between learned sentences and behaviors. 1
5 0.42836305 6 nips-2003-A Fast Multi-Resolution Method for Detection of Significant Spatial Disease Clusters
Author: Daniel B. Neill, Andrew W. Moore
Abstract: Given an N ×N grid of squares, where each square has a count and an underlying population, our goal is to find the square region with the highest density, and to calculate its significance by randomization. Any density measure D, dependent on the total count and total population of a region, can be used. For example, if each count represents the number of disease cases occurring in that square, we can use Kulldorff’s spatial scan statistic DK to find the most significant spatial disease cluster. A naive approach to finding the maximum density region requires O(N 3 ) time, and is generally computationally infeasible. We present a novel algorithm which partitions the grid into overlapping regions, bounds the maximum score of subregions contained in each region, and prunes regions which cannot contain the maximum density region. For sufficiently dense regions, this method finds the maximum density region in optimal O(N 2 ) time, in practice resulting in significant (10-200x) speedups. 1
6 0.40956965 43 nips-2003-Bounded Invariance and the Formation of Place Fields
7 0.37532818 33 nips-2003-Approximate Planning in POMDPs with Macro-Actions
8 0.36906338 62 nips-2003-Envelope-based Planning in Relational MDPs
9 0.34071058 44 nips-2003-Can We Learn to Beat the Best Stock
10 0.34057346 29 nips-2003-Applying Metric-Trees to Belief-Point POMDPs
11 0.33018985 28 nips-2003-Application of SVMs for Colour Classification and Collision Detection with AIBO Robots
12 0.28688937 34 nips-2003-Approximate Policy Iteration with a Policy Language Bias
13 0.28347495 158 nips-2003-Policy Search by Dynamic Programming
14 0.28118491 116 nips-2003-Linear Program Approximations for Factored Continuous-State Markov Decision Processes
15 0.27768311 61 nips-2003-Entrainment of Silicon Central Pattern Generators for Legged Locomotory Control
16 0.27023721 196 nips-2003-Wormholes Improve Contrastive Divergence
17 0.26497331 136 nips-2003-New Algorithms for Efficient High Dimensional Non-parametric Classification
18 0.26092643 110 nips-2003-Learning a World Model and Planning with a Self-Organizing, Dynamic Neural System
19 0.24710135 97 nips-2003-Iterative Scaled Trust-Region Learning in Krylov Subspaces via Pearlmutter's Implicit Sparse Hessian
20 0.24489839 181 nips-2003-Statistical Debugging of Sampled Programs
topicId topicWeight
[(0, 0.048), (11, 0.013), (29, 0.016), (30, 0.034), (35, 0.059), (53, 0.051), (71, 0.035), (76, 0.028), (82, 0.012), (85, 0.536), (91, 0.06)]
simIndex simValue paperId paperTitle
1 0.97986192 95 nips-2003-Insights from Machine Learning Applied to Human Visual Classification
Author: Felix A. Wichmann, Arnulf B. Graf
Abstract: We attempt to understand visual classification in humans using both psychophysical and machine learning techniques. Frontal views of human faces were used for a gender classification task. Human subjects classified the faces and their gender judgment, reaction time and confidence rating were recorded. Several hyperplane learning algorithms were used on the same classification task using the Principal Components of the texture and shape representation of the faces. The classification performance of the learning algorithms was estimated using the face database with the true gender of the faces as labels, and also with the gender estimated by the subjects. We then correlated the human responses to the distance of the stimuli to the separating hyperplane of the learning algorithms. Our results suggest that human classification can be modeled by some hyperplane algorithms in the feature space we used. For classification, the brain needs more processing for stimuli close to that hyperplane than for those further away. 1
2 0.97508186 193 nips-2003-Variational Linear Response
Author: Manfred Opper, Ole Winther
Abstract: A general linear response method for deriving improved estimates of correlations in the variational Bayes framework is presented. Three applications are given and it is discussed how to use linear response as a general principle for improving mean field approximations.
3 0.97095305 136 nips-2003-New Algorithms for Efficient High Dimensional Non-parametric Classification
Author: Ting liu, Andrew W. Moore, Alexander Gray
Abstract: This paper is about non-approximate acceleration of high dimensional nonparametric operations such as k nearest neighbor classifiers and the prediction phase of Support Vector Machine classifiers. We attempt to exploit the fact that even if we want exact answers to nonparametric queries, we usually do not need to explicitly find the datapoints close to the query, but merely need to ask questions about the properties about that set of datapoints. This offers a small amount of computational leeway, and we investigate how much that leeway can be exploited. For clarity, this paper concentrates on pure k-NN classification and the prediction phase of SVMs. We introduce new ball tree algorithms that on real-world datasets give accelerations of 2-fold up to 100-fold compared against highly optimized traditional ball-tree-based k-NN. These results include datasets with up to 106 dimensions and 105 records, and show non-trivial speedups while giving exact answers. 1
same-paper 4 0.94216651 2 nips-2003-ARA*: Anytime A* with Provable Bounds on Sub-Optimality
Author: Maxim Likhachev, Geoffrey J. Gordon, Sebastian Thrun
Abstract: In real world planning problems, time for deliberation is often limited. Anytime planners are well suited for these problems: they find a feasible solution quickly and then continually work on improving it until time runs out. In this paper we propose an anytime heuristic search, ARA*, which tunes its performance bound based on available search time. It starts by finding a suboptimal solution quickly using a loose bound, then tightens the bound progressively as time allows. Given enough time it finds a provably optimal solution. While improving its bound, ARA* reuses previous search efforts and, as a result, is significantly more efficient than other anytime search methods. In addition to our theoretical analysis, we demonstrate the practical utility of ARA* with experiments on a simulated robot kinematic arm and a dynamic path planning problem for an outdoor rover. 1
5 0.88940334 64 nips-2003-Estimating Internal Variables and Paramters of a Learning Agent by a Particle Filter
Author: Kazuyuki Samejima, Kenji Doya, Yasumasa Ueda, Minoru Kimura
Abstract: When we model a higher order functions, such as learning and memory, we face a difficulty of comparing neural activities with hidden variables that depend on the history of sensory and motor signals and the dynamics of the network. Here, we propose novel method for estimating hidden variables of a learning agent, such as connection weights from sequences of observable variables. Bayesian estimation is a method to estimate the posterior probability of hidden variables from observable data sequence using a dynamic model of hidden and observable variables. In this paper, we apply particle filter for estimating internal parameters and metaparameters of a reinforcement learning model. We verified the effectiveness of the method using both artificial data and real animal behavioral data. 1
6 0.84279698 124 nips-2003-Max-Margin Markov Networks
7 0.79126203 192 nips-2003-Using the Forest to See the Trees: A Graphical Model Relating Features, Objects, and Scenes
8 0.74602127 28 nips-2003-Application of SVMs for Colour Classification and Collision Detection with AIBO Robots
9 0.70806563 3 nips-2003-AUC Optimization vs. Error Rate Minimization
10 0.70482504 134 nips-2003-Near-Minimax Optimal Classification with Dyadic Classification Trees
11 0.67720497 109 nips-2003-Learning a Rare Event Detection Cascade by Direct Feature Selection
12 0.66472179 50 nips-2003-Denoising and Untangling Graphs Using Degree Priors
13 0.65222758 20 nips-2003-All learning is Local: Multi-agent Learning in Global Reward Games
14 0.63195622 57 nips-2003-Dynamical Modeling with Kernels for Nonlinear Time Series Prediction
15 0.61843723 148 nips-2003-Online Passive-Aggressive Algorithms
16 0.61133909 147 nips-2003-Online Learning via Global Feedback for Phrase Recognition
17 0.60714549 52 nips-2003-Different Cortico-Basal Ganglia Loops Specialize in Reward Prediction at Different Time Scales
18 0.60677552 41 nips-2003-Boosting versus Covering
19 0.60666639 101 nips-2003-Large Margin Classifiers: Convex Loss, Low Noise, and Convergence Rates
20 0.59868222 91 nips-2003-Inferring State Sequences for Non-linear Systems with Embedded Hidden Markov Models