nips nips2007 nips2007-171 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Victoria Manfredi, Jim Kurose
Abstract: We address the problem of adaptive sensor control in dynamic resourceconstrained sensor networks. We focus on a meteorological sensing network comprising radars that can perform sector scanning rather than always scanning 360◦ . We compare three sector scanning strategies. The sit-and-spin strategy always scans 360◦ . The limited lookahead strategy additionally uses the expected environmental state K decision epochs in the future, as predicted from Kalman filters, in its decision-making. The full lookahead strategy uses all expected future states by casting the problem as a Markov decision process and using reinforcement learning to estimate the optimal scan strategy. We show that the main benefits of using a lookahead strategy are when there are multiple meteorological phenomena in the environment, and when the maximum radius of any phenomenon is sufficiently smaller than the radius of the radars. We also show that there is a trade-off between the average quality with which a phenomenon is scanned and the number of decision epochs before which a phenomenon is rescanned. 1
Reference: text
sentIndex sentText sentNum sentScore
1 We focus on a meteorological sensing network comprising radars that can perform sector scanning rather than always scanning 360◦ . [sent-4, score-1.068]
2 The full lookahead strategy uses all expected future states by casting the problem as a Markov decision process and using reinforcement learning to estimate the optimal scan strategy. [sent-8, score-0.677]
3 We show that the main benefits of using a lookahead strategy are when there are multiple meteorological phenomena in the environment, and when the maximum radius of any phenomenon is sufficiently smaller than the radius of the radars. [sent-9, score-0.784]
4 We also show that there is a trade-off between the average quality with which a phenomenon is scanned and the number of decision epochs before which a phenomenon is rescanned. [sent-10, score-0.619]
5 1 Introduction Traditionally, meteorological radars, such as the National Weather Service NEXRAD system, are tasked to always scan 360 degrees. [sent-11, score-0.574]
6 Since all meteorological phenomena cannot be now all observed all of the time with the highest degree of fidelity, the radars must decide how best to perform scanning. [sent-13, score-0.543]
7 While we focus on the problem of how to perform sector scanning in such an adaptive meteorological sensing network, it is an instance of the larger class of problems of adaptive sensor control in dynamic resource-constrained sensor networks. [sent-14, score-0.779]
8 Given the ability of a network of radars to perform sector scanning, how should scanning be adapted at each decision epoch? [sent-15, score-0.623]
9 In this work we examine three methods for adapting the radar scan strategy. [sent-18, score-0.828]
10 Finally, the full lookahead strategy has an infinite horizon: it uses all expected future states by casting the problem as a Markov decision process and using reinforcement learning to estimate the optimal scan strategy. [sent-22, score-0.677]
11 We first introduce the meteorological radar control problem and show how to constrain the problem so that it is amenable to reinforcement learning methods. [sent-25, score-0.816]
12 We then identify conditions under which the computational cost of an infinite horizon radar scan strategy such as reinforcement learning is necessary. [sent-26, score-0.976]
13 With respect to the radar meteorological application, we show that the main benefits of considering expected future states are when there are multiple meteorological phenomena in the environment, and when the maximum radius of any phenomenon is sufficiently smaller than the radius of the radars. [sent-27, score-1.354]
14 We also show that there is a trade-off between the average quality with which a phenomenon is scanned and the number of decision epochs before which a phenomenon is rescanned. [sent-28, score-0.619]
15 In contrast to other work on radar control (see Section 5), we focus on tracking meteorological phenomena and the time frame over which to evaluate control decisions. [sent-30, score-0.93]
16 Section 5 reviews related work on control and resource allocation in radar and sensor networks. [sent-35, score-0.546]
17 Each radar has a set of scan actions from which it chooses. [sent-40, score-0.85]
18 In the simplest case, a radar scan action determines the size of the sector to scan, the start angle, the end angle, and the angle of elevation. [sent-41, score-1.039]
19 An effective scanning strategy must balance scanning small sectors (thus implicitly not scanning other sectors), to ensure that phenomena are correctly identified, with scanning a variety of sectors, to ensure that no phenomena are missed. [sent-44, score-1.197]
20 The radar control problem is that of dynamically choosing the scan strategy of the radars over time to maximize quality while minimizing inter-scan time. [sent-49, score-1.241]
21 3 Scan Strategies We define a radar configuration to be the start and end angles of the sector to be scanned by an individual radar for a fixed interval of time. [sent-50, score-1.252]
22 We define a scan action to be a set of radar configurations (one configuration for each radar in the meteorological sensing network). [sent-51, score-1.624]
23 We define a scan strategy to be an algorithm for choosing scan actions. [sent-52, score-0.749]
24 1 we define the quality function associated with different radar configurations and in Section 3. [sent-54, score-0.623]
25 2 we define the quality functions associated with different scan strategies. [sent-55, score-0.481]
26 1 Quality Function The quality function associated with a given scan action was proposed by radar meteorologists in [5] and has two components. [sent-57, score-0.995]
27 There is a quality component Up associated with scanning a particular phenomenon p. [sent-58, score-0.468]
28 There is also a quality component Us associated with scanning a sector, which is independent of any phenomena in that sector. [sent-59, score-0.464]
29 Let sr be the radar configuration for a single radar r and let Sr be the scan action under consideration. [sent-60, score-1.555]
30 Up (p, sr ) is the quality obtained for scanning phenomenon p using a specific radar r and radar configuration sr . [sent-87, score-1.864]
31 Fc captures the effect on quality due to the percentage of the phenomenon covered; to usefully scan a phenomenon, at least 95% of the phenomenon must be scanned. [sent-89, score-0.715]
32 Fw captures the effect of radar rotation speed on quality; as rotation speed is reduced, quality increases. [sent-90, score-0.623]
33 Fd captures the effects of the distance from the radar to the geometrical center of the phenomenon on quality; the further away the radar center is from the phenomenon being scanned, the more degraded will be the scan quality due to attenuation. [sent-91, score-1.727]
34 Due to the Fw function, the quality function Up (p, sr ) outputs the same quality for scan angles of 181◦ to 360◦ . [sent-92, score-0.845]
35 For example, suppose a storm cell is about to move into a high-quality multi-doppler region (i. [sent-97, score-0.558]
36 By considering future expected states, a lookahead strategy can anticipate this event and have all radars focused on the storm cell when it enters the multi-doppler region, rather than expending resources (with little “reward”) to scan the storm cell just before it enters this region. [sent-100, score-1.823]
37 The A matrix reflects that storm cells typically move to the north-east. [sent-110, score-0.508]
38 We use the first location measurement of a storm cell y0 , augmented with the observed velocity, as the the initial state x0 . [sent-115, score-0.658]
39 The optimal set of radar configurations is ∗ then Sr,1 = argmaxSr,1 UK (Sr,1 |Tr ). [sent-123, score-0.485]
40 To account for the decay of quality for unscanned sectors and phenomena, and to consider the possibility of new phenomena appearing, we restrict Sr to be those scan actions that ensure that every sector has been scanned at least once in the last Tr decision epochs. [sent-124, score-1.027]
41 We formulate the radar control problem as a Markov decision process (MDP) and use reinforcement learning to obtain a lookahead scan strategy as follows. [sent-127, score-1.163]
42 While a POMDP (partially observable MDP) could be used to model the environmental uncertainty, due to the cost of solving a POMDP with a large state space [9], we choose to formulate the radar control problem as an MDP with quality (or uncertainty) variables as in an augmented MDP [6]. [sent-128, score-0.755]
43 The state is a function of the observed number of storms, the observed x, y velocity of each storm, and the observed dimensions of each storm cell given by x, y center of mass and radius. [sent-130, score-0.739]
44 up is the quality Up (·) with which each storm cell was observed, and us is the current quality Us (·) of each 90◦ subsector, starting at 0, 90, 180, or 270◦ . [sent-133, score-0.834]
45 This is the set of radar configurations for a given decision epoch. [sent-135, score-0.553]
46 We restrict each radar to scanning subsectors that are a multiple of 90◦ , starting at 0, 90, 180, or 270◦ . [sent-136, score-0.698]
47 The transition function T (S × A × S) → [0, 1] encodes the observed environment dynamics: specifically the appearance, disappearance, and movement of storm cells and their associated attributes. [sent-138, score-0.588]
48 For meteorological radar control, the next state really is a function of not just the current state but also the action executed in the current state. [sent-139, score-0.811]
49 For instance, if a radar scans 180 degrees rather than 360 degrees, then any new storm cells that appear in the unscanned areas will not be observed. [sent-140, score-1.079]
50 Thus, the new storm cells that will be observed will depend on the scanning action of the radar. [sent-141, score-0.773]
51 The cost function C(S, A, S) → R encodes the goals of the radar sensing network. [sent-142, score-0.557]
52 C is a function of the error between the true state and the observed state, whether all storms have been observed, 4 and a penalty term for not rescanning a storm within Tr decision epochs. [sent-143, score-0.751]
53 The quality with which a storm is observed determines the difference between the observed and true values of its attributes. [sent-145, score-0.67]
54 We use linear Sarsa(λ) [15] as the reinforcement learning algorithm to solve the MDP for the radar control problem. [sent-146, score-0.585]
55 A new storm cell can appear anywhere within the rectangle and a maximum number of cells can be present on any decision epoch. [sent-152, score-0.675]
56 When the (x, y) center of a storm cell is no longer within range of any radar, the cell is removed from the environment. [sent-153, score-0.665]
57 We derive the maximum storm cell radius from [11], which uses 2. [sent-155, score-0.636]
58 ” We then permit a storm cell’s radius to range from 1 to 4 km. [sent-157, score-0.55]
59 To determine the range of storm cell velocities, we use 39 real storm cell tracks obtained from meteorologists. [sent-158, score-1.116]
60 To obtain a storm cell’s (x, y) velocity, we then sample the appropriate Gaussian distribution. [sent-171, score-0.472]
61 To simulate the environment transitions we use a stochastic model of rainfall in which storm cell arrivals are modeled using a spatio-temporal Poisson process, see [11, 1]. [sent-172, score-0.615]
62 To determine the number of new storm cells to add during a decision epoch, we sample a Poisson random variable with rate ληδaδt with λ = 0. [sent-173, score-0.576]
63 From the radar setup we have δa = 90 · 60 km2 , and from the 30-second decision epoch we have δt = 0. [sent-176, score-0.605]
64 New storm cells are uniformly randomly distributed in the 90km × 60km region and we uniformly randomly choose new storm cell attributes from their range of values. [sent-178, score-1.102]
65 The following simplified radar model determines how well the radars observe the true environmental state under a given set of radar configurations. [sent-180, score-1.235]
66 If a storm cell p is scanned using a set of radar configurations Sr , the location, velocity, and radius attributes are observed as a function of the Up (p, Sr ) quality defined in Section 3. [sent-181, score-1.421]
67 Since u depends on the decision epoch t, for the k-step look-ahead scan strategy we also use σt = (1 − ut )V max /ρ to compute the measurement error covariance matrix, R, in our Kalman filter. [sent-185, score-0.575]
68 We assume that any unobserved storm cell has been observed with quality 0, hence u = 0. [sent-187, score-0.719]
69 If a storm cell is not seen within Tr = 4 decision epochs a penalty of Pr = 200 is given. [sent-191, score-0.723]
70 Using the value 200 ensures that if a storm cell has not been rescanned within the appropriate amount of time, this part of the cost function will dominate. [sent-192, score-0.607]
71 5 We distinguish the true environmental state known only to the simulator from the observed environmental state used by the scan strategies for several reasons. [sent-193, score-0.592]
72 Although radars provide measurements about meteorological phenomena, the true attributes of the phenomena are unknown. [sent-194, score-0.57]
73 Poor overlap in a dual-Doppler area, scanning a subsector too quickly or slowly, or being unable to obtain a sufficient number of elevation scans will degrade the quality of the measurements. [sent-195, score-0.481]
74 Additionally, when a radar scans a subsector, it obtains more accurate estimates of the phenomena in that subsector than if it had scanned a full 360◦ , but less accurate estimates of the phenomena outside the subsector. [sent-197, score-0.928]
75 0; for the (x, y) velocity, phenomenon confidence, and radar sector confidence tilings, we use a granularity of 0. [sent-212, score-0.781]
76 Figure 2(b) shows the average difference in scan quality between the learned Sarsa(λ) strategy and sit-and-spin and 2-step strategies. [sent-217, score-0.544]
77 Examining the learned strategy showed that when there was at most one storm with observation noise 1/ρ = 0. [sent-225, score-0.552]
78 Figure 2(c) shows the average difference in cost between the learned Sarsa(λ) scan strategy and the sit-and-spin and 2-step strategies for a 30 km radar radius. [sent-228, score-1.018]
79 Note that for the sit-and-spin CDF, P [X ≤ 1] is not 1; due to noise, for example, the measured location of a storm cell may be (expected) outside any radar footprint and consequently the storm cell will not be observed. [sent-231, score-1.614]
80 We hypothesize that this trade-off occurs because increasing the size of the scan sectors ensures that inter-scan time is minimized, but decreases the scan quality. [sent-234, score-0.755]
81 Other results (not shown, see [7]) examine the average difference in quality between the 1-step and 2step strategies for 10 km and 30 km radar radii. [sent-235, score-0.773]
82 We hypothesize that this is a consequence of the maximum storm cell radius, 4 km, relative to the 10 km radar radius. [sent-237, score-1.1]
83 With a 30 km radius and at most eight storm cells, the 2-step quality is about 0. [sent-238, score-0.732]
84 Now recall that Figure 2(b) shows that with a 30 km radius and at most four storm cells, the 2-step quality is as much as 0. [sent-241, score-0.732]
85 Instead, the primary value of reinforcement learning for the radar control problem is balancing multiple conflicting goals, i. [sent-245, score-0.585]
86 Implementing the learned reinforcement learning scan strategy in a real meteorological radar network requires addressing the differences between the offline environment in which the learned strategy is trained, and the online environment in which the strategy is deployed. [sent-248, score-1.426]
87 2 6 2step − sarsa, max 1 storm 2step − sarsa, max 4 storms sitandspin− sarsa, max 1 storm sitandspin − sarsa, max 4 storms 0 0. [sent-256, score-1.293]
88 07 Max # of Storms = 4, Radar Radius = 30km 1 2step − sarsa, max 1 storm 2step − sarsa, max 4 storms sitandspin− sarsa, max 1 storm sitandspin − sarsa, max 4 storms 4 0. [sent-270, score-1.293]
89 1 0 1 2 3 4 5 6 7 8 x = # of decision epochs between storm scans 9 10 (d) Figure 2: Comparing the scan strategies based on quality, cost, and inter-scan time. [sent-296, score-1.089]
90 With respect to radar control, [4] examines the problem of using agile radars on airplanes to detect and track ground targets. [sent-303, score-0.702]
91 They show that lookahead scan strategies for radar tracking of a ground target outperform myopic strategies. [sent-304, score-1.044]
92 [16] examines where to target radar beams and which waveform to use for electronically steered phased array radars. [sent-307, score-0.498]
93 They then choose the scan mode for each target that has both the longest revisit time for scanning a target and error covariance below a threshold. [sent-309, score-0.556]
94 6 Conclusions and Future Work In this work we compared the performance of myopic and lookahead scan strategies in the context of the meteorological radar control problem. [sent-311, score-1.282]
95 We showed that the main benefits of using a lookahead strategy are when there are multiple meteorological phenomena in the environment, and when the maximum radius of any phenomenon is sufficiently smaller than the radius of the radars. [sent-312, score-0.784]
96 We also showed that there is a trade-off between the average quality with which a phenomenon is scanned and the number of decision epochs before which a phenomenon is rescanned. [sent-313, score-0.619]
97 Overall, considering only scan quality, a simple lookahead strategy is sufficient. [sent-314, score-0.51]
98 We would also like to incorporate more radar and meteorological information into the transition, measurement, and cost functions. [sent-317, score-0.737]
99 Improved radar sensitivity through limited sector scanning: The DCAS approach. [sent-331, score-0.651]
100 Comparison of myopic and lookahead scan strategies for meteorological radars. [sent-363, score-0.761]
wordName wordTfidf (topN-words)
[('radar', 0.485), ('storm', 0.472), ('scan', 0.343), ('meteorological', 0.231), ('sr', 0.213), ('scanning', 0.213), ('radars', 0.176), ('sector', 0.166), ('sarsa', 0.158), ('quality', 0.138), ('storms', 0.12), ('phenomenon', 0.117), ('phenomena', 0.113), ('lookahead', 0.104), ('scanned', 0.103), ('cell', 0.086), ('radius', 0.078), ('epochs', 0.076), ('scans', 0.068), ('decision', 0.068), ('reinforcement', 0.064), ('strategy', 0.063), ('strategies', 0.062), ('velocity', 0.058), ('environment', 0.057), ('sectors', 0.056), ('epoch', 0.052), ('sensing', 0.051), ('subsector', 0.046), ('fw', 0.045), ('km', 0.044), ('environmental', 0.042), ('np', 0.042), ('sitandspin', 0.037), ('control', 0.036), ('cells', 0.036), ('kalman', 0.036), ('attributes', 0.036), ('tr', 0.035), ('fd', 0.034), ('gurations', 0.033), ('state', 0.033), ('mdp', 0.032), ('measurement', 0.031), ('fc', 0.031), ('tilings', 0.029), ('tracking', 0.029), ('action', 0.029), ('additionally', 0.029), ('agile', 0.028), ('manfredi', 0.028), ('rescanned', 0.028), ('con', 0.026), ('guration', 0.026), ('sensor', 0.025), ('attribute', 0.025), ('latitude', 0.024), ('longitude', 0.024), ('observed', 0.023), ('actions', 0.022), ('future', 0.021), ('penalty', 0.021), ('cost', 0.021), ('center', 0.021), ('myopic', 0.021), ('donovan', 0.018), ('kurose', 0.018), ('mclaughlin', 0.018), ('unscanned', 0.018), ('ahead', 0.018), ('tunable', 0.018), ('pomdp', 0.018), ('comprising', 0.018), ('max', 0.018), ('pm', 0.017), ('noise', 0.017), ('angle', 0.016), ('hmax', 0.016), ('weather', 0.016), ('elevation', 0.016), ('sit', 0.016), ('adaptive', 0.016), ('ti', 0.016), ('robotics', 0.015), ('true', 0.014), ('targets', 0.014), ('spin', 0.014), ('casting', 0.014), ('location', 0.013), ('angles', 0.013), ('plus', 0.013), ('granularity', 0.013), ('amherst', 0.013), ('episode', 0.013), ('helicopter', 0.013), ('waveform', 0.013), ('tile', 0.013), ('rectangle', 0.013), ('hypothesize', 0.013), ('track', 0.013)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999976 171 nips-2007-Scan Strategies for Meteorological Radars
Author: Victoria Manfredi, Jim Kurose
Abstract: We address the problem of adaptive sensor control in dynamic resourceconstrained sensor networks. We focus on a meteorological sensing network comprising radars that can perform sector scanning rather than always scanning 360◦ . We compare three sector scanning strategies. The sit-and-spin strategy always scans 360◦ . The limited lookahead strategy additionally uses the expected environmental state K decision epochs in the future, as predicted from Kalman filters, in its decision-making. The full lookahead strategy uses all expected future states by casting the problem as a Markov decision process and using reinforcement learning to estimate the optimal scan strategy. We show that the main benefits of using a lookahead strategy are when there are multiple meteorological phenomena in the environment, and when the maximum radius of any phenomenon is sufficiently smaller than the radius of the radars. We also show that there is a trade-off between the average quality with which a phenomenon is scanned and the number of decision epochs before which a phenomenon is rescanned. 1
2 0.087867759 168 nips-2007-Reinforcement Learning in Continuous Action Spaces through Sequential Monte Carlo Methods
Author: Alessandro Lazaric, Marcello Restelli, Andrea Bonarini
Abstract: Learning in real-world domains often requires to deal with continuous state and action spaces. Although many solutions have been proposed to apply Reinforcement Learning algorithms to continuous state problems, the same techniques can be hardly extended to continuous action spaces, where, besides the computation of a good approximation of the value function, a fast method for the identification of the highest-valued action is needed. In this paper, we propose a novel actor-critic approach in which the policy of the actor is estimated through sequential Monte Carlo methods. The importance sampling step is performed on the basis of the values learned by the critic, while the resampling step modifies the actor’s policy. The proposed approach has been empirically compared to other learning algorithms into several domains; in this paper, we report results obtained in a control problem consisting of steering a boat across a river. 1
3 0.067486726 56 nips-2007-Configuration Estimates Improve Pedestrian Finding
Author: Duan Tran, David A. Forsyth
Abstract: Fair discriminative pedestrian finders are now available. In fact, these pedestrian finders make most errors on pedestrians in configurations that are uncommon in the training data, for example, mounting a bicycle. This is undesirable. However, the human configuration can itself be estimated discriminatively using structure learning. We demonstrate a pedestrian finder which first finds the most likely human pose in the window using a discriminative procedure trained with structure learning on a small dataset. We then present features (local histogram of oriented gradient and local PCA of gradient) based on that configuration to an SVM classifier. We show, using the INRIA Person dataset, that estimates of configuration significantly improve the accuracy of a discriminative pedestrian finder. 1
4 0.047813095 17 nips-2007-A neural network implementing optimal state estimation based on dynamic spike train decoding
Author: Omer Bobrowski, Ron Meir, Shy Shoham, Yonina Eldar
Abstract: It is becoming increasingly evident that organisms acting in uncertain dynamical environments often employ exact or approximate Bayesian statistical calculations in order to continuously estimate the environmental state, integrate information from multiple sensory modalities, form predictions and choose actions. What is less clear is how these putative computations are implemented by cortical neural networks. An additional level of complexity is introduced because these networks observe the world through spike trains received from primary sensory afferents, rather than directly. A recent line of research has described mechanisms by which such computations can be implemented using a network of neurons whose activity directly represents a probability distribution across the possible “world states”. Much of this work, however, uses various approximations, which severely restrict the domain of applicability of these implementations. Here we make use of rigorous mathematical results from the theory of continuous time point process filtering, and show how optimal real-time state estimation and prediction may be implemented in a general setting using linear neural networks. We demonstrate the applicability of the approach with several examples, and relate the required network properties to the statistical nature of the environment, thereby quantifying the compatibility of a given network with its environment. 1
5 0.046553526 148 nips-2007-Online Linear Regression and Its Application to Model-Based Reinforcement Learning
Author: Alexander L. Strehl, Michael L. Littman
Abstract: We provide a provably efficient algorithm for learning Markov Decision Processes (MDPs) with continuous state and action spaces in the online setting. Specifically, we take a model-based approach and show that a special type of online linear regression allows us to learn MDPs with (possibly kernalized) linearly parameterized dynamics. This result builds on Kearns and Singh’s work that provides a provably efficient algorithm for finite state MDPs. Our approach is not restricted to the linear setting, and is applicable to other classes of continuous MDPs.
6 0.043068245 191 nips-2007-Temporal Difference Updating without a Learning Rate
7 0.03996475 30 nips-2007-Bayes-Adaptive POMDPs
8 0.03559722 100 nips-2007-Hippocampal Contributions to Control: The Third Way
9 0.033654585 162 nips-2007-Random Sampling of States in Dynamic Programming
10 0.033408247 215 nips-2007-What makes some POMDP problems easy to approximate?
11 0.031354971 140 nips-2007-Neural characterization in partially observed populations of spiking neurons
12 0.029894296 55 nips-2007-Computing Robust Counter-Strategies
13 0.029862916 163 nips-2007-Receding Horizon Differential Dynamic Programming
14 0.029709609 154 nips-2007-Predicting Brain States from fMRI Data: Incremental Functional Principal Component Regression
15 0.029334631 122 nips-2007-Locality and low-dimensions in the prediction of natural experience from fMRI
16 0.029292649 204 nips-2007-Theoretical Analysis of Heuristic Search Methods for Online POMDPs
17 0.029120211 124 nips-2007-Managing Power Consumption and Performance of Computing Systems Using Reinforcement Learning
18 0.029055936 27 nips-2007-Anytime Induction of Cost-sensitive Trees
19 0.028295673 116 nips-2007-Learning the structure of manifolds using random projections
20 0.02785047 98 nips-2007-Hierarchical Apprenticeship Learning with Application to Quadruped Locomotion
topicId topicWeight
[(0, -0.086), (1, -0.057), (2, 0.042), (3, -0.023), (4, -0.021), (5, 0.055), (6, -0.002), (7, 0.033), (8, 0.009), (9, -0.012), (10, -0.028), (11, 0.017), (12, 0.026), (13, -0.028), (14, 0.008), (15, 0.025), (16, -0.017), (17, 0.01), (18, 0.071), (19, 0.017), (20, 0.012), (21, 0.02), (22, 0.006), (23, 0.036), (24, -0.046), (25, -0.011), (26, -0.024), (27, 0.014), (28, 0.0), (29, 0.007), (30, -0.042), (31, -0.013), (32, 0.025), (33, 0.043), (34, 0.023), (35, 0.042), (36, -0.057), (37, 0.012), (38, 0.062), (39, -0.033), (40, 0.04), (41, -0.011), (42, 0.054), (43, 0.074), (44, 0.16), (45, 0.132), (46, -0.094), (47, -0.073), (48, 0.008), (49, -0.103)]
simIndex simValue paperId paperTitle
same-paper 1 0.93218911 171 nips-2007-Scan Strategies for Meteorological Radars
Author: Victoria Manfredi, Jim Kurose
Abstract: We address the problem of adaptive sensor control in dynamic resourceconstrained sensor networks. We focus on a meteorological sensing network comprising radars that can perform sector scanning rather than always scanning 360◦ . We compare three sector scanning strategies. The sit-and-spin strategy always scans 360◦ . The limited lookahead strategy additionally uses the expected environmental state K decision epochs in the future, as predicted from Kalman filters, in its decision-making. The full lookahead strategy uses all expected future states by casting the problem as a Markov decision process and using reinforcement learning to estimate the optimal scan strategy. We show that the main benefits of using a lookahead strategy are when there are multiple meteorological phenomena in the environment, and when the maximum radius of any phenomenon is sufficiently smaller than the radius of the radars. We also show that there is a trade-off between the average quality with which a phenomenon is scanned and the number of decision epochs before which a phenomenon is rescanned. 1
2 0.52900708 56 nips-2007-Configuration Estimates Improve Pedestrian Finding
Author: Duan Tran, David A. Forsyth
Abstract: Fair discriminative pedestrian finders are now available. In fact, these pedestrian finders make most errors on pedestrians in configurations that are uncommon in the training data, for example, mounting a bicycle. This is undesirable. However, the human configuration can itself be estimated discriminatively using structure learning. We demonstrate a pedestrian finder which first finds the most likely human pose in the window using a discriminative procedure trained with structure learning on a small dataset. We then present features (local histogram of oriented gradient and local PCA of gradient) based on that configuration to an SVM classifier. We show, using the INRIA Person dataset, that estimates of configuration significantly improve the accuracy of a discriminative pedestrian finder. 1
3 0.44069654 168 nips-2007-Reinforcement Learning in Continuous Action Spaces through Sequential Monte Carlo Methods
Author: Alessandro Lazaric, Marcello Restelli, Andrea Bonarini
Abstract: Learning in real-world domains often requires to deal with continuous state and action spaces. Although many solutions have been proposed to apply Reinforcement Learning algorithms to continuous state problems, the same techniques can be hardly extended to continuous action spaces, where, besides the computation of a good approximation of the value function, a fast method for the identification of the highest-valued action is needed. In this paper, we propose a novel actor-critic approach in which the policy of the actor is estimated through sequential Monte Carlo methods. The importance sampling step is performed on the basis of the values learned by the critic, while the resampling step modifies the actor’s policy. The proposed approach has been empirically compared to other learning algorithms into several domains; in this paper, we report results obtained in a control problem consisting of steering a boat across a river. 1
4 0.42400029 59 nips-2007-Continuous Time Particle Filtering for fMRI
Author: Lawrence Murray, Amos J. Storkey
Abstract: We construct a biologically motivated stochastic differential model of the neural and hemodynamic activity underlying the observed Blood Oxygen Level Dependent (BOLD) signal in Functional Magnetic Resonance Imaging (fMRI). The model poses a difficult parameter estimation problem, both theoretically due to the nonlinearity and divergence of the differential system, and computationally due to its time and space complexity. We adapt a particle filter and smoother to the task, and discuss some of the practical approaches used to tackle the difficulties, including use of sparse matrices and parallelisation. Results demonstrate the tractability of the approach in its application to an effective connectivity study. 1
5 0.41924438 52 nips-2007-Competition Adds Complexity
Author: Judy Goldsmith, Martin Mundhenk
Abstract: It is known that determinining whether a DEC-POMDP, namely, a cooperative partially observable stochastic game (POSG), has a cooperative strategy with positive expected reward is complete for NEXP. It was not known until now how cooperation affected that complexity. We show that, for competitive POSGs, the complexity of determining whether one team has a positive-expected-reward strategy is complete for NEXPNP .
6 0.40960354 50 nips-2007-Combined discriminative and generative articulated pose and non-rigid shape estimation
7 0.38699535 191 nips-2007-Temporal Difference Updating without a Learning Rate
8 0.38623798 17 nips-2007-A neural network implementing optimal state estimation based on dynamic spike train decoding
9 0.35206753 167 nips-2007-Regulator Discovery from Gene Expression Time Series of Malaria Parasites: a Hierachical Approach
10 0.34941399 203 nips-2007-The rat as particle filter
11 0.33902651 89 nips-2007-Feature Selection Methods for Improving Protein Structure Prediction with Rosetta
12 0.33751979 153 nips-2007-People Tracking with the Laplacian Eigenmaps Latent Variable Model
13 0.32936972 124 nips-2007-Managing Power Consumption and Performance of Computing Systems Using Reinforcement Learning
14 0.32148531 91 nips-2007-Fitted Q-iteration in continuous action-space MDPs
15 0.31041288 113 nips-2007-Learning Visual Attributes
16 0.30841053 31 nips-2007-Bayesian Agglomerative Clustering with Coalescents
17 0.30623838 57 nips-2007-Congruence between model and human attention reveals unique signatures of critical visual events
18 0.3015835 193 nips-2007-The Distribution Family of Similarity Distances
19 0.29438546 30 nips-2007-Bayes-Adaptive POMDPs
20 0.27908167 137 nips-2007-Multiple-Instance Pruning For Learning Efficient Cascade Detectors
topicId topicWeight
[(5, 0.05), (13, 0.029), (16, 0.021), (18, 0.022), (21, 0.045), (31, 0.029), (34, 0.053), (35, 0.023), (47, 0.069), (66, 0.41), (83, 0.062), (85, 0.014), (87, 0.01), (90, 0.042)]
simIndex simValue paperId paperTitle
same-paper 1 0.73001832 171 nips-2007-Scan Strategies for Meteorological Radars
Author: Victoria Manfredi, Jim Kurose
Abstract: We address the problem of adaptive sensor control in dynamic resourceconstrained sensor networks. We focus on a meteorological sensing network comprising radars that can perform sector scanning rather than always scanning 360◦ . We compare three sector scanning strategies. The sit-and-spin strategy always scans 360◦ . The limited lookahead strategy additionally uses the expected environmental state K decision epochs in the future, as predicted from Kalman filters, in its decision-making. The full lookahead strategy uses all expected future states by casting the problem as a Markov decision process and using reinforcement learning to estimate the optimal scan strategy. We show that the main benefits of using a lookahead strategy are when there are multiple meteorological phenomena in the environment, and when the maximum radius of any phenomenon is sufficiently smaller than the radius of the radars. We also show that there is a trade-off between the average quality with which a phenomenon is scanned and the number of decision epochs before which a phenomenon is rescanned. 1
2 0.72823346 57 nips-2007-Congruence between model and human attention reveals unique signatures of critical visual events
Author: Robert Peters, Laurent Itti
Abstract: Current computational models of bottom-up and top-down components of attention are predictive of eye movements across a range of stimuli and of simple, fixed visual tasks (such as visual search for a target among distractors). However, to date there exists no computational framework which can reliably mimic human gaze behavior in more complex environments and tasks, such as driving a vehicle through traffic. Here, we develop a hybrid computational/behavioral framework, combining simple models for bottom-up salience and top-down relevance, and looking for changes in the predictive power of these components at different critical event times during 4.7 hours (500,000 video frames) of observers playing car racing and flight combat video games. This approach is motivated by our observation that the predictive strengths of the salience and relevance models exhibit reliable temporal signatures during critical event windows in the task sequence—for example, when the game player directly engages an enemy plane in a flight combat game, the predictive strength of the salience model increases significantly, while that of the relevance model decreases significantly. Our new framework combines these temporal signatures to implement several event detectors. Critically, we find that an event detector based on fused behavioral and stimulus information (in the form of the model’s predictive strength) is much stronger than detectors based on behavioral information alone (eye position) or image information alone (model prediction maps). This approach to event detection, based on eye tracking combined with computational models applied to the visual input, may have useful applications as a less-invasive alternative to other event detection approaches based on neural signatures derived from EEG or fMRI recordings. 1
3 0.29922414 17 nips-2007-A neural network implementing optimal state estimation based on dynamic spike train decoding
Author: Omer Bobrowski, Ron Meir, Shy Shoham, Yonina Eldar
Abstract: It is becoming increasingly evident that organisms acting in uncertain dynamical environments often employ exact or approximate Bayesian statistical calculations in order to continuously estimate the environmental state, integrate information from multiple sensory modalities, form predictions and choose actions. What is less clear is how these putative computations are implemented by cortical neural networks. An additional level of complexity is introduced because these networks observe the world through spike trains received from primary sensory afferents, rather than directly. A recent line of research has described mechanisms by which such computations can be implemented using a network of neurons whose activity directly represents a probability distribution across the possible “world states”. Much of this work, however, uses various approximations, which severely restrict the domain of applicability of these implementations. Here we make use of rigorous mathematical results from the theory of continuous time point process filtering, and show how optimal real-time state estimation and prediction may be implemented in a general setting using linear neural networks. We demonstrate the applicability of the approach with several examples, and relate the required network properties to the statistical nature of the environment, thereby quantifying the compatibility of a given network with its environment. 1
4 0.29779762 126 nips-2007-McRank: Learning to Rank Using Multiple Classification and Gradient Boosting
Author: Ping Li, Qiang Wu, Christopher J. Burges
Abstract: We cast the ranking problem as (1) multiple classification (“Mc”) (2) multiple ordinal classification, which lead to computationally tractable learning algorithms for relevance ranking in Web search. We consider the DCG criterion (discounted cumulative gain), a standard quality measure in information retrieval. Our approach is motivated by the fact that perfect classifications result in perfect DCG scores and the DCG errors are bounded by classification errors. We propose using the Expected Relevance to convert class probabilities into ranking scores. The class probabilities are learned using a gradient boosting tree algorithm. Evaluations on large-scale datasets show that our approach can improve LambdaRank [5] and the regressions-based ranker [6], in terms of the (normalized) DCG scores. An efficient implementation of the boosting tree algorithm is also presented.
5 0.29635948 138 nips-2007-Near-Maximum Entropy Models for Binary Neural Representations of Natural Images
Author: Matthias Bethge, Philipp Berens
Abstract: Maximum entropy analysis of binary variables provides an elegant way for studying the role of pairwise correlations in neural populations. Unfortunately, these approaches suffer from their poor scalability to high dimensions. In sensory coding, however, high-dimensional data is ubiquitous. Here, we introduce a new approach using a near-maximum entropy model, that makes this type of analysis feasible for very high-dimensional data—the model parameters can be derived in closed form and sampling is easy. Therefore, our NearMaxEnt approach can serve as a tool for testing predictions from a pairwise maximum entropy model not only for low-dimensional marginals, but also for high dimensional measurements of more than thousand units. We demonstrate its usefulness by studying natural images with dichotomized pixel intensities. Our results indicate that the statistics of such higher-dimensional measurements exhibit additional structure that are not predicted by pairwise correlations, despite the fact that pairwise correlations explain the lower-dimensional marginal statistics surprisingly well up to the limit of dimensionality where estimation of the full joint distribution is feasible. 1
6 0.29203761 93 nips-2007-GRIFT: A graphical model for inferring visual classification features from human data
7 0.29156312 86 nips-2007-Exponential Family Predictive Representations of State
8 0.29119241 34 nips-2007-Bayesian Policy Learning with Trans-Dimensional MCMC
9 0.29022011 63 nips-2007-Convex Relaxations of Latent Variable Training
10 0.28988251 168 nips-2007-Reinforcement Learning in Continuous Action Spaces through Sequential Monte Carlo Methods
11 0.28702351 153 nips-2007-People Tracking with the Laplacian Eigenmaps Latent Variable Model
12 0.28699374 44 nips-2007-Catching Up Faster in Bayesian Model Selection and Model Averaging
13 0.28679743 18 nips-2007-A probabilistic model for generating realistic lip movements from speech
14 0.2862666 100 nips-2007-Hippocampal Contributions to Control: The Third Way
15 0.28619865 174 nips-2007-Selecting Observations against Adversarial Objectives
16 0.28570315 122 nips-2007-Locality and low-dimensions in the prediction of natural experience from fMRI
17 0.28407866 140 nips-2007-Neural characterization in partially observed populations of spiking neurons
18 0.28376234 158 nips-2007-Probabilistic Matrix Factorization
19 0.28295064 115 nips-2007-Learning the 2-D Topology of Images
20 0.28283307 177 nips-2007-Simplified Rules and Theoretical Analysis for Information Bottleneck Optimization and PCA with Spiking Neurons