nips nips2005 nips2005-143 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Urs Muller, Jan Ben, Eric Cosatto, Beat Flepp, Yann L. Cun
Abstract: We describe a vision-based obstacle avoidance system for off-road mobile robots. The system is trained from end to end to map raw input images to steering angles. It is trained in supervised mode to predict the steering angles provided by a human driver during training runs collected in a wide variety of terrains, weather conditions, lighting conditions, and obstacle types. The robot is a 50cm off-road truck, with two forwardpointing wireless color cameras. A remote computer processes the video and controls the robot via radio. The learning system is a large 6-layer convolutional network whose input is a single left/right pair of unprocessed low-resolution images. The robot exhibits an excellent ability to detect obstacles and navigate around them in real time at speeds of 2 m/s.
Reference: text
sentIndex sentText sentNum sentScore
1 com Beat Flepp Net-Scale Technologies Morganville, NJ 07751, USA Abstract We describe a vision-based obstacle avoidance system for off-road mobile robots. [sent-4, score-0.457]
2 The system is trained from end to end to map raw input images to steering angles. [sent-5, score-0.868]
3 It is trained in supervised mode to predict the steering angles provided by a human driver during training runs collected in a wide variety of terrains, weather conditions, lighting conditions, and obstacle types. [sent-6, score-1.357]
4 The robot is a 50cm off-road truck, with two forwardpointing wireless color cameras. [sent-7, score-0.316]
5 A remote computer processes the video and controls the robot via radio. [sent-8, score-0.463]
6 The learning system is a large 6-layer convolutional network whose input is a single left/right pair of unprocessed low-resolution images. [sent-9, score-0.282]
7 The robot exhibits an excellent ability to detect obstacles and navigate around them in real time at speeds of 2 m/s. [sent-10, score-0.555]
8 Building a fully autonomous off-road vehicle that can reliably navigate and avoid obstacles at high speed is a major challenge for robotics, and a new domain of application for machine learning research. [sent-12, score-0.55]
9 The last few years have seen considerable progress toward that goal, particularly in areas such as mapping the environment from active range sensors and stereo cameras [11, 7], simultaneously navigating and building maps [6, 15], and classifying obstacle types. [sent-13, score-0.648]
10 Among the various sub-problems of off-road vehicle navigation, obstacle detection and avoidance is a subject of prime importance. [sent-14, score-0.572]
11 Avoiding obstacles by relying solely on camera input requires solving a highly complex vision problem. [sent-20, score-0.367]
12 A time-honored approach is to derive range maps from multiple images through multiple cameras or through motion [6, 5]. [sent-21, score-0.331]
13 Deriving steering angles to avoid obstacles from the range maps is a simple matter. [sent-22, score-0.983]
14 A large number of techniques have been proposed in the literature to construct range maps from stereo images. [sent-23, score-0.27]
15 2 End-To-End Learning for Obstacle Avoidance In general, computing depth from stereo images is an ill-posed problem, but the depth map is only a means to an end. [sent-27, score-0.302]
16 Ultimately, the output of an obstacle avoidance system is a set of possible steering angles that direct the robot toward traversible regions. [sent-28, score-1.342]
17 Our approach is to view the entire problem of mapping input stereo images to possible steering angles as a single indivisible task to be learned from end to end. [sent-29, score-0.903]
18 Our learning system takes raw color images from two forward-pointing cameras mounted on the robot, and maps them to a set of possible steering angles through a single trained function. [sent-30, score-1.219]
19 The training data was collected by recording the actions of a human driver together with the video data. [sent-31, score-0.436]
20 The human driver remotely drives the robot straight ahead until the robot encounters a non-traversible obstacle. [sent-32, score-0.85]
21 The human driver then avoids the obstacle by steering the robot in the appropriate direction. [sent-33, score-1.25]
22 It takes a single pair of heavily-subsampled images from the two cameras, and is trained to predict the steering angle produced by the human driver at that time. [sent-35, score-1.103]
23 The learning architecture is a 6-layer convolutional network [9]. [sent-36, score-0.209]
24 The network takes the left and right 149×58 color images and produces two outputs. [sent-37, score-0.213]
25 A large value on the first output is interpreted as a left steering command while a large value on the second output indicates a right steering command. [sent-38, score-1.204]
26 Each layer in a convolutional network can be viewed as a set of trainable, shift-invariant linear filters with local support, followed by a point-wise non-linear saturation function. [sent-39, score-0.326]
27 The main differences with ALVINN are: (1) our system uses stereo cameras; (2) it is trained for off-road obtacle avoidance rather than road following; (3) Our trainable system uses a convolutional network rather than a traditional fully-connected neural net. [sent-43, score-0.777]
28 Convolutional nets are particularly well suited for our task because local feature detectors that combine inputs from the left and right images can be useful for estimating distances to obstacles (possibly by estimating disparities). [sent-46, score-0.48]
29 They key advantage of the approach is that the entire function from raw pixels to steering angles is trained from data, which completely eliminates the need for feature design and selection, geometry, camera calibration, and hand-tuning of parameters. [sent-48, score-0.891]
30 Another potential benefit of a pure learning-based approach is that the system may use other cues than stereo disparity to detect obstacles, possibly alleviating the short-sightedness of methods based purely on stereo matching. [sent-51, score-0.477]
31 3 Vehicle Hardware We built a small and light-weight vehicle which can be carried by a single person so as to facilitate data collection and testing in a wide variety of environments. [sent-52, score-0.249]
32 Using a small, rugged and low-cost robot allowed us to drive at relatively high speed without fear of causing damage to people, property or the robot itself. [sent-53, score-0.541]
33 Therefore, the robot has no significant on-board computing power. [sent-55, score-0.256]
34 A wireless link is used to transmit video and sensor readings to the remote computer. [sent-57, score-0.239]
35 Throttle and steering controls are sent from the computer to the robot through a regular radio control channel. [sent-58, score-0.823]
36 The robot chassis was built around a customized 1/10-th scale remote-controlled, electricpowered, four-wheel-drive truck which was roughly 50cm in length. [sent-59, score-0.297]
37 The typical speed of the robot during data collection and testing sessions was roughly 2 meters per second. [sent-60, score-0.286]
38 Two forward-pointing low-cost 1/3-inch CCD cameras were mounted 110mm apart behind a clear lexan window. [sent-61, score-0.189]
39 A pair of 900MHz analog video transmitters was used to send the camera outputs to the remote computer. [sent-64, score-0.316]
40 The analog video links were subject to high signal noise, color shifts, frequent interferences, and occasional video drop-outs. [sent-65, score-0.238]
41 Figure 1: Left: The robot is a modified 50 cm-long truck platform controled by a remote computer. [sent-70, score-0.436]
42 4 Data Collection During a data collection session, the human operator wears video goggles fed with the video signal from one the robot’s cameras (no stereo), and controls the robot through a joystick connected to the PC. [sent-73, score-0.743]
43 During each run, the PC records the output of the two video cameras at 15 frames per second, together with the steering angle and throttle setting from the operator. [sent-74, score-1.042]
44 Tt was necessary for the human driver to adopt a consistent obstacle avoidance behaviour. [sent-76, score-0.572]
45 To ensure this, the human driver was to drive the vehicle straight ahead whenever no obstacle was present within a threatening distance. [sent-77, score-0.727]
46 Whenever the robot approached an obstacle, the human driver had to steer left or right so as to avoid the obstacle. [sent-78, score-0.575]
47 Even though great care was taken in collecting the highest quality training data, there were a number of imperfections in the training data that could not be avoided: (a) The smallform-factor, low-cost cameras presented significant differences in their default settings. [sent-85, score-0.313]
48 In particular, the white balance of the two cameras were somewhat different; (b) To maximize image quality, the automatic gain control and automatic exposure were activated. [sent-86, score-0.188]
49 In particular, the AGC adjustments seem to react at different speeds and amplitudes; (c) Because of AGC, driving into the sunlight caused the images to become very dark and obstacles to become hard to detect; (d) The wireless video connection caused dropouts and distortions of some frames. [sent-88, score-0.598]
50 An example is shown in Figures 1; (e) The cameras were mounted rigidly on the vehicle and were exposed to vibration, despite the suspension. [sent-90, score-0.408]
51 Segments during which the robot was driven into position in preparation for a run were edited out. [sent-96, score-0.256]
52 The architecture of convolutional nets is somewhat inspired by the structure of biological visual systems. [sent-105, score-0.212]
53 The input to the convolutional net consists of 6 planes of size 149×58 pixels. [sent-107, score-0.282]
54 The six planes respectively contain the Y, U and V components for the left camera and the right camera. [sent-108, score-0.23]
55 Each layer in a convolutional net is composed of units organized in planes called feature maps. [sent-111, score-0.5]
56 Each unit in a feature map takes inputs from a small neighborhood within the feature maps of the previous layer. [sent-112, score-0.276]
57 Therefore, each feature map can be seen as convolving the feature maps of the previous layers with small-size kernels, and passing the sum of those convolutions through sigmoid functions. [sent-116, score-0.316]
58 The first layer contains 6 feature maps of size 147×56 connected to various combinations of the input maps through 3×3 kernels. [sent-118, score-0.409]
59 The first feature map is connected to the YUV planes of the left image, the second feature map to the YUV planes of the right image, and the other 4 feature maps to all 6 input planes. [sent-119, score-0.65]
60 Those 4 feature maps are binocular, and can learn filters that compare the location of features in the left and right images. [sent-120, score-0.232]
61 The next layer is an averaging/subsampling layer of size 49×14 whose purpose is to reduce the spatial resolution of the feature maps so as to build invariances to small geometric distortions of the input. [sent-122, score-0.46]
62 The 3-rd layer contains 24 feature maps of size 45×12. [sent-124, score-0.279]
63 Each feature map is connected to various subsests of maps in the previous layer through a total of 96 kernels of size 5×3. [sent-125, score-0.361]
64 The 4-th layer is an averaging/subsampling layer of size 9×4 with 5×3 subsampling ratios. [sent-126, score-0.282]
65 The 5-th layer contains 100 feature maps of size 1×1 connected to the 4-th layer through 2400 kernels of size 9×4 (full connection). [sent-127, score-0.426]
66 finally, the output layer contains two units fully-connected to the 100 units in the 5-th layer. [sent-128, score-0.195]
67 The bottom half of figure 2 shows the states of the six layers of the convolutional net. [sent-132, score-0.214]
68 The network as shown runs in about 60ms per image pair on the remote computer. [sent-135, score-0.209]
69 The reason is that since the steering is fairly smooth in time (with long, stable periods), the current rate of turn is an excellent predictor of the next desired steering angle. [sent-140, score-1.134]
70 Hence, a system trained with multiple frames would merely predict a steering angle equal to the current rate of turn as observed through the camera. [sent-142, score-0.927]
71 6 Results Two performance measurements were recorded, the average loss, and the percentage of “correctly classified” steering angles. [sent-147, score-0.567]
72 The percentage of correctly classified steering angles measures the number of times the predicted steering angle, quantized into three bins (left, straight, right), agrees with steering angle provided by the human driver. [sent-149, score-1.959]
73 Since the thresholds for deciding whether an angle counted as left, center, or right were somewhat arbitrary, the percentages cannot be intepreted in absolute terms, but merely as a relative figure of merit for comparing runs and architectures. [sent-150, score-0.219]
74 Figure 2: Internal state of the convolutional net for two sample frames. [sent-151, score-0.202]
75 The light-blue bars below show the steering angle produced by the system. [sent-153, score-0.715]
76 The bottom halves show the state of the layers of the network, where each column is a layer (the penultimate layer is not shown). [sent-154, score-0.274]
77 With 95,000 training image pairs, training took 18 epochs through the training set. [sent-157, score-0.193]
78 8 % doesn’t mean that the vehicle crashes into obstacles 35. [sent-168, score-0.449]
79 8 % of the time, but merely that the prediction of the system was in a different bin than that of the human drivers for 35. [sent-169, score-0.185]
80 The seemingly high error rate is not an accurate reflection of the actual effectiveness of the robot in the field. [sent-171, score-0.256]
81 First, there may be several legitimate steering angles for a given image pair: turning left or right around an obstacle may both be valid options, but our performance measure would record one of those options as incorrect. [sent-173, score-0.989]
82 In addition, many illegitimate errors are recorded when the system starts turning at a different time than the human driver, or when the precise values of the steering angles are different enough to be in different bins, but close enough to cause the robot to avoid the obstacle. [sent-174, score-1.128]
83 It shows the steering angle produced by the system and the steering angle provided by the human driver for 8000 frames from the test set. [sent-176, score-1.786]
84 It is clear for the plot that only a small number of obstacles would not have been avoided by the robot. [sent-177, score-0.265]
85 This figure shows that the system can detect obstacles and predict appropriate steering angles in the presence of back-lighting and with wild difference between the automatics gain settings of the left and right cameras. [sent-186, score-1.058]
86 They are snapshots of video clips recorded from the vehicle’s cameras while the vehicle was driving itself autonomously. [sent-188, score-0.653]
87 Each picture also shows Figure 3: The steering angle produced by the system (black) compared to the steering angle provided by the human operator (red line) for 8000 frames from the test set. [sent-190, score-1.603]
88 Very few obstacles would not have been avoided by the system. [sent-191, score-0.265]
89 the steering angle produced by the system for that particular input. [sent-192, score-0.788]
90 7 Conclusion We have demonstrate the applicability of end-to-end learning methods to the task of obstacle avoidance for off-road robots. [sent-193, score-0.323]
91 A 6-layer convolutional network was trained with massive amounts of data to emulate the obstacle avoidance behavior of a human driver. [sent-194, score-0.657]
92 Its main design advantage is that it is trained from raw pixels to directly produce steering angles. [sent-197, score-0.663]
93 Furthermore, the method gets around the need to design and select an appropriate set of feature detectors, as well as the need to design robust and fast stereo algorithms. [sent-199, score-0.232]
94 The construction of a fully autonomous driving system for ground robots will require several other components besides the purely-reactive obstacle detection and avoidance system described here. [sent-200, score-0.677]
95 The black bar beneath each image indicates the steering angle produced by the system. [sent-234, score-0.752]
96 Top row: four successive snapshots showing the robot navigating through a narrow passageway between a trailer, a backhoe, and some construction material. [sent-235, score-0.32]
97 Bottom row, left: narrow obstacles such as table legs and poles (left), and solid obstacles such as fences (center-left) are easily detected and avoided. [sent-236, score-0.46]
98 One scenario where the vehicle occasionally made wrong decisions is when the sun is in the field of view: the system seems to systematically drive towards the sun, whenever the sun is low on the horizon (right). [sent-238, score-0.429]
99 Stereo vision and navigation in buildings for mobile robots. [sent-255, score-0.192]
100 Knowledge-based training of artificial neural netowrks for autonomous robot driving. [sent-302, score-0.372]
wordName wordTfidf (topN-words)
[('steering', 0.567), ('robot', 0.256), ('obstacles', 0.23), ('vehicle', 0.219), ('driver', 0.183), ('obstacle', 0.178), ('convolutional', 0.174), ('stereo', 0.17), ('cameras', 0.151), ('avoidance', 0.145), ('layer', 0.117), ('angle', 0.106), ('video', 0.105), ('navigation', 0.102), ('remote', 0.102), ('maps', 0.1), ('angles', 0.086), ('camera', 0.08), ('images', 0.08), ('planes', 0.08), ('frames', 0.076), ('system', 0.073), ('human', 0.066), ('autonomous', 0.064), ('lighting', 0.064), ('snapshots', 0.064), ('feature', 0.062), ('robots', 0.061), ('mobile', 0.061), ('trained', 0.059), ('collecting', 0.058), ('alvinn', 0.055), ('morganville', 0.055), ('yuv', 0.055), ('sun', 0.054), ('driving', 0.053), ('map', 0.052), ('training', 0.052), ('straight', 0.052), ('turning', 0.051), ('lecun', 0.051), ('sensors', 0.049), ('automation', 0.049), ('subsampling', 0.048), ('trainable', 0.048), ('yann', 0.048), ('merely', 0.046), ('robotics', 0.046), ('produced', 0.042), ('truck', 0.041), ('layers', 0.04), ('units', 0.039), ('mounted', 0.038), ('nets', 0.038), ('left', 0.038), ('raw', 0.037), ('agc', 0.037), ('athlon', 0.037), ('controled', 0.037), ('dropouts', 0.037), ('leon', 0.037), ('matthies', 0.037), ('navigate', 0.037), ('orr', 0.037), ('remotely', 0.037), ('throttle', 0.037), ('traversible', 0.037), ('weather', 0.037), ('image', 0.037), ('network', 0.035), ('avoided', 0.035), ('runs', 0.035), ('nj', 0.034), ('frame', 0.033), ('right', 0.032), ('wireless', 0.032), ('diversity', 0.032), ('resolution', 0.032), ('detect', 0.032), ('clips', 0.032), ('urs', 0.032), ('disparity', 0.032), ('distortions', 0.032), ('technologies', 0.031), ('collection', 0.03), ('connected', 0.03), ('considerably', 0.03), ('detection', 0.03), ('collected', 0.03), ('lters', 0.029), ('drive', 0.029), ('vision', 0.029), ('nder', 0.029), ('vehicles', 0.029), ('adjustments', 0.029), ('recorded', 0.029), ('outputs', 0.029), ('relying', 0.028), ('color', 0.028), ('net', 0.028)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000006 143 nips-2005-Off-Road Obstacle Avoidance through End-to-End Learning
Author: Urs Muller, Jan Ben, Eric Cosatto, Beat Flepp, Yann L. Cun
Abstract: We describe a vision-based obstacle avoidance system for off-road mobile robots. The system is trained from end to end to map raw input images to steering angles. It is trained in supervised mode to predict the steering angles provided by a human driver during training runs collected in a wide variety of terrains, weather conditions, lighting conditions, and obstacle types. The robot is a 50cm off-road truck, with two forwardpointing wireless color cameras. A remote computer processes the video and controls the robot via radio. The learning system is a large 6-layer convolutional network whose input is a single left/right pair of unprocessed low-resolution images. The robot exhibits an excellent ability to detect obstacles and navigate around them in real time at speeds of 2 m/s.
2 0.14296819 23 nips-2005-An Application of Markov Random Fields to Range Sensing
Author: James Diebel, Sebastian Thrun
Abstract: This paper describes a highly successful application of MRFs to the problem of generating high-resolution range images. A new generation of range sensors combines the capture of low-resolution range images with the acquisition of registered high-resolution camera images. The MRF in this paper exploits the fact that discontinuities in range and coloring tend to co-align. This enables it to generate high-resolution, low-noise range images by integrating regular camera images into the range data. We show that by using such an MRF, we can substantially improve over existing range imaging technology. 1
3 0.11174693 115 nips-2005-Learning Shared Latent Structure for Image Synthesis and Robotic Imitation
Author: Aaron Shon, Keith Grochow, Aaron Hertzmann, Rajesh P. Rao
Abstract: We propose an algorithm that uses Gaussian process regression to learn common hidden structure shared between corresponding sets of heterogenous observations. The observation spaces are linked via a single, reduced-dimensionality latent variable space. We present results from two datasets demonstrating the algorithms’s ability to synthesize novel data from learned correspondences. We first show that the method can learn the nonlinear mapping between corresponding views of objects, filling in missing data as needed to synthesize novel views. We then show that the method can learn a mapping between human degrees of freedom and robotic degrees of freedom for a humanoid robot, allowing robotic imitation of human poses from motion capture data. 1
4 0.10133816 108 nips-2005-Layered Dynamic Textures
Author: Antoni B. Chan, Nuno Vasconcelos
Abstract: A dynamic texture is a video model that treats a video as a sample from a spatio-temporal stochastic process, specifically a linear dynamical system. One problem associated with the dynamic texture is that it cannot model video where there are multiple regions of distinct motion. In this work, we introduce the layered dynamic texture model, which addresses this problem. We also introduce a variant of the model, and present the EM algorithm for learning each of the models. Finally, we demonstrate the efficacy of the proposed model for the tasks of segmentation and synthesis of video.
5 0.095903337 73 nips-2005-Fast biped walking with a reflexive controller and real-time policy searching
Author: Tao Geng, Bernd Porr, Florentin Wörgötter
Abstract: In this paper, we present our design and experiments of a planar biped robot (“RunBot”) under pure reflexive neuronal control. The goal of this study is to combine neuronal mechanisms with biomechanics to obtain very fast speed and the on-line learning of circuit parameters. Our controller is built with biologically inspired sensor- and motor-neuron models, including local reflexes and not employing any kind of position or trajectory-tracking control algorithm. Instead, this reflexive controller allows RunBot to exploit its own natural dynamics during critical stages of its walking gait cycle. To our knowledge, this is the first time that dynamic biped walking is achieved using only a pure reflexive controller. In addition, this structure allows using a policy gradient reinforcement learning algorithm to tune the parameters of the reflexive controller in real-time during walking. This way RunBot can reach a relative speed of 3.5 leg-lengths per second after a few minutes of online learning, which is faster than that of any other biped robot, and is also comparable to the fastest relative speed of human walking. In addition, the stability domain of stable walking is quite large supporting this design strategy. 1
6 0.090681918 110 nips-2005-Learning Depth from Single Monocular Images
7 0.079943068 5 nips-2005-A Computational Model of Eye Movements during Object Class Detection
8 0.074478082 97 nips-2005-Inferring Motor Programs from Images of Handwritten Digits
9 0.066214554 36 nips-2005-Bayesian models of human action understanding
10 0.06596902 7 nips-2005-A Cortically-Plausible Inverse Problem Solving Method Applied to Recognizing Static and Kinematic 3D Objects
11 0.064334169 170 nips-2005-Scaling Laws in Natural Scenes and the Inference of 3D Shape
12 0.063877694 151 nips-2005-Pattern Recognition from One Example by Chopping
13 0.062384289 131 nips-2005-Multiple Instance Boosting for Object Detection
14 0.05710233 164 nips-2005-Representing Part-Whole Relationships in Recurrent Neural Networks
15 0.05650083 45 nips-2005-Conditional Visual Tracking in Kernel Space
16 0.055848084 194 nips-2005-Top-Down Control of Visual Attention: A Rational Account
17 0.053214896 169 nips-2005-Saliency Based on Information Maximization
18 0.045364149 25 nips-2005-An aVLSI Cricket Ear Model
19 0.044629622 176 nips-2005-Silicon growth cones map silicon retina
20 0.04439351 57 nips-2005-Distance Metric Learning for Large Margin Nearest Neighbor Classification
topicId topicWeight
[(0, 0.16), (1, -0.046), (2, 0.026), (3, 0.184), (4, -0.045), (5, 0.072), (6, 0.036), (7, 0.046), (8, 0.134), (9, -0.072), (10, -0.017), (11, 0.13), (12, -0.031), (13, 0.062), (14, 0.081), (15, -0.046), (16, 0.072), (17, 0.023), (18, -0.044), (19, 0.017), (20, 0.099), (21, -0.045), (22, 0.054), (23, -0.022), (24, -0.167), (25, 0.038), (26, -0.048), (27, 0.034), (28, -0.018), (29, -0.018), (30, 0.091), (31, -0.028), (32, -0.085), (33, 0.053), (34, -0.043), (35, 0.071), (36, -0.069), (37, -0.031), (38, 0.02), (39, 0.043), (40, -0.067), (41, -0.029), (42, -0.029), (43, 0.032), (44, 0.028), (45, -0.032), (46, -0.188), (47, -0.05), (48, 0.051), (49, 0.032)]
simIndex simValue paperId paperTitle
same-paper 1 0.93477392 143 nips-2005-Off-Road Obstacle Avoidance through End-to-End Learning
Author: Urs Muller, Jan Ben, Eric Cosatto, Beat Flepp, Yann L. Cun
Abstract: We describe a vision-based obstacle avoidance system for off-road mobile robots. The system is trained from end to end to map raw input images to steering angles. It is trained in supervised mode to predict the steering angles provided by a human driver during training runs collected in a wide variety of terrains, weather conditions, lighting conditions, and obstacle types. The robot is a 50cm off-road truck, with two forwardpointing wireless color cameras. A remote computer processes the video and controls the robot via radio. The learning system is a large 6-layer convolutional network whose input is a single left/right pair of unprocessed low-resolution images. The robot exhibits an excellent ability to detect obstacles and navigate around them in real time at speeds of 2 m/s.
2 0.6592266 23 nips-2005-An Application of Markov Random Fields to Range Sensing
Author: James Diebel, Sebastian Thrun
Abstract: This paper describes a highly successful application of MRFs to the problem of generating high-resolution range images. A new generation of range sensors combines the capture of low-resolution range images with the acquisition of registered high-resolution camera images. The MRF in this paper exploits the fact that discontinuities in range and coloring tend to co-align. This enables it to generate high-resolution, low-noise range images by integrating regular camera images into the range data. We show that by using such an MRF, we can substantially improve over existing range imaging technology. 1
3 0.59682035 108 nips-2005-Layered Dynamic Textures
Author: Antoni B. Chan, Nuno Vasconcelos
Abstract: A dynamic texture is a video model that treats a video as a sample from a spatio-temporal stochastic process, specifically a linear dynamical system. One problem associated with the dynamic texture is that it cannot model video where there are multiple regions of distinct motion. In this work, we introduce the layered dynamic texture model, which addresses this problem. We also introduce a variant of the model, and present the EM algorithm for learning each of the models. Finally, we demonstrate the efficacy of the proposed model for the tasks of segmentation and synthesis of video.
4 0.57920521 73 nips-2005-Fast biped walking with a reflexive controller and real-time policy searching
Author: Tao Geng, Bernd Porr, Florentin Wörgötter
Abstract: In this paper, we present our design and experiments of a planar biped robot (“RunBot”) under pure reflexive neuronal control. The goal of this study is to combine neuronal mechanisms with biomechanics to obtain very fast speed and the on-line learning of circuit parameters. Our controller is built with biologically inspired sensor- and motor-neuron models, including local reflexes and not employing any kind of position or trajectory-tracking control algorithm. Instead, this reflexive controller allows RunBot to exploit its own natural dynamics during critical stages of its walking gait cycle. To our knowledge, this is the first time that dynamic biped walking is achieved using only a pure reflexive controller. In addition, this structure allows using a policy gradient reinforcement learning algorithm to tune the parameters of the reflexive controller in real-time during walking. This way RunBot can reach a relative speed of 3.5 leg-lengths per second after a few minutes of online learning, which is faster than that of any other biped robot, and is also comparable to the fastest relative speed of human walking. In addition, the stability domain of stable walking is quite large supporting this design strategy. 1
5 0.55364579 110 nips-2005-Learning Depth from Single Monocular Images
Author: Ashutosh Saxena, Sung H. Chung, Andrew Y. Ng
Abstract: We consider the task of depth estimation from a single monocular image. We take a supervised learning approach to this problem, in which we begin by collecting a training set of monocular images (of unstructured outdoor environments which include forests, trees, buildings, etc.) and their corresponding ground-truth depthmaps. Then, we apply supervised learning to predict the depthmap as a function of the image. Depth estimation is a challenging problem, since local features alone are insufficient to estimate depth at a point, and one needs to consider the global context of the image. Our model uses a discriminatively-trained Markov Random Field (MRF) that incorporates multiscale local- and global-image features, and models both depths at individual points as well as the relation between depths at different points. We show that, even on unstructured scenes, our algorithm is frequently able to recover fairly accurate depthmaps. 1
6 0.46946901 170 nips-2005-Scaling Laws in Natural Scenes and the Inference of 3D Shape
7 0.45589173 115 nips-2005-Learning Shared Latent Structure for Image Synthesis and Robotic Imitation
8 0.44514406 97 nips-2005-Inferring Motor Programs from Images of Handwritten Digits
9 0.38597053 62 nips-2005-Efficient Estimation of OOMs
11 0.35503206 34 nips-2005-Bayesian Surprise Attracts Human Attention
12 0.33338293 162 nips-2005-Rate Distortion Codes in Sensor Networks: A System-level Analysis
13 0.32512674 176 nips-2005-Silicon growth cones map silicon retina
14 0.31864578 169 nips-2005-Saliency Based on Information Maximization
15 0.31241459 6 nips-2005-A Connectionist Model for Constructive Modal Reasoning
16 0.3003723 151 nips-2005-Pattern Recognition from One Example by Chopping
17 0.29767999 136 nips-2005-Noise and the two-thirds power Law
18 0.29617286 94 nips-2005-Identifying Distributed Object Representations in Human Extrastriate Visual Cortex
19 0.29478455 5 nips-2005-A Computational Model of Eye Movements during Object Class Detection
20 0.284468 93 nips-2005-Ideal Observers for Detecting Motion: Correspondence Noise
topicId topicWeight
[(3, 0.029), (10, 0.048), (26, 0.01), (27, 0.024), (31, 0.046), (34, 0.503), (39, 0.016), (55, 0.026), (57, 0.014), (69, 0.055), (73, 0.028), (88, 0.066), (91, 0.025)]
simIndex simValue paperId paperTitle
1 0.99194503 2 nips-2005-A Bayes Rule for Density Matrices
Author: Manfred K. Warmuth
Abstract: The classical Bayes rule computes the posterior model probability from the prior probability and the data likelihood. We generalize this rule to the case when the prior is a density matrix (symmetric positive definite and trace one) and the data likelihood a covariance matrix. The classical Bayes rule is retained as the special case when the matrices are diagonal. In the classical setting, the calculation of the probability of the data is an expected likelihood, where the expectation is over the prior distribution. In the generalized setting, this is replaced by an expected variance calculation where the variance is computed along the eigenvectors of the prior density matrix and the expectation is over the eigenvalues of the density matrix (which form a probability vector). The variances along any direction is determined by the covariance matrix. Curiously enough this expected variance calculation is a quantum measurement where the co-variance matrix specifies the instrument and the prior density matrix the mixture state of the particle. We motivate both the classical and the generalized Bayes rule with a minimum relative entropy principle, where the Kullbach-Leibler version gives the classical Bayes rule and Umegaki’s quantum relative entropy the new Bayes rule for density matrices. 1
2 0.98313582 12 nips-2005-A PAC-Bayes approach to the Set Covering Machine
Author: François Laviolette, Mario Marchand, Mohak Shah
Abstract: We design a new learning algorithm for the Set Covering Machine from a PAC-Bayes perspective and propose a PAC-Bayes risk bound which is minimized for classifiers achieving a non trivial margin-sparsity trade-off. 1
3 0.9807198 42 nips-2005-Combining Graph Laplacians for Semi--Supervised Learning
Author: Andreas Argyriou, Mark Herbster, Massimiliano Pontil
Abstract: A foundational problem in semi-supervised learning is the construction of a graph underlying the data. We propose to use a method which optimally combines a number of differently constructed graphs. For each of these graphs we associate a basic graph kernel. We then compute an optimal combined kernel. This kernel solves an extended regularization problem which requires a joint minimization over both the data and the set of graph kernels. We present encouraging results on different OCR tasks where the optimal combined kernel is computed from graphs constructed with a variety of distances functions and the ‘k’ in nearest neighbors. 1
4 0.97892827 10 nips-2005-A General and Efficient Multiple Kernel Learning Algorithm
Author: Sören Sonnenburg, Gunnar Rätsch, Christin Schäfer
Abstract: While classical kernel-based learning algorithms are based on a single kernel, in practice it is often desirable to use multiple kernels. Lankriet et al. (2004) considered conic combinations of kernel matrices for classification, leading to a convex quadratically constraint quadratic program. We show that it can be rewritten as a semi-infinite linear program that can be efficiently solved by recycling the standard SVM implementations. Moreover, we generalize the formulation and our method to a larger class of problems, including regression and one-class classification. Experimental results show that the proposed algorithm helps for automatic model selection, improving the interpretability of the learning result and works for hundred thousands of examples or hundreds of kernels to be combined. 1
same-paper 5 0.97193974 143 nips-2005-Off-Road Obstacle Avoidance through End-to-End Learning
Author: Urs Muller, Jan Ben, Eric Cosatto, Beat Flepp, Yann L. Cun
Abstract: We describe a vision-based obstacle avoidance system for off-road mobile robots. The system is trained from end to end to map raw input images to steering angles. It is trained in supervised mode to predict the steering angles provided by a human driver during training runs collected in a wide variety of terrains, weather conditions, lighting conditions, and obstacle types. The robot is a 50cm off-road truck, with two forwardpointing wireless color cameras. A remote computer processes the video and controls the robot via radio. The learning system is a large 6-layer convolutional network whose input is a single left/right pair of unprocessed low-resolution images. The robot exhibits an excellent ability to detect obstacles and navigate around them in real time at speeds of 2 m/s.
6 0.83648837 44 nips-2005-Computing the Solution Path for the Regularized Support Vector Regression
7 0.82361686 57 nips-2005-Distance Metric Learning for Large Margin Nearest Neighbor Classification
8 0.80974585 114 nips-2005-Learning Rankings via Convex Hull Separation
9 0.79851979 190 nips-2005-The Curse of Highly Variable Functions for Local Kernel Machines
10 0.79718035 105 nips-2005-Large-Scale Multiclass Transduction
11 0.79076588 77 nips-2005-From Lasso regression to Feature vector machine
12 0.78808987 123 nips-2005-Maximum Margin Semi-Supervised Learning for Structured Variables
13 0.78748322 31 nips-2005-Asymptotics of Gaussian Regularized Least Squares
14 0.78281045 184 nips-2005-Structured Prediction via the Extragradient Method
15 0.76524997 50 nips-2005-Convex Neural Networks
16 0.7595157 138 nips-2005-Non-Local Manifold Parzen Windows
17 0.75675613 58 nips-2005-Divergences, surrogate loss functions and experimental design
18 0.75646698 154 nips-2005-Preconditioner Approximations for Probabilistic Graphical Models
19 0.75633597 47 nips-2005-Consistency of one-class SVM and related algorithms
20 0.7540319 195 nips-2005-Transfer learning for text classification