nips nips2006 nips2006-170 nips2006-170-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Ashutosh Saxena, Justin Driemeyer, Justin Kearns, Andrew Y. Ng
Abstract: We consider the problem of grasping novel objects, specifically ones that are being seen for the first time through vision. We present a learning algorithm that neither requires, nor tries to build, a 3-d model of the object. Instead it predicts, directly as a function of the images, a point at which to grasp the object. Our algorithm is trained via supervised learning, using synthetic images for the training set. We demonstrate on a robotic manipulation platform that this approach successfully grasps a wide variety of objects, such as wine glasses, duct tape, markers, a translucent box, jugs, knife-cutters, cellphones, keys, screwdrivers, staplers, toothbrushes, a thick coil of wire, a strangely shaped power horn, and others, none of which were seen in the training set. 1
[1] A. Bicchi and V. Kumar. Robotic grasping and contact: a review. In ICRA, 2000.
[2] T. G. R. Bower, J. M. Broughton, and M. K. Moore. Demonstration of intention in the reaching behaviour of neonate humans. Nature, 228:679–681, 1970.
[3] A. S. Glassner. An Introduction to Ray Tracing. Morgan Kaufmann Publishers, Inc., San Francisco, 1989.
[4] K. Hsiao and T. Lozano-Perez. Imitation learning of whole-body grasps. In IEEE/RJS International Conference on Intelligent Robots and Systems (IROS), 2006.
[5] M. T. Mason and J. K. Salisbury. Manipulator grasping and pushing operations. In Robot Hands and the Mechanics of Manipulation. The MIT Press, Cambridge, MA, 1985.
[6] J. Michels, A. Saxena, and A. Y. Ng. High speed obstacle avoidance using monocular vision and reinforcement learning. In ICML, 2005.
[7] Miller and et. al. Automatic grasp planning using shape primitives. In ICRA, 2003.
[8] R. Pelossof and et. al. An svm learning approach to robotic grasping. In ICRA, 2004.
[9] J. H. Piater. Learning visual features to predict hand orientations. In ICML Workshop on Machine Learning of Spatial Knowledge, 2002.
[10] R. Platt, A. H. Fagg, and R. Grupen. Reusing schematic grasping policies. In IEEE-RAS International Conference on Humanoid Robots, Tsukuba, Japan, 2005.
[11] A. Saxena, S. H. Chung, and A. Y. Ng. Learning depth from single monocular images. In NIPS 18, 2005.
[12] A. Saxena, J. Driemeyer, J. Kearns, C. Osondu, and A. Y. Ng. Learning to grasp novel objects using vision. In 10th International Symposium of Experimental Robotics (ISER), 2006.
[13] A. Saxena, J. Schulte, and A. Y. Ng. Depth estimation using monocular and stereo cues. In 20th International Joint Conference on Artificial Intelligence (IJCAI), 2007.
[14] H. Schneiderman and T. Kanade. Probabilistic modeling of local appearance and spatial relationships for object recognition. In CVPR, 1998.
[15] T. Shin-ichi and M. Satoshi. Living and working with robots. Nipponia, 2000. 9 To improve performance, we also used depth-based features. More formally, we applied our texture based features to the depth image obtained from a stereo camera, and appended them to the feature vector used in classification. We also appended some hand-labeled real examples of dishwasher images to the training set to prevent the algorithm from identifying grasping points on background clutter, such as dishwasher prongs.