nips nips2005 nips2005-115 knowledge-graph by maker-knowledge-mining

115 nips-2005-Learning Shared Latent Structure for Image Synthesis and Robotic Imitation


Source: pdf

Author: Aaron Shon, Keith Grochow, Aaron Hertzmann, Rajesh P. Rao

Abstract: We propose an algorithm that uses Gaussian process regression to learn common hidden structure shared between corresponding sets of heterogenous observations. The observation spaces are linked via a single, reduced-dimensionality latent variable space. We present results from two datasets demonstrating the algorithms’s ability to synthesize novel data from learned correspondences. We first show that the method can learn the nonlinear mapping between corresponding views of objects, filling in missing data as needed to synthesize novel views. We then show that the method can learn a mapping between human degrees of freedom and robotic degrees of freedom for a humanoid robot, allowing robotic imitation of human poses from motion capture data. 1

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 The observation spaces are linked via a single, reduced-dimensionality latent variable space. [sent-9, score-0.68]

2 We present results from two datasets demonstrating the algorithms’s ability to synthesize novel data from learned correspondences. [sent-10, score-0.319]

3 We first show that the method can learn the nonlinear mapping between corresponding views of objects, filling in missing data as needed to synthesize novel views. [sent-11, score-0.571]

4 We then show that the method can learn a mapping between human degrees of freedom and robotic degrees of freedom for a humanoid robot, allowing robotic imitation of human poses from motion capture data. [sent-12, score-1.27]

5 For example, predicting a 3D object’s appearance given corresponding poses of another, related object relies on learning a parameterization common to both objects. [sent-15, score-0.313]

6 In imitation learning, one agent, such as a robot, learns to perform a task by observing another agent, for example, a human instructor. [sent-17, score-0.351]

7 Recently, Lawrence proposed the Gaussian process latent variable model (GPLVM) [4] as a new technique for nonlinear dimensionality reduction and data visualization [13, 10]. [sent-20, score-0.697]

8 An extension of this model, the scaled GPLVM (SGPLVM), has been used successfully for dimensionality reduction on human motion capture data for motion synthesis and visualization [1]. [sent-21, score-0.431]

9 In this paper, we propose a generalization of the GPLVM model that can handle multiple observation spaces, where each set of observations is parameterized by a different set of kernel parameters. [sent-22, score-0.289]

10 Observations are linked via a single, reduced-dimensionality latent variable space. [sent-23, score-0.519]

11 Our goal is to find correspondences on testing data, given a limited set of corresponding training data from two observation spaces. [sent-25, score-0.406]

12 Such an algorithm can be used in a variety of applications, such as inferring a novel view of an object given a corresponding view of a different object and estimating the kinematic parameters for a humanoid robot given a human pose. [sent-26, score-0.873]

13 First, finding latent representations for correlated, high-dimensional sets of observations requires non-linear mappings, so linear CCA is not viable. [sent-28, score-0.558]

14 Fourth, probabilistic models provide an estimate of uncertainty in classification or interpolating between data; this is especially useful in applications such as robotic imitation where estimates of uncertainty can be used to decide whether a robot should attempt a particular pose or not. [sent-31, score-0.674]

15 A latent space X maps to two (or more) observation spaces Y, Z using nonlinear kernels, and “inverse” Gaussian processes map back from observations to latent coordinates. [sent-35, score-1.324]

16 Synthesis employs a map from latent coordinates to observations, while recognition employs an inverse mapping. [sent-36, score-0.737]

17 The first is an image dataset containing corresponding views of two different objects. [sent-38, score-0.261]

18 The challenge is to predict corresponding views of the second object given novel views of the first based on a limited training set of corresponding object views. [sent-39, score-0.804]

19 The second dataset consists of human poses derived from motion capture data and corresponding kinematic poses from a humanoid robot. [sent-40, score-0.843]

20 The challenge is to estimate the kinematic parameters for robot pose, given a potentially novel pose from human motion capture, thereby allowing robotic imitation of human poses. [sent-41, score-1.182]

21 Our results indicate that the model generalizes well when only limited training correspondences are available, and that the model remains robust when testing data is noisy. [sent-42, score-0.374]

22 2 Latent Structure Model The goal of our model is to find a shared latent variable parameterization in a space X that relates corresponding pairs of observations from two (or more) different spaces Y, Z. [sent-43, score-0.85]

23 The observation spaces might be very dissimilar, despite the observations sharing a common structure or parameterization. [sent-44, score-0.26]

24 The latent variable space then characterizes the common pose space. [sent-46, score-0.705]

25 These observations are drawn so that the first observation y1 corresponds to the observation z1 , observation y2 corresponds to observation z2 , etc. [sent-49, score-0.425]

26 We initialize a matrix of latent points X by averaging the top DX principal components of Y, Z. [sent-52, score-0.517]

27 We assume that each latent point xi generates a pair of observations yi , zi via a nonlinear function parameterized by a kernel matrix. [sent-56, score-0.695]

28 A conjugate gradient solver adjusts model parameters and latent coordinates to maximize Eq. [sent-63, score-0.713]

29 Given a trained SGPLVM, we would like to infer the parameters in one observation space given parameters in the other (e. [sent-65, score-0.239]

30 First, we determine the most likely latent coordinate x given the observation y using argmaxx LX (x, y). [sent-69, score-0.601]

31 Once the correct latent coordinate x has been inferred for a given y, the model uses the trained SGPLVM to predict the corresponding observation z. [sent-72, score-0.748]

32 3 Results We first demonstrate how the our model can be used to synthesize new views of an object, character or scene from known views of another object, character or scene, given a common latent variable model. [sent-73, score-1.11]

33 For ease of visualization, we used 2D latent spaces for all results shown here. [sent-74, score-0.558]

34 The model was applied to image pairs depicting corresponding views of 3D objects. [sent-75, score-0.303]

35 Different views show the objects1 rotated at varying degrees out of the camera plane. [sent-76, score-0.242]

36 1(b) shows how the model extrapolates to novel datasets given a limited set of training correspondences. [sent-86, score-0.287]

37 We trained the model using 72 corresponding views of two different objects, a coffee cup and a toy truck. [sent-87, score-0.648]

38 Fixing the latent coordinates learned during training, we then selected 8 views of a third object (a toy car). [sent-88, score-1.049]

39 We selected latent points corresponding to those views, and learned kernel parameters for the 8 images. [sent-89, score-0.729]

40 Empirically, priors on kernel parameters are critical for acceptable performance, particularly when only limited data are available such as the 8 different poses for the toy car. [sent-90, score-0.464]

41 In this case, we used the kernel parameters learned for the cup and toy truck (based on 72 different poses) to impose a Gaussian prior on the kernel parameters for the car (replacing P (θ) in Eqn. [sent-91, score-0.891]

42 4): T − log P (θcar ) = − log PGP + (θcar − θµ ) Γ−1 (θcar − θµ ) θ (14) where θcar , θµ , Γ−1 are respectively kernel parameters for the car, the mean kernel paramθ eters for previously learned kernels (for the cup and truck), and inverse covariance matrix for learned kernel parameters. [sent-92, score-0.683]

43 θµ , Γ−1 in this case are derived from only two samples, but θ nonetheless successfully constrain the kernel parameters for the car so the model functions on the limited set of 8 example poses. [sent-93, score-0.367]

44 To test the model’s robustness to noise and missing data, we randomly selected 10 latent coordinates corresponding to a subset of learned cup and truck image pairs. [sent-94, score-1.124]

45 We then added varying displacements to the latent coordinates and synthesized the corresponding novel views for all 3 observation spaces. [sent-95, score-1.173]

46 45 (all 72 latent coordinates lie on the interval [-0. [sent-97, score-0.633]

47 1(b), with images for the cup and truck in the first two rows. [sent-103, score-0.452]

48 Latent coordinates in regions of low model likelihood generate images that appear blurry or noisy. [sent-104, score-0.376]

49 More interestingly, despite the small number of images used for the car, the model correctly matches the orientation of the car to the synthesized images of the cup and truck. [sent-105, score-0.703]

50 Thus, the model can synthesize reasonable correspondences (given a latent point) even if the number of training examples used to learn kernel parameters is small. [sent-106, score-1.001]

51 Using the latent space and kernel parameters learned for Fig. [sent-109, score-0.71]

52 1, we present 72 views of the coffee cup with varying amounts of additive, zero-mean white noise, and determine the fraction of the 72 poses correctly classified by the model. [sent-110, score-0.74]

53 The model estimates the pose using 1-nearest-neighbor classification of the latent coordinates x learned during training: argmax k (x, x ) (15) x The recognition performance degrades gracefully with increasing noise power. [sent-111, score-0.956]

54 2 also plots sample images from one pose of the cup at several different noise levels. [sent-113, score-0.499]

55 For two of the noise levels, we show the “denoised” cup image selected using the nearest-neighbor 1 http://www1. [sent-114, score-0.268]

56 html a) b) Displacement from latent coordinate: X GPLVM 0 . [sent-118, score-0.485]

57 Z Novel Figure 1: Pose synthesis for multiple objects using shared structure: (a) Graphical model for our shared structure latent variable model. [sent-130, score-0.802]

58 The latent space X maps to two (or more) observation spaces Y, Z using a nonlinear kernel. [sent-131, score-0.766]

59 “Inverse” Gaussian process kernels map back from observations to latent coordinates. [sent-132, score-0.558]

60 (b) The model learns pose correspondences for images of the coffee cup and toy truck (Y and Z) by fitting kernel parameters and a 2-dimensional latent variable space. [sent-133, score-1.655]

61 After learning the latent coordinates for the cup and truck, we fit kernel parameters for a novel object (the toy car). [sent-134, score-1.229]

62 Unlike the cup and truck, where 72 pairs of views were used to fit kernel parameters and latent coordinates, only 8 views were used to fit kernel parameters for the car. [sent-135, score-1.343]

63 The model is robust to noise in the latent coordinates; numbers above each column represent the amount of noise added to the latent coordinates used to synthesize the images. [sent-136, score-1.375]

64 Even at points where the model is uncertain (indicated by the rightmost results in the Y and Z rows), the learned kernel extrapolates the correct view of the toy car (the “novel” row). [sent-137, score-0.533]

65 3 illustrates the ability of the model to synthesize novel views of one object given a novel view of a different object. [sent-141, score-0.717]

66 A limited set of corresponding poses (24 of 72 total) of a cat figurine and a mug were used to train the GP model. [sent-142, score-0.533]

67 The remaining 48 poses of the mug were then used as testing data. [sent-143, score-0.401]

68 For each snapshot of the mug, we inferred a latent point using the “inverse” Gaussian process model and used the learned model to synthesize what the cat figurine should look like in the same pose. [sent-144, score-0.954]

69 3: the “Test” rows show novel images of the mug, the “Inferred” rows show the model’s best estimate for the cat figurine, and the “Actual” rows show the ground truth. [sent-146, score-0.445]

70 Although the images for some poses are blurry and the model fails to synthesize the correct image for pose 44, the model nevertheless manages to capture fine detail on most of the images. [sent-147, score-0.842]

71 Arrows indicate the path in latent space formed by the training images. [sent-150, score-0.573]

72 The dashed line indicates latent points inferred from testing images of the mug. [sent-151, score-0.751]

73 Numbered latent coordinates correspond to the synthesized images at left. [sent-152, score-0.815]

74 The latent space shows structure: latent points for similar poses are grouped together, and tend to move along a smooth curve in latent space, with coordinates for the final pose lying close to coordinates for the first pose (as desired for a cyclic image sequence). [sent-153, score-2.36]

75 The bar graph at lower right compares model certainty for the numbered latent coordinates; higher bars indicate greater model certainty. [sent-154, score-0.663]

76 4 shows an application of our framework to the problem of robotic imitation of human actions. [sent-157, score-0.425]

77 We trained our model on a dataset containing human poses (acquired with a Vicon motion capture system) and corresponding poses of a Fujitsu HOAP-2 humanoid robot. [sent-158, score-0.828]

78 Fraction of images recognized are plotted on the Y axis and standard deviation of white noise is plotted on the X axis. [sent-164, score-0.263]

79 One pose of the cup (of 72 total) is plotted for various noise levels (see text for details). [sent-165, score-0.425]

80 “Denoised” images obtained from nearest-neighbor classification and the corresponding images for the Z space (the toy truck) are also shown. [sent-166, score-0.383]

81 After training on 43 roughly matching poses (only linear time scaling applied to align training poses), we tested the model by presenting a set of 123 human motion capture poses (which includes the original −1 training set). [sent-168, score-0.835]

82 4 (inset panels, human and robot skeletons), the model was able to correctly infer appropriate robot kinematic parameters given a range of novel human poses. [sent-172, score-0.965]

83 These inferred parameters were used in conjunction with a simple controller to instantiate the pose in the humanoid robot (see photos in the inset panels). [sent-173, score-0.644]

84 Our method differs from the latent variable method proposed in [14] by using Gaussian process regression. [sent-178, score-0.519]

85 Our framework learns mappings between each observation space and a latent space, rather than mapping directly between the observation spaces. [sent-181, score-0.77]

86 An intermediate mapping to a latent space is also more economical in 2 8 14 20 26 32 −1. [sent-183, score-0.555]

87 The system uses an inverse Gaussian process model to infer a 2D latent point for each of the 48 novel mug views, then synthesizes a corresponding view of the cat figurine. [sent-196, score-1.047]

88 At left we plot the novel testing mug images given to the system (“test”), the synthesized cat images (“inferred”), and the actual views of the cat figurine from the database (“actual”). [sent-197, score-1.043]

89 At upper right we plot the model uncertainty in the latent space. [sent-198, score-0.553]

90 The 24 latent coordinates from the training data are plotted as arrows, while the 48 novel latent points are plotted as crosses on a dashed line. [sent-199, score-1.397]

91 At lower right we show model certainty for 2 the cat figurine data (1/σZ (x)) for each testing latent point x. [sent-200, score-0.74]

92 Note the low certainty for the blurry inferred images labeled 8, 14, and 26. [sent-201, score-0.327]

93 This modality-independent space is analogous to the latent variable space in our model. [sent-206, score-0.601]

94 Our model does not directly address the “correspondence problem” in imitation [7], where correspondences between an agent and a teacher are established through some form of unsupervised feature matching. [sent-207, score-0.439]

95 However, it is reasonable to assume that imitation by a robot of human activity could involve some initial, explicit correspondence matching based on simultaneity. [sent-208, score-0.559]

96 Thus, to bootstrap its database of corresponding data points, a robot could invite a human to take turns playing out motor sequences. [sent-210, score-0.366]

97 Initially, the human would imitate the robot’s actions and the robot could use this data to learn correspondences using our GP model; later, the robot could check and if necessary, refine its learned model by attempting to imitate the human’s actions. [sent-211, score-0.926]

98 Figure 4: Learning shared latent structure for robotic imitation of human actions: The plot in 2 the center shows the latent training points (red circles) and model precision 1/σZ for the robot model (grayscale plot), with examples of recovered latent points for testing data (blue diamonds). [sent-239, score-2.464]

99 Inset panels show the pose of the human motion capture skeleton, the simulated robot skeleton, and the humanoid robot for each example latent point. [sent-241, score-1.436]

100 The model correctly infers robot poses from the human walking data (inset panels). [sent-242, score-0.597]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('latent', 0.485), ('robot', 0.221), ('poses', 0.217), ('views', 0.204), ('cup', 0.202), ('imitation', 0.195), ('correspondences', 0.162), ('car', 0.162), ('coordinates', 0.148), ('pose', 0.145), ('mug', 0.142), ('synthesize', 0.141), ('gplvm', 0.141), ('truck', 0.135), ('gurine', 0.122), ('novel', 0.118), ('human', 0.117), ('images', 0.115), ('robotic', 0.113), ('fy', 0.113), ('cat', 0.107), ('humanoid', 0.106), ('sgplvm', 0.088), ('coffee', 0.088), ('observation', 0.088), ('kernel', 0.086), ('dy', 0.085), ('toy', 0.084), ('inferred', 0.077), ('shared', 0.074), ('observations', 0.073), ('spaces', 0.073), ('blurry', 0.071), ('cca', 0.071), ('gps', 0.069), ('object', 0.068), ('synthesized', 0.067), ('dz', 0.067), ('synthesis', 0.067), ('inverse', 0.065), ('certainty', 0.064), ('ky', 0.064), ('fz', 0.061), ('pgp', 0.061), ('motion', 0.061), ('learned', 0.06), ('kinematic', 0.057), ('inset', 0.057), ('nonlinear', 0.051), ('visualization', 0.049), ('training', 0.047), ('skeleton', 0.045), ('freedom', 0.043), ('testing', 0.042), ('model', 0.042), ('space', 0.041), ('gaussian', 0.041), ('plotted', 0.041), ('extrapolates', 0.041), ('kz', 0.041), ('lz', 0.041), ('agent', 0.04), ('panels', 0.04), ('capture', 0.04), ('limited', 0.039), ('gp', 0.039), ('learns', 0.039), ('grayscale', 0.039), ('recognition', 0.039), ('degrees', 0.038), ('parameters', 0.038), ('noise', 0.037), ('dimensionality', 0.036), ('displacements', 0.035), ('grochow', 0.035), ('hertzmann', 0.035), ('imitate', 0.035), ('ly', 0.035), ('rows', 0.035), ('paired', 0.035), ('infer', 0.034), ('variable', 0.034), ('actions', 0.033), ('shon', 0.032), ('denoised', 0.032), ('points', 0.032), ('toronto', 0.031), ('numbered', 0.03), ('lx', 0.03), ('image', 0.029), ('white', 0.029), ('mapping', 0.029), ('corresponding', 0.028), ('maps', 0.028), ('coordinate', 0.028), ('aaron', 0.027), ('correspondence', 0.026), ('plot', 0.026), ('view', 0.026), ('structure', 0.026)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999994 115 nips-2005-Learning Shared Latent Structure for Image Synthesis and Robotic Imitation

Author: Aaron Shon, Keith Grochow, Aaron Hertzmann, Rajesh P. Rao

Abstract: We propose an algorithm that uses Gaussian process regression to learn common hidden structure shared between corresponding sets of heterogenous observations. The observation spaces are linked via a single, reduced-dimensionality latent variable space. We present results from two datasets demonstrating the algorithms’s ability to synthesize novel data from learned correspondences. We first show that the method can learn the nonlinear mapping between corresponding views of objects, filling in missing data as needed to synthesize novel views. We then show that the method can learn a mapping between human degrees of freedom and robotic degrees of freedom for a humanoid robot, allowing robotic imitation of human poses from motion capture data. 1

2 0.33195642 80 nips-2005-Gaussian Process Dynamical Models

Author: Jack Wang, Aaron Hertzmann, David M. Blei

Abstract: This paper introduces Gaussian Process Dynamical Models (GPDM) for nonlinear time series analysis. A GPDM comprises a low-dimensional latent space with associated dynamics, and a map from the latent space to an observation space. We marginalize out the model parameters in closed-form, using Gaussian Process (GP) priors for both the dynamics and the observation mappings. This results in a nonparametric model for dynamical systems that accounts for uncertainty in the model. We demonstrate the approach on human motion capture data in which each pose is 62-dimensional. Despite the use of small data sets, the GPDM learns an effective representation of the nonlinear dynamics in these spaces. Webpage: http://www.dgp.toronto.edu/∼ jmwang/gpdm/ 1

3 0.16339253 87 nips-2005-Goal-Based Imitation as Probabilistic Inference over Graphical Models

Author: Deepak Verma, Rajesh P. Rao

Abstract: Humans are extremely adept at learning new skills by imitating the actions of others. A progression of imitative abilities has been observed in children, ranging from imitation of simple body movements to goalbased imitation based on inferring intent. In this paper, we show that the problem of goal-based imitation can be formulated as one of inferring goals and selecting actions using a learned probabilistic graphical model of the environment. We first describe algorithms for planning actions to achieve a goal state using probabilistic inference. We then describe how planning can be used to bootstrap the learning of goal-dependent policies by utilizing feedback from the environment. The resulting graphical model is then shown to be powerful enough to allow goal-based imitation. Using a simple maze navigation task, we illustrate how an agent can infer the goals of an observed teacher and imitate the teacher even when the goals are uncertain and the demonstration is incomplete.

4 0.15593319 98 nips-2005-Infinite latent feature models and the Indian buffet process

Author: Zoubin Ghahramani, Thomas L. Griffiths

Abstract: We define a probability distribution over equivalence classes of binary matrices with a finite number of rows and an unbounded number of columns. This distribution is suitable for use as a prior in probabilistic models that represent objects using a potentially infinite array of features. We identify a simple generative process that results in the same distribution over equivalence classes, which we call the Indian buffet process. We illustrate the use of this distribution as a prior in an infinite latent feature model, deriving a Markov chain Monte Carlo algorithm for inference in this model and applying the algorithm to an image dataset. 1

5 0.14467013 67 nips-2005-Extracting Dynamical Structure Embedded in Neural Activity

Author: Afsheen Afshar, Gopal Santhanam, Stephen I. Ryu, Maneesh Sahani, Byron M. Yu, Krishna V. Shenoy

Abstract: Spiking activity from neurophysiological experiments often exhibits dynamics beyond that driven by external stimulation, presumably reflecting the extensive recurrence of neural circuitry. Characterizing these dynamics may reveal important features of neural computation, particularly during internally-driven cognitive operations. For example, the activity of premotor cortex (PMd) neurons during an instructed delay period separating movement-target specification and a movementinitiation cue is believed to be involved in motor planning. We show that the dynamics underlying this activity can be captured by a lowdimensional non-linear dynamical systems model, with underlying recurrent structure and stochastic point-process output. We present and validate latent variable methods that simultaneously estimate the system parameters and the trial-by-trial dynamical trajectories. These methods are applied to characterize the dynamics in PMd data recorded from a chronically-implanted 96-electrode array while monkeys perform delayed-reach tasks. 1

6 0.12812719 113 nips-2005-Learning Multiple Related Tasks using Latent Independent Component Analysis

7 0.12753591 60 nips-2005-Dynamic Social Network Analysis using Latent Space Models

8 0.11174693 143 nips-2005-Off-Road Obstacle Avoidance through End-to-End Learning

9 0.1080896 45 nips-2005-Conditional Visual Tracking in Kernel Space

10 0.10662714 202 nips-2005-Variational EM Algorithms for Non-Gaussian Latent Variable Models

11 0.093762808 30 nips-2005-Assessing Approximations for Gaussian Process Classification

12 0.089274354 5 nips-2005-A Computational Model of Eye Movements during Object Class Detection

13 0.087176345 179 nips-2005-Sparse Gaussian Processes using Pseudo-inputs

14 0.083285026 7 nips-2005-A Cortically-Plausible Inverse Problem Solving Method Applied to Recognizing Static and Kinematic 3D Objects

15 0.081942163 52 nips-2005-Correlated Topic Models

16 0.074608073 151 nips-2005-Pattern Recognition from One Example by Chopping

17 0.074311905 97 nips-2005-Inferring Motor Programs from Images of Handwritten Digits

18 0.072699927 16 nips-2005-A matching pursuit approach to sparse Gaussian process regression

19 0.070923492 196 nips-2005-Two view learning: SVM-2K, Theory and Practice

20 0.070246562 73 nips-2005-Fast biped walking with a reflexive controller and real-time policy searching


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.234), (1, 0.016), (2, 0.057), (3, 0.199), (4, 0.058), (5, -0.242), (6, 0.273), (7, -0.001), (8, 0.06), (9, 0.037), (10, -0.003), (11, 0.17), (12, -0.207), (13, 0.051), (14, 0.061), (15, 0.038), (16, 0.033), (17, -0.124), (18, 0.079), (19, 0.115), (20, 0.146), (21, -0.017), (22, 0.085), (23, 0.008), (24, -0.063), (25, 0.01), (26, -0.045), (27, -0.142), (28, 0.106), (29, -0.111), (30, -0.087), (31, -0.009), (32, -0.019), (33, -0.016), (34, -0.09), (35, 0.095), (36, -0.072), (37, -0.003), (38, 0.065), (39, -0.029), (40, -0.043), (41, -0.001), (42, -0.017), (43, 0.059), (44, 0.057), (45, -0.132), (46, -0.03), (47, 0.004), (48, 0.103), (49, 0.016)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.96550018 115 nips-2005-Learning Shared Latent Structure for Image Synthesis and Robotic Imitation

Author: Aaron Shon, Keith Grochow, Aaron Hertzmann, Rajesh P. Rao

Abstract: We propose an algorithm that uses Gaussian process regression to learn common hidden structure shared between corresponding sets of heterogenous observations. The observation spaces are linked via a single, reduced-dimensionality latent variable space. We present results from two datasets demonstrating the algorithms’s ability to synthesize novel data from learned correspondences. We first show that the method can learn the nonlinear mapping between corresponding views of objects, filling in missing data as needed to synthesize novel views. We then show that the method can learn a mapping between human degrees of freedom and robotic degrees of freedom for a humanoid robot, allowing robotic imitation of human poses from motion capture data. 1

2 0.85474986 80 nips-2005-Gaussian Process Dynamical Models

Author: Jack Wang, Aaron Hertzmann, David M. Blei

Abstract: This paper introduces Gaussian Process Dynamical Models (GPDM) for nonlinear time series analysis. A GPDM comprises a low-dimensional latent space with associated dynamics, and a map from the latent space to an observation space. We marginalize out the model parameters in closed-form, using Gaussian Process (GP) priors for both the dynamics and the observation mappings. This results in a nonparametric model for dynamical systems that accounts for uncertainty in the model. We demonstrate the approach on human motion capture data in which each pose is 62-dimensional. Despite the use of small data sets, the GPDM learns an effective representation of the nonlinear dynamics in these spaces. Webpage: http://www.dgp.toronto.edu/∼ jmwang/gpdm/ 1

3 0.54456812 98 nips-2005-Infinite latent feature models and the Indian buffet process

Author: Zoubin Ghahramani, Thomas L. Griffiths

Abstract: We define a probability distribution over equivalence classes of binary matrices with a finite number of rows and an unbounded number of columns. This distribution is suitable for use as a prior in probabilistic models that represent objects using a potentially infinite array of features. We identify a simple generative process that results in the same distribution over equivalence classes, which we call the Indian buffet process. We illustrate the use of this distribution as a prior in an infinite latent feature model, deriving a Markov chain Monte Carlo algorithm for inference in this model and applying the algorithm to an image dataset. 1

4 0.52241892 35 nips-2005-Bayesian model learning in human visual perception

Author: Gergő Orbán, Jozsef Fiser, Richard N. Aslin, Máté Lengyel

Abstract: Humans make optimal perceptual decisions in noisy and ambiguous conditions. Computations underlying such optimal behavior have been shown to rely on probabilistic inference according to generative models whose structure is usually taken to be known a priori. We argue that Bayesian model selection is ideal for inferring similar and even more complex model structures from experience. We find in experiments that humans learn subtle statistical properties of visual scenes in a completely unsupervised manner. We show that these findings are well captured by Bayesian model learning within a class of models that seek to explain observed variables by independent hidden causes. 1

5 0.50952566 67 nips-2005-Extracting Dynamical Structure Embedded in Neural Activity

Author: Afsheen Afshar, Gopal Santhanam, Stephen I. Ryu, Maneesh Sahani, Byron M. Yu, Krishna V. Shenoy

Abstract: Spiking activity from neurophysiological experiments often exhibits dynamics beyond that driven by external stimulation, presumably reflecting the extensive recurrence of neural circuitry. Characterizing these dynamics may reveal important features of neural computation, particularly during internally-driven cognitive operations. For example, the activity of premotor cortex (PMd) neurons during an instructed delay period separating movement-target specification and a movementinitiation cue is believed to be involved in motor planning. We show that the dynamics underlying this activity can be captured by a lowdimensional non-linear dynamical systems model, with underlying recurrent structure and stochastic point-process output. We present and validate latent variable methods that simultaneously estimate the system parameters and the trial-by-trial dynamical trajectories. These methods are applied to characterize the dynamics in PMd data recorded from a chronically-implanted 96-electrode array while monkeys perform delayed-reach tasks. 1

6 0.49338549 60 nips-2005-Dynamic Social Network Analysis using Latent Space Models

7 0.48665971 143 nips-2005-Off-Road Obstacle Avoidance through End-to-End Learning

8 0.42631042 73 nips-2005-Fast biped walking with a reflexive controller and real-time policy searching

9 0.42067346 45 nips-2005-Conditional Visual Tracking in Kernel Space

10 0.37637603 113 nips-2005-Learning Multiple Related Tasks using Latent Independent Component Analysis

11 0.36255169 55 nips-2005-Describing Visual Scenes using Transformed Dirichlet Processes

12 0.33646762 151 nips-2005-Pattern Recognition from One Example by Chopping

13 0.32755104 87 nips-2005-Goal-Based Imitation as Probabilistic Inference over Graphical Models

14 0.31257683 11 nips-2005-A Hierarchical Compositional System for Rapid Object Detection

15 0.30475059 7 nips-2005-A Cortically-Plausible Inverse Problem Solving Method Applied to Recognizing Static and Kinematic 3D Objects

16 0.30277696 93 nips-2005-Ideal Observers for Detecting Motion: Correspondence Noise

17 0.29654431 81 nips-2005-Gaussian Processes for Multiuser Detection in CDMA receivers

18 0.29472449 108 nips-2005-Layered Dynamic Textures

19 0.28829721 68 nips-2005-Factorial Switching Kalman Filters for Condition Monitoring in Neonatal Intensive Care

20 0.28624323 63 nips-2005-Efficient Unsupervised Learning for Localization and Detection in Object Categories


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(3, 0.027), (10, 0.016), (18, 0.012), (27, 0.055), (31, 0.021), (34, 0.068), (39, 0.015), (55, 0.011), (69, 0.472), (73, 0.024), (88, 0.15), (91, 0.029)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.98043579 40 nips-2005-CMOL CrossNets: Possible Neuromorphic Nanoelectronic Circuits

Author: Jung Hoon Lee, Xiaolong Ma, Konstantin K. Likharev

Abstract: Hybrid “CMOL” integrated circuits, combining CMOS subsystem with nanowire crossbars and simple two-terminal nanodevices, promise to extend the exponential Moore-Law development of microelectronics into the sub-10-nm range. We are developing neuromorphic network (“CrossNet”) architectures for this future technology, in which neural cell bodies are implemented in CMOS, nanowires are used as axons and dendrites, while nanodevices (bistable latching switches) are used as elementary synapses. We have shown how CrossNets may be trained to perform pattern recovery and classification despite the limitations imposed by the CMOL hardware. Preliminary estimates have shown that CMOL CrossNets may be extremely dense (~10 7 cells per cm2) and operate approximately a million times faster than biological neural networks, at manageable power consumption. In Conclusion, we discuss in brief possible short-term and long-term applications of the emerging technology. 1 Introduction: CMOL Circuits Recent results [1, 2] indicate that the current VLSI paradigm based on CMOS technology can be hardly extended beyond the 10-nm frontier: in this range the sensitivity of parameters (most importantly, the gate voltage threshold) of silicon field-effect transistors to inevitable fabrication spreads grows exponentially. This sensitivity will probably send the fabrication facilities costs skyrocketing, and may lead to the end of Moore’s Law some time during the next decade. There is a growing consensus that the impending Moore’s Law crisis may be preempted by a radical paradigm shift from the purely CMOS technology to hybrid CMOS/nanodevice circuits, e.g., those of “CMOL” variety (Fig. 1). Such circuits (see, e.g., Ref. 3 for their recent review) would combine a level of advanced CMOS devices fabricated by the lithographic patterning, and two-layer nanowire crossbar formed, e.g., by nanoimprint, with nanowires connected by simple, similar, two-terminal nanodevices at each crosspoint. For such devices, molecular single-electron latching switches [4] are presently the leading candidates, in particular because they may be fabricated using the self-assembled monolayer (SAM) technique which already gave reproducible results for simpler molecular devices [5]. (a) nanodevices nanowiring and nanodevices interface pins upper wiring level of CMOS stack (b) βFCMOS Fnano α Fig. 1. CMOL circuit: (a) schematic side view, and (b) top-view zoom-in on several adjacent interface pins. (For clarity, only two adjacent nanodevices are shown.) In order to overcome the CMOS/nanodevice interface problems pertinent to earlier proposals of hybrid circuits [6], in CMOL the interface is provided by pins that are distributed all over the circuit area, on the top of the CMOS stack. This allows to use advanced techniques of nanowire patterning (like nanoimprint) which do not have nanoscale accuracy of layer alignment [3]. The vital feature of this interface is the tilt, by angle α = arcsin(Fnano/βFCMOS), of the nanowire crossbar relative to the square arrays of interface pins (Fig. 1b). Here Fnano is the nanowiring half-pitch, FCMOS is the half-pitch of the CMOS subsystem, and β is a dimensionless factor larger than 1 that depends on the CMOS cell complexity. Figure 1b shows that this tilt allows the CMOS subsystem to address each nanodevice even if Fnano << βFCMOS. By now, it has been shown that CMOL circuits can combine high performance with high defect tolerance (which is necessary for any circuit using nanodevices) for several digital applications. In particular, CMOL circuits with defect rates below a few percent would enable terabit-scale memories [7], while the performance of FPGA-like CMOL circuits may be several hundred times above that of overcome purely CMOL FPGA (implemented with the same FCMOS), at acceptable power dissipation and defect tolerance above 20% [8]. In addition, the very structure of CMOL circuits makes them uniquely suitable for the implementation of more complex, mixed-signal information processing systems, including ultradense and ultrafast neuromorphic networks. The objective of this paper is to describe in brief the current status of our work on the development of so-called Distributed Crossbar Networks (“CrossNets”) that could provide high performance despite the limitations imposed by CMOL hardware. A more detailed description of our earlier results may be found in Ref. 9. 2 Synapses The central device of CrossNet is a two-terminal latching switch [3, 4] (Fig. 2a) which is a combination of two single-electron devices, a transistor and a trap [3]. The device may be naturally implemented as a single organic molecule (Fig. 2b). Qualitatively, the device operates as follows: if voltage V = Vj – Vk applied between the external electrodes (in CMOL, nanowires) is low, the trap island has no net electric charge, and the single-electron transistor is closed. If voltage V approaches certain threshold value V+ > 0, an additional electron is inserted into the trap island, and its field lifts the Coulomb blockade of the single-electron transistor, thus connecting the nanowires. The switch state may be reset (e.g., wires disconnected) by applying a lower voltage V < V- < V+. Due to the random character of single-electron tunneling [2], the quantitative description of the switch is by necessity probabilistic: actually, V determines only the rates Γ↑↓ of device switching between its ON and OFF states. The rates, in turn, determine the dynamics of probability p to have the transistor opened (i.e. wires connected): dp/dt = Γ↑(1 - p) - Γ↓p. (1) The theory of single-electron tunneling [2] shows that, in a good approximation, the rates may be presented as Γ↑↓ = Γ0 exp{±e(V - S)/kBT} , (2) (a) single-electron trap tunnel junction Vj Vk single-electron transistor (b) O clipping group O N C R diimide acceptor groups O O C N R R O OPE wires O N R R N O O R O N R R = hexyl N O O R R O N C R R R Fig. 2. (a) Schematics and (b) possible molecular implementation of the two-terminal single-electron latching switch where Γ0 and S are constants depending on physical parameters of the latching switches. Note that despite the random character of switching, the strong nonlinearity of Eq. (2) allows to limit the degree of the device “fuzziness”. 3 CrossNets Figure 3a shows the generic structure of a CrossNet. CMOS-implemented somatic cells (within the Fire Rate model, just nonlinear differential amplifiers, see Fig. 3b,c) apply their output voltages to “axonic” nanowires. If the latching switch, working as an elementary synapse, on the crosspoint of an axonic wire with the perpendicular “dendritic” wire is open, some current flows into the latter wire, charging it. Since such currents are injected into each dendritic wire through several (many) open synapses, their addition provides a natural passive analog summation of signals from the corresponding somas, typical for all neural networks. Examining Fig. 3a, please note the open-circuit terminations of axonic and dendritic lines at the borders of the somatic cells; due to these terminations the somas do not communicate directly (but only via synapses). The network shown on Fig. 3 is evidently feedforward; recurrent networks are achieved in the evident way by doubling the number of synapses and nanowires per somatic cell (Fig. 3c). Moreover, using dual-rail (bipolar) representation of the signal, and hence doubling the number of nanowires and elementary synapses once again, one gets a CrossNet with somas coupled by compact 4-switch groups [9]. Using Eqs. (1) and (2), it is straightforward to show that that the average synaptic weight wjk of the group obeys the “quasi-Hebbian” rule: d w jk = −4Γ0 sinh (γ S ) sinh (γ V j ) sinh (γ Vk ) . dt (3) (a) - +soma j (b) RL + -- jk+ RL (c) jk- RL + -- -+soma k RL Fig. 3. (a) Generic structure of the simplest, (feedforward, non-Hebbian) CrossNet. Red lines show “axonic”, and blue lines “dendritic” nanowires. Gray squares are interfaces between nanowires and CMOS-based somas (b, c). Signs show the dendrite input polarities. Green circles denote molecular latching switches forming elementary synapses. Bold red and blue points are open-circuit terminations of the nanowires, that do not allow somas to interact in bypass of synapses In the simplest cases (e.g., quasi-Hopfield networks with finite connectivity), the tri-level synaptic weights of the generic CrossNets are quite satisfactory, leading to just a very modest (~30%) network capacity loss. However, some applications (in particular, pattern classification) may require a larger number of weight quantization levels L (e.g., L ≈ 30 for a 1% fidelity [9]). This may be achieved by using compact square arrays (e.g., 4×4) of latching switches (Fig. 4). Various species of CrossNets [9] differ also by the way the somatic cells are distributed around the synaptic field. Figure 5 shows feedforward versions of two CrossNet types most explored so far: the so-called FlossBar and InBar. The former network is more natural for the implementation of multilayered perceptrons (MLP), while the latter system is preferable for recurrent network implementations and also allows a simpler CMOS design of somatic cells. The most important advantage of CrossNets over the hardware neural networks suggested earlier is that these networks allow to achieve enormous density combined with large cell connectivity M >> 1 in quasi-2D electronic circuits. 4 CrossNet training CrossNet training faces several hardware-imposed challenges: (i) The synaptic weight contribution provided by the elementary latching switch is binary, so that for most applications the multi-switch synapses (Fig. 4) are necessary. (ii) The only way to adjust any particular synaptic weight is to turn ON or OFF the corresponding latching switch(es). This is only possible to do by applying certain voltage V = Vj – Vk between the two corresponding nanowires. At this procedure, other nanodevices attached to the same wires should not be disturbed. (iii) As stated above, synapse state switching is a statistical progress, so that the degree of its “fuzziness” should be carefully controlled. (a) Vj (b) V w – A/2 i=1 i=1 2 2 … … n n Vj V w+ A/2 i' = 1 RL 2 … i' = 1 n RS ±(V t –A/2) 2 … RS n ±(V t +A/2) Fig. 4. Composite synapse for providing L = 2n2+1 discrete levels of the weight in (a) operation and (b) weight adjustment modes. The dark-gray rectangles are resistive metallic strips at soma/nanowire interfaces (a) (b) Fig. 5. Two main CrossNet species: (a) FlossBar and (b) InBar, in the generic (feedforward, non-Hebbian, ternary-weight) case for the connectivity parameter M = 9. Only the nanowires and nanodevices coupling one cell (indicated with red dashed lines) to M post-synaptic cells (blue dashed lines) are shown; actually all the cells are similarly coupled We have shown that these challenges may be met using (at least) the following training methods [9]: (i) Synaptic weight import. This procedure is started with training of a homomorphic “precursor” artificial neural network with continuous synaptic weighs wjk, implemented in software, using one of established methods (e.g., error backpropagation). Then the synaptic weights wjk are transferred to the CrossNet, with some “clipping” (rounding) due to the binary nature of elementary synaptic weights. To accomplish the transfer, pairs of somatic cells are sequentially selected via CMOS-level wiring. Using the flexibility of CMOS circuitry, these cells are reconfigured to apply external voltages ±VW to the axonic and dendritic nanowires leading to a particular synapse, while all other nanowires are grounded. The voltage level V W is selected so that it does not switch the synapses attached to only one of the selected nanowires, while voltage 2VW applied to the synapse at the crosspoint of the selected wires is sufficient for its reliable switching. (In the composite synapses with quasi-continuous weights (Fig. 4), only a part of the corresponding switches is turned ON or OFF.) (ii) Error backpropagation. The synaptic weight import procedure is straightforward when wjk may be simply calculated, e.g., for the Hopfield-type networks. However, for very large CrossNets used, e.g., as pattern classifiers the precursor network training may take an impracticably long time. In this case the direct training of a CrossNet may become necessary. We have developed two methods of such training, both based on “Hebbian” synapses consisting of 4 elementary synapses (latching switches) whose average weight dynamics obeys Eq. (3). This quasi-Hebbian rule may be used to implement the backpropagation algorithm either using a periodic time-multiplexing [9] or in a continuous fashion, using the simultaneous propagation of signals and errors along the same dual-rail channels. As a result, presently we may state that CrossNets may be taught to perform virtually all major functions demonstrated earlier with the usual neural networks, including the corrupted pattern restoration in the recurrent quasi-Hopfield mode and pattern classification in the feedforward MLP mode [11]. 5 C r o s s N e t p e r f o r m an c e e s t i m a t e s The significance of this result may be only appreciated in the context of unparalleled physical parameters of CMOL CrossNets. The only fundamental limitation on the half-pitch Fnano (Fig. 1) comes from quantum-mechanical tunneling between nanowires. If the wires are separated by vacuum, the corresponding specific leakage conductance becomes uncomfortably large (~10-12 Ω-1m-1) only at Fnano = 1.5 nm; however, since realistic insulation materials (SiO2, etc.) provide somewhat lower tunnel barriers, let us use a more conservative value Fnano= 3 nm. Note that this value corresponds to 1012 elementary synapses per cm2, so that for 4M = 104 and n = 4 the areal density of neural cells is close to 2×107 cm-2. Both numbers are higher than those for the human cerebral cortex, despite the fact that the quasi-2D CMOL circuits have to compete with quasi-3D cerebral cortex. With the typical specific capacitance of 3×10-10 F/m = 0.3 aF/nm, this gives nanowire capacitance C0 ≈ 1 aF per working elementary synapse, because the corresponding segment has length 4Fnano. The CrossNet operation speed is determined mostly by the time constant τ0 of dendrite nanowire capacitance recharging through resistances of open nanodevices. Since both the relevant conductance and capacitance increase similarly with M and n, τ0 ≈ R0C0. The possibilities of reduction of R0, and hence τ0, are limited mostly by acceptable power dissipation per unit area, that is close to Vs2/(2Fnano)2R0. For room-temperature operation, the voltage scale V0 ≈ Vt should be of the order of at least 30 kBT/e ≈ 1 V to avoid thermally-induced errors [9]. With our number for Fnano, and a relatively high but acceptable power consumption of 100 W/cm2, we get R0 ≈ 1010Ω (which is a very realistic value for single-molecule single-electron devices like one shown in Fig. 3). With this number, τ0 is as small as ~10 ns. This means that the CrossNet speed may be approximately six orders of magnitude (!) higher than that of the biological neural networks. Even scaling R0 up by a factor of 100 to bring power consumption to a more comfortable level of 1 W/cm2, would still leave us at least a four-orders-of-magnitude speed advantage. 6 D i s c u s s i on: P o s s i bl e a p p l i c at i o n s These estimates make us believe that that CMOL CrossNet chips may revolutionize the neuromorphic network applications. Let us start with the example of relatively small (1-cm2-scale) chips used for recognition of a face in a crowd [11]. The most difficult feature of such recognition is the search for face location, i.e. optimal placement of a face on the image relative to the panel providing input for the processing network. The enormous density and speed of CMOL hardware gives a possibility to time-and-space multiplex this task (Fig. 6). In this approach, the full image (say, formed by CMOS photodetectors on the same chip) is divided into P rectangular panels of h×w pixels, corresponding to the expected size and approximate shape of a single face. A CMOS-implemented communication channel passes input data from each panel to the corresponding CMOL neural network, providing its shift in time, say using the TV scanning pattern (red line in Fig. 6). The standard methods of image classification require the network to have just a few hidden layers, so that the time interval Δt necessary for each mapping position may be so short that the total pattern recognition time T = hwΔt may be acceptable even for online face recognition. w h image network input Fig. 6. Scan mapping of the input image on CMOL CrossNet inputs. Red lines show the possible time sequence of image pixels sent to a certain input of the network processing image from the upper-left panel of the pattern Indeed, let us consider a 4-Megapixel image partitioned into 4K 32×32-pixel panels (h = w = 32). This panel will require an MLP net with several (say, four) layers with 1K cells each in order to compare the panel image with ~10 3 stored faces. With the feasible 4-nm nanowire half-pitch, and 65-level synapses (sufficient for better than 99% fidelity [9]), each interlayer crossbar would require chip area about (4K×64 nm)2 = 64×64 μm2, fitting 4×4K of them on a ~0.6 cm2 chip. (The CMOS somatic-layer and communication-system overheads are negligible.) With the acceptable power consumption of the order of 10 W/cm2, the input-to-output signal propagation in such a network will take only about 50 ns, so that Δt may be of the order of 100 ns and the total time T = hwΔt of processing one frame of the order of 100 microseconds, much shorter than the typical TV frame time of ~10 milliseconds. The remaining two-orders-of-magnitude time gap may be used, for example, for double-checking the results via stopping the scan mapping (Fig. 6) at the most promising position. (For this, a simple feedback from the recognition output to the mapping communication system is necessary.) It is instructive to compare the estimated CMOL chip speed with that of the implementation of a similar parallel network ensemble on a CMOS signal processor (say, also combined on the same chip with an array of CMOS photodetectors). Even assuming an extremely high performance of 30 billion additions/multiplications per second, we would need ~4×4K×1K×(4K)2/(30×109) ≈ 104 seconds ~ 3 hours per frame, evidently incompatible with the online image stream processing. Let us finish with a brief (and much more speculative) discussion of possible long-term prospects of CMOL CrossNets. Eventually, large-scale (~30×30 cm2) CMOL circuits may become available. According to the estimates given in the previous section, the integration scale of such a system (in terms of both neural cells and synapses) will be comparable with that of the human cerebral cortex. Equipped with a set of broadband sensor/actuator interfaces, such (necessarily, hierarchical) system may be capable, after a period of initial supervised training, of further self-training in the process of interaction with environment, with the speed several orders of magnitude higher than that of its biological prototypes. Needless to say, the successful development of such self-developing systems would have a major impact not only on all information technologies, but also on the society as a whole. Acknowledgments This work has been supported in part by the AFOSR, MARCO (via FENA Center), and NSF. Valuable contributions made by Simon Fölling, Özgür Türel and Ibrahim Muckra, as well as useful discussions with P. Adams, J. Barhen, D. Hammerstrom, V. Protopopescu, T. Sejnowski, and D. Strukov are gratefully acknowledged. References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] Frank, D. J. et al. (2001) Device scaling limits of Si MOSFETs and their application dependencies. Proc. IEEE 89(3): 259-288. Likharev, K. K. (2003) Electronics below 10 nm, in J. Greer et al. (eds.), Nano and Giga Challenges in Microelectronics, pp. 27-68. Amsterdam: Elsevier. Likharev, K. K. and Strukov, D. B. (2005) CMOL: Devices, circuits, and architectures, in G. Cuniberti et al. (eds.), Introducing Molecular Electronics, Ch. 16. Springer, Berlin. Fölling, S., Türel, Ö. & Likharev, K. K. (2001) Single-electron latching switches as nanoscale synapses, in Proc. of the 2001 Int. Joint Conf. on Neural Networks, pp. 216-221. Mount Royal, NJ: Int. Neural Network Society. Wang, W. et al. (2003) Mechanism of electron conduction in self-assembled alkanethiol monolayer devices. Phys. Rev. B 68(3): 035416 1-8. Stan M. et al. (2003) Molecular electronics: From devices and interconnect to circuits and architecture, Proc. IEEE 91(11): 1940-1957. Strukov, D. B. & Likharev, K. K. (2005) Prospects for terabit-scale nanoelectronic memories. Nanotechnology 16(1): 137-148. Strukov, D. B. & Likharev, K. K. (2005) CMOL FPGA: A reconfigurable architecture for hybrid digital circuits with two-terminal nanodevices. Nanotechnology 16(6): 888-900. Türel, Ö. et al. (2004) Neuromorphic architectures for nanoelectronic circuits”, Int. J. of Circuit Theory and Appl. 32(5): 277-302. See, e.g., Hertz J. et al. (1991) Introduction to the Theory of Neural Computation. Cambridge, MA: Perseus. Lee, J. H. & Likharev, K. K. (2005) CrossNets as pattern classifiers. Lecture Notes in Computer Sciences 3575: 434-441.

2 0.97541815 18 nips-2005-Active Learning For Identifying Function Threshold Boundaries

Author: Brent Bryan, Robert C. Nichol, Christopher R. Genovese, Jeff Schneider, Christopher J. Miller, Larry Wasserman

Abstract: We present an efficient algorithm to actively select queries for learning the boundaries separating a function domain into regions where the function is above and below a given threshold. We develop experiment selection methods based on entropy, misclassification rate, variance, and their combinations, and show how they perform on a number of data sets. We then show how these algorithms are used to determine simultaneously valid 1 − α confidence intervals for seven cosmological parameters. Experimentation shows that the algorithm reduces the computation necessary for the parameter estimation problem by an order of magnitude.

3 0.9705705 6 nips-2005-A Connectionist Model for Constructive Modal Reasoning

Author: Artur Garcez, Luis C. Lamb, Dov M. Gabbay

Abstract: We present a new connectionist model for constructive, intuitionistic modal reasoning. We use ensembles of neural networks to represent intuitionistic modal theories, and show that for each intuitionistic modal program there exists a corresponding neural network ensemble that computes the program. This provides a massively parallel model for intuitionistic modal reasoning, and sets the scene for integrated reasoning, knowledge representation, and learning of intuitionistic theories in neural networks, since the networks in the ensemble can be trained by examples using standard neural learning algorithms. 1

4 0.95138288 180 nips-2005-Spectral Bounds for Sparse PCA: Exact and Greedy Algorithms

Author: Baback Moghaddam, Yair Weiss, Shai Avidan

Abstract: Sparse PCA seeks approximate sparse “eigenvectors” whose projections capture the maximal variance of data. As a cardinality-constrained and non-convex optimization problem, it is NP-hard and is encountered in a wide range of applied fields, from bio-informatics to finance. Recent progress has focused mainly on continuous approximation and convex relaxation of the hard cardinality constraint. In contrast, we consider an alternative discrete spectral formulation based on variational eigenvalue bounds and provide an effective greedy strategy as well as provably optimal solutions using branch-and-bound search. Moreover, the exact methodology used reveals a simple renormalization step that improves approximate solutions obtained by any continuous method. The resulting performance gain of discrete algorithms is demonstrated on real-world benchmark data and in extensive Monte Carlo evaluation trials. 1

same-paper 5 0.94683838 115 nips-2005-Learning Shared Latent Structure for Image Synthesis and Robotic Imitation

Author: Aaron Shon, Keith Grochow, Aaron Hertzmann, Rajesh P. Rao

Abstract: We propose an algorithm that uses Gaussian process regression to learn common hidden structure shared between corresponding sets of heterogenous observations. The observation spaces are linked via a single, reduced-dimensionality latent variable space. We present results from two datasets demonstrating the algorithms’s ability to synthesize novel data from learned correspondences. We first show that the method can learn the nonlinear mapping between corresponding views of objects, filling in missing data as needed to synthesize novel views. We then show that the method can learn a mapping between human degrees of freedom and robotic degrees of freedom for a humanoid robot, allowing robotic imitation of human poses from motion capture data. 1

6 0.73783439 200 nips-2005-Variable KD-Tree Algorithms for Spatial Pattern Search and Discovery

7 0.65332514 181 nips-2005-Spiking Inputs to a Winner-take-all Network

8 0.63053191 99 nips-2005-Integrate-and-Fire models with adaptation are good enough

9 0.62513864 169 nips-2005-Saliency Based on Information Maximization

10 0.62398785 179 nips-2005-Sparse Gaussian Processes using Pseudo-inputs

11 0.62061477 149 nips-2005-Optimal cue selection strategy

12 0.61814344 67 nips-2005-Extracting Dynamical Structure Embedded in Neural Activity

13 0.61626089 21 nips-2005-An Alternative Infinite Mixture Of Gaussian Process Experts

14 0.61309552 72 nips-2005-Fast Online Policy Gradient Learning with SMD Gain Vector Adaptation

15 0.61252016 45 nips-2005-Conditional Visual Tracking in Kernel Space

16 0.61013621 93 nips-2005-Ideal Observers for Detecting Motion: Correspondence Noise

17 0.6058929 16 nips-2005-A matching pursuit approach to sparse Gaussian process regression

18 0.60304689 109 nips-2005-Learning Cue-Invariant Visual Responses

19 0.59463811 68 nips-2005-Factorial Switching Kalman Filters for Condition Monitoring in Neonatal Intensive Care

20 0.59137261 136 nips-2005-Noise and the two-thirds power Law