nips nips2010 nips2010-20 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Shuang Wu, Xuming He, Hongjing Lu, Alan L. Yuille
Abstract: The human vision system is able to effortlessly perceive both short-range and long-range motion patterns in complex dynamic scenes. Previous work has assumed that two different mechanisms are involved in processing these two types of motion. In this paper, we propose a hierarchical model as a unified framework for modeling both short-range and long-range motion perception. Our model consists of two key components: a data likelihood that proposes multiple motion hypotheses using nonlinear matching, and a hierarchical prior that imposes slowness and spatial smoothness constraints on the motion field at multiple scales. We tested our model on two types of stimuli, random dot kinematograms and multiple-aperture stimuli, both commonly used in human vision research. We demonstrate that the hierarchical model adequately accounts for human performance in psychophysical experiments.
Reference: text
sentIndex sentText sentNum sentScore
1 A unified model of short-range and long-range motion perception Shuang Wu Department of Statistics UCLA Los Angeles , CA 90095 shuangw@stat. [sent-1, score-0.765]
2 edu Abstract The human vision system is able to effortlessly perceive both short-range and long-range motion patterns in complex dynamic scenes. [sent-8, score-0.847]
3 In this paper, we propose a hierarchical model as a unified framework for modeling both short-range and long-range motion perception. [sent-10, score-0.755]
4 Our model consists of two key components: a data likelihood that proposes multiple motion hypotheses using nonlinear matching, and a hierarchical prior that imposes slowness and spatial smoothness constraints on the motion field at multiple scales. [sent-11, score-1.719]
5 We tested our model on two types of stimuli, random dot kinematograms and multiple-aperture stimuli, both commonly used in human vision research. [sent-12, score-0.308]
6 As illustrated by the motion sequence depicted in Figure 1, humans readily perceive the baseball player’s body movements and the fastermoving baseball simultaneously. [sent-15, score-0.928]
7 Separate motion systems have been proposed to explain human perception in scenarios like this example. [sent-19, score-0.83]
8 In particular, Braddick [1] proposed that there is a short-range motion system which is responsible for perceiving movements with relatively small displacements (e. [sent-20, score-0.829]
9 , the player’s movement), and a long-range motion system which perceives motion with large displacements (e. [sent-22, score-1.406]
10 Lu and Sperling [2] have further argued for the existence of three motion systems in human vision. [sent-25, score-0.734]
11 The first and secondorder systems conduct motion analysis on luminance and texture information respectively, while the third-order system uses a feature-tracking strategy. [sent-26, score-0.694]
12 In the baseball example, the first-order motion system would be used to perceive the player’s movements, but the third-order system would be required for perceiving the faster motion of the baseball. [sent-27, score-1.595]
13 Short-range motion and first-order motion appear to apply to the same class of phenomena, and can be modeled using computational theories that are based on motion energy or related techniques. [sent-28, score-2.113]
14 However, long-range motion and third-order 1 Figure 1: Left panel: Short-range and long-range motion: two frames from a baseball sequence where the ball moves with much faster speed than the other objects. [sent-29, score-0.788]
15 Each node represents motion at different location and scales. [sent-31, score-0.716]
16 A child node can have multiple parents, and the prior constraints on motion are expressed by parent-child interactions. [sent-32, score-0.81]
17 motion employ qualitatively different computational strategies involving tracking features over time, which may require attention-driven processes. [sent-33, score-0.669]
18 In contrast to these previous multi-system theories [2, 3], we develop a unified single-system framework to account for these phenomena of human motion perception. [sent-34, score-0.756]
19 We model motion estimation as an inference problem which uses flexible prior assumptions about motion flows and statistical models for quantifying the uncertainty in motion measurement. [sent-35, score-2.063]
20 First, the prior model is defined over a hierarchical graph, see Figure 1, where the nodes of the graph represent the motion at different scales. [sent-37, score-0.912]
21 Such a representation makes it possible to define motion priors and contextual effects at a range of different scales, and so differs from other models of motion perception based on motion priors [5, 6]. [sent-39, score-2.126]
22 This model connects lower level nodes to multiple coarser-level nodes, resulting in a loopy graph structure, which imposes a more flexible prior than tree-structured models (eg. [sent-40, score-0.245]
23 We define a probability distribution on this graph using potentials defined over the graph cliques to capture spatial smoothness constraints [10] at different scales and slowness constraints [5, 11, 12, 13]. [sent-42, score-0.3]
24 , the likelihood term allows many possible motions) which is resolved in our model by imposing the hierarchical motion prior. [sent-46, score-0.779]
25 Instead we use a bottom-up compositional/hierarchical approach where local hypotheses about the motion are combined to form hypotheses for larger regions of the image. [sent-48, score-0.727]
26 We tested our model using two types of stimuli commonly used in human vision research. [sent-50, score-0.28]
27 The first stimulus type are random dot kinematograms (RDKs), where some of the dots (the signal) move coherently with large displacements, whereas other dots (the noise) move randomly. [sent-51, score-0.594]
28 RDKs are one of the most important stimuli used in both physiological and psychophysical studies of motion perception. [sent-52, score-0.88]
29 For example, electrophysiological studies have used RDKs to analyze the neuronal basis of motion perception, identifying a functional link between the activity of motion-selective neurons and behavioral judgments of motion perception [15]. [sent-53, score-1.434]
30 Psychophysical studies have used RDKs to measure the sensitivity of the human visual system for perceiving coherent motion, and also to infer how motion information is integrated to perceive global motion under different viewing conditions [16]. [sent-54, score-1.698]
31 We used two-frame RDKs as an example of a long-range motion stimulus. [sent-55, score-0.669]
32 The second stimulus type are moving gratings or plaids. [sent-56, score-0.282]
33 For example, when randomly orientated lines or grating elements drift behind apertures, the perceived direction of motion is heavily biased by the orientation of the lines/gratings, as well as by the shape and contrast of the apertures [17, 18, 19]. [sent-58, score-0.873]
34 Multiple-aperture stimuli have also recently been used to study coherent motion perception with short-range motion stimulus [20, 21]. [sent-59, score-1.768]
35 For both types of stimuli we compared the model predictions with human performance across various experimental conditions. [sent-60, score-0.25]
36 2 2 Hierarchical Model for Motion Estimation Our hierarchical model represents a motion field using a graph G = (V, E), which has L + 1 hierarchical levels, i. [sent-61, score-0.891]
37 The edges E of the graph connect nodes at each level of the hierarchy to nodes in the neighboring levels. [sent-84, score-0.35]
38 Specifically, edges connect node ν l (i, j) at level l to a set of child nodes Chl (i, j) = {ν l−1 (i , j )} at level l − 1 satisfying 2i − d ≤ i ≤ 2i + d, 2j − d ≤ j ≤ 2j + d. [sent-85, score-0.317]
39 To apply the model to motion estimation, we define state variable ul (i, j) at each node to represent the motion, and connect the 0th level nodes to two consecutive image frames, D = (It (x), It+1 (x)). [sent-89, score-1.284]
40 The problem of motion estimation is to estimate the 2D motion field u(x) at time t for every pixel site x from input D. [sent-90, score-1.369]
41 For simplicity, we use ul to denote the motion instead of ul (i, j) in the i following sections. [sent-91, score-1.453]
42 This robust norm helps deal with the measurement noise that often occur at motion boundary and to prevent over-smoothing at the higher levels. [sent-95, score-0.693]
43 The second i term imposes a slowness prior on the motion which is weighted by the coefficient α. [sent-100, score-0.821]
44 These similarity scores at x gives confidence for different local motion hypotheses: higher similarity means the motion is more likely while lower means it is less likely. [sent-102, score-1.338]
45 l 2) The Hierarchical Prior {Eu } We define a hierarchical prior on the slowness and spatial smoothness of motion fields. [sent-103, score-0.984]
46 The first term of this prior is expressed by energy terms between nodes at different levels of the hierarchy and enforces a smoothness preference for their states u – that the motion of a child node is similar to the motion of its parent. [sent-104, score-1.828]
47 This imposes weak smoothness on the motion field and allows abrupt change on motion boundaries. [sent-106, score-1.464]
48 The second term is a L1 norm of motion velocities that encourages the slowness. [sent-107, score-0.669]
49 Note that our hierarchical smoothness prior differs from conventional smoothness constraints, e. [sent-116, score-0.316]
50 , [10], because they impose smoothness ’sideways’ between neighboring pixels at the same resolution level, which requires that the motion is similar between neighboring sites at the pixel level only. [sent-118, score-0.939]
51 By contrast, we impose smoothness by requiring that child nodes have similar motions to their parent nodes. [sent-121, score-0.321]
52 2 Motion Estimation ˆ We estimate the motion field by computing the most probable motion U = arg maxU P (U |D), where P (U |D) was defined as a Gibbs distribution in equation (1). [sent-124, score-1.338]
53 Performing inference on this model is challenging since the energy is defined over a hierarchical graph structure with many closed loops, the state variables U are continuous-valued, and the energy function is non-convex. [sent-125, score-0.331]
54 Our strategy is to convert this into a discrete optimization problem by quantizing the motion state space. [sent-126, score-0.669]
55 For example, we estimate the motion at an integer-valued resolution if the accuracy is sufficient for certain experimental settings. [sent-127, score-0.669]
56 We first approximate the hierarchial graph with a tree-structured model by making multiple copies of child nodes such that each child node has a single parent (see [23]). [sent-133, score-0.347]
57 More ˜ specifically, we compute an approximate energy function E(U ) recursively by exploiting the tree 4 structure: ˜ E(ul+1 ) = i l ˜ j min[Eu (ul+1 , ul ) + E(ul )] j i j∈Chl+1 (i) ul j ˜ where E(u0 ) at the bottom level is the data energy Ed (u0 ; D). [sent-135, score-1.027]
58 Given the top-level motion (ˆL ), we then compute the optimal motion conui figuration for other levels using the following top-down procedure. [sent-138, score-1.376]
59 We minimize the following energy function recursively for each node: ˆj ul = arg min[ ul j l ˜ j Eu (ˆ l+1 ; ul ) + E(ul )] ui j i∈P al (j) where P al (j) is the set of parents of level-l node j. [sent-140, score-1.352]
60 In the top-down pass, the spatial smoothness is imposed to the motion estimates at higher levels which provide context information to disambiguate the motion estimated at lower levels. [sent-141, score-1.533]
61 1 The stimuli and simulation procedures Random dot kinematogram (RDK) stimuli consist of two image frames with N dots in each frame [1, 16, 6]. [sent-146, score-0.67]
62 The difficulty of perceiving coherent motion in RDK stimuli is due to the large correspondence uncertainty introduced by the noise dots as shown in rightmost panel in figure (3). [sent-152, score-1.279]
63 Figure 3: The left three panels show coherent stimuli with N = 20, C = 0. [sent-153, score-0.264]
64 The arrows show the motion of those dots which are moving coherently. [sent-158, score-0.83]
65 Correspondence noise is illustrated by the rightmost panel showing that a dot in the first frame has many candidate matches in the second frame. [sent-159, score-0.317]
66 Barlow and Tripathy [16] used RDK stimuli to investigate how dot density can affect human performance in a global motion discrimination task. [sent-160, score-1.037]
67 They found that human performance (measured by the coherence threshold) vary little with dot density. [sent-161, score-0.33]
68 We tested our model on the same task to judge 5 Figure 4: Estimated motion fields for random dot kinematograms. [sent-162, score-0.797]
69 First row: 50 dots in the RDK stimulus; Second row: 100 dots in the RDK stimulus; Column-wise, coherence ratio C = 0. [sent-163, score-0.454]
70 The arrows indicate the motion estimated for each dot. [sent-168, score-0.691]
71 the global motion direction using RDK motion stimulus as the input image. [sent-169, score-1.495]
72 We applied our model to estimate motion fields and used the average velocity to indicate the global motion direction (to the left or to the right). [sent-170, score-1.392]
73 The dot number varies with N = 40, 80, 100, 200, 400, 800 respectively, corresponding to a wide range of dot densities. [sent-172, score-0.256]
74 The model performance was computed for each coherence ratio to fit psychometric functions and to find the coherence threshold at which model performance can reach 75% accuracy. [sent-173, score-0.354]
75 2 The Results Figure (4) shows examples of the estimated motion field for various values of dot number N and coherence ratio C. [sent-175, score-0.993]
76 The model outputs provide visually coherent motion estimates when the coherence ratio was greater than 0. [sent-176, score-0.941]
77 With the increase of coherence ratio, the estimated motion flow appears to be more coherent. [sent-178, score-0.828]
78 The coherence threshold, using the criterion of 75% accuracy, showed that model performance varied little with the increase of dot density, which is consistent with human performance reported in psychophysical experiments [16, 6]. [sent-181, score-0.387]
79 Two types of elements were used in our simulations: (i) drifting sine-wave gratings with random orientation, and (ii) plaids which includes two gratings with orthogonal orientations. [sent-184, score-0.501]
80 For the CN signal gratings, the motion vi was set to a fixed value v. [sent-190, score-0.692]
81 005 0 40 80 100 200 N 400 800 Figure 5: Left panel: Figure 2 in [16] showing that the coherence ratio threshold varies very little with dot density. [sent-201, score-0.302]
82 The plaid elements combine two gratings with orthogonal orientations (each grating has the same speed but can have a different motion direction). [sent-208, score-0.986]
83 1 Figure 7: Left two panels: Estimated motion fields of grating and plaids stimuli. [sent-227, score-0.904]
84 Rightmost panel: Psychometric functions of gratings and plaids stimuli. [sent-228, score-0.288]
85 2 Simulation procedures and results The left two panels in Figure (7) show the estimated motion fields for the two types of stimulus we studied with the same coherence ratios 0. [sent-230, score-0.995]
86 Plaids stimuli produce more coherent estimated motion field than grating stimuli, which is understandable. [sent-232, score-1.027]
87 We tested our model in an 8-direction discrimination task for estimating global motion direction [20]. [sent-234, score-0.723]
88 We ran 300 trials for each stimulus type, and used the direction of the average motion to predict the global motion direction. [sent-236, score-1.495]
89 the number of times our model predicted the correct motion direction from 8 alternatives – was calculated at different coherence ratio levels. [sent-239, score-0.876]
90 This difference between gratings and plaids is shown in the rightmost panel of Figure (7), where the psychometric function of plaids stimuli is always above that of grating stimuli, indicating better performance. [sent-240, score-0.846]
91 It differs from traditional motion energy models because it does not use spatiotemporal filtering. [sent-243, score-0.776]
92 Note that it was shown in [6] that motion energy models are not well suited to the long-range motion stimuli studied in this paper. [sent-244, score-1.576]
93 The local ambiguities of motion are resolved by a novel hierarchical prior which combines slowness and smoothness at a range of different scales. [sent-245, score-1.014]
94 Our model accounts well for human perception of both short-range and long-range motion using the two standard stimulus types (RDKs and gratings). [sent-246, score-0.964]
95 It also has the computational motivation of being able to represent prior knowledge about motion at different scales and to allow efficient computation. [sent-248, score-0.698]
96 Three-systems theory of human visual motion perception: review and update. [sent-260, score-0.785]
97 Cortical dynamics of visual motion perception: short-range and long-range apparent motion. [sent-319, score-0.744]
98 The influence of terminators on motion integration across space. [sent-395, score-0.669]
99 Adaptive pooling of visual motion signals by the human visual system revealed with a novel multi-element stimulus. [sent-409, score-0.861]
100 A comparison of global motion perception using a multiple-aperture stimulus. [sent-415, score-0.786]
wordName wordTfidf (topN-words)
[('motion', 0.669), ('ul', 0.392), ('gratings', 0.158), ('stimuli', 0.154), ('rdk', 0.143), ('dots', 0.14), ('coherence', 0.137), ('plaids', 0.13), ('dot', 0.128), ('rdks', 0.107), ('grating', 0.105), ('stimulus', 0.103), ('perception', 0.096), ('smoothness', 0.089), ('eu', 0.087), ('baseball', 0.086), ('slowness', 0.086), ('hierarchical', 0.086), ('energy', 0.084), ('panel', 0.081), ('nodes', 0.078), ('coherent', 0.077), ('human', 0.065), ('child', 0.065), ('motions', 0.064), ('perceiving', 0.063), ('perceive', 0.058), ('psychophysical', 0.057), ('sin', 0.055), ('ucla', 0.054), ('kinematograms', 0.054), ('plaid', 0.054), ('level', 0.051), ('displacement', 0.051), ('visual', 0.051), ('graph', 0.05), ('node', 0.047), ('chl', 0.047), ('rightmost', 0.045), ('player', 0.044), ('angeles', 0.044), ('pp', 0.044), ('displacements', 0.043), ('psychometric', 0.043), ('copies', 0.042), ('optical', 0.041), ('los', 0.04), ('lu', 0.04), ('frame', 0.039), ('levels', 0.038), ('imposes', 0.037), ('ratio', 0.037), ('neighboring', 0.037), ('apertures', 0.036), ('yuille', 0.035), ('eld', 0.035), ('lattice', 0.034), ('frames', 0.033), ('direction', 0.033), ('panels', 0.033), ('ambiguities', 0.031), ('hongjing', 0.031), ('sideways', 0.031), ('hierarchy', 0.031), ('pixel', 0.031), ('types', 0.031), ('orientation', 0.03), ('vision', 0.03), ('hypotheses', 0.029), ('movements', 0.029), ('barlow', 0.029), ('coherently', 0.029), ('prior', 0.029), ('cn', 0.029), ('enforces', 0.029), ('ambiguous', 0.027), ('inference', 0.027), ('cos', 0.026), ('uni', 0.026), ('correspondence', 0.026), ('spatial', 0.025), ('impose', 0.025), ('connect', 0.025), ('system', 0.025), ('drifting', 0.024), ('resolved', 0.024), ('apparent', 0.024), ('noise', 0.024), ('recursively', 0.024), ('enables', 0.024), ('vi', 0.023), ('differs', 0.023), ('estimated', 0.022), ('image', 0.022), ('theories', 0.022), ('parents', 0.021), ('global', 0.021), ('estimates', 0.021), ('moved', 0.021), ('moving', 0.021)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000007 20 nips-2010-A unified model of short-range and long-range motion perception
Author: Shuang Wu, Xuming He, Hongjing Lu, Alan L. Yuille
Abstract: The human vision system is able to effortlessly perceive both short-range and long-range motion patterns in complex dynamic scenes. Previous work has assumed that two different mechanisms are involved in processing these two types of motion. In this paper, we propose a hierarchical model as a unified framework for modeling both short-range and long-range motion perception. Our model consists of two key components: a data likelihood that proposes multiple motion hypotheses using nonlinear matching, and a hierarchical prior that imposes slowness and spatial smoothness constraints on the motion field at multiple scales. We tested our model on two types of stimuli, random dot kinematograms and multiple-aperture stimuli, both commonly used in human vision research. We demonstrate that the hierarchical model adequately accounts for human performance in psychophysical experiments.
2 0.5753926 98 nips-2010-Functional form of motion priors in human motion perception
Author: Hongjing Lu, Tungyou Lin, Alan Lee, Luminita Vese, Alan L. Yuille
Abstract: It has been speculated that the human motion system combines noisy measurements with prior expectations in an optimal, or rational, manner. The basic goal of our work is to discover experimentally which prior distribution is used. More specifically, we seek to infer the functional form of the motion prior from the performance of human subjects on motion estimation tasks. We restricted ourselves to priors which combine three terms for motion slowness, first-order smoothness, and second-order smoothness. We focused on two functional forms for prior distributions: L2-norm and L1-norm regularization corresponding to the Gaussian and Laplace distributions respectively. In our first experimental session we estimate the weights of the three terms for each functional form to maximize the fit to human performance. We then measured human performance for motion tasks and found that we obtained better fit for the L1-norm (Laplace) than for the L2-norm (Gaussian). We note that the L1-norm is also a better fit to the statistics of motion in natural environments. In addition, we found large weights for the second-order smoothness term, indicating the importance of high-order smoothness compared to slowness and lower-order smoothness. To validate our results further, we used the best fit models using the L1-norm to predict human performance in a second session with different experimental setups. Our results showed excellent agreement between human performance and model prediction – ranging from 3% to 8% for five human subjects over ten experimental conditions – and give further support that the human visual system uses an L1-norm (Laplace) prior.
3 0.23476093 187 nips-2010-Occlusion Detection and Motion Estimation with Convex Optimization
Author: Alper Ayvaci, Michalis Raptis, Stefano Soatto
Abstract: We tackle the problem of simultaneously detecting occlusions and estimating optical flow. We show that, under standard assumptions of Lambertian reflection and static illumination, the task can be posed as a convex minimization problem. Therefore, the solution, computed using efficient algorithms, is guaranteed to be globally optimal, for any number of independently moving objects, and any number of occlusion layers. We test the proposed algorithm on benchmark datasets, expanded to enable evaluation of occlusion detection performance. 1
4 0.1951317 141 nips-2010-Layered image motion with explicit occlusions, temporal consistency, and depth ordering
Author: Deqing Sun, Erik B. Sudderth, Michael J. Black
Abstract: Layered models are a powerful way of describing natural scenes containing smooth surfaces that may overlap and occlude each other. For image motion estimation, such models have a long history but have not achieved the wide use or accuracy of non-layered methods. We present a new probabilistic model of optical flow in layers that addresses many of the shortcomings of previous approaches. In particular, we define a probabilistic graphical model that explicitly captures: 1) occlusions and disocclusions; 2) depth ordering of the layers; 3) temporal consistency of the layer segmentation. Additionally the optical flow in each layer is modeled by a combination of a parametric model and a smooth deviation based on an MRF with a robust spatial prior; the resulting model allows roughness in layers. Finally, a key contribution is the formulation of the layers using an imagedependent hidden field prior based on recent models for static scene segmentation. The method achieves state-of-the-art results on the Middlebury benchmark and produces meaningful scene segmentations as well as detected occlusion regions.
5 0.10308354 170 nips-2010-Moreau-Yosida Regularization for Grouped Tree Structure Learning
Author: Jun Liu, Jieping Ye
Abstract: We consider the tree structured group Lasso where the structure over the features can be represented as a tree with leaf nodes as features and internal nodes as clusters of the features. The structured regularization with a pre-defined tree structure is based on a group-Lasso penalty, where one group is defined for each node in the tree. Such a regularization can help uncover the structured sparsity, which is desirable for applications with some meaningful tree structures on the features. However, the tree structured group Lasso is challenging to solve due to the complex regularization. In this paper, we develop an efficient algorithm for the tree structured group Lasso. One of the key steps in the proposed algorithm is to solve the Moreau-Yosida regularization associated with the grouped tree structure. The main technical contributions of this paper include (1) we show that the associated Moreau-Yosida regularization admits an analytical solution, and (2) we develop an efficient algorithm for determining the effective interval for the regularization parameter. Our experimental results on the AR and JAFFE face data sets demonstrate the efficiency and effectiveness of the proposed algorithm.
6 0.10023946 246 nips-2010-Sparse Coding for Learning Interpretable Spatio-Temporal Primitives
7 0.069272757 119 nips-2010-Implicit encoding of prior probabilities in optimal neural populations
8 0.067617252 245 nips-2010-Space-Variant Single-Image Blind Deconvolution for Removing Camera Shake
9 0.06564635 167 nips-2010-Mixture of time-warped trajectory models for movement decoding
10 0.063979879 127 nips-2010-Inferring Stimulus Selectivity from the Spatial Structure of Neural Network Dynamics
11 0.062238172 276 nips-2010-Tree-Structured Stick Breaking for Hierarchical Data
12 0.061536592 95 nips-2010-Feature Transitions with Saccadic Search: Size, Color, and Orientation Are Not Alike
13 0.061164383 214 nips-2010-Probabilistic Belief Revision with Structural Constraints
14 0.060963765 149 nips-2010-Learning To Count Objects in Images
15 0.053414594 171 nips-2010-Movement extraction by detecting dynamics switches and repetitions
16 0.053396881 17 nips-2010-A biologically plausible network for the computation of orientation dominance
17 0.052057847 21 nips-2010-Accounting for network effects in neuronal responses using L1 regularized point process models
18 0.051660635 161 nips-2010-Linear readout from a neural population with partial correlation data
19 0.050750423 150 nips-2010-Learning concept graphs from text with stick-breaking priors
20 0.05067623 268 nips-2010-The Neural Costs of Optimal Control
topicId topicWeight
[(0, 0.146), (1, 0.056), (2, -0.191), (3, 0.063), (4, -0.033), (5, -0.153), (6, -0.059), (7, 0.059), (8, 0.017), (9, 0.051), (10, 0.12), (11, -0.448), (12, -0.157), (13, 0.203), (14, 0.143), (15, 0.219), (16, -0.255), (17, 0.091), (18, -0.059), (19, -0.012), (20, -0.166), (21, -0.111), (22, 0.0), (23, -0.064), (24, 0.014), (25, 0.044), (26, 0.047), (27, 0.045), (28, 0.019), (29, -0.03), (30, 0.014), (31, -0.005), (32, -0.038), (33, 0.049), (34, 0.119), (35, -0.026), (36, 0.002), (37, 0.026), (38, -0.028), (39, -0.055), (40, -0.014), (41, 0.012), (42, 0.089), (43, 0.069), (44, -0.001), (45, 0.062), (46, -0.003), (47, 0.109), (48, 0.001), (49, -0.017)]
simIndex simValue paperId paperTitle
same-paper 1 0.98984832 20 nips-2010-A unified model of short-range and long-range motion perception
Author: Shuang Wu, Xuming He, Hongjing Lu, Alan L. Yuille
Abstract: The human vision system is able to effortlessly perceive both short-range and long-range motion patterns in complex dynamic scenes. Previous work has assumed that two different mechanisms are involved in processing these two types of motion. In this paper, we propose a hierarchical model as a unified framework for modeling both short-range and long-range motion perception. Our model consists of two key components: a data likelihood that proposes multiple motion hypotheses using nonlinear matching, and a hierarchical prior that imposes slowness and spatial smoothness constraints on the motion field at multiple scales. We tested our model on two types of stimuli, random dot kinematograms and multiple-aperture stimuli, both commonly used in human vision research. We demonstrate that the hierarchical model adequately accounts for human performance in psychophysical experiments.
2 0.9600895 98 nips-2010-Functional form of motion priors in human motion perception
Author: Hongjing Lu, Tungyou Lin, Alan Lee, Luminita Vese, Alan L. Yuille
Abstract: It has been speculated that the human motion system combines noisy measurements with prior expectations in an optimal, or rational, manner. The basic goal of our work is to discover experimentally which prior distribution is used. More specifically, we seek to infer the functional form of the motion prior from the performance of human subjects on motion estimation tasks. We restricted ourselves to priors which combine three terms for motion slowness, first-order smoothness, and second-order smoothness. We focused on two functional forms for prior distributions: L2-norm and L1-norm regularization corresponding to the Gaussian and Laplace distributions respectively. In our first experimental session we estimate the weights of the three terms for each functional form to maximize the fit to human performance. We then measured human performance for motion tasks and found that we obtained better fit for the L1-norm (Laplace) than for the L2-norm (Gaussian). We note that the L1-norm is also a better fit to the statistics of motion in natural environments. In addition, we found large weights for the second-order smoothness term, indicating the importance of high-order smoothness compared to slowness and lower-order smoothness. To validate our results further, we used the best fit models using the L1-norm to predict human performance in a second session with different experimental setups. Our results showed excellent agreement between human performance and model prediction – ranging from 3% to 8% for five human subjects over ten experimental conditions – and give further support that the human visual system uses an L1-norm (Laplace) prior.
3 0.70582891 187 nips-2010-Occlusion Detection and Motion Estimation with Convex Optimization
Author: Alper Ayvaci, Michalis Raptis, Stefano Soatto
Abstract: We tackle the problem of simultaneously detecting occlusions and estimating optical flow. We show that, under standard assumptions of Lambertian reflection and static illumination, the task can be posed as a convex minimization problem. Therefore, the solution, computed using efficient algorithms, is guaranteed to be globally optimal, for any number of independently moving objects, and any number of occlusion layers. We test the proposed algorithm on benchmark datasets, expanded to enable evaluation of occlusion detection performance. 1
4 0.61940712 141 nips-2010-Layered image motion with explicit occlusions, temporal consistency, and depth ordering
Author: Deqing Sun, Erik B. Sudderth, Michael J. Black
Abstract: Layered models are a powerful way of describing natural scenes containing smooth surfaces that may overlap and occlude each other. For image motion estimation, such models have a long history but have not achieved the wide use or accuracy of non-layered methods. We present a new probabilistic model of optical flow in layers that addresses many of the shortcomings of previous approaches. In particular, we define a probabilistic graphical model that explicitly captures: 1) occlusions and disocclusions; 2) depth ordering of the layers; 3) temporal consistency of the layer segmentation. Additionally the optical flow in each layer is modeled by a combination of a parametric model and a smooth deviation based on an MRF with a robust spatial prior; the resulting model allows roughness in layers. Finally, a key contribution is the formulation of the layers using an imagedependent hidden field prior based on recent models for static scene segmentation. The method achieves state-of-the-art results on the Middlebury benchmark and produces meaningful scene segmentations as well as detected occlusion regions.
5 0.37299198 245 nips-2010-Space-Variant Single-Image Blind Deconvolution for Removing Camera Shake
Author: Stefan Harmeling, Hirsch Michael, Bernhard Schölkopf
Abstract: Modelling camera shake as a space-invariant convolution simplifies the problem of removing camera shake, but often insufficiently models actual motion blur such as those due to camera rotation and movements outside the sensor plane or when objects in the scene have different distances to the camera. In an effort to address these limitations, (i) we introduce a taxonomy of camera shakes, (ii) we build on a recently introduced framework for space-variant filtering by Hirsch et al. and a fast algorithm for single image blind deconvolution for space-invariant filters by Cho and Lee to construct a method for blind deconvolution in the case of space-variant blur, and (iii), we present an experimental setup for evaluation that allows us to take images with real camera shake while at the same time recording the spacevariant point spread function corresponding to that blur. Finally, we demonstrate that our method is able to deblur images degraded by spatially-varying blur originating from real camera shake, even without using additionally motion sensor information. 1
6 0.36658713 3 nips-2010-A Bayesian Framework for Figure-Ground Interpretation
7 0.3604261 214 nips-2010-Probabilistic Belief Revision with Structural Constraints
8 0.33886099 246 nips-2010-Sparse Coding for Learning Interpretable Spatio-Temporal Primitives
9 0.27817312 121 nips-2010-Improving Human Judgments by Decontaminating Sequential Dependencies
10 0.26086795 95 nips-2010-Feature Transitions with Saccadic Search: Size, Color, and Orientation Are Not Alike
11 0.25094745 171 nips-2010-Movement extraction by detecting dynamics switches and repetitions
12 0.24814999 167 nips-2010-Mixture of time-warped trajectory models for movement decoding
13 0.20821257 81 nips-2010-Evaluating neuronal codes for inference using Fisher information
14 0.20195536 155 nips-2010-Learning the context of a category
15 0.19990595 161 nips-2010-Linear readout from a neural population with partial correlation data
16 0.19976166 262 nips-2010-Switched Latent Force Models for Movement Segmentation
17 0.19771448 107 nips-2010-Global seismic monitoring as probabilistic inference
18 0.18717226 111 nips-2010-Hallucinations in Charles Bonnet Syndrome Induced by Homeostasis: a Deep Boltzmann Machine Model
19 0.18470922 221 nips-2010-Random Projections for $k$-means Clustering
20 0.17951727 257 nips-2010-Structured Determinantal Point Processes
topicId topicWeight
[(13, 0.031), (17, 0.021), (27, 0.133), (30, 0.054), (35, 0.051), (43, 0.252), (45, 0.19), (50, 0.05), (52, 0.027), (60, 0.015), (77, 0.053), (78, 0.016), (90, 0.028)]
simIndex simValue paperId paperTitle
1 0.83397353 286 nips-2010-Word Features for Latent Dirichlet Allocation
Author: James Petterson, Wray Buntine, Shravan M. Narayanamurthy, Tibério S. Caetano, Alex J. Smola
Abstract: We extend Latent Dirichlet Allocation (LDA) by explicitly allowing for the encoding of side information in the distribution over words. This results in a variety of new capabilities, such as improved estimates for infrequently occurring words, as well as the ability to leverage thesauri and dictionaries in order to boost topic cohesion within and across languages. We present experiments on multi-language topic synchronisation where dictionary information is used to bias corresponding words towards similar topics. Results indicate that our model substantially improves topic cohesion when compared to the standard LDA model. 1
same-paper 2 0.80234635 20 nips-2010-A unified model of short-range and long-range motion perception
Author: Shuang Wu, Xuming He, Hongjing Lu, Alan L. Yuille
Abstract: The human vision system is able to effortlessly perceive both short-range and long-range motion patterns in complex dynamic scenes. Previous work has assumed that two different mechanisms are involved in processing these two types of motion. In this paper, we propose a hierarchical model as a unified framework for modeling both short-range and long-range motion perception. Our model consists of two key components: a data likelihood that proposes multiple motion hypotheses using nonlinear matching, and a hierarchical prior that imposes slowness and spatial smoothness constraints on the motion field at multiple scales. We tested our model on two types of stimuli, random dot kinematograms and multiple-aperture stimuli, both commonly used in human vision research. We demonstrate that the hierarchical model adequately accounts for human performance in psychophysical experiments.
3 0.72909391 98 nips-2010-Functional form of motion priors in human motion perception
Author: Hongjing Lu, Tungyou Lin, Alan Lee, Luminita Vese, Alan L. Yuille
Abstract: It has been speculated that the human motion system combines noisy measurements with prior expectations in an optimal, or rational, manner. The basic goal of our work is to discover experimentally which prior distribution is used. More specifically, we seek to infer the functional form of the motion prior from the performance of human subjects on motion estimation tasks. We restricted ourselves to priors which combine three terms for motion slowness, first-order smoothness, and second-order smoothness. We focused on two functional forms for prior distributions: L2-norm and L1-norm regularization corresponding to the Gaussian and Laplace distributions respectively. In our first experimental session we estimate the weights of the three terms for each functional form to maximize the fit to human performance. We then measured human performance for motion tasks and found that we obtained better fit for the L1-norm (Laplace) than for the L2-norm (Gaussian). We note that the L1-norm is also a better fit to the statistics of motion in natural environments. In addition, we found large weights for the second-order smoothness term, indicating the importance of high-order smoothness compared to slowness and lower-order smoothness. To validate our results further, we used the best fit models using the L1-norm to predict human performance in a second session with different experimental setups. Our results showed excellent agreement between human performance and model prediction – ranging from 3% to 8% for five human subjects over ten experimental conditions – and give further support that the human visual system uses an L1-norm (Laplace) prior.
4 0.70266467 21 nips-2010-Accounting for network effects in neuronal responses using L1 regularized point process models
Author: Ryan Kelly, Matthew Smith, Robert Kass, Tai S. Lee
Abstract: Activity of a neuron, even in the early sensory areas, is not simply a function of its local receptive field or tuning properties, but depends on global context of the stimulus, as well as the neural context. This suggests the activity of the surrounding neurons and global brain states can exert considerable influence on the activity of a neuron. In this paper we implemented an L1 regularized point process model to assess the contribution of multiple factors to the firing rate of many individual units recorded simultaneously from V1 with a 96-electrode “Utah” array. We found that the spikes of surrounding neurons indeed provide strong predictions of a neuron’s response, in addition to the neuron’s receptive field transfer function. We also found that the same spikes could be accounted for with the local field potentials, a surrogate measure of global network states. This work shows that accounting for network fluctuations can improve estimates of single trial firing rate and stimulus-response transfer functions. 1
5 0.70050156 161 nips-2010-Linear readout from a neural population with partial correlation data
Author: Adrien Wohrer, Ranulfo Romo, Christian K. Machens
Abstract: How much information does a neural population convey about a stimulus? Answers to this question are known to strongly depend on the correlation of response variability in neural populations. These noise correlations, however, are essentially immeasurable as the number of parameters in a noise correlation matrix grows quadratically with population size. Here, we suggest to bypass this problem by imposing a parametric model on a noise correlation matrix. Our basic assumption is that noise correlations arise due to common inputs between neurons. On average, noise correlations will therefore reflect signal correlations, which can be measured in neural populations. We suggest an explicit parametric dependency between signal and noise correlations. We show how this dependency can be used to ”fill the gaps” in noise correlations matrices using an iterative application of the Wishart distribution over positive definitive matrices. We apply our method to data from the primary somatosensory cortex of monkeys performing a two-alternativeforced choice task. We compare the discrimination thresholds read out from the population of recorded neurons with the discrimination threshold of the monkey and show that our method predicts different results than simpler, average schemes of noise correlations. 1
6 0.6922496 268 nips-2010-The Neural Costs of Optimal Control
7 0.69208527 200 nips-2010-Over-complete representations on recurrent neural networks can support persistent percepts
8 0.69196361 44 nips-2010-Brain covariance selection: better individual functional connectivity models using population prior
9 0.69186449 81 nips-2010-Evaluating neuronal codes for inference using Fisher information
10 0.68767565 17 nips-2010-A biologically plausible network for the computation of orientation dominance
11 0.6863063 97 nips-2010-Functional Geometry Alignment and Localization of Brain Areas
12 0.68454981 194 nips-2010-Online Learning for Latent Dirichlet Allocation
13 0.68415612 109 nips-2010-Group Sparse Coding with a Laplacian Scale Mixture Prior
14 0.68248844 123 nips-2010-Individualized ROI Optimization via Maximization of Group-wise Consistency of Structural and Functional Profiles
15 0.6813615 6 nips-2010-A Discriminative Latent Model of Image Region and Object Tag Correspondence
16 0.68030179 55 nips-2010-Cross Species Expression Analysis using a Dirichlet Process Mixture Model with Latent Matchings
17 0.6773361 56 nips-2010-Deciphering subsampled data: adaptive compressive sampling as a principle of brain communication
18 0.67682779 39 nips-2010-Bayesian Action-Graph Games
19 0.67556781 238 nips-2010-Short-term memory in neuronal networks through dynamical compressed sensing
20 0.67516237 266 nips-2010-The Maximal Causes of Natural Scenes are Edge Filters