nips nips2010 nips2010-3 knowledge-graph by maker-knowledge-mining

3 nips-2010-A Bayesian Framework for Figure-Ground Interpretation


Source: pdf

Author: Vicky Froyen, Jacob Feldman, Manish Singh

Abstract: Figure/ground assignment, in which the visual image is divided into nearer (figural) and farther (ground) surfaces, is an essential step in visual processing, but its underlying computational mechanisms are poorly understood. Figural assignment (often referred to as border ownership) can vary along a contour, suggesting a spatially distributed process whereby local and global cues are combined to yield local estimates of border ownership. In this paper we model figure/ground estimation in a Bayesian belief network, attempting to capture the propagation of border ownership across the image as local cues (contour curvature and T-junctions) interact with more global cues to yield a figure/ground assignment. Our network includes as a nonlocal factor skeletal (medial axis) structure, under the hypothesis that medial structure “draws” border ownership so that borders are owned by the skeletal hypothesis that best explains them. We also briefly present a psychophysical experiment in which we measured local border ownership along a contour at various distances from an inducing cue (a T-junction). Both the human subjects and the network show similar patterns of performance, converging rapidly to a similar pattern of spatial variation in border ownership along contours. Figure/ground assignment (further referred to as f/g), in which the visual image is divided into nearer (figural) and farther (ground) surfaces, is an essential step in visual processing. A number of factors are known to affect f/g assignment, including region size [9], convexity [7, 16], and symmetry [1, 7, 11]. Figural assignment (often referred to as border ownership, under the assumption that the figural side “owns” the border) is usually studied globally, meaning that entire surfaces and their enclosing boundaries are assumed to receive a globally consistent figural status. But recent psychophysical findings [8] have suggested that border ownership can vary locally along a boundary, even leading to a globally inconsistent figure/ground assignment—broadly consistent with electrophysiological evidence showing local coding for border ownership in area V2 as early as 68 msec after image onset [20]. This suggests a spatially distributed and potentially competitive process of figural assignment [15], in which adjacent surfaces compete to own their common boundary, with figural status propagating across the image as this competition proceeds. But both the principles and computational mechanisms underlying this process are poorly understood. ∗ V.F. was supported by a Fullbright Honorary fellowship and by the Rutgers NSF IGERT program in Perceptual Science, NSF DGE 0549115, J.F. by NIH R01 EY15888, and M.S. by NSF CCF-0541185 1 In this paper we consider how border ownership might propagate over both space and time—that is, across the image as well as over the progression of computation. Following Weiss et al. [18] we adopt a Bayesian belief network architecture, with nodes along boundaries representing estimated border ownership, and connections arranged so that both neighboring nodes and nonlocal integrating nodes combine to influence local estimates of border ownership. Our model is novel in two particular respects: (a) we combine both local and global influences on border ownership in an integrated and principled way; and (b) we include as a nonlocal factor skeletal (medial axis) influences on f/g assignment. Skeletal structure has not been previously considered as a factor on border ownership, but its relevance follows from a model [4] in which shapes are conceived of as generated by or “grown” from an internal skeleton, with the consequence that their boundaries are perceptually “owned” by the skeletal side. We also briey present a psychophysical experiment in which we measured local border ownership along a contour, at several distances from a strong local f/g inducing cue, and at several time delays after the onset of the cue. The results show measurable spatial differences in judged border ownership, with judgments varying with distance from the inducer; but no temporal effect, with essentially asymptotic judgments even after very brief exposures. Both results are consistent with the behavior of the network, which converges quickly to an asymptotic but spatially nonuniform f/g assignment. 1 The Model The Network. For simplicity, we take an edge map as input for the model, assuming that edges and T-junctions have already been detected. From this edge map we then create a Bayesian belief network consisting of four hierarchical levels. At the input level the model receives evidence E from the image, consisting of local contour curvature and T-junctions. The nodes for this level are placed at equidistant locations along the contour. At the first level the model estimates local border ownership. The border ownership, or B-nodes at this level are at the same locations as the E-nodes, but are connected to their nearest neighbors, and are the parent of the E-node at their location. (As a simplifying assumption, such connections are broken at T-junctions in such a way that the occluded contour is disconnected from the occluder.) The highest level has skeletal nodes, S, whose positions are defined by the circumcenters of the Delaunay triangulation on all the E-nodes, creating a coarse medial axis skeleton [13]. Because of the structure of the Delaunay, each S-node is connected to exactly three E-nodes from which they receive information about the position and the local tangent of the contour. In the current state of the model the S-nodes are “passive”, meaning their posteriors are computed before the model is initiated. Between the S nodes and the B nodes are the grouping nodes G. They have the same positions as the S-nodes and the same Delaunay connections, but to B-nodes that have the same image positions as the E-nodes. They will integrate information from distant B-nodes, applying an interiority cue that is influenced by the local strength of skeletal axes as computed by the S-nodes (Fig. 1). Although this is a multiply connected network, we have found that given reasonable parameters the model converges to intuitive posteriors for a variety of shapes (see below). Updating. Our goal is to compute the posterior p(Bi |I), where I is the whole image. Bi is a binary variable coding for the local direction of border ownership, that is, the side that owns the border. In order for border ownership estimates to be influenced by image structure elsewhere in the image, information has to propagate throughout the network. To achieve this propagation, we use standard equations for node updating [14, 12]. However while to all other connections being directed, connections at the B-node level are undirected, causing each node to be child and parent node at the same time. Considering only the B-node level, a node Bi is only separated from the rest of the network by its two neighbors. Hence the Markovian property applies, in that Bi only needs to get iterative information from its neighbors to eventually compute p(Bi |I). So considering the whole network, at each iteration t, Bi receives information from both its child, Ei and from its parents—that is neigbouring nodes (Bi+1 and Bi−1 )—as well as all grouping nodes connected to it (Gj , ..., Gm ). The latter encode for interiority versus exteriority, interiority meaning that the B-node’s estimated gural direction points towards the G-node in question, exteriority meaning that it points away. Integrating all this information creates a multidimensional likelihood function: p(Bi |Bi−1 , Bi+1 , Gj , ..., Gm ). Because of its complexity we choose to approximate it (assuming all nodes are marginally independent of each other when conditioned on Bi ) by 2 Figure 1: Basic network structure of the model. Both skeletal (S-nodes) and border-ownerhsip nodes (B-nodes) get evidence from E-nodes, though different types. S-nodes receive mere positional information, while B-nodes receive information about local curvature and the presence of T-junctions. Because of the structure of the Delaunay triangulation S-nodes and G-nodes (grouping nodes) always get input from exactly three nodes, respectively E and B-nodes. The gray color depicts the fact that this part of the network is computed before the model is initiated and does not thereafter interact with the dynamics of the model. m p(Bi |Pj , ..., Pm ) ∝ p(Bi |Pj ) (1) j where the Pj ’s are the parents of Bi . Given this, at each iteration, each node Bi performs the following computation: Bel(Bi ) ← cλ(Bi )π(Bi )α(Bi )β(Bi ) (2) where conceptually λ stands for bottom-up information, π for top down information and α and β for information received from within the same level. More formally, λ(Bi ) ← p(E|Bi ) (3) m π(Bi ) ← p(Bi |Gj )πGj (Bi ) j (4) Gj and analogously to equation 4 for α(Bi ) and β(Bi ), which compute information coming from Bi−1 and Bi+1 respectively. For these πBi−1 (Bi ), πBi+1 (Bi ), and πGj (Bi ): πGj (Bi ) ← c π(G) λBk (Gj ) (5) k=i πBi−1 (Bi ) ← c β(Bi−1 )λ(Bi−1 )π(Bi−1 ) 3 (6) and πBi+1 (Bi ) is analogous to πBi−1 (Bi ), with c and c being normalization constants. Finally for the G-nodes: Bel(Gi ) ← cλ(Gi )π(Gi ) λ(Gi ) ← (7) λBj (Gi ) (8) j m λBj (Gi ) ← λ(Bj )p(Bi |Gj )[α(Bj )β(Bj ) Bj p(Bi |Gk )πGk (Bi )] (9) k=i Gk The posteriors of the S-nodes are used to compute the π(Gi ). This posterior computes how well the S-node at each position explains the contour—that is, how well it accounts for the cues flowing from the E-nodes it is connected to. Each Delaunay connection between S- and E-nodes can be seen as a rib that sprouts from the skeleton. More specifically each rib sprouts in a direction that is normal (perpendicular) to the tangent of the contour at the E-node plus a random error φi chosen independently for each rib from a von Mises distribution centered on zero, i.e. φi ∼ V (0, κS ) with spread parameter κS [4]. The rib lengths are drawn from an exponential decreasing density function p(ρi ) ∝ e−λS ρi [4]. We can now express how well this node “explains” the three E-nodes it is connected to via the probability that this S-node deserves to be a skeletal node or not, p(S = true|E1 , E2 , E3 ) ∝ p(ρi )p(φi ) (10) i with S = true depicting that this S-node deserves to be a skeletal node. From this we then compute the prior π(Gi ) in such a way that good (high posterior) skeletal nodes induce a high interiority bias, hence a stronger tendency to induce figural status. Conversely, bad (low posterior) skeletal nodes create a prior close to indifferent (uniform) and thus have less (or no) influence on figural status. Likelihood functions Finally we need to express the likelihood function necessary for the updating rules described above. The first two likelihood functions are part of p(Ei |Bi ), one for each of the local cues. The first one, reflecting local curvature, gives the probability of the orientations of the two vectors inherent to Ei (α1 and α2 ) given both direction of figure (θ) encoded in Bi as a von Mises density centered on θ, i.e. αi ∼ V (θ, κEB ). The second likelihood function, reflecting the presence of a T-junction, simply assumes a fixed likelihood when a T-junction is present—that is p(T-junction = true|Bi ) = θT , where Bi places the direction of figure in the direction of the occluder. This likelihood function is only in effect when a T-junction is present, replacing the curvature cue at that node. The third likelihood function serves to keep consistency between nodes of the first level. This function p(Bi |Bi−1 ) or p(Bi |Bi+1 ) is used to compute α(B) and β(B) and is defined 2x2 conditional probability matrix with a single free parameter, θBB (the probability that figural direction at both B-nodes are the same). A fourth and final likelihood function p(Bi |Gj ) serves to propagate information between level one and two. This likelihood function is 2x2 conditional probability matrix matrix with one free parameter, θBG . In this case θBG encodes the probability that the figural direction of the B-node is in the direction of the exterior or interior preference of the G-node. In total this brings us to six free parameters in the model: κS , λS , κEB , θT , θBB , and θBG . 2 Basic Simulations To evaluate the performance of the model, we first tested it on several basic stimulus configurations in which the desired outcome is intuitively clear: a convex shape, a concave shape, a pair of overlapping shapes, and a pair of non-overlapping shapes (Fig. 2,3). The convex shape is the simplest in that curvature never changes sign. The concave shape includes a region with oppositely signed curvature. (The shape is naturally described as predominantly positively curved with a region of negative curvature, i.e. a concavity. But note that it can also be interpreted as predominantly negatively curved “window” with a region of positive curvature, although this is not the intuitive interpretation.) 4 The overlapping pair of shapes consists of two convex shapes with one partly occluding the other, creating a competition between the two shapes for the ownership of the common borderline. Finally the non-overlapping shapes comprise two simple convex shapes that do not touch—again setting up a competition for ownership of the two inner boundaries (i.e. between each shape and the ground space between them). Fig. 2 shows the network structures for each of these four cases. Figure 2: Network structure for the four shape categories (left to right: convex, concave, overlapping, non-overlapping shapes). Blue depict the locations of the B-nodes (and also the E-nodes), the red connections are the connections between B-nodes, the green connections are connections between B-nodes and G-nodes, and the G-nodes (and also the S-nodes) go from orange to dark red. This colour code depicts low (orange) to high (dark red) probability that this is a skeletal node, and hence the strength of the interiority cue. Running our model with hand-estimated parameter values yields highly intuitive posteriors (Fig. 3), an essential “sanity check” to ensure that the network approximates human judgments in simple cases. For the convex shape the model assigns figure to the interior just as one would expect even based solely on local curvature (Fig. 3A). In the concave figure (Fig. 3B), estimated border ownership begins to reverse inside the deep concavity. This may seem surprising, but actually closely matches empirical results obtained when local border ownership is probed psychophysically inside a similarly deep concavity, i.e. a “negative part” in which f/g seems to partly reverse [8]. For the overlapping shapes posteriors were also intuitive, with the occluding shape interpreted as in front and owning the common border (Fig. 3C). Finally, for the two non-overlapping shapes the model computed border-ownership just as one would expect if each shape were run separately, with each shape treated as figural along its entire boundary (Fig. 3D). That is, even though there is skeletal structure in the ground-region between the two shapes (see Fig. 2D), its posterior is weak compared to the skeletal structure inside the shapes, which thus loses the competition to own the boundary between them. For all these configurations, the model not only converged to intuitive estimates but did so rapidly (Fig. 4), always in fewer cycles than would be expected by pure lateral propagation, niterations < Nnodes [18] (with these parameters, typically about five times faster). Figure 3: Posteriors after convergence for the four shape categories (left to right: convex, concave, overlapping, non-overlapping). Arrows indicate estimated border ownership, with direction pointing to the perceived figural side, and length proportional to the magnitude of the posterior. All four simulations used the same parameters. 5 Figure 4: Convergence of the model for the basic shape categories. The vertical lines represent the point of convergence for each of the three shape categories. The posterior change is calculated as |p(Bi = 1|I)t − p(Bi = 1|I)t−1 | at each iteration. 3 Comparison to human data Beyond the simple cases reviewed above, we wished to submit our network to a more fine-grained comparison with human data. To this end we compared its performance to that of human subjects in an experiment we conducted (to be presented in more detail in a future paper). Briefly, our experiment involved finding evidence for propagation of f/g signals across the image. Subjects were first shown a stimulus in which the f/g configuration was globally and locally unambiguous and consistent: a smaller rectangle partly occluding a larger one (Fig. 5A), meaning that the smaller (front) one owns the common border. Then this configuration was perturbed by adding two bars, of which one induced a local f/g reversal—making it now appear locally that the larger rectangle owned the border (Fig. 5B). (The other bar in the display does not alter f/g interpretation, but was included to control for the attentional affects of introducing a bar in the image.) The inducing bar creates T-junctions that serve as strong local f/g cues, in this case tending to reverse the prior global interpretation of the figure. We then measured subjective border ownership along the central contour at various distances from the inducing bar, and at different times after the onset of the bar (25ms, 100ms and 250ms). We measured border ownership locally using a method introduced in [8] in which a local motion probe is introduced at a point on the boundary between two color regions of different colors, and the subject is asked which color appeared to move. Because the figural side “owns” the border, the response reflects perceived figural status. The goal of the experiment was to actually measure the progression of the influence of the inducing T-junction as it (hypothetically) propagated along the boundary. Briefly, we found no evidence of temporal differences, meaning that f/g judgments were essentially constant over time, suggesting rapid convergence of local f/g assignment. (This is consistent with the very rapid convergence of our network, which would suggest a lack of measurable temporal differences except at much shorter time scales than we measured.) But we did find a progressive reduction of f/g reversal with increasing distance from the inducer—that is, the influence of the T-junction decayed with distance. Mean responses aggregated over subjects (shortest delay only) are shown in Fig. 6. In order to run our model on this stimulus (which has a much more complex structure than the simple figures tested above) we had to make some adjustments. We removed the bars from the edge map, leaving only the T-junctions as underlying cues. This was a necessary first step because our model is not yet able to cope with skeletons that are split up by occluders. (The larger rectangle’s skeleton has been split up by the lower bar.) In this way all contours except those created by the bars were used to create the network (Fig. 7). Given this network we ran the model using hand-picked parameters that 6 Figure 5: Stimuli used in the experiment. A. Initial stimulus with locally and globally consistent and unambiguous f/g. B. Subsequently bars were added of which one (the top bar in this case) created a local reversal of f/g. C. Positions at which local f/g judgments of subjects were probed. Figure 6: Results from our experiment aggregated for all 7 subjects (shortest delay only) are shown in red. The x-axis shows distance from the inducing bar at which f/g judgment was probed. The y-axis shows the proportion of trials on which subjects judged the smaller rectangle to own the boundary. As can be seen, the further from the T-junction, the lower the f/g reversal. The fitted model (green curve) shows very similar pattern. Horizontal black line indicates chance performance (ambiguous f/g). gave us the best possible qualitative similarity to the human data. The parameters used never entailed total elimination of the influence of any likelihood function (κS = 16, λS = .025, κEB = .5, θT = .9, θBB = .9, and θBG = .6). As can be seen in Fig. 6 the border-ownership estimates at the locations where we had data show compelling similarities to human judgments. Furthermore along the entire contour the model converged to intuitive border-ownership estimates (Fig. 7) very rapidly (within 36 iterations). The fact that our model yielded intuitive estimates for the current network in which not all contours were completed shows another strength of our model. Because our model included grouping nodes, it did not require contours to be amodally completed [6] in order for information to propagate. 4 Conclusion In this paper we proposed a model rooted in Bayesian belief networks to compute figure/ground. The model uses both local and global cues, combined in a principled way, to achieve a stable and apparently psychologically reasonable estimate of border ownership. Local cues included local curvature and T-junctions, both well-established cues to f/g. Global cues included skeletal structure, 7 Figure 7: (left) Node structure for the experimental stimulus. (right) The model’s local borderownership estimates after convergence. a novel cue motivated by the idea that strongly axial shapes tend to be figural and thus own their boundaries. We successfully tested this model on both simple displays, in which it gave intuitive results, and on a more complex experimental stimulus, in which it gave a close match to the pattern of f/g propagation found in our subjects. Specifically, the model, like the human subjects rapidly converged to a stable local f/g interpretation. Our model’s structure shows several interesting parallels to properties of neural coding of border ownership in visual cortex. Some cortical cells (end-stopped cells) appear to code for local curvature [3] and T-junctions [5]. The B-nodes in our model could be seen as corresponding to cells that code for border ownership [20]. Furthermore, some authors [2] have suggested that recurrent feedback loops between border ownership cells in V2 and cells in V4 (corresponding to G-nodes in our model) play a role in the rapid computation of border ownership. The very rapid convergence we observed in our model likewise appears to be due to the connections between B-nodes and G-nodes. Finally scale-invariant shape representations (such as, speculatively, those based on skeletons) are thought to be present in higher cortical regions such as IT [17], which project down to earlier areas in ways that are not yet understood. A number of parallels to past models of f/g should be mentioned. Weiss [18] pioneered the application of belief networks to the f/g problem, though their network only considered a more restricted set of local cues and no global ones, such that information only propagated along the contour. Furthermore it has not been systematically compared to human judgments. Kogo et al. [10] proposed an exponential decay of f/g signals as they spread throughout the image. Our model has a similar decay for information going through the G-nodes, though it is also influenced by an angular factor defined by the position of the skeletal node. Like the model by Li Zhaoping [19], our model includes horizontal propagation between B-nodes, analogous to border-ownership cells in her model. A neurophysiological model by Craft et al. [2] defines grouping cells coding for an interiority preference that decays with the size of the receptive fields of these grouping cells. Our model takes this a step further by including shape (skeletal) structure as a factor in interiority estimates, rather than simply size of receptive fields (which is similar to the rib lengths in our model). Currently, our use of skeletons as shape representations is still limited to medial axis skeletons and surfaces that are not split up by occluders. Our future goals including integrating skeletons in a more robust way following the probabilistic account suggested by Feldman and Singh [4]. Eventually, we hope to fully integrate skeleton computation with f/g computation so that the more general problem of shape and surface estimation can be approached in a coherent and unified fashion. 8 References [1] P. Bahnsen. Eine untersuchung uber symmetrie und assymmetrie bei visuellen wahrnehmungen. Zeitschrift fur psychology, 108:129–154, 1928. [2] E. Craft, H. Sch¨ tze, E. Niebur, and R. von der Heydt. A neural model of figure-ground u organization. Journal of Neurophysiology, 97:4310–4326, 2007. [3] A. Dobbins, S. W. Zucker, and M. S. Cyander. Endstopping and curvature. Vision Research, 29:1371–1387, 1989. [4] J. Feldman and M. Singh. Bayesian estimation of the shape skeleton. Proceedings of the National Academy of Sciences, 103:18014–18019, 2006. [5] B. Heider, V. Meskenaite, and E. Peterhans. Anatomy and physiology of a neural mechanism defining depth order and contrast polarity at illusory contours. European Journal of Neuroscience, 12:4117–4130, 2000. [6] G. Kanizsa. Organization inVision. New York: Praeger, 1979. [7] G. Kanizsa and W. Gerbino. Vision and Artifact, chapter Convexity and symmetry in figureground organisation, pages 25–32. New York: Springer, 1976. [8] S. Kim and J. Feldman. Globally inconsistent figure/ground relations induced by a negative part. Journal of Vision, 9:1534–7362, 2009. [9] K. Koffka. Principles of Gestalt Psychology. Lund Humphries, London, 1935. [10] N. Kogo, C. Strecha, L. Van Gool, and J. Wagemans. Surface construction by a 2-d differentiation-integration process: a neurocomputational model for perceived border ownership, depth, and lightness in kanizsa figures. Psychological Review, 117:406–439, 2010. [11] B. Machielsen, M. Pauwels, and J. Wagemans. The role of vertical mirror-symmetry in visual shape detection. Journal of Vision, 9:1–11, 2009. [12] K. Murphy, Y. Weiss, and M.I. Jordan. Loopy belief propagation for approximate inference: an empirical study. Proceedings of Uncertainty in AI, pages 467–475, 1999. [13] R. L. Ogniewicz and O. K¨ bler. Hierarchic Voronoi skeletons. Pattern Recognition, 28:343– u 359, 1995. [14] J. Pearl. Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan Kaufmann, 1988. [15] M. A. Peterson and E. Skow. Inhibitory competition between shape properties in figureground perception. Journal of Experimental Psychology: Human Perception and Performance, 34:251–267, 2008. [16] K. A. Stevens and A. Brookes. The concave cusp as a determiner of figure-ground. Perception, 17:35–42, 1988. [17] K. Tanaka, H. Saito, Y. Fukada, and M. Moriya. Coding visual images of object in the inferotemporal cortex of the macaque monkey. Journal of Neurophysiology, 66:170–189, 1991. [18] Y. Weiss. Interpreting images by propagating Bayesian beliefs. Adv. in Neural Information Processing Systems, 9:908915, 1997. [19] L. Zhaoping. Border ownership from intracortical interactions in visual area V2. Neuron, 47(1):143–153, Jul 2005. [20] H. Zhou, H. S. Friedman, and R. von der Heydt. Coding of border ownerschip in monkey visual cortex. The Journal of Neuroscience, 20:6594–6611, 2000. 9

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 edu Abstract Figure/ground assignment, in which the visual image is divided into nearer (figural) and farther (ground) surfaces, is an essential step in visual processing, but its underlying computational mechanisms are poorly understood. [sent-10, score-0.136]

2 Figural assignment (often referred to as border ownership) can vary along a contour, suggesting a spatially distributed process whereby local and global cues are combined to yield local estimates of border ownership. [sent-11, score-1.322]

3 In this paper we model figure/ground estimation in a Bayesian belief network, attempting to capture the propagation of border ownership across the image as local cues (contour curvature and T-junctions) interact with more global cues to yield a figure/ground assignment. [sent-12, score-1.331]

4 Our network includes as a nonlocal factor skeletal (medial axis) structure, under the hypothesis that medial structure “draws” border ownership so that borders are owned by the skeletal hypothesis that best explains them. [sent-13, score-1.654]

5 We also briefly present a psychophysical experiment in which we measured local border ownership along a contour at various distances from an inducing cue (a T-junction). [sent-14, score-1.254]

6 Both the human subjects and the network show similar patterns of performance, converging rapidly to a similar pattern of spatial variation in border ownership along contours. [sent-15, score-1.09]

7 Figure/ground assignment (further referred to as f/g), in which the visual image is divided into nearer (figural) and farther (ground) surfaces, is an essential step in visual processing. [sent-16, score-0.181]

8 Figural assignment (often referred to as border ownership, under the assumption that the figural side “owns” the border) is usually studied globally, meaning that entire surfaces and their enclosing boundaries are assumed to receive a globally consistent figural status. [sent-18, score-0.722]

9 This suggests a spatially distributed and potentially competitive process of figural assignment [15], in which adjacent surfaces compete to own their common boundary, with figural status propagating across the image as this competition proceeds. [sent-20, score-0.141]

10 by NSF CCF-0541185 1 In this paper we consider how border ownership might propagate over both space and time—that is, across the image as well as over the progression of computation. [sent-28, score-0.967]

11 [18] we adopt a Bayesian belief network architecture, with nodes along boundaries representing estimated border ownership, and connections arranged so that both neighboring nodes and nonlocal integrating nodes combine to influence local estimates of border ownership. [sent-30, score-1.548]

12 Our model is novel in two particular respects: (a) we combine both local and global influences on border ownership in an integrated and principled way; and (b) we include as a nonlocal factor skeletal (medial axis) influences on f/g assignment. [sent-31, score-1.285]

13 We also briey present a psychophysical experiment in which we measured local border ownership along a contour, at several distances from a strong local f/g inducing cue, and at several time delays after the onset of the cue. [sent-33, score-1.2]

14 The results show measurable spatial differences in judged border ownership, with judgments varying with distance from the inducer; but no temporal effect, with essentially asymptotic judgments even after very brief exposures. [sent-34, score-0.634]

15 At the input level the model receives evidence E from the image, consisting of local contour curvature and T-junctions. [sent-39, score-0.281]

16 The nodes for this level are placed at equidistant locations along the contour. [sent-40, score-0.106]

17 At the first level the model estimates local border ownership. [sent-41, score-0.592]

18 The border ownership, or B-nodes at this level are at the same locations as the E-nodes, but are connected to their nearest neighbors, and are the parent of the E-node at their location. [sent-42, score-0.517]

19 (As a simplifying assumption, such connections are broken at T-junctions in such a way that the occluded contour is disconnected from the occluder. [sent-43, score-0.16]

20 ) The highest level has skeletal nodes, S, whose positions are defined by the circumcenters of the Delaunay triangulation on all the E-nodes, creating a coarse medial axis skeleton [13]. [sent-44, score-0.459]

21 Because of the structure of the Delaunay, each S-node is connected to exactly three E-nodes from which they receive information about the position and the local tangent of the contour. [sent-45, score-0.131]

22 Between the S nodes and the B nodes are the grouping nodes G. [sent-47, score-0.264]

23 They will integrate information from distant B-nodes, applying an interiority cue that is influenced by the local strength of skeletal axes as computed by the S-nodes (Fig. [sent-49, score-0.522]

24 Although this is a multiply connected network, we have found that given reasonable parameters the model converges to intuitive posteriors for a variety of shapes (see below). [sent-51, score-0.249]

25 Bi is a binary variable coding for the local direction of border ownership, that is, the side that owns the border. [sent-54, score-0.7]

26 In order for border ownership estimates to be influenced by image structure elsewhere in the image, information has to propagate throughout the network. [sent-55, score-0.97]

27 However while to all other connections being directed, connections at the B-node level are undirected, causing each node to be child and parent node at the same time. [sent-57, score-0.194]

28 So considering the whole network, at each iteration t, Bi receives information from both its child, Ei and from its parents—that is neigbouring nodes (Bi+1 and Bi−1 )—as well as all grouping nodes connected to it (Gj , . [sent-60, score-0.22]

29 The latter encode for interiority versus exteriority, interiority meaning that the B-node’s estimated gural direction points towards the G-node in question, exteriority meaning that it points away. [sent-64, score-0.661]

30 Because of its complexity we choose to approximate it (assuming all nodes are marginally independent of each other when conditioned on Bi ) by 2 Figure 1: Basic network structure of the model. [sent-69, score-0.131]

31 Both skeletal (S-nodes) and border-ownerhsip nodes (B-nodes) get evidence from E-nodes, though different types. [sent-70, score-0.348]

32 S-nodes receive mere positional information, while B-nodes receive information about local curvature and the presence of T-junctions. [sent-71, score-0.245]

33 This posterior computes how well the S-node at each position explains the contour—that is, how well it accounts for the cues flowing from the E-nodes it is connected to. [sent-82, score-0.121]

34 Each Delaunay connection between S- and E-nodes can be seen as a rib that sprouts from the skeleton. [sent-83, score-0.122]

35 More specifically each rib sprouts in a direction that is normal (perpendicular) to the tangent of the contour at the E-node plus a random error φi chosen independently for each rib from a von Mises distribution centered on zero, i. [sent-84, score-0.378]

36 We can now express how well this node “explains” the three E-nodes it is connected to via the probability that this S-node deserves to be a skeletal node or not, p(S = true|E1 , E2 , E3 ) ∝ p(ρi )p(φi ) (10) i with S = true depicting that this S-node deserves to be a skeletal node. [sent-88, score-0.718]

37 From this we then compute the prior π(Gi ) in such a way that good (high posterior) skeletal nodes induce a high interiority bias, hence a stronger tendency to induce figural status. [sent-89, score-0.47]

38 Conversely, bad (low posterior) skeletal nodes create a prior close to indifferent (uniform) and thus have less (or no) influence on figural status. [sent-90, score-0.348]

39 The first one, reflecting local curvature, gives the probability of the orientations of the two vectors inherent to Ei (α1 and α2 ) given both direction of figure (θ) encoded in Bi as a von Mises density centered on θ, i. [sent-93, score-0.134]

40 The second likelihood function, reflecting the presence of a T-junction, simply assumes a fixed likelihood when a T-junction is present—that is p(T-junction = true|Bi ) = θT , where Bi places the direction of figure in the direction of the occluder. [sent-96, score-0.136]

41 This likelihood function is only in effect when a T-junction is present, replacing the curvature cue at that node. [sent-97, score-0.195]

42 The third likelihood function serves to keep consistency between nodes of the first level. [sent-98, score-0.104]

43 2 Basic Simulations To evaluate the performance of the model, we first tested it on several basic stimulus configurations in which the desired outcome is intuitively clear: a convex shape, a concave shape, a pair of overlapping shapes, and a pair of non-overlapping shapes (Fig. [sent-104, score-0.236]

44 The convex shape is the simplest in that curvature never changes sign. [sent-106, score-0.21]

45 The concave shape includes a region with oppositely signed curvature. [sent-107, score-0.144]

46 (The shape is naturally described as predominantly positively curved with a region of negative curvature, i. [sent-108, score-0.165]

47 But note that it can also be interpreted as predominantly negatively curved “window” with a region of positive curvature, although this is not the intuitive interpretation. [sent-111, score-0.11]

48 ) 4 The overlapping pair of shapes consists of two convex shapes with one partly occluding the other, creating a competition between the two shapes for the ownership of the common borderline. [sent-112, score-0.933]

49 Finally the non-overlapping shapes comprise two simple convex shapes that do not touch—again setting up a competition for ownership of the two inner boundaries (i. [sent-113, score-0.737]

50 between each shape and the ground space between them). [sent-115, score-0.102]

51 Figure 2: Network structure for the four shape categories (left to right: convex, concave, overlapping, non-overlapping shapes). [sent-118, score-0.102]

52 Blue depict the locations of the B-nodes (and also the E-nodes), the red connections are the connections between B-nodes, the green connections are connections between B-nodes and G-nodes, and the G-nodes (and also the S-nodes) go from orange to dark red. [sent-119, score-0.224]

53 This colour code depicts low (orange) to high (dark red) probability that this is a skeletal node, and hence the strength of the interiority cue. [sent-120, score-0.398]

54 3), an essential “sanity check” to ensure that the network approximates human judgments in simple cases. [sent-122, score-0.15]

55 For the convex shape the model assigns figure to the interior just as one would expect even based solely on local curvature (Fig. [sent-123, score-0.279]

56 3B), estimated border ownership begins to reverse inside the deep concavity. [sent-126, score-0.898]

57 This may seem surprising, but actually closely matches empirical results obtained when local border ownership is probed psychophysically inside a similarly deep concavity, i. [sent-127, score-0.967]

58 For the overlapping shapes posteriors were also intuitive, with the occluding shape interpreted as in front and owning the common border (Fig. [sent-130, score-0.839]

59 Finally, for the two non-overlapping shapes the model computed border-ownership just as one would expect if each shape were run separately, with each shape treated as figural along its entire boundary (Fig. [sent-132, score-0.39]

60 That is, even though there is skeletal structure in the ground-region between the two shapes (see Fig. [sent-134, score-0.4]

61 2D), its posterior is weak compared to the skeletal structure inside the shapes, which thus loses the competition to own the boundary between them. [sent-135, score-0.353]

62 Figure 3: Posteriors after convergence for the four shape categories (left to right: convex, concave, overlapping, non-overlapping). [sent-138, score-0.102]

63 Arrows indicate estimated border ownership, with direction pointing to the perceived figural side, and length proportional to the magnitude of the posterior. [sent-139, score-0.559]

64 5 Figure 4: Convergence of the model for the basic shape categories. [sent-141, score-0.102]

65 The vertical lines represent the point of convergence for each of the three shape categories. [sent-142, score-0.102]

66 3 Comparison to human data Beyond the simple cases reviewed above, we wished to submit our network to a more fine-grained comparison with human data. [sent-144, score-0.127]

67 Subjects were first shown a stimulus in which the f/g configuration was globally and locally unambiguous and consistent: a smaller rectangle partly occluding a larger one (Fig. [sent-147, score-0.256]

68 5A), meaning that the smaller (front) one owns the common border. [sent-148, score-0.103]

69 Then this configuration was perturbed by adding two bars, of which one induced a local f/g reversal—making it now appear locally that the larger rectangle owned the border (Fig. [sent-149, score-0.677]

70 (The other bar in the display does not alter f/g interpretation, but was included to control for the attentional affects of introducing a bar in the image. [sent-151, score-0.116]

71 ) The inducing bar creates T-junctions that serve as strong local f/g cues, in this case tending to reverse the prior global interpretation of the figure. [sent-152, score-0.188]

72 We then measured subjective border ownership along the central contour at various distances from the inducing bar, and at different times after the onset of the bar (25ms, 100ms and 250ms). [sent-153, score-1.191]

73 We measured border ownership locally using a method introduced in [8] in which a local motion probe is introduced at a point on the boundary between two color regions of different colors, and the subject is asked which color appeared to move. [sent-154, score-1.026]

74 The goal of the experiment was to actually measure the progression of the influence of the inducing T-junction as it (hypothetically) propagated along the boundary. [sent-156, score-0.126]

75 Briefly, we found no evidence of temporal differences, meaning that f/g judgments were essentially constant over time, suggesting rapid convergence of local f/g assignment. [sent-157, score-0.196]

76 Initial stimulus with locally and globally consistent and unambiguous f/g. [sent-170, score-0.145]

77 Subsequently bars were added of which one (the top bar in this case) created a local reversal of f/g. [sent-172, score-0.167]

78 Positions at which local f/g judgments of subjects were probed. [sent-174, score-0.191]

79 The x-axis shows distance from the inducing bar at which f/g judgment was probed. [sent-176, score-0.119]

80 The y-axis shows the proportion of trials on which subjects judged the smaller rectangle to own the boundary. [sent-177, score-0.138]

81 Furthermore along the entire contour the model converged to intuitive border-ownership estimates (Fig. [sent-190, score-0.219]

82 The fact that our model yielded intuitive estimates for the current network in which not all contours were completed shows another strength of our model. [sent-192, score-0.176]

83 The model uses both local and global cues, combined in a principled way, to achieve a stable and apparently psychologically reasonable estimate of border ownership. [sent-195, score-0.558]

84 Local cues included local curvature and T-junctions, both well-established cues to f/g. [sent-196, score-0.363]

85 Global cues included skeletal structure, 7 Figure 7: (left) Node structure for the experimental stimulus. [sent-197, score-0.369]

86 a novel cue motivated by the idea that strongly axial shapes tend to be figural and thus own their boundaries. [sent-199, score-0.179]

87 Specifically, the model, like the human subjects rapidly converged to a stable local f/g interpretation. [sent-201, score-0.168]

88 Our model’s structure shows several interesting parallels to properties of neural coding of border ownership in visual cortex. [sent-202, score-0.971]

89 Some cortical cells (end-stopped cells) appear to code for local curvature [3] and T-junctions [5]. [sent-203, score-0.225]

90 The B-nodes in our model could be seen as corresponding to cells that code for border ownership [20]. [sent-204, score-0.946]

91 Furthermore, some authors [2] have suggested that recurrent feedback loops between border ownership cells in V2 and cells in V4 (corresponding to G-nodes in our model) play a role in the rapid computation of border ownership. [sent-205, score-1.52]

92 Weiss [18] pioneered the application of belief networks to the f/g problem, though their network only considered a more restricted set of local cues and no global ones, such that information only propagated along the contour. [sent-209, score-0.284]

93 Our model has a similar decay for information going through the G-nodes, though it is also influenced by an angular factor defined by the position of the skeletal node. [sent-213, score-0.276]

94 [2] defines grouping cells coding for an interiority preference that decays with the size of the receptive fields of these grouping cells. [sent-216, score-0.302]

95 Our model takes this a step further by including shape (skeletal) structure as a factor in interiority estimates, rather than simply size of receptive fields (which is similar to the rib lengths in our model). [sent-217, score-0.311]

96 Currently, our use of skeletons as shape representations is still limited to medial axis skeletons and surfaces that are not split up by occluders. [sent-218, score-0.408]

97 Eventually, we hope to fully integrate skeleton computation with f/g computation so that the more general problem of shape and surface estimation can be approached in a coherent and unified fashion. [sent-220, score-0.172]

98 Surface construction by a 2-d differentiation-integration process: a neurocomputational model for perceived border ownership, depth, and lightness in kanizsa figures. [sent-274, score-0.558]

99 The role of vertical mirror-symmetry in visual shape detection. [sent-280, score-0.139]

100 Border ownership from intracortical interactions in visual area V2. [sent-325, score-0.446]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('border', 0.489), ('ownership', 0.409), ('bi', 0.343), ('gural', 0.28), ('skeletal', 0.276), ('shapes', 0.124), ('interiority', 0.122), ('curvature', 0.108), ('contour', 0.104), ('shape', 0.102), ('gj', 0.098), ('cues', 0.093), ('rib', 0.087), ('skeletons', 0.087), ('delaunay', 0.077), ('nodes', 0.072), ('owns', 0.07), ('skeleton', 0.07), ('local', 0.069), ('subjects', 0.065), ('gi', 0.063), ('inducing', 0.061), ('network', 0.059), ('bar', 0.058), ('judgments', 0.057), ('medial', 0.057), ('bg', 0.056), ('rutgers', 0.056), ('connections', 0.056), ('cue', 0.055), ('posteriors', 0.05), ('bj', 0.05), ('competition', 0.049), ('grouping', 0.048), ('cells', 0.048), ('surfaces', 0.047), ('intuitive', 0.047), ('owned', 0.046), ('assignment', 0.045), ('globally', 0.043), ('eb', 0.042), ('nonlocal', 0.042), ('concave', 0.042), ('rectangle', 0.042), ('node', 0.041), ('propagation', 0.041), ('feldman', 0.04), ('occluding', 0.04), ('reversal', 0.04), ('piscataway', 0.038), ('propagate', 0.038), ('rapid', 0.037), ('visual', 0.037), ('stimulus', 0.036), ('contours', 0.036), ('bb', 0.036), ('onset', 0.036), ('coding', 0.036), ('direction', 0.036), ('craft', 0.035), ('curved', 0.035), ('exteriority', 0.035), ('figural', 0.035), ('gureground', 0.035), ('inducer', 0.035), ('kanizsa', 0.035), ('kogo', 0.035), ('leuven', 0.035), ('manish', 0.035), ('sprouts', 0.035), ('unambiguous', 0.035), ('overlapping', 0.034), ('perceived', 0.034), ('estimates', 0.034), ('receive', 0.034), ('along', 0.034), ('human', 0.034), ('meaning', 0.033), ('psychophysical', 0.033), ('likelihood', 0.032), ('boundaries', 0.031), ('locally', 0.031), ('bel', 0.031), ('farther', 0.031), ('judged', 0.031), ('nearer', 0.031), ('progression', 0.031), ('gk', 0.029), ('pj', 0.029), ('von', 0.029), ('partly', 0.029), ('belief', 0.029), ('axis', 0.028), ('connected', 0.028), ('deserves', 0.028), ('mises', 0.028), ('predominantly', 0.028), ('triangulation', 0.028), ('uenced', 0.028), ('boundary', 0.028)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999917 3 nips-2010-A Bayesian Framework for Figure-Ground Interpretation

Author: Vicky Froyen, Jacob Feldman, Manish Singh

Abstract: Figure/ground assignment, in which the visual image is divided into nearer (figural) and farther (ground) surfaces, is an essential step in visual processing, but its underlying computational mechanisms are poorly understood. Figural assignment (often referred to as border ownership) can vary along a contour, suggesting a spatially distributed process whereby local and global cues are combined to yield local estimates of border ownership. In this paper we model figure/ground estimation in a Bayesian belief network, attempting to capture the propagation of border ownership across the image as local cues (contour curvature and T-junctions) interact with more global cues to yield a figure/ground assignment. Our network includes as a nonlocal factor skeletal (medial axis) structure, under the hypothesis that medial structure “draws” border ownership so that borders are owned by the skeletal hypothesis that best explains them. We also briefly present a psychophysical experiment in which we measured local border ownership along a contour at various distances from an inducing cue (a T-junction). Both the human subjects and the network show similar patterns of performance, converging rapidly to a similar pattern of spatial variation in border ownership along contours. Figure/ground assignment (further referred to as f/g), in which the visual image is divided into nearer (figural) and farther (ground) surfaces, is an essential step in visual processing. A number of factors are known to affect f/g assignment, including region size [9], convexity [7, 16], and symmetry [1, 7, 11]. Figural assignment (often referred to as border ownership, under the assumption that the figural side “owns” the border) is usually studied globally, meaning that entire surfaces and their enclosing boundaries are assumed to receive a globally consistent figural status. But recent psychophysical findings [8] have suggested that border ownership can vary locally along a boundary, even leading to a globally inconsistent figure/ground assignment—broadly consistent with electrophysiological evidence showing local coding for border ownership in area V2 as early as 68 msec after image onset [20]. This suggests a spatially distributed and potentially competitive process of figural assignment [15], in which adjacent surfaces compete to own their common boundary, with figural status propagating across the image as this competition proceeds. But both the principles and computational mechanisms underlying this process are poorly understood. ∗ V.F. was supported by a Fullbright Honorary fellowship and by the Rutgers NSF IGERT program in Perceptual Science, NSF DGE 0549115, J.F. by NIH R01 EY15888, and M.S. by NSF CCF-0541185 1 In this paper we consider how border ownership might propagate over both space and time—that is, across the image as well as over the progression of computation. Following Weiss et al. [18] we adopt a Bayesian belief network architecture, with nodes along boundaries representing estimated border ownership, and connections arranged so that both neighboring nodes and nonlocal integrating nodes combine to influence local estimates of border ownership. Our model is novel in two particular respects: (a) we combine both local and global influences on border ownership in an integrated and principled way; and (b) we include as a nonlocal factor skeletal (medial axis) influences on f/g assignment. Skeletal structure has not been previously considered as a factor on border ownership, but its relevance follows from a model [4] in which shapes are conceived of as generated by or “grown” from an internal skeleton, with the consequence that their boundaries are perceptually “owned” by the skeletal side. We also briey present a psychophysical experiment in which we measured local border ownership along a contour, at several distances from a strong local f/g inducing cue, and at several time delays after the onset of the cue. The results show measurable spatial differences in judged border ownership, with judgments varying with distance from the inducer; but no temporal effect, with essentially asymptotic judgments even after very brief exposures. Both results are consistent with the behavior of the network, which converges quickly to an asymptotic but spatially nonuniform f/g assignment. 1 The Model The Network. For simplicity, we take an edge map as input for the model, assuming that edges and T-junctions have already been detected. From this edge map we then create a Bayesian belief network consisting of four hierarchical levels. At the input level the model receives evidence E from the image, consisting of local contour curvature and T-junctions. The nodes for this level are placed at equidistant locations along the contour. At the first level the model estimates local border ownership. The border ownership, or B-nodes at this level are at the same locations as the E-nodes, but are connected to their nearest neighbors, and are the parent of the E-node at their location. (As a simplifying assumption, such connections are broken at T-junctions in such a way that the occluded contour is disconnected from the occluder.) The highest level has skeletal nodes, S, whose positions are defined by the circumcenters of the Delaunay triangulation on all the E-nodes, creating a coarse medial axis skeleton [13]. Because of the structure of the Delaunay, each S-node is connected to exactly three E-nodes from which they receive information about the position and the local tangent of the contour. In the current state of the model the S-nodes are “passive”, meaning their posteriors are computed before the model is initiated. Between the S nodes and the B nodes are the grouping nodes G. They have the same positions as the S-nodes and the same Delaunay connections, but to B-nodes that have the same image positions as the E-nodes. They will integrate information from distant B-nodes, applying an interiority cue that is influenced by the local strength of skeletal axes as computed by the S-nodes (Fig. 1). Although this is a multiply connected network, we have found that given reasonable parameters the model converges to intuitive posteriors for a variety of shapes (see below). Updating. Our goal is to compute the posterior p(Bi |I), where I is the whole image. Bi is a binary variable coding for the local direction of border ownership, that is, the side that owns the border. In order for border ownership estimates to be influenced by image structure elsewhere in the image, information has to propagate throughout the network. To achieve this propagation, we use standard equations for node updating [14, 12]. However while to all other connections being directed, connections at the B-node level are undirected, causing each node to be child and parent node at the same time. Considering only the B-node level, a node Bi is only separated from the rest of the network by its two neighbors. Hence the Markovian property applies, in that Bi only needs to get iterative information from its neighbors to eventually compute p(Bi |I). So considering the whole network, at each iteration t, Bi receives information from both its child, Ei and from its parents—that is neigbouring nodes (Bi+1 and Bi−1 )—as well as all grouping nodes connected to it (Gj , ..., Gm ). The latter encode for interiority versus exteriority, interiority meaning that the B-node’s estimated gural direction points towards the G-node in question, exteriority meaning that it points away. Integrating all this information creates a multidimensional likelihood function: p(Bi |Bi−1 , Bi+1 , Gj , ..., Gm ). Because of its complexity we choose to approximate it (assuming all nodes are marginally independent of each other when conditioned on Bi ) by 2 Figure 1: Basic network structure of the model. Both skeletal (S-nodes) and border-ownerhsip nodes (B-nodes) get evidence from E-nodes, though different types. S-nodes receive mere positional information, while B-nodes receive information about local curvature and the presence of T-junctions. Because of the structure of the Delaunay triangulation S-nodes and G-nodes (grouping nodes) always get input from exactly three nodes, respectively E and B-nodes. The gray color depicts the fact that this part of the network is computed before the model is initiated and does not thereafter interact with the dynamics of the model. m p(Bi |Pj , ..., Pm ) ∝ p(Bi |Pj ) (1) j where the Pj ’s are the parents of Bi . Given this, at each iteration, each node Bi performs the following computation: Bel(Bi ) ← cλ(Bi )π(Bi )α(Bi )β(Bi ) (2) where conceptually λ stands for bottom-up information, π for top down information and α and β for information received from within the same level. More formally, λ(Bi ) ← p(E|Bi ) (3) m π(Bi ) ← p(Bi |Gj )πGj (Bi ) j (4) Gj and analogously to equation 4 for α(Bi ) and β(Bi ), which compute information coming from Bi−1 and Bi+1 respectively. For these πBi−1 (Bi ), πBi+1 (Bi ), and πGj (Bi ): πGj (Bi ) ← c π(G) λBk (Gj ) (5) k=i πBi−1 (Bi ) ← c β(Bi−1 )λ(Bi−1 )π(Bi−1 ) 3 (6) and πBi+1 (Bi ) is analogous to πBi−1 (Bi ), with c and c being normalization constants. Finally for the G-nodes: Bel(Gi ) ← cλ(Gi )π(Gi ) λ(Gi ) ← (7) λBj (Gi ) (8) j m λBj (Gi ) ← λ(Bj )p(Bi |Gj )[α(Bj )β(Bj ) Bj p(Bi |Gk )πGk (Bi )] (9) k=i Gk The posteriors of the S-nodes are used to compute the π(Gi ). This posterior computes how well the S-node at each position explains the contour—that is, how well it accounts for the cues flowing from the E-nodes it is connected to. Each Delaunay connection between S- and E-nodes can be seen as a rib that sprouts from the skeleton. More specifically each rib sprouts in a direction that is normal (perpendicular) to the tangent of the contour at the E-node plus a random error φi chosen independently for each rib from a von Mises distribution centered on zero, i.e. φi ∼ V (0, κS ) with spread parameter κS [4]. The rib lengths are drawn from an exponential decreasing density function p(ρi ) ∝ e−λS ρi [4]. We can now express how well this node “explains” the three E-nodes it is connected to via the probability that this S-node deserves to be a skeletal node or not, p(S = true|E1 , E2 , E3 ) ∝ p(ρi )p(φi ) (10) i with S = true depicting that this S-node deserves to be a skeletal node. From this we then compute the prior π(Gi ) in such a way that good (high posterior) skeletal nodes induce a high interiority bias, hence a stronger tendency to induce figural status. Conversely, bad (low posterior) skeletal nodes create a prior close to indifferent (uniform) and thus have less (or no) influence on figural status. Likelihood functions Finally we need to express the likelihood function necessary for the updating rules described above. The first two likelihood functions are part of p(Ei |Bi ), one for each of the local cues. The first one, reflecting local curvature, gives the probability of the orientations of the two vectors inherent to Ei (α1 and α2 ) given both direction of figure (θ) encoded in Bi as a von Mises density centered on θ, i.e. αi ∼ V (θ, κEB ). The second likelihood function, reflecting the presence of a T-junction, simply assumes a fixed likelihood when a T-junction is present—that is p(T-junction = true|Bi ) = θT , where Bi places the direction of figure in the direction of the occluder. This likelihood function is only in effect when a T-junction is present, replacing the curvature cue at that node. The third likelihood function serves to keep consistency between nodes of the first level. This function p(Bi |Bi−1 ) or p(Bi |Bi+1 ) is used to compute α(B) and β(B) and is defined 2x2 conditional probability matrix with a single free parameter, θBB (the probability that figural direction at both B-nodes are the same). A fourth and final likelihood function p(Bi |Gj ) serves to propagate information between level one and two. This likelihood function is 2x2 conditional probability matrix matrix with one free parameter, θBG . In this case θBG encodes the probability that the figural direction of the B-node is in the direction of the exterior or interior preference of the G-node. In total this brings us to six free parameters in the model: κS , λS , κEB , θT , θBB , and θBG . 2 Basic Simulations To evaluate the performance of the model, we first tested it on several basic stimulus configurations in which the desired outcome is intuitively clear: a convex shape, a concave shape, a pair of overlapping shapes, and a pair of non-overlapping shapes (Fig. 2,3). The convex shape is the simplest in that curvature never changes sign. The concave shape includes a region with oppositely signed curvature. (The shape is naturally described as predominantly positively curved with a region of negative curvature, i.e. a concavity. But note that it can also be interpreted as predominantly negatively curved “window” with a region of positive curvature, although this is not the intuitive interpretation.) 4 The overlapping pair of shapes consists of two convex shapes with one partly occluding the other, creating a competition between the two shapes for the ownership of the common borderline. Finally the non-overlapping shapes comprise two simple convex shapes that do not touch—again setting up a competition for ownership of the two inner boundaries (i.e. between each shape and the ground space between them). Fig. 2 shows the network structures for each of these four cases. Figure 2: Network structure for the four shape categories (left to right: convex, concave, overlapping, non-overlapping shapes). Blue depict the locations of the B-nodes (and also the E-nodes), the red connections are the connections between B-nodes, the green connections are connections between B-nodes and G-nodes, and the G-nodes (and also the S-nodes) go from orange to dark red. This colour code depicts low (orange) to high (dark red) probability that this is a skeletal node, and hence the strength of the interiority cue. Running our model with hand-estimated parameter values yields highly intuitive posteriors (Fig. 3), an essential “sanity check” to ensure that the network approximates human judgments in simple cases. For the convex shape the model assigns figure to the interior just as one would expect even based solely on local curvature (Fig. 3A). In the concave figure (Fig. 3B), estimated border ownership begins to reverse inside the deep concavity. This may seem surprising, but actually closely matches empirical results obtained when local border ownership is probed psychophysically inside a similarly deep concavity, i.e. a “negative part” in which f/g seems to partly reverse [8]. For the overlapping shapes posteriors were also intuitive, with the occluding shape interpreted as in front and owning the common border (Fig. 3C). Finally, for the two non-overlapping shapes the model computed border-ownership just as one would expect if each shape were run separately, with each shape treated as figural along its entire boundary (Fig. 3D). That is, even though there is skeletal structure in the ground-region between the two shapes (see Fig. 2D), its posterior is weak compared to the skeletal structure inside the shapes, which thus loses the competition to own the boundary between them. For all these configurations, the model not only converged to intuitive estimates but did so rapidly (Fig. 4), always in fewer cycles than would be expected by pure lateral propagation, niterations < Nnodes [18] (with these parameters, typically about five times faster). Figure 3: Posteriors after convergence for the four shape categories (left to right: convex, concave, overlapping, non-overlapping). Arrows indicate estimated border ownership, with direction pointing to the perceived figural side, and length proportional to the magnitude of the posterior. All four simulations used the same parameters. 5 Figure 4: Convergence of the model for the basic shape categories. The vertical lines represent the point of convergence for each of the three shape categories. The posterior change is calculated as |p(Bi = 1|I)t − p(Bi = 1|I)t−1 | at each iteration. 3 Comparison to human data Beyond the simple cases reviewed above, we wished to submit our network to a more fine-grained comparison with human data. To this end we compared its performance to that of human subjects in an experiment we conducted (to be presented in more detail in a future paper). Briefly, our experiment involved finding evidence for propagation of f/g signals across the image. Subjects were first shown a stimulus in which the f/g configuration was globally and locally unambiguous and consistent: a smaller rectangle partly occluding a larger one (Fig. 5A), meaning that the smaller (front) one owns the common border. Then this configuration was perturbed by adding two bars, of which one induced a local f/g reversal—making it now appear locally that the larger rectangle owned the border (Fig. 5B). (The other bar in the display does not alter f/g interpretation, but was included to control for the attentional affects of introducing a bar in the image.) The inducing bar creates T-junctions that serve as strong local f/g cues, in this case tending to reverse the prior global interpretation of the figure. We then measured subjective border ownership along the central contour at various distances from the inducing bar, and at different times after the onset of the bar (25ms, 100ms and 250ms). We measured border ownership locally using a method introduced in [8] in which a local motion probe is introduced at a point on the boundary between two color regions of different colors, and the subject is asked which color appeared to move. Because the figural side “owns” the border, the response reflects perceived figural status. The goal of the experiment was to actually measure the progression of the influence of the inducing T-junction as it (hypothetically) propagated along the boundary. Briefly, we found no evidence of temporal differences, meaning that f/g judgments were essentially constant over time, suggesting rapid convergence of local f/g assignment. (This is consistent with the very rapid convergence of our network, which would suggest a lack of measurable temporal differences except at much shorter time scales than we measured.) But we did find a progressive reduction of f/g reversal with increasing distance from the inducer—that is, the influence of the T-junction decayed with distance. Mean responses aggregated over subjects (shortest delay only) are shown in Fig. 6. In order to run our model on this stimulus (which has a much more complex structure than the simple figures tested above) we had to make some adjustments. We removed the bars from the edge map, leaving only the T-junctions as underlying cues. This was a necessary first step because our model is not yet able to cope with skeletons that are split up by occluders. (The larger rectangle’s skeleton has been split up by the lower bar.) In this way all contours except those created by the bars were used to create the network (Fig. 7). Given this network we ran the model using hand-picked parameters that 6 Figure 5: Stimuli used in the experiment. A. Initial stimulus with locally and globally consistent and unambiguous f/g. B. Subsequently bars were added of which one (the top bar in this case) created a local reversal of f/g. C. Positions at which local f/g judgments of subjects were probed. Figure 6: Results from our experiment aggregated for all 7 subjects (shortest delay only) are shown in red. The x-axis shows distance from the inducing bar at which f/g judgment was probed. The y-axis shows the proportion of trials on which subjects judged the smaller rectangle to own the boundary. As can be seen, the further from the T-junction, the lower the f/g reversal. The fitted model (green curve) shows very similar pattern. Horizontal black line indicates chance performance (ambiguous f/g). gave us the best possible qualitative similarity to the human data. The parameters used never entailed total elimination of the influence of any likelihood function (κS = 16, λS = .025, κEB = .5, θT = .9, θBB = .9, and θBG = .6). As can be seen in Fig. 6 the border-ownership estimates at the locations where we had data show compelling similarities to human judgments. Furthermore along the entire contour the model converged to intuitive border-ownership estimates (Fig. 7) very rapidly (within 36 iterations). The fact that our model yielded intuitive estimates for the current network in which not all contours were completed shows another strength of our model. Because our model included grouping nodes, it did not require contours to be amodally completed [6] in order for information to propagate. 4 Conclusion In this paper we proposed a model rooted in Bayesian belief networks to compute figure/ground. The model uses both local and global cues, combined in a principled way, to achieve a stable and apparently psychologically reasonable estimate of border ownership. Local cues included local curvature and T-junctions, both well-established cues to f/g. Global cues included skeletal structure, 7 Figure 7: (left) Node structure for the experimental stimulus. (right) The model’s local borderownership estimates after convergence. a novel cue motivated by the idea that strongly axial shapes tend to be figural and thus own their boundaries. We successfully tested this model on both simple displays, in which it gave intuitive results, and on a more complex experimental stimulus, in which it gave a close match to the pattern of f/g propagation found in our subjects. Specifically, the model, like the human subjects rapidly converged to a stable local f/g interpretation. Our model’s structure shows several interesting parallels to properties of neural coding of border ownership in visual cortex. Some cortical cells (end-stopped cells) appear to code for local curvature [3] and T-junctions [5]. The B-nodes in our model could be seen as corresponding to cells that code for border ownership [20]. Furthermore, some authors [2] have suggested that recurrent feedback loops between border ownership cells in V2 and cells in V4 (corresponding to G-nodes in our model) play a role in the rapid computation of border ownership. The very rapid convergence we observed in our model likewise appears to be due to the connections between B-nodes and G-nodes. Finally scale-invariant shape representations (such as, speculatively, those based on skeletons) are thought to be present in higher cortical regions such as IT [17], which project down to earlier areas in ways that are not yet understood. A number of parallels to past models of f/g should be mentioned. Weiss [18] pioneered the application of belief networks to the f/g problem, though their network only considered a more restricted set of local cues and no global ones, such that information only propagated along the contour. Furthermore it has not been systematically compared to human judgments. Kogo et al. [10] proposed an exponential decay of f/g signals as they spread throughout the image. Our model has a similar decay for information going through the G-nodes, though it is also influenced by an angular factor defined by the position of the skeletal node. Like the model by Li Zhaoping [19], our model includes horizontal propagation between B-nodes, analogous to border-ownership cells in her model. A neurophysiological model by Craft et al. [2] defines grouping cells coding for an interiority preference that decays with the size of the receptive fields of these grouping cells. Our model takes this a step further by including shape (skeletal) structure as a factor in interiority estimates, rather than simply size of receptive fields (which is similar to the rib lengths in our model). Currently, our use of skeletons as shape representations is still limited to medial axis skeletons and surfaces that are not split up by occluders. Our future goals including integrating skeletons in a more robust way following the probabilistic account suggested by Feldman and Singh [4]. Eventually, we hope to fully integrate skeleton computation with f/g computation so that the more general problem of shape and surface estimation can be approached in a coherent and unified fashion. 8 References [1] P. Bahnsen. Eine untersuchung uber symmetrie und assymmetrie bei visuellen wahrnehmungen. Zeitschrift fur psychology, 108:129–154, 1928. [2] E. Craft, H. Sch¨ tze, E. Niebur, and R. von der Heydt. A neural model of figure-ground u organization. Journal of Neurophysiology, 97:4310–4326, 2007. [3] A. Dobbins, S. W. Zucker, and M. S. Cyander. Endstopping and curvature. Vision Research, 29:1371–1387, 1989. [4] J. Feldman and M. Singh. Bayesian estimation of the shape skeleton. Proceedings of the National Academy of Sciences, 103:18014–18019, 2006. [5] B. Heider, V. Meskenaite, and E. Peterhans. Anatomy and physiology of a neural mechanism defining depth order and contrast polarity at illusory contours. European Journal of Neuroscience, 12:4117–4130, 2000. [6] G. Kanizsa. Organization inVision. New York: Praeger, 1979. [7] G. Kanizsa and W. Gerbino. Vision and Artifact, chapter Convexity and symmetry in figureground organisation, pages 25–32. New York: Springer, 1976. [8] S. Kim and J. Feldman. Globally inconsistent figure/ground relations induced by a negative part. Journal of Vision, 9:1534–7362, 2009. [9] K. Koffka. Principles of Gestalt Psychology. Lund Humphries, London, 1935. [10] N. Kogo, C. Strecha, L. Van Gool, and J. Wagemans. Surface construction by a 2-d differentiation-integration process: a neurocomputational model for perceived border ownership, depth, and lightness in kanizsa figures. Psychological Review, 117:406–439, 2010. [11] B. Machielsen, M. Pauwels, and J. Wagemans. The role of vertical mirror-symmetry in visual shape detection. Journal of Vision, 9:1–11, 2009. [12] K. Murphy, Y. Weiss, and M.I. Jordan. Loopy belief propagation for approximate inference: an empirical study. Proceedings of Uncertainty in AI, pages 467–475, 1999. [13] R. L. Ogniewicz and O. K¨ bler. Hierarchic Voronoi skeletons. Pattern Recognition, 28:343– u 359, 1995. [14] J. Pearl. Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan Kaufmann, 1988. [15] M. A. Peterson and E. Skow. Inhibitory competition between shape properties in figureground perception. Journal of Experimental Psychology: Human Perception and Performance, 34:251–267, 2008. [16] K. A. Stevens and A. Brookes. The concave cusp as a determiner of figure-ground. Perception, 17:35–42, 1988. [17] K. Tanaka, H. Saito, Y. Fukada, and M. Moriya. Coding visual images of object in the inferotemporal cortex of the macaque monkey. Journal of Neurophysiology, 66:170–189, 1991. [18] Y. Weiss. Interpreting images by propagating Bayesian beliefs. Adv. in Neural Information Processing Systems, 9:908915, 1997. [19] L. Zhaoping. Border ownership from intracortical interactions in visual area V2. Neuron, 47(1):143–153, Jul 2005. [20] H. Zhou, H. S. Friedman, and R. von der Heydt. Coding of border ownerschip in monkey visual cortex. The Journal of Neuroscience, 20:6594–6611, 2000. 9

2 0.091367014 224 nips-2010-Regularized estimation of image statistics by Score Matching

Author: Diederik P. Kingma, Yann L. Cun

Abstract: Score Matching is a recently-proposed criterion for training high-dimensional density models for which maximum likelihood training is intractable. It has been applied to learning natural image statistics but has so-far been limited to simple models due to the difficulty of differentiating the loss with respect to the model parameters. We show how this differentiation can be automated with an extended version of the double-backpropagation algorithm. In addition, we introduce a regularization term for the Score Matching loss that enables its use for a broader range of problem by suppressing instabilities that occur with finite training sample sizes and quantized input values. Results are reported for image denoising and super-resolution.

3 0.074050382 63 nips-2010-Distributed Dual Averaging In Networks

Author: Alekh Agarwal, Martin J. Wainwright, John C. Duchi

Abstract: The goal of decentralized optimization over a network is to optimize a global objective formed by a sum of local (possibly nonsmooth) convex functions using only local computation and communication. We develop and analyze distributed algorithms based on dual averaging of subgradients, and provide sharp bounds on their convergence rates as a function of the network size and topology. Our analysis clearly separates the convergence of the optimization algorithm itself from the effects of communication constraints arising from the network structure. We show that the number of iterations required by our algorithm scales inversely in the spectral gap of the network. The sharpness of this prediction is confirmed both by theoretical lower bounds and simulations for various networks. 1

4 0.060980193 52 nips-2010-Convex Multiple-Instance Learning by Estimating Likelihood Ratio

Author: Fuxin Li, Cristian Sminchisescu

Abstract: We propose an approach to multiple-instance learning that reformulates the problem as a convex optimization on the likelihood ratio between the positive and the negative class for each training instance. This is casted as joint estimation of both a likelihood ratio predictor and the target (likelihood ratio variable) for instances. Theoretically, we prove a quantitative relationship between the risk estimated under the 0-1 classification loss, and under a loss function for likelihood ratio. It is shown that likelihood ratio estimation is generally a good surrogate for the 0-1 loss, and separates positive and negative instances well. The likelihood ratio estimates provide a ranking of instances within a bag and are used as input features to learn a linear classifier on bags of instances. Instance-level classification is achieved from the bag-level predictions and the individual likelihood ratios. Experiments on synthetic and real datasets demonstrate the competitiveness of the approach.

5 0.05305019 170 nips-2010-Moreau-Yosida Regularization for Grouped Tree Structure Learning

Author: Jun Liu, Jieping Ye

Abstract: We consider the tree structured group Lasso where the structure over the features can be represented as a tree with leaf nodes as features and internal nodes as clusters of the features. The structured regularization with a pre-defined tree structure is based on a group-Lasso penalty, where one group is defined for each node in the tree. Such a regularization can help uncover the structured sparsity, which is desirable for applications with some meaningful tree structures on the features. However, the tree structured group Lasso is challenging to solve due to the complex regularization. In this paper, we develop an efficient algorithm for the tree structured group Lasso. One of the key steps in the proposed algorithm is to solve the Moreau-Yosida regularization associated with the grouped tree structure. The main technical contributions of this paper include (1) we show that the associated Moreau-Yosida regularization admits an analytical solution, and (2) we develop an efficient algorithm for determining the effective interval for the regularization parameter. Our experimental results on the AR and JAFFE face data sets demonstrate the efficiency and effectiveness of the proposed algorithm.

6 0.052311722 98 nips-2010-Functional form of motion priors in human motion perception

7 0.05002477 31 nips-2010-An analysis on negative curvature induced by singularity in multi-layer neural-network learning

8 0.047840852 241 nips-2010-Size Matters: Metric Visual Search Constraints from Monocular Metadata

9 0.047837771 149 nips-2010-Learning To Count Objects in Images

10 0.046403404 32 nips-2010-Approximate Inference by Compilation to Arithmetic Circuits

11 0.045797829 20 nips-2010-A unified model of short-range and long-range motion perception

12 0.045725804 276 nips-2010-Tree-Structured Stick Breaking for Hierarchical Data

13 0.045229841 59 nips-2010-Deep Coding Network

14 0.045079689 119 nips-2010-Implicit encoding of prior probabilities in optimal neural populations

15 0.044271503 21 nips-2010-Accounting for network effects in neuronal responses using L1 regularized point process models

16 0.043952771 36 nips-2010-Avoiding False Positive in Multi-Instance Learning

17 0.043779757 127 nips-2010-Inferring Stimulus Selectivity from the Spatial Structure of Neural Network Dynamics

18 0.042527691 76 nips-2010-Energy Disaggregation via Discriminative Sparse Coding

19 0.041524962 268 nips-2010-The Neural Costs of Optimal Control

20 0.041154101 87 nips-2010-Extended Bayesian Information Criteria for Gaussian Graphical Models


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.123), (1, 0.046), (2, -0.076), (3, 0.026), (4, -0.016), (5, -0.025), (6, -0.012), (7, 0.011), (8, 0.002), (9, 0.015), (10, -0.023), (11, -0.048), (12, -0.06), (13, -0.001), (14, -0.004), (15, -0.024), (16, -0.054), (17, -0.033), (18, -0.024), (19, -0.005), (20, 0.028), (21, -0.061), (22, 0.051), (23, -0.014), (24, 0.016), (25, 0.058), (26, 0.038), (27, -0.025), (28, -0.014), (29, 0.005), (30, -0.006), (31, 0.024), (32, 0.059), (33, -0.091), (34, -0.004), (35, -0.069), (36, -0.036), (37, -0.028), (38, -0.052), (39, -0.115), (40, 0.075), (41, 0.002), (42, 0.073), (43, 0.07), (44, 0.04), (45, 0.0), (46, 0.021), (47, -0.09), (48, 0.007), (49, -0.03)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.94460469 3 nips-2010-A Bayesian Framework for Figure-Ground Interpretation

Author: Vicky Froyen, Jacob Feldman, Manish Singh

Abstract: Figure/ground assignment, in which the visual image is divided into nearer (figural) and farther (ground) surfaces, is an essential step in visual processing, but its underlying computational mechanisms are poorly understood. Figural assignment (often referred to as border ownership) can vary along a contour, suggesting a spatially distributed process whereby local and global cues are combined to yield local estimates of border ownership. In this paper we model figure/ground estimation in a Bayesian belief network, attempting to capture the propagation of border ownership across the image as local cues (contour curvature and T-junctions) interact with more global cues to yield a figure/ground assignment. Our network includes as a nonlocal factor skeletal (medial axis) structure, under the hypothesis that medial structure “draws” border ownership so that borders are owned by the skeletal hypothesis that best explains them. We also briefly present a psychophysical experiment in which we measured local border ownership along a contour at various distances from an inducing cue (a T-junction). Both the human subjects and the network show similar patterns of performance, converging rapidly to a similar pattern of spatial variation in border ownership along contours. Figure/ground assignment (further referred to as f/g), in which the visual image is divided into nearer (figural) and farther (ground) surfaces, is an essential step in visual processing. A number of factors are known to affect f/g assignment, including region size [9], convexity [7, 16], and symmetry [1, 7, 11]. Figural assignment (often referred to as border ownership, under the assumption that the figural side “owns” the border) is usually studied globally, meaning that entire surfaces and their enclosing boundaries are assumed to receive a globally consistent figural status. But recent psychophysical findings [8] have suggested that border ownership can vary locally along a boundary, even leading to a globally inconsistent figure/ground assignment—broadly consistent with electrophysiological evidence showing local coding for border ownership in area V2 as early as 68 msec after image onset [20]. This suggests a spatially distributed and potentially competitive process of figural assignment [15], in which adjacent surfaces compete to own their common boundary, with figural status propagating across the image as this competition proceeds. But both the principles and computational mechanisms underlying this process are poorly understood. ∗ V.F. was supported by a Fullbright Honorary fellowship and by the Rutgers NSF IGERT program in Perceptual Science, NSF DGE 0549115, J.F. by NIH R01 EY15888, and M.S. by NSF CCF-0541185 1 In this paper we consider how border ownership might propagate over both space and time—that is, across the image as well as over the progression of computation. Following Weiss et al. [18] we adopt a Bayesian belief network architecture, with nodes along boundaries representing estimated border ownership, and connections arranged so that both neighboring nodes and nonlocal integrating nodes combine to influence local estimates of border ownership. Our model is novel in two particular respects: (a) we combine both local and global influences on border ownership in an integrated and principled way; and (b) we include as a nonlocal factor skeletal (medial axis) influences on f/g assignment. Skeletal structure has not been previously considered as a factor on border ownership, but its relevance follows from a model [4] in which shapes are conceived of as generated by or “grown” from an internal skeleton, with the consequence that their boundaries are perceptually “owned” by the skeletal side. We also briey present a psychophysical experiment in which we measured local border ownership along a contour, at several distances from a strong local f/g inducing cue, and at several time delays after the onset of the cue. The results show measurable spatial differences in judged border ownership, with judgments varying with distance from the inducer; but no temporal effect, with essentially asymptotic judgments even after very brief exposures. Both results are consistent with the behavior of the network, which converges quickly to an asymptotic but spatially nonuniform f/g assignment. 1 The Model The Network. For simplicity, we take an edge map as input for the model, assuming that edges and T-junctions have already been detected. From this edge map we then create a Bayesian belief network consisting of four hierarchical levels. At the input level the model receives evidence E from the image, consisting of local contour curvature and T-junctions. The nodes for this level are placed at equidistant locations along the contour. At the first level the model estimates local border ownership. The border ownership, or B-nodes at this level are at the same locations as the E-nodes, but are connected to their nearest neighbors, and are the parent of the E-node at their location. (As a simplifying assumption, such connections are broken at T-junctions in such a way that the occluded contour is disconnected from the occluder.) The highest level has skeletal nodes, S, whose positions are defined by the circumcenters of the Delaunay triangulation on all the E-nodes, creating a coarse medial axis skeleton [13]. Because of the structure of the Delaunay, each S-node is connected to exactly three E-nodes from which they receive information about the position and the local tangent of the contour. In the current state of the model the S-nodes are “passive”, meaning their posteriors are computed before the model is initiated. Between the S nodes and the B nodes are the grouping nodes G. They have the same positions as the S-nodes and the same Delaunay connections, but to B-nodes that have the same image positions as the E-nodes. They will integrate information from distant B-nodes, applying an interiority cue that is influenced by the local strength of skeletal axes as computed by the S-nodes (Fig. 1). Although this is a multiply connected network, we have found that given reasonable parameters the model converges to intuitive posteriors for a variety of shapes (see below). Updating. Our goal is to compute the posterior p(Bi |I), where I is the whole image. Bi is a binary variable coding for the local direction of border ownership, that is, the side that owns the border. In order for border ownership estimates to be influenced by image structure elsewhere in the image, information has to propagate throughout the network. To achieve this propagation, we use standard equations for node updating [14, 12]. However while to all other connections being directed, connections at the B-node level are undirected, causing each node to be child and parent node at the same time. Considering only the B-node level, a node Bi is only separated from the rest of the network by its two neighbors. Hence the Markovian property applies, in that Bi only needs to get iterative information from its neighbors to eventually compute p(Bi |I). So considering the whole network, at each iteration t, Bi receives information from both its child, Ei and from its parents—that is neigbouring nodes (Bi+1 and Bi−1 )—as well as all grouping nodes connected to it (Gj , ..., Gm ). The latter encode for interiority versus exteriority, interiority meaning that the B-node’s estimated gural direction points towards the G-node in question, exteriority meaning that it points away. Integrating all this information creates a multidimensional likelihood function: p(Bi |Bi−1 , Bi+1 , Gj , ..., Gm ). Because of its complexity we choose to approximate it (assuming all nodes are marginally independent of each other when conditioned on Bi ) by 2 Figure 1: Basic network structure of the model. Both skeletal (S-nodes) and border-ownerhsip nodes (B-nodes) get evidence from E-nodes, though different types. S-nodes receive mere positional information, while B-nodes receive information about local curvature and the presence of T-junctions. Because of the structure of the Delaunay triangulation S-nodes and G-nodes (grouping nodes) always get input from exactly three nodes, respectively E and B-nodes. The gray color depicts the fact that this part of the network is computed before the model is initiated and does not thereafter interact with the dynamics of the model. m p(Bi |Pj , ..., Pm ) ∝ p(Bi |Pj ) (1) j where the Pj ’s are the parents of Bi . Given this, at each iteration, each node Bi performs the following computation: Bel(Bi ) ← cλ(Bi )π(Bi )α(Bi )β(Bi ) (2) where conceptually λ stands for bottom-up information, π for top down information and α and β for information received from within the same level. More formally, λ(Bi ) ← p(E|Bi ) (3) m π(Bi ) ← p(Bi |Gj )πGj (Bi ) j (4) Gj and analogously to equation 4 for α(Bi ) and β(Bi ), which compute information coming from Bi−1 and Bi+1 respectively. For these πBi−1 (Bi ), πBi+1 (Bi ), and πGj (Bi ): πGj (Bi ) ← c π(G) λBk (Gj ) (5) k=i πBi−1 (Bi ) ← c β(Bi−1 )λ(Bi−1 )π(Bi−1 ) 3 (6) and πBi+1 (Bi ) is analogous to πBi−1 (Bi ), with c and c being normalization constants. Finally for the G-nodes: Bel(Gi ) ← cλ(Gi )π(Gi ) λ(Gi ) ← (7) λBj (Gi ) (8) j m λBj (Gi ) ← λ(Bj )p(Bi |Gj )[α(Bj )β(Bj ) Bj p(Bi |Gk )πGk (Bi )] (9) k=i Gk The posteriors of the S-nodes are used to compute the π(Gi ). This posterior computes how well the S-node at each position explains the contour—that is, how well it accounts for the cues flowing from the E-nodes it is connected to. Each Delaunay connection between S- and E-nodes can be seen as a rib that sprouts from the skeleton. More specifically each rib sprouts in a direction that is normal (perpendicular) to the tangent of the contour at the E-node plus a random error φi chosen independently for each rib from a von Mises distribution centered on zero, i.e. φi ∼ V (0, κS ) with spread parameter κS [4]. The rib lengths are drawn from an exponential decreasing density function p(ρi ) ∝ e−λS ρi [4]. We can now express how well this node “explains” the three E-nodes it is connected to via the probability that this S-node deserves to be a skeletal node or not, p(S = true|E1 , E2 , E3 ) ∝ p(ρi )p(φi ) (10) i with S = true depicting that this S-node deserves to be a skeletal node. From this we then compute the prior π(Gi ) in such a way that good (high posterior) skeletal nodes induce a high interiority bias, hence a stronger tendency to induce figural status. Conversely, bad (low posterior) skeletal nodes create a prior close to indifferent (uniform) and thus have less (or no) influence on figural status. Likelihood functions Finally we need to express the likelihood function necessary for the updating rules described above. The first two likelihood functions are part of p(Ei |Bi ), one for each of the local cues. The first one, reflecting local curvature, gives the probability of the orientations of the two vectors inherent to Ei (α1 and α2 ) given both direction of figure (θ) encoded in Bi as a von Mises density centered on θ, i.e. αi ∼ V (θ, κEB ). The second likelihood function, reflecting the presence of a T-junction, simply assumes a fixed likelihood when a T-junction is present—that is p(T-junction = true|Bi ) = θT , where Bi places the direction of figure in the direction of the occluder. This likelihood function is only in effect when a T-junction is present, replacing the curvature cue at that node. The third likelihood function serves to keep consistency between nodes of the first level. This function p(Bi |Bi−1 ) or p(Bi |Bi+1 ) is used to compute α(B) and β(B) and is defined 2x2 conditional probability matrix with a single free parameter, θBB (the probability that figural direction at both B-nodes are the same). A fourth and final likelihood function p(Bi |Gj ) serves to propagate information between level one and two. This likelihood function is 2x2 conditional probability matrix matrix with one free parameter, θBG . In this case θBG encodes the probability that the figural direction of the B-node is in the direction of the exterior or interior preference of the G-node. In total this brings us to six free parameters in the model: κS , λS , κEB , θT , θBB , and θBG . 2 Basic Simulations To evaluate the performance of the model, we first tested it on several basic stimulus configurations in which the desired outcome is intuitively clear: a convex shape, a concave shape, a pair of overlapping shapes, and a pair of non-overlapping shapes (Fig. 2,3). The convex shape is the simplest in that curvature never changes sign. The concave shape includes a region with oppositely signed curvature. (The shape is naturally described as predominantly positively curved with a region of negative curvature, i.e. a concavity. But note that it can also be interpreted as predominantly negatively curved “window” with a region of positive curvature, although this is not the intuitive interpretation.) 4 The overlapping pair of shapes consists of two convex shapes with one partly occluding the other, creating a competition between the two shapes for the ownership of the common borderline. Finally the non-overlapping shapes comprise two simple convex shapes that do not touch—again setting up a competition for ownership of the two inner boundaries (i.e. between each shape and the ground space between them). Fig. 2 shows the network structures for each of these four cases. Figure 2: Network structure for the four shape categories (left to right: convex, concave, overlapping, non-overlapping shapes). Blue depict the locations of the B-nodes (and also the E-nodes), the red connections are the connections between B-nodes, the green connections are connections between B-nodes and G-nodes, and the G-nodes (and also the S-nodes) go from orange to dark red. This colour code depicts low (orange) to high (dark red) probability that this is a skeletal node, and hence the strength of the interiority cue. Running our model with hand-estimated parameter values yields highly intuitive posteriors (Fig. 3), an essential “sanity check” to ensure that the network approximates human judgments in simple cases. For the convex shape the model assigns figure to the interior just as one would expect even based solely on local curvature (Fig. 3A). In the concave figure (Fig. 3B), estimated border ownership begins to reverse inside the deep concavity. This may seem surprising, but actually closely matches empirical results obtained when local border ownership is probed psychophysically inside a similarly deep concavity, i.e. a “negative part” in which f/g seems to partly reverse [8]. For the overlapping shapes posteriors were also intuitive, with the occluding shape interpreted as in front and owning the common border (Fig. 3C). Finally, for the two non-overlapping shapes the model computed border-ownership just as one would expect if each shape were run separately, with each shape treated as figural along its entire boundary (Fig. 3D). That is, even though there is skeletal structure in the ground-region between the two shapes (see Fig. 2D), its posterior is weak compared to the skeletal structure inside the shapes, which thus loses the competition to own the boundary between them. For all these configurations, the model not only converged to intuitive estimates but did so rapidly (Fig. 4), always in fewer cycles than would be expected by pure lateral propagation, niterations < Nnodes [18] (with these parameters, typically about five times faster). Figure 3: Posteriors after convergence for the four shape categories (left to right: convex, concave, overlapping, non-overlapping). Arrows indicate estimated border ownership, with direction pointing to the perceived figural side, and length proportional to the magnitude of the posterior. All four simulations used the same parameters. 5 Figure 4: Convergence of the model for the basic shape categories. The vertical lines represent the point of convergence for each of the three shape categories. The posterior change is calculated as |p(Bi = 1|I)t − p(Bi = 1|I)t−1 | at each iteration. 3 Comparison to human data Beyond the simple cases reviewed above, we wished to submit our network to a more fine-grained comparison with human data. To this end we compared its performance to that of human subjects in an experiment we conducted (to be presented in more detail in a future paper). Briefly, our experiment involved finding evidence for propagation of f/g signals across the image. Subjects were first shown a stimulus in which the f/g configuration was globally and locally unambiguous and consistent: a smaller rectangle partly occluding a larger one (Fig. 5A), meaning that the smaller (front) one owns the common border. Then this configuration was perturbed by adding two bars, of which one induced a local f/g reversal—making it now appear locally that the larger rectangle owned the border (Fig. 5B). (The other bar in the display does not alter f/g interpretation, but was included to control for the attentional affects of introducing a bar in the image.) The inducing bar creates T-junctions that serve as strong local f/g cues, in this case tending to reverse the prior global interpretation of the figure. We then measured subjective border ownership along the central contour at various distances from the inducing bar, and at different times after the onset of the bar (25ms, 100ms and 250ms). We measured border ownership locally using a method introduced in [8] in which a local motion probe is introduced at a point on the boundary between two color regions of different colors, and the subject is asked which color appeared to move. Because the figural side “owns” the border, the response reflects perceived figural status. The goal of the experiment was to actually measure the progression of the influence of the inducing T-junction as it (hypothetically) propagated along the boundary. Briefly, we found no evidence of temporal differences, meaning that f/g judgments were essentially constant over time, suggesting rapid convergence of local f/g assignment. (This is consistent with the very rapid convergence of our network, which would suggest a lack of measurable temporal differences except at much shorter time scales than we measured.) But we did find a progressive reduction of f/g reversal with increasing distance from the inducer—that is, the influence of the T-junction decayed with distance. Mean responses aggregated over subjects (shortest delay only) are shown in Fig. 6. In order to run our model on this stimulus (which has a much more complex structure than the simple figures tested above) we had to make some adjustments. We removed the bars from the edge map, leaving only the T-junctions as underlying cues. This was a necessary first step because our model is not yet able to cope with skeletons that are split up by occluders. (The larger rectangle’s skeleton has been split up by the lower bar.) In this way all contours except those created by the bars were used to create the network (Fig. 7). Given this network we ran the model using hand-picked parameters that 6 Figure 5: Stimuli used in the experiment. A. Initial stimulus with locally and globally consistent and unambiguous f/g. B. Subsequently bars were added of which one (the top bar in this case) created a local reversal of f/g. C. Positions at which local f/g judgments of subjects were probed. Figure 6: Results from our experiment aggregated for all 7 subjects (shortest delay only) are shown in red. The x-axis shows distance from the inducing bar at which f/g judgment was probed. The y-axis shows the proportion of trials on which subjects judged the smaller rectangle to own the boundary. As can be seen, the further from the T-junction, the lower the f/g reversal. The fitted model (green curve) shows very similar pattern. Horizontal black line indicates chance performance (ambiguous f/g). gave us the best possible qualitative similarity to the human data. The parameters used never entailed total elimination of the influence of any likelihood function (κS = 16, λS = .025, κEB = .5, θT = .9, θBB = .9, and θBG = .6). As can be seen in Fig. 6 the border-ownership estimates at the locations where we had data show compelling similarities to human judgments. Furthermore along the entire contour the model converged to intuitive border-ownership estimates (Fig. 7) very rapidly (within 36 iterations). The fact that our model yielded intuitive estimates for the current network in which not all contours were completed shows another strength of our model. Because our model included grouping nodes, it did not require contours to be amodally completed [6] in order for information to propagate. 4 Conclusion In this paper we proposed a model rooted in Bayesian belief networks to compute figure/ground. The model uses both local and global cues, combined in a principled way, to achieve a stable and apparently psychologically reasonable estimate of border ownership. Local cues included local curvature and T-junctions, both well-established cues to f/g. Global cues included skeletal structure, 7 Figure 7: (left) Node structure for the experimental stimulus. (right) The model’s local borderownership estimates after convergence. a novel cue motivated by the idea that strongly axial shapes tend to be figural and thus own their boundaries. We successfully tested this model on both simple displays, in which it gave intuitive results, and on a more complex experimental stimulus, in which it gave a close match to the pattern of f/g propagation found in our subjects. Specifically, the model, like the human subjects rapidly converged to a stable local f/g interpretation. Our model’s structure shows several interesting parallels to properties of neural coding of border ownership in visual cortex. Some cortical cells (end-stopped cells) appear to code for local curvature [3] and T-junctions [5]. The B-nodes in our model could be seen as corresponding to cells that code for border ownership [20]. Furthermore, some authors [2] have suggested that recurrent feedback loops between border ownership cells in V2 and cells in V4 (corresponding to G-nodes in our model) play a role in the rapid computation of border ownership. The very rapid convergence we observed in our model likewise appears to be due to the connections between B-nodes and G-nodes. Finally scale-invariant shape representations (such as, speculatively, those based on skeletons) are thought to be present in higher cortical regions such as IT [17], which project down to earlier areas in ways that are not yet understood. A number of parallels to past models of f/g should be mentioned. Weiss [18] pioneered the application of belief networks to the f/g problem, though their network only considered a more restricted set of local cues and no global ones, such that information only propagated along the contour. Furthermore it has not been systematically compared to human judgments. Kogo et al. [10] proposed an exponential decay of f/g signals as they spread throughout the image. Our model has a similar decay for information going through the G-nodes, though it is also influenced by an angular factor defined by the position of the skeletal node. Like the model by Li Zhaoping [19], our model includes horizontal propagation between B-nodes, analogous to border-ownership cells in her model. A neurophysiological model by Craft et al. [2] defines grouping cells coding for an interiority preference that decays with the size of the receptive fields of these grouping cells. Our model takes this a step further by including shape (skeletal) structure as a factor in interiority estimates, rather than simply size of receptive fields (which is similar to the rib lengths in our model). Currently, our use of skeletons as shape representations is still limited to medial axis skeletons and surfaces that are not split up by occluders. Our future goals including integrating skeletons in a more robust way following the probabilistic account suggested by Feldman and Singh [4]. Eventually, we hope to fully integrate skeleton computation with f/g computation so that the more general problem of shape and surface estimation can be approached in a coherent and unified fashion. 8 References [1] P. Bahnsen. Eine untersuchung uber symmetrie und assymmetrie bei visuellen wahrnehmungen. Zeitschrift fur psychology, 108:129–154, 1928. [2] E. Craft, H. Sch¨ tze, E. Niebur, and R. von der Heydt. A neural model of figure-ground u organization. Journal of Neurophysiology, 97:4310–4326, 2007. [3] A. Dobbins, S. W. Zucker, and M. S. Cyander. Endstopping and curvature. Vision Research, 29:1371–1387, 1989. [4] J. Feldman and M. Singh. Bayesian estimation of the shape skeleton. Proceedings of the National Academy of Sciences, 103:18014–18019, 2006. [5] B. Heider, V. Meskenaite, and E. Peterhans. Anatomy and physiology of a neural mechanism defining depth order and contrast polarity at illusory contours. European Journal of Neuroscience, 12:4117–4130, 2000. [6] G. Kanizsa. Organization inVision. New York: Praeger, 1979. [7] G. Kanizsa and W. Gerbino. Vision and Artifact, chapter Convexity and symmetry in figureground organisation, pages 25–32. New York: Springer, 1976. [8] S. Kim and J. Feldman. Globally inconsistent figure/ground relations induced by a negative part. Journal of Vision, 9:1534–7362, 2009. [9] K. Koffka. Principles of Gestalt Psychology. Lund Humphries, London, 1935. [10] N. Kogo, C. Strecha, L. Van Gool, and J. Wagemans. Surface construction by a 2-d differentiation-integration process: a neurocomputational model for perceived border ownership, depth, and lightness in kanizsa figures. Psychological Review, 117:406–439, 2010. [11] B. Machielsen, M. Pauwels, and J. Wagemans. The role of vertical mirror-symmetry in visual shape detection. Journal of Vision, 9:1–11, 2009. [12] K. Murphy, Y. Weiss, and M.I. Jordan. Loopy belief propagation for approximate inference: an empirical study. Proceedings of Uncertainty in AI, pages 467–475, 1999. [13] R. L. Ogniewicz and O. K¨ bler. Hierarchic Voronoi skeletons. Pattern Recognition, 28:343– u 359, 1995. [14] J. Pearl. Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan Kaufmann, 1988. [15] M. A. Peterson and E. Skow. Inhibitory competition between shape properties in figureground perception. Journal of Experimental Psychology: Human Perception and Performance, 34:251–267, 2008. [16] K. A. Stevens and A. Brookes. The concave cusp as a determiner of figure-ground. Perception, 17:35–42, 1988. [17] K. Tanaka, H. Saito, Y. Fukada, and M. Moriya. Coding visual images of object in the inferotemporal cortex of the macaque monkey. Journal of Neurophysiology, 66:170–189, 1991. [18] Y. Weiss. Interpreting images by propagating Bayesian beliefs. Adv. in Neural Information Processing Systems, 9:908915, 1997. [19] L. Zhaoping. Border ownership from intracortical interactions in visual area V2. Neuron, 47(1):143–153, Jul 2005. [20] H. Zhou, H. S. Friedman, and R. von der Heydt. Coding of border ownerschip in monkey visual cortex. The Journal of Neuroscience, 20:6594–6611, 2000. 9

2 0.57364964 224 nips-2010-Regularized estimation of image statistics by Score Matching

Author: Diederik P. Kingma, Yann L. Cun

Abstract: Score Matching is a recently-proposed criterion for training high-dimensional density models for which maximum likelihood training is intractable. It has been applied to learning natural image statistics but has so-far been limited to simple models due to the difficulty of differentiating the loss with respect to the model parameters. We show how this differentiation can be automated with an extended version of the double-backpropagation algorithm. In addition, we introduce a regularization term for the Score Matching loss that enables its use for a broader range of problem by suppressing instabilities that occur with finite training sample sizes and quantized input values. Results are reported for image denoising and super-resolution.

3 0.50831819 95 nips-2010-Feature Transitions with Saccadic Search: Size, Color, and Orientation Are Not Alike

Author: Stella X. Yu

Abstract: Size, color, and orientation have long been considered elementary features whose attributes are extracted in parallel and available to guide the deployment of attention. If each is processed in the same fashion with simply a different set of local detectors, one would expect similar search behaviours on localizing an equivalent flickering change among identically laid out disks. We analyze feature transitions associated with saccadic search and find out that size, color, and orientation are not alike in dynamic attribute processing over time. The Markovian feature transition is attractive for size, repulsive for color, and largely reversible for orientation. 1

4 0.49893355 84 nips-2010-Exact inference and learning for cumulative distribution functions on loopy graphs

Author: Nebojsa Jojic, Chris Meek, Jim C. Huang

Abstract: Many problem domains including climatology and epidemiology require models that can capture both heavy-tailed statistics and local dependencies. Specifying such distributions using graphical models for probability density functions (PDFs) generally lead to intractable inference and learning. Cumulative distribution networks (CDNs) provide a means to tractably specify multivariate heavy-tailed models as a product of cumulative distribution functions (CDFs). Existing algorithms for inference and learning in CDNs are limited to those with tree-structured (nonloopy) graphs. In this paper, we develop inference and learning algorithms for CDNs with arbitrary topology. Our approach to inference and learning relies on recursively decomposing the computation of mixed derivatives based on a junction trees over the cumulative distribution functions. We demonstrate that our systematic approach to utilizing the sparsity represented by the junction tree yields signiďŹ cant performance improvements over the general symbolic differentiation programs Mathematica and D*. Using two real-world datasets, we demonstrate that non-tree structured (loopy) CDNs are able to provide signiďŹ cantly better ďŹ ts to the data as compared to tree-structured and unstructured CDNs and other heavy-tailed multivariate distributions such as the multivariate copula and logistic models.

5 0.4966194 63 nips-2010-Distributed Dual Averaging In Networks

Author: Alekh Agarwal, Martin J. Wainwright, John C. Duchi

Abstract: The goal of decentralized optimization over a network is to optimize a global objective formed by a sum of local (possibly nonsmooth) convex functions using only local computation and communication. We develop and analyze distributed algorithms based on dual averaging of subgradients, and provide sharp bounds on their convergence rates as a function of the network size and topology. Our analysis clearly separates the convergence of the optimization algorithm itself from the effects of communication constraints arising from the network structure. We show that the number of iterations required by our algorithm scales inversely in the spectral gap of the network. The sharpness of this prediction is confirmed both by theoretical lower bounds and simulations for various networks. 1

6 0.4748984 17 nips-2010-A biologically plausible network for the computation of orientation dominance

7 0.46004307 107 nips-2010-Global seismic monitoring as probabilistic inference

8 0.45501187 1 nips-2010-(RF)^2 -- Random Forest Random Field

9 0.44631362 214 nips-2010-Probabilistic Belief Revision with Structural Constraints

10 0.44329163 190 nips-2010-On the Convexity of Latent Social Network Inference

11 0.43260041 32 nips-2010-Approximate Inference by Compilation to Arithmetic Circuits

12 0.42791143 52 nips-2010-Convex Multiple-Instance Learning by Estimating Likelihood Ratio

13 0.41689804 36 nips-2010-Avoiding False Positive in Multi-Instance Learning

14 0.41340083 234 nips-2010-Segmentation as Maximum-Weight Independent Set

15 0.40653947 170 nips-2010-Moreau-Yosida Regularization for Grouped Tree Structure Learning

16 0.397432 266 nips-2010-The Maximal Causes of Natural Scenes are Edge Filters

17 0.39636815 81 nips-2010-Evaluating neuronal codes for inference using Fisher information

18 0.38039064 85 nips-2010-Exact learning curves for Gaussian process regression on large random graphs

19 0.37937805 129 nips-2010-Inter-time segment information sharing for non-homogeneous dynamic Bayesian networks

20 0.37624341 119 nips-2010-Implicit encoding of prior probabilities in optimal neural populations


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(13, 0.027), (27, 0.121), (30, 0.044), (35, 0.024), (45, 0.212), (50, 0.034), (52, 0.025), (57, 0.297), (60, 0.028), (77, 0.061), (90, 0.041)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.8231619 3 nips-2010-A Bayesian Framework for Figure-Ground Interpretation

Author: Vicky Froyen, Jacob Feldman, Manish Singh

Abstract: Figure/ground assignment, in which the visual image is divided into nearer (figural) and farther (ground) surfaces, is an essential step in visual processing, but its underlying computational mechanisms are poorly understood. Figural assignment (often referred to as border ownership) can vary along a contour, suggesting a spatially distributed process whereby local and global cues are combined to yield local estimates of border ownership. In this paper we model figure/ground estimation in a Bayesian belief network, attempting to capture the propagation of border ownership across the image as local cues (contour curvature and T-junctions) interact with more global cues to yield a figure/ground assignment. Our network includes as a nonlocal factor skeletal (medial axis) structure, under the hypothesis that medial structure “draws” border ownership so that borders are owned by the skeletal hypothesis that best explains them. We also briefly present a psychophysical experiment in which we measured local border ownership along a contour at various distances from an inducing cue (a T-junction). Both the human subjects and the network show similar patterns of performance, converging rapidly to a similar pattern of spatial variation in border ownership along contours. Figure/ground assignment (further referred to as f/g), in which the visual image is divided into nearer (figural) and farther (ground) surfaces, is an essential step in visual processing. A number of factors are known to affect f/g assignment, including region size [9], convexity [7, 16], and symmetry [1, 7, 11]. Figural assignment (often referred to as border ownership, under the assumption that the figural side “owns” the border) is usually studied globally, meaning that entire surfaces and their enclosing boundaries are assumed to receive a globally consistent figural status. But recent psychophysical findings [8] have suggested that border ownership can vary locally along a boundary, even leading to a globally inconsistent figure/ground assignment—broadly consistent with electrophysiological evidence showing local coding for border ownership in area V2 as early as 68 msec after image onset [20]. This suggests a spatially distributed and potentially competitive process of figural assignment [15], in which adjacent surfaces compete to own their common boundary, with figural status propagating across the image as this competition proceeds. But both the principles and computational mechanisms underlying this process are poorly understood. ∗ V.F. was supported by a Fullbright Honorary fellowship and by the Rutgers NSF IGERT program in Perceptual Science, NSF DGE 0549115, J.F. by NIH R01 EY15888, and M.S. by NSF CCF-0541185 1 In this paper we consider how border ownership might propagate over both space and time—that is, across the image as well as over the progression of computation. Following Weiss et al. [18] we adopt a Bayesian belief network architecture, with nodes along boundaries representing estimated border ownership, and connections arranged so that both neighboring nodes and nonlocal integrating nodes combine to influence local estimates of border ownership. Our model is novel in two particular respects: (a) we combine both local and global influences on border ownership in an integrated and principled way; and (b) we include as a nonlocal factor skeletal (medial axis) influences on f/g assignment. Skeletal structure has not been previously considered as a factor on border ownership, but its relevance follows from a model [4] in which shapes are conceived of as generated by or “grown” from an internal skeleton, with the consequence that their boundaries are perceptually “owned” by the skeletal side. We also briey present a psychophysical experiment in which we measured local border ownership along a contour, at several distances from a strong local f/g inducing cue, and at several time delays after the onset of the cue. The results show measurable spatial differences in judged border ownership, with judgments varying with distance from the inducer; but no temporal effect, with essentially asymptotic judgments even after very brief exposures. Both results are consistent with the behavior of the network, which converges quickly to an asymptotic but spatially nonuniform f/g assignment. 1 The Model The Network. For simplicity, we take an edge map as input for the model, assuming that edges and T-junctions have already been detected. From this edge map we then create a Bayesian belief network consisting of four hierarchical levels. At the input level the model receives evidence E from the image, consisting of local contour curvature and T-junctions. The nodes for this level are placed at equidistant locations along the contour. At the first level the model estimates local border ownership. The border ownership, or B-nodes at this level are at the same locations as the E-nodes, but are connected to their nearest neighbors, and are the parent of the E-node at their location. (As a simplifying assumption, such connections are broken at T-junctions in such a way that the occluded contour is disconnected from the occluder.) The highest level has skeletal nodes, S, whose positions are defined by the circumcenters of the Delaunay triangulation on all the E-nodes, creating a coarse medial axis skeleton [13]. Because of the structure of the Delaunay, each S-node is connected to exactly three E-nodes from which they receive information about the position and the local tangent of the contour. In the current state of the model the S-nodes are “passive”, meaning their posteriors are computed before the model is initiated. Between the S nodes and the B nodes are the grouping nodes G. They have the same positions as the S-nodes and the same Delaunay connections, but to B-nodes that have the same image positions as the E-nodes. They will integrate information from distant B-nodes, applying an interiority cue that is influenced by the local strength of skeletal axes as computed by the S-nodes (Fig. 1). Although this is a multiply connected network, we have found that given reasonable parameters the model converges to intuitive posteriors for a variety of shapes (see below). Updating. Our goal is to compute the posterior p(Bi |I), where I is the whole image. Bi is a binary variable coding for the local direction of border ownership, that is, the side that owns the border. In order for border ownership estimates to be influenced by image structure elsewhere in the image, information has to propagate throughout the network. To achieve this propagation, we use standard equations for node updating [14, 12]. However while to all other connections being directed, connections at the B-node level are undirected, causing each node to be child and parent node at the same time. Considering only the B-node level, a node Bi is only separated from the rest of the network by its two neighbors. Hence the Markovian property applies, in that Bi only needs to get iterative information from its neighbors to eventually compute p(Bi |I). So considering the whole network, at each iteration t, Bi receives information from both its child, Ei and from its parents—that is neigbouring nodes (Bi+1 and Bi−1 )—as well as all grouping nodes connected to it (Gj , ..., Gm ). The latter encode for interiority versus exteriority, interiority meaning that the B-node’s estimated gural direction points towards the G-node in question, exteriority meaning that it points away. Integrating all this information creates a multidimensional likelihood function: p(Bi |Bi−1 , Bi+1 , Gj , ..., Gm ). Because of its complexity we choose to approximate it (assuming all nodes are marginally independent of each other when conditioned on Bi ) by 2 Figure 1: Basic network structure of the model. Both skeletal (S-nodes) and border-ownerhsip nodes (B-nodes) get evidence from E-nodes, though different types. S-nodes receive mere positional information, while B-nodes receive information about local curvature and the presence of T-junctions. Because of the structure of the Delaunay triangulation S-nodes and G-nodes (grouping nodes) always get input from exactly three nodes, respectively E and B-nodes. The gray color depicts the fact that this part of the network is computed before the model is initiated and does not thereafter interact with the dynamics of the model. m p(Bi |Pj , ..., Pm ) ∝ p(Bi |Pj ) (1) j where the Pj ’s are the parents of Bi . Given this, at each iteration, each node Bi performs the following computation: Bel(Bi ) ← cλ(Bi )π(Bi )α(Bi )β(Bi ) (2) where conceptually λ stands for bottom-up information, π for top down information and α and β for information received from within the same level. More formally, λ(Bi ) ← p(E|Bi ) (3) m π(Bi ) ← p(Bi |Gj )πGj (Bi ) j (4) Gj and analogously to equation 4 for α(Bi ) and β(Bi ), which compute information coming from Bi−1 and Bi+1 respectively. For these πBi−1 (Bi ), πBi+1 (Bi ), and πGj (Bi ): πGj (Bi ) ← c π(G) λBk (Gj ) (5) k=i πBi−1 (Bi ) ← c β(Bi−1 )λ(Bi−1 )π(Bi−1 ) 3 (6) and πBi+1 (Bi ) is analogous to πBi−1 (Bi ), with c and c being normalization constants. Finally for the G-nodes: Bel(Gi ) ← cλ(Gi )π(Gi ) λ(Gi ) ← (7) λBj (Gi ) (8) j m λBj (Gi ) ← λ(Bj )p(Bi |Gj )[α(Bj )β(Bj ) Bj p(Bi |Gk )πGk (Bi )] (9) k=i Gk The posteriors of the S-nodes are used to compute the π(Gi ). This posterior computes how well the S-node at each position explains the contour—that is, how well it accounts for the cues flowing from the E-nodes it is connected to. Each Delaunay connection between S- and E-nodes can be seen as a rib that sprouts from the skeleton. More specifically each rib sprouts in a direction that is normal (perpendicular) to the tangent of the contour at the E-node plus a random error φi chosen independently for each rib from a von Mises distribution centered on zero, i.e. φi ∼ V (0, κS ) with spread parameter κS [4]. The rib lengths are drawn from an exponential decreasing density function p(ρi ) ∝ e−λS ρi [4]. We can now express how well this node “explains” the three E-nodes it is connected to via the probability that this S-node deserves to be a skeletal node or not, p(S = true|E1 , E2 , E3 ) ∝ p(ρi )p(φi ) (10) i with S = true depicting that this S-node deserves to be a skeletal node. From this we then compute the prior π(Gi ) in such a way that good (high posterior) skeletal nodes induce a high interiority bias, hence a stronger tendency to induce figural status. Conversely, bad (low posterior) skeletal nodes create a prior close to indifferent (uniform) and thus have less (or no) influence on figural status. Likelihood functions Finally we need to express the likelihood function necessary for the updating rules described above. The first two likelihood functions are part of p(Ei |Bi ), one for each of the local cues. The first one, reflecting local curvature, gives the probability of the orientations of the two vectors inherent to Ei (α1 and α2 ) given both direction of figure (θ) encoded in Bi as a von Mises density centered on θ, i.e. αi ∼ V (θ, κEB ). The second likelihood function, reflecting the presence of a T-junction, simply assumes a fixed likelihood when a T-junction is present—that is p(T-junction = true|Bi ) = θT , where Bi places the direction of figure in the direction of the occluder. This likelihood function is only in effect when a T-junction is present, replacing the curvature cue at that node. The third likelihood function serves to keep consistency between nodes of the first level. This function p(Bi |Bi−1 ) or p(Bi |Bi+1 ) is used to compute α(B) and β(B) and is defined 2x2 conditional probability matrix with a single free parameter, θBB (the probability that figural direction at both B-nodes are the same). A fourth and final likelihood function p(Bi |Gj ) serves to propagate information between level one and two. This likelihood function is 2x2 conditional probability matrix matrix with one free parameter, θBG . In this case θBG encodes the probability that the figural direction of the B-node is in the direction of the exterior or interior preference of the G-node. In total this brings us to six free parameters in the model: κS , λS , κEB , θT , θBB , and θBG . 2 Basic Simulations To evaluate the performance of the model, we first tested it on several basic stimulus configurations in which the desired outcome is intuitively clear: a convex shape, a concave shape, a pair of overlapping shapes, and a pair of non-overlapping shapes (Fig. 2,3). The convex shape is the simplest in that curvature never changes sign. The concave shape includes a region with oppositely signed curvature. (The shape is naturally described as predominantly positively curved with a region of negative curvature, i.e. a concavity. But note that it can also be interpreted as predominantly negatively curved “window” with a region of positive curvature, although this is not the intuitive interpretation.) 4 The overlapping pair of shapes consists of two convex shapes with one partly occluding the other, creating a competition between the two shapes for the ownership of the common borderline. Finally the non-overlapping shapes comprise two simple convex shapes that do not touch—again setting up a competition for ownership of the two inner boundaries (i.e. between each shape and the ground space between them). Fig. 2 shows the network structures for each of these four cases. Figure 2: Network structure for the four shape categories (left to right: convex, concave, overlapping, non-overlapping shapes). Blue depict the locations of the B-nodes (and also the E-nodes), the red connections are the connections between B-nodes, the green connections are connections between B-nodes and G-nodes, and the G-nodes (and also the S-nodes) go from orange to dark red. This colour code depicts low (orange) to high (dark red) probability that this is a skeletal node, and hence the strength of the interiority cue. Running our model with hand-estimated parameter values yields highly intuitive posteriors (Fig. 3), an essential “sanity check” to ensure that the network approximates human judgments in simple cases. For the convex shape the model assigns figure to the interior just as one would expect even based solely on local curvature (Fig. 3A). In the concave figure (Fig. 3B), estimated border ownership begins to reverse inside the deep concavity. This may seem surprising, but actually closely matches empirical results obtained when local border ownership is probed psychophysically inside a similarly deep concavity, i.e. a “negative part” in which f/g seems to partly reverse [8]. For the overlapping shapes posteriors were also intuitive, with the occluding shape interpreted as in front and owning the common border (Fig. 3C). Finally, for the two non-overlapping shapes the model computed border-ownership just as one would expect if each shape were run separately, with each shape treated as figural along its entire boundary (Fig. 3D). That is, even though there is skeletal structure in the ground-region between the two shapes (see Fig. 2D), its posterior is weak compared to the skeletal structure inside the shapes, which thus loses the competition to own the boundary between them. For all these configurations, the model not only converged to intuitive estimates but did so rapidly (Fig. 4), always in fewer cycles than would be expected by pure lateral propagation, niterations < Nnodes [18] (with these parameters, typically about five times faster). Figure 3: Posteriors after convergence for the four shape categories (left to right: convex, concave, overlapping, non-overlapping). Arrows indicate estimated border ownership, with direction pointing to the perceived figural side, and length proportional to the magnitude of the posterior. All four simulations used the same parameters. 5 Figure 4: Convergence of the model for the basic shape categories. The vertical lines represent the point of convergence for each of the three shape categories. The posterior change is calculated as |p(Bi = 1|I)t − p(Bi = 1|I)t−1 | at each iteration. 3 Comparison to human data Beyond the simple cases reviewed above, we wished to submit our network to a more fine-grained comparison with human data. To this end we compared its performance to that of human subjects in an experiment we conducted (to be presented in more detail in a future paper). Briefly, our experiment involved finding evidence for propagation of f/g signals across the image. Subjects were first shown a stimulus in which the f/g configuration was globally and locally unambiguous and consistent: a smaller rectangle partly occluding a larger one (Fig. 5A), meaning that the smaller (front) one owns the common border. Then this configuration was perturbed by adding two bars, of which one induced a local f/g reversal—making it now appear locally that the larger rectangle owned the border (Fig. 5B). (The other bar in the display does not alter f/g interpretation, but was included to control for the attentional affects of introducing a bar in the image.) The inducing bar creates T-junctions that serve as strong local f/g cues, in this case tending to reverse the prior global interpretation of the figure. We then measured subjective border ownership along the central contour at various distances from the inducing bar, and at different times after the onset of the bar (25ms, 100ms and 250ms). We measured border ownership locally using a method introduced in [8] in which a local motion probe is introduced at a point on the boundary between two color regions of different colors, and the subject is asked which color appeared to move. Because the figural side “owns” the border, the response reflects perceived figural status. The goal of the experiment was to actually measure the progression of the influence of the inducing T-junction as it (hypothetically) propagated along the boundary. Briefly, we found no evidence of temporal differences, meaning that f/g judgments were essentially constant over time, suggesting rapid convergence of local f/g assignment. (This is consistent with the very rapid convergence of our network, which would suggest a lack of measurable temporal differences except at much shorter time scales than we measured.) But we did find a progressive reduction of f/g reversal with increasing distance from the inducer—that is, the influence of the T-junction decayed with distance. Mean responses aggregated over subjects (shortest delay only) are shown in Fig. 6. In order to run our model on this stimulus (which has a much more complex structure than the simple figures tested above) we had to make some adjustments. We removed the bars from the edge map, leaving only the T-junctions as underlying cues. This was a necessary first step because our model is not yet able to cope with skeletons that are split up by occluders. (The larger rectangle’s skeleton has been split up by the lower bar.) In this way all contours except those created by the bars were used to create the network (Fig. 7). Given this network we ran the model using hand-picked parameters that 6 Figure 5: Stimuli used in the experiment. A. Initial stimulus with locally and globally consistent and unambiguous f/g. B. Subsequently bars were added of which one (the top bar in this case) created a local reversal of f/g. C. Positions at which local f/g judgments of subjects were probed. Figure 6: Results from our experiment aggregated for all 7 subjects (shortest delay only) are shown in red. The x-axis shows distance from the inducing bar at which f/g judgment was probed. The y-axis shows the proportion of trials on which subjects judged the smaller rectangle to own the boundary. As can be seen, the further from the T-junction, the lower the f/g reversal. The fitted model (green curve) shows very similar pattern. Horizontal black line indicates chance performance (ambiguous f/g). gave us the best possible qualitative similarity to the human data. The parameters used never entailed total elimination of the influence of any likelihood function (κS = 16, λS = .025, κEB = .5, θT = .9, θBB = .9, and θBG = .6). As can be seen in Fig. 6 the border-ownership estimates at the locations where we had data show compelling similarities to human judgments. Furthermore along the entire contour the model converged to intuitive border-ownership estimates (Fig. 7) very rapidly (within 36 iterations). The fact that our model yielded intuitive estimates for the current network in which not all contours were completed shows another strength of our model. Because our model included grouping nodes, it did not require contours to be amodally completed [6] in order for information to propagate. 4 Conclusion In this paper we proposed a model rooted in Bayesian belief networks to compute figure/ground. The model uses both local and global cues, combined in a principled way, to achieve a stable and apparently psychologically reasonable estimate of border ownership. Local cues included local curvature and T-junctions, both well-established cues to f/g. Global cues included skeletal structure, 7 Figure 7: (left) Node structure for the experimental stimulus. (right) The model’s local borderownership estimates after convergence. a novel cue motivated by the idea that strongly axial shapes tend to be figural and thus own their boundaries. We successfully tested this model on both simple displays, in which it gave intuitive results, and on a more complex experimental stimulus, in which it gave a close match to the pattern of f/g propagation found in our subjects. Specifically, the model, like the human subjects rapidly converged to a stable local f/g interpretation. Our model’s structure shows several interesting parallels to properties of neural coding of border ownership in visual cortex. Some cortical cells (end-stopped cells) appear to code for local curvature [3] and T-junctions [5]. The B-nodes in our model could be seen as corresponding to cells that code for border ownership [20]. Furthermore, some authors [2] have suggested that recurrent feedback loops between border ownership cells in V2 and cells in V4 (corresponding to G-nodes in our model) play a role in the rapid computation of border ownership. The very rapid convergence we observed in our model likewise appears to be due to the connections between B-nodes and G-nodes. Finally scale-invariant shape representations (such as, speculatively, those based on skeletons) are thought to be present in higher cortical regions such as IT [17], which project down to earlier areas in ways that are not yet understood. A number of parallels to past models of f/g should be mentioned. Weiss [18] pioneered the application of belief networks to the f/g problem, though their network only considered a more restricted set of local cues and no global ones, such that information only propagated along the contour. Furthermore it has not been systematically compared to human judgments. Kogo et al. [10] proposed an exponential decay of f/g signals as they spread throughout the image. Our model has a similar decay for information going through the G-nodes, though it is also influenced by an angular factor defined by the position of the skeletal node. Like the model by Li Zhaoping [19], our model includes horizontal propagation between B-nodes, analogous to border-ownership cells in her model. A neurophysiological model by Craft et al. [2] defines grouping cells coding for an interiority preference that decays with the size of the receptive fields of these grouping cells. Our model takes this a step further by including shape (skeletal) structure as a factor in interiority estimates, rather than simply size of receptive fields (which is similar to the rib lengths in our model). Currently, our use of skeletons as shape representations is still limited to medial axis skeletons and surfaces that are not split up by occluders. Our future goals including integrating skeletons in a more robust way following the probabilistic account suggested by Feldman and Singh [4]. Eventually, we hope to fully integrate skeleton computation with f/g computation so that the more general problem of shape and surface estimation can be approached in a coherent and unified fashion. 8 References [1] P. Bahnsen. Eine untersuchung uber symmetrie und assymmetrie bei visuellen wahrnehmungen. Zeitschrift fur psychology, 108:129–154, 1928. [2] E. Craft, H. Sch¨ tze, E. Niebur, and R. von der Heydt. A neural model of figure-ground u organization. Journal of Neurophysiology, 97:4310–4326, 2007. [3] A. Dobbins, S. W. Zucker, and M. S. Cyander. Endstopping and curvature. Vision Research, 29:1371–1387, 1989. [4] J. Feldman and M. Singh. Bayesian estimation of the shape skeleton. Proceedings of the National Academy of Sciences, 103:18014–18019, 2006. [5] B. Heider, V. Meskenaite, and E. Peterhans. Anatomy and physiology of a neural mechanism defining depth order and contrast polarity at illusory contours. European Journal of Neuroscience, 12:4117–4130, 2000. [6] G. Kanizsa. Organization inVision. New York: Praeger, 1979. [7] G. Kanizsa and W. Gerbino. Vision and Artifact, chapter Convexity and symmetry in figureground organisation, pages 25–32. New York: Springer, 1976. [8] S. Kim and J. Feldman. Globally inconsistent figure/ground relations induced by a negative part. Journal of Vision, 9:1534–7362, 2009. [9] K. Koffka. Principles of Gestalt Psychology. Lund Humphries, London, 1935. [10] N. Kogo, C. Strecha, L. Van Gool, and J. Wagemans. Surface construction by a 2-d differentiation-integration process: a neurocomputational model for perceived border ownership, depth, and lightness in kanizsa figures. Psychological Review, 117:406–439, 2010. [11] B. Machielsen, M. Pauwels, and J. Wagemans. The role of vertical mirror-symmetry in visual shape detection. Journal of Vision, 9:1–11, 2009. [12] K. Murphy, Y. Weiss, and M.I. Jordan. Loopy belief propagation for approximate inference: an empirical study. Proceedings of Uncertainty in AI, pages 467–475, 1999. [13] R. L. Ogniewicz and O. K¨ bler. Hierarchic Voronoi skeletons. Pattern Recognition, 28:343– u 359, 1995. [14] J. Pearl. Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan Kaufmann, 1988. [15] M. A. Peterson and E. Skow. Inhibitory competition between shape properties in figureground perception. Journal of Experimental Psychology: Human Perception and Performance, 34:251–267, 2008. [16] K. A. Stevens and A. Brookes. The concave cusp as a determiner of figure-ground. Perception, 17:35–42, 1988. [17] K. Tanaka, H. Saito, Y. Fukada, and M. Moriya. Coding visual images of object in the inferotemporal cortex of the macaque monkey. Journal of Neurophysiology, 66:170–189, 1991. [18] Y. Weiss. Interpreting images by propagating Bayesian beliefs. Adv. in Neural Information Processing Systems, 9:908915, 1997. [19] L. Zhaoping. Border ownership from intracortical interactions in visual area V2. Neuron, 47(1):143–153, Jul 2005. [20] H. Zhou, H. S. Friedman, and R. von der Heydt. Coding of border ownerschip in monkey visual cortex. The Journal of Neuroscience, 20:6594–6611, 2000. 9

2 0.78225529 200 nips-2010-Over-complete representations on recurrent neural networks can support persistent percepts

Author: Shaul Druckmann, Dmitri B. Chklovskii

Abstract: A striking aspect of cortical neural networks is the divergence of a relatively small number of input channels from the peripheral sensory apparatus into a large number of cortical neurons, an over-complete representation strategy. Cortical neurons are then connected by a sparse network of lateral synapses. Here we propose that such architecture may increase the persistence of the representation of an incoming stimulus, or a percept. We demonstrate that for a family of networks in which the receptive field of each neuron is re-expressed by its outgoing connections, a represented percept can remain constant despite changing activity. We term this choice of connectivity REceptive FIeld REcombination (REFIRE) networks. The sparse REFIRE network may serve as a high-dimensional integrator and a biologically plausible model of the local cortical circuit. 1

3 0.69175512 12 nips-2010-A Primal-Dual Algorithm for Group Sparse Regularization with Overlapping Groups

Author: Sofia Mosci, Silvia Villa, Alessandro Verri, Lorenzo Rosasco

Abstract: We deal with the problem of variable selection when variables must be selected group-wise, with possibly overlapping groups defined a priori. In particular we propose a new optimization procedure for solving the regularized algorithm presented in [12], where the group lasso penalty is generalized to overlapping groups of variables. While in [12] the proposed implementation requires explicit replication of the variables belonging to more than one group, our iterative procedure is based on a combination of proximal methods in the primal space and projected Newton method in a reduced dual space, corresponding to the active groups. This procedure provides a scalable alternative with no need for data duplication, and allows to deal with high dimensional problems without pre-processing for dimensionality reduction. The computational advantages of our scheme with respect to state-of-the-art algorithms using data duplication are shown empirically with numerical simulations. 1

4 0.65697896 21 nips-2010-Accounting for network effects in neuronal responses using L1 regularized point process models

Author: Ryan Kelly, Matthew Smith, Robert Kass, Tai S. Lee

Abstract: Activity of a neuron, even in the early sensory areas, is not simply a function of its local receptive field or tuning properties, but depends on global context of the stimulus, as well as the neural context. This suggests the activity of the surrounding neurons and global brain states can exert considerable influence on the activity of a neuron. In this paper we implemented an L1 regularized point process model to assess the contribution of multiple factors to the firing rate of many individual units recorded simultaneously from V1 with a 96-electrode “Utah” array. We found that the spikes of surrounding neurons indeed provide strong predictions of a neuron’s response, in addition to the neuron’s receptive field transfer function. We also found that the same spikes could be accounted for with the local field potentials, a surrogate measure of global network states. This work shows that accounting for network fluctuations can improve estimates of single trial firing rate and stimulus-response transfer functions. 1

5 0.65666449 98 nips-2010-Functional form of motion priors in human motion perception

Author: Hongjing Lu, Tungyou Lin, Alan Lee, Luminita Vese, Alan L. Yuille

Abstract: It has been speculated that the human motion system combines noisy measurements with prior expectations in an optimal, or rational, manner. The basic goal of our work is to discover experimentally which prior distribution is used. More specifically, we seek to infer the functional form of the motion prior from the performance of human subjects on motion estimation tasks. We restricted ourselves to priors which combine three terms for motion slowness, first-order smoothness, and second-order smoothness. We focused on two functional forms for prior distributions: L2-norm and L1-norm regularization corresponding to the Gaussian and Laplace distributions respectively. In our first experimental session we estimate the weights of the three terms for each functional form to maximize the fit to human performance. We then measured human performance for motion tasks and found that we obtained better fit for the L1-norm (Laplace) than for the L2-norm (Gaussian). We note that the L1-norm is also a better fit to the statistics of motion in natural environments. In addition, we found large weights for the second-order smoothness term, indicating the importance of high-order smoothness compared to slowness and lower-order smoothness. To validate our results further, we used the best fit models using the L1-norm to predict human performance in a second session with different experimental setups. Our results showed excellent agreement between human performance and model prediction – ranging from 3% to 8% for five human subjects over ten experimental conditions – and give further support that the human visual system uses an L1-norm (Laplace) prior.

6 0.65363741 17 nips-2010-A biologically plausible network for the computation of orientation dominance

7 0.65272522 268 nips-2010-The Neural Costs of Optimal Control

8 0.65120828 161 nips-2010-Linear readout from a neural population with partial correlation data

9 0.65063131 44 nips-2010-Brain covariance selection: better individual functional connectivity models using population prior

10 0.64627832 6 nips-2010-A Discriminative Latent Model of Image Region and Object Tag Correspondence

11 0.64298981 194 nips-2010-Online Learning for Latent Dirichlet Allocation

12 0.64080298 81 nips-2010-Evaluating neuronal codes for inference using Fisher information

13 0.64078093 238 nips-2010-Short-term memory in neuronal networks through dynamical compressed sensing

14 0.64001215 20 nips-2010-A unified model of short-range and long-range motion perception

15 0.63678038 277 nips-2010-Two-Layer Generalization Analysis for Ranking Using Rademacher Average

16 0.6367349 55 nips-2010-Cross Species Expression Analysis using a Dirichlet Process Mixture Model with Latent Matchings

17 0.63656354 155 nips-2010-Learning the context of a category

18 0.63597637 150 nips-2010-Learning concept graphs from text with stick-breaking priors

19 0.63497168 123 nips-2010-Individualized ROI Optimization via Maximization of Group-wise Consistency of Structural and Functional Profiles

20 0.63465703 249 nips-2010-Spatial and anatomical regularization of SVM for brain image analysis