Author: Daniel M. Roy, Yee W. Teh
Abstract: We describe a novel class of distributions, called Mondrian processes, which can be interpreted as probability distributions over kd-tree data structures. Mondrian processes are multidimensional generalizations of Poisson processes and this connection allows us to construct multidimensional generalizations of the stickbreaking process described by Sethuraman (1994), recovering the Dirichlet process in one dimension. After introducing the Aldous-Hoover representation for jointly and separately exchangeable arrays, we show how the process can be used as a nonparametric prior distribution in Bayesian models of relational data. 1
1 Mondrian processes are multidimensional generalizations of Poisson processes and this connection allows us to construct multidimensional generalizations of the stickbreaking process described by Sethuraman (1994), recovering the Dirichlet process in one dimension. [sent-7, score-0.27]
2 After introducing the Aldous-Hoover representation for jointly and separately exchangeable arrays, we show how the process can be used as a nonparametric prior distribution in Bayesian models of relational data. [sent-8, score-0.379]
3 A common Bayesian approach in the one-dimensional setting is to assume there is cluster structure and use a mixture model with a prior distribution over partitions of the objects in X. [sent-15, score-0.151]
4 A similar approach for relational data would na¨vely require a prior distribution on partitions of the product ı space X × Y = {(x, y) | x ∈ X, y ∈ Y }. [sent-16, score-0.29]
5 , by placing a Chinese restaurant process (CRP) prior on partitions of X × Y . [sent-19, score-0.177]
6 An unsatisfactory implication of this choice is that the distribution on partitions of (Ri,j ) is exchangeable, i. [sent-20, score-0.121]
7 Stochastic block models2 place prior distributions on partitions of X and Y separately, which can be interpreted as inducing a distribution on partitions of the product space by considering the product of the partitions. [sent-23, score-0.364]
8 By arranging the rows and columns of (Ri,j ) so that clustered objects have adjacent indices, such partitions look like regular grids (Figure 1. [sent-24, score-0.19]
9 (2007) generate random partitions which are not constrained to be regular grids (Figure 1. [sent-28, score-0.139]
10 Motivated by the need for a consistent distribution on partitions of product spaces with more structure than classic block models, we define a class of nonparametric distributions we have named Mondrian processes after Piet Mondrian and his abstract grid-based paintings. [sent-30, score-0.26]
11 Mondrian processes are random partitions on product spaces not constrained to be regular grids. [sent-31, score-0.2]
12 We begin by introducing the notion of partially exchangeable arrays by Aldous (1981) and Hoover (1979), a generalization of exchangeability on sequences appropriate for modeling relational data. [sent-34, score-0.391]
13 2 1 We then define the Mondrian process, highlight a few of its elegant properties, and describe two nonparametric models for relational data that use the Mondrian process as a prior on partitions. [sent-42, score-0.215]
14 is an exchangeable sequence, then there exists a random parameter θ such that the sequence is conditionally iid given θ: n p(x1 , . [sent-47, score-0.207]
15 , xn ) = px (xi |θ)dθ pθ (θ) (1) i=1 That is, exchangeable sequences arise as a mixture of iid sequences, where the mixing distribution is p(θ). [sent-50, score-0.187]
16 In this section we describe notions of exchangeability for relational data originally proposed by Aldous (1981) and Hoover (1979) in the context of exchangeable arrays. [sent-52, score-0.341]
17 Kallenberg (2005) significantly expanded on the concept, and Diaconis and Janson (2007) showed a strong correspondence between such exchangeable relations and a notion of limits on graph structures (Lov´ sz and Szegedy, 2006). [sent-53, score-0.229]
18 We say that R is separately exchangeable if its distribution is invariant to separate permutations on its rows and columns. [sent-60, score-0.164]
19 Aldous (1981) and Hoover (1979) showed that separately exchangeable relations can always be represented in the following way: each object i (and j) has a latent representation ξi (ηj ) drawn iid from some distribution pξ (pη ); independently let θ be an additional random parameter. [sent-62, score-0.252]
20 We say R is jointly exchangeable if it is invariant to jointly permuting rows and columns; that is, for each n ≥ 1 and each permutation π ∈ Sn we have p(R1:n,1:n ) = p(Rπ(1:n),π(1:n) ) (4) Such jointly exchangeable relations also have a form similar to (3). [sent-67, score-0.393]
21 The first impression from (5) is that joint exchangeability implies a more restricted functional form than separately exchangeable (3). [sent-69, score-0.206]
22 In fact, the reverse holds—(5) means that the latent representations of row i and column i need not be independent, and that Ri,j and Rj,i need not be conditionally independent given the row and column representations, while (3) assumes independence of both. [sent-70, score-0.122]
23 The above Aldous-Hoover representation serves as the theoretical foundation for hierarchical Bayesian modeling of exchangeable relational data, just as de Finetti’s representation serves as a foundation for the modeling of exchangeable sequences. [sent-74, score-0.463]
24 , 2006) induce regular partitions on the product space, introducing structure where the data do not support it. [sent-81, score-0.173]
25 (2) Axis- aligned partitions, like those produced by annotated hierarchies and the Mondrian process provide (a posteriori) resolution only where it is needed. [sent-82, score-0.113]
26 (4) We can visualize the sequential hierarchical process by spreading the cuts out over time. [sent-84, score-0.155]
27 (5) Mondrian process with beta L´ vy e measure, µ(dx) = x−1 dx on [0, 1]2 . [sent-86, score-0.115]
28 3 The Mondrian Process The Mondrian process can be expressed as a recursive generative process that randomly makes axisaligned cuts, partitioning the underlying product space in a hierarchical fashion akin to decision trees or kd- trees. [sent-89, score-0.176]
29 The implication of consistency is that we can extend the Mondrian process to infinite spaces and use it as a nonparametric prior for modeling exchangeable relational data. [sent-91, score-0.399]
30 1 The one dimensional case The simplest space to introduce the Mondrian process is the unit interval [0, 1]. [sent-93, score-0.11]
31 Each cut costs a random amount, eventually exhausting the budget and resulting in a finite partition m of the unit interval. [sent-95, score-0.211]
32 The cost, EI , to cut an interval I is exponentially distributed with inverse mean given by the length of the interval. [sent-96, score-0.139]
33 If λ < 0, we make no cuts and the process returns the trivial partition m = {[0, 1]}. [sent-99, score-0.225]
34 Otherwise, we make a cut uniformly at random, splitting the unit interval into two subintervals A and B. [sent-100, score-0.167]
35 The process recurses independently on A and B, with independent budgets λ , producing partitions mA and mB , which are then combined into a partition m = mA mB of [0, 1]. [sent-101, score-0.247]
36 The resulting cuts can be shown to be a Poisson (point) process. [sent-102, score-0.099]
37 Unlike the standard description of the Poisson process, the cuts in this “break and branch” process are organized in a hierarchy. [sent-103, score-0.155]
38 As the Poisson process is a fundamental building block for random measures such as the Dirichlet process (DP), we will later exploit this relationship to build various multidimensional generalizations. [sent-104, score-0.193]
39 2 Generalizations to higher dimensions and trees We begin in two dimensions by describing the generative process for a Mondrian process m ∼ MP(λ, (a, A), (b, B)) on the rectangle (a, A)×(b, B). [sent-106, score-0.142]
40 If λ < 0, the process halts, and returns the trivial partition {(a, A) × (b, B)}. [sent-108, score-0.126]
41 Otherwise, an axis- aligned cut is made uniformly at random along the combined lengths of (a, A) and (b, B); that is, the cut lies along a particular dimension with probability proportional to its length, and is drawn uniformly within that interval. [sent-109, score-0.226]
42 , a cut x ∈ (a, A) splits the interval into (a, x) and (x, A). [sent-114, score-0.139]
43 Like the one- dimensional special case, the λ parameter controls the number of cuts, with the process more likely to cut rectangles with large perimeters. [sent-117, score-0.169]
44 In higher dimensions, the cost E to make an additional cut is exponentially distributed with rate given by the sum over all dimensions of the interval lengths. [sent-119, score-0.139]
45 Similarly, the cut point is chosen uniformly at random from all intervals, splitting only that interval in the recursion. [sent-120, score-0.139]
46 Like non- homogeneous Poisson processes, the cut point need not 3 In this paper we shall always mean infinite exchangeability when we state exchangeability. [sent-121, score-0.155]
47 The key property of intervals that the Mondrian process relies upon is that any point cuts the space into one-dimensional, simplyconnected pieces. [sent-125, score-0.175]
48 Trees also have this property: a cut along an edge splits a tree into two trees. [sent-126, score-0.145]
49 We denote a Mondrian process m with rate λ on a product of one-dimensional, simply-connected domains Θ1 ×···×ΘD by m ∼ MP(λ, Θ1 , . [sent-127, score-0.112]
50 Thus a draw from the Mondrian process is either a trivial partition or a tuple m = d, x, λ , m< , m> , representing a cut at x along the d’th dimension Θd , with nested Mondrians m< and m> on either side of the cut. [sent-138, score-0.261]
51 Therefore, m is itself a tree of axis-aligned cuts (a kd-tree data structure), with the leaves of the tree forming the partition of the original product space. [sent-139, score-0.267]
52 Conditional Independencies: The generative process for the Mondrian produces a tree of cuts, where each subtree is itself a draw from a Mondrian. [sent-140, score-0.11]
53 Consistency: The Mondrian process satisfies an important self-consistency property: given a draw from a Mondrian on some domain, the partition on any subdomain has the same distribution as if we sampled a Mondrian process directly on that subdomain. [sent-144, score-0.225]
54 , ΦD ) of m to Φ1 × · · · × ΦD is the subtree of cuts within Φ1 × · · · × ΦD . [sent-152, score-0.099]
55 We define restrictions inductively: If there are no cuts in m, i. [sent-153, score-0.099]
56 A consequence of this consistency property is that we can now use the Daniell-Kolmogorov extension theorem to extend the Mondrian process to σ-finite domains (those that can be written as a countable union of finite domains). [sent-191, score-0.098]
57 For example, from a Mondrian process on products of intervals, we can construct a Mondrian process on all of RD . [sent-192, score-0.112]
58 Note that if the domains have infinite measure, the tree of cuts will be infinitely deep with no root and infinitely many leaves (being the infinite partition of the product space). [sent-193, score-0.257]
59 However the restriction of the tree to any given finite subdomains will be finite with a root (with probability one). [sent-194, score-0.1]
60 But since µ1 is non-atomic, µ1 ({y}) = 0 thus ρ will not have any cuts in the first domain (with probability 1). [sent-202, score-0.099]
61 5 1 3 5 6 2 1 4 Figure 2: Modeling a Mondrian with a Mondrian: A posterior sample given relational data created from an actual Mondrian painting. [sent-211, score-0.135]
62 These synthetic data were generated by fitting a regular 6 × 7 point array over the painting (6 row objects, 7 column objects), and using the blocks in the painting to determine the block structure of these 42 relations. [sent-214, score-0.221]
63 We then sampled 18 relational arrays with this block structure. [sent-215, score-0.239]
64 The colors are for visual effect only as the partitions are contiguous rectangles. [sent-217, score-0.143]
65 Each point represents a relation Ri,j ; each row of points are the relations (Ri,· ) for an object ξi , and similarly for columns. [sent-219, score-0.117]
66 (4) Induced partition on the (discrete) relational array, matching the painting. [sent-221, score-0.205]
67 (5) Partitioned and permuted relational data showing block structure. [sent-222, score-0.189]
68 To do so, we have to consider three possibilities: when m contains no cuts, when the first cut of m is in ρ, and when the first cut of m is above ρ. [sent-228, score-0.226]
69 Fortunately the probabilities of each of these events can be computed easily, and amounts to drawing an exponential sample E ∼ Exp( d µd (Θd \ Φd )), and comparing it against the diminished rate after the first cut in ρ. [sent-229, score-0.113]
70 if ρ has no cuts then λ ← 0 else d , x , λ , ρ< , ρ> ← ρ. [sent-240, score-0.099]
71 As an example, the expected number of slices along each dimension of (0, A) × (0, B) is λA and λB, while the expected total number of partitions is (1 + λA)(1 + λB). [sent-266, score-0.121]
72 Interestingly, this is also the expected number of partitions in a biclustering model where we first have two independent Poisson processes with rate λ partition (0, A) and (0, B), and then form the product partition of (0, A) × (0, B). [sent-267, score-0.322]
73 The colors are for visual effect only as the partitions are contiguous rectangles. [sent-270, score-0.143]
74 Note that partitions are not necessarily contiguous; we use color to identify partitions. [sent-317, score-0.121]
75 The partition structure is related to the annotated hierarchies model (Roy et al. [sent-318, score-0.127]
76 (3) A sequence of cuts; each cut separates a subtree. [sent-321, score-0.113]
77 5 Relational Modeling To illustrate how the Mondrian process can be used to model relational data, we describe two nonparametric block models for exchangeable relations. [sent-323, score-0.433]
78 Recall the Aldous- Hoover representation (θ, ξi , ηj , pR ) for exchangeable arrays. [sent-325, score-0.164]
79 Using a Mondrian process with beta L´ vy measure µ(dx) = αx−1 dx, we first sample a random partition of the unit e square into blocks and assign each block a probability: M ∼ MP(λ, [0, 1], [0, 1]) φS | M ∼ Beta(a0 , a1 ), ∀S ∈ M. [sent-326, score-0.287]
80 slices up unit square into blocks each block S gets a probability φS (6) (7) The pair (M, φ) plays the role of θ in the Aldous- Hoover representation. [sent-327, score-0.102]
81 shared x coordinate for each row shared y coordinate for each column (8) (9) Let Sij be the block S ∈ M such that (ξi , ηj ) ∈ S. [sent-335, score-0.105]
82 φSij (10) This model clusters relations together whose (ξi , ηj ) pairs fall in the same blocks in the Mondrian partition and models each cluster with a beta-binomial likelihood model. [sent-342, score-0.135]
83 By mirroring the AldousHoover representation, we guarantee that R is exchangeable and that there is no order dependence. [sent-343, score-0.164]
84 , 2006), where rows and columns are first clustered using a CRP prior, then each relation Rij is conditionally independent from others given the clusters that row i and column j belong to. [sent-346, score-0.116]
85 (6) with M ∼ MP(λ, [0, 1]) × MP(λ, [0, 1]), product of partitions of unit intervals (11) then we recover the same marginal distribution over relations as the IRM/IHRM. [sent-348, score-0.268]
86 To see this, recall that a Mondrian process in one-dimension produces a partition whose cut points follow a Poisson point process. [sent-349, score-0.239]
87 , partitions) induced by a Poisson point process on [0, 1] with the beta L´ vy measure have the same distribution as those in the sticke breaking construction of the DP. [sent-353, score-0.115]
88 We can also construct an exchangeable variant of the Annotated Hierarchies model (a hierarchical block model) by moving from the unit square to a product of random trees drawn from Kingman’s coalescent prior (Kingman, 1982a). [sent-358, score-0.33]
89 for each dimension, sample a tree partition the cross product of trees each block S gets a probability φS (12) (13) (14) Let Sij be the subset S ∈ M where leaves (i, j) fall in S. [sent-367, score-0.22]
90 Kingman shows that the partition on the leaves of a coalescent tree when its edges are cut by a Poisson point process is the same as that of a DP (Figure 4). [sent-376, score-0.271]
91 Therefore, the partition structure along every row and column is marginally the same as a DP. [sent-377, score-0.121]
92 Both the unit square and product of random trees models give DP distributed partitions on each row and column, but they have different inductive biases. [sent-378, score-0.261]
93 Figure 2 shows a sample after 1500 iterations (starting from a random initialization) where the partition on the array is exactly recovered. [sent-383, score-0.104]
94 We next analyzed the classic Countries data set from the network analysis literature (Wasserman and Faust, 1994), which reports trade in 1984 between 24 countries in food and live animals; crude materials; minerals and fuels; basic manufactured goods; and exchange of diplomats. [sent-387, score-0.136]
95 Given three relations (friends, works-with, and 7 gives-orders-to), the maximum a posteriori Mondrian process partitions the relations into homogeneous blocks. [sent-395, score-0.307]
96 7 Discussion While the Mondrian process has many elegant properties, much more work is required to determine its usefulness for relational modeling. [sent-398, score-0.191]
97 We are currently investigating improved MCMC sampling schemes for the Mondrian process, as well as working to develop a combinatorial representation of the distribution on partitions induced by the Mondrian process. [sent-400, score-0.121]
98 The axis-aligned partitions of [0, 1]n produced by the Mondrian process have been studied extensively in combinatorics and computational geometry, where they are known as guillotine partitions. [sent-402, score-0.209]
99 Guillotine partitions have wide ranging applications including circuit design, approximation algorithms and computer graphics. [sent-403, score-0.121]
100 Learning systems of concepts with an infinite relational model. [sent-456, score-0.135]
[(6, 0.037), (7, 0.044), (12, 0.027), (28, 0.118), (57, 0.539), (59, 0.011), (63, 0.01), (77, 0.037), (83, 0.03)]
simIndex simValue paperId paperTitle
1 0.94139302 191 nips-2008-Recursive Segmentation and Recognition Templates for 2D Parsing
Author: Leo Zhu, Yuanhao Chen, Yuan Lin, Chenxi Lin, Alan L. Yuille
Abstract: Language and image understanding are two major goals of artificial intelligence which can both be conceptually formulated in terms of parsing the input signal into a hierarchical representation. Natural language researchers have made great progress by exploiting the 1D structure of language to design efficient polynomialtime parsing algorithms. By contrast, the two-dimensional nature of images makes it much harder to design efficient image parsers and the form of the hierarchical representations is also unclear. Attempts to adapt representations and algorithms from natural language have only been partially successful. In this paper, we propose a Hierarchical Image Model (HIM) for 2D image parsing which outputs image segmentation and object recognition. This HIM is represented by recursive segmentation and recognition templates in multiple layers and has advantages for representation, inference, and learning. Firstly, the HIM has a coarse-to-fine representation which is capable of capturing long-range dependency and exploiting different levels of contextual information. Secondly, the structure of the HIM allows us to design a rapid inference algorithm, based on dynamic programming, which enables us to parse the image rapidly in polynomial time. Thirdly, we can learn the HIM efficiently in a discriminative manner from a labeled dataset. We demonstrate that HIM outperforms other state-of-the-art methods by evaluation on the challenging public MSRC image dataset. Finally, we sketch how the HIM architecture can be extended to model more complex image phenomena. 1
2 0.93274349 233 nips-2008-The Gaussian Process Density Sampler
Author: Iain Murray, David MacKay, Ryan P. Adams
Abstract: We present the Gaussian Process Density Sampler (GPDS), an exchangeable generative model for use in nonparametric Bayesian density estimation. Samples drawn from the GPDS are consistent with exact, independent samples from a fixed density function that is a transformation of a function drawn from a Gaussian process prior. Our formulation allows us to infer an unknown density from data using Markov chain Monte Carlo, which gives samples from the posterior distribution over density functions and from the predictive distribution on data space. We can also infer the hyperparameters of the Gaussian process. We compare this density modeling technique to several existing techniques on a toy problem and a skullreconstruction task. 1
3 0.92303562 148 nips-2008-Natural Image Denoising with Convolutional Networks
Author: Viren Jain, Sebastian Seung
Abstract: We present an approach to low-level vision that combines two main ideas: the use of convolutional networks as an image processing architecture and an unsupervised learning procedure that synthesizes training samples from specific noise models. We demonstrate this approach on the challenging problem of natural image denoising. Using a test set with a hundred natural images, we find that convolutional networks provide comparable and in some cases superior performance to state of the art wavelet and Markov random field (MRF) methods. Moreover, we find that a convolutional network offers similar performance in the blind denoising setting as compared to other techniques in the non-blind setting. We also show how convolutional networks are mathematically related to MRF approaches by presenting a mean field theory for an MRF specially designed for image denoising. Although these approaches are related, convolutional networks avoid computational difficulties in MRF approaches that arise from probabilistic learning and inference. This makes it possible to learn image processing architectures that have a high degree of representational power (we train models with over 15,000 parameters), but whose computational expense is significantly less than that associated with inference in MRF approaches with even hundreds of parameters. 1 Background Low-level image processing tasks include edge detection, interpolation, and deconvolution. These tasks are useful both in themselves, and as a front-end for high-level visual tasks like object recognition. This paper focuses on the task of denoising, defined as the recovery of an underlying image from an observation that has been subjected to Gaussian noise. One approach to image denoising is to transform an image from pixel intensities into another representation where statistical regularities are more easily captured. For example, the Gaussian scale mixture (GSM) model introduced by Portilla and colleagues is based on a multiscale wavelet decomposition that provides an effective description of local image statistics [1, 2]. Another approach is to try and capture statistical regularities of pixel intensities directly using Markov random fields (MRFs) to define a prior over the image space. Initial work used handdesigned settings of the parameters, but recently there has been increasing success in learning the parameters of such models from databases of natural images [3, 4, 5, 6, 7, 8]. Prior models can be used for tasks such as image denoising by augmenting the prior with a noise model. Alternatively, an MRF can be used to model the probability distribution of the clean image conditioned on the noisy image. This conditional random field (CRF) approach is said to be discriminative, in contrast to the generative MRF approach. Several researchers have shown that the CRF approach can outperform generative learning on various image restoration and labeling tasks [9, 10]. CRFs have recently been applied to the problem of image denoising as well [5]. 1 The present work is most closely related to the CRF approach. Indeed, certain special cases of convolutional networks can be seen as performing maximum likelihood inference on a CRF [11]. The advantage of the convolutional network approach is that it avoids a general difficulty with applying MRF-based methods to image analysis: the computational expense associated with both parameter estimation and inference in probabilistic models. For example, naive methods of learning MRFbased models involve calculation of the partition function, a normalization factor that is generally intractable for realistic models and image dimensions. As a result, a great deal of research has been devoted to approximate MRF learning and inference techniques that meliorate computational difficulties, generally at the cost of either representational power or theoretical guarantees [12, 13]. Convolutional networks largely avoid these difficulties by posing the computational task within the statistical framework of regression rather than density estimation. Regression is a more tractable computation and therefore permits models with greater representational power than methods based on density estimation. This claim will be argued for with empirical results on the denoising problem, as well as mathematical connections between MRF and convolutional network approaches. 2 Convolutional Networks Convolutional networks have been extensively applied to visual object recognition using architectures that accept an image as input and, through alternating layers of convolution and subsampling, produce one or more output values that are thresholded to yield binary predictions regarding object identity [14, 15]. In contrast, we study networks that accept an image as input and produce an entire image as output. Previous work has used such architectures to produce images with binary targets in image restoration problems for specialized microscopy data [11, 16]. Here we show that similar architectures can also be used to produce images with the analog fluctuations found in the intensity distributions of natural images. Network Dynamics and Architecture A convolutional network is an alternating sequence of linear filtering and nonlinear transformation operations. The input and output layers include one or more images, while intermediate layers contain “hidden
same-paper 4 0.91316605 236 nips-2008-The Mondrian Process
Author: Daniel M. Roy, Yee W. Teh
Abstract: We describe a novel class of distributions, called Mondrian processes, which can be interpreted as probability distributions over kd-tree data structures. Mondrian processes are multidimensional generalizations of Poisson processes and this connection allows us to construct multidimensional generalizations of the stickbreaking process described by Sethuraman (1994), recovering the Dirichlet process in one dimension. After introducing the Aldous-Hoover representation for jointly and separately exchangeable arrays, we show how the process can be used as a nonparametric prior distribution in Bayesian models of relational data. 1
5 0.8971864 80 nips-2008-Extended Grassmann Kernels for Subspace-Based Learning
Author: Jihun Hamm, Daniel D. Lee
Abstract: Subspace-based learning problems involve data whose elements are linear subspaces of a vector space. To handle such data structures, Grassmann kernels have been proposed and used previously. In this paper, we analyze the relationship between Grassmann kernels and probabilistic similarity measures. Firstly, we show that the KL distance in the limit yields the Projection kernel on the Grassmann manifold, whereas the Bhattacharyya kernel becomes trivial in the limit and is suboptimal for subspace-based problems. Secondly, based on our analysis of the KL distance, we propose extensions of the Projection kernel which can be extended to the set of affine as well as scaled subspaces. We demonstrate the advantages of these extended kernels for classification and recognition tasks with Support Vector Machines and Kernel Discriminant Analysis using synthetic and real image databases. 1
6 0.79942435 100 nips-2008-How memory biases affect information transmission: A rational analysis of serial reproduction
7 0.77297717 27 nips-2008-Artificial Olfactory Brain for Mixture Identification
8 0.72032815 208 nips-2008-Shared Segmentation of Natural Scenes Using Dependent Pitman-Yor Processes
9 0.6792503 158 nips-2008-Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks
10 0.64111954 234 nips-2008-The Infinite Factorial Hidden Markov Model
11 0.62273139 35 nips-2008-Bayesian Synchronous Grammar Induction
12 0.61863112 116 nips-2008-Learning Hybrid Models for Image Annotation with Partially Labeled Data
13 0.60728401 200 nips-2008-Robust Kernel Principal Component Analysis
14 0.59440053 232 nips-2008-The Conjoint Effect of Divisive Normalization and Orientation Selectivity on Redundancy Reduction
15 0.59029174 66 nips-2008-Dynamic visual attention: searching for coding length increments
16 0.58611917 192 nips-2008-Reducing statistical dependencies in natural signals using radial Gaussianization
17 0.58561397 127 nips-2008-Logistic Normal Priors for Unsupervised Probabilistic Grammar Induction
18 0.58122778 197 nips-2008-Relative Performance Guarantees for Approximate Inference in Latent Dirichlet Allocation
19 0.57708031 42 nips-2008-Cascaded Classification Models: Combining Models for Holistic Scene Understanding
20 0.57163596 229 nips-2008-Syntactic Topic Models