nips nips2008 nips2008-208 knowledge-graph by maker-knowledge-mining

208 nips-2008-Shared Segmentation of Natural Scenes Using Dependent Pitman-Yor Processes


Source: pdf

Author: Erik B. Sudderth, Michael I. Jordan

Abstract: We develop a statistical framework for the simultaneous, unsupervised segmentation and discovery of visual object categories from image databases. Examining a large set of manually segmented scenes, we show that object frequencies and segment sizes both follow power law distributions, which are well modeled by the Pitman–Yor (PY) process. This nonparametric prior distribution leads to learning algorithms which discover an unknown set of objects, and segmentation methods which automatically adapt their resolution to each image. Generalizing previous applications of PY processes, we use Gaussian processes to discover spatially contiguous segments which respect image boundaries. Using a novel family of variational approximations, our approach produces segmentations which compare favorably to state-of-the-art methods, while simultaneously discovering categories shared among natural scenes. 1

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 edu Abstract We develop a statistical framework for the simultaneous, unsupervised segmentation and discovery of visual object categories from image databases. [sent-7, score-0.6]

2 Examining a large set of manually segmented scenes, we show that object frequencies and segment sizes both follow power law distributions, which are well modeled by the Pitman–Yor (PY) process. [sent-8, score-0.444]

3 This nonparametric prior distribution leads to learning algorithms which discover an unknown set of objects, and segmentation methods which automatically adapt their resolution to each image. [sent-9, score-0.265]

4 Generalizing previous applications of PY processes, we use Gaussian processes to discover spatially contiguous segments which respect image boundaries. [sent-10, score-0.547]

5 Using a novel family of variational approximations, our approach produces segmentations which compare favorably to state-of-the-art methods, while simultaneously discovering categories shared among natural scenes. [sent-11, score-0.399]

6 We would like to build systems which can automatically discover the visual categories (e. [sent-13, score-0.186]

7 In simple cases, topic models can be used to cluster local textural elements, coarsely representing categories via a bag of visual features [1, 2]. [sent-17, score-0.259]

8 One approach to modeling additional spatial dependence begins by precomputing one, or several, segmentations of each input image [4–6]. [sent-19, score-0.308]

9 Markov random fields (MRFs) have been used to segment images into one of several known object classes [7, 8], but these approaches require manual segmentations to train category-specific appearance models. [sent-21, score-0.508]

10 In this paper, we instead develop a statistical framework for the unsupervised discovery and segmentation of visual object categories. [sent-22, score-0.34]

11 Using color and texture cues, our method simultaneously groups dense features into spatially coherent segments, and refines these partitions using shared appearance models. [sent-25, score-0.514]

12 This extends the cosegmentation framework [9], which matches two views of a single object instance, to simultaneously segment multiple object categories across a large image database. [sent-26, score-0.627]

13 This generalization of the Dirichlet process (DP) leads to heavier-tailed, power law distributions for the frequencies of observed objects or topics. [sent-29, score-0.229]

14 2 demonstrates that PY priors closely match the true distributions of natural segment sizes, and frequencies with which object categories are observed. [sent-31, score-0.545]

15 Importantly, this approach coherently models uncertainty in the number of object categories and instances. [sent-34, score-0.192]

16 90) 100 80 60 40 20 0 1 2 3 4 5 6 7 8 Number of Segments per Image (a) (b) (c) (d) Figure 1: Validation of stick-breaking priors for the statistics of human segmentations of the forest (top) and insidecity (bottom) scene categories. [sent-53, score-0.368]

17 (c) Number of segments occupying varying proportions of the image area, on a log-log scale. [sent-59, score-0.437]

18 (d) Counts of segments of size at least 5,000 pixels in 256 × 256 images of natural scenes. [sent-60, score-0.29]

19 4, we use thresholded Gaussian processes to link assignments of features to regions, and thereby produce smooth, coherent segments. [sent-62, score-0.309]

20 Simulations show that our use of continuous latent variables captures long-range dependencies neglected by MRFs, including intervening contour cues derived from image boundaries [13]. [sent-63, score-0.229]

21 Furthermore, our formulation naturally leads to an efficient variational learning algorithm, which automatically searches over segmentations of varying resolution. [sent-64, score-0.23]

22 5 concludes by demonstrating accurate segmentation of complex images, and discovery of appearance patterns shared across natural scenes. [sent-66, score-0.393]

23 2 Statistics of Natural Scene Categories To better understand the statistical relationships underlying natural scenes, we analyze manual segmentations of Oliva and Torralba’s eight categories [3]. [sent-67, score-0.258]

24 A non-expert user partitioned each image into a variable number of polygonal segments corresponding to distinctive objects or scene elements (see Fig. [sent-68, score-0.474]

25 Each segment has a semantic text label, allowing study of object co-occurrence frequencies across related scenes. [sent-70, score-0.346]

26 There are over 29,000 segments in the collection of 2,688 images. [sent-71, score-0.203]

27 1 Stick Breaking and Pitman–Yor Processes The relative frequencies of different object categories, as well as the image areas they occupy, can be statistically modeled via distributions on potentially infinite partitions. [sent-73, score-0.36]

28 When γa > 0, E[wk ] decreases with k, and the resulting partition frequencies follow heavier-tailed, power law distributions. [sent-81, score-0.191]

29 While the sequences of beta variables underlying PY processes lead to infinite partitions, only a random, finite subset of size Kε = {k | ϕk > ε} will have probability greater than any threshold ε. [sent-82, score-0.209]

30 2 Object Label Frequencies Pitman–Yor processes have been previously used to model the well-known power law behavior of text sequences [15, 16]. [sent-89, score-0.183]

31 Intuitively, the labels assigned to segments in the natural scene database have similar properties: some (like sky, trees, and building) occur frequently, while others (rainbow, lichen, scaffolding, obelisk, etc. [sent-90, score-0.321]

32 1(b) plots the observed frequencies with which unique text labels, sorted from most to least frequent, occur in two scene categories. [sent-93, score-0.21]

33 By varying PY hyperparameters, we also capture interesting differences among scene types: urban, man-made environments have many more unique objects than natural ones. [sent-99, score-0.225]

34 3 Segment Counts and Size Distributions We have also used the natural scene database to quantitatively validate PY priors for image partitions [17]. [sent-101, score-0.4]

35 1(d), PY priors also model uncertainty in the number of segments at various resolutions. [sent-106, score-0.257]

36 While power laws are often used simply as a descriptive summary of observed statistics, PY processes provide a consistent generative model which we use to develop effective segmentation algorithms. [sent-107, score-0.305]

37 We do not claim that PY processes are the only valid prior for image areas; for example, log-normal distributions have similar properties, and may also provide a good model [18]. [sent-108, score-0.233]

38 However, PY priors lead to efficient variational inference algorithms, avoiding the costly MCMC search required by other segmentation methods with region size priors [18, 19]. [sent-109, score-0.417]

39 We first describe a “bag of features” model [1, 2] capturing prior knowledge about region counts and sizes, and then extend it to model spatially coherent shapes in Sec. [sent-111, score-0.196]

40 1 Hierarchical Pitman–Yor Processes Each image is first divided into roughly 1,000 superpixels [18] using a variant of the normalized cuts spectral clustering algorithm [13]. [sent-117, score-0.264]

41 Superpixel i in image j is then represented by histograms xji = (xt , xc ) indicating its texture xt and color xc . [sent-120, score-0.413]

42 ji ji ji ji Figure 2 contains a directed graphical model summarizing our HPY model for collections of local image features. [sent-121, score-0.56]

43 Each of the potentially infinite set of global object categories occurs with frequency ϕk , where ϕ ∼ GEM(γa , γb ) as motivated in Sec. [sent-122, score-0.226]

44 Each category k also has an assot c t c ciated appearance model θk = (θk , θk ), where θk and θk parameterize multinomial distributions on the Wt texture and Wc color bins, respectively. [sent-125, score-0.22]

45 Consider a dataset containing J images of related scenes, each of which is allocated an infinite set of potential segments or regions. [sent-127, score-0.257]

46 3, region t occupies a random proportion πjt of the area in image j, where π j ∼ GEM(αa , αb ). [sent-130, score-0.285]

47 Each region is also associated with a particular global object category kjt ∼ ϕ. [sent-131, score-0.221]

48 kjt wk f 6 5 5 4 3 2 1 Probability Density vjt Probability Density J D 6 5 Probability Density 6 4 3 2 1 0 0 0. [sent-133, score-0.24]

49 Left: Directed graphical model in which global category frequencies ϕ ∼ GEM(γ) are constructed from stickbreaking proportions wk ∼ Beta(1 − γa , γb + kγa ), as in Eq. [sent-166, score-0.269]

50 Similarly, vjt ∼ Beta(1 − αa , αb + tαa ) define region areas π j ∼ GEM(α) for image j. [sent-168, score-0.311]

51 Upper right: Beta distributions from which stick proportions wk are sampled for three different PY processes: k = 1 (blue), k = 10 (red), k = 20 (green). [sent-171, score-0.252]

52 2 Variational Learning for HPY Mixture Models To allow efficient learning of HPY model parameters from large image databases, we have developed a mean field variational method which combines and extends previous approaches for DP mixtures [21, 22] and finite topic models. [sent-177, score-0.234]

53 We truncate the variational posterior [21] by setting q(vjT = 1) = 1 for each image or group, and q(wK = 1) = 1 for the shared global clusters. [sent-180, score-0.323]

54 Multinomial assignments q(kjt | κjt ), q(tji | τji ), and beta stick proportions q(wk | ωk ), q(vjt | νjt ), then have closed form update equations. [sent-181, score-0.34]

55 To avoid bias, we sort the current sets of image segments, and global categories, in order of decreasing aggregate assignment probability after each iteration [22]. [sent-182, score-0.182]

56 4 Segmentation with Spatially Dependent Pitman–Yor Processes We now generalize the HPY image segmentation model of Fig. [sent-183, score-0.332]

57 For simplicity, we consider a single-image model in which features xi are assigned to regions by indicator variables zi , and each segment k has its own appearance parameters θk (see Fig. [sent-185, score-0.365]

58 1 Coupling Assignments using Thresholded Gaussian Processes Consider a generative model which partitions data into two clusters via assignments zi ∈ {0, 1} sampled such that P[zi = 1] = v. [sent-191, score-0.229]

59 zi = 1 0 if ui < Φ−1 (v) otherwise Φ(u) We adapt this idea to PY processes using the stick-breaking representation of Eq. [sent-194, score-0.186]

60 In particuk−1 lar, we note that if zi ∼ π where πk = vk ℓ=1 (1 − vℓ ), a simple induction argument shows that vk = P[zi = k | zi = k − 1, . [sent-196, score-0.362]

61 The stick-breaking proportion vk is thus the conditional probability of choosing cluster k, given that clusters with indexes ℓ < k have been rejected. [sent-200, score-0.232]

62 Combining uk2 uk1 51 52 53 54 uk4 uk3 51 52 53 54 u3 B z2 z1 51 52 53 54 , u2 vk z4 z3 x1 x2 u1 6k B x3 x4 7 Figure 3: A nonparametric Bayesian approach to image segmentation in which thresholded Gaussian processes generate spatially dependent Pitman–Yor processes. [sent-201, score-0.765]

63 Left: Directed graphical model in which expected segment areas π ∼ GEM(α) are constructed from stick-breaking proportions vk ∼ Beta(1 − αa , αb + kαa ). [sent-202, score-0.38]

64 Zero mean Gaussian processes (uki ∼ N (0, 1)) are cut by thresholds Φ−1 (vk ) to produce segment assignments zi , and thereby features xi . [sent-203, score-0.489]

65 Right: Three randomly sampled image partitions (columns), where assignments (bottom, color-coded) are determined by the first of the ordered Gaussian processes uk to cross Φ−1 (vk ). [sent-204, score-0.394]

66 (4), we can generate samples zi ∼ π as follows: zi = min k | uki < Φ−1 (vk ) where uki ∼ N (0, 1) and uki ⊥ uℓi , k = ℓ (5) As illustrated in Fig. [sent-206, score-0.493]

67 Intuitively, the ordering of segments underlying this dependent PY model is analogous to layered appearance models [23], in which foreground layers occlude those that are farther from the camera. [sent-213, score-0.332]

68 To retain the power law prior on segment sizes justified in Sec. [sent-214, score-0.271]

69 3, we transform priors on stick proportions vk ∼ Beta(1 − αa , αb + kαa ) into corresponding random thresholds: p(¯k | α) = N (¯k | 0, 1) · Beta(Φ(¯k ) | 1 − αa , αb + kαa ) v v v vk ¯ Φ−1 (vk ) (6) Fig. [sent-216, score-0.445]

70 As the number of features N becomes large relative to the GP covariance length-scale, the proportion assigned to segment k approaches πk , where π ∼ GEM(αa , αb ) as desired. [sent-218, score-0.273]

71 Figure 4: Five samples from each of four prior models for image partitions (color coded). [sent-230, score-0.228]

72 [24] proposed a generalized spatial Dirichlet process which links assignments via thresholded GPs, as in Sec. [sent-239, score-0.192]

73 However, their focus is on modeling spatial random effects for prediction tasks, as opposed to the segmentation tasks which motivate our generalization to PY processes. [sent-242, score-0.231]

74 Moreover, their basic Gibbs sampler takes 12 hours on a toy dataset with 2,000 observations; our variational method jointly segments 200 scenes in comparable time. [sent-244, score-0.362]

75 This produces a field of smoothly varying multinomial ˇ ˇ distributions π i , from which segment assignments are independently sampled as zi ∼ π i . [sent-246, score-0.353]

76 Moreover, its bias towards partitions with K segments of similar size is a poor fit for natural scenes. [sent-249, score-0.316]

77 A previous nonparametric image segmentation method defined its prior as a normalized product of a DP sample π ∼ GEM(0, α) and a nearest neighbor MRF with Potts potentials [28]. [sent-250, score-0.418]

78 Due to the phase transition which occurs with increasing potential strength, Potts models assign low probability to realistic image partitions [29]. [sent-252, score-0.228]

79 5 Results Figure 5 shows segmentation results for images from the scene categories considered in Sec. [sent-255, score-0.435]

80 We compare the bag of features PY model (PY-BOF), dependent PY with distance-based squared exponential covariance (PY-Dist), and dependent PY with covariance that incorporates intervening contour cues (PY-Edge) based on the Pb detector [20]. [sent-257, score-0.267]

81 5 and 6, we independently segment each image, without sharing appearance models or supervised training. [sent-261, score-0.261]

82 We compare our results to the normalized cuts spectral clustering method with varying numbers of segments (NCut(K)), and a high-quality affinity function based on color, texture, and intervening contour cues [13]. [sent-262, score-0.431]

83 To quantitatively evaluate results, we measure overlap with held-out human segments via the Rand index [30]. [sent-265, score-0.203]

84 We have also experimented with our hierarchical PY extension, in which color and texture distributions are shared between images. [sent-268, score-0.241]

85 7, many of the inferred global visual categories align reasonably with semantic categories (e. [sent-270, score-0.301]

86 6 Discussion We have developed a nonparametric framework for image segmentation which uses thresholded Gaussian processes to produce spatially coupled Pitman–Yor processes. [sent-273, score-0.611]

87 This approach produces empirically justified power law priors for region areas and object frequencies, allows visual appear- Figure 5: Segmentation results for two images (rows) from each of the coast, mountain, and tallbuilding scene categories. [sent-274, score-0.56]

88 From left to right, columns show LabelMe human segments, image with boundaries inferred by PY-Edge, and segments for PY-Edge, PY-Dist, PY-BOF, NCut(3), NCut(4), and NCut(6). [sent-275, score-0.351]

89 5 2 4 6 8 10 Number of Normalized Cuts Regions (d) Figure 6: Quantitative comparison of segmentation results to human segments, using the Rand index. [sent-311, score-0.184]

90 We plot the performance of NCut(K) versus the number of segments K, compared to the variable resolution segmentations of PY-Edge, PY-Dist, and PY-BOF. [sent-314, score-0.316]

91 ance models to be flexibly shared among natural scenes, and leads to efficient variational inference algorithms which automatically search over segmentations of varying resolution. [sent-317, score-0.318]

92 We believe this provides a promising starting point for discovery of shape-based visual appearance models, as well as weakly supervised nonparametric learning in other, non-visual application domains. [sent-318, score-0.214]

93 Acknowledgments We thank Charless Fowlkes and David Martin for the Pb boundary estimation and segmentation code, Antonio Torralba for helpful conversations, and Sra. [sent-319, score-0.184]

94 Figure 7: Most significant segments associated with each of three shared, global visual categories (rows) for hierarchical PY-Edge models trained with 200 images of mountain (left) or tallbuilding (right) scenes. [sent-343, score-0.611]

95 Spatially coherent latent topic model for concurrent object segmentation and classification. [sent-347, score-0.307]

96 Using multiple segmentations to discover objects and their extent in image collections. [sent-358, score-0.33]

97 Cosegmentation of image pairs by histogram matching: Incorporating a global constraint into MRFs. [sent-383, score-0.182]

98 Shared segmentation of natural scenes using dependent Pitman-Yor processes. [sent-447, score-0.331]

99 Learning to detect natural image boundaries using local brightness, color, and texture cues. [sent-469, score-0.258]

100 The Ising/Potts model is not well suited to segmentation tasks. [sent-540, score-0.184]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('py', 0.486), ('pitman', 0.256), ('yor', 0.239), ('segments', 0.203), ('segmentation', 0.184), ('segment', 0.173), ('hpy', 0.171), ('gem', 0.153), ('image', 0.148), ('dp', 0.133), ('uki', 0.119), ('segmentations', 0.113), ('vk', 0.113), ('categories', 0.112), ('stick', 0.11), ('rand', 0.109), ('ncut', 0.104), ('ji', 0.103), ('covar', 0.102), ('beta', 0.094), ('frequencies', 0.093), ('appearance', 0.088), ('wk', 0.087), ('variational', 0.086), ('tji', 0.085), ('vjt', 0.085), ('scene', 0.085), ('processes', 0.085), ('assignments', 0.081), ('cuts', 0.08), ('object', 0.08), ('partitions', 0.08), ('spatially', 0.08), ('texture', 0.077), ('scenes', 0.073), ('dirichlet', 0.071), ('jt', 0.068), ('insidecity', 0.068), ('kjt', 0.068), ('tallbuilding', 0.068), ('zi', 0.068), ('bag', 0.068), ('breaking', 0.068), ('thresholded', 0.064), ('proportion', 0.064), ('law', 0.062), ('proportions', 0.055), ('indexes', 0.055), ('shared', 0.055), ('color', 0.055), ('priors', 0.054), ('hierarchical', 0.054), ('images', 0.054), ('superpixel', 0.051), ('xji', 0.051), ('potts', 0.051), ('nonparametric', 0.05), ('gps', 0.05), ('forest', 0.048), ('spatial', 0.047), ('cvpr', 0.047), ('thresholds', 0.046), ('zji', 0.045), ('coherent', 0.043), ('visual', 0.043), ('labelme', 0.043), ('mountain', 0.043), ('dependent', 0.041), ('xc', 0.041), ('intervening', 0.041), ('iccv', 0.041), ('cues', 0.04), ('areas', 0.039), ('region', 0.039), ('objects', 0.038), ('environments', 0.038), ('mrf', 0.037), ('features', 0.036), ('sky', 0.036), ('power', 0.036), ('normalized', 0.036), ('ki', 0.035), ('fowlkes', 0.035), ('pb', 0.035), ('global', 0.034), ('cosegmentation', 0.034), ('duan', 0.034), ('foliage', 0.034), ('mountains', 0.034), ('occupies', 0.034), ('counts', 0.034), ('natural', 0.033), ('sudderth', 0.033), ('discovery', 0.033), ('ui', 0.033), ('volume', 0.032), ('sorted', 0.032), ('varying', 0.031), ('discover', 0.031), ('threshold', 0.03)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999982 208 nips-2008-Shared Segmentation of Natural Scenes Using Dependent Pitman-Yor Processes

Author: Erik B. Sudderth, Michael I. Jordan

Abstract: We develop a statistical framework for the simultaneous, unsupervised segmentation and discovery of visual object categories from image databases. Examining a large set of manually segmented scenes, we show that object frequencies and segment sizes both follow power law distributions, which are well modeled by the Pitman–Yor (PY) process. This nonparametric prior distribution leads to learning algorithms which discover an unknown set of objects, and segmentation methods which automatically adapt their resolution to each image. Generalizing previous applications of PY processes, we use Gaussian processes to discover spatially contiguous segments which respect image boundaries. Using a novel family of variational approximations, our approach produces segmentations which compare favorably to state-of-the-art methods, while simultaneously discovering categories shared among natural scenes. 1

2 0.33812177 91 nips-2008-Generative and Discriminative Learning with Unknown Labeling Bias

Author: Steven J. Phillips, Miroslav Dudík

Abstract: We apply robust Bayesian decision theory to improve both generative and discriminative learners under bias in class proportions in labeled training data, when the true class proportions are unknown. For the generative case, we derive an entropybased weighting that maximizes expected log likelihood under the worst-case true class proportions. For the discriminative case, we derive a multinomial logistic model that minimizes worst-case conditional log loss. We apply our theory to the modeling of species geographic distributions from presence data, an extreme case of labeling bias since there is no absence data. On a benchmark dataset, we find that entropy-based weighting offers an improvement over constant estimates of class proportions, consistently reducing log loss on unbiased test data. 1

3 0.19901776 191 nips-2008-Recursive Segmentation and Recognition Templates for 2D Parsing

Author: Leo Zhu, Yuanhao Chen, Yuan Lin, Chenxi Lin, Alan L. Yuille

Abstract: Language and image understanding are two major goals of artificial intelligence which can both be conceptually formulated in terms of parsing the input signal into a hierarchical representation. Natural language researchers have made great progress by exploiting the 1D structure of language to design efficient polynomialtime parsing algorithms. By contrast, the two-dimensional nature of images makes it much harder to design efficient image parsers and the form of the hierarchical representations is also unclear. Attempts to adapt representations and algorithms from natural language have only been partially successful. In this paper, we propose a Hierarchical Image Model (HIM) for 2D image parsing which outputs image segmentation and object recognition. This HIM is represented by recursive segmentation and recognition templates in multiple layers and has advantages for representation, inference, and learning. Firstly, the HIM has a coarse-to-fine representation which is capable of capturing long-range dependency and exploiting different levels of contextual information. Secondly, the structure of the HIM allows us to design a rapid inference algorithm, based on dynamic programming, which enables us to parse the image rapidly in polynomial time. Thirdly, we can learn the HIM efficiently in a discriminative manner from a labeled dataset. We demonstrate that HIM outperforms other state-of-the-art methods by evaluation on the challenging public MSRC image dataset. Finally, we sketch how the HIM architecture can be extended to model more complex image phenomena. 1

4 0.17356896 42 nips-2008-Cascaded Classification Models: Combining Models for Holistic Scene Understanding

Author: Geremy Heitz, Stephen Gould, Ashutosh Saxena, Daphne Koller

Abstract: One of the original goals of computer vision was to fully understand a natural scene. This requires solving several sub-problems simultaneously, including object detection, region labeling, and geometric reasoning. The last few decades have seen great progress in tackling each of these problems in isolation. Only recently have researchers returned to the difficult task of considering them jointly. In this work, we consider learning a set of related models in such that they both solve their own problem and help each other. We develop a framework called Cascaded Classification Models (CCM), where repeated instantiations of these classifiers are coupled by their input/output variables in a cascade that improves performance at each level. Our method requires only a limited “black box” interface with the models, allowing us to use very sophisticated, state-of-the-art classifiers without having to look under the hood. We demonstrate the effectiveness of our method on a large set of natural images by combining the subtasks of scene categorization, object detection, multiclass image segmentation, and 3d reconstruction. 1

5 0.17030019 116 nips-2008-Learning Hybrid Models for Image Annotation with Partially Labeled Data

Author: Xuming He, Richard S. Zemel

Abstract: Extensive labeled data for image annotation systems, which learn to assign class labels to image regions, is difficult to obtain. We explore a hybrid model framework for utilizing partially labeled data that integrates a generative topic model for image appearance with discriminative label prediction. We propose three alternative formulations for imposing a spatial smoothness prior on the image labels. Tests of the new models and some baseline approaches on three real image datasets demonstrate the effectiveness of incorporating the latent structure. 1

6 0.16979124 147 nips-2008-Multiscale Random Fields with Application to Contour Grouping

7 0.13549814 6 nips-2008-A ``Shape Aware'' Model for semi-supervised Learning of Objects and its Context

8 0.11609368 142 nips-2008-Multi-Level Active Prediction of Useful Image Annotations for Recognition

9 0.091732971 26 nips-2008-Analyzing human feature learning as nonparametric Bayesian inference

10 0.08795289 207 nips-2008-Shape-Based Object Localization for Descriptive Classification

11 0.086867064 118 nips-2008-Learning Transformational Invariants from Natural Movies

12 0.086385369 148 nips-2008-Natural Image Denoising with Convolutional Networks

13 0.082329735 246 nips-2008-Unsupervised Learning of Visual Sense Models for Polysemous Words

14 0.075905874 107 nips-2008-Influence of graph construction on graph-based clustering measures

15 0.073982388 95 nips-2008-Grouping Contours Via a Related Image

16 0.072158962 154 nips-2008-Nonparametric Bayesian Learning of Switching Linear Dynamical Systems

17 0.065700755 111 nips-2008-Kernel Change-point Analysis

18 0.063446455 130 nips-2008-MCBoost: Multiple Classifier Boosting for Perceptual Co-clustering of Images and Visual Features

19 0.06344153 119 nips-2008-Learning a discriminative hidden part model for human action recognition

20 0.063095219 229 nips-2008-Syntactic Topic Models


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.213), (1, -0.134), (2, 0.145), (3, -0.204), (4, -0.028), (5, -0.008), (6, -0.08), (7, -0.069), (8, -0.012), (9, -0.007), (10, -0.052), (11, -0.029), (12, 0.171), (13, -0.199), (14, -0.004), (15, 0.012), (16, -0.047), (17, 0.088), (18, -0.055), (19, -0.03), (20, -0.006), (21, 0.127), (22, 0.071), (23, 0.046), (24, -0.003), (25, -0.052), (26, 0.099), (27, 0.158), (28, 0.077), (29, -0.029), (30, -0.056), (31, -0.311), (32, -0.041), (33, 0.194), (34, -0.014), (35, -0.029), (36, 0.138), (37, -0.038), (38, 0.128), (39, -0.128), (40, 0.145), (41, -0.01), (42, 0.094), (43, 0.01), (44, 0.03), (45, 0.14), (46, 0.004), (47, -0.097), (48, 0.072), (49, 0.015)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.94700432 208 nips-2008-Shared Segmentation of Natural Scenes Using Dependent Pitman-Yor Processes

Author: Erik B. Sudderth, Michael I. Jordan

Abstract: We develop a statistical framework for the simultaneous, unsupervised segmentation and discovery of visual object categories from image databases. Examining a large set of manually segmented scenes, we show that object frequencies and segment sizes both follow power law distributions, which are well modeled by the Pitman–Yor (PY) process. This nonparametric prior distribution leads to learning algorithms which discover an unknown set of objects, and segmentation methods which automatically adapt their resolution to each image. Generalizing previous applications of PY processes, we use Gaussian processes to discover spatially contiguous segments which respect image boundaries. Using a novel family of variational approximations, our approach produces segmentations which compare favorably to state-of-the-art methods, while simultaneously discovering categories shared among natural scenes. 1

2 0.72420442 91 nips-2008-Generative and Discriminative Learning with Unknown Labeling Bias

Author: Steven J. Phillips, Miroslav Dudík

Abstract: We apply robust Bayesian decision theory to improve both generative and discriminative learners under bias in class proportions in labeled training data, when the true class proportions are unknown. For the generative case, we derive an entropybased weighting that maximizes expected log likelihood under the worst-case true class proportions. For the discriminative case, we derive a multinomial logistic model that minimizes worst-case conditional log loss. We apply our theory to the modeling of species geographic distributions from presence data, an extreme case of labeling bias since there is no absence data. On a benchmark dataset, we find that entropy-based weighting offers an improvement over constant estimates of class proportions, consistently reducing log loss on unbiased test data. 1

3 0.55515122 191 nips-2008-Recursive Segmentation and Recognition Templates for 2D Parsing

Author: Leo Zhu, Yuanhao Chen, Yuan Lin, Chenxi Lin, Alan L. Yuille

Abstract: Language and image understanding are two major goals of artificial intelligence which can both be conceptually formulated in terms of parsing the input signal into a hierarchical representation. Natural language researchers have made great progress by exploiting the 1D structure of language to design efficient polynomialtime parsing algorithms. By contrast, the two-dimensional nature of images makes it much harder to design efficient image parsers and the form of the hierarchical representations is also unclear. Attempts to adapt representations and algorithms from natural language have only been partially successful. In this paper, we propose a Hierarchical Image Model (HIM) for 2D image parsing which outputs image segmentation and object recognition. This HIM is represented by recursive segmentation and recognition templates in multiple layers and has advantages for representation, inference, and learning. Firstly, the HIM has a coarse-to-fine representation which is capable of capturing long-range dependency and exploiting different levels of contextual information. Secondly, the structure of the HIM allows us to design a rapid inference algorithm, based on dynamic programming, which enables us to parse the image rapidly in polynomial time. Thirdly, we can learn the HIM efficiently in a discriminative manner from a labeled dataset. We demonstrate that HIM outperforms other state-of-the-art methods by evaluation on the challenging public MSRC image dataset. Finally, we sketch how the HIM architecture can be extended to model more complex image phenomena. 1

4 0.51192838 147 nips-2008-Multiscale Random Fields with Application to Contour Grouping

Author: Longin J. Latecki, Chengen Lu, Marc Sobel, Xiang Bai

Abstract: We introduce a new interpretation of multiscale random fields (MSRFs) that admits efficient optimization in the framework of regular (single level) random fields (RFs). It is based on a new operator, called append, that combines sets of random variables (RVs) to single RVs. We assume that a MSRF can be decomposed into disjoint trees that link RVs at different pyramid levels. The append operator is then applied to map RVs in each tree structure to a single RV. We demonstrate the usefulness of the proposed approach on a challenging task involving grouping contours of target shapes in images. It provides a natural representation of multiscale contour models, which is needed in order to cope with unstable contour decompositions. The append operator allows us to find optimal image segment labels using the classical framework of relaxation labeling. Alternative methods like Markov Chain Monte Carlo (MCMC) could also be used.

5 0.48576131 116 nips-2008-Learning Hybrid Models for Image Annotation with Partially Labeled Data

Author: Xuming He, Richard S. Zemel

Abstract: Extensive labeled data for image annotation systems, which learn to assign class labels to image regions, is difficult to obtain. We explore a hybrid model framework for utilizing partially labeled data that integrates a generative topic model for image appearance with discriminative label prediction. We propose three alternative formulations for imposing a spatial smoothness prior on the image labels. Tests of the new models and some baseline approaches on three real image datasets demonstrate the effectiveness of incorporating the latent structure. 1

6 0.47806287 42 nips-2008-Cascaded Classification Models: Combining Models for Holistic Scene Understanding

7 0.4212974 236 nips-2008-The Mondrian Process

8 0.41933039 148 nips-2008-Natural Image Denoising with Convolutional Networks

9 0.40753719 111 nips-2008-Kernel Change-point Analysis

10 0.38219973 6 nips-2008-A ``Shape Aware'' Model for semi-supervised Learning of Objects and its Context

11 0.37360767 41 nips-2008-Breaking Audio CAPTCHAs

12 0.35499606 153 nips-2008-Nonlinear causal discovery with additive noise models

13 0.3443802 176 nips-2008-Partially Observed Maximum Entropy Discrimination Markov Networks

14 0.32322344 207 nips-2008-Shape-Based Object Localization for Descriptive Classification

15 0.31259814 154 nips-2008-Nonparametric Bayesian Learning of Switching Linear Dynamical Systems

16 0.30350342 23 nips-2008-An ideal observer model of infant object perception

17 0.29201266 95 nips-2008-Grouping Contours Via a Related Image

18 0.28810209 142 nips-2008-Multi-Level Active Prediction of Useful Image Annotations for Recognition

19 0.28480652 119 nips-2008-Learning a discriminative hidden part model for human action recognition

20 0.2734274 169 nips-2008-Online Models for Content Optimization


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(6, 0.053), (7, 0.057), (12, 0.046), (15, 0.016), (28, 0.122), (35, 0.022), (55, 0.246), (57, 0.152), (59, 0.019), (63, 0.037), (71, 0.018), (77, 0.054), (81, 0.011), (83, 0.059)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.82228041 208 nips-2008-Shared Segmentation of Natural Scenes Using Dependent Pitman-Yor Processes

Author: Erik B. Sudderth, Michael I. Jordan

Abstract: We develop a statistical framework for the simultaneous, unsupervised segmentation and discovery of visual object categories from image databases. Examining a large set of manually segmented scenes, we show that object frequencies and segment sizes both follow power law distributions, which are well modeled by the Pitman–Yor (PY) process. This nonparametric prior distribution leads to learning algorithms which discover an unknown set of objects, and segmentation methods which automatically adapt their resolution to each image. Generalizing previous applications of PY processes, we use Gaussian processes to discover spatially contiguous segments which respect image boundaries. Using a novel family of variational approximations, our approach produces segmentations which compare favorably to state-of-the-art methods, while simultaneously discovering categories shared among natural scenes. 1

2 0.65628636 27 nips-2008-Artificial Olfactory Brain for Mixture Identification

Author: Mehmet K. Muezzinoglu, Alexander Vergara, Ramon Huerta, Thomas Nowotny, Nikolai Rulkov, Henry Abarbanel, Allen Selverston, Mikhail Rabinovich

Abstract: The odor transduction process has a large time constant and is susceptible to various types of noise. Therefore, the olfactory code at the sensor/receptor level is in general a slow and highly variable indicator of the input odor in both natural and artificial situations. Insects overcome this problem by using a neuronal device in their Antennal Lobe (AL), which transforms the identity code of olfactory receptors to a spatio-temporal code. This transformation improves the decision of the Mushroom Bodies (MBs), the subsequent classifier, in both speed and accuracy. Here we propose a rate model based on two intrinsic mechanisms in the insect AL, namely integration and inhibition. Then we present a MB classifier model that resembles the sparse and random structure of insect MB. A local Hebbian learning procedure governs the plasticity in the model. These formulations not only help to understand the signal conditioning and classification methods of insect olfactory systems, but also can be leveraged in synthetic problems. Among them, we consider here the discrimination of odor mixtures from pure odors. We show on a set of records from metal-oxide gas sensors that the cascade of these two new models facilitates fast and accurate discrimination of even highly imbalanced mixtures from pure odors. 1

3 0.65072322 80 nips-2008-Extended Grassmann Kernels for Subspace-Based Learning

Author: Jihun Hamm, Daniel D. Lee

Abstract: Subspace-based learning problems involve data whose elements are linear subspaces of a vector space. To handle such data structures, Grassmann kernels have been proposed and used previously. In this paper, we analyze the relationship between Grassmann kernels and probabilistic similarity measures. Firstly, we show that the KL distance in the limit yields the Projection kernel on the Grassmann manifold, whereas the Bhattacharyya kernel becomes trivial in the limit and is suboptimal for subspace-based problems. Secondly, based on our analysis of the KL distance, we propose extensions of the Projection kernel which can be extended to the set of affine as well as scaled subspaces. We demonstrate the advantages of these extended kernels for classification and recognition tasks with Support Vector Machines and Kernel Discriminant Analysis using synthetic and real image databases. 1

4 0.65067977 100 nips-2008-How memory biases affect information transmission: A rational analysis of serial reproduction

Author: Jing Xu, Thomas L. Griffiths

Abstract: Many human interactions involve pieces of information being passed from one person to another, raising the question of how this process of information transmission is affected by the capacities of the agents involved. In the 1930s, Sir Frederic Bartlett explored the influence of memory biases in “serial reproduction” of information, in which one person’s reconstruction of a stimulus from memory becomes the stimulus seen by the next person. These experiments were done using relatively uncontrolled stimuli such as pictures and stories, but suggested that serial reproduction would transform information in a way that reflected the biases inherent in memory. We formally analyze serial reproduction using a Bayesian model of reconstruction from memory, giving a general result characterizing the effect of memory biases on information transmission. We then test the predictions of this account in two experiments using simple one-dimensional stimuli. Our results provide theoretical and empirical justification for the idea that serial reproduction reflects memory biases. 1

5 0.64556986 236 nips-2008-The Mondrian Process

Author: Daniel M. Roy, Yee W. Teh

Abstract: We describe a novel class of distributions, called Mondrian processes, which can be interpreted as probability distributions over kd-tree data structures. Mondrian processes are multidimensional generalizations of Poisson processes and this connection allows us to construct multidimensional generalizations of the stickbreaking process described by Sethuraman (1994), recovering the Dirichlet process in one dimension. After introducing the Aldous-Hoover representation for jointly and separately exchangeable arrays, we show how the process can be used as a nonparametric prior distribution in Bayesian models of relational data. 1

6 0.64184451 191 nips-2008-Recursive Segmentation and Recognition Templates for 2D Parsing

7 0.64126039 148 nips-2008-Natural Image Denoising with Convolutional Networks

8 0.63126731 233 nips-2008-The Gaussian Process Density Sampler

9 0.6302281 116 nips-2008-Learning Hybrid Models for Image Annotation with Partially Labeled Data

10 0.626616 42 nips-2008-Cascaded Classification Models: Combining Models for Holistic Scene Understanding

11 0.62236708 158 nips-2008-Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks

12 0.62221682 200 nips-2008-Robust Kernel Principal Component Analysis

13 0.61354196 95 nips-2008-Grouping Contours Via a Related Image

14 0.61313421 66 nips-2008-Dynamic visual attention: searching for coding length increments

15 0.61257195 234 nips-2008-The Infinite Factorial Hidden Markov Model

16 0.61136663 192 nips-2008-Reducing statistical dependencies in natural signals using radial Gaussianization

17 0.60981667 197 nips-2008-Relative Performance Guarantees for Approximate Inference in Latent Dirichlet Allocation

18 0.60512882 246 nips-2008-Unsupervised Learning of Visual Sense Models for Polysemous Words

19 0.60508406 118 nips-2008-Learning Transformational Invariants from Natural Movies

20 0.60403973 62 nips-2008-Differentiable Sparse Coding