nips nips2012 nips2012-146 knowledge-graph by maker-knowledge-mining

146 nips-2012-Graphical Gaussian Vector for Image Categorization


Source: pdf

Author: Tatsuya Harada, Yasuo Kuniyoshi

Abstract: This paper proposes a novel image representation called a Graphical Gaussian Vector (GGV), which is a counterpart of the codebook and local feature matching approaches. We model the distribution of local features as a Gaussian Markov Random Field (GMRF) which can efficiently represent the spatial relationship among local features. Using concepts of information geometry, proper parameters and a metric from the GMRF can be obtained. Then we define a new image feature by embedding the proper metric into the parameters, which can be directly applied to scalable linear classifiers. We show that the GGV obtains better performance over the state-of-the-art methods in the standard object recognition datasets and comparable performance in the scene dataset. 1

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 jp Abstract This paper proposes a novel image representation called a Graphical Gaussian Vector (GGV), which is a counterpart of the codebook and local feature matching approaches. [sent-11, score-0.327]

2 We model the distribution of local features as a Gaussian Markov Random Field (GMRF) which can efficiently represent the spatial relationship among local features. [sent-12, score-0.379]

3 Using concepts of information geometry, proper parameters and a metric from the GMRF can be obtained. [sent-13, score-0.111]

4 Then we define a new image feature by embedding the proper metric into the parameters, which can be directly applied to scalable linear classifiers. [sent-14, score-0.292]

5 1 Introduction The Bag of Words (BoW) [7] is the de facto standard image feature for the image categorization. [sent-16, score-0.3]

6 In a BoW, each local feature is assigned to the nearest codeword and an image is represented by a histogram of the quantized features. [sent-17, score-0.293]

7 While it is well established that using a large number of codewords improves classification performance, the drawback is that assigning local features to the nearest codeword is computationally expensive. [sent-19, score-0.228]

8 To overcome this problem, some studies have proposed building an efficient image representation with a smaller number of codewords [22], [24]. [sent-20, score-0.155]

9 Finding an explicit correspondence between local features is another way of categorizing images using a BoW [4], [12], [26], and this approach has been improved by representing a spatial layout of local features as a graph [11], [2], [16], [8]. [sent-21, score-0.569]

10 Explicit correspondences between features have an advantage over a BoW as information loss in the vector quantization can be avoided. [sent-22, score-0.077]

11 Therefore, the aim of our research is to build an efficient image representation without using codewords or explicit correspondences between local features, while still achieving high classification accuracy. [sent-24, score-0.244]

12 Since having a spatial layout of local features is important for an image to have semantic meaning, it is natural that embedding spatial information into an image feature improves classification performance [18], [5], [14], [17]. [sent-25, score-0.742]

13 Several approaches take advantage of this fact, ranging from local (e. [sent-26, score-0.105]

14 Meanwhile, we will focus on the spatial layout of local features, which is the midlevel of the spatial information. [sent-31, score-0.351]

15 In this paper, we model an image as a graph representing the spatial layout of local features and define a new image feature based on this graph, where a proper metric is embedded into the feature. [sent-32, score-0.735]

16 Specifically, we model an image as a Gaussian Markov Random Field (GMRF) whose nodes correspond to local features and consider the GMRF parameters as the image feature. [sent-34, score-0.428]

17 Although the GMRF is commonly used for image segmentation, it is rarely used in modern image categorization pipelines despite being an effective way of modeling the spatial layout. [sent-35, score-0.36]

18 In order to extract the repre1 sentative feature vector from the GMRF, the choice of coordinates for the parameters and the metric between them needs to be carefully made. [sent-36, score-0.137]

19 We define the proper coordinates and the metric from an information geometry standpoint [1] and derive an optimal feature vector. [sent-37, score-0.166]

20 The contributions of this study are summarized as follows: 1) A novel and efficient image feature is developed by utilizing the GMRF as a tool for object categorization. [sent-39, score-0.198]

21 3) Using standard image categorization benchmarks, we demonstrate that the proposed feature has better performance over the state-of-the-art methods, even though it is not based on mainstream modules (such as codebooks and correspondence between local features). [sent-41, score-0.331]

22 To the best of our knowledge, this is the first image feature for the object categorization that utilizes the expectation parameters of the GMRF with its Fisher information metric, and achieves a level of accuracy comparable to that of the codebook and local feature matching approaches. [sent-42, score-0.537]

23 ξ2 Figure 1: Overview of image feature extraction based on a multivariate GMRF. [sent-45, score-0.192]

24 Initially, local features {xi ∈ Rd }M are i=1 extracted using a dense sampling strategy (Fig. [sent-47, score-0.182]

25 We then use a multivariate GMRF to model the spatial relationships among local features (Fig. [sent-49, score-0.298]

26 The GMRF is represented as a graph G(V, E), whose vertices V and edges E correspond to local features and the dependent relationships between those features, respectively. [sent-51, score-0.241]

27 Let the vector x be a concatenation of local features in V and let ξ j be a parameter of the GMRF of an image Ij , the image Ij can be represented by a probability distribution p(x; ξ j ) of the GMRF (Fig. [sent-52, score-0.408]

28 We consider the parameter ξj of the GMRF to be a feature vector of the image Ij (Fig. [sent-54, score-0.168]

29 However, because the space spanned by parameters of a probability distribution is not a Euclidean space, we have to be very careful when choosing parameters for the probability distribution and the metric among them. [sent-58, score-0.123]

30 We make use of concepts from the information geometry [1] and extract proper parameters and a metric from the GMRF. [sent-59, score-0.131]

31 Finally, we define the new image feature by embedding the metric into the extracted parameters to build an image categorization system with a scalable linear classifier. [sent-60, score-0.438]

32 2 Image Model and Parameters Given M local features {xi ∈ Rd }M , the aim is to model a probability distribution of the local i=1 features representing the spatial layout of the image using the multivariate GMRF G = (V, E). [sent-63, score-0.655]

33 First, a vector x is built by concatenating the local features corresponding to the vertices V of the GMRF. [sent-64, score-0.235]

34 Let {xi }n are local features that we are focusing on, we obtain the concatenated vector as i=1 x = (x1 · · · xn ) (e. [sent-65, score-0.182]

35 Note that the dimensionality of x is nd and does not depend on the number of local features M , the image size, or the aspect ratio. [sent-69, score-0.378]

36 However, since all results valid for a scalar local feature are also valid for a multivariate local feature, in this section we consider the dimensionality of local features is 1 (d = 1) for simplicity. [sent-70, score-0.554]

37 The expectation parameters are obtained as [15]: ηi = μi , ηii = Pii + μ2 , ηjk = Pjk + μj μk , (i ∈ V, {j, k} ∈ E). [sent-77, score-0.083]

38 (4) i The natural and expectation parameters can be transformed into each other [1]. [sent-78, score-0.083]

39 If we take the natural parameters or the expectation parameters as a coordinate system for an exponential family, a flat structure can be realized [1]. [sent-81, score-0.124]

40 Those spaces are similar to a Euclidean space, but we need to be careful that the spaces spanned by the natural or expectation parameters are different from a Euclidean space, as the metrics vary for different parameters. [sent-84, score-0.104]

41 To summarize this section, the natural and expectation parameters are similar and interchangeable through the FIMs. [sent-89, score-0.083]

42 Although it does not matter whether we choose natural or expectation parameters, we use expectation parameters (Eq. [sent-91, score-0.146]

43 (4)) as feature vectors because they can be calculated directly from the mean and covariance of local features. [sent-92, score-0.183]

44 3 Calculation of Expectation Parameters In this section, we describe the calculations of the expectation parameters of the multivariate GMRF. [sent-95, score-0.124]

45 While a graph having more neighbors is obviously able to represent richer spatial information, the compact structure is preferable for efficiency. [sent-101, score-0.119]

46 2(c), which represents the vertical and horizontal relationships among local features, and Fig. [sent-103, score-0.138]

47 rk + a 2 rk + a 3 rk + a 4 rk + a 2 rk + a1 rk (a) (b) rk + a1 rk (c) (d) Figure 2: Structures of the GMRF. [sent-105, score-0.432]

48 Next, we show a method for estimating the expectation parameters of each image. [sent-106, score-0.083]

49 (4) in a multivariate case can be determined by calculating the local auto-correlations of local 3 features. [sent-108, score-0.234]

50 Let x(r k ) ∈ Rd be the local feature at a reference point rk and let ai and aj be the displacement vectors, which are defined by the structure of the GMRF. [sent-112, score-0.252]

51 Then, the local auto-correlation matrices 1 are obtained as: Ci,j = NJ k∈J x(r k +ai )x(r k +aj ) , where NJ is the number of local features 1 in the image region J. [sent-113, score-0.4]

52 Let a vector concatenating local features in the vertices at the reference point rk be xk = (x(r k ) x(r k + a1 ) x(r k + a2 ) ), P + μμ is calculated to be: P + μμ = 1 NJ xk xk = k∈J C0,0 C1,0 C2,0 C0,1 C1,1 C2,1 C0,2 C1,2 C2,2 . [sent-115, score-0.312]

53 (5) The expectation parameters of the GMRF depicted in Fig. [sent-116, score-0.083]

54 Note that C1,2 is omitted, because there is no edge between the vertices at rk + a1 and rk + a2 . [sent-118, score-0.14]

55 The dimensionality of η is: nd + n(d + 1)d/2 + (n − 1)d2 , where d is the dimensionality of the local feature. [sent-121, score-0.271]

56 2(e)), if we have enough local features, the means {μi }n−1 i=0 and covariance matrices {Ci,i }n−1 of local features in the region J come to the vector μ0 and matrix i=0 C0,0 , respectively. [sent-124, score-0.287]

57 We now derive a metric between the expectation parameters [1]. [sent-138, score-0.145]

58 (8) Thus, the FIM is a proper metric for the feature vectors (the expectation parameters) obtained from the GMRF. [sent-144, score-0.209]

59 5 Implementation of Graphical Gaussian Vector At first, we build the concatenated vector as x = (x1 · · · xn ) , where each xi corresponds to the local feature of the vertex i. [sent-149, score-0.16]

60 The dimensionality of the local features is 2 (x = (x1 , x2 ) , y = (y1 , y2 ) , z = (z1 , z2 ) ). [sent-151, score-0.265]

61 A vector concatenating the local features in V is v = (x1 , x2 , y1 , y2 , z1 , z2 ) . [sent-152, score-0.203]

62 Using μ and J, the Fisher information matrix of the full Gaussian family can be calculated as in (b), whose rows and columns correspond to the elements of the expectation parameters. [sent-154, score-0.086]

63 In order to embed the proper metric into the expectation parameters, we multiply G∗ (η c )1/2 by η: ζ= ∗ ∗ ∗ FG,G (η c ) − FG,\G (η c ) F\G,\G (η c ) −1 ∗ F\G,G (η c ) 1/2 η. [sent-169, score-0.174]

64 In practice, since using all training data is infeasible to estimate the FIM, we use a subset of local features randomly sampled from training data. [sent-176, score-0.182]

65 Input: An image region J, and the Fisher information matrix of the GMRF G∗ (η c ) Output: GGV ζ 1. [sent-179, score-0.113]

66 Calculate local auto-correlations of local features: 1 1 μi = NJ k∈J x(r k + ai ), Ci,j = NJ k∈J x(r k + ai )x(r k + aj ) 2. [sent-180, score-0.269]

67 Embed the Fisher information metric into the expectation parameters: 1/2 ζ = (G∗ (η c )) η 3 Experiment We tested our method on the standard object and scene datasets (Caltech101, Caltech256, and 15-Scenes). [sent-182, score-0.178]

68 2(c) (GGV, n = 3), which models a horizontal and vertical spatial layout of the local features. [sent-190, score-0.292]

69 2(d) (GGV, n = 5), which adds diagonal spatial layouts of the features ˆ to Fig. [sent-192, score-0.169]

70 To embed the global spatial information, we used the spatial pyramid representation with a 1 × 1 + 2 × 2 + 3 × 3 pyramid structure. [sent-197, score-0.293]

71 Table 1: The relationships between GLC, LAC, GG, and GGV in terms of spatial information and Fisher information metrics. [sent-198, score-0.092]

72 Method GLC LAC GG GGV (proposed) Spatial information √ √ Fisher information metric √ √ In the second experiment, we compared GGVs with the Improved Fisher kernel (IFK) [24], [25], which is the best image representation available at the time of writing. [sent-199, score-0.175]

73 In this experiment, we used the spatial pyramid representation with a 1 × 1 + 2 × 2 + 3 × 1 structure. [sent-200, score-0.128]

74 For all datasets, SIFT features were densely sampled and were described for 16 × 16 patches. [sent-203, score-0.077]

75 As the aforementioned features depend on the dimensionality of the local feature, we reduced its dimensionality using PCA and compared performance as a function of its new dimensionality. [sent-205, score-0.348]

76 Before comparison between GGVs and the baselines, we evaluate the sensitivities of the sampling step of local features. [sent-212, score-0.105]

77 The sampling step is one of the important parameters of GGV, because GGV calculates auto-correlations of the neighboring local features. [sent-213, score-0.125]

78 In this preliminary experiment, we fix the number of vertices is 5 (n = 5) and the dimensionality of local feature is 32. [sent-214, score-0.275]

79 Therefore in the following experiments, we use 6 pixels sampling step for local feature extraction. [sent-224, score-0.186]

80 Figure 4 (left) shows the classification accuracies as a function of the dimensionality of the local features. [sent-225, score-0.225]

81 A large dimensionality yielded better performance, and the proposed method (GGV) outperformed the other methods (GLC, LAC, and GG). [sent-226, score-0.099]

82 By comparing GGV with LAC, and GG with GLC, it is clear that embedding the Fisher information metric improved the classification accuracy significantly. [sent-227, score-0.095]

83 By comparing GGV with GG, as well as LAC with GLC, it can also be seen that embedding the spatial layout of local features also improved the accuracy. [sent-228, score-0.369]

84 4 (right) shows the classification accuracy as a function of the dimensionality of the image features which are converted from the results shown in Fig. [sent-238, score-0.273]

85 We see that GGVs achieved higher accuracy for a lower dimensionality of image features. [sent-240, score-0.196]

86 3 % when the dimensionality of the local feature is 32 and the number of vertices is 5. [sent-246, score-0.275]

87 5 (center) and (right) show comparisons of the L2 normalized GGVs and IFKs using the Caltech256 dataset with respect to the dimensionality of local features and image features, respectively. [sent-256, score-0.378]

88 It is known that using multi-scale local features improves classification accuracies (e. [sent-264, score-0.219]

89 In the second experiment, the results with respect to the dimensionality of local features and image features are shown in Figs. [sent-280, score-0.455]

90 As the leading method, the spatial Fisher kernel [17] reported the highest score (88. [sent-285, score-0.092]

91 4 Conclusion In this paper, we proposed an efficient image feature called a Graphical Gaussian Vector, which uses neither codebook nor local feature matching. [sent-288, score-0.377]

92 In the proposed method, spatial information about local features and the Fisher information metric are embedded into a feature by modeling the image as the Gaussian Markov Random Field (GMRF). [sent-289, score-0.52]

93 The proposed image feature calculates the expectation parameters of the GMRF simply and effectively while maintaining a high classification rate. [sent-291, score-0.267]

94 Improving local descriptors by embedding global and local spatial information. [sent-385, score-0.373]

95 Asymmetric region-to-image matching for comparing images with generic object categories. [sent-394, score-0.075]

96 Modeling spatial layout with fisher vectors for image categorization. [sent-400, score-0.267]

97 Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. [sent-406, score-0.096]

98 Global gaussian approach for scene categorization using information geometry. [sent-418, score-0.091]

99 Linear spatial pyramid matching using sparse coding for image classification. [sent-469, score-0.262]

100 Image classification using super-vector coding of local image descriptors. [sent-477, score-0.218]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('ggv', 0.683), ('ifk', 0.41), ('gmrf', 0.283), ('fim', 0.174), ('glc', 0.161), ('lac', 0.149), ('gg', 0.141), ('ggvs', 0.124), ('image', 0.113), ('local', 0.105), ('mgmrf', 0.099), ('spatial', 0.092), ('dimensionality', 0.083), ('fisher', 0.079), ('features', 0.077), ('classification', 0.075), ('pq', 0.067), ('expectation', 0.063), ('layout', 0.062), ('harada', 0.062), ('metric', 0.062), ('feature', 0.055), ('fg', 0.054), ('rk', 0.054), ('bow', 0.047), ('sift', 0.046), ('cvpr', 0.043), ('norm', 0.042), ('categorization', 0.042), ('classi', 0.04), ('accuracies', 0.037), ('ifks', 0.037), ('jpr', 0.037), ('nakayama', 0.037), ('pyramid', 0.036), ('pp', 0.035), ('calculation', 0.035), ('nj', 0.034), ('embedding', 0.033), ('codebook', 0.033), ('ij', 0.033), ('vertices', 0.032), ('auto', 0.03), ('object', 0.03), ('scored', 0.029), ('proper', 0.029), ('field', 0.028), ('graph', 0.027), ('perronnin', 0.027), ('rr', 0.027), ('gaussian', 0.026), ('eccv', 0.026), ('codewords', 0.026), ('pixels', 0.026), ('center', 0.025), ('csurka', 0.025), ('fgg', 0.025), ('hongo', 0.025), ('jkp', 0.025), ('jpi', 0.025), ('multivariate', 0.024), ('images', 0.024), ('calculated', 0.023), ('scene', 0.023), ('tokyo', 0.023), ('fifteen', 0.022), ('dance', 0.022), ('spanned', 0.021), ('matching', 0.021), ('ai', 0.021), ('concatenating', 0.021), ('descriptors', 0.021), ('graphical', 0.021), ('coordinate', 0.021), ('jij', 0.02), ('nchez', 0.02), ('submatrices', 0.02), ('codeword', 0.02), ('vocabularies', 0.02), ('embed', 0.02), ('correlation', 0.02), ('fergus', 0.02), ('parameters', 0.02), ('geometry', 0.02), ('jk', 0.019), ('facto', 0.019), ('gij', 0.018), ('visual', 0.018), ('cation', 0.018), ('calculations', 0.017), ('llc', 0.017), ('sher', 0.017), ('gmms', 0.017), ('aj', 0.017), ('global', 0.017), ('vertical', 0.017), ('horizontal', 0.016), ('rate', 0.016), ('bags', 0.016), ('proposed', 0.016)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999988 146 nips-2012-Graphical Gaussian Vector for Image Categorization

Author: Tatsuya Harada, Yasuo Kuniyoshi

Abstract: This paper proposes a novel image representation called a Graphical Gaussian Vector (GGV), which is a counterpart of the codebook and local feature matching approaches. We model the distribution of local features as a Gaussian Markov Random Field (GMRF) which can efficiently represent the spatial relationship among local features. Using concepts of information geometry, proper parameters and a metric from the GMRF can be obtained. Then we define a new image feature by embedding the proper metric into the parameters, which can be directly applied to scalable linear classifiers. We show that the GGV obtains better performance over the state-of-the-art methods in the standard object recognition datasets and comparable performance in the scene dataset. 1

2 0.066536166 92 nips-2012-Deep Representations and Codes for Image Auto-Annotation

Author: Ryan Kiros, Csaba Szepesvári

Abstract: The task of image auto-annotation, namely assigning a set of relevant tags to an image, is challenging due to the size and variability of tag vocabularies. Consequently, most existing algorithms focus on tag assignment and fix an often large number of hand-crafted features to describe image characteristics. In this paper we introduce a hierarchical model for learning representations of standard sized color images from the pixel level, removing the need for engineered feature representations and subsequent feature selection for annotation. We benchmark our model on the STL-10 recognition dataset, achieving state-of-the-art performance. When our features are combined with TagProp (Guillaumin et al.), we compete with or outperform existing annotation approaches that use over a dozen distinct handcrafted image descriptors. Furthermore, using 256-bit codes and Hamming distance for training TagProp, we exchange only a small reduction in performance for efficient storage and fast comparisons. Self-taught learning is used in all of our experiments and deeper architectures always outperform shallow ones. 1

3 0.0623735 176 nips-2012-Learning Image Descriptors with the Boosting-Trick

Author: Tomasz Trzcinski, Mario Christoudias, Vincent Lepetit, Pascal Fua

Abstract: In this paper we apply boosting to learn complex non-linear local visual feature representations, drawing inspiration from its successful application to visual object detection. The main goal of local feature descriptors is to distinctively represent a salient image region while remaining invariant to viewpoint and illumination changes. This representation can be improved using machine learning, however, past approaches have been mostly limited to learning linear feature mappings in either the original input or a kernelized input feature space. While kernelized methods have proven somewhat effective for learning non-linear local feature descriptors, they rely heavily on the choice of an appropriate kernel function whose selection is often difficult and non-intuitive. We propose to use the boosting-trick to obtain a non-linear mapping of the input to a high-dimensional feature space. The non-linear feature mapping obtained with the boosting-trick is highly intuitive. We employ gradient-based weak learners resulting in a learned descriptor that closely resembles the well-known SIFT. As demonstrated in our experiments, the resulting descriptor can be learned directly from intensity patches achieving state-of-the-art performance. 1

4 0.057768479 357 nips-2012-Unsupervised Template Learning for Fine-Grained Object Recognition

Author: Shulin Yang, Liefeng Bo, Jue Wang, Linda G. Shapiro

Abstract: Fine-grained recognition refers to a subordinate level of recognition, such as recognizing different species of animals and plants. It differs from recognition of basic categories, such as humans, tables, and computers, in that there are global similarities in shape and structure shared cross different categories, and the differences are in the details of object parts. We suggest that the key to identifying the fine-grained differences lies in finding the right alignment of image regions that contain the same object parts. We propose a template model for the purpose, which captures common shape patterns of object parts, as well as the cooccurrence relation of the shape patterns. Once the image regions are aligned, extracted features are used for classification. Learning of the template model is efficient, and the recognition results we achieve significantly outperform the stateof-the-art algorithms. 1

5 0.05553114 265 nips-2012-Parametric Local Metric Learning for Nearest Neighbor Classification

Author: Jun Wang, Alexandros Kalousis, Adam Woznica

Abstract: We study the problem of learning local metrics for nearest neighbor classification. Most previous works on local metric learning learn a number of local unrelated metrics. While this ”independence” approach delivers an increased flexibility its downside is the considerable risk of overfitting. We present a new parametric local metric learning method in which we learn a smooth metric matrix function over the data manifold. Using an approximation error bound of the metric matrix function we learn local metrics as linear combinations of basis metrics defined on anchor points over different regions of the instance space. We constrain the metric matrix function by imposing on the linear combinations manifold regularization which makes the learned metric matrix function vary smoothly along the geodesics of the data manifold. Our metric learning method has excellent performance both in terms of predictive power and scalability. We experimented with several largescale classification problems, tens of thousands of instances, and compared it with several state of the art metric learning methods, both global and local, as well as to SVM with automatic kernel selection, all of which it outperforms in a significant manner. 1

6 0.052864932 210 nips-2012-Memorability of Image Regions

7 0.052804664 9 nips-2012-A Geometric take on Metric Learning

8 0.049674407 360 nips-2012-Visual Recognition using Embedded Feature Selection for Curvature Self-Similarity

9 0.049561922 344 nips-2012-Timely Object Recognition

10 0.048088271 90 nips-2012-Deep Learning of Invariant Features via Simulated Fixations in Video

11 0.047144454 7 nips-2012-A Divide-and-Conquer Method for Sparse Inverse Covariance Estimation

12 0.046541356 197 nips-2012-Learning with Recursive Perceptual Representations

13 0.044354141 202 nips-2012-Locally Uniform Comparison Image Descriptor

14 0.044016685 106 nips-2012-Dynamical And-Or Graph Learning for Object Shape Modeling and Detection

15 0.042947885 307 nips-2012-Semi-Crowdsourced Clustering: Generalizing Crowd Labeling by Robust Distance Metric Learning

16 0.042480208 147 nips-2012-Graphical Models via Generalized Linear Models

17 0.041899312 318 nips-2012-Sparse Approximate Manifolds for Differential Geometric MCMC

18 0.041338395 200 nips-2012-Local Supervised Learning through Space Partitioning

19 0.040891059 361 nips-2012-Volume Regularization for Binary Classification

20 0.038681429 40 nips-2012-Analyzing 3D Objects in Cluttered Images


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.11), (1, 0.033), (2, -0.072), (3, -0.044), (4, 0.068), (5, -0.032), (6, -0.004), (7, -0.007), (8, 0.016), (9, -0.022), (10, 0.014), (11, -0.013), (12, 0.032), (13, -0.006), (14, -0.011), (15, 0.026), (16, 0.003), (17, -0.018), (18, 0.032), (19, 0.012), (20, 0.014), (21, -0.074), (22, -0.001), (23, -0.007), (24, 0.003), (25, -0.006), (26, -0.002), (27, 0.024), (28, 0.004), (29, 0.026), (30, -0.043), (31, 0.036), (32, 0.001), (33, -0.065), (34, 0.046), (35, 0.061), (36, 0.021), (37, 0.002), (38, 0.006), (39, 0.004), (40, 0.061), (41, -0.035), (42, -0.026), (43, -0.035), (44, 0.068), (45, -0.0), (46, 0.033), (47, 0.006), (48, 0.114), (49, -0.013)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.89260143 146 nips-2012-Graphical Gaussian Vector for Image Categorization

Author: Tatsuya Harada, Yasuo Kuniyoshi

Abstract: This paper proposes a novel image representation called a Graphical Gaussian Vector (GGV), which is a counterpart of the codebook and local feature matching approaches. We model the distribution of local features as a Gaussian Markov Random Field (GMRF) which can efficiently represent the spatial relationship among local features. Using concepts of information geometry, proper parameters and a metric from the GMRF can be obtained. Then we define a new image feature by embedding the proper metric into the parameters, which can be directly applied to scalable linear classifiers. We show that the GGV obtains better performance over the state-of-the-art methods in the standard object recognition datasets and comparable performance in the scene dataset. 1

2 0.75956106 176 nips-2012-Learning Image Descriptors with the Boosting-Trick

Author: Tomasz Trzcinski, Mario Christoudias, Vincent Lepetit, Pascal Fua

Abstract: In this paper we apply boosting to learn complex non-linear local visual feature representations, drawing inspiration from its successful application to visual object detection. The main goal of local feature descriptors is to distinctively represent a salient image region while remaining invariant to viewpoint and illumination changes. This representation can be improved using machine learning, however, past approaches have been mostly limited to learning linear feature mappings in either the original input or a kernelized input feature space. While kernelized methods have proven somewhat effective for learning non-linear local feature descriptors, they rely heavily on the choice of an appropriate kernel function whose selection is often difficult and non-intuitive. We propose to use the boosting-trick to obtain a non-linear mapping of the input to a high-dimensional feature space. The non-linear feature mapping obtained with the boosting-trick is highly intuitive. We employ gradient-based weak learners resulting in a learned descriptor that closely resembles the well-known SIFT. As demonstrated in our experiments, the resulting descriptor can be learned directly from intensity patches achieving state-of-the-art performance. 1

3 0.71313286 210 nips-2012-Memorability of Image Regions

Author: Aditya Khosla, Jianxiong Xiao, Antonio Torralba, Aude Oliva

Abstract: While long term human visual memory can store a remarkable amount of visual information, it tends to degrade over time. Recent works have shown that image memorability is an intrinsic property of an image that can be reliably estimated using state-of-the-art image features and machine learning algorithms. However, the class of features and image information that is forgotten has not been explored yet. In this work, we propose a probabilistic framework that models how and which local regions from an image may be forgotten using a data-driven approach that combines local and global images features. The model automatically discovers memorability maps of individual images without any human annotation. We incorporate multiple image region attributes in our algorithm, leading to improved memorability prediction of images as compared to previous works. 1

4 0.6880545 202 nips-2012-Locally Uniform Comparison Image Descriptor

Author: Andrew Ziegler, Eric Christiansen, David Kriegman, Serge J. Belongie

Abstract: Keypoint matching between pairs of images using popular descriptors like SIFT or a faster variant called SURF is at the heart of many computer vision algorithms including recognition, mosaicing, and structure from motion. However, SIFT and SURF do not perform well for real-time or mobile applications. As an alternative very fast binary descriptors like BRIEF and related methods use pairwise comparisons of pixel intensities in an image patch. We present an analysis of BRIEF and related approaches revealing that they are hashing schemes on the ordinal correlation metric Kendall’s tau. Here, we introduce Locally Uniform Comparison Image Descriptor (LUCID), a simple description method based on linear time permutation distances between the ordering of RGB values of two image patches. LUCID is computable in linear time with respect to the number of pixels and does not require floating point computation. 1

5 0.63672328 92 nips-2012-Deep Representations and Codes for Image Auto-Annotation

Author: Ryan Kiros, Csaba Szepesvári

Abstract: The task of image auto-annotation, namely assigning a set of relevant tags to an image, is challenging due to the size and variability of tag vocabularies. Consequently, most existing algorithms focus on tag assignment and fix an often large number of hand-crafted features to describe image characteristics. In this paper we introduce a hierarchical model for learning representations of standard sized color images from the pixel level, removing the need for engineered feature representations and subsequent feature selection for annotation. We benchmark our model on the STL-10 recognition dataset, achieving state-of-the-art performance. When our features are combined with TagProp (Guillaumin et al.), we compete with or outperform existing annotation approaches that use over a dozen distinct handcrafted image descriptors. Furthermore, using 256-bit codes and Hamming distance for training TagProp, we exchange only a small reduction in performance for efficient storage and fast comparisons. Self-taught learning is used in all of our experiments and deeper architectures always outperform shallow ones. 1

6 0.63470578 185 nips-2012-Learning about Canonical Views from Internet Image Collections

7 0.60015821 357 nips-2012-Unsupervised Template Learning for Fine-Grained Object Recognition

8 0.59755796 91 nips-2012-Deep Neural Networks Segment Neuronal Membranes in Electron Microscopy Images

9 0.58495557 360 nips-2012-Visual Recognition using Embedded Feature Selection for Curvature Self-Similarity

10 0.5638113 159 nips-2012-Image Denoising and Inpainting with Deep Neural Networks

11 0.55179274 87 nips-2012-Convolutional-Recursive Deep Learning for 3D Object Classification

12 0.54665929 158 nips-2012-ImageNet Classification with Deep Convolutional Neural Networks

13 0.54056036 101 nips-2012-Discriminatively Trained Sparse Code Gradients for Contour Detection

14 0.53765863 8 nips-2012-A Generative Model for Parts-based Object Segmentation

15 0.52721435 303 nips-2012-Searching for objects driven by context

16 0.50343001 90 nips-2012-Deep Learning of Invariant Features via Simulated Fixations in Video

17 0.49765283 28 nips-2012-A systematic approach to extracting semantic information from functional MRI data

18 0.49114338 193 nips-2012-Learning to Align from Scratch

19 0.48680705 235 nips-2012-Natural Images, Gaussian Mixtures and Dead Leaves

20 0.46835163 306 nips-2012-Semantic Kernel Forests from Multiple Taxonomies


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(0, 0.024), (21, 0.02), (38, 0.098), (42, 0.05), (54, 0.026), (55, 0.032), (74, 0.101), (76, 0.108), (80, 0.045), (83, 0.333), (92, 0.042)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.68450755 146 nips-2012-Graphical Gaussian Vector for Image Categorization

Author: Tatsuya Harada, Yasuo Kuniyoshi

Abstract: This paper proposes a novel image representation called a Graphical Gaussian Vector (GGV), which is a counterpart of the codebook and local feature matching approaches. We model the distribution of local features as a Gaussian Markov Random Field (GMRF) which can efficiently represent the spatial relationship among local features. Using concepts of information geometry, proper parameters and a metric from the GMRF can be obtained. Then we define a new image feature by embedding the proper metric into the parameters, which can be directly applied to scalable linear classifiers. We show that the GGV obtains better performance over the state-of-the-art methods in the standard object recognition datasets and comparable performance in the scene dataset. 1

2 0.56367505 335 nips-2012-The Bethe Partition Function of Log-supermodular Graphical Models

Author: Nicholas Ruozzi

Abstract: Sudderth, Wainwright, and Willsky conjectured that the Bethe approximation corresponding to any fixed point of the belief propagation algorithm over an attractive, pairwise binary graphical model provides a lower bound on the true partition function. In this work, we resolve this conjecture in the affirmative by demonstrating that, for any graphical model with binary variables whose potential functions (not necessarily pairwise) are all log-supermodular, the Bethe partition function always lower bounds the true partition function. The proof of this result follows from a new variant of the “four functions” theorem that may be of independent interest. 1

3 0.51339018 121 nips-2012-Expectation Propagation in Gaussian Process Dynamical Systems

Author: Marc Deisenroth, Shakir Mohamed

Abstract: Rich and complex time-series data, such as those generated from engineering systems, financial markets, videos, or neural recordings are now a common feature of modern data analysis. Explaining the phenomena underlying these diverse data sets requires flexible and accurate models. In this paper, we promote Gaussian process dynamical systems as a rich model class that is appropriate for such an analysis. We present a new approximate message-passing algorithm for Bayesian state estimation and inference in Gaussian process dynamical systems, a nonparametric probabilistic generalization of commonly used state-space models. We derive our message-passing algorithm using Expectation Propagation and provide a unifying perspective on message passing in general state-space models. We show that existing Gaussian filters and smoothers appear as special cases within our inference framework, and that these existing approaches can be improved upon using iterated message passing. Using both synthetic and real-world data, we demonstrate that iterated message passing can improve inference in a wide range of tasks in Bayesian state estimation, thus leading to improved predictions and more effective decision making. 1

4 0.49584591 13 nips-2012-A Nonparametric Conjugate Prior Distribution for the Maximizing Argument of a Noisy Function

Author: Pedro Ortega, Jordi Grau-moya, Tim Genewein, David Balduzzi, Daniel Braun

Abstract: We propose a novel Bayesian approach to solve stochastic optimization problems that involve finding extrema of noisy, nonlinear functions. Previous work has focused on representing possible functions explicitly, which leads to a two-step procedure of first, doing inference over the function space and second, finding the extrema of these functions. Here we skip the representation step and directly model the distribution over extrema. To this end, we devise a non-parametric conjugate prior based on a kernel regressor. The resulting posterior distribution directly captures the uncertainty over the maximum of the unknown function. Given t observations of the function, the posterior can be evaluated efficiently in time O(t2 ) up to a multiplicative constant. Finally, we show how to apply our model to optimize a noisy, non-convex, high-dimensional objective function.

5 0.49133945 176 nips-2012-Learning Image Descriptors with the Boosting-Trick

Author: Tomasz Trzcinski, Mario Christoudias, Vincent Lepetit, Pascal Fua

Abstract: In this paper we apply boosting to learn complex non-linear local visual feature representations, drawing inspiration from its successful application to visual object detection. The main goal of local feature descriptors is to distinctively represent a salient image region while remaining invariant to viewpoint and illumination changes. This representation can be improved using machine learning, however, past approaches have been mostly limited to learning linear feature mappings in either the original input or a kernelized input feature space. While kernelized methods have proven somewhat effective for learning non-linear local feature descriptors, they rely heavily on the choice of an appropriate kernel function whose selection is often difficult and non-intuitive. We propose to use the boosting-trick to obtain a non-linear mapping of the input to a high-dimensional feature space. The non-linear feature mapping obtained with the boosting-trick is highly intuitive. We employ gradient-based weak learners resulting in a learned descriptor that closely resembles the well-known SIFT. As demonstrated in our experiments, the resulting descriptor can be learned directly from intensity patches achieving state-of-the-art performance. 1

6 0.48942924 3 nips-2012-A Bayesian Approach for Policy Learning from Trajectory Preference Queries

7 0.48799741 210 nips-2012-Memorability of Image Regions

8 0.48656294 235 nips-2012-Natural Images, Gaussian Mixtures and Dead Leaves

9 0.4837245 274 nips-2012-Priors for Diversity in Generative Latent Variable Models

10 0.48280749 201 nips-2012-Localizing 3D cuboids in single-view images

11 0.48232588 185 nips-2012-Learning about Canonical Views from Internet Image Collections

12 0.48124915 357 nips-2012-Unsupervised Template Learning for Fine-Grained Object Recognition

13 0.48122543 337 nips-2012-The Lovász ϑ function, SVMs and finding large dense subgraphs

14 0.48038319 339 nips-2012-The Time-Marginalized Coalescent Prior for Hierarchical Clustering

15 0.47925988 8 nips-2012-A Generative Model for Parts-based Object Segmentation

16 0.47900385 303 nips-2012-Searching for objects driven by context

17 0.47894847 101 nips-2012-Discriminatively Trained Sparse Code Gradients for Contour Detection

18 0.47831589 260 nips-2012-Online Sum-Product Computation Over Trees

19 0.47744319 90 nips-2012-Deep Learning of Invariant Features via Simulated Fixations in Video

20 0.47608832 209 nips-2012-Max-Margin Structured Output Regression for Spatio-Temporal Action Localization