cvpr cvpr2013 cvpr2013-369 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Laurent Sifre, Stéphane Mallat
Abstract: An affine invariant representation is constructed with a cascade of invariants, which preserves information for classification. A joint translation and rotation invariant representation of image patches is calculated with a scattering transform. It is implemented with a deep convolution network, which computes successive wavelet transforms and modulus non-linearities. Invariants to scaling, shearing and small deformations are calculated with linear operators in the scattering domain. State-of-the-art classification results are obtained over texture databases with uncontrolled viewing conditions.
Reference: text
sentIndex sentText sentNum sentScore
1 Rotation, Scaling and Deformation Invariant Scattering for Texture Discrimination Laurent Sifre CMAP, Ecole Polytechnique 91128 Palaiseau Abstract An affine invariant representation is constructed with a cascade of invariants, which preserves information for classification. [sent-1, score-0.327]
2 A joint translation and rotation invariant representation of image patches is calculated with a scattering transform. [sent-2, score-1.001]
3 It is implemented with a deep convolution network, which computes successive wavelet transforms and modulus non-linearities. [sent-3, score-0.632]
4 Invariants to scaling, shearing and small deformations are calculated with linear operators in the scattering domain. [sent-4, score-0.915]
5 Hierarchical cascade of invariants [2, 3] have been studied to build affine invariant image representations. [sent-9, score-0.648]
6 Learning is not necessary to build affine invariants with a hierarchical cascade, but its mathematical and algorithmic implementation raises difficulties. [sent-11, score-0.434]
7 How can we factorize affine invariants into simpler invariants computed over smaller subgroups ? [sent-12, score-0.755]
8 How to compute stable and informative invariants over any given group ? [sent-13, score-0.416]
9 This paper shows that stable and informative affine invariant representations can be obtained with a scattering operator defined on the translation, rotation and scaling groups. [sent-14, score-1.147]
10 It is implemented by a deep convolution network with wavelets filters and modulus non-linearities. [sent-15, score-0.51]
11 Section 2 shows that within image patches, translations and rotation invariants must be computed together to retain joint information on spatial positions and orientations. [sent-17, score-0.537]
12 A joint scattering invariant on the roto-translation group requires to build a wavelet transform on this non-commutative group, which involves a different type of convolution described in Section 3. [sent-18, score-1.196]
13 A scaling invariant may however be computed separately with a scale-space averaging across image patches. [sent-19, score-0.306]
14 Thanks to wavelet localizations, scattering transforms computes invariants which are stable to deformations [9]. [sent-21, score-1.428]
15 It results that small shearing and deformations are linearized in the scattering domain. [sent-22, score-0.861]
16 Section 6 shows that scattering representations give state-of-the-art texture classification results on KTH-TIPS [17], UIUC [18] and UMD [19] databases. [sent-26, score-0.694]
17 1analyzes the construction of affine invariants as a a cascade of separable or joint invariants on smaller groups. [sent-33, score-0.97]
18 The hierarchical architecture of affine invariant scattering representations is described in Section 2. [sent-34, score-0.948]
19 Separable Versus Joint Invariants The affine group can be written as a product of the translation, rotation, scaling and shearing groups. [sent-38, score-0.321]
20 Affine invariant representations can be computed as a separable product of invariants on each of these smaller subgroups. [sent-39, score-0.593]
21 However, we show that such separable invariant products may lose important information. [sent-40, score-0.309]
22 This is why two-dimensional translation invariant representations are not computed by cascading invariants to horizontal and vertical translations. [sent-71, score-0.575]
23 A separa- ble product of translation and rotation invariant operators can represent the relative positions of the vertical patterns, and the relative positions of the horizontal patterns, up to global translations. [sent-74, score-0.398]
24 Such a separable invariant thus can not discriminate the two textures of Figure 1. [sent-77, score-0.311]
25 Figure 1: The left and right textures are not discriminated by a separable invariant along rotations and translations, but can be discriminated by a joint roto-translation invariant. [sent-78, score-0.426]
26 Computing a joint invariant between rotations and translations also means taking into account the joint relative positions and orientations of image structures, so that the textures of Figure 1 can be discriminated. [sent-80, score-0.471]
27 Section 3 introduces a roto-translation scattering operator, which is computed by cascading wavelet transforms on the rototranslation group. [sent-81, score-0.966]
28 Calculating joint invariants on large non-commutative groups may however become very complex. [sent-82, score-0.366]
29 As a result of this strong correlation across scales, one can use a separable invariant along scales, with little loss of discriminative information. [sent-87, score-0.272]
30 Hierarchical Architecture We now explain how to build an affine invariant representation, with a hierarchical architecture. [sent-90, score-0.271]
31 We separate variabilities of potentially large amplitudes such as translations, rotations and scaling, from smaller amplitude variabilities, but which may belong to much higher dimensional groups such as shearing and general diffeomorphisms. [sent-91, score-0.255]
32 Most image representations build localized invariants over small image patches, for example with SIFT descriptors [15]. [sent-94, score-0.321]
33 These invariant coefficients are then ag- gregated into more invariant global image descriptors, for example with bag of words [10] or multiple layers of deep neural network [4, 5]. [sent-95, score-0.429]
34 We follow a similar strategy by first computing invariants over image patches and then aggregating them at the global image scale. [sent-96, score-0.361]
35 Figure 2: An affine invariant scattering is computed by applying a roto-translation scattering on image patches, a logarithmic non-linearity and a global space-scale averaging. [sent-100, score-1.579]
36 Invariants to small shearing and deformations are computed with linear projectors optimized by a supervised classifier. [sent-101, score-0.276]
37 This is done by calculating a scattering invariant on the joint roto-translation group. [sent-103, score-0.843]
38 A logarithmic non-linearity is first applied to invariant scattering coefficients to linearize their power law behavior across scales. [sent-105, score-0.894]
39 A scattering transform was proved to be stable to deformations [9]. [sent-108, score-0.818]
40 Indeed, it is computed with a cascade of wavelet trans1 1 12 2 23 3 324 2 forms which are stable to deformations, because wavelets are both regular and localized. [sent-109, score-0.46]
41 A small image deformation thus produces a small modification of it scattering representation. [sent-110, score-0.667]
42 The stability of scattering transforms to deformations guarantees that small shearing and deformations can be approximated by a linear operator in the space of scattering coefficients. [sent-111, score-1.724]
43 Enforcing invariance to all deformations would remove too much information because deformations involve too many degrees of freedom. [sent-113, score-0.283]
44 Since invariants are implemented by a linear projector, their optimization involves the optimization of a linear operator. [sent-115, score-0.349]
45 Roto-Translation Patch Scattering This section introduces roto-translation scattering operators which are stable to deformations. [sent-119, score-0.747]
46 Invariant-Covariant Wavelet Cascade A scattering operator [9] computes an invariant image representation relatively to the action of a group, by applying a cascade of invariant and covariant operators calculated with wavelet convolutions and modulus operators. [sent-122, score-1.715]
47 Convolutions on a group appear naturally because they define all linear operators which are covariant to the action of a group. [sent-123, score-0.188]
48 3 introduce a wavelet modulus which transforms Umx into the average Smx and a ̃new layer Um+1x of wavelet amplitude coefficients: operatorW̃m, W̃mUmx = (Smx, Um+1x) . [sent-161, score-0.735]
49 wavelet rototranslation convolumtio+n1s and thus remains covariant to rototranslations. [sent-163, score-0.341]
50 Iterating on this wavelet modulus transform Fooutrp muts =m 0ul wtiepl ein liatyiaelirzs eo Uf sxcat =texr in. [sent-164, score-0.387]
51 s No etxhtat s ethctei orne-s sulting scattering represeñtatiofno rS 1x ≤is m mãl ≤so 3 c. [sent-170, score-0.64]
52 S0x S1x S2x Figure 3: A scattering representation ĩs calculatẽd with a cascade of wavelet-modulus operators Each outputs invariant scattering coefficients Sm̃x and a nẽxt layer of covariant wavelet modulus coefficiẽnts Um+1x, ̃which is further transformed. [sent-175, score-2.134]
53 First Layer With Spatial Wavelets This section defines the first wavelet modulus operator which computes S0x and U1x from the input image x. [sent-179, score-0.476]
54 Locally invariant translation and rotation coefficients are f̃irst computed by averagi(nug) th =e 2 i−m2Jaφg(e 2x− wui)th: a rotation invariant low pass filter φJ(u) = 2−2Jφ(2−Ju): S0x(u) = x ⋆ φJ((uu)) == ∑2xφ(v()2φJ(uu) :−v) . [sent-180, score-0.559]
55 The averaged image S0x is nearly invariant to rotations and translations up to 2J pixels, but it has lost the high frequencies 1 1 12 2 23 3 3 5 3 of x. [sent-183, score-0.383]
56 These high frequencies are recovered by convolution with high pass wavelet filters. [sent-184, score-0.358]
57 To obtain rotation covariant coefficients, we rotate a wavelet ψ by several angles θ and dilate it by 2j : ψθ,j(u) = 2−2jψ(2−jr−θu) . [sent-185, score-0.363]
58 e R ofem x⋆ovψing this phase with a modulus operator yields ap hreasgeul oafr xen⋆vψelop which is more insensitive to translations: U1x(p1) = ∣x ⋆ ψθ1,j1 (u)∣ with p1 = (u, θ1,j1) . [sent-191, score-0.229]
59 (10) The non-linear operator is contractive because the wavelet transform W1 is cõ̃ntractive and a modulus is also contractive. [sent-198, score-0.443]
60 Deeper Layer With Roto-Translation Wavelets W̃2 The second wavelet modulus operator computes the average S1x of U1x on the roto-translatioñ group, together with the next layer of wavelet modulus c̃oefficients U2x. [sent-207, score-0.902]
61 (11) The invariant part of U1 h∑i∈sG computed with an averaging over the spatial and angle variables. [sent-212, score-0.236]
62 The high frequencies lost by this averaging are recovered through roto-translation convolutions with separable wavelets. [sent-220, score-0.348]
63 Roto-translation wavelets are computed with three separable pro(duu)c otsr. [sent-221, score-0.251]
64 nt)s U w3itxh hw tihteh a roto-translation convolution of U2x(g, p2 ) with the wavelets (13,14,15). [sent-245, score-0.234]
65 The output roto-translation of a second order scattering representation is a vector of coefficients: Sx = = (S0x(u) , S1x(p1) , p(u), S2x(p2)) , (19) k2). [sent-248, score-0.64]
66 If the wavelet are rotated along K angle−s2 θ theJn( Jon −e ≤1ca) jnK v leorgifyK t/h2at oS2efxf chieans tas. [sent-254, score-0.214]
67 ariance of Log Scattering Roto-translation scattering is computed over image patches of size 2J. [sent-265, score-0.68]
68 A joint scale-rotation-translation invariant must therefore be applied to the scattering representation of each patch vector. [sent-267, score-0.843]
69 One could recover the high frequencies lost by this averaging and compute a new layer of invariant through convolutions on the joint scale-rotation-translation group. [sent-269, score-0.476]
70 However, adding this supplementary information does not improve texture classification, so this last invariant is limited to a global scale-space averaging. [sent-270, score-0.212]
71 The roto-translation scattering representations of all patches at a scale 2J is given by Sx = (x ⋆ φJ(u) , U1x⍟ΦJ(p1) , ((up, U2x⍟ΦJ(p2)) k(p2). [sent-271, score-0.68]
72 To estimate an accurate average from a uniform sampling of the variables j1and j2, it is necessary to bound uniformly the variations of scattering coefficient as a function ofj1 and j2. [sent-287, score-0.64]
73 a, s, 1 1 12 2 23 3 357 5 A joint scaling, rotation and translation invariant is comspcuatleed nwdit hsp aat sacla ilned-sicpeasc (e a,uv)e:raging of log Sxi along the scale and spatial indices (i, u) : iaSlx in =d∑i i,cuelso (gi(,Su)x:i(u,. [sent-290, score-0.321]
74 It require to compute/ twice more scattering coefficients at scales 2j1/2 and 2j2/2. [sent-295, score-0.74]
75 on the joint scale-rotation-translation group, and define a new scattering cascade. [sent-301, score-0.685]
76 ch Tehsi so rfe slaizteiv 2J = s 25 l= f e3a2with K = 8 wavelet orientations. [sent-304, score-0.214]
77 Deformation Invariant Projectors Shearing and image deformations are typically of smaller amplitudes than translations, rotations and scaling. [sent-308, score-0.195]
78 A scattering transform is stable and hence linearizes small deformations. [sent-309, score-0.728]
79 A set of small image deformations thus produces scattering coefficients which belong to an affine space. [sent-310, score-0.946]
80 Linear projectors which are orthogonal to this affine space are invariant to these small deformations. [sent-311, score-0.326]
81 These invariants can be adapted to each signal class by optimizing a linear kernel at the supervised classification stage. [sent-312, score-0.321]
82 Texture Classification Experiments This section gives scattering classification results on KTH-TIPS [17], UIUC [10, 18] and UMD [19] texture datasets, and comparison with state of the art algorithms. [sent-332, score-0.694]
83 Most state of the art algorithms use separable invariants to define a translation and rotation invariant algorithms, and thus lose joint information on positions and orientations. [sent-334, score-0.827]
84 This is the case of [10] where rotation invariance is obtained through histograms along concentric circles, as well as Log Gaussian Cox processes (COX) [11] and Basic Image Features (BIF) [12] which use rotation invariant patch descriptors calculated from small filter responses. [sent-335, score-0.305]
85 Wavelet Multifractal Spectrum (WMFS) [13] computes wavelet descriptors which are averaged in space and rotations, and are similar to first order scattering coefficients S1x. [sent-337, score-0.955]
86 We compare the best published results [10, 11, 12, 13, 14] and scattering invariants on KTH-TIPS (table 1), UIUC (table 2) and UMD (table 3) texture databases. [sent-338, score-1.015]
87 Classification rates are computed with scattering representations implemented with progressively more invariants, and with the PCA classifier of Section 5. [sent-340, score-0.668]
88 The space Vc is thus generated by the D scattering vectors of the training set. [sent-342, score-0.64]
89 Classification rates in Tables 1,2,3 are given for different scattering representations. [sent-344, score-0.64]
90 scatt” correspond to a translation invariant scattering as in [1]. [sent-346, score-0.859]
91 It is computed on image patches of size 2J, with a final spatial averaging of all the patch scattering vector. [sent-347, score-0.758]
92 scatt” replace the translation invariant scattering by the roto-translation scattering of Section 3. [sent-349, score-1.499]
93 The rows “+ scale avg” also reduce the error by computing a separable invariant along scales, with the averaging described in Section 4. [sent-351, score-0.383]
94 The last five rows give results obtained with progressively refined scattering invariants. [sent-363, score-0.673]
95 The roto-translation scattering does not degrade results but each scale processing step provides significant improvements. [sent-366, score-0.64]
96 For both these databases, the roto-translation scattering provides a considerable improvement (from 50% to 77% for UIUC with 5 training) and scale processing steps also improve results. [sent-368, score-0.64]
97 Conclusion This paper introduces a general scattering architecture which computes invariants to translations, rotations, scaling and deformations, while keeping enough discriminative information. [sent-385, score-1.101]
98 It can be interpreted as a deep convolution network, where convolutions are performed along spatial, rotation and scaling variables. [sent-386, score-0.35]
99 This paper concentrates on texture applications, but the invariance properties of this scattering image patch representation can also replace SIFT type features for more complex classification problems. [sent-389, score-0.727]
100 Ji, “A new texture descriptor using multifractal analysis in multi-orientation wavelet pyramid”, Proc. [sent-471, score-0.303]
wordName wordTfidf (topN-words)
[('scattering', 0.64), ('invariants', 0.321), ('wavelet', 0.214), ('modulus', 0.173), ('invariant', 0.158), ('wavelets', 0.137), ('deformations', 0.125), ('separable', 0.114), ('affine', 0.113), ('convolution', 0.097), ('shearing', 0.096), ('covariant', 0.092), ('sx', 0.091), ('convolutions', 0.081), ('translations', 0.08), ('rue', 0.079), ('averaging', 0.078), ('uiuc', 0.078), ('vc', 0.074), ('smx', 0.071), ('sxc', 0.071), ('scaling', 0.07), ('rotations', 0.07), ('coefficients', 0.068), ('translation', 0.061), ('umd', 0.057), ('rotation', 0.057), ('cascade', 0.056), ('operator', 0.056), ('projectors', 0.055), ('operators', 0.054), ('texture', 0.054), ('umx', 0.053), ('amplitude', 0.053), ('stable', 0.053), ('um', 0.047), ('frequencies', 0.047), ('joint', 0.045), ('deep', 0.045), ('transforms', 0.042), ('group', 0.042), ('patches', 0.04), ('uf', 0.039), ('layer', 0.039), ('textures', 0.039), ('gy', 0.038), ('architecture', 0.037), ('lose', 0.037), ('variabilities', 0.036), ('co', 0.036), ('apnudte', 0.035), ('barye', 0.035), ('bry', 0.035), ('cascading', 0.035), ('citti', 0.035), ('dgu', 0.035), ('egff', 0.035), ('eole', 0.035), ('esxc', 0.035), ('fuo', 0.035), ('iaoenndt', 0.035), ('linearizes', 0.035), ('morlet', 0.035), ('multifractal', 0.035), ('ovon', 0.035), ('raging', 0.035), ('roto', 0.035), ('rototranslation', 0.035), ('scatt', 0.035), ('scmoevnodatarifti', 0.035), ('sofx', 0.035), ('woiltuhtio', 0.035), ('xti', 0.035), ('cox', 0.035), ('ua', 0.034), ('networks', 0.034), ('positions', 0.034), ('rows', 0.033), ('computes', 0.033), ('invariance', 0.033), ('uu', 0.032), ('scales', 0.032), ('databases', 0.032), ('pvc', 0.031), ('mallat', 0.031), ('quadrature', 0.031), ('heu', 0.031), ('bgy', 0.031), ('filters', 0.03), ('aun', 0.029), ('jon', 0.029), ('uo', 0.029), ('logarithmic', 0.028), ('lost', 0.028), ('implemented', 0.028), ('hge', 0.027), ('realizations', 0.027), ('gti', 0.027), ('dimen', 0.027), ('deformation', 0.027)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000002 369 cvpr-2013-Rotation, Scaling and Deformation Invariant Scattering for Texture Discrimination
Author: Laurent Sifre, Stéphane Mallat
Abstract: An affine invariant representation is constructed with a cascade of invariants, which preserves information for classification. A joint translation and rotation invariant representation of image patches is calculated with a scattering transform. It is implemented with a deep convolution network, which computes successive wavelet transforms and modulus non-linearities. Invariants to scaling, shearing and small deformations are calculated with linear operators in the scattering domain. State-of-the-art classification results are obtained over texture databases with uncontrolled viewing conditions.
Author: Won Hwa Kim, Moo K. Chung, Vikas Singh
Abstract: The analysis of 3-D shape meshes is a fundamental problem in computer vision, graphics, and medical imaging. Frequently, the needs of the application require that our analysis take a multi-resolution view of the shape ’s local and global topology, and that the solution is consistent across multiple scales. Unfortunately, the preferred mathematical construct which offers this behavior in classical image/signal processing, Wavelets, is no longer applicable in this general setting (data with non-uniform topology). In particular, the traditional definition does not allow writing out an expansion for graphs that do not correspond to the uniformly sampled lattice (e.g., images). In this paper, we adapt recent results in harmonic analysis, to derive NonEuclidean Wavelets based algorithms for a range of shape analysis problems in vision and medical imaging. We show how descriptors derived from the dual domain representation offer native multi-resolution behavior for characterizing local/global topology around vertices. With only minor modifications, the framework yields a method for extracting interest/key points from shapes, a surprisingly simple algorithm for 3-D shape segmentation (competitive with state of the art), and a method for surface alignment (without landmarks). We give an extensive set of comparison results on a large shape segmentation benchmark and derive a uniqueness theorem for the surface alignment problem.
3 0.11405347 88 cvpr-2013-Compressible Motion Fields
Author: Giuseppe Ottaviano, Pushmeet Kohli
Abstract: Traditional video compression methods obtain a compact representation for image frames by computing coarse motion fields defined on patches of pixels called blocks, in order to compensate for the motion in the scene across frames. This piecewise constant approximation makes the motion field efficiently encodable, but it introduces block artifacts in the warped image frame. In this paper, we address the problem of estimating dense motion fields that, while accurately predicting one frame from a given reference frame by warping it with the field, are also compressible. We introduce a representation for motion fields based on wavelet bases, and approximate the compressibility of their coefficients with a piecewise smooth surrogate function that yields an objective function similar to classical optical flow formulations. We then show how to quantize and encode such coefficients with adaptive precision. We demonstrate the effectiveness of our approach by com- paring its performance with a state-of-the-art wavelet video encoder. Experimental results on a number of standard flow and video datasets reveal that our method significantly outperforms both block-based and optical-flow-based motion compensation algorithms.
4 0.11221683 255 cvpr-2013-Learning Separable Filters
Author: Roberto Rigamonti, Amos Sironi, Vincent Lepetit, Pascal Fua
Abstract: Learning filters to produce sparse image representations in terms of overcomplete dictionaries has emerged as a powerful way to create image features for many different purposes. Unfortunately, these filters are usually both numerous and non-separable, making their use computationally expensive. In this paper, we show that such filters can be computed as linear combinations of a smaller number of separable ones, thus greatly reducing the computational complexity at no cost in terms of performance. This makes filter learning approaches practical even for large images or 3D volumes, and we show that we significantly outperform state-of-theart methods on the linear structure extraction task, in terms ofboth accuracy and speed. Moreover, our approach is general and can be used on generic filter banks to reduce the complexity of the convolutions.
5 0.069604389 162 cvpr-2013-FasT-Match: Fast Affine Template Matching
Author: Simon Korman, Daniel Reichman, Gilad Tsur, Shai Avidan
Abstract: Fast-Match is a fast algorithm for approximate template matching under 2D affine transformations that minimizes the Sum-of-Absolute-Differences (SAD) error measure. There is a huge number of transformations to consider but we prove that they can be sampled using a density that depends on the smoothness of the image. For each potential transformation, we approximate the SAD error using a sublinear algorithm that randomly examines only a small number of pixels. We further accelerate the algorithm using a branch-and-bound scheme. As images are known to be piecewise smooth, the result is a practical affine template matching algorithm with approximation guarantees, that takes a few seconds to run on a standard machine. We perform several experiments on three different datasets, and report very good results. To the best of our knowledge, this is the first template matching algorithm which is guaranteed to handle arbitrary 2D affine transformations.
6 0.066149361 46 cvpr-2013-Articulated and Restricted Motion Subspaces and Their Signatures
7 0.060724027 392 cvpr-2013-Separable Dictionary Learning
8 0.059926823 146 cvpr-2013-Enriching Texture Analysis with Semantic Data
9 0.057176135 104 cvpr-2013-Deep Convolutional Network Cascade for Facial Point Detection
10 0.055937033 164 cvpr-2013-Fast Convolutional Sparse Coding
11 0.055742748 163 cvpr-2013-Fast, Accurate Detection of 100,000 Object Classes on a Single Machine
12 0.052858293 2 cvpr-2013-3D Pictorial Structures for Multiple View Articulated Pose Estimation
13 0.052465219 130 cvpr-2013-Discriminative Color Descriptors
14 0.051500827 196 cvpr-2013-HON4D: Histogram of Oriented 4D Normals for Activity Recognition from Depth Sequences
15 0.051044684 328 cvpr-2013-Pedestrian Detection with Unsupervised Multi-stage Feature Learning
16 0.050610762 330 cvpr-2013-Photometric Ambient Occlusion
17 0.050131474 105 cvpr-2013-Deep Learning Shape Priors for Object Segmentation
18 0.050120249 355 cvpr-2013-Representing Videos Using Mid-level Discriminative Patches
20 0.048167214 388 cvpr-2013-Semi-supervised Learning of Feature Hierarchies for Object Detection in a Video
topicId topicWeight
[(0, 0.124), (1, 0.021), (2, -0.031), (3, 0.034), (4, -0.021), (5, -0.007), (6, -0.013), (7, -0.024), (8, -0.028), (9, -0.045), (10, -0.014), (11, 0.028), (12, -0.02), (13, -0.063), (14, 0.072), (15, 0.001), (16, 0.024), (17, 0.008), (18, 0.082), (19, 0.05), (20, 0.0), (21, -0.013), (22, 0.041), (23, -0.037), (24, -0.037), (25, 0.038), (26, -0.007), (27, -0.032), (28, -0.024), (29, -0.056), (30, -0.003), (31, -0.035), (32, 0.055), (33, -0.022), (34, -0.067), (35, 0.052), (36, -0.04), (37, 0.041), (38, -0.017), (39, 0.02), (40, -0.05), (41, -0.008), (42, 0.078), (43, 0.032), (44, -0.069), (45, -0.061), (46, 0.047), (47, 0.032), (48, -0.002), (49, 0.027)]
simIndex simValue paperId paperTitle
same-paper 1 0.937518 369 cvpr-2013-Rotation, Scaling and Deformation Invariant Scattering for Texture Discrimination
Author: Laurent Sifre, Stéphane Mallat
Abstract: An affine invariant representation is constructed with a cascade of invariants, which preserves information for classification. A joint translation and rotation invariant representation of image patches is calculated with a scattering transform. It is implemented with a deep convolution network, which computes successive wavelet transforms and modulus non-linearities. Invariants to scaling, shearing and small deformations are calculated with linear operators in the scattering domain. State-of-the-art classification results are obtained over texture databases with uncontrolled viewing conditions.
2 0.68669081 164 cvpr-2013-Fast Convolutional Sparse Coding
Author: Hilton Bristow, Anders Eriksson, Simon Lucey
Abstract: Sparse coding has become an increasingly popular method in learning and vision for a variety of classification, reconstruction and coding tasks. The canonical approach intrinsically assumes independence between observations during learning. For many natural signals however, sparse coding is applied to sub-elements (i.e. patches) of the signal, where such an assumption is invalid. Convolutional sparse coding explicitly models local interactions through the convolution operator, however the resulting optimization problem is considerably more complex than traditional sparse coding. In this paper, we draw upon ideas from signal processing and Augmented Lagrange Methods (ALMs) to produce a fast algorithm with globally optimal subproblems and super-linear convergence.
3 0.68082899 255 cvpr-2013-Learning Separable Filters
Author: Roberto Rigamonti, Amos Sironi, Vincent Lepetit, Pascal Fua
Abstract: Learning filters to produce sparse image representations in terms of overcomplete dictionaries has emerged as a powerful way to create image features for many different purposes. Unfortunately, these filters are usually both numerous and non-separable, making their use computationally expensive. In this paper, we show that such filters can be computed as linear combinations of a smaller number of separable ones, thus greatly reducing the computational complexity at no cost in terms of performance. This makes filter learning approaches practical even for large images or 3D volumes, and we show that we significantly outperform state-of-theart methods on the linear structure extraction task, in terms ofboth accuracy and speed. Moreover, our approach is general and can be used on generic filter banks to reduce the complexity of the convolutions.
Author: Won Hwa Kim, Moo K. Chung, Vikas Singh
Abstract: The analysis of 3-D shape meshes is a fundamental problem in computer vision, graphics, and medical imaging. Frequently, the needs of the application require that our analysis take a multi-resolution view of the shape ’s local and global topology, and that the solution is consistent across multiple scales. Unfortunately, the preferred mathematical construct which offers this behavior in classical image/signal processing, Wavelets, is no longer applicable in this general setting (data with non-uniform topology). In particular, the traditional definition does not allow writing out an expansion for graphs that do not correspond to the uniformly sampled lattice (e.g., images). In this paper, we adapt recent results in harmonic analysis, to derive NonEuclidean Wavelets based algorithms for a range of shape analysis problems in vision and medical imaging. We show how descriptors derived from the dual domain representation offer native multi-resolution behavior for characterizing local/global topology around vertices. With only minor modifications, the framework yields a method for extracting interest/key points from shapes, a surprisingly simple algorithm for 3-D shape segmentation (competitive with state of the art), and a method for surface alignment (without landmarks). We give an extensive set of comparison results on a large shape segmentation benchmark and derive a uniqueness theorem for the surface alignment problem.
5 0.62741464 391 cvpr-2013-Sensing and Recognizing Surface Textures Using a GelSight Sensor
Author: Rui Li, Edward H. Adelson
Abstract: Sensing surface textures by touch is a valuable capability for robots. Until recently it wwas difficult to build a compliant sensor with high sennsitivity and high resolution. The GelSight sensor is coompliant and offers sensitivity and resolution exceeding that of the human fingertips. This opens the possibility of measuring and recognizing highly detailed surface texxtures. The GelSight sensor, when pressed against a surfacce, delivers a height map. This can be treated as an image, aand processed using the tools of visual texture analysis. WWe have devised a simple yet effective texture recognitioon system based on local binary patterns, and enhanced it by the use of a multi-scale pyramid and a Hellinger ddistance metric. We built a database with 40 classes of taactile textures using materials such as fabric, wood, and sanndpaper. Our system can correctly categorize materials fromm this database with high accuracy. This suggests that the GGelSight sensor can be useful for material recognition by roobots.
6 0.60603213 346 cvpr-2013-Real-Time No-Reference Image Quality Assessment Based on Filter Learning
7 0.59378719 429 cvpr-2013-The Generalized Laplacian Distance and Its Applications for Visual Matching
9 0.58375716 141 cvpr-2013-Efficient Computation of Shortest Path-Concavity for 3D Meshes
10 0.54414499 35 cvpr-2013-Adaptive Compressed Tomography Sensing
11 0.54401332 427 cvpr-2013-Texture Enhanced Image Denoising via Gradient Histogram Preservation
12 0.53565556 304 cvpr-2013-Multipath Sparse Coding Using Hierarchical Matching Pursuit
13 0.53127211 432 cvpr-2013-Three-Dimensional Bilateral Symmetry Plane Estimation in the Phase Domain
14 0.51998037 404 cvpr-2013-Sparse Quantization for Patch Description
15 0.51750201 421 cvpr-2013-Supervised Kernel Descriptors for Visual Recognition
16 0.50227106 162 cvpr-2013-FasT-Match: Fast Affine Template Matching
17 0.50124663 316 cvpr-2013-Optical Flow Estimation Using Laplacian Mesh Energy
18 0.49546286 88 cvpr-2013-Compressible Motion Fields
19 0.49189171 97 cvpr-2013-Correspondence-Less Non-rigid Registration of Triangular Surface Meshes
20 0.49004343 166 cvpr-2013-Fast Image Super-Resolution Based on In-Place Example Regression
topicId topicWeight
[(10, 0.083), (16, 0.039), (18, 0.271), (26, 0.063), (28, 0.015), (29, 0.011), (33, 0.229), (39, 0.01), (57, 0.017), (67, 0.047), (69, 0.045), (80, 0.014), (87, 0.061)]
simIndex simValue paperId paperTitle
same-paper 1 0.80588871 369 cvpr-2013-Rotation, Scaling and Deformation Invariant Scattering for Texture Discrimination
Author: Laurent Sifre, Stéphane Mallat
Abstract: An affine invariant representation is constructed with a cascade of invariants, which preserves information for classification. A joint translation and rotation invariant representation of image patches is calculated with a scattering transform. It is implemented with a deep convolution network, which computes successive wavelet transforms and modulus non-linearities. Invariants to scaling, shearing and small deformations are calculated with linear operators in the scattering domain. State-of-the-art classification results are obtained over texture databases with uncontrolled viewing conditions.
2 0.78725851 452 cvpr-2013-Vantage Feature Frames for Fine-Grained Categorization
Author: Asma Rejeb Sfar, Nozha Boujemaa, Donald Geman
Abstract: We study fine-grained categorization, the task of distinguishing among (sub)categories of the same generic object class (e.g., birds), focusing on determining botanical species (leaves and orchids) from scanned images. The strategy is to focus attention around several vantage points, which is the approach taken by botanists, but using features dedicated to the individual categories. Our implementation of the strategy is based on vantage feature frames, a novel object representation consisting of two components: a set of coordinate systems centered at the most discriminating local viewpoints for the generic object class and a set of category-dependentfeatures computed in these frames. The features are pooled over frames to build the classifier. Categorization then proceeds from coarse-grained (finding the frames) to fine-grained (finding the category), and hence the vantage feature frames must be both detectable and discriminating. The proposed method outperforms state-of-the art algorithms, in particular those using more distributed representations, on standard databases of leaves.
3 0.72255242 305 cvpr-2013-Non-parametric Filtering for Geometric Detail Extraction and Material Representation
Author: Zicheng Liao, Jason Rock, Yang Wang, David Forsyth
Abstract: Geometric detail is a universal phenomenon in real world objects. It is an important component in object modeling, but not accounted for in current intrinsic image works. In this work, we explore using a non-parametric method to separate geometric detail from intrinsic image components. We further decompose an image as albedo ∗ (ccoomarpsoen-escnatsle. shading +e shading pdoestaeil a).n Oaugre decomposition offers quantitative improvement in albedo recovery and material classification.Our method also enables interesting image editing activities, including bump removal, geometric detail smoothing/enhancement and material transfer.
4 0.71843082 390 cvpr-2013-Semi-supervised Node Splitting for Random Forest Construction
Author: Xiao Liu, Mingli Song, Dacheng Tao, Zicheng Liu, Luming Zhang, Chun Chen, Jiajun Bu
Abstract: Node splitting is an important issue in Random Forest but robust splitting requires a large number of training samples. Existing solutions fail to properly partition the feature space if there are insufficient training data. In this paper, we present semi-supervised splitting to overcome this limitation by splitting nodes with the guidance of both labeled and unlabeled data. In particular, we derive a nonparametric algorithm to obtain an accurate quality measure of splitting by incorporating abundant unlabeled data. To avoid the curse of dimensionality, we project the data points from the original high-dimensional feature space onto a low-dimensional subspace before estimation. A unified optimization framework is proposed to select a coupled pair of subspace and separating hyperplane such that the smoothness of the subspace and the quality of the splitting are guaranteed simultaneously. The proposed algorithm is compared with state-of-the-art supervised and semi-supervised algorithms for typical computer vision applications such as object categorization and image segmen- tation. Experimental results on publicly available datasets demonstrate the superiority of our method.
5 0.69081742 104 cvpr-2013-Deep Convolutional Network Cascade for Facial Point Detection
Author: Yi Sun, Xiaogang Wang, Xiaoou Tang
Abstract: We propose a new approach for estimation of the positions of facial keypoints with three-level carefully designed convolutional networks. At each level, the outputs of multiple networks are fused for robust and accurate estimation. Thanks to the deep structures of convolutional networks, global high-level features are extracted over the whole face region at the initialization stage, which help to locate high accuracy keypoints. There are two folds of advantage for this. First, the texture context information over the entire face is utilized to locate each keypoint. Second, since the networks are trained to predict all the keypoints simultaneously, the geometric constraints among keypoints are implicitly encoded. The method therefore can avoid local minimum caused by ambiguity and data corruption in difficult image samples due to occlusions, large pose variations, and extreme lightings. The networks at the following two levels are trained to locally refine initial predictions and their inputs are limited to small regions around the initial predictions. Several network structures critical for accurate and robust facial point detection are investigated. Extensive experiments show that our approach outperforms state-ofthe-art methods in both detection accuracy and reliability1.
6 0.68906754 446 cvpr-2013-Understanding Indoor Scenes Using 3D Geometric Phrases
7 0.68888539 365 cvpr-2013-Robust Real-Time Tracking of Multiple Objects by Volumetric Mass Densities
8 0.68842655 311 cvpr-2013-Occlusion Patterns for Object Class Detection
9 0.68795562 4 cvpr-2013-3D Visual Proxemics: Recognizing Human Interactions in 3D from a Single Image
10 0.68793625 61 cvpr-2013-Beyond Point Clouds: Scene Understanding by Reasoning Geometry and Physics
11 0.68767315 30 cvpr-2013-Accurate Localization of 3D Objects from RGB-D Data Using Segmentation Hypotheses
12 0.68747997 225 cvpr-2013-Integrating Grammar and Segmentation for Human Pose Estimation
13 0.68685287 242 cvpr-2013-Label Propagation from ImageNet to 3D Point Clouds
14 0.686535 96 cvpr-2013-Correlation Filters for Object Alignment
15 0.68644667 424 cvpr-2013-Templateless Quasi-rigid Shape Modeling with Implicit Loop-Closure
16 0.68614829 35 cvpr-2013-Adaptive Compressed Tomography Sensing
17 0.68592435 221 cvpr-2013-Incorporating Structural Alternatives and Sharing into Hierarchy for Multiclass Object Recognition and Detection
18 0.68565106 331 cvpr-2013-Physically Plausible 3D Scene Tracking: The Single Actor Hypothesis
19 0.68552846 248 cvpr-2013-Learning Collections of Part Models for Object Recognition
20 0.6854369 79 cvpr-2013-Cartesian K-Means