cvpr cvpr2013 cvpr2013-429 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Elhanan Elboer, Michael Werman, Yacov Hel-Or
Abstract: The graph Laplacian operator, which originated in spectral graph theory, is commonly used for learning applications such as spectral clustering and embedding. In this paper we explore the Laplacian distance, a distance function related to the graph Laplacian, and use it for visual search. We show that previous techniques such as Matching by Tone Mapping (MTM) are particular cases of the Laplacian distance. Generalizing the Laplacian distance results in distance measures which are tolerant to various visual distortions. A novel algorithm based on linear decomposition makes it possible to compute these generalized distances efficiently. The proposed approach is demonstrated for tone mapping invariant, outlier robust and multimodal template matching.
Reference: text
sentIndex sentText sentNum sentScore
1 l Abstract The graph Laplacian operator, which originated in spectral graph theory, is commonly used for learning applications such as spectral clustering and embedding. [sent-5, score-0.154]
2 In this paper we explore the Laplacian distance, a distance function related to the graph Laplacian, and use it for visual search. [sent-6, score-0.099]
3 Generalizing the Laplacian distance results in distance measures which are tolerant to various visual distortions. [sent-8, score-0.178]
4 The proposed approach is demonstrated for tone mapping invariant, outlier robust and multimodal template matching. [sent-10, score-0.573]
5 The graph Laplacian, which is studied in spectral graph theory [3], has been used for machine learning problems such as spectral clustering [13, 10, 15] and dimensionality reduction [1, 11]. [sent-17, score-0.154]
6 The Laplacian distance LD(v, x) is a property of the graph Laplacian that can be interpreted as an asymmetric cross distance between two vectors v, x ∈ RM. [sent-23, score-0.158]
7 The graph edge weights are determined by v, wij = w(vi, vj) ; then the Laplacian distance of x from v is defined as LD(v,x) =21? [sent-24, score-0.241]
8 ijw(vi,vj)(xi− xj)2 (1) The Laplacian distance is small where small squared differences (xi−xj)2 correspond to large weights w(vi, vj). [sent-25, score-0.185]
9 We extend the concept of the Laplacian distance to develop a rich family of distance functions, Generalized Laplacian Distances (GLDs). [sent-31, score-0.118]
10 In this family the squared difference used by LD (Equation 1) is replaced with a general dissimilarity function, fij, depending on (xi, xj) and possibly also (vi , vj). [sent-32, score-0.175]
11 The proposed generalization makes it possible to replace the squared difference with a more robust dissimilarity fij such as the absolute difference |xi − xj | . [sent-33, score-0.343]
12 An even greater advantage can be achieved by changing both the weight function wij and the dissimilarity fij . [sent-34, score-0.261]
13 This distance e−(vi−vj)2), is small for vectors x, v for which xi ≤ xj iff vi ≤ vj ; an illustration is shown in Figure 2. [sent-38, score-0.293]
14 We demonstrate the advantage of GLD by an application to template matching. [sent-39, score-0.235]
15 A template p is sought in a large image which may be distorted by noise, captured in different illumination conditions, or even from different modality (e. [sent-40, score-0.312]
16 Usually the best match to p in the image is found by minimizing a distance 222333 111533 4 2 3 2 10 8 6 1 weights dissimilarity weighted error w(pi,pj) f (qi,qj) w(pi,pj) · f (qi,qj) Figure 1. [sent-44, score-0.197]
17 The weights wij are large (thick lines) where pi and pj (intensity values) are similar. [sent-47, score-0.315]
18 Middle: the dissimilarities fij are defined as (qi − qj )2). [sent-48, score-0.38]
19 Right: the weighted dissimilarity of each pair, wij · fij . [sent-49, score-0.261]
20 An appropriate distance should be tolerant to the possible distortions between the template and the searched image. [sent-52, score-0.39]
21 Matching a template over an image can be performed efficiently by common measures such as the Euclidean distance, the sum of absolute differences (SAD) or the normalized cross correlation (NCC, a. [sent-54, score-0.297]
22 This is an efficient template matching algorithm based on a distance function invariant to an arbitrary tone mapping of the image values. [sent-61, score-0.567]
23 Moreover, by replacing the dissimilarity function f we define in Section 4 GLDabs which is more robust than MTM. [sent-64, score-0.107]
24 GLDabs is computed efficiently using a novel technique based on a linear decomposition of non Euclidean distance functions. [sent-65, score-0.156]
25 Maximizing Csign is equivalent to minimizing GLDsign, a GLD which preserves the order property for the informative pairs in the template (see Figure 2). [sent-68, score-0.261]
26 Csign is invariant to monotonic intensity transformations and robust to outliers. [sent-69, score-0.144]
27 Definition Let G(V, W) be a weighted graph represented by a matrix W containing the edge weights wij for each node 4 2 3 2 10 8 6 1 weiogrdhtesr w (p(ip,pi,jp)j) ; order (qi,qj) w(pwi,peij)g ·h ft e(dq ie,rqrjo,pri,pj) Figure 2. [sent-78, score-0.182]
28 Large weights are assigned to pairs with distant values |pi − pj |. [sent-82, score-0.149]
29 Middle: the orders of the corresponding pairs (qi , qj ). [sent-83, score-0.24]
30 fij is multiplied by the edge weight w(pi , pj ) . [sent-85, score-0.189]
31 , the edge weights wij are determined by a weight function w(vi , vj) (e. [sent-96, score-0.142]
32 The distance of x from v is small where the following property is satisfied: pairs (i, j) with large weights, usually indicating that vi and vj are close, have close values in x (i. [sent-102, score-0.192]
33 For pairs (i, j) with small weights, the proximity of xi and xj is irrelevant. [sent-105, score-0.128]
34 The normalization by D−1 makes the contribution of each vi equal, since the sum of the normalized weights − ? [sent-106, score-0.111]
35 Laplacian distance for template matching Assume a template (pattern) p of M pixels is sought in an image. [sent-110, score-0.61]
36 (a) The template p (32 (d) intensity transformation, additive noise σ GLDabs = 30/255 and 30% outliers (random values 32) and the correct match q∗ . [sent-113, score-0.367]
37 which each node is associated with a pixel, and the weights wij depend on grayscale values pi, pj . [sent-116, score-0.261]
38 We call this approach pattern-based (PB), since the graph weights are determined by the grayscale values of the template p. [sent-121, score-0.363]
39 ijw(qi,qj)(pi− pj)2 (7) In this case the graph weights are determined the grayscale values of q, and the best match q∗ minimizes LD(q, p). [sent-123, score-0.128]
40 Efficiency and weights decomposition The weight function w strongly influences what is considered a good match. [sent-126, score-0.107]
41 Thus, computing LD(p, q) requires 2K + 1 convolutions, or K + 1 convolutions if w is symmetric (U=V ). [sent-164, score-0.105]
42 Thus, K box filters and 3K convolutions are required (or 2k convolutions if the weights are symmetric). [sent-186, score-0.269]
43 proposed a template matching method which is closely related to the Laplacian distance. [sent-195, score-0.265]
44 255]) is divided into K bins, and S(p) ∈ {0, 1}M×K is a matrix that indicates for each pi whether it falls to the kth bin, k = 1. [sent-205, score-0.139]
45 o p where the optimization parameter, β∈RK, is a vector that represents piecewise constant tone mapping. [sent-214, score-0.167]
46 Thus, MTM is × the minimal squared Euclidean distance between q and ˜p = T(p) over all possible piecewise constant tone mappings T(·). [sent-215, score-0.293]
47 2 (12) where nk is the number of pixels pi in the kth bin, and Sk (p) is the kth template slice – i. [sent-220, score-0.468]
48 In [7], MTM is computed efficiently using sparse convolutions [16]. [sent-223, score-0.105]
49 The graph weights wij are determined by the delta weight function: = δ? [sent-228, score-0.233]
50 j Rearranging Equation 12 and plugging in the delta weights and the normalization factors di = nk, we get the following equivalent formulation for MTM: MTM(p,q) =21? [sent-242, score-0.11]
51 This distance is defined by delta weights depending on p and squared differences depending on q/std(q). [sent-244, score-0.294]
52 By replacing p and q in Equations 11–15 it follows also that the W2P variant of MTM is a window-based (WB) Laplacian distance with delta weights depending on q and squared differences depending on p. [sent-245, score-0.364]
53 Generalized Laplacian Distances (GLDs) In the Laplacian distance the dissimilarity between xi and xj is expressed by the squared difference (xi−xj)2 (Equation 5). [sent-247, score-0.306]
54 We generalize the concept of the Laplacian distance by replacing the squared difference with an arbitrary dissimilarity function fij depending on xi and xj, and possibly also on vi and vj . [sent-248, score-0.529]
55 where wij = w(vi,vj) and fij GLD is more general than LD. [sent-252, score-0.182]
56 the absolute difference |xi − xj | that is more robust than the squared difference. [sent-255, score-0.165]
57 Such a decomposition is not available for the dissimilarities fij used by GLDs. [sent-262, score-0.188]
58 In the next sections (4,5) we compute GLDs efficiently by decomposing the dissimilarity function f using the spectral methods described in Section 2. [sent-263, score-0.116]
59 The only change is replacing the squared differences (qi − qj)2 by absolute differences |qi − qj |, both in the nominator and denominator (note that var(q) = 2M12 ? [sent-271, score-0.438]
60 We will show that in both cases only few convolutions with the image are required. [sent-277, score-0.105]
61 Decomposing the delta weights (Equation 13) and the absolute difference (Equation 18), and using the equality di = nk (Equation 14), the nominator of GLDabs (Equation 17) can be rewritten as ? [sent-287, score-0.218]
62 However, sparse convolutions can be used since the template slices Sk (p) are sparse and non overlapping. [sent-309, score-0.432]
63 Additionally, the template size is usually small which means that direct convolution is faster than many (K) FFT based convolutions. [sent-312, score-0.261]
64 2 (21) Here Ur (p) is the result of applying Ur to the template values, while Sk (q) is the kth slice ofthe image (1for pixels qi which fall in the kth bin, and 0 otherwise). [sent-342, score-0.521]
65 Again, KR sparse convolutions are required which are equivalent to R full convolutions. [sent-343, score-0.105]
66 The terms nk (q), denoting the number of pixels in q which fall in the kth bin, are computed by applying a box filter to each of the image slices in O(RN). [sent-344, score-0.137]
67 NCC is invariant to affine illumination change both of p and q, and tolerant to additive Gaussian noise. [sent-351, score-0.165]
68 It is also tolerant to general monotonic intensity transform, as can be seen from the formulation: NCC(p,q) =? [sent-352, score-0.179]
69 On the other hand, NCC is not robust to outlier noise or partial occlusions of the sought template. [sent-355, score-0.138]
70 This is reasonable in the context of template matching since the given template is usually clean and informative, while the image might be distorted. [sent-359, score-0.5]
71 First, it is invariant to monotonic increasing transform of the image values qi. [sent-364, score-0.106]
72 We define the patternbased distance GLDsign by the weights wij = |pi − pj | (25) and dissimilarity function fij = 1 − sign(pi − pj)sign(qi − qj) (26) The dissimilarity fij penalizes opposite orders of (pi, pj ) and (qi, qj). [sent-369, score-0.737]
73 The weights wij imply that pairs with distant values in the template pi,pj are more important than pairs with similar values. [sent-370, score-0.377]
74 i This expression is computed by 2K convolutions (for the summations over piUk [qi] and pjVk [qj]) and 2K box filters (for the summations over Vk [qj] and Uk [qi]). [sent-395, score-0.155]
75 Average execution times for template matching using Euclidean distance, NCC, MTM, Csign, GLDabs and mutual information (MI). [sent-397, score-0.306]
76 Results We present experimental evaluation of the proposed GLD based template matching methods, GLDabs and Csign, compared with other distance and similarity measures commonly used for template matching. [sent-400, score-0.559]
77 Then we compare the performance of the relevant methods1 for different intensity × changes ranging from monotonic affine transformation to multimodal resources. [sent-403, score-0.226]
78 Csign is also efficient but the parameter R (number of eigenvectors) affects the runtime as 2R FFT convolutions are required (we set R = 5). [sent-413, score-0.105]
79 This requires K2 sparse convolutions or, equivalently, K full convolutions. [sent-417, score-0.105]
80 Template matching with intensity transformations and outlier noise. [sent-421, score-0.125]
81 (c) Random non-monotonic tone mapping with increasing amounts of outliers. [sent-428, score-0.218]
82 The time gap is especially significant for small template size, e. [sent-429, score-0.235]
83 We compared the tolerance of various methods to monotonic affine transformation. [sent-435, score-0.125]
84 200 template locations were sampled randomly in an image dataset as well as affine transformation parameters. [sent-436, score-0.279]
85 In the case of distorted images with small additive Gaussian noise (std σ = 15/255) the best performing × method is NCC with a small gap from Csign (graph is provided in the supplementary material). [sent-444, score-0.124]
86 Although NCC is not invariant to Gamma correction, this result can be explained by the pairwise interpretation of NCC (Equation 23) since the signs of the pairs (pi −pj) , (qi qj ) do not change. [sent-445, score-0.265]
87 200 tone mapping functions and informative template locations were sampled randomly and used for three experiments. [sent-450, score-0.479]
88 First, we added small Gaussian noise (σ = 15/255) and examined the performance for different template sizes. [sent-451, score-0.265]
89 An example oftemplate matching with both additive and outlier noise is shown in Figure 3. [sent-459, score-0.153]
90 In all these experiments, faster methods such as NCC and Csign totally fail since they are not designed to overcome non monotonic intensity change. [sent-460, score-0.168]
91 The P2W variant of MTM and the PB variant of GLDabs perform better than the W2P and WB variants (shown in the supplementary material). [sent-461, score-0.143]
92 The reason is that in the above experiments there exists a tone mapping T which maps the template to the correct match (up to noise), but there might be no injective mapping from the mapped image back to the template. [sent-462, score-0.529]
93 Finding a small template in an image from a different source (modality) is a difficult task. [sent-464, score-0.235]
94 Template matching is more challenging since the joint statistics of a small template and candidate windows are less reliable. [sent-466, score-0.265]
95 In this experiment the W2P variant of MTM and the WB variant of GLDabs perform better than P2W and PB (see the supplementary material). [sent-476, score-0.116]
96 Although there is no injective mapping either from the template to the image or vice versa, fitting the candidates to the template (as done by W2P and WB) is more successful than fitting the template to each candidate window (as done by P2W and PB). [sent-477, score-0.818]
97 Intuitively, fitting the template to an incorrect window (e. [sent-478, score-0.272]
98 a constant mapping) is more probable than fitting an incorrect candidate to an informative template where mapping is rarely available. [sent-482, score-0.312]
99 The GLD framework makes it possible to capture different types of visual similarity, which is demonstrated for template matching in challenging conditions such as complicated intensity transformations, outlier noise and multimodal matching. [sent-485, score-0.453]
100 For fast computation of GLDs we developed a novel algorithm based on linear decomposition of distance functions. [sent-486, score-0.107]
wordName wordTfidf (topN-words)
[('gldabs', 0.415), ('mtm', 0.391), ('csign', 0.277), ('laplacian', 0.253), ('qj', 0.24), ('template', 0.235), ('ncc', 0.191), ('gld', 0.184), ('qi', 0.174), ('tone', 0.167), ('glds', 0.123), ('convolutions', 0.105), ('fij', 0.099), ('ur', 0.096), ('pj', 0.09), ('ld', 0.088), ('vk', 0.086), ('wij', 0.083), ('pi', 0.083), ('monotonic', 0.081), ('wb', 0.081), ('vj', 0.081), ('dissimilarity', 0.079), ('uk', 0.078), ('gldsign', 0.077), ('ijw', 0.068), ('squared', 0.067), ('xj', 0.066), ('multimodal', 0.063), ('sk', 0.061), ('tolerant', 0.06), ('weights', 0.059), ('distance', 0.059), ('equation', 0.059), ('tq', 0.057), ('outlier', 0.057), ('kth', 0.056), ('vi', 0.052), ('mapping', 0.051), ('delta', 0.051), ('sought', 0.051), ('non', 0.049), ('pb', 0.049), ('decomposition', 0.048), ('ikv', 0.046), ('rur', 0.046), ('dct', 0.044), ('mi', 0.044), ('affine', 0.044), ('slices', 0.043), ('ij', 0.042), ('variant', 0.042), ('execution', 0.041), ('iku', 0.041), ('dissimilarities', 0.041), ('graph', 0.04), ('sad', 0.04), ('sign', 0.039), ('nk', 0.038), ('intensity', 0.038), ('nominator', 0.038), ('window', 0.037), ('spectral', 0.037), ('israel', 0.037), ('additive', 0.036), ('distortions', 0.036), ('xi', 0.035), ('bins', 0.035), ('generalized', 0.035), ('denominator', 0.033), ('absolute', 0.032), ('supplementary', 0.032), ('argmqin', 0.031), ('piuk', 0.031), ('pjvk', 0.031), ('material', 0.03), ('matching', 0.03), ('bin', 0.03), ('noise', 0.03), ('correlation', 0.03), ('depending', 0.029), ('grayscale', 0.029), ('cos', 0.029), ('svd', 0.029), ('cosine', 0.028), ('gamma', 0.028), ('outliers', 0.028), ('replacing', 0.028), ('proximity', 0.027), ('variants', 0.027), ('xtl', 0.027), ('qtdq', 0.027), ('jerusalem', 0.027), ('informative', 0.026), ('convolution', 0.026), ('lookup', 0.026), ('distorted', 0.026), ('invariant', 0.025), ('summations', 0.025), ('injective', 0.025)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999988 429 cvpr-2013-The Generalized Laplacian Distance and Its Applications for Visual Matching
Author: Elhanan Elboer, Michael Werman, Yacov Hel-Or
Abstract: The graph Laplacian operator, which originated in spectral graph theory, is commonly used for learning applications such as spectral clustering and embedding. In this paper we explore the Laplacian distance, a distance function related to the graph Laplacian, and use it for visual search. We show that previous techniques such as Matching by Tone Mapping (MTM) are particular cases of the Laplacian distance. Generalizing the Laplacian distance results in distance measures which are tolerant to various visual distortions. A novel algorithm based on linear decomposition makes it possible to compute these generalized distances efficiently. The proposed approach is demonstrated for tone mapping invariant, outlier robust and multimodal template matching.
2 0.15394616 162 cvpr-2013-FasT-Match: Fast Affine Template Matching
Author: Simon Korman, Daniel Reichman, Gilad Tsur, Shai Avidan
Abstract: Fast-Match is a fast algorithm for approximate template matching under 2D affine transformations that minimizes the Sum-of-Absolute-Differences (SAD) error measure. There is a huge number of transformations to consider but we prove that they can be sampled using a density that depends on the smoothness of the image. For each potential transformation, we approximate the SAD error using a sublinear algorithm that randomly examines only a small number of pixels. We further accelerate the algorithm using a branch-and-bound scheme. As images are known to be piecewise smooth, the result is a practical affine template matching algorithm with approximation guarantees, that takes a few seconds to run on a standard machine. We perform several experiments on three different datasets, and report very good results. To the best of our knowledge, this is the first template matching algorithm which is guaranteed to handle arbitrary 2D affine transformations.
3 0.084020287 316 cvpr-2013-Optical Flow Estimation Using Laplacian Mesh Energy
Author: Wenbin Li, Darren Cosker, Matthew Brown, Rui Tang
Abstract: In this paper we present a novel non-rigid optical flow algorithm for dense image correspondence and non-rigid registration. The algorithm uses a unique Laplacian Mesh Energy term to encourage local smoothness whilst simultaneously preserving non-rigid deformation. Laplacian deformation approaches have become popular in graphics research as they enable mesh deformations to preserve local surface shape. In this work we propose a novel Laplacian Mesh Energy formula to ensure such sensible local deformations between image pairs. We express this wholly within the optical flow optimization, and show its application in a novel coarse-to-fine pyramidal approach. Our algorithm achieves the state-of-the-art performance in all trials on the Garg et al. dataset, and top tier performance on the Middlebury evaluation.
4 0.078942575 180 cvpr-2013-Fully-Connected CRFs with Non-Parametric Pairwise Potential
Author: Neill D.F. Campbell, Kartic Subr, Jan Kautz
Abstract: Conditional Random Fields (CRFs) are used for diverse tasks, ranging from image denoising to object recognition. For images, they are commonly defined as a graph with nodes corresponding to individual pixels and pairwise links that connect nodes to their immediate neighbors. Recent work has shown that fully-connected CRFs, where each node is connected to every other node, can be solved efficiently under the restriction that the pairwise term is a Gaussian kernel over a Euclidean feature space. In this paper, we generalize the pairwise terms to a non-linear dissimilarity measure that is not required to be a distance metric. To this end, we propose a density estimation technique to derive conditional pairwise potentials in a nonparametric manner. We then use an efficient embedding technique to estimate an approximate Euclidean feature space for these potentials, in which the pairwise term can still be expressed as a Gaussian kernel. We demonstrate that the use of non-parametric models for the pairwise interactions, conditioned on the input data, greatly increases expressive power whilst maintaining efficient inference.
5 0.070373572 267 cvpr-2013-Least Soft-Threshold Squares Tracking
Author: Dong Wang, Huchuan Lu, Ming-Hsuan Yang
Abstract: In this paper, we propose a generative tracking method based on a novel robust linear regression algorithm. In contrast to existing methods, the proposed Least Soft-thresold Squares (LSS) algorithm models the error term with the Gaussian-Laplacian distribution, which can be solved efficiently. Based on maximum joint likelihood of parameters, we derive a LSS distance to measure the difference between an observation sample and the dictionary. Compared with the distance derived from ordinary least squares methods, the proposed metric is more effective in dealing with outliers. In addition, we present an update scheme to capture the appearance change of the tracked target and ensure that the model is properly updated. Experimental results on several challenging image sequences demonstrate that the proposed tracker achieves more favorable performance than the state-of-the-art methods.
6 0.068732478 23 cvpr-2013-A Practical Rank-Constrained Eight-Point Algorithm for Fundamental Matrix Estimation
7 0.066873305 191 cvpr-2013-Graph-Laplacian PCA: Closed-Form Solution and Robustness
8 0.065124579 371 cvpr-2013-SCaLE: Supervised and Cascaded Laplacian Eigenmaps for Visual Object Recognition Based on Nearest Neighbors
9 0.063108772 423 cvpr-2013-Template-Based Isometric Deformable 3D Reconstruction with Sampling-Based Focal Length Self-Calibration
10 0.061009455 234 cvpr-2013-Joint Spectral Correspondence for Disparate Image Matching
11 0.06054138 192 cvpr-2013-Graph Matching with Anchor Nodes: A Learning Approach
12 0.057609588 457 cvpr-2013-Visual Tracking via Locality Sensitive Histograms
13 0.053953901 405 cvpr-2013-Sparse Subspace Denoising for Image Manifolds
14 0.053319298 92 cvpr-2013-Constrained Clustering and Its Application to Face Clustering in Videos
15 0.052674983 225 cvpr-2013-Integrating Grammar and Segmentation for Human Pose Estimation
16 0.052253336 437 cvpr-2013-Towards Fast and Accurate Segmentation
17 0.051628102 109 cvpr-2013-Dense Non-rigid Point-Matching Using Random Projections
18 0.050665721 151 cvpr-2013-Event Retrieval in Large Video Collections with Circulant Temporal Encoding
20 0.045272224 379 cvpr-2013-Scalable Sparse Subspace Clustering
topicId topicWeight
[(0, 0.116), (1, 0.024), (2, -0.014), (3, 0.025), (4, 0.029), (5, -0.006), (6, -0.002), (7, -0.046), (8, -0.03), (9, -0.023), (10, 0.001), (11, 0.019), (12, -0.063), (13, -0.046), (14, 0.007), (15, -0.048), (16, -0.008), (17, 0.003), (18, 0.069), (19, 0.038), (20, -0.056), (21, 0.007), (22, 0.034), (23, -0.014), (24, 0.016), (25, -0.027), (26, -0.032), (27, 0.007), (28, 0.02), (29, 0.013), (30, 0.066), (31, 0.042), (32, 0.034), (33, -0.043), (34, -0.019), (35, 0.014), (36, -0.02), (37, 0.065), (38, 0.013), (39, 0.056), (40, -0.003), (41, -0.017), (42, 0.045), (43, 0.03), (44, 0.015), (45, -0.046), (46, 0.022), (47, 0.085), (48, 0.06), (49, -0.006)]
simIndex simValue paperId paperTitle
same-paper 1 0.92476273 429 cvpr-2013-The Generalized Laplacian Distance and Its Applications for Visual Matching
Author: Elhanan Elboer, Michael Werman, Yacov Hel-Or
Abstract: The graph Laplacian operator, which originated in spectral graph theory, is commonly used for learning applications such as spectral clustering and embedding. In this paper we explore the Laplacian distance, a distance function related to the graph Laplacian, and use it for visual search. We show that previous techniques such as Matching by Tone Mapping (MTM) are particular cases of the Laplacian distance. Generalizing the Laplacian distance results in distance measures which are tolerant to various visual distortions. A novel algorithm based on linear decomposition makes it possible to compute these generalized distances efficiently. The proposed approach is demonstrated for tone mapping invariant, outlier robust and multimodal template matching.
2 0.72807777 162 cvpr-2013-FasT-Match: Fast Affine Template Matching
Author: Simon Korman, Daniel Reichman, Gilad Tsur, Shai Avidan
Abstract: Fast-Match is a fast algorithm for approximate template matching under 2D affine transformations that minimizes the Sum-of-Absolute-Differences (SAD) error measure. There is a huge number of transformations to consider but we prove that they can be sampled using a density that depends on the smoothness of the image. For each potential transformation, we approximate the SAD error using a sublinear algorithm that randomly examines only a small number of pixels. We further accelerate the algorithm using a branch-and-bound scheme. As images are known to be piecewise smooth, the result is a practical affine template matching algorithm with approximation guarantees, that takes a few seconds to run on a standard machine. We perform several experiments on three different datasets, and report very good results. To the best of our knowledge, this is the first template matching algorithm which is guaranteed to handle arbitrary 2D affine transformations.
3 0.68265873 9 cvpr-2013-A Fast Semidefinite Approach to Solving Binary Quadratic Problems
Author: Peng Wang, Chunhua Shen, Anton van_den_Hengel
Abstract: Many computer vision problems can be formulated as binary quadratic programs (BQPs). Two classic relaxation methods are widely used for solving BQPs, namely, spectral methods and semidefinite programming (SDP), each with their own advantages and disadvantages. Spectral relaxation is simple and easy to implement, but its bound is loose. Semidefinite relaxation has a tighter bound, but its computational complexity is high for large scale problems. We present a new SDP formulation for BQPs, with two desirable properties. First, it has a similar relaxation bound to conventional SDP formulations. Second, compared with conventional SDP methods, the new SDP formulation leads to a significantly more efficient and scalable dual optimization approach, which has the same degree of complexity as spectral methods. Extensive experiments on various applications including clustering, image segmentation, co-segmentation and registration demonstrate the usefulness of our SDP formulation for solving large-scale BQPs.
4 0.64290303 360 cvpr-2013-Robust Estimation of Nonrigid Transformation for Point Set Registration
Author: Jiayi Ma, Ji Zhao, Jinwen Tian, Zhuowen Tu, Alan L. Yuille
Abstract: We present a new point matching algorithm for robust nonrigid registration. The method iteratively recovers the point correspondence and estimates the transformation between two point sets. In the first step of the iteration, feature descriptors such as shape context are used to establish rough correspondence. In the second step, we estimate the transformation using a robust estimator called L2E. This is the main novelty of our approach and it enables us to deal with the noise and outliers which arise in the correspondence step. The transformation is specified in a functional space, more specifically a reproducing kernel Hilbert space. We apply our method to nonrigid sparse image feature correspondence on 2D images and 3D surfaces. Our results quantitatively show that our approach outperforms state-ofthe-art methods, particularly when there are a large number of outliers. Moreover, our method of robustly estimating transformations from correspondences is general and has many other applications.
5 0.61743623 317 cvpr-2013-Optimal Geometric Fitting under the Truncated L2-Norm
Author: Erik Ask, Olof Enqvist, Fredrik Kahl
Abstract: This paper is concerned with model fitting in the presence of noise and outliers. Previously it has been shown that the number of outliers can be minimized with polynomial complexity in the number of measurements. This paper improves on these results in two ways. First, it is shown that for a large class of problems, the statistically more desirable truncated L2-norm can be optimized with the same complexity. Then, with the same methodology, it is shown how to transform multi-model fitting into a purely combinatorial problem—with worst-case complexity that is polynomial in the number of measurements, though exponential in the number of models. We apply our framework to a series of hard registration and stitching problems demonstrating that the approach is not only of theoretical interest. It gives a practical method for simultaneously dealing with measurement noise and large amounts of outliers for fitting problems with lowdimensional models.
7 0.60870981 106 cvpr-2013-Deformable Graph Matching
8 0.60678017 97 cvpr-2013-Correspondence-Less Non-rigid Registration of Triangular Surface Meshes
9 0.59503734 369 cvpr-2013-Rotation, Scaling and Deformation Invariant Scattering for Texture Discrimination
10 0.58713096 240 cvpr-2013-Keypoints from Symmetries by Wave Propagation
11 0.58551633 93 cvpr-2013-Constraints as Features
12 0.57786471 51 cvpr-2013-Auxiliary Cuts for General Classes of Higher Order Functionals
13 0.57734692 23 cvpr-2013-A Practical Rank-Constrained Eight-Point Algorithm for Fundamental Matrix Estimation
14 0.57631177 234 cvpr-2013-Joint Spectral Correspondence for Disparate Image Matching
15 0.56970912 191 cvpr-2013-Graph-Laplacian PCA: Closed-Form Solution and Robustness
16 0.56135333 141 cvpr-2013-Efficient Computation of Shortest Path-Concavity for 3D Meshes
17 0.54342157 184 cvpr-2013-Gauging Association Patterns of Chromosome Territories via Chromatic Median
18 0.54297405 109 cvpr-2013-Dense Non-rigid Point-Matching Using Random Projections
19 0.54059345 327 cvpr-2013-Pattern-Driven Colorization of 3D Surfaces
20 0.52600354 52 cvpr-2013-Axially Symmetric 3D Pots Configuration System Using Axis of Symmetry and Break Curve
topicId topicWeight
[(10, 0.09), (16, 0.054), (26, 0.066), (33, 0.23), (67, 0.054), (69, 0.067), (81, 0.265), (87, 0.062)]
simIndex simValue paperId paperTitle
same-paper 1 0.80152333 429 cvpr-2013-The Generalized Laplacian Distance and Its Applications for Visual Matching
Author: Elhanan Elboer, Michael Werman, Yacov Hel-Or
Abstract: The graph Laplacian operator, which originated in spectral graph theory, is commonly used for learning applications such as spectral clustering and embedding. In this paper we explore the Laplacian distance, a distance function related to the graph Laplacian, and use it for visual search. We show that previous techniques such as Matching by Tone Mapping (MTM) are particular cases of the Laplacian distance. Generalizing the Laplacian distance results in distance measures which are tolerant to various visual distortions. A novel algorithm based on linear decomposition makes it possible to compute these generalized distances efficiently. The proposed approach is demonstrated for tone mapping invariant, outlier robust and multimodal template matching.
2 0.75948834 351 cvpr-2013-Recovering Line-Networks in Images by Junction-Point Processes
Author: Dengfeng Chai, Wolfgang Förstner, Florent Lafarge
Abstract: The automatic extraction of line-networks from images is a well-known computer vision issue. Appearance and shape considerations have been deeply explored in the literature to improve accuracy in presence of occlusions, shadows, and a wide variety of irrelevant objects. However most existing works have ignored the structural aspect of the problem. We present an original method which provides structurally-coherent solutions. Contrary to the pixelbased and object-based methods, our result is a graph in which each node represents either a connection or an ending in the line-network. Based on stochastic geometry, we develop a new family of point processes consisting in sampling junction-points in the input image by using a Monte Carlo mechanism. The quality of a configuration is measured by a probability density which takes into account both image consistency and shape priors. Our experiments on a variety of problems illustrate the potential of our approach in terms of accuracy, flexibility and efficiency.
3 0.72619802 200 cvpr-2013-Harvesting Mid-level Visual Concepts from Large-Scale Internet Images
Author: Quannan Li, Jiajun Wu, Zhuowen Tu
Abstract: Obtaining effective mid-level representations has become an increasingly important task in computer vision. In this paper, we propose a fully automatic algorithm which harvests visual concepts from a large number of Internet images (more than a quarter of a million) using text-based queries. Existing approaches to visual concept learning from Internet images either rely on strong supervision with detailed manual annotations or learn image-level classifiers only. Here, we take the advantage of having massive wellorganized Google and Bing image data; visual concepts (around 14, 000) are automatically exploited from images using word-based queries. Using the learned visual concepts, we show state-of-the-art performances on a variety of benchmark datasets, which demonstrate the effectiveness of the learned mid-level representations: being able to generalize well to general natural images. Our method shows significant improvement over the competing systems in image classification, including those with strong supervision.
4 0.70646828 61 cvpr-2013-Beyond Point Clouds: Scene Understanding by Reasoning Geometry and Physics
Author: Bo Zheng, Yibiao Zhao, Joey C. Yu, Katsushi Ikeuchi, Song-Chun Zhu
Abstract: In this paper, we present an approach for scene understanding by reasoning physical stability of objects from point cloud. We utilize a simple observation that, by human design, objects in static scenes should be stable with respect to gravity. This assumption is applicable to all scene categories and poses useful constraints for the plausible interpretations (parses) in scene understanding. Our method consists of two major steps: 1) geometric reasoning: recovering solid 3D volumetric primitives from defective point cloud; and 2) physical reasoning: grouping the unstable primitives to physically stable objects by optimizing the stability and the scene prior. We propose to use a novel disconnectivity graph (DG) to represent the energy landscape and use a Swendsen-Wang Cut (MCMC) method for optimization. In experiments, we demonstrate that the algorithm achieves substantially better performance for i) object segmentation, ii) 3D volumetric recovery of the scene, and iii) better parsing result for scene understanding in comparison to state-of-the-art methods in both public dataset and our own new dataset.
5 0.70417476 104 cvpr-2013-Deep Convolutional Network Cascade for Facial Point Detection
Author: Yi Sun, Xiaogang Wang, Xiaoou Tang
Abstract: We propose a new approach for estimation of the positions of facial keypoints with three-level carefully designed convolutional networks. At each level, the outputs of multiple networks are fused for robust and accurate estimation. Thanks to the deep structures of convolutional networks, global high-level features are extracted over the whole face region at the initialization stage, which help to locate high accuracy keypoints. There are two folds of advantage for this. First, the texture context information over the entire face is utilized to locate each keypoint. Second, since the networks are trained to predict all the keypoints simultaneously, the geometric constraints among keypoints are implicitly encoded. The method therefore can avoid local minimum caused by ambiguity and data corruption in difficult image samples due to occlusions, large pose variations, and extreme lightings. The networks at the following two levels are trained to locally refine initial predictions and their inputs are limited to small regions around the initial predictions. Several network structures critical for accurate and robust facial point detection are investigated. Extensive experiments show that our approach outperforms state-ofthe-art methods in both detection accuracy and reliability1.
6 0.7033183 446 cvpr-2013-Understanding Indoor Scenes Using 3D Geometric Phrases
7 0.70298523 311 cvpr-2013-Occlusion Patterns for Object Class Detection
8 0.7029469 365 cvpr-2013-Robust Real-Time Tracking of Multiple Objects by Volumetric Mass Densities
9 0.70256579 331 cvpr-2013-Physically Plausible 3D Scene Tracking: The Single Actor Hypothesis
10 0.70217735 445 cvpr-2013-Understanding Bayesian Rooms Using Composite 3D Object Models
11 0.70191109 248 cvpr-2013-Learning Collections of Part Models for Object Recognition
12 0.70133477 292 cvpr-2013-Multi-agent Event Detection: Localization and Role Assignment
13 0.70126361 424 cvpr-2013-Templateless Quasi-rigid Shape Modeling with Implicit Loop-Closure
14 0.70114982 221 cvpr-2013-Incorporating Structural Alternatives and Sharing into Hierarchy for Multiclass Object Recognition and Detection
15 0.70078212 372 cvpr-2013-SLAM++: Simultaneous Localisation and Mapping at the Level of Objects
16 0.70006812 70 cvpr-2013-Bottom-Up Segmentation for Top-Down Detection
17 0.69900602 242 cvpr-2013-Label Propagation from ImageNet to 3D Point Clouds
18 0.69900572 96 cvpr-2013-Correlation Filters for Object Alignment
19 0.69890451 416 cvpr-2013-Studying Relationships between Human Gaze, Description, and Computer Vision
20 0.69866627 325 cvpr-2013-Part Discovery from Partial Correspondence