iccv iccv2013 iccv2013-365 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Alexandra Gilinsky, Lihi Zelnik Manor
Abstract: Computing distances between large sets of SIFT descriptors is a basic step in numerous algorithms in computer vision. When the number of descriptors is large, as is often the case, computing these distances can be extremely time consuming. In this paper we propose the SIFTpack: a compact way of storing SIFT descriptors, which enables significantly faster calculations between sets of SIFTs than the current solutions. SIFTpack can be used to represent SIFTs densely extracted from a single image or sparsely from multiple different images. We show that the SIFTpack representation saves both storage space and run time, for both finding nearest neighbors and for computing all distances between all descriptors. The usefulness of SIFTpack is also demonstrated as an alternative implementation for K-means dictionaries of visual words.
Reference: text
sentIndex sentText sentNum sentScore
1 SIFTpack: a compact representation for efficient SIFT matching Alexandra Gilinsky Technion Israel Institute of Technology s a shkagi l gmai l @ . [sent-1, score-0.053]
2 com Abstract Computing distances between large sets of SIFT descriptors is a basic step in numerous algorithms in computer vision. [sent-2, score-0.137]
3 When the number of descriptors is large, as is often the case, computing these distances can be extremely time consuming. [sent-3, score-0.145]
4 In this paper we propose the SIFTpack: a compact way of storing SIFT descriptors, which enables significantly faster calculations between sets of SIFTs than the current solutions. [sent-4, score-0.089]
5 We show that the SIFTpack representation saves both storage space and run time, for both finding nearest neighbors and for computing all distances between all descriptors. [sent-6, score-0.172]
6 Introduction In numerous applications in computer vision a basic building block is the computation of distances between sets of SIFT descriptors [16]. [sent-9, score-0.137]
7 In other cases one needs to compute all the distances between all the SIFTs, e. [sent-13, score-0.058]
8 As the number of descriptors increases, the computation time of these distances becomes a major consideration in the applicability of the algorithm. [sent-16, score-0.128]
9 The second approach reduces the number of descriptors by using sparse sampling of the image [26, 29]. [sent-20, score-0.081]
10 Furthermore, in many applications, the obtained set of descriptors could still be very large, e. [sent-25, score-0.07]
11 Hence, efficient methods for computing the distances are still required. [sent-28, score-0.075]
12 These reduce run-time significantly, however, they are relevant only for the nearestneighbor case and become inefficient when all distances need to be computed. [sent-30, score-0.058]
13 Algorithms for approximate matching across images are also highly efficient [5, 10, 13, 20], but are limited to dense matching across a pair of images and most of them cannot be applied for matching SIFTs. [sent-31, score-0.077]
14 In this paper we propose the SIFTpack: a compact form for storing a set of SIFT descriptors that reduces both storage space and run-time when comparing sets of SIFTs. [sent-32, score-0.192]
15 Our key idea is to exploit redundancies between descriptors to store them efficiently in an image-like structure. [sent-33, score-0.124]
16 Redundancies can occur, for example, when two descriptors are computed over overlapping image regions. [sent-34, score-0.097]
17 In this case the descriptors have a shared part that doesn’t need to be stored twice. [sent-35, score-0.118]
18 The SIFTpack construction identifies such redundancies and hence saves in storage space. [sent-36, score-0.145]
19 We show how SIFTpack can be used to compactly represent descriptors extracted densely from a single image. [sent-37, score-0.11]
20 We further suggest an algorithm for constructing SIFTpacks for descriptors extracted sparsely from one or many images. [sent-38, score-0.132]
21 The key contribution of this work is a significant reduc- tion in run-time when computing distances between sets of SIFTs. [sent-40, score-0.084]
22 Second, since the SIFTs are stored in an image-like form, we can utilize existing efficient algorithms for fast matching between images. [sent-43, score-0.051]
23 We suggest such solutions for both nearest-neighbor matching and all-distances calculation. [sent-44, score-0.046]
24 We start by describing the SIFTpack construction for dense-SIFT and for an arbitrary set of SIFTs in Section 2. [sent-47, score-0.043]
25 Redundancy in overlapping SIFTs: When two SIFT descriptors are computed over overlapping image regions the shared region (marked in gray) could result in common SIFT val- ues in the descriptors (marked in green). [sent-49, score-0.213]
26 The joint values will appear at different entries of the vectors and will be stored twice by the standard approach of keeping SIFTs in an array. [sent-50, score-0.052]
27 The SIFTpack representation The SIFTpack construction that we propose is based on two observations. [sent-55, score-0.044]
28 The first is that two different SIFT descriptors could still have shared components. [sent-56, score-0.089]
29 Storing these shared components twice is redundant and inefficient. [sent-58, score-0.048]
30 We first describe how SIFTpack is constructed for dense-SIFT and then continue to present an algorithm for constructing a SIFTpack for an arbitrary set of SIFTs. [sent-62, score-0.071]
31 aT o1 efficiently satyo,re w dheernese M-S IiFsT th descriptors f[2 d8es] we tsotrarst. [sent-71, score-0.07]
32 Histograms ×tha 4t are shared between descriptors are stored only once in this construction, e. [sent-76, score-0.118]
33 sAtosg gilralumst irsat ceodm minFigure 1v,e rina at shiusb c-roengsiotrnucotifo sniz tehen spatial regions ofneighboring descriptors overlap and the sub-regions are aligned. [sent-84, score-0.088]
34 Without the weighting, two descriptors with overlapping regions will have at least one shared histogram. [sent-87, score-0.116]
35 In fact, all the descriptors including a certain sub-region will share its corresponding gradient histogram. [sent-88, score-0.07]
36 While the standard approach stores this histogram multiple times, once for each × descriptor, we wish to store it only once. [sent-89, score-0.058]
37 To enable a compact and image-like representation we start by reshaping the extracted SIFT descriptors. [sent-90, score-0.044]
38 As illustrated in Figure 2, we reshape each 128 dimensional SIFT vector into a 3D array of dimensions 4 4 8. [sent-91, score-0.045]
39 The key idea behind our construction is that after the reshape step, the shared histograms of two neighboring descriptors correspond to shared “pixels”. [sent-102, score-0.164]
40 Therefore, to store the descriptors compactly, we simply place them next to each other, with overlaps, according to their original position in the image. [sent-103, score-0.091]
41 The SIFTpack storage space is 16 times smaller than that of the conventional approach of saving all SIFTs in an array. [sent-105, score-0.121]
42 In practice, Gaussian weighting is applied to the region over which the descriptor is computed, thus two overlapping SIFTs do not share the exact same entry values. [sent-108, score-0.066]
43 we store only one, averaged version, of all SIFT entries that describe the same spatial bin. [sent-111, score-0.042]
44 We performed this twice, once with the original SIFT descriptors, and once while ×× averaging overlapping descriptors. [sent-115, score-0.053]
45 While the averaging described above alters the normaliztion a bit, this is not significant as matching accuracy is not harmed. [sent-119, score-0.048]
46 SIFTpack for a set of SIFTs Next, we suggest a similar construction for the more popular case where SIFTs are extracted at isolated locations (typically interest points), possibly from multiple different × images, of different scales and orientations. [sent-128, score-0.055]
47 When the number of descriptors is large we expect to find many similarities between part of them, e. [sent-129, score-0.07]
48 However, unlike the dense case, here we cannot tell a-priori which descriptors have corresponding components. [sent-132, score-0.07]
49 To identify these repetitions and enable the construction of a compact SIFTpack we build upon the dictionary optimization framework of [1]. [sent-133, score-0.097]
50 , M we wish to find a SIFTpack S, of size m m 8, that forms a sparsifying dictionary for them. [sent-137, score-0.048]
51 Flow-field puted by SIFT-flow accuracy on Middlebury (a) [15] are consistently and Mikolajczyk data-set (b): As can be seen from both tables, the flow-fields more accurate when applied to descriptors supplementary for more detailed results). [sent-161, score-0.07]
52 This suggests that averaging overlapping packed in SIFTpack, SIFTs com- than the original SIFTs (see reduces noise and hence matching accuracy is improved. [sent-162, score-0.095]
53 When × the SIFTpack represents a set of SIFTs extracted at interest points of multiple different images we have found empirically that best results were obtained when initializing with a SIFTpack constructed from a dense-SIFT of a randomly chosen image, as described in the previous section. [sent-179, score-0.043]
54 It differs from the standard dictionaries in exploiting redundancies between visual words to store the dictionary more compactly. [sent-191, score-0.11]
55 Update the SIFTpack by averaging SIFTs assigned to overlapping SIFTpack entries. [sent-195, score-0.053]
56 The advantage of the gradual resizing is a more compact SIFTpack for the same representation error. [sent-201, score-0.071]
57 On the down side, applying Algorithm 2 with gradual resizing is slower, and hence is not always preferable. [sent-202, score-0.049]
58 Exchanging the distance metric amounts to replacing the update step of Equation (2) by an appropriate calculation for averaging overlapping SIFTs and exchanging stage 1with regards to the new metric. [sent-204, score-0.082]
59 Efficient matching solutions So far we have described algorithms for constructing a space-efficient representation for multiple SIFTs, extracted 778800 (a) (b) Figure 5. [sent-206, score-0.089]
60 SIFTpack for a set of SIFTs: (a) The 8 layers of a SIFTpack constructed of ∼ 50, 000 SIFTs extracted at interest points afrcokm c donifsfetrruecntte images w 5i0th, 0di0f0fer SeIFntT ssca elexstr. [sent-207, score-0.05]
61 (b) dTh aet average representation error (blue) and saving in storage space (red) as a function of the SIFTpack size m. [sent-208, score-0.134]
62 While saving in storage space is a desirable property, our main goal is to obtain a significant reduction in computation time as well. [sent-211, score-0.131]
63 Computing all distances In applications such as image segmentation, cosegmentation and self-similarity estimation one needs to compute all (or multiple) distances between all (or multiple) pairs of descriptors within a given set or across two sets of SIFTs. [sent-215, score-0.208]
64 When the number of descriptors is large, computing all distances naively is highly time consuming: O(M2), where M is the number of descriptors. [sent-216, score-0.156]
65 Storing the descriptors in a SIFTpack enables a more efficient computation since we avoid redundant calculations. [sent-217, score-0.086]
66 In applications where distances are to be computed between the descriptors of a single set, we assign S2 = S1. [sent-221, score-0.128]
67 Our approach for efficiently computing all distances between the descriptors of S1 and S2 is adopted from [20], where it was used to compare image patches. [sent-223, score-0.145]
68 We use the Integral Image [6] to compute the distance between all pairs of descriptors in S1 and S2 that have the same location i,j. [sent-224, score-0.094]
69 To compute distances between descriptors at different locations we loop through all shifts k, l of S2. [sent-225, score-0.141]
70 Our algorithm can be summarized as follows: Algorithm 3 All distances between SIFTs Input: SIFTpacks S1 and S2 for all shifts k, ldo Compute the element-wise square difference Δ = (S1 − S2[k,l])2 Compute t She2 Integral Image F(Δ) , summing the 8 layers. [sent-226, score-0.071]
71 is equal to: F(i, j) + F(i + 3, j + 3) − F(i + 3, j) − F(i, j + 3) endF f(oi,rj Output: Distances between all SIFTs – – – The distance between S1i,j and S2ik,j,l Algorithm 3 loops through all possible shifts k, l, similarly to the naive solution. [sent-227, score-0.042]
72 However, it computes the distances between all pairs of descriptors with a location shift k, l, faster than the standard solution. [sent-228, score-0.171]
73 For this experiment we first extract the dense-SIFT [28] descriptors of images of varying sizes. [sent-231, score-0.079]
74 We then compute the distances between all pairs of SIFTs within a a radius R of each other, using the naive approach. [sent-232, score-0.1]
75 Next, we construct a SIFTpack for each image, using Algorithm 1, and compute the distances between the same pairs of SIFTs using Algorithm 3. [sent-233, score-0.081]
76 For each image we repeated the distances computation 100 times and averaged the results. [sent-236, score-0.069]
77 The above experiment shows the speed-up that can be obtained for applications that require computing distances 778811 Figure 6. [sent-242, score-0.084]
78 Run-time saving by SIFTpack when computing multiple distances: The curves represent the average run-time for computing distances between all pairs of SIFTs of an image, within a radius R of each other. [sent-243, score-0.172]
79 When one needs to compute all distances between two arbitrary sets of SIFTs, the SIFTpack construction needs to be performed via Algorithm 2, which is time consuming on its own. [sent-250, score-0.12]
80 Exact Nearest-Neighbor matching When a single nearest-neighbor is needed, the naive solution is to compute all distances and take the minimum. [sent-254, score-0.109]
81 We propose to use SIFTpacks and Algorithm 3, with the slight modification that only the nearest neighbor is stored for each descriptor. [sent-256, score-0.053]
82 More recently it has been shown that even faster algorithms can be developed when computing ANN densely across images [5, 10, 13, 20]. [sent-266, score-0.046]
83 The dictionary construction process is performed off-line, hence, its computation time is of lesser interest. [sent-278, score-0.065]
84 Run-time saving by SIFTpack for ANN: Computing the ANN using SIFTpack and TreeCANN leads to significantly lower errors and faster run time than both Kd-trees and PatchMatch. [sent-280, score-0.086]
85 First, we wish to evaluate the benefits in storage space and representation error of SIFTpack in comparison with the standard K-means dictionary. [sent-284, score-0.093]
86 Storage space saving: To assess the benefits of SIFTpack in terms of storage space we performed the following ex- ×× × periment. [sent-287, score-0.066]
87 In addition, we also construct a standard dictionary using K-means, setting K such that the same (as much as possible) representation error is obtained. [sent-290, score-0.057]
88 ×As 8 can bSIeF seen, kth aen saving 1in2 8st foor-r age space is tremendous for both setups. [sent-304, score-0.067]
89 When packing dense-SIFTs the space-reduction is more significant since the SIFTpack is more compact due to the gradual resizing during its construction. [sent-305, score-0.058]
90 Space saving by SIFTpack for BoW: The plots present the required storage space of SIFTpacks of varying representation errors, constructed with Algorithm 2, and of corresponding k-means dictionaries (with the same representation error). [sent-307, score-0.188]
91 Run-time saving: Next, we assess the run-time benefits of SIFTpack when constructing the bag-of-words representation for an image. [sent-311, score-0.053]
92 We use as test-bed the SIFTpack and K-means dictionaries constructed in the previous experiment from ∼ 1700 images of 6 different scenes (corresponding pairs w∼ith 17 7t0he0 same representation netrr socre). [sent-313, score-0.076]
93 n Gesiv (ecnor an arbitrary input image we first extract dense-SIFT descriptors and construct the corresponding SIFTpack using Algorithm 1. [sent-314, score-0.092]
94 Next, we find for each descriptor its nearest-neighbor in the k-means dictionary and in the SIFTpack. [sent-315, score-0.057]
95 We should note that as far as exact NN are concerned, the kd-tree algorithm doesn’t outperform the naive one significantly. [sent-320, score-0.045]
96 Conclusions This paper suggested an approach for compactly storing sets of SIFT descriptors. [sent-322, score-0.056]
97 We have shown that the proposed SIFTpack saves not only in storage space, but more importantly, it enables huge reductions in run-time for matching between SIFTs. [sent-323, score-0.094]
98 In particular, we have shown empirically the benefits of using SIFTpack instead of the traditional Bagof-Words dictionary and as an alternative image signature for retrieval purposes. [sent-325, score-0.057]
99 In addi778833 ing the bag-of-words model (constructing the histogram of frequencies) is significantly faster when using SIFTpack to represent the dictionary, instead of the standard k-means dictionary with the same representation error. [sent-327, score-0.077]
100 tion, our framework could be easily extended to other descriptors whose spatial properties are similar to SIFT. [sent-328, score-0.07]
wordName wordTfidf (topN-words)
[('siftpack', 0.882), ('sifts', 0.396), ('siftpacks', 0.071), ('descriptors', 0.07), ('saving', 0.067), ('sift', 0.06), ('distances', 0.058), ('storage', 0.054), ('treecann', 0.045), ('ann', 0.036), ('dictionary', 0.034), ('redundancies', 0.033), ('construction', 0.031), ('storing', 0.03), ('naive', 0.029), ('stored', 0.029), ('constructing', 0.028), ('overlapping', 0.027), ('averaging', 0.026), ('reshape', 0.025), ('descriptor', 0.023), ('dictionaries', 0.022), ('pages', 0.022), ('resizing', 0.022), ('matching', 0.022), ('store', 0.021), ('rezaiifar', 0.02), ('vidpairs', 0.02), ('array', 0.02), ('shared', 0.019), ('patchmatch', 0.019), ('israel', 0.019), ('constructed', 0.019), ('faster', 0.019), ('saves', 0.018), ('gradual', 0.018), ('compact', 0.018), ('layers', 0.018), ('compactly', 0.017), ('computing', 0.017), ('exact', 0.016), ('redundant', 0.016), ('technion', 0.016), ('nn', 0.015), ('exchanging', 0.015), ('omp', 0.015), ('shechtman', 0.015), ('wish', 0.014), ('mikolajczyk', 0.014), ('repetitions', 0.014), ('calculation', 0.014), ('shifts', 0.013), ('representation', 0.013), ('coherency', 0.013), ('extracted', 0.013), ('pairs', 0.013), ('solutions', 0.013), ('twice', 0.013), ('calculations', 0.013), ('doesn', 0.012), ('joulin', 0.012), ('benefits', 0.012), ('arbitrary', 0.012), ('neighbor', 0.012), ('stores', 0.012), ('middlebury', 0.012), ('continue', 0.012), ('pursuit', 0.012), ('nearest', 0.012), ('suggest', 0.011), ('integral', 0.011), ('histogram', 0.011), ('ims', 0.011), ('location', 0.011), ('highly', 0.011), ('alternative', 0.011), ('price', 0.011), ('reduces', 0.011), ('averaged', 0.011), ('initializing', 0.011), ('implications', 0.01), ('reduction', 0.01), ('construct', 0.01), ('consuming', 0.01), ('densely', 0.01), ('resized', 0.01), ('sparsely', 0.01), ('entries', 0.01), ('sets', 0.009), ('siggraph', 0.009), ('experiment', 0.009), ('hence', 0.009), ('asilomar', 0.009), ('lihi', 0.009), ('densesift', 0.009), ('timefrequency', 0.009), ('gutierrez', 0.009), ('ofneighboring', 0.009), ('olonetsky', 0.009), ('ceodm', 0.009)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999988 365 iccv-2013-SIFTpack: A Compact Representation for Efficient SIFT Matching
Author: Alexandra Gilinsky, Lihi Zelnik Manor
Abstract: Computing distances between large sets of SIFT descriptors is a basic step in numerous algorithms in computer vision. When the number of descriptors is large, as is often the case, computing these distances can be extremely time consuming. In this paper we propose the SIFTpack: a compact way of storing SIFT descriptors, which enables significantly faster calculations between sets of SIFTs than the current solutions. SIFTpack can be used to represent SIFTs densely extracted from a single image or sparsely from multiple different images. We show that the SIFTpack representation saves both storage space and run time, for both finding nearest neighbors and for computing all distances between all descriptors. The usefulness of SIFTpack is also demonstrated as an alternative implementation for K-means dictionaries of visual words.
2 0.041295763 197 iccv-2013-Hierarchical Joint Max-Margin Learning of Mid and Top Level Representations for Visual Recognition
Author: Hans Lobel, René Vidal, Alvaro Soto
Abstract: Currently, Bag-of-Visual-Words (BoVW) and part-based methods are the most popular approaches for visual recognition. In both cases, a mid-level representation is built on top of low-level image descriptors and top-level classifiers use this mid-level representation to achieve visual recognition. While in current part-based approaches, mid- and top-level representations are usually jointly trained, this is not the usual case for BoVW schemes. A main reason for this is the complex data association problem related to the usual large dictionary size needed by BoVW approaches. As a further observation, typical solutions based on BoVW and part-based representations are usually limited to extensions of binary classification schemes, a strategy that ignores relevant correlations among classes. In this work we propose a novel hierarchical approach to visual recognition based on a BoVW scheme that jointly learns suitable midand top-level representations. Furthermore, using a maxmargin learning framework, the proposed approach directly handles the multiclass case at both levels of abstraction. We test our proposed method using several popular bench- mark datasets. As our main result, we demonstrate that, by coupling learning of mid- and top-level representations, the proposed approach fosters sharing of discriminative visual words among target classes, being able to achieve state-ofthe-art recognition performance using far less visual words than previous approaches.
3 0.038760494 384 iccv-2013-Semi-supervised Robust Dictionary Learning via Efficient l-Norms Minimization
Author: Hua Wang, Feiping Nie, Weidong Cai, Heng Huang
Abstract: Representing the raw input of a data set by a set of relevant codes is crucial to many computer vision applications. Due to the intrinsic sparse property of real-world data, dictionary learning, in which the linear decomposition of a data point uses a set of learned dictionary bases, i.e., codes, has demonstrated state-of-the-art performance. However, traditional dictionary learning methods suffer from three weaknesses: sensitivity to noisy and outlier samples, difficulty to determine the optimal dictionary size, and incapability to incorporate supervision information. In this paper, we address these weaknesses by learning a Semi-Supervised Robust Dictionary (SSR-D). Specifically, we use the ℓ2,0+ norm as the loss function to improve the robustness against outliers, and develop a new structured sparse regularization com, , tom. . cai@sydney . edu . au , heng@uta .edu make the learning tasks easier to deal with and reduce the computational cost. For example, in image tagging, instead of using the raw pixel-wise features, semi-local or patch- based features, such as SIFT and geometric blur, are usually more desirable to achieve better performance. In practice, finding a set of compact features bases, also referred to as dictionary, with enhanced representative and discriminative power, plays a significant role in building a successful computer vision system. In this paper, we explore this important problem by proposing a novel formulation and its solution for learning Semi-Supervised Robust Dictionary (SSRD), where we examine the challenges in dictionary learning, and seek opportunities to overcome them and improve the dictionary qualities. 1.1. Challenges in Dictionary Learning to incorporate the supervision information in dictionary learning, without incurring additional parameters. Moreover, the optimal dictionary size is automatically learned from the input data. Minimizing the derived objective function is challenging because it involves many non-smooth ℓ2,0+ -norm terms. We present an efficient algorithm to solve the problem with a rigorous proof of the convergence of the algorithm. Extensive experiments are presented to show the superior performance of the proposed method.
4 0.038247697 161 iccv-2013-Fast Sparsity-Based Orthogonal Dictionary Learning for Image Restoration
Author: Chenglong Bao, Jian-Feng Cai, Hui Ji
Abstract: In recent years, how to learn a dictionary from input images for sparse modelling has been one very active topic in image processing and recognition. Most existing dictionary learning methods consider an over-complete dictionary, e.g. the K-SVD method. Often they require solving some minimization problem that is very challenging in terms of computational feasibility and efficiency. However, if the correlations among dictionary atoms are not well constrained, the redundancy of the dictionary does not necessarily improve the performance of sparse coding. This paper proposed a fast orthogonal dictionary learning method for sparse image representation. With comparable performance on several image restoration tasks, the proposed method is much more computationally efficient than the over-complete dictionary based learning methods.
5 0.033530794 400 iccv-2013-Stable Hyper-pooling and Query Expansion for Event Detection
Author: Matthijs Douze, Jérôme Revaud, Cordelia Schmid, Hervé Jégou
Abstract: This paper makes two complementary contributions to event retrieval in large collections of videos. First, we propose hyper-pooling strategies that encode the frame descriptors into a representation of the video sequence in a stable manner. Our best choices compare favorably with regular pooling techniques based on k-means quantization. Second, we introduce a technique to improve the ranking. It can be interpreted either as a query expansion method or as a similarity adaptation based on the local context of the query video descriptor. Experiments on public benchmarks show that our methods are complementary and improve event retrieval results, without sacrificing efficiency.
6 0.030501518 337 iccv-2013-Random Grids: Fast Approximate Nearest Neighbors and Range Searching for Image Search
7 0.030290939 105 iccv-2013-DeepFlow: Large Displacement Optical Flow with Deep Matching
8 0.029483847 74 iccv-2013-Co-segmentation by Composition
10 0.027997617 244 iccv-2013-Learning View-Invariant Sparse Representations for Cross-View Action Recognition
11 0.026862893 276 iccv-2013-Multi-attributed Dictionary Learning for Sparse Coding
12 0.02555665 51 iccv-2013-Anchored Neighborhood Regression for Fast Example-Based Super-Resolution
13 0.025488568 48 iccv-2013-An Adaptive Descriptor Design for Object Recognition in the Wild
14 0.024785601 450 iccv-2013-What is the Most EfficientWay to Select Nearest Neighbor Candidates for Fast Approximate Nearest Neighbor Search?
15 0.024710201 159 iccv-2013-Fast Neighborhood Graph Search Using Cartesian Concatenation
16 0.024260538 377 iccv-2013-Segmentation Driven Object Detection with Fisher Vectors
17 0.024087645 354 iccv-2013-Robust Dictionary Learning by Error Source Decomposition
18 0.023812249 188 iccv-2013-Group Sparsity and Geometry Constrained Dictionary Learning for Action Recognition from Depth Maps
19 0.022040138 57 iccv-2013-BOLD Features to Detect Texture-less Objects
20 0.021968435 128 iccv-2013-Dynamic Probabilistic Volumetric Models
topicId topicWeight
[(0, 0.055), (1, 0.012), (2, -0.014), (3, -0.01), (4, -0.029), (5, 0.016), (6, -0.021), (7, -0.005), (8, -0.019), (9, -0.005), (10, 0.023), (11, 0.008), (12, 0.026), (13, -0.008), (14, -0.014), (15, 0.008), (16, 0.007), (17, -0.002), (18, 0.035), (19, 0.008), (20, 0.011), (21, -0.001), (22, 0.002), (23, 0.004), (24, -0.009), (25, -0.005), (26, -0.0), (27, 0.005), (28, -0.029), (29, 0.016), (30, -0.021), (31, 0.031), (32, 0.024), (33, 0.011), (34, -0.004), (35, -0.021), (36, 0.002), (37, -0.049), (38, 0.008), (39, -0.004), (40, 0.01), (41, 0.021), (42, -0.013), (43, -0.014), (44, -0.01), (45, -0.05), (46, 0.01), (47, -0.017), (48, -0.011), (49, 0.023)]
simIndex simValue paperId paperTitle
same-paper 1 0.83011782 365 iccv-2013-SIFTpack: A Compact Representation for Efficient SIFT Matching
Author: Alexandra Gilinsky, Lihi Zelnik Manor
Abstract: Computing distances between large sets of SIFT descriptors is a basic step in numerous algorithms in computer vision. When the number of descriptors is large, as is often the case, computing these distances can be extremely time consuming. In this paper we propose the SIFTpack: a compact way of storing SIFT descriptors, which enables significantly faster calculations between sets of SIFTs than the current solutions. SIFTpack can be used to represent SIFTs densely extracted from a single image or sparsely from multiple different images. We show that the SIFTpack representation saves both storage space and run time, for both finding nearest neighbors and for computing all distances between all descriptors. The usefulness of SIFTpack is also demonstrated as an alternative implementation for K-means dictionaries of visual words.
2 0.61475503 287 iccv-2013-Neighbor-to-Neighbor Search for Fast Coding of Feature Vectors
Author: Nakamasa Inoue, Koichi Shinoda
Abstract: Assigning a visual code to a low-level image descriptor, which we call code assignment, is the most computationally expensive part of image classification algorithms based on the bag of visual word (BoW) framework. This paper proposes a fast computation method, Neighbor-toNeighbor (NTN) search, for this code assignment. Based on the fact that image features from an adjacent region are usually similar to each other, this algorithm effectively reduces the cost of calculating the distance between a codeword and a feature vector. This method can be applied not only to a hard codebook constructed by vector quantization (NTN-VQ), but also to a soft codebook, a Gaussian mixture model (NTN-GMM). We evaluated this method on the PASCAL VOC 2007 classification challenge task. NTN-VQ reduced the assignment cost by 77.4% in super-vector coding, and NTN-GMM reduced it by 89.3% in Fisher-vector coding, without any significant degradation in classification performance.
Author: Yannis Avrithis
Abstract: Inspired by the close relation between nearest neighbor search and clustering in high-dimensional spaces as well as the success of one helping to solve the other, we introduce a new paradigm where both problems are solved simultaneously. Our solution is recursive, not in the size of input data but in the number of dimensions. One result is a clustering algorithm that is tuned to small codebooks but does not need all data in memory at the same time and is practically constant in the data size. As a by-product, a tree structure performs either exact or approximate quantization on trained centroids, the latter being not very precise but extremely fast. A lesser contribution is a new indexing scheme for image retrieval that exploits multiple small codebooks to provide an arbitrarily fine partition of the descriptor space. Large scale experiments on public datasets exhibit state of the art performance and remarkable generalization.
Author: Marco San_Biagio, Marco Crocco, Marco Cristani, Samuele Martelli, Vittorio Murino
Abstract: Capturing the essential characteristics of visual objects by considering how their features are inter-related is a recent philosophy of object classification. In this paper, we embed this principle in a novel image descriptor, dubbed Heterogeneous Auto-Similarities of Characteristics (HASC). HASC is applied to heterogeneous dense features maps, encoding linear relations by covariances and nonlinear associations through information-theoretic measures such as mutual information and entropy. In this way, highly complex structural information can be expressed in a compact, scale invariant and robust manner. The effectiveness of HASC is tested on many diverse detection and classification scenarios, considering objects, textures and pedestrians, on widely known benchmarks (Caltech-101, Brodatz, Daimler Multi-Cue). In all the cases, the results obtained with standard classifiers demonstrate the superiority of HASC with respect to the most adopted local feature descriptors nowadays, such as SIFT, HOG, LBP and feature covariances. In addition, HASC sets the state-of-the-art on the Brodatz texture dataset and the Daimler Multi-Cue pedestrian dataset, without exploiting ad-hoc sophisticated classifiers.
5 0.54413152 419 iccv-2013-To Aggregate or Not to aggregate: Selective Match Kernels for Image Search
Author: Giorgos Tolias, Yannis Avrithis, Hervé Jégou
Abstract: This paper considers a family of metrics to compare images based on their local descriptors. It encompasses the VLAD descriptor and matching techniques such as Hamming Embedding. Making the bridge between these approaches leads us to propose a match kernel that takes the best of existing techniques by combining an aggregation procedure with a selective match kernel. Finally, the representation underpinning this kernel is approximated, providing a large scale image search both precise and scalable, as shown by our experiments on several benchmarks.
6 0.54252285 51 iccv-2013-Anchored Neighborhood Regression for Fast Example-Based Super-Resolution
7 0.52791625 401 iccv-2013-Stacked Predictive Sparse Coding for Classification of Distinct Regions in Tumor Histopathology
8 0.52731723 258 iccv-2013-Low-Rank Sparse Coding for Image Classification
9 0.524131 288 iccv-2013-Nested Shape Descriptors
11 0.5178045 189 iccv-2013-HOGgles: Visualizing Object Detection Features
12 0.50854927 197 iccv-2013-Hierarchical Joint Max-Margin Learning of Mid and Top Level Representations for Visual Recognition
13 0.50730121 416 iccv-2013-The Interestingness of Images
14 0.50260264 48 iccv-2013-An Adaptive Descriptor Design for Object Recognition in the Wild
15 0.49137804 400 iccv-2013-Stable Hyper-pooling and Query Expansion for Event Detection
16 0.48971921 388 iccv-2013-Shape Index Descriptors Applied to Texture-Based Galaxy Analysis
18 0.48743397 74 iccv-2013-Co-segmentation by Composition
19 0.48579976 221 iccv-2013-Joint Inverted Indexing
20 0.47957498 312 iccv-2013-Perceptual Fidelity Aware Mean Squared Error
topicId topicWeight
[(2, 0.071), (3, 0.261), (4, 0.013), (13, 0.013), (16, 0.013), (26, 0.069), (27, 0.012), (31, 0.037), (42, 0.076), (48, 0.015), (64, 0.022), (73, 0.026), (89, 0.183), (95, 0.011), (98, 0.011)]
simIndex simValue paperId paperTitle
same-paper 1 0.75656569 365 iccv-2013-SIFTpack: A Compact Representation for Efficient SIFT Matching
Author: Alexandra Gilinsky, Lihi Zelnik Manor
Abstract: Computing distances between large sets of SIFT descriptors is a basic step in numerous algorithms in computer vision. When the number of descriptors is large, as is often the case, computing these distances can be extremely time consuming. In this paper we propose the SIFTpack: a compact way of storing SIFT descriptors, which enables significantly faster calculations between sets of SIFTs than the current solutions. SIFTpack can be used to represent SIFTs densely extracted from a single image or sparsely from multiple different images. We show that the SIFTpack representation saves both storage space and run time, for both finding nearest neighbors and for computing all distances between all descriptors. The usefulness of SIFTpack is also demonstrated as an alternative implementation for K-means dictionaries of visual words.
2 0.73010129 250 iccv-2013-Lifting 3D Manhattan Lines from a Single Image
Author: Srikumar Ramalingam, Matthew Brand
Abstract: We propose a novel and an efficient method for reconstructing the 3D arrangement of lines extracted from a single image, using vanishing points, orthogonal structure, and an optimization procedure that considers all plausible connectivity constraints between lines. Line detection identifies a large number of salient lines that intersect or nearly intersect in an image, but relatively a few of these apparent junctions correspond to real intersections in the 3D scene. We use linear programming (LP) to identify a minimal set of least-violated connectivity constraints that are sufficient to unambiguously reconstruct the 3D lines. In contrast to prior solutions that primarily focused on well-behaved synthetic line drawings with severely restricting assumptions, we develop an algorithm that can work on real images. The algorithm produces line reconstruction by identifying 95% correct connectivity constraints in York Urban database, with a total computation time of 1 second per image.
3 0.72274506 286 iccv-2013-NYC3DCars: A Dataset of 3D Vehicles in Geographic Context
Author: Kevin Matzen, Noah Snavely
Abstract: Geometry and geography can play an important role in recognition tasks in computer vision. To aid in studying connections between geometry and recognition, we introduce NYC3DCars, a rich dataset for vehicle detection in urban scenes built from Internet photos drawn from the wild, focused on densely trafficked areas of New York City. Our dataset is augmented with detailed geometric and geographic information, including full camera poses derived from structure from motion, 3D vehicle annotations, and geographic information from open resources, including road segmentations and directions of travel. NYC3DCars can be used to study new questions about using geometric information in detection tasks, and to explore applications of Internet photos in understanding cities. To demonstrate the utility of our data, we evaluate the use of the geographic information in our dataset to enhance a parts-based detection method, and suggest other avenues for future exploration.
4 0.71347129 141 iccv-2013-Enhanced Continuous Tabu Search for Parameter Estimation in Multiview Geometry
Author: Guoqing Zhou, Qing Wang
Abstract: Optimization using the L∞ norm has been becoming an effective way to solve parameter estimation problems in multiview geometry. But the computational cost increases rapidly with the size of measurement data. Although some strategies have been presented to improve the efficiency of L∞ optimization, it is still an open issue. In the paper, we propose a novel approach under theframework ofenhanced continuous tabu search (ECTS) for generic parameter estimation in multiview geometry. ECTS is an optimization method in the domain of artificial intelligence, which has an interesting ability of covering a wide solution space by promoting the search far away from current solution and consecutively decreasing the possibility of trapping in the local minima. Taking the triangulation as an example, we propose the corresponding ways in the key steps of ECTS, diversification and intensification. We also present theoretical proof to guarantee the global convergence of search with probability one. Experimental results have validated that the ECTS based approach can obtain global optimum efficiently, especially for large scale dimension of parameter. Potentially, the novel ECTS based algorithm can be applied in many applications of multiview geometry.
5 0.67573643 26 iccv-2013-A Practical Transfer Learning Algorithm for Face Verification
Author: Xudong Cao, David Wipf, Fang Wen, Genquan Duan, Jian Sun
Abstract: Face verification involves determining whether a pair of facial images belongs to the same or different subjects. This problem can prove to be quite challenging in many important applications where labeled training data is scarce, e.g., family album photo organization software. Herein we propose a principled transfer learning approach for merging plentiful source-domain data with limited samples from some target domain of interest to create a classifier that ideally performs nearly as well as if rich target-domain data were present. Based upon a surprisingly simple generative Bayesian model, our approach combines a KL-divergencebased regularizer/prior with a robust likelihood function leading to a scalable implementation via the EM algorithm. As justification for our design choices, we later use principles from convex analysis to recast our algorithm as an equivalent structured rank minimization problem leading to a number of interesting insights related to solution structure and feature-transform invariance. These insights help to both explain the effectiveness of our algorithm as well as elucidate a wide variety of related Bayesian approaches. Experimental testing with challenging datasets validate the utility of the proposed algorithm.
6 0.64895624 404 iccv-2013-Structured Forests for Fast Edge Detection
8 0.64690977 258 iccv-2013-Low-Rank Sparse Coding for Image Classification
9 0.64657468 426 iccv-2013-Training Deformable Part Models with Decorrelated Features
10 0.64653915 351 iccv-2013-Restoring an Image Taken through a Window Covered with Dirt or Rain
11 0.64644647 238 iccv-2013-Learning Graphs to Match
12 0.64604628 327 iccv-2013-Predicting an Object Location Using a Global Image Representation
13 0.64581198 361 iccv-2013-Robust Trajectory Clustering for Motion Segmentation
14 0.64577329 71 iccv-2013-Category-Independent Object-Level Saliency Detection
15 0.64544803 101 iccv-2013-DCSH - Matching Patches in RGBD Images
16 0.6454159 111 iccv-2013-Detecting Dynamic Objects with Multi-view Background Subtraction
17 0.64524841 18 iccv-2013-A Joint Intensity and Depth Co-sparse Analysis Model for Depth Map Super-resolution
18 0.64490914 134 iccv-2013-Efficient Higher-Order Clustering on the Grassmann Manifold
19 0.64417422 128 iccv-2013-Dynamic Probabilistic Volumetric Models
20 0.64398021 196 iccv-2013-Hierarchical Data-Driven Descent for Efficient Optimal Deformation Estimation