iccv iccv2013 iccv2013-294 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Jayaguru Panda, Michael S. Brown, C.V. Jawahar
Abstract: Existing mobile image instance retrieval applications assume a network-based usage where image features are sent to a server to query an online visual database. In this scenario, there are no restrictions on the size of the visual database. This paper, however, examines how to perform this same task offline, where the entire visual index must reside on the mobile device itself within a small memory footprint. Such solutions have applications on location recognition and product recognition. Mobile instance retrieval requires a significant reduction in the visual index size. To achieve this, we describe a set of strategies that can reduce the visual index up to 60-80 compared to a scatannd raerddu iens tthaen vceis rueatrli ienvdaelx xim upple tom 6en0t-8at0io ×n found on ddte osk atops or servers. While our proposed reduction steps affect the overall mean Average Precision (mAP), they are able to maintain a good Precision for the top K results (PK). We argue that for such offline application, maintaining a good PK is sufficient. The effectiveness of this approach is demonstrated on several standard databases. A working application designed for a remote historical site is also presented. This application is able to reduce an 50,000 image index structure to 25 MBs while providing a precision of 97% for P10 and 100% for P1.
Reference: text
sentIndex sentText sentNum sentScore
1 Jawahar1 1CVIT, IIIT Hyderabad, India 2NUS, Singapore Abstract Existing mobile image instance retrieval applications assume a network-based usage where image features are sent to a server to query an online visual database. [sent-4, score-0.882]
2 This paper, however, examines how to perform this same task offline, where the entire visual index must reside on the mobile device itself within a small memory footprint. [sent-6, score-0.957]
3 Mobile instance retrieval requires a significant reduction in the visual index size. [sent-8, score-0.634]
4 To achieve this, we describe a set of strategies that can reduce the visual index up to 60-80 compared to a scatannd raerddu iens tthaen vceis rueatrli ienvdaelx xim upple tom 6en0t-8at0io ×n found on ddte osk atops or servers. [sent-9, score-0.362]
5 This application is able to reduce an 50,000 image index structure to 25 MBs while providing a precision of 97% for P10 and 100% for P1. [sent-14, score-0.365]
6 Introduction Mobile image retrieval allows users to identify visual information about their environment by transmitting image queries to an online image database that has associated annotations (e. [sent-16, score-0.334]
7 This paper examines mobile image retrieval for offline use when a network connection is limited or not available. [sent-20, score-0.657]
8 In this scenario, the entire visual search index must reside on the mobile device itself. [sent-21, score-0.897]
9 Figure 1 shows an example use case where mobile Figure 1. [sent-25, score-0.334]
10 Example use of our method where landmark location can be performed offline by indexing an image collection within a small memory footprint for use on a mobile device. [sent-26, score-0.806]
11 While our targeted instance retrieval does not need to store the images, the entire visual index structure needs to fit on a mobile device, ideally within a small footprint (e. [sent-28, score-1.109]
12 First, while mobile phones have up to 16-32GB of storage, this is mainly in the form of slower flash memory that is an order of magnitude slower than RAM. [sent-32, score-0.475]
13 Second, this small size is inline with common practices for mobile ap- × plications; e. [sent-34, score-0.374]
14 Contribution We introduce a series of pruning steps that can significantly reduce the index size and incorporate partial geometry in it. [sent-38, score-0.502]
15 These include a simple vocabulary pruning based on word frequency, exploiting a 2-word-phrase based index structure, placing constraints on feature geometry and encoding the relative geometry of visual words. [sent-39, score-0.927]
16 Each step is discussed in detail and a running example using the Oxford building dataset is used to show the effects on the index size and performance. [sent-40, score-0.303]
17 1257 In fact, our collective strategies are able to reduce the visual index up to 60-80 with virtually no change in P10, P5 and Pin1d . [sent-42, score-0.362]
18 Related Work Image recognition and retrieval for mobile devices already has several successful examples. [sent-47, score-0.528]
19 Most of these mobile apps use a client-server model and communicate with a remote server for matching within large image databases [12, 6, 5, 7]. [sent-49, score-0.513]
20 We address this problem in a far more challenging setting, where the entire computation must happen on the mobile device. [sent-51, score-0.334]
21 We target databases at least 10-100 times larger where the mobile device should store a compact search index to instantly recognize the query and annotate it. [sent-54, score-1.053]
22 In order to avoid any time-consuming post-processing, we need the search index to effectively borrow the advantages of geometric verification into index structure. [sent-55, score-0.745]
23 Variations of this approach have been motivated either towards building a compact and efficient search index or, a geometry induced search framework. [sent-58, score-0.639]
24 Decreasing the memory footprint of the search index has been earlier addressed targeting desktop environments [15, 17, 32]. [sent-59, score-0.743]
25 Identifying a useful subset of training features that are robust and distinctive and using only these for specific retrieval tasks can achieve significant memory savings without loss in matching performance [29]. [sent-60, score-0.335]
26 Usage of Fisher Vectors [22] and vector of locally aggregated descriptors (VLADs) [16] for large-scale image search has been demonstrated with compact image vectors using joint optimization of dimensionality reduction and indexing for precise vector comparisons. [sent-64, score-0.337]
27 A few attempts have introduced novel techniques which can help in phrases with index structures like min-hash [8, 34]. [sent-69, score-0.646]
28 Introduction of geometry constraints into the phrase or index definition [2 1, 34, 35] often helps in improving the recall of the rare objects in large databases. [sent-70, score-0.452]
29 However, with introduction of geometry, the search index size grows. [sent-71, score-0.423]
30 In this section, we start by analysing the memory and storage requirements of the instance retrieval solution implemented as an inverted index search with a BoW representation. [sent-78, score-0.886]
31 With quantization, a 128-Byte SIFT representation gets converted to a compact visual word representation of 4 Bytes (or smaller depending on the vocabulary size). [sent-82, score-0.454]
32 Database images get represented as a histogram of visual words that are indexed with an appropriate index structure (often an inverted index). [sent-84, score-0.541]
33 They are quantized into visual words, and a set of images having most of these visual words in common, are retrieved from the database. [sent-86, score-0.296]
34 A vocabulary tree built with hierarchical k-means (HKM) [20] has logarithmic order and highly speeds up the vocabulary assignment, with small compromise on clustering results. [sent-93, score-0.496]
35 For many instance retrieval tasks, the vocabulary size can be as high as 1M. [sent-96, score-0.54]
36 This, however, incurs additional decoding costs which may be undesirable on a mobile device. [sent-101, score-0.334]
37 In Table 1, we present the space requirement as well as the retrieval performance of a desktop (BoW-D) implementation of a retrieval system on Oxford Building dataset. [sent-106, score-0.471]
38 Second, as done by other retrieval systems, instead ofusing SIFT vectors for GV, we match the visual word indices in the corresponding images. [sent-114, score-0.369]
39 A large vocabulary is one of the primary reason for the large index size; e. [sent-118, score-0.511]
40 instance retrieval solutions need large (closer to 1M) vocabulary for finer quantization [23, 24]. [sent-120, score-0.559]
41 We do this by pruning the vocabulary by discarding some of the visual words while indexing. [sent-123, score-0.509]
42 Instead of directly computing a smaller vocabulary, we prune the large vocabulary and define a smaller one by VocIndexmAPP10P5P1 BBooWW--DD-GV11MM121. [sent-124, score-0.294]
43 Pruning is done by identifying and eliminating visual words that are the least useful for the retrieval task. [sent-145, score-0.4]
44 , assign a zero weight) the least scored visual words and work only with the informative visual words (with weight as one). [sent-150, score-0.342]
45 Using such a vocabulary pruning, we reduce the vocabulary size from 1M to 750K, with only a minor reduction in performance as can be seen in Table 1. [sent-151, score-0.63]
46 Results of the pruned vocabulary (BoW-Pr and BoW-PrGV) are comparable to the original. [sent-152, score-0.321]
47 In this work, we first define a phrase as a pair of visual words, and show how that can help in the instance retrieval task. [sent-160, score-0.43]
48 Building a 2-word-phrase index: Each image, represented by BoW quantized features, is processed to obtain phrases by dividing the image into “overlapping” grids and indexing 2word combinations in each grid. [sent-162, score-0.44]
49 In the final index, only phrases that index more than one image are retained. [sent-163, score-0.607]
50 Such phrases can be represented as pk = {wi, wj } such that wi is in the neighborhood of wj, and= =in {dwexed in} a suicmhil tahra manner to that of visual words. [sent-165, score-0.745]
51 To avoid duplicate 2-wordphrases in the index structure, we constrain wi ≤ wj . [sent-167, score-0.472]
52 pk is indexed in a hash-map with {wi, wj } as a key. [sent-168, score-0.31]
53 Extracting higher or- der phrases increases the computational requirements during query processing. [sent-170, score-0.457]
54 e With a vocabulary size of 1M, the number of possible 2word-phrases could be as large as 1012, however, this does not happen in practice. [sent-173, score-0.288]
55 We first find a set of phrases that are prominently present in the database. [sent-174, score-0.344]
56 For example, phrases formed with a vocabulary as small as a 500K, one can obtain performance comparable to that of 1M visual words (see Table 1) without any explicit geometric verification. [sent-175, score-0.808]
57 Even in this case, the subset of relevant phrases are computed based on the TF-IDF scores. [sent-176, score-0.344]
58 In row:Base-Phr, one can observe that the vocabulary is pruned to half the original size and the index requirement reduces by almost half to 58MB. [sent-181, score-0.668]
59 In this section, we demonstrate how the geometry can be effectively used in selecting a much smaller subset of phrases for our instance retrieval problem. [sent-188, score-0.668]
60 In general, we constrain the phrases selected, based on (i) the scale of keypoints at the visual word location, (ii) the relative placement of the words, and (iii) the VocIndexmAPP10P5P1 S SVc p A -P -Ph Ph rh r124 68 520 K K413216 8M M B B. [sent-189, score-0.554]
61 ScA-Phr, SpA-Phr, Scp-Phr and SV-Phr Phr refer to the the 2word-phrase index methods with scale-aware, space-aware, both scale-space aware and spatial validity constraints respectively. [sent-200, score-0.376]
62 Scale-aware constraints We define phrases only in a small neighborhood (50 pixels in practice) and assume both the visual words are at the same scale. [sent-207, score-0.555]
63 This also helps in selecting phrases that will be reliable for viewpoint variations. [sent-211, score-0.344]
64 In Table 2, with scale-aware phrases (ScAPhr) the search index is reduced with a further pruned vocabulary, while precision P10, P5 and P1 are preserved. [sent-218, score-0.865]
65 We constrain the geometry of occurrence of two visual words further by encoding the relative placement 1260 mation qij along with the 2-word-phrase {wi , wj } as shown. [sent-222, score-0.464]
66 In a phrase pk defined by {wi, wj }, where wi ≤ wj, we choose wi as the origin a bndy {dewtermin}e, winh ewrehic wh rectangular quadrant wj lies in (See Figure 3). [sent-225, score-0.701]
67 Depending on the quadrant where wj lies, a two-bit value of qij = 0, 1, 2, or, 3, the 2-wordphrase key is modified as {pk , qij } i. [sent-226, score-0.34]
68 further prune the vocabulary size and reduce the phrases from the search index. [sent-231, score-0.835]
69 Using both the scale and space constraints (row:ScpPhr) together, we achieve a search index structure as small as 28MB with a vocabulary pruned to 250K. [sent-233, score-0.744]
70 To identify useful features, each image is used as a query and geometric verification is performed on the top M retrievals based on visual word matching. [sent-238, score-0.354]
71 Hence, the phrases obtained by the combination of space and scale aware constraints are further processed. [sent-247, score-0.416]
72 This brings down the vocabulary size to 160K and the search index is reduced to 16MB. [sent-248, score-0.671]
73 As shown in Table 2 (row:SVPhr), phrases constrained with spatial validity further compromise the mAP over the conventional BoW-D-GV approach but help in significant reduction of index size. [sent-249, score-0.744]
74 This is done by (i) using the scale information while defining phrases and (ii) spatial constraints preferred images with similar view angles to be at the top of the list. [sent-260, score-0.384]
75 By identifying 2-word-phrases that inherit robust geometrical constraints, we are able to further reduce the search index by 3-4 times. [sent-263, score-0.455]
76 The images are marked by architecturally distinct landmarks unlike identical buildings in Oxford-5K, which creates a difference in the vocabulary used for quantization. [sent-301, score-0.444]
77 With a vocabulary size of 1M, we observe that the mAP is affected with our 2-word-phrase indexing technique. [sent-302, score-0.384]
78 However, we were able to reduce the memory footprint significantly, compared to BoW approach with SIFT-based GV (BoW-D-GV). [sent-303, score-0.324]
79 The size of the search index is reduced by 75 times, compared to BoW-D-GV. [sent-325, score-0.423]
80 We reduce the search index size significantly, however, PK is preserved. [sent-332, score-0.46]
81 Analysis Selecting a phrase or co-occurrence of visual words from a query image to match with a similar co-occurrence in the database brings in spatial constraints during the matching procedure. [sent-335, score-0.401]
82 As the vocabulary size decreases, the memory × footprint for the search index also decreases. [sent-373, score-0.958]
83 Mobile App We now discuss an implementation for a mobile instance search on a mid-end mobile phone. [sent-379, score-0.846]
84 Our application provides details of the monuments based on the image captured on a mobile phone. [sent-380, score-0.41]
85 The app is implemented for Android devices and designed to allow a tourist point his mobile at an object and know its details without using a network connection. [sent-381, score-0.497]
86 The search index is built on a dataset with 50,292 medium resolution (480 360) images, which awmitohu 5n0ts,2 t9o2 25. [sent-385, score-0.383]
87 m mN roetseo lthutatio othne ( images are never esst,or wedhi on the mobile devices. [sent-387, score-0.334]
88 We extract SIFT features from all images and use hierarchical k-means to build a corresponding visual vocabulary of size 100K. [sent-388, score-0.35]
89 The memory occupied by the vocabulary tree during query processing is 17MB. [sent-389, score-0.502]
90 With our 2-word-phrase indexing technique, we get a compact search index of 25MB. [sent-390, score-0.543]
91 These along with the annotations for the database images are all that we store as application data on the mobile device. [sent-391, score-0.424]
92 The heritage site is marked by architecturally similar monumental structures and towers spread over a large area. [sent-392, score-0.327]
93 We did real-time testing of our app at this site and analyzed how our solution performs in terms of correctness of retrieved annotations, memory usage and the speed of annotation delivery. [sent-393, score-0.394]
94 Conclusion This paper presents an indexing strategy for reducing the memory footprint for instance retrieval along with exploiting the spatial geometry of 2-word-phrases. [sent-403, score-0.707]
95 We successfully demonstrate the usage of 2word-phrases as key in the inverted search index and how this helps in pruning the vocabulary as well as compacting the search index. [sent-405, score-0.958]
96 This facilitates successful application in low-memory, slow-processing mobile environments. [sent-406, score-0.334]
97 Phoneguide: museum guidance supported by on-device object recognition on mobile phones. [sent-499, score-0.334]
98 object [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] recognition from natural features on a mobile phone. [sent-519, score-0.334]
99 Towards low bit rate mobile visual search with multiple-channel coding. [sent-559, score-0.516]
100 Outdoors augmented reality on mobile phone using loxel-based visual feature organization. [sent-625, score-0.441]
wordName wordTfidf (topN-words)
[('phrases', 0.344), ('mobile', 0.334), ('index', 0.263), ('vocabulary', 0.248), ('retrieval', 0.194), ('heritage', 0.157), ('footprint', 0.146), ('memory', 0.141), ('wj', 0.138), ('chandrasekhar', 0.134), ('takacs', 0.134), ('gv', 0.13), ('pk', 0.13), ('search', 0.12), ('query', 0.113), ('words', 0.109), ('buildings', 0.102), ('indexing', 0.096), ('app', 0.094), ('bow', 0.09), ('pruning', 0.09), ('grzeszczuk', 0.086), ('qij', 0.083), ('word', 0.08), ('phrase', 0.077), ('apps', 0.076), ('monumental', 0.076), ('monuments', 0.076), ('reznik', 0.076), ('device', 0.073), ('pruned', 0.073), ('geometry', 0.072), ('wi', 0.071), ('server', 0.069), ('keypoints', 0.068), ('tsai', 0.066), ('gb', 0.066), ('inverted', 0.065), ('precision', 0.065), ('chum', 0.064), ('compact', 0.064), ('retrieved', 0.063), ('ukbench', 0.062), ('visual', 0.062), ('mk', 0.06), ('quantization', 0.059), ('vlads', 0.059), ('offline', 0.058), ('instance', 0.058), ('reduction', 0.057), ('verification', 0.054), ('usage', 0.052), ('store', 0.052), ('architecturally', 0.05), ('mbs', 0.05), ('vle', 0.05), ('oxford', 0.049), ('douze', 0.047), ('compressed', 0.046), ('prune', 0.046), ('geometric', 0.045), ('phone', 0.045), ('reside', 0.045), ('storage', 0.045), ('site', 0.044), ('requirement', 0.044), ('landmarks', 0.044), ('indexed', 0.042), ('send', 0.041), ('hkm', 0.041), ('philbin', 0.041), ('validity', 0.041), ('queries', 0.04), ('rectangular', 0.04), ('constraints', 0.04), ('size', 0.04), ('help', 0.039), ('jegou', 0.039), ('examines', 0.039), ('fisher', 0.039), ('vocabularies', 0.039), ('desktop', 0.039), ('annotations', 0.038), ('reduce', 0.037), ('tourist', 0.037), ('keypoint', 0.037), ('isard', 0.036), ('product', 0.036), ('desire', 0.036), ('iphone', 0.036), ('quadrant', 0.036), ('useless', 0.036), ('identifying', 0.035), ('sift', 0.035), ('databases', 0.034), ('targeting', 0.034), ('ofusing', 0.033), ('network', 0.032), ('aware', 0.032), ('location', 0.031)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999994 294 iccv-2013-Offline Mobile Instance Retrieval with a Small Memory Footprint
Author: Jayaguru Panda, Michael S. Brown, C.V. Jawahar
Abstract: Existing mobile image instance retrieval applications assume a network-based usage where image features are sent to a server to query an online visual database. In this scenario, there are no restrictions on the size of the visual database. This paper, however, examines how to perform this same task offline, where the entire visual index must reside on the mobile device itself within a small memory footprint. Such solutions have applications on location recognition and product recognition. Mobile instance retrieval requires a significant reduction in the visual index size. To achieve this, we describe a set of strategies that can reduce the visual index up to 60-80 compared to a scatannd raerddu iens tthaen vceis rueatrli ienvdaelx xim upple tom 6en0t-8at0io ×n found on ddte osk atops or servers. While our proposed reduction steps affect the overall mean Average Precision (mAP), they are able to maintain a good Precision for the top K results (PK). We argue that for such offline application, maintaining a good PK is sufficient. The effectiveness of this approach is demonstrated on several standard databases. A working application designed for a remote historical site is also presented. This application is able to reduce an 50,000 image index structure to 25 MBs while providing a precision of 97% for P10 and 100% for P1.
2 0.25111279 378 iccv-2013-Semantic-Aware Co-indexing for Image Retrieval
Author: Shiliang Zhang, Ming Yang, Xiaoyu Wang, Yuanqing Lin, Qi Tian
Abstract: Inverted indexes in image retrieval not only allow fast access to database images but also summarize all knowledge about the database, so that their discriminative capacity largely determines the retrieval performance. In this paper, for vocabulary tree based image retrieval, we propose a semantic-aware co-indexing algorithm to jointly San Antonio, TX 78249 . j dl@gmai l com qit ian@cs .ut sa . edu . The query embed two strong cues into the inverted indexes: 1) local invariant features that are robust to delineate low-level image contents, and 2) semantic attributes from large-scale object recognition that may reveal image semantic meanings. For an initial set of inverted indexes of local features, we utilize 1000 semantic attributes to filter out isolated images and insert semantically similar images to the initial set. Encoding these two distinct cues together effectively enhances the discriminative capability of inverted indexes. Such co-indexing operations are totally off-line and introduce small computation overhead to online query cause only local features but no semantic attributes are used for query. Experiments and comparisons with recent retrieval methods on 3 datasets, i.e., UKbench, Holidays, Oxford5K, and 1.3 million images from Flickr as distractors, manifest the competitive performance of our method 1.
3 0.20404172 265 iccv-2013-Mining Motion Atoms and Phrases for Complex Action Recognition
Author: Limin Wang, Yu Qiao, Xiaoou Tang
Abstract: This paper proposes motion atom and phrase as a midlevel temporal “part” for representing and classifying complex action. Motion atom is defined as an atomic part of action, and captures the motion information of action video in a short temporal scale. Motion phrase is a temporal composite of multiple motion atoms with an AND/OR structure, which further enhances the discriminative ability of motion atoms by incorporating temporal constraints in a longer scale. Specifically, given a set of weakly labeled action videos, we firstly design a discriminative clustering method to automatically discovera set ofrepresentative motion atoms. Then, based on these motion atoms, we mine effective motion phrases with high discriminative and representativepower. We introduce a bottom-upphrase construction algorithm and a greedy selection method for this mining task. We examine the classification performance of the motion atom and phrase based representation on two complex action datasets: Olympic Sports and UCF50. Experimental results show that our method achieves superior performance over recent published methods on both datasets.
Author: Basura Fernando, Tinne Tuytelaars
Abstract: In this paper we present a new method for object retrieval starting from multiple query images. The use of multiple queries allows for a more expressive formulation of the query object including, e.g., different viewpoints and/or viewing conditions. This, in turn, leads to more diverse and more accurate retrieval results. When no query images are available to the user, they can easily be retrieved from the internet using a standard image search engine. In particular, we propose a new method based on pattern mining. Using the minimal description length principle, we derive the most suitable set of patterns to describe the query object, with patterns corresponding to local feature configurations. This results in apowerful object-specific mid-level image representation. The archive can then be searched efficiently for similar images based on this representation, using a combination of two inverted file systems. Since the patterns already encode local spatial information, good results on several standard image retrieval datasets are obtained even without costly re-ranking based on geometric verification.
5 0.17367655 210 iccv-2013-Image Retrieval Using Textual Cues
Author: Anand Mishra, Karteek Alahari, C.V. Jawahar
Abstract: We present an approach for the text-to-image retrieval problem based on textual content present in images. Given the recent developments in understanding text in images, an appealing approach to address this problem is to localize and recognize the text, and then query the database, as in a text retrieval problem. We show that such an approach, despite being based on state-of-the-artmethods, is insufficient, and propose a method, where we do not rely on an exact localization and recognition pipeline. We take a query-driven search approach, where we find approximate locations of characters in the text query, and then impose spatial constraints to generate a ranked list of images in the database. The retrieval performance is evaluated on public scene text datasets as well as three large datasets, namely IIIT scene text retrieval, Sports-10K and TV series-1M, we introduce.
7 0.14123505 337 iccv-2013-Random Grids: Fast Approximate Nearest Neighbors and Range Searching for Image Search
8 0.13529819 334 iccv-2013-Query-Adaptive Asymmetrical Dissimilarities for Visual Object Retrieval
9 0.1157417 198 iccv-2013-Hierarchical Part Matching for Fine-Grained Visual Categorization
10 0.11521695 419 iccv-2013-To Aggregate or Not to aggregate: Selective Match Kernels for Image Search
11 0.11375709 159 iccv-2013-Fast Neighborhood Graph Search Using Cartesian Concatenation
12 0.099825248 400 iccv-2013-Stable Hyper-pooling and Query Expansion for Event Detection
13 0.097245887 179 iccv-2013-From Subcategories to Visual Composites: A Multi-level Framework for Object Detection
14 0.095019951 192 iccv-2013-Handwritten Word Spotting with Corrected Attributes
15 0.094197825 3 iccv-2013-3D Sub-query Expansion for Improving Sketch-Based Multi-view Image Retrieval
16 0.093872502 445 iccv-2013-Visual Reranking through Weakly Supervised Multi-graph Learning
17 0.092936262 162 iccv-2013-Fast Subspace Search via Grassmannian Based Hashing
18 0.0849429 327 iccv-2013-Predicting an Object Location Using a Global Image Representation
19 0.081776857 72 iccv-2013-Characterizing Layouts of Outdoor Scenes Using Spatial Topic Processes
20 0.080274105 29 iccv-2013-A Scalable Unsupervised Feature Merging Approach to Efficient Dimensionality Reduction of High-Dimensional Visual Data
topicId topicWeight
[(0, 0.188), (1, 0.054), (2, -0.05), (3, -0.09), (4, 0.046), (5, 0.155), (6, 0.028), (7, -0.05), (8, -0.11), (9, 0.045), (10, 0.151), (11, 0.012), (12, 0.076), (13, 0.05), (14, -0.001), (15, 0.002), (16, 0.113), (17, -0.088), (18, 0.102), (19, -0.089), (20, -0.037), (21, -0.037), (22, 0.034), (23, 0.019), (24, -0.03), (25, 0.018), (26, 0.031), (27, -0.021), (28, 0.008), (29, 0.034), (30, 0.025), (31, -0.06), (32, -0.033), (33, 0.051), (34, 0.008), (35, 0.047), (36, -0.027), (37, -0.033), (38, 0.009), (39, 0.064), (40, -0.041), (41, 0.0), (42, -0.072), (43, -0.014), (44, -0.065), (45, 0.053), (46, -0.027), (47, 0.018), (48, 0.13), (49, 0.074)]
simIndex simValue paperId paperTitle
same-paper 1 0.95898491 294 iccv-2013-Offline Mobile Instance Retrieval with a Small Memory Footprint
Author: Jayaguru Panda, Michael S. Brown, C.V. Jawahar
Abstract: Existing mobile image instance retrieval applications assume a network-based usage where image features are sent to a server to query an online visual database. In this scenario, there are no restrictions on the size of the visual database. This paper, however, examines how to perform this same task offline, where the entire visual index must reside on the mobile device itself within a small memory footprint. Such solutions have applications on location recognition and product recognition. Mobile instance retrieval requires a significant reduction in the visual index size. To achieve this, we describe a set of strategies that can reduce the visual index up to 60-80 compared to a scatannd raerddu iens tthaen vceis rueatrli ienvdaelx xim upple tom 6en0t-8at0io ×n found on ddte osk atops or servers. While our proposed reduction steps affect the overall mean Average Precision (mAP), they are able to maintain a good Precision for the top K results (PK). We argue that for such offline application, maintaining a good PK is sufficient. The effectiveness of this approach is demonstrated on several standard databases. A working application designed for a remote historical site is also presented. This application is able to reduce an 50,000 image index structure to 25 MBs while providing a precision of 97% for P10 and 100% for P1.
2 0.84899545 378 iccv-2013-Semantic-Aware Co-indexing for Image Retrieval
Author: Shiliang Zhang, Ming Yang, Xiaoyu Wang, Yuanqing Lin, Qi Tian
Abstract: Inverted indexes in image retrieval not only allow fast access to database images but also summarize all knowledge about the database, so that their discriminative capacity largely determines the retrieval performance. In this paper, for vocabulary tree based image retrieval, we propose a semantic-aware co-indexing algorithm to jointly San Antonio, TX 78249 . j dl@gmai l com qit ian@cs .ut sa . edu . The query embed two strong cues into the inverted indexes: 1) local invariant features that are robust to delineate low-level image contents, and 2) semantic attributes from large-scale object recognition that may reveal image semantic meanings. For an initial set of inverted indexes of local features, we utilize 1000 semantic attributes to filter out isolated images and insert semantically similar images to the initial set. Encoding these two distinct cues together effectively enhances the discriminative capability of inverted indexes. Such co-indexing operations are totally off-line and introduce small computation overhead to online query cause only local features but no semantic attributes are used for query. Experiments and comparisons with recent retrieval methods on 3 datasets, i.e., UKbench, Holidays, Oxford5K, and 1.3 million images from Flickr as distractors, manifest the competitive performance of our method 1.
3 0.81806314 334 iccv-2013-Query-Adaptive Asymmetrical Dissimilarities for Visual Object Retrieval
Author: Cai-Zhi Zhu, Hervé Jégou, Shin'Ichi Satoh
Abstract: Visual object retrieval aims at retrieving, from a collection of images, all those in which a given query object appears. It is inherently asymmetric: the query object is mostly included in the database image, while the converse is not necessarily true. However, existing approaches mostly compare the images with symmetrical measures, without considering the different roles of query and database. This paper first measure the extent of asymmetry on large-scale public datasets reflecting this task. Considering the standard bag-of-words representation, we then propose new asymmetrical dissimilarities accounting for the different inlier ratios associated with query and database images. These asymmetrical measures depend on the query, yet they are compatible with an inverted file structure, without noticeably impacting search efficiency. Our experiments show the benefit of our approach, and show that the visual object retrieval task is better treated asymmetrically, in the spirit of state-of-the-art text retrieval.
4 0.81602508 419 iccv-2013-To Aggregate or Not to aggregate: Selective Match Kernels for Image Search
Author: Giorgos Tolias, Yannis Avrithis, Hervé Jégou
Abstract: This paper considers a family of metrics to compare images based on their local descriptors. It encompasses the VLAD descriptor and matching techniques such as Hamming Embedding. Making the bridge between these approaches leads us to propose a match kernel that takes the best of existing techniques by combining an aggregation procedure with a selective match kernel. Finally, the representation underpinning this kernel is approximated, providing a large scale image search both precise and scalable, as shown by our experiments on several benchmarks.
5 0.77699465 221 iccv-2013-Joint Inverted Indexing
Author: Yan Xia, Kaiming He, Fang Wen, Jian Sun
Abstract: Inverted indexing is a popular non-exhaustive solution to large scale search. An inverted file is built by a quantizer such as k-means or a tree structure. It has been found that multiple inverted files, obtained by multiple independent random quantizers, are able to achieve practically good recall and speed. Instead of computing the multiple quantizers independently, we present a method that creates them jointly. Our method jointly optimizes all codewords in all quantizers. Then it assigns these codewords to the quantizers. In experiments this method shows significant improvement over various existing methods that use multiple independent quantizers. On the one-billion set of SIFT vectors, our method is faster and more accurate than a recent state-of-the-art inverted indexing method.
8 0.76290858 159 iccv-2013-Fast Neighborhood Graph Search Using Cartesian Concatenation
9 0.74339396 337 iccv-2013-Random Grids: Fast Approximate Nearest Neighbors and Range Searching for Image Search
10 0.72255027 400 iccv-2013-Stable Hyper-pooling and Query Expansion for Event Detection
11 0.70557821 446 iccv-2013-Visual Semantic Complex Network for Web Images
13 0.63483113 162 iccv-2013-Fast Subspace Search via Grassmannian Based Hashing
14 0.61958909 3 iccv-2013-3D Sub-query Expansion for Improving Sketch-Based Multi-view Image Retrieval
15 0.59701765 210 iccv-2013-Image Retrieval Using Textual Cues
16 0.5719949 287 iccv-2013-Neighbor-to-Neighbor Search for Fast Coding of Feature Vectors
17 0.57090706 77 iccv-2013-Codemaps - Segment, Classify and Search Objects Locally
18 0.5519892 445 iccv-2013-Visual Reranking through Weakly Supervised Multi-graph Learning
19 0.5507983 192 iccv-2013-Handwritten Word Spotting with Corrected Attributes
20 0.5426656 198 iccv-2013-Hierarchical Part Matching for Fine-Grained Visual Categorization
topicId topicWeight
[(2, 0.498), (6, 0.011), (13, 0.012), (26, 0.056), (27, 0.016), (31, 0.036), (42, 0.086), (64, 0.03), (73, 0.019), (89, 0.143), (98, 0.011)]
simIndex simValue paperId paperTitle
1 0.93826175 13 iccv-2013-A General Two-Step Approach to Learning-Based Hashing
Author: Guosheng Lin, Chunhua Shen, David Suter, Anton van_den_Hengel
Abstract: Most existing approaches to hashing apply a single form of hash function, and an optimization process which is typically deeply coupled to this specific form. This tight coupling restricts the flexibility of the method to respond to the data, and can result in complex optimization problems that are difficult to solve. Here we propose a flexible yet simple framework that is able to accommodate different types of loss functions and hash functions. This framework allows a number of existing approaches to hashing to be placed in context, and simplifies the development of new problemspecific hashing methods. Our framework decomposes the hashing learning problem into two steps: hash bit learning and hash function learning based on the learned bits. The first step can typically be formulated as binary quadratic problems, and the second step can be accomplished by training standard binary classifiers. Both problems have been extensively studied in the literature. Our extensive experiments demonstrate that the proposed framework is effective, flexible and outperforms the state-of-the-art.
same-paper 2 0.92735511 294 iccv-2013-Offline Mobile Instance Retrieval with a Small Memory Footprint
Author: Jayaguru Panda, Michael S. Brown, C.V. Jawahar
Abstract: Existing mobile image instance retrieval applications assume a network-based usage where image features are sent to a server to query an online visual database. In this scenario, there are no restrictions on the size of the visual database. This paper, however, examines how to perform this same task offline, where the entire visual index must reside on the mobile device itself within a small memory footprint. Such solutions have applications on location recognition and product recognition. Mobile instance retrieval requires a significant reduction in the visual index size. To achieve this, we describe a set of strategies that can reduce the visual index up to 60-80 compared to a scatannd raerddu iens tthaen vceis rueatrli ienvdaelx xim upple tom 6en0t-8at0io ×n found on ddte osk atops or servers. While our proposed reduction steps affect the overall mean Average Precision (mAP), they are able to maintain a good Precision for the top K results (PK). We argue that for such offline application, maintaining a good PK is sufficient. The effectiveness of this approach is demonstrated on several standard databases. A working application designed for a remote historical site is also presented. This application is able to reduce an 50,000 image index structure to 25 MBs while providing a precision of 97% for P10 and 100% for P1.
3 0.89465636 446 iccv-2013-Visual Semantic Complex Network for Web Images
Author: Shi Qiu, Xiaogang Wang, Xiaoou Tang
Abstract: This paper proposes modeling the complex web image collections with an automatically generated graph structure called visual semantic complex network (VSCN). The nodes on this complex network are clusters of images with both visual and semantic consistency, called semantic concepts. These nodes are connected based on the visual and semantic correlations. Our VSCN with 33, 240 concepts is generated from a collection of 10 million web images. 1 A great deal of valuable information on the structures of the web image collections can be revealed by exploring the VSCN, such as the small-world behavior, concept community, indegree distribution, hubs, and isolated concepts. It not only helps us better understand the web image collections at a macroscopic level, but also has many important practical applications. This paper presents two application examples: content-based image retrieval and image browsing. Experimental results show that the VSCN leads to significant improvement on both the precision of image retrieval (over 200%) and user experience for image browsing.
4 0.88545394 352 iccv-2013-Revisiting Example Dependent Cost-Sensitive Learning with Decision Trees
Author: Oisin Mac Aodha, Gabriel J. Brostow
Abstract: Typical approaches to classification treat class labels as disjoint. For each training example, it is assumed that there is only one class label that correctly describes it, and that all other labels are equally bad. We know however, that good and bad labels are too simplistic in many scenarios, hurting accuracy. In the realm of example dependent costsensitive learning, each label is instead a vector representing a data point’s affinity for each of the classes. At test time, our goal is not to minimize the misclassification rate, but to maximize that affinity. We propose a novel example dependent cost-sensitive impurity measure for decision trees. Our experiments show that this new impurity measure improves test performance while still retaining the fast test times of standard classification trees. We compare our approach to classification trees and other cost-sensitive methods on three computer vision problems, tracking, descriptor matching, and optical flow, and show improvements in all three domains.
5 0.87629044 191 iccv-2013-Handling Uncertain Tags in Visual Recognition
Author: Arash Vahdat, Greg Mori
Abstract: Gathering accurate training data for recognizing a set of attributes or tags on images or videos is a challenge. Obtaining labels via manual effort or from weakly-supervised data typically results in noisy training labels. We develop the FlipSVM, a novel algorithm for handling these noisy, structured labels. The FlipSVM models label noise by “flipping ” labels on training examples. We show empirically that the FlipSVM is effective on images-and-attributes and video tagging datasets.
6 0.8603099 244 iccv-2013-Learning View-Invariant Sparse Representations for Cross-View Action Recognition
7 0.85943031 214 iccv-2013-Improving Graph Matching via Density Maximization
8 0.82928723 374 iccv-2013-Salient Region Detection by UFO: Uniqueness, Focusness and Objectness
9 0.77662647 239 iccv-2013-Learning Hash Codes with Listwise Supervision
10 0.7596308 153 iccv-2013-Face Recognition Using Face Patch Networks
11 0.74345386 83 iccv-2013-Complementary Projection Hashing
12 0.74277562 322 iccv-2013-Pose Estimation and Segmentation of People in 3D Movies
13 0.74111509 409 iccv-2013-Supervised Binary Hash Code Learning with Jensen Shannon Divergence
14 0.72435009 229 iccv-2013-Large-Scale Video Hashing via Structure Learning
15 0.69026238 248 iccv-2013-Learning to Rank Using Privileged Information
16 0.68800867 313 iccv-2013-Person Re-identification by Salience Matching
17 0.68339097 368 iccv-2013-SYM-FISH: A Symmetry-Aware Flip Invariant Sketch Histogram Shape Descriptor
18 0.68186319 378 iccv-2013-Semantic-Aware Co-indexing for Image Retrieval
19 0.67990112 73 iccv-2013-Class-Specific Simplex-Latent Dirichlet Allocation for Image Classification
20 0.66603673 443 iccv-2013-Video Synopsis by Heterogeneous Multi-source Correlation