iccv iccv2013 iccv2013-446 knowledge-graph by maker-knowledge-mining

446 iccv-2013-Visual Semantic Complex Network for Web Images


Source: pdf

Author: Shi Qiu, Xiaogang Wang, Xiaoou Tang

Abstract: This paper proposes modeling the complex web image collections with an automatically generated graph structure called visual semantic complex network (VSCN). The nodes on this complex network are clusters of images with both visual and semantic consistency, called semantic concepts. These nodes are connected based on the visual and semantic correlations. Our VSCN with 33, 240 concepts is generated from a collection of 10 million web images. 1 A great deal of valuable information on the structures of the web image collections can be revealed by exploring the VSCN, such as the small-world behavior, concept community, indegree distribution, hubs, and isolated concepts. It not only helps us better understand the web image collections at a macroscopic level, but also has many important practical applications. This paper presents two application examples: content-based image retrieval and image browsing. Experimental results show that the VSCN leads to significant improvement on both the precision of image retrieval (over 200%) and user experience for image browsing.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 hk e of Hong Kong Abstract This paper proposes modeling the complex web image collections with an automatically generated graph structure called visual semantic complex network (VSCN). [sent-6, score-0.581]

2 The nodes on this complex network are clusters of images with both visual and semantic consistency, called semantic concepts. [sent-7, score-0.574]

3 These nodes are connected based on the visual and semantic correlations. [sent-8, score-0.315]

4 Our VSCN with 33, 240 concepts is generated from a collection of 10 million web images. [sent-9, score-0.48]

5 1 A great deal of valuable information on the structures of the web image collections can be revealed by exploring the VSCN, such as the small-world behavior, concept community, indegree distribution, hubs, and isolated concepts. [sent-10, score-0.486]

6 It not only helps us better understand the web image collections at a macroscopic level, but also has many important practical applications. [sent-11, score-0.342]

7 Introduction The enormous and ever-growing amount of images on the web has inspired many important applications related to web image search, browsing, and clustering. [sent-15, score-0.477]

8 Such applications aim to provide users with easier access to web images. [sent-16, score-0.382]

9 This problem is particularly challenging due to the large diversity and complex structures of web images. [sent-18, score-0.302]

10 Most search engines rely on textual information to index web images and measure their relevance. [sent-19, score-0.346]

11 Because of the ambiguous nature of textual description, images indexed by the same keyword may come from irrelevant concepts and exhibit large diversity on visual content. [sent-21, score-0.416]

12 T and V are textual and visual descriptors for a semantic concept. [sent-29, score-0.256]

13 near duplicate, and cannot find relevant images that have the same semantic meaning but moderate difference in visual content. [sent-37, score-0.29]

14 Both of the above approaches only allow users to interact with the huge web image collections at a microscopic level, i. [sent-38, score-0.408]

15 exploring images within a very small local region either in the textual or visual feature space, which limits the effective access of web images. [sent-40, score-0.355]

16 We attribute this limitation to the lack of a top-down organization of web images that models their underlying visual and semantic structures. [sent-41, score-0.456]

17 Although efforts have been made to manually organize portions of web images such as ImageNet [6], it is derived from a human-defined ontology that has inherent discrepancies with dynamic web images. [sent-42, score-0.501]

18 The purpose of this work is to automatically discover and model the visual and semantic structures of web image collections, study their properties at a macroscopic level, 33661236 and demonstrate the use of such structures and properties through concrete applications. [sent-44, score-0.589]

19 To this end, we propose to model web image collections using the Visual Semantic Complex Network (VSCN), an automatically generated graph structure (illustrated in Figure 1) on which images that are relevant in both semantics and visual content are well connected and organized. [sent-45, score-0.428]

20 Our key observation is that images on the web are not distributed randomly, but do tend to form visually and semantically compact clusters. [sent-46, score-0.274]

21 These image clusters can be used as the elementary units for modeling the structures of web image collections. [sent-47, score-0.313]

22 We automatically discover image clusters with both semantic and visual consistency, and treat them as nodes on the graph. [sent-48, score-0.309]

23 We refer to the discovered image clusters as semantic concepts, and associate them with visual and textual descriptors. [sent-49, score-0.298]

24 Semantic concepts are connected with edges based on their visual and semantic correlations. [sent-50, score-0.507]

25 The semantic concepts and their correlations bring structures to web images and allow more accurate modeling ofimage relevance. [sent-51, score-0.763]

26 Our VSCN currently comprises 33, 240 semantic concepts and around 10 million web images. [sent-52, score-0.652]

27 Given more computation resources, this complex network can be readily scaled by including more concepts and more images under each concept. [sent-54, score-0.346]

28 We can better understand web image collections at a macroscopic level by studying the structural properties of the VSCN from the perspective of complex network [1]. [sent-55, score-0.412]

29 Such properties provide valuable information that opens doors for many important applications such as text or content-based web image retrieval, web image browsing, discovering popular web image topics, and defining image similarities based on structural information [22]. [sent-57, score-0.706]

30 In the second application, a novel visualization scheme is proposed for web image browsing. [sent-65, score-0.279]

31 Users can explore the web image collections by navigating the VSCN without being limited by query keywords. [sent-66, score-0.465]

32 Our study shows that the VSCN has small-world behaviour, like many other complex networks, which indicates that most semantic concepts can reach each other by taking a short path, which enables efficient image browsing. [sent-67, score-0.478]

33 Related Work Modeling Structure of Web Images: ImageNet [6] and Tiny Images [20] both provide large-scale hierarchical structures of images, by associating web images with/without human selection to nodes in the WordNet ontology. [sent-69, score-0.355]

34 In contrast, our VSCN is automatically generated from the visual and textual contents on the web, making it well-suited for tasks related to web images. [sent-71, score-0.311]

35 Visual Synset [21] and LCSS [14] learn a set of prototypical concepts from web images, but neither of them model the correlations among concepts. [sent-72, score-0.524]

36 Their learned concepts are used independently for image annotation tasks. [sent-73, score-0.253]

37 Our VSCN differs from ImageKB in that we organize the semantic concepts using a complex network, which provides richer information about the structures of web images, as presented in Section 4. [sent-75, score-0.751]

38 This challenge implies the need for a better organization method that well models the structures of web images to improve web-scale CBIR. [sent-80, score-0.294]

39 Image Collection Browsing: An effective browsing scheme is critical for users to access their desired images [12]. [sent-81, score-0.397]

40 A number of browsing schemes organize images on a plane based on visual similarities [13], such that images with higher visual similarities are placed closer. [sent-82, score-0.334]

41 IGroup [23] groups images using surrounding texts and enables users to browse images by semantic clusters. [sent-84, score-0.448]

42 All of these approaches are more suitable for browsing an image collection under one particular query, but not the entire web image collection. [sent-86, score-0.423]

43 Starting with 2, 000 top query keywords of Bing image search engine, we automatically discover 33, 240 semantic concepts that are compact image clusters with visual and semantic consistency. [sent-93, score-0.939]

44 Our method learns the semantic concepts by discovering keywords that occur frequently in visually similar images. [sent-94, score-0.479]

45 Such concepts have more specific semantic meanings and less visual diversity, and can be viewed as elementary units of web image collections. [sent-101, score-0.686]

46 The learned concepts under query keyword q are denoted as Cq = . [sent-102, score-0.475]

47 The concepts learned from dqif afreer denetn queries Cform= t {hec n}odes of the VSCN. [sent-103, score-0.283]

48 We adopt this method because it has been shown to work robustly in measuring the similarity of two short texts (a short text contains a set of keywords) at the semantic level, and because of its efficiency. [sent-110, score-0.287]

49 For a short text x, a set of Google snippets S(x) is obtained from the Google web search. [sent-111, score-0.283]

50 We collect the snippets of the top N search result items, which provide rich semantic Algorithm 1 Concept Discovery through Query Expansion Input: Query q, image collection Iq, surrounding texts Tq. [sent-113, score-0.324]

51 Use ci as a query input on the Google web search. [sent-122, score-0.476]

52 The semantic correlation between ci and cj is: S Cor = Cosine(ntf(ci) , ntf(cj)) . [sent-129, score-0.303]

53 (1) Visual correlation of two concepts is measured by the visual similarity between their corresponding exemplar image sets. [sent-130, score-0.395]

54 For each concept, its exemplar image set consists of the top 300 images retrieved from the search engine by using the concept as query keyword. [sent-131, score-0.464]

55 Concretely, for a concept ci ∈ C, we denote its exemplar image set by Ici. [sent-135, score-0.268]

56 The path connecting two semantically irrelevant concepts “apple laptop” and “disney logo”. [sent-148, score-0.3]

57 V Cor = We fuse the semantic correlation and visual correlation by Cor = S cor + V cor. [sent-160, score-0.287]

58 Complexity After downloading the images and metadata, our method takes 70 seconds to learn semantic concepts from one query. [sent-164, score-0.448]

59 The study of these properties not only yields a better understanding of web image collections at a macroscopic level, but also provides valuable information that assists in important tasks including CBIR and image browsing, as presented in Section 5 and 6. [sent-170, score-0.367]

60 The existence of a dominant connected component and its small separation between nodes suggest it is possible to navigate the VSCN by following its edges, which inspires the novel image browsing scheme introduced in Section 6. [sent-192, score-0.328]

61 It is interesting to see how semantically different concepts are connected on the VSCN as exemplified in Figure 4. [sent-194, score-0.325]

62 In general, representative and popular concepts that are neighbors of many other concepts have high in-degrees, and form hub structures. [sent-208, score-0.533]

63 They are typically uncommon concepts such as “geodesic dome”and “ant grasshopper”, or the failures of concept detection such as “dscn jpg” which does not have semantic meanings. [sent-210, score-0.546]

64 Figure 5 shows part of the VSCN, with concepts of large in-degrees. [sent-211, score-0.253]

65 Concept Community The semantic regions observed from Figure 5 suggest the existence of community structures on the VSCN. [sent-215, score-0.263]

66 On the VSCN, it corresponds to a group of closely related semantic concepts, called a concept community. [sent-217, score-0.293]

67 The key idea is to effectively reduce the search space by exploiting the structures of web images encoded in the VSCN. [sent-230, score-0.34]

68 Based on the initial retrieval result, the semantic meaning of the query image is estimated using a small set of relevant semantic concepts on the VSCN (Figure 7 (c) and (d)). [sent-235, score-0.877]

69 Images under these semantic concepts are then gathered to form a re-ranking pool (Figure 7 (e)). [sent-236, score-0.476]

70 A key step of our approach is to estimate the semantic meaning of the query image, which is done at two levels. [sent-243, score-0.372]

71 At the community level, we estimate the query image’s semantic meaning using a set of concept communities discovered in Section 4. [sent-244, score-0.616]

72 As concept communities group similar concepts, estimating the relevant communities is more reliable than estimating individual concepts. [sent-246, score-0.3]

73 Then, at the concept level, a smaller set of relevant concepts are further 33662270 identified from the previously identified communities. [sent-247, score-0.401]

74 GivWene a query image Iq, a leisptt o cof top-ranked images and their distances to Iq are returned by a baseline retrieval algorithm (e. [sent-252, score-0.266]

75 The concepts included in these concept communities are aggregated and denoted by C? [sent-266, score-0.45]

76 In order to best identify the most relevant concepts out of C? [sent-275, score-0.28]

77 t)io nis, nanotd sufficiently reliable, we introduce the second source of information— correlations between semantic concepts—to refine the noisy relevance score. [sent-287, score-0.271]

78 We rank the semantic concepts according to their probability values in p, and take the top NC to represent the semantic meaning of the query image. [sent-316, score-0.797]

79 Images of the top NC concepts are gathered and form an re-ranking pool, which are matched with the query image. [sent-317, score-0.444]

80 We collect a set of query images to search against images of the VSCN. [sent-330, score-0.258]

81 We submit the names of semantic concepts to Google and obtain the top five images returned. [sent-332, score-0.479]

82 For each query image, the top 100 images are retrieved and are manually labelled as being relevant/irrelevant to the query image. [sent-336, score-0.377]

83 Image Browsing with the VSCN This section presents a new browsing scheme that helps users explore the VSCN and find images of interest. [sent-359, score-0.376]

84 The user starts browsing by entering a query keyword to the system. [sent-360, score-0.473]

85 As shown in Figure 2(e), our scheme allows users to browse two spaces—the query space and the local concept space—each of which only presents a small subgraph of the entire VSCN. [sent-362, score-0.505]

86 A query space visualizes semantic concepts generated by the same query. [sent-363, score-0.591]

87 For example, the query space of “apple” contains concepts such as “apple fruit”, “apple iphone”, “apple pie”, and their corresponding images. [sent-364, score-0.419]

88 In this way, it bridges images of most related concepts and helps users access more images of interest without being limited by their initial queries. [sent-370, score-0.454]

89 In the browsing process, users can freely switch between the two spaces. [sent-371, score-0.33]

90 A user who chooses a particular concept in the query space enters into the local concept space and the chosen concept becomes the centric concept. [sent-372, score-0.584]

91 The final visualization result can effectively deliver the visual and semantic content of the current space. [sent-384, score-0.257]

92 User Study We evaluate our browsing scheme by comparing it with three existing browsing schemes (interfaces): the traditional ranked-list interface, the interface of presenting images based on visual similarity [13], and the semantic clusterbased interface [23], as shown in Figure 9. [sent-393, score-0.713]

93 This task is designed to mimic the common scenario in which a user may not know the exact query keyword for an object and starts from another related keyword that he/she is familiar with. [sent-404, score-0.333]

94 We allow users to reformulate query keywords as they need. [sent-405, score-0.354]

95 It indicates that our scheme (UI4) re33662292 When users click apple iphone in the query space of apple, the local concept space is shown, with two more neighboring concepts, namely htc diamond and palm pixi. [sent-416, score-0.726]

96 Users can further enter the query space of palm by clicking on the concept of palm pixi. [sent-418, score-0.417]

97 Conclusions This paper has proposed a novel visual semantic complex network to model the complex structures of a web image collection. [sent-429, score-0.578]

98 They not only help us understand the huge web image collection at a macroscopic level, but are also valuable in practical applications. [sent-431, score-0.32]

99 Two exemplar applications show that exploiting structural information of the VSCN not only substantially improves the precisions of CBIR, but also greatly enhances the user experience in web image search and browsing. [sent-432, score-0.414]

100 Igroup: presenting web image search results in semantic clusters. [sent-608, score-0.445]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('vscn', 0.723), ('concepts', 0.253), ('web', 0.227), ('browsing', 0.196), ('semantic', 0.172), ('query', 0.166), ('users', 0.134), ('concept', 0.121), ('apple', 0.118), ('itq', 0.107), ('cbir', 0.101), ('ik', 0.088), ('ci', 0.083), ('communities', 0.076), ('macroscopic', 0.068), ('exemplar', 0.064), ('hasing', 0.062), ('wiik', 0.062), ('nodes', 0.061), ('hashing', 0.057), ('keyword', 0.056), ('relevance', 0.055), ('user', 0.055), ('rwr', 0.055), ('iphone', 0.054), ('palm', 0.054), ('keywords', 0.054), ('retrieval', 0.053), ('textual', 0.05), ('connected', 0.048), ('texts', 0.048), ('collections', 0.047), ('community', 0.047), ('search', 0.046), ('ntf', 0.046), ('correlations', 0.044), ('structures', 0.044), ('mouse', 0.043), ('iq', 0.043), ('google', 0.043), ('clusters', 0.042), ('anova', 0.041), ('interfaces', 0.039), ('cor', 0.039), ('network', 0.039), ('subgraph', 0.037), ('visual', 0.034), ('cq', 0.034), ('snippets', 0.034), ('meaning', 0.034), ('networks', 0.033), ('htc', 0.031), ('igroup', 0.031), ('imagekb', 0.031), ('pixi', 0.031), ('sahami', 0.031), ('simhash', 0.031), ('submit', 0.031), ('complex', 0.031), ('cui', 0.03), ('cuhk', 0.03), ('queries', 0.03), ('visualization', 0.029), ('cj', 0.027), ('hub', 0.027), ('intentsearch', 0.027), ('relevant', 0.027), ('duplicate', 0.027), ('pool', 0.026), ('logo', 0.025), ('navigating', 0.025), ('diamond', 0.025), ('valuable', 0.025), ('gathered', 0.025), ('organize', 0.024), ('surrounding', 0.024), ('browse', 0.024), ('baseline', 0.024), ('semantically', 0.024), ('largest', 0.023), ('scheme', 0.023), ('similarity', 0.023), ('interface', 0.023), ('path', 0.023), ('images', 0.023), ('content', 0.022), ('retrieved', 0.022), ('engine', 0.022), ('cars', 0.022), ('isolated', 0.022), ('short', 0.022), ('bing', 0.022), ('clicking', 0.022), ('jing', 0.022), ('experience', 0.022), ('dk', 0.022), ('correlation', 0.021), ('ap', 0.021), ('wen', 0.021), ('access', 0.021)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999934 446 iccv-2013-Visual Semantic Complex Network for Web Images

Author: Shi Qiu, Xiaogang Wang, Xiaoou Tang

Abstract: This paper proposes modeling the complex web image collections with an automatically generated graph structure called visual semantic complex network (VSCN). The nodes on this complex network are clusters of images with both visual and semantic consistency, called semantic concepts. These nodes are connected based on the visual and semantic correlations. Our VSCN with 33, 240 concepts is generated from a collection of 10 million web images. 1 A great deal of valuable information on the structures of the web image collections can be revealed by exploring the VSCN, such as the small-world behavior, concept community, indegree distribution, hubs, and isolated concepts. It not only helps us better understand the web image collections at a macroscopic level, but also has many important practical applications. This paper presents two application examples: content-based image retrieval and image browsing. Experimental results show that the VSCN leads to significant improvement on both the precision of image retrieval (over 200%) and user experience for image browsing.

2 0.15307653 4 iccv-2013-ACTIVE: Activity Concept Transitions in Video Event Classification

Author: Chen Sun, Ram Nevatia

Abstract: The goal of high level event classification from videos is to assign a single, high level event label to each query video. Traditional approaches represent each video as a set of low level features and encode it into a fixed length feature vector (e.g. Bag-of-Words), which leave a big gap between low level visual features and high level events. Our paper tries to address this problem by exploiting activity concept transitions in video events (ACTIVE). A video is treated as a sequence of short clips, all of which are observations corresponding to latent activity concept variables in a Hidden Markov Model (HMM). We propose to apply Fisher Kernel techniques so that the concept transitions over time can be encoded into a compact and fixed length feature vector very efficiently. Our approach can utilize concept annotations from independent datasets, and works well even with a very small number of training samples. Experiments on the challenging NIST TRECVID Multimedia Event Detection (MED) dataset shows our approach performs favorably over the state-of-the-art.

3 0.15020338 266 iccv-2013-Mining Multiple Queries for Image Retrieval: On-the-Fly Learning of an Object-Specific Mid-level Representation

Author: Basura Fernando, Tinne Tuytelaars

Abstract: In this paper we present a new method for object retrieval starting from multiple query images. The use of multiple queries allows for a more expressive formulation of the query object including, e.g., different viewpoints and/or viewing conditions. This, in turn, leads to more diverse and more accurate retrieval results. When no query images are available to the user, they can easily be retrieved from the internet using a standard image search engine. In particular, we propose a new method based on pattern mining. Using the minimal description length principle, we derive the most suitable set of patterns to describe the query object, with patterns corresponding to local feature configurations. This results in apowerful object-specific mid-level image representation. The archive can then be searched efficiently for similar images based on this representation, using a combination of two inverted file systems. Since the patterns already encode local spatial information, good results on several standard image retrieval datasets are obtained even without costly re-ranking based on geometric verification.

4 0.1369299 378 iccv-2013-Semantic-Aware Co-indexing for Image Retrieval

Author: Shiliang Zhang, Ming Yang, Xiaoyu Wang, Yuanqing Lin, Qi Tian

Abstract: Inverted indexes in image retrieval not only allow fast access to database images but also summarize all knowledge about the database, so that their discriminative capacity largely determines the retrieval performance. In this paper, for vocabulary tree based image retrieval, we propose a semantic-aware co-indexing algorithm to jointly San Antonio, TX 78249 . j dl@gmai l com qit ian@cs .ut sa . edu . The query embed two strong cues into the inverted indexes: 1) local invariant features that are robust to delineate low-level image contents, and 2) semantic attributes from large-scale object recognition that may reveal image semantic meanings. For an initial set of inverted indexes of local features, we utilize 1000 semantic attributes to filter out isolated images and insert semantically similar images to the initial set. Encoding these two distinct cues together effectively enhances the discriminative capability of inverted indexes. Such co-indexing operations are totally off-line and introduce small computation overhead to online query cause only local features but no semantic attributes are used for query. Experiments and comparisons with recent retrieval methods on 3 datasets, i.e., UKbench, Holidays, Oxford5K, and 1.3 million images from Flickr as distractors, manifest the competitive performance of our method 1.

5 0.11753099 337 iccv-2013-Random Grids: Fast Approximate Nearest Neighbors and Range Searching for Image Search

Author: Dror Aiger, Efi Kokiopoulou, Ehud Rivlin

Abstract: We propose two solutions for both nearest neighbors and range search problems. For the nearest neighbors problem, we propose a c-approximate solutionfor the restricted version ofthe decisionproblem with bounded radius which is then reduced to the nearest neighbors by a known reduction. For range searching we propose a scheme that learns the parameters in a learning stage adopting them to the case of a set of points with low intrinsic dimension that are embedded in high dimensional space (common scenario for image point descriptors). We compare our algorithms to the best known methods for these problems, i.e. LSH, ANN and FLANN. We show analytically and experimentally that we can do better for moderate approximation factor. Our algorithms are trivial to parallelize. In the experiments conducted, running on couple of million im- ages, our algorithms show meaningful speed-ups when compared with the above mentioned methods.

6 0.11606562 162 iccv-2013-Fast Subspace Search via Grassmannian Based Hashing

7 0.091671988 176 iccv-2013-From Large Scale Image Categorization to Entry-Level Categories

8 0.09037669 210 iccv-2013-Image Retrieval Using Textual Cues

9 0.089240529 233 iccv-2013-Latent Task Adaptation with Large-Scale Hierarchies

10 0.08801356 213 iccv-2013-Implied Feedback: Learning Nuances of User Behavior in Image Search

11 0.086650454 334 iccv-2013-Query-Adaptive Asymmetrical Dissimilarities for Visual Object Retrieval

12 0.082807131 445 iccv-2013-Visual Reranking through Weakly Supervised Multi-graph Learning

13 0.07958094 400 iccv-2013-Stable Hyper-pooling and Query Expansion for Event Detection

14 0.075514391 444 iccv-2013-Viewing Real-World Faces in 3D

15 0.074063815 294 iccv-2013-Offline Mobile Instance Retrieval with a Small Memory Footprint

16 0.073606007 3 iccv-2013-3D Sub-query Expansion for Improving Sketch-Based Multi-view Image Retrieval

17 0.073591679 229 iccv-2013-Large-Scale Video Hashing via Structure Learning

18 0.069239385 54 iccv-2013-Attribute Pivots for Guiding Relevance Feedback in Image Search

19 0.067339212 246 iccv-2013-Learning the Visual Interpretation of Sentences

20 0.065780774 235 iccv-2013-Learning Coupled Feature Spaces for Cross-Modal Matching


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.132), (1, 0.076), (2, -0.027), (3, -0.095), (4, 0.057), (5, 0.139), (6, 0.011), (7, -0.056), (8, -0.049), (9, 0.047), (10, 0.041), (11, -0.018), (12, 0.018), (13, 0.072), (14, -0.033), (15, 0.017), (16, 0.035), (17, -0.083), (18, 0.044), (19, -0.047), (20, -0.035), (21, -0.035), (22, 0.0), (23, 0.018), (24, -0.018), (25, 0.024), (26, 0.011), (27, -0.023), (28, -0.014), (29, -0.021), (30, 0.005), (31, -0.056), (32, 0.049), (33, -0.022), (34, -0.011), (35, 0.007), (36, -0.035), (37, 0.089), (38, -0.059), (39, 0.013), (40, -0.026), (41, -0.074), (42, 0.029), (43, 0.003), (44, 0.08), (45, -0.03), (46, 0.036), (47, 0.036), (48, 0.002), (49, -0.006)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.95112169 446 iccv-2013-Visual Semantic Complex Network for Web Images

Author: Shi Qiu, Xiaogang Wang, Xiaoou Tang

Abstract: This paper proposes modeling the complex web image collections with an automatically generated graph structure called visual semantic complex network (VSCN). The nodes on this complex network are clusters of images with both visual and semantic consistency, called semantic concepts. These nodes are connected based on the visual and semantic correlations. Our VSCN with 33, 240 concepts is generated from a collection of 10 million web images. 1 A great deal of valuable information on the structures of the web image collections can be revealed by exploring the VSCN, such as the small-world behavior, concept community, indegree distribution, hubs, and isolated concepts. It not only helps us better understand the web image collections at a macroscopic level, but also has many important practical applications. This paper presents two application examples: content-based image retrieval and image browsing. Experimental results show that the VSCN leads to significant improvement on both the precision of image retrieval (over 200%) and user experience for image browsing.

2 0.75482297 334 iccv-2013-Query-Adaptive Asymmetrical Dissimilarities for Visual Object Retrieval

Author: Cai-Zhi Zhu, Hervé Jégou, Shin'Ichi Satoh

Abstract: Visual object retrieval aims at retrieving, from a collection of images, all those in which a given query object appears. It is inherently asymmetric: the query object is mostly included in the database image, while the converse is not necessarily true. However, existing approaches mostly compare the images with symmetrical measures, without considering the different roles of query and database. This paper first measure the extent of asymmetry on large-scale public datasets reflecting this task. Considering the standard bag-of-words representation, we then propose new asymmetrical dissimilarities accounting for the different inlier ratios associated with query and database images. These asymmetrical measures depend on the query, yet they are compatible with an inverted file structure, without noticeably impacting search efficiency. Our experiments show the benefit of our approach, and show that the visual object retrieval task is better treated asymmetrically, in the spirit of state-of-the-art text retrieval.

3 0.75388557 266 iccv-2013-Mining Multiple Queries for Image Retrieval: On-the-Fly Learning of an Object-Specific Mid-level Representation

Author: Basura Fernando, Tinne Tuytelaars

Abstract: In this paper we present a new method for object retrieval starting from multiple query images. The use of multiple queries allows for a more expressive formulation of the query object including, e.g., different viewpoints and/or viewing conditions. This, in turn, leads to more diverse and more accurate retrieval results. When no query images are available to the user, they can easily be retrieved from the internet using a standard image search engine. In particular, we propose a new method based on pattern mining. Using the minimal description length principle, we derive the most suitable set of patterns to describe the query object, with patterns corresponding to local feature configurations. This results in apowerful object-specific mid-level image representation. The archive can then be searched efficiently for similar images based on this representation, using a combination of two inverted file systems. Since the patterns already encode local spatial information, good results on several standard image retrieval datasets are obtained even without costly re-ranking based on geometric verification.

4 0.73854524 378 iccv-2013-Semantic-Aware Co-indexing for Image Retrieval

Author: Shiliang Zhang, Ming Yang, Xiaoyu Wang, Yuanqing Lin, Qi Tian

Abstract: Inverted indexes in image retrieval not only allow fast access to database images but also summarize all knowledge about the database, so that their discriminative capacity largely determines the retrieval performance. In this paper, for vocabulary tree based image retrieval, we propose a semantic-aware co-indexing algorithm to jointly San Antonio, TX 78249 . j dl@gmai l com qit ian@cs .ut sa . edu . The query embed two strong cues into the inverted indexes: 1) local invariant features that are robust to delineate low-level image contents, and 2) semantic attributes from large-scale object recognition that may reveal image semantic meanings. For an initial set of inverted indexes of local features, we utilize 1000 semantic attributes to filter out isolated images and insert semantically similar images to the initial set. Encoding these two distinct cues together effectively enhances the discriminative capability of inverted indexes. Such co-indexing operations are totally off-line and introduce small computation overhead to online query cause only local features but no semantic attributes are used for query. Experiments and comparisons with recent retrieval methods on 3 datasets, i.e., UKbench, Holidays, Oxford5K, and 1.3 million images from Flickr as distractors, manifest the competitive performance of our method 1.

5 0.73658949 3 iccv-2013-3D Sub-query Expansion for Improving Sketch-Based Multi-view Image Retrieval

Author: Yen-Liang Lin, Cheng-Yu Huang, Hao-Jeng Wang, Winston Hsu

Abstract: We propose a 3D sub-query expansion approach for boosting sketch-based multi-view image retrieval. The core idea of our method is to automatically convert two (guided) 2D sketches into an approximated 3D sketch model, and then generate multi-view sketches as expanded sub-queries to improve the retrieval performance. To learn the weights among synthesized views (sub-queries), we present a new multi-query feature to model the similarity between subqueries and dataset images, and formulate it into a convex optimization problem. Our approach shows superior performance compared with the state-of-the-art approach on a public multi-view image dataset. Moreover, we also conduct sensitivity tests to analyze the parameters of our approach based on the gathered user sketches.

6 0.67117423 162 iccv-2013-Fast Subspace Search via Grassmannian Based Hashing

7 0.66242254 337 iccv-2013-Random Grids: Fast Approximate Nearest Neighbors and Range Searching for Image Search

8 0.65238309 294 iccv-2013-Offline Mobile Instance Retrieval with a Small Memory Footprint

9 0.64997309 445 iccv-2013-Visual Reranking through Weakly Supervised Multi-graph Learning

10 0.63303936 400 iccv-2013-Stable Hyper-pooling and Query Expansion for Event Detection

11 0.60355705 159 iccv-2013-Fast Neighborhood Graph Search Using Cartesian Concatenation

12 0.59088713 306 iccv-2013-Paper Doll Parsing: Retrieving Similar Styles to Parse Clothing Items

13 0.58762217 368 iccv-2013-SYM-FISH: A Symmetry-Aware Flip Invariant Sketch Histogram Shape Descriptor

14 0.56757003 419 iccv-2013-To Aggregate or Not to aggregate: Selective Match Kernels for Image Search

15 0.5638603 221 iccv-2013-Joint Inverted Indexing

16 0.56320816 176 iccv-2013-From Large Scale Image Categorization to Entry-Level Categories

17 0.55174214 235 iccv-2013-Learning Coupled Feature Spaces for Cross-Modal Matching

18 0.54992419 450 iccv-2013-What is the Most EfficientWay to Select Nearest Neighbor Candidates for Fast Approximate Nearest Neighbor Search?

19 0.54011464 233 iccv-2013-Latent Task Adaptation with Large-Scale Hierarchies

20 0.53355378 246 iccv-2013-Learning the Visual Interpretation of Sentences


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(2, 0.466), (7, 0.014), (12, 0.026), (26, 0.046), (31, 0.032), (42, 0.083), (48, 0.01), (64, 0.035), (73, 0.012), (78, 0.015), (89, 0.147)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.94567448 13 iccv-2013-A General Two-Step Approach to Learning-Based Hashing

Author: Guosheng Lin, Chunhua Shen, David Suter, Anton van_den_Hengel

Abstract: Most existing approaches to hashing apply a single form of hash function, and an optimization process which is typically deeply coupled to this specific form. This tight coupling restricts the flexibility of the method to respond to the data, and can result in complex optimization problems that are difficult to solve. Here we propose a flexible yet simple framework that is able to accommodate different types of loss functions and hash functions. This framework allows a number of existing approaches to hashing to be placed in context, and simplifies the development of new problemspecific hashing methods. Our framework decomposes the hashing learning problem into two steps: hash bit learning and hash function learning based on the learned bits. The first step can typically be formulated as binary quadratic problems, and the second step can be accomplished by training standard binary classifiers. Both problems have been extensively studied in the literature. Our extensive experiments demonstrate that the proposed framework is effective, flexible and outperforms the state-of-the-art.

2 0.93441629 294 iccv-2013-Offline Mobile Instance Retrieval with a Small Memory Footprint

Author: Jayaguru Panda, Michael S. Brown, C.V. Jawahar

Abstract: Existing mobile image instance retrieval applications assume a network-based usage where image features are sent to a server to query an online visual database. In this scenario, there are no restrictions on the size of the visual database. This paper, however, examines how to perform this same task offline, where the entire visual index must reside on the mobile device itself within a small memory footprint. Such solutions have applications on location recognition and product recognition. Mobile instance retrieval requires a significant reduction in the visual index size. To achieve this, we describe a set of strategies that can reduce the visual index up to 60-80 compared to a scatannd raerddu iens tthaen vceis rueatrli ienvdaelx xim upple tom 6en0t-8at0io ×n found on ddte osk atops or servers. While our proposed reduction steps affect the overall mean Average Precision (mAP), they are able to maintain a good Precision for the top K results (PK). We argue that for such offline application, maintaining a good PK is sufficient. The effectiveness of this approach is demonstrated on several standard databases. A working application designed for a remote historical site is also presented. This application is able to reduce an 50,000 image index structure to 25 MBs while providing a precision of 97% for P10 and 100% for P1.

same-paper 3 0.90575874 446 iccv-2013-Visual Semantic Complex Network for Web Images

Author: Shi Qiu, Xiaogang Wang, Xiaoou Tang

Abstract: This paper proposes modeling the complex web image collections with an automatically generated graph structure called visual semantic complex network (VSCN). The nodes on this complex network are clusters of images with both visual and semantic consistency, called semantic concepts. These nodes are connected based on the visual and semantic correlations. Our VSCN with 33, 240 concepts is generated from a collection of 10 million web images. 1 A great deal of valuable information on the structures of the web image collections can be revealed by exploring the VSCN, such as the small-world behavior, concept community, indegree distribution, hubs, and isolated concepts. It not only helps us better understand the web image collections at a macroscopic level, but also has many important practical applications. This paper presents two application examples: content-based image retrieval and image browsing. Experimental results show that the VSCN leads to significant improvement on both the precision of image retrieval (over 200%) and user experience for image browsing.

4 0.89433026 352 iccv-2013-Revisiting Example Dependent Cost-Sensitive Learning with Decision Trees

Author: Oisin Mac Aodha, Gabriel J. Brostow

Abstract: Typical approaches to classification treat class labels as disjoint. For each training example, it is assumed that there is only one class label that correctly describes it, and that all other labels are equally bad. We know however, that good and bad labels are too simplistic in many scenarios, hurting accuracy. In the realm of example dependent costsensitive learning, each label is instead a vector representing a data point’s affinity for each of the classes. At test time, our goal is not to minimize the misclassification rate, but to maximize that affinity. We propose a novel example dependent cost-sensitive impurity measure for decision trees. Our experiments show that this new impurity measure improves test performance while still retaining the fast test times of standard classification trees. We compare our approach to classification trees and other cost-sensitive methods on three computer vision problems, tracking, descriptor matching, and optical flow, and show improvements in all three domains.

5 0.88673747 191 iccv-2013-Handling Uncertain Tags in Visual Recognition

Author: Arash Vahdat, Greg Mori

Abstract: Gathering accurate training data for recognizing a set of attributes or tags on images or videos is a challenge. Obtaining labels via manual effort or from weakly-supervised data typically results in noisy training labels. We develop the FlipSVM, a novel algorithm for handling these noisy, structured labels. The FlipSVM models label noise by “flipping ” labels on training examples. We show empirically that the FlipSVM is effective on images-and-attributes and video tagging datasets.

6 0.87292373 244 iccv-2013-Learning View-Invariant Sparse Representations for Cross-View Action Recognition

7 0.87011409 214 iccv-2013-Improving Graph Matching via Density Maximization

8 0.83915329 374 iccv-2013-Salient Region Detection by UFO: Uniqueness, Focusness and Objectness

9 0.79125839 239 iccv-2013-Learning Hash Codes with Listwise Supervision

10 0.76992822 153 iccv-2013-Face Recognition Using Face Patch Networks

11 0.76038927 409 iccv-2013-Supervised Binary Hash Code Learning with Jensen Shannon Divergence

12 0.75746328 83 iccv-2013-Complementary Projection Hashing

13 0.75741303 322 iccv-2013-Pose Estimation and Segmentation of People in 3D Movies

14 0.74291509 229 iccv-2013-Large-Scale Video Hashing via Structure Learning

15 0.70759171 248 iccv-2013-Learning to Rank Using Privileged Information

16 0.70755011 313 iccv-2013-Person Re-identification by Salience Matching

17 0.69890308 368 iccv-2013-SYM-FISH: A Symmetry-Aware Flip Invariant Sketch Histogram Shape Descriptor

18 0.69069082 378 iccv-2013-Semantic-Aware Co-indexing for Image Retrieval

19 0.69048041 73 iccv-2013-Class-Specific Simplex-Latent Dirichlet Allocation for Image Classification

20 0.68495989 443 iccv-2013-Video Synopsis by Heterogeneous Multi-source Correlation