iccv iccv2013 iccv2013-117 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Tobias Weyand, Bastian Leibe
Abstract: Current landmark recognition engines are typically aimed at recognizing building-scale landmarks, but miss interesting details like portals, statues or windows. This is because they use a flat clustering that summarizes all photos of a building facade in one cluster. We propose Hierarchical Iconoid Shift, a novel landmark clustering algorithm capable of discovering such details. Instead of just a collection of clusters, the output of HIS is a set of dendrograms describing the detail hierarchy of a landmark. HIS is based on the novel Hierarchical Medoid Shift clustering algorithm that performs a continuous mode search over the complete scale space. HMS is completely parameter-free, has the same complexity as Medoid Shift and is easy to parallelize. We evaluate HIS on 800k images of 34 landmarks and show that it can extract an often surprising amount of detail and structure that can be applied, e.g., to provide a mobile user with more detailed information on a landmark or even to extend the landmark’s Wikipedia article.
Reference: text
sentIndex sentText sentNum sentScore
1 de Abstract Current landmark recognition engines are typically aimed at recognizing building-scale landmarks, but miss interesting details like portals, statues or windows. [sent-3, score-0.223]
2 We propose Hierarchical Iconoid Shift, a novel landmark clustering algorithm capable of discovering such details. [sent-5, score-0.194]
3 Instead of just a collection of clusters, the output of HIS is a set of dendrograms describing the detail hierarchy of a landmark. [sent-6, score-0.167]
4 , to provide a mobile user with more detailed information on a landmark or even to extend the landmark’s Wikipedia article. [sent-11, score-0.173]
5 Introduction Current landmark recognition approaches, such as [1, 5, 9, 12, 16, 25, 27] or Google Goggles typically only return the name of the photographed building. [sent-13, score-0.153]
6 However, current landmark discovery approaches are not up to this task yet, and it is an open question how such a detailed structure can be automatically discovered in image collections. [sent-20, score-0.24]
7 We propose a novel algorithm for clustering internet photo collections, called Hierarchical Iconoid Shift, that is able to discover objects of interest at any scale. [sent-25, score-0.156]
8 It is based on Hierarchical Medoid Shift, a novel variant of Medoid Shift [19] inspired by scale space theory that tracks local density maxima while continuously increasing the kernel bandwidth. [sent-26, score-0.203]
9 In contrast, our algorithm increases the bandwidth continuously and is thus completely parameter-free. [sent-28, score-0.163]
10 HMS discovers modes at all scales and constructs a dendrogram from their merging behavior. [sent-29, score-0.292]
11 We apply HMS to the task of clustering internet photo collections using the Iconoid Shift framework [25] and show that it discovers the structure of a scene starting at small-scale objects such as individual statues or ornaments up to the full view of the landmark building. [sent-31, score-0.421]
12 Iconoids are organized in a dendrogram structure in which a path from 33447792 leaf to root represents the hierarchy of iconic views for a particular photo. [sent-33, score-0.25]
13 Many different clustering methods have been applied to discover landmarks in internet photo collections. [sent-38, score-0.227]
14 [5] apply Mean Shift to the geotags of photos at different bandwidths representing city scale and landmark scale. [sent-42, score-0.229]
15 There have been different approaches employing metadata from the web in order to name the discovered landmarks and provide additional information on them. [sent-46, score-0.158]
16 [23] propose a 3D landmark viewer that enables manual annotation of building parts. [sent-47, score-0.153]
17 [17] automate this step by performing web searches for all noun phrases in the Wikipedia article of a landmark and matching the retrieved images against its SfM reconstruction. [sent-48, score-0.272]
18 The large majority of approaches uses user-provided tags from internet photo collections [5, 9, 16, 21, 27]. [sent-50, score-0.167]
19 Our approach is similar to [11] who also adapt the idea of Scale-Space filtering [26] to clustering by increasing kernel scale and tracking maxima of the kernel density of the dataset using a Mean-Shift-like procedure. [sent-55, score-0.286]
20 By weighting points with a kernel with bandwidth β, Mean Shift implicitly searches for maxima of a kernel density. [sent-65, score-0.307]
21 The only change Medoid Shift [19] makes to this procedure is that the kernel window is iteratively shifted to the medoid of the points inside it, which makes it applicable to arbitrary metric spaces. [sent-69, score-0.572]
22 , Ganivde starting af tp tohien csu {rxren}t mnded ao mide yk, Medoid Shift finds the next medoid yk+1 by minimizing yk+1= ayr∈g{mxii}n? [sent-71, score-0.499]
23 A problem with Mean Shift and Medoid Shift is that the choice of kernel bandwidth depends on the data and application. [sent-76, score-0.188]
24 Often, the data has modes of different scales, even making a single fixed bandwidth unsuitable. [sent-77, score-0.174]
25 One way to address these issues is to use a variable bandwidth kernel [4], which however requires the choice of a suitable density estimator and a function for choosing the bandwidth based on the estimated density. [sent-78, score-0.336]
26 Another idea is to run Mean Shift coarse-to-fine at discrete bandwidth steps β1 , . [sent-79, score-0.159]
27 signal by filtering it with a continuously growing Gaussian kernel and constructing a tree of its merging extrema. [sent-93, score-0.176]
28 Our proposed algorithm allows the application of Scale Space Fil- tering to arbitrary metric spaces by using Medoid Shift [19] to explicitly track density maxima while continuously increasing the kernel bandwidth. [sent-95, score-0.184]
29 (1) If the kernel has finite support, as is common in Medoid Shift, there is only a finite number of bandwidth steps at which a new data point enters a medoid’s kernel window. [sent-97, score-0.291]
30 (2) If the kernel also has a monotonously decreasing profile, the weighted distances of the data points to their respective modes change between these steps, but their order remains the same, meaning that the density maxima can only change when a new data point enters the kernel window. [sent-98, score-0.298]
31 This allows for continuously growing the kernel by examining only a finite number of discrete steps. [sent-99, score-0.157]
32 The Hierarchical Medoid Shift algorithm proceeds as follows: We start from a seed point at kernel bandwidth 0 and build a priority queue of its nearest neighbors, ordered by their distances to the seed. [sent-100, score-0.359]
33 In each step, we pop an element from the priority queue, increase the kernel bandwidth to its distance from the medoid and compute its distances to all points inside the kernel window. [sent-102, score-0.803]
34 If the medoid has shifted, we update the priority queue with the new neighbor distances. [sent-105, score-0.59]
35 Similar to the evolution of local maxima in scale space [26], medoids corresponding to small maxima will merge to form larger maxima. [sent-120, score-0.152]
36 The resulting convergence sequences therefore form a dendrogram of the density structure of the dataset at all scales. [sent-121, score-0.238]
37 A horizontal slice through this dendrogram yields the set of medoids at a particular scale. [sent-122, score-0.238]
38 We can imagine each run of the algorithm as tracing out a branch of the dendrogram from bottom to top, where we move upwards each time we increase the kernel bandwidth β and we move horizontally each time we shift (Fig. [sent-125, score-0.692]
39 We keep track of the dendrogram branches in a central data structure, using mutexes to prevent race conditions when accessing them. [sent-127, score-0.244]
40 Hierarchical Iconoid Shift We apply HMS to the task of clustering collections of landmark images using the Iconoid Shift framework [25]. [sent-131, score-0.25]
41 Iconoid Shift defines a metric space over images based on their overlap and uses Medoid Shift to find modes in this space. [sent-133, score-0.16]
42 These modes, called Iconoids, are the photos that have the highest overlap with other photos of the same building or object. [sent-134, score-0.164]
43 Iconoids are typically frontal, centered views in which the landmark fills most of the image [25]. [sent-135, score-0.153]
44 In IS, the support set of an image is the set of images under its kernel window, meaning the images that have a certain minimum overlap with it. [sent-136, score-0.245]
45 Formally, an Iconoid is an image that has minimal homography overlap distance to the images in its support set. [sent-137, score-0.225]
46 In the underlying Medoid Shift algorithm, a hinge function is used as the shadow kernel Φβ, and thus the kernel ϕβ is a step function that cuts off at the bandwidth β, i. [sent-139, score-0.253]
47 the overlap distance threshold of a support set: ϕβ(d) = β1if d < β , 0 otherwise. [sent-141, score-0.146]
48 The homography overlap distance is simply one minus the overlap. [sent-150, score-0.154]
49 dH |o|Rwe|v|e thr, b imecaaguese a images wagithe lo anwd overlap are usually not directly connected in the graph, missing overlaps are determined using the homography overlap propagation (HOP) algorithm [25]. [sent-154, score-0.299]
50 Then, HOP computes the overlap between each pair of images (i, k) by propagating the overlap region of ialong the unique path to k in the MST. [sent-156, score-0.201]
51 For example, if (i, j) and (j, k) are edges in the MST, dovl (i, k) is computed by projecting i’s overlap region with j into j using the homography between iand j, intersecting the resulting region with j’s overlap region with k and finally projecting the region into k. [sent-157, score-0.271]
52 Using the known overlap regions, the overlap is then computed using Eq. [sent-158, score-0.184]
53 Given a seed image, Iconoid Shift explores its support set and builds the local matching graph, computes the pairwise overlaps in the support set as described above, and chooses the images with the highest weighted overlap according to Eq. [sent-160, score-0.321]
54 Since naively plugging in the homography overlap distance into Alg. [sent-169, score-0.154]
55 We define the corona as the images whose overlap distance to the medoid is greater than β, but that match at least one image from the support set (i. [sent-171, score-0.83]
56 The overlaps of the corona images with the medoid define the discrete scale steps for growing the kernel. [sent-175, score-0.808]
57 The corona images are inserted into a priority queue and prioritized by their overlap with the medoid. [sent-176, score-0.412]
58 HIS maintains a minimum spanning tree of all images in the corona and support set. [sent-177, score-0.282]
59 (a,b) The corona image closest to the medoid is added to the support set. [sent-190, score-0.739]
60 The kernel bandwidth is expanded to its distance to the medoid. [sent-191, score-0.188]
61 (d) Images matching the new image are retrieved, inserted into the corona and added to the priority queue. [sent-193, score-0.291]
62 (e,f) If the medoid has shifted, the support set and corona are updated. [sent-194, score-0.721]
63 laps of the new corona images with the seed are computed and the images are added to the priority queue. [sent-195, score-0.339]
64 3a) and increase the kernel bandwidth to the image’s overlap distance from the medoid (Fig. [sent-199, score-0.757]
65 Then, we compute the image’s pairwise distances to all images in the support set by propagating its overlap through the MST using HOP (Fig. [sent-201, score-0.18]
66 Next, we complete the corona by retrieving images matching the newly added image, computing overlaps with the medoid using HOP, and inserting them into the corona and priority queue (Fig. [sent-203, score-1.068]
67 Finally, we check whether the medoid has shifted (Eq. [sent-205, score-0.507]
68 If yes, we alternately update the support set and corona (Fig. [sent-208, score-0.244]
69 We move support set images whose overlap distance from the new medoid is larger than β to the corona. [sent-211, score-0.64]
70 We remove corona images with no match in the support set. [sent-213, score-0.261]
71 While there are images in the corona whose overlap distance to the medoid is smaller than β, we add them to the support set, compute their overlaps with all support set images and retrieve new corona images as above. [sent-215, score-1.144]
72 The algorithm is finished when the corona is empty, i. [sent-216, score-0.19]
73 all images overlapping with the medoid are in the support set. [sent-218, score-0.548]
74 Since adding an image to a support set has linear complexity in the number of images in it, the total complexity of growing a branch is quadratic in the number of images in its final support set. [sent-221, score-0.203]
75 Since we can stop processing a branch if a worker merges into an existing branch, the overall runtime of HIS depends on the branching factor of the dendrogram, 33447825 Figure 4: Part of a dendrogram from Trinity College before (left) and after (right) simplification. [sent-222, score-0.241]
76 Since these redundant dendrogram branches usually show very little change in perspective, we use a simple but effective scheme to simplify the dendrograms. [sent-230, score-0.244]
77 We descend each dendrogram from the root in a breadth-first fashion and estimate the camera motion for each outgoing edge of the current node. [sent-231, score-0.213]
78 We estimate the camera motion using a scheme similar to [10]: Since each child is in the support set of its parent, we use HOP to compute their overlap region. [sent-235, score-0.163]
79 We then use the change in relative size of the overlap region as an estimate ofthe zoom and the relative movement of the centroid of the overlap region to estimate the amount of panning or tilt. [sent-236, score-0.184]
80 We then perform dendrogram simplification, which reduces dendrogram size by 55. [sent-254, score-0.426]
81 HIS typically produces one dendrogram for each facade or view of a landmark and several smaller ones covering individual details. [sent-258, score-0.389]
82 It can be seen that the details discovered at lower bandwidth levels merge into more global structures as β increases. [sent-261, score-0.243]
83 We perform the matching by querying the existing inverted file indices for each landmark with the detail images from Wikipedia. [sent-279, score-0.233]
84 A Wikipedia image matches an Iconoid if they are related by a homography with at least 15 inliers and their homography overlap distance is less than 0. [sent-280, score-0.235]
85 Figure 5: Example dendrograms showing detail structures automatically discovered by Hierarchical Iconoid Shift. [sent-324, score-0.237]
86 Wikipedia Details is the number of details depicted in each article and WP → Iconoid is the number of Wikipedia images ew aitnhd a Wt Plea →st one matching Iucmonboeird. [sent-330, score-0.174]
87 o Generally, itmhenumber of details discovered depends on the number of images available for the landmark, since a dense enough sampling of the detail space is needed to identify local maxima. [sent-331, score-0.173]
88 Such high-quality labels cannot be found by simply examining frequently occurring terms in photo titles and tags [16, 21, 27], since photographers often do not make an effort to correctly label each landmark detail. [sent-336, score-0.244]
89 8 compares objects discovered by both algorithms (left) to objects only discovered by HIS. [sent-360, score-0.174]
90 The right child of the node showing the gate is not present on Wikipedia, but we can propose a place to insert it into the article based on its parent and sibling in the dendrogram and the estimated camera motion between them. [sent-363, score-0.388]
91 By exploiting links between the HIS dendrograms and the Wikipedia article tree, it is possible to guess the article section a new detail should be added to. [sent-369, score-0.318]
92 9 shows a part of a dendrogram where a figure group in the left part of a gate is present in Wikipedia while the figure group in the right part of the gate is missing. [sent-371, score-0.307]
93 From the adjacent nodes in the dendrogram and the camera motion estimation described above, we know the scene parts that the figure group is related to and where it is located relative to them. [sent-372, score-0.213]
94 In this case, we know that the right figure group is a part of the gate since the gate is its parent in the dendrogram and the camera move associated with the edge is “zoom out”. [sent-373, score-0.326]
95 Using the discovered dendrograms describing the scene structure, as well as the image clusters associated with each node, we can build visualizations of landmark building facades that are far more useful to the user of a mobile vi- sual search application than the name of the main landmark alone. [sent-378, score-0.561]
96 Firstly, using HOP, we can precisely localize even very small details on a large facade since the images in the clusters can help bridge larger scale changes than normal local feature matching can handle. [sent-380, score-0.153]
97 Conclusion In this work, we have proposed Hierarchical Medoid Shift, a new variant of Medoid Shift [19] that, instead of a flat clustering at a single scale, produces a dendrogram of clusters by continuously sweeping over all scales. [sent-383, score-0.328]
98 Based on HMS, we have proposed Hierarchical Iconoid Shift, a new algorithm for clustering internet photo collections that produces a hierarchy of clusters that includes both small details as well as full views of a landmark’s facade. [sent-385, score-0.262]
99 We have demonstrated our algorithm on a set of 34 landmarks with a rich detail structure and shown that it discovers many of the details depicted on Wikipedia and significantly outperforms Iconoid Shift [25] in terms of the number of discovered details. [sent-386, score-0.277]
100 The scene hierarchies produced by HIS could be useful for a large range of applications including landmark description and visual recognition of detail structures. [sent-387, score-0.189]
wordName wordTfidf (topN-words)
[('iconoid', 0.518), ('medoid', 0.477), ('wikipedia', 0.276), ('shift', 0.263), ('dendrogram', 0.213), ('corona', 0.19), ('iconoids', 0.19), ('hms', 0.168), ('landmark', 0.153), ('bandwidth', 0.123), ('dendrograms', 0.114), ('overlap', 0.092), ('discovered', 0.087), ('article', 0.075), ('landmarks', 0.071), ('kernel', 0.065), ('singletons', 0.063), ('homography', 0.062), ('hop', 0.062), ('hierarchical', 0.061), ('photo', 0.061), ('queue', 0.057), ('priority', 0.056), ('support', 0.054), ('maxima', 0.054), ('yk', 0.054), ('modes', 0.051), ('gate', 0.047), ('marco', 0.045), ('wp', 0.042), ('seed', 0.041), ('clustering', 0.041), ('continuously', 0.04), ('collections', 0.039), ('piazza', 0.038), ('westminster', 0.038), ('weyand', 0.038), ('statues', 0.037), ('internet', 0.037), ('overlaps', 0.036), ('photos', 0.036), ('detail', 0.036), ('dame', 0.036), ('notre', 0.036), ('mst', 0.034), ('clusters', 0.034), ('growing', 0.033), ('details', 0.033), ('cathedral', 0.031), ('branches', 0.031), ('shifted', 0.03), ('tags', 0.03), ('san', 0.03), ('covered', 0.029), ('branch', 0.028), ('quack', 0.028), ('discovers', 0.028), ('matching', 0.027), ('density', 0.025), ('dovl', 0.025), ('medoids', 0.025), ('ornaments', 0.025), ('superordinate', 0.025), ('facade', 0.023), ('basilica', 0.022), ('scramsac', 0.022), ('depicted', 0.022), ('finds', 0.022), ('simplification', 0.022), ('tree', 0.021), ('bandwidths', 0.021), ('enters', 0.021), ('seeding', 0.021), ('philbin', 0.021), ('mobile', 0.02), ('plsa', 0.02), ('chapel', 0.02), ('iconic', 0.02), ('discrete', 0.019), ('scale', 0.019), ('parent', 0.019), ('inliers', 0.019), ('mean', 0.019), ('dfg', 0.019), ('collapse', 0.019), ('gammeter', 0.019), ('tourist', 0.019), ('engine', 0.018), ('added', 0.018), ('filtering', 0.017), ('steps', 0.017), ('captions', 0.017), ('images', 0.017), ('leibe', 0.017), ('discover', 0.017), ('distances', 0.017), ('insert', 0.017), ('granularity', 0.017), ('child', 0.017), ('hierarchy', 0.017)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000001 117 iccv-2013-Discovering Details and Scene Structure with Hierarchical Iconoid Shift
Author: Tobias Weyand, Bastian Leibe
Abstract: Current landmark recognition engines are typically aimed at recognizing building-scale landmarks, but miss interesting details like portals, statues or windows. This is because they use a flat clustering that summarizes all photos of a building facade in one cluster. We propose Hierarchical Iconoid Shift, a novel landmark clustering algorithm capable of discovering such details. Instead of just a collection of clusters, the output of HIS is a set of dendrograms describing the detail hierarchy of a landmark. HIS is based on the novel Hierarchical Medoid Shift clustering algorithm that performs a continuous mode search over the complete scale space. HMS is completely parameter-free, has the same complexity as Medoid Shift and is easy to parallelize. We evaluate HIS on 800k images of 34 landmarks and show that it can extract an often surprising amount of detail and structure that can be applied, e.g., to provide a mobile user with more detailed information on a landmark or even to extend the landmark’s Wikipedia article.
2 0.098461889 321 iccv-2013-Pose-Free Facial Landmark Fitting via Optimized Part Mixtures and Cascaded Deformable Shape Model
Author: Xiang Yu, Junzhou Huang, Shaoting Zhang, Wang Yan, Dimitris N. Metaxas
Abstract: This paper addresses the problem of facial landmark localization and tracking from a single camera. We present a two-stage cascaded deformable shape model to effectively and efficiently localize facial landmarks with large head pose variations. For face detection, we propose a group sparse learning method to automatically select the most salient facial landmarks. By introducing 3D face shape model, we use procrustes analysis to achieve pose-free facial landmark initialization. For deformation, the first step uses mean-shift local search with constrained local model to rapidly approach the global optimum. The second step uses component-wise active contours to discriminatively refine the subtle shape variation. Our framework can simultaneously handle face detection, pose-free landmark localization and tracking in real time. Extensive experiments are conducted on both laboratory environmental face databases and face-in-the-wild databases. All results demonstrate that our approach has certain advantages over state-of-theart methods in handling pose variations1.
3 0.096795574 70 iccv-2013-Cascaded Shape Space Pruning for Robust Facial Landmark Detection
Author: Xiaowei Zhao, Shiguang Shan, Xiujuan Chai, Xilin Chen
Abstract: In this paper, we propose a novel cascaded face shape space pruning algorithm for robust facial landmark detection. Through progressively excluding the incorrect candidate shapes, our algorithm can accurately and efficiently achieve the globally optimal shape configuration. Specifically, individual landmark detectors are firstly applied to eliminate wrong candidates for each landmark. Then, the candidate shape space is further pruned by jointly removing incorrect shape configurations. To achieve this purpose, a discriminative structure classifier is designed to assess the candidate shape configurations. Based on the learned discriminative structure classifier, an efficient shape space pruning strategy is proposed to quickly reject most incorrect candidate shapes while preserve the true shape. The proposed algorithm is carefully evaluated on a large set of real world face images. In addition, comparison results on the publicly available BioID and LFW face databases demonstrate that our algorithm outperforms some state-of-the-art algorithms.
4 0.087467313 214 iccv-2013-Improving Graph Matching via Density Maximization
Author: Chao Wang, Lei Wang, Lingqiao Liu
Abstract: Graph matching has been widely used in various applications in computer vision due to its powerful performance. However, it poses three challenges to image sparse feature matching: (1) The combinatorial nature limits the size of the possible matches; (2) It is sensitive to outliers because the objective function prefers more matches; (3) It works poorly when handling many-to-many object correspondences, due to its assumption of one single cluster for each graph. In this paper, we address these problems with a unified framework—Density Maximization. We propose a graph density local estimator (퐷퐿퐸) to measure the quality of matches. Density Maximization aims to maximize the 퐷퐿퐸 values both locally and globally. The local maximization of 퐷퐿퐸 finds the clusters of nodes as well as eliminates the outliers. The global maximization of 퐷퐿퐸 efficiently refines the matches by exploring a much larger matching space. Our Density Maximization is orthogonal to specific graph matching algorithms. Experimental evaluation demonstrates that it significantly boosts the true matches and enables graph matching to handle both outliers and many-to-many object correspondences.
5 0.069710508 149 iccv-2013-Exemplar-Based Graph Matching for Robust Facial Landmark Localization
Author: Feng Zhou, Jonathan Brandt, Zhe Lin
Abstract: Localizing facial landmarks is a fundamental step in facial image analysis. However, the problem is still challenging due to the large variability in pose and appearance, and the existence ofocclusions in real-worldface images. In this paper, we present exemplar-based graph matching (EGM), a robust framework for facial landmark localization. Compared to conventional algorithms, EGM has three advantages: (1) an affine-invariant shape constraint is learned online from similar exemplars to better adapt to the test face; (2) the optimal landmark configuration can be directly obtained by solving a graph matching problem with the learned shape constraint; (3) the graph matching problem can be optimized efficiently by linear programming. To our best knowledge, this is the first attempt to apply a graph matching technique for facial landmark localization. Experiments on several challenging datasets demonstrate the advantages of EGM over state-of-the-art methods.
6 0.064260259 147 iccv-2013-Event Recognition in Photo Collections with a Stopwatch HMM
7 0.057568673 404 iccv-2013-Structured Forests for Fast Edge Detection
8 0.05209801 111 iccv-2013-Detecting Dynamic Objects with Multi-view Background Subtraction
9 0.050700895 10 iccv-2013-A Framework for Shape Analysis via Hilbert Space Embedding
10 0.048334099 219 iccv-2013-Internet Based Morphable Model
11 0.045599718 81 iccv-2013-Combining the Right Features for Complex Event Recognition
12 0.044552807 440 iccv-2013-Video Event Understanding Using Natural Language Descriptions
13 0.043107916 159 iccv-2013-Fast Neighborhood Graph Search Using Cartesian Concatenation
14 0.04304127 294 iccv-2013-Offline Mobile Instance Retrieval with a Small Memory Footprint
15 0.042041656 320 iccv-2013-Pose-Configurable Generic Tracking of Elongated Objects
16 0.041981027 165 iccv-2013-Find the Best Path: An Efficient and Accurate Classifier for Image Hierarchies
17 0.041770257 196 iccv-2013-Hierarchical Data-Driven Descent for Efficient Optimal Deformation Estimation
18 0.040890001 435 iccv-2013-Unsupervised Domain Adaptation by Domain Invariant Projection
19 0.040862795 198 iccv-2013-Hierarchical Part Matching for Fine-Grained Visual Categorization
20 0.040767618 295 iccv-2013-On One-Shot Similarity Kernels: Explicit Feature Maps and Properties
topicId topicWeight
[(0, 0.101), (1, -0.003), (2, -0.015), (3, -0.022), (4, 0.021), (5, 0.023), (6, 0.041), (7, 0.008), (8, 0.001), (9, -0.037), (10, 0.006), (11, 0.003), (12, 0.028), (13, 0.021), (14, -0.016), (15, 0.032), (16, 0.048), (17, -0.012), (18, 0.014), (19, -0.061), (20, -0.012), (21, 0.013), (22, 0.022), (23, 0.074), (24, 0.039), (25, -0.011), (26, -0.005), (27, -0.022), (28, -0.032), (29, -0.026), (30, 0.007), (31, 0.004), (32, -0.022), (33, -0.011), (34, 0.027), (35, 0.046), (36, 0.04), (37, -0.037), (38, -0.002), (39, 0.006), (40, -0.057), (41, -0.045), (42, 0.0), (43, -0.069), (44, -0.016), (45, -0.034), (46, 0.029), (47, 0.005), (48, 0.01), (49, 0.069)]
simIndex simValue paperId paperTitle
same-paper 1 0.8924033 117 iccv-2013-Discovering Details and Scene Structure with Hierarchical Iconoid Shift
Author: Tobias Weyand, Bastian Leibe
Abstract: Current landmark recognition engines are typically aimed at recognizing building-scale landmarks, but miss interesting details like portals, statues or windows. This is because they use a flat clustering that summarizes all photos of a building facade in one cluster. We propose Hierarchical Iconoid Shift, a novel landmark clustering algorithm capable of discovering such details. Instead of just a collection of clusters, the output of HIS is a set of dendrograms describing the detail hierarchy of a landmark. HIS is based on the novel Hierarchical Medoid Shift clustering algorithm that performs a continuous mode search over the complete scale space. HMS is completely parameter-free, has the same complexity as Medoid Shift and is easy to parallelize. We evaluate HIS on 800k images of 34 landmarks and show that it can extract an often surprising amount of detail and structure that can be applied, e.g., to provide a mobile user with more detailed information on a landmark or even to extend the landmark’s Wikipedia article.
2 0.64222604 149 iccv-2013-Exemplar-Based Graph Matching for Robust Facial Landmark Localization
Author: Feng Zhou, Jonathan Brandt, Zhe Lin
Abstract: Localizing facial landmarks is a fundamental step in facial image analysis. However, the problem is still challenging due to the large variability in pose and appearance, and the existence ofocclusions in real-worldface images. In this paper, we present exemplar-based graph matching (EGM), a robust framework for facial landmark localization. Compared to conventional algorithms, EGM has three advantages: (1) an affine-invariant shape constraint is learned online from similar exemplars to better adapt to the test face; (2) the optimal landmark configuration can be directly obtained by solving a graph matching problem with the learned shape constraint; (3) the graph matching problem can be optimized efficiently by linear programming. To our best knowledge, this is the first attempt to apply a graph matching technique for facial landmark localization. Experiments on several challenging datasets demonstrate the advantages of EGM over state-of-the-art methods.
3 0.56011957 321 iccv-2013-Pose-Free Facial Landmark Fitting via Optimized Part Mixtures and Cascaded Deformable Shape Model
Author: Xiang Yu, Junzhou Huang, Shaoting Zhang, Wang Yan, Dimitris N. Metaxas
Abstract: This paper addresses the problem of facial landmark localization and tracking from a single camera. We present a two-stage cascaded deformable shape model to effectively and efficiently localize facial landmarks with large head pose variations. For face detection, we propose a group sparse learning method to automatically select the most salient facial landmarks. By introducing 3D face shape model, we use procrustes analysis to achieve pose-free facial landmark initialization. For deformation, the first step uses mean-shift local search with constrained local model to rapidly approach the global optimum. The second step uses component-wise active contours to discriminatively refine the subtle shape variation. Our framework can simultaneously handle face detection, pose-free landmark localization and tracking in real time. Extensive experiments are conducted on both laboratory environmental face databases and face-in-the-wild databases. All results demonstrate that our approach has certain advantages over state-of-theart methods in handling pose variations1.
4 0.55558091 70 iccv-2013-Cascaded Shape Space Pruning for Robust Facial Landmark Detection
Author: Xiaowei Zhao, Shiguang Shan, Xiujuan Chai, Xilin Chen
Abstract: In this paper, we propose a novel cascaded face shape space pruning algorithm for robust facial landmark detection. Through progressively excluding the incorrect candidate shapes, our algorithm can accurately and efficiently achieve the globally optimal shape configuration. Specifically, individual landmark detectors are firstly applied to eliminate wrong candidates for each landmark. Then, the candidate shape space is further pruned by jointly removing incorrect shape configurations. To achieve this purpose, a discriminative structure classifier is designed to assess the candidate shape configurations. Based on the learned discriminative structure classifier, an efficient shape space pruning strategy is proposed to quickly reject most incorrect candidate shapes while preserve the true shape. The proposed algorithm is carefully evaluated on a large set of real world face images. In addition, comparison results on the publicly available BioID and LFW face databases demonstrate that our algorithm outperforms some state-of-the-art algorithms.
5 0.52918941 159 iccv-2013-Fast Neighborhood Graph Search Using Cartesian Concatenation
Author: Jing Wang, Jingdong Wang, Gang Zeng, Rui Gan, Shipeng Li, Baining Guo
Abstract: In this paper, we propose a new data structure for approximate nearest neighbor search. This structure augments the neighborhoodgraph with a bridge graph. We propose to exploit Cartesian concatenation to produce a large set of vectors, called bridge vectors, from several small sets of subvectors. Each bridge vector is connected with a few reference vectors near to it, forming a bridge graph. Our approach finds nearest neighbors by simultaneously traversing the neighborhood graph and the bridge graph in the best-first strategy. The success of our approach stems from two factors: the exact nearest neighbor search over a large number of bridge vectors can be done quickly, and the reference vectors connected to a bridge (reference) vector near the query are also likely to be near the query. Experimental results on searching over large scale datasets (SIFT, GIST andHOG) show that our approach outperforms stateof-the-art ANN search algorithms in terms of efficiency and accuracy. The combination of our approach with the IVFADC system [18] also shows superior performance over the BIGANN dataset of 1 billion SIFT features compared with the best previously published result.
6 0.51776749 10 iccv-2013-A Framework for Shape Analysis via Hilbert Space Embedding
7 0.50495797 289 iccv-2013-Network Principles for SfM: Disambiguating Repeated Structures with Local Context
8 0.50055104 294 iccv-2013-Offline Mobile Instance Retrieval with a Small Memory Footprint
9 0.49428967 81 iccv-2013-Combining the Right Features for Complex Event Recognition
10 0.4939985 277 iccv-2013-Multi-channel Correlation Filters
11 0.47875527 251 iccv-2013-Like Father, Like Son: Facial Expression Dynamics for Kinship Verification
12 0.47842792 368 iccv-2013-SYM-FISH: A Symmetry-Aware Flip Invariant Sketch Histogram Shape Descriptor
14 0.47389659 79 iccv-2013-Coherent Object Detection with 3D Geometric Context from a Single Image
15 0.4731732 110 iccv-2013-Detecting Curved Symmetric Parts Using a Deformable Disc Model
16 0.47000676 48 iccv-2013-An Adaptive Descriptor Design for Object Recognition in the Wild
17 0.45759276 214 iccv-2013-Improving Graph Matching via Density Maximization
18 0.45535973 387 iccv-2013-Shape Anchors for Data-Driven Multi-view Reconstruction
19 0.44319871 419 iccv-2013-To Aggregate or Not to aggregate: Selective Match Kernels for Image Search
20 0.43961376 443 iccv-2013-Video Synopsis by Heterogeneous Multi-source Correlation
topicId topicWeight
[(2, 0.104), (12, 0.015), (26, 0.072), (31, 0.036), (34, 0.037), (42, 0.087), (45, 0.277), (64, 0.042), (73, 0.023), (89, 0.141), (95, 0.019), (98, 0.02)]
simIndex simValue paperId paperTitle
same-paper 1 0.75563538 117 iccv-2013-Discovering Details and Scene Structure with Hierarchical Iconoid Shift
Author: Tobias Weyand, Bastian Leibe
Abstract: Current landmark recognition engines are typically aimed at recognizing building-scale landmarks, but miss interesting details like portals, statues or windows. This is because they use a flat clustering that summarizes all photos of a building facade in one cluster. We propose Hierarchical Iconoid Shift, a novel landmark clustering algorithm capable of discovering such details. Instead of just a collection of clusters, the output of HIS is a set of dendrograms describing the detail hierarchy of a landmark. HIS is based on the novel Hierarchical Medoid Shift clustering algorithm that performs a continuous mode search over the complete scale space. HMS is completely parameter-free, has the same complexity as Medoid Shift and is easy to parallelize. We evaluate HIS on 800k images of 34 landmarks and show that it can extract an often surprising amount of detail and structure that can be applied, e.g., to provide a mobile user with more detailed information on a landmark or even to extend the landmark’s Wikipedia article.
2 0.67978269 353 iccv-2013-Revisiting the PnP Problem: A Fast, General and Optimal Solution
Author: Yinqiang Zheng, Yubin Kuang, Shigeki Sugimoto, Kalle Åström, Masatoshi Okutomi
Abstract: In this paper, we revisit the classical perspective-n-point (PnP) problem, and propose the first non-iterative O(n) solution that is fast, generally applicable and globally optimal. Our basic idea is to formulate the PnP problem into a functional minimization problem and retrieve all its stationary points by using the Gr¨ obner basis technique. The novelty lies in a non-unit quaternion representation to parameterize the rotation and a simple but elegant formulation of the PnP problem into an unconstrained optimization problem. Interestingly, the polynomial system arising from its first-order optimality condition assumes two-fold symmetry, a nice property that can be utilized to improve speed and numerical stability of a Gr¨ obner basis solver. Experiment results have demonstrated that, in terms of accuracy, our proposed solution is definitely better than the state-ofthe-art O(n) methods, and even comparable with the reprojection error minimization method.
3 0.62782729 137 iccv-2013-Efficient Salient Region Detection with Soft Image Abstraction
Author: Ming-Ming Cheng, Jonathan Warrell, Wen-Yan Lin, Shuai Zheng, Vibhav Vineet, Nigel Crook
Abstract: Detecting visually salient regions in images is one of the fundamental problems in computer vision. We propose a novel method to decompose an image into large scale perceptually homogeneous elements for efficient salient region detection, using a soft image abstraction representation. By considering both appearance similarity and spatial distribution of image pixels, the proposed representation abstracts out unnecessary image details, allowing the assignment of comparable saliency values across similar regions, and producing perceptually accurate salient region detection. We evaluate our salient region detection approach on the largest publicly available dataset with pixel accurate annotations. The experimental results show that the proposed method outperforms 18 alternate methods, reducing the mean absolute error by 25.2% compared to the previous best result, while being computationally more efficient.
4 0.61412632 156 iccv-2013-Fast Direct Super-Resolution by Simple Functions
Author: Chih-Yuan Yang, Ming-Hsuan Yang
Abstract: The goal of single-image super-resolution is to generate a high-quality high-resolution image based on a given low-resolution input. It is an ill-posed problem which requires exemplars or priors to better reconstruct the missing high-resolution image details. In this paper, we propose to split the feature space into numerous subspaces and collect exemplars to learn priors for each subspace, thereby creating effective mapping functions. The use of split input space facilitates both feasibility of using simple functionsfor super-resolution, and efficiency ofgenerating highresolution results. High-quality high-resolution images are reconstructed based on the effective learned priors. Experimental results demonstrate that theproposed algorithmperforms efficiently and effectively over state-of-the-art methods.
5 0.60001552 424 iccv-2013-Tracking Revisited Using RGBD Camera: Unified Benchmark and Baselines
Author: Shuran Song, Jianxiong Xiao
Abstract: Despite significant progress, tracking is still considered to be a very challenging task. Recently, the increasing popularity of depth sensors has made it possible to obtain reliable depth easily. This may be a game changer for tracking, since depth can be used to prevent model drift and handle occlusion. We also observe that current tracking algorithms are mostly evaluated on a very small number of videos collectedandannotated by different groups. The lack of a reasonable size and consistently constructed benchmark has prevented a persuasive comparison among different algorithms. In this paper, we construct a unified benchmark dataset of 100 RGBD videos with high diversity, propose different kinds of RGBD tracking algorithms using 2D or 3D model, and present a quantitative comparison of various algorithms with RGB or RGBD input. We aim to lay the foundation for further research in both RGB and RGBD tracking, and our benchmark is available at http://tracking.cs.princeton.edu.
6 0.59941041 406 iccv-2013-Style-Aware Mid-level Representation for Discovering Visual Connections in Space and Time
7 0.59927726 180 iccv-2013-From Where and How to What We See
8 0.59828848 197 iccv-2013-Hierarchical Joint Max-Margin Learning of Mid and Top Level Representations for Visual Recognition
9 0.59811425 248 iccv-2013-Learning to Rank Using Privileged Information
10 0.59730077 31 iccv-2013-A Unified Probabilistic Approach Modeling Relationships between Attributes and Objects
11 0.59617877 239 iccv-2013-Learning Hash Codes with Listwise Supervision
12 0.5955193 445 iccv-2013-Visual Reranking through Weakly Supervised Multi-graph Learning
13 0.59545654 448 iccv-2013-Weakly Supervised Learning of Image Partitioning Using Decision Trees with Structured Split Criteria
14 0.59499717 126 iccv-2013-Dynamic Label Propagation for Semi-supervised Multi-class Multi-label Classification
15 0.5949502 153 iccv-2013-Face Recognition Using Face Patch Networks
16 0.59477758 384 iccv-2013-Semi-supervised Robust Dictionary Learning via Efficient l-Norms Minimization
17 0.59474063 322 iccv-2013-Pose Estimation and Segmentation of People in 3D Movies
18 0.59418029 326 iccv-2013-Predicting Sufficient Annotation Strength for Interactive Foreground Segmentation
19 0.59315705 202 iccv-2013-How Do You Tell a Blackbird from a Crow?
20 0.5929333 188 iccv-2013-Group Sparsity and Geometry Constrained Dictionary Learning for Action Recognition from Depth Maps