iccv iccv2013 iccv2013-305 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Chunxiao Liu, Chen Change Loy, Shaogang Gong, Guijin Wang
Abstract: Owing to visual ambiguities and disparities, person reidentification methods inevitably produce suboptimal ranklist, which still requires exhaustive human eyeballing to identify the correct target from hundreds of different likelycandidates. Existing re-identification studies focus on improving the ranking performance, but rarely look into the critical problem of optimising the time-consuming and error-prone post-rank visual search at the user end. In this study, we present a novel one-shot Post-rank OPtimisation (POP) method, which allows a user to quickly refine their search by either “one-shot” or a couple of sparse negative selections during a re-identification process. We conduct systematic behavioural studies to understand user’s searching behaviour and show that the proposed method allows correct re-identification to converge 2.6 times faster than the conventional exhaustive search. Importantly, through extensive evaluations we demonstrate that the method is capable of achieving significant improvement over the stateof-the-art distance metric learning based ranking models, even with just “one shot” feedback optimisation, by as much as over 30% performance improvement for rank 1reidentification on the VIPeR and i-LIDS datasets.
Reference: text
sentIndex sentText sentNum sentScore
1 cn Abstract Owing to visual ambiguities and disparities, person reidentification methods inevitably produce suboptimal ranklist, which still requires exhaustive human eyeballing to identify the correct target from hundreds of different likelycandidates. [sent-14, score-0.421]
2 Existing re-identification studies focus on improving the ranking performance, but rarely look into the critical problem of optimising the time-consuming and error-prone post-rank visual search at the user end. [sent-15, score-0.471]
3 In this study, we present a novel one-shot Post-rank OPtimisation (POP) method, which allows a user to quickly refine their search by either “one-shot” or a couple of sparse negative selections during a re-identification process. [sent-16, score-0.449]
4 Introduction For person re-identification (re-id), a probe image serves as a query to be compared against a gallery that consists of images of different individuals captured at distributed locations at different time. [sent-21, score-0.801]
5 There are two reasons for such considerations: Visual ambiguities and disparities - In the context of person re-identification, the visual samples are ambiguous, i. [sent-32, score-0.266]
6 Off-line learning scalability - The performance of current distance learning based ranking approaches to person re-identification remain low [ 19, 26, 13, 17, 16, 25], e. [sent-38, score-0.271]
7 s ≤et3 even wcoigthn person probe images manually aanr dV carefully cropped. [sent-42, score-0.417]
8 Specifically, our method aims to minimise human-in-theloop effort by one-shot negative feedback selection. [sent-98, score-0.508]
9 That is, a user only needs to select a single strong negative feedback, and optionally a few weak negatives, to trigger an automated refinement of the suboptimal rank list. [sent-99, score-0.631]
10 A strong negative is a highly ranked, but confusing match in a machine generated suboptimal rank list with clear visual dissimilarity to the probe image, whilst a weak negative is a visually similar but wrong match in the same rank list (Fig. [sent-100, score-1.375]
11 We formulate a new visual expansion model that not only synthesises pseudo-samples to complement the sparse negative selection, but also compute a generic mapping of visual change between different camera views. [sent-102, score-0.343]
12 In addition, we introduce an incremental affinity graph construction for propagating sparse belief accumulated from human-in-theloop negative mining. [sent-103, score-0.368]
13 In essence, the proposed model combines sparse human negative feedback on-the-fly to steer automatic selection of more relevant re-identification features. [sent-104, score-0.547]
14 6 times of search efficiency compared to the typical exhaustive search strategy, but also brings about as much as over 30% performance improvement for rank 1re-identification over current distance metric learning and ranking models. [sent-107, score-0.43]
15 This is based on “one shot” user negative selection only, and evaluated extensively using both the VIPeR and i-LIDS benchmark datasets. [sent-108, score-0.362]
16 Related Work Post-rank optimisation for re-id is relatively unexplored in the person re-identification literature. [sent-110, score-0.286]
17 One related study in [12] attempted to refine the rank list but their study does not model the process of enabling human-in-the-loop for optimising the suboptimal rank list with only sparse feedback, down to one-shot. [sent-111, score-0.454]
18 Another related work [18] requires explicit relative feedback in image classifier training to diffuse the label to unlabelled images. [sent-115, score-0.4]
19 face recognition in multimedia domain with feedback for query expansion in continuously tracked faces, a significantly more constrained problem when compared to person re-identification by a single image (see Fig. [sent-119, score-0.543]
20 Our negative mining concept is related to human relevance feedback mining in generic image search and retrieval. [sent-123, score-0.668]
21 They are: (1) top-ranked positive images are visually consistent to the probe (no visual ambiguities) [24, 10], (2) those positive images often form the largest cluster [28], or (3) sufficient positive samples can be gathered through text keyword expansion [21]. [sent-128, score-0.566]
22 A true positive person re-id match does not necessarily forms a large cluster in the gallery set, in the contrary it is often sparse. [sent-130, score-0.641]
23 Given a probe image to be matched against an unlabelled gallery set, a ranking function generates a suboptimal rank list of the gallery set according to each gallery image’s likelihood to be a true match of the probe image. [sent-193, score-2.218]
24 All other samples in the gallery space are considered as negatives, which can be divided into two negative types (Fig. [sent-195, score-0.599]
25 2): (1) Strong negatives - highly ranked gallery images that are visually clearly dissimilar to the probe image. [sent-196, score-0.992]
26 (2) Weak negatives - albeit not the true match, these highly ranked negative gallery images are visually similar to the probe image. [sent-198, score-1.214]
27 They could be good candidates for disambiguating visual uncertainties and optimising the initial ranking function. [sent-199, score-0.311]
28 We wish to formulate a model to best exploit human-in-the-loop feedback for postrank optimisation. [sent-200, score-0.395]
29 Given a probe instance, xp, we assume an initial ranking function finit is available (e. [sent-202, score-0.465]
30 true in the top N ranked candidates, we wish to learn a post-rank function fpr for rank re-ordering. [sent-216, score-0.259]
31 3: (a) A user selects one (any) strong negative from the top N ranked instances, denoted as xs− 1 . [sent-218, score-0.477]
32 than weak negatives (visually subtle) in an on-the-fly feedback process. [sent-238, score-0.631]
33 We also show in comparative experiments in Section 6 that any performance advantage gained from additional multiple negative feedback over a single one-shot (b) For learning the post-rank function, we also require positive sample(s) in addition to the user selected negative sample. [sent-239, score-0.894]
34 To that end, visual expansion is computed to synthesise one or more instances of the probe image ( x˜p) in the gallery view (Sec. [sent-240, score-0.895]
35 (c) An affinity graph weighted by an affinity matrix A¯ is constructed to capture the appearance similarities among all the images in the gallery view, including both the original gallery instances and the synthesised probe instances (Sec. [sent-243, score-1.702]
36 (d) This sparse negative information obtained from the user is propagated to their nearby neighbours in the gallery view via the above weighted affinity graph (Sec. [sent-246, score-0.887]
37 Cross-Camera View Visual Expansion Learning a post-rank function for rank re-ordering requires both labelled negative and positive data. [sent-254, score-0.336]
38 Clearly, a single strong negative selected by user is insufficient for this purpose. [sent-255, score-0.439]
39 Moreover, owing to potentially large feature inconsistency between different camera views, the probe image itself from the probe camera view cannot be readily used as a positive sample in the gallery view. [sent-260, score-1.019]
40 To resolve this problem, we specifically design a regression forest [4] based visual expansion method. [sent-261, score-0.28]
41 Moreover, the nature of it being an ensemble oftrees allows efficient random permutation in the predictors to synthesise one or more samples that resemble the probe’s appearance as pseudo positive-labelled data in the gallery view. [sent-263, score-0.493]
42 Specifically, the visual variations between a probe and a gallery camera view are accounted by the multi-output regression forest, with Tr trees, through learning an appearance mapping space xp xg M: → ∈ Rd, (1) from a set of paired training instances extracted from crosscamera views (Fig. [sent-264, score-0.891]
43 A synthesised probe instance can then be generated as follows negative feedback is insignificant as a result of post-rank optimisation. [sent-266, score-0.991]
44 This} pro- cess can ybe s repeated ntod generate more synthesised probe iron-stances if desired. [sent-278, score-0.483]
45 To that end, we shall describe how to propagate the sparse labelled samples to the large quantity of unlabelled gallery set so to avoid the need for labelling exhaustively the gallery set. [sent-284, score-0.936]
46 This process of transduction via an affinity graph is facilitated by first constructing an affinity graph of the unlabelled gallery set. [sent-285, score-0.824]
47 (3) We then collect the pairwise distances of all gallery instances to construct an affinity matrix At ∈ Rn×n of that tree, with each element Aitj given as Aitj = exp−distt(xig,xjg) . [sent-299, score-0.587]
48 (4) Intuitively, we assign affinity=1 (distance=0) to samples xig and xjg if they fall into the same leaf node, and affin2This fraction is typical in random forest bootstrap training [4]. [sent-300, score-0.312]
49 now consider the case for including synthesised positives in the construction of the affinity graph. [sent-310, score-0.361]
50 Recall that our method is designed to need only a single strong negative to re-order the rank. [sent-311, score-0.275]
51 Nevertheless, a user has the option to select more negatives in more than one round of feedback, if necessary and desired. [sent-312, score-0.429]
52 To maintain a balance in positive-negative data for the post-rank function learning, the model needs to generate equal number of synthesised positive probe instances { x˜p} as pseudo positive-labelled dpaostai i vne eth per gallery avnicewe. [sent-313, score-1.013]
53 s T{ ˜xhu}s, athse p nseuumdboer p oosfi txi ˜vpe can vary depending on the number of negatives selected by a user cumulatively. [sent-314, score-0.39]
54 A more tractable approach is to first build a graph using the gallery data alone without the additional synthesised positives, and then expand it to accommodate the additional synthesised probe instances, as follows. [sent-316, score-1.141]
55 First, we compute the affinity between { x˜p} and all the existing gallery ipnusttean thcees a {ffixngit}y. [sent-317, score-0.519]
56 In particular, since the index of each gallery instances is stored in the leaf nod? [sent-320, score-0.483]
57 Sparse Negative Propagation over Graph After constructing the affinity graph, we diffuse the sparse negative and synthesised positive information over the graph to all other gallery instances. [sent-354, score-0.986]
58 First, we order the selected negatives and synthesised probe instances into the first llabelled samples L, followed by the remaining u gallery tiln s latabneclelesd as aumnplalebsel lLed, f samples b Uy, i t. [sent-355, score-1.244]
59 Effects of negative accumulation: (a) three-dimensional embedding of gallery images obtained using multi-dimensional scaling after the first round of negative selection, (b) the embedding after the second round. [sent-458, score-0.828]
60 The gallery images are colour coded according to their new ranking score. [sent-459, score-0.561]
61 The shrinking region of bright yellow colour indicates the effectiveness of negative mining in demoting initial false matches. [sent-460, score-0.358]
62 2I, which enforces thec osimntirolalrs/d thisesi imntilrainr liacb erlesg uolfa nearby gallery instances with respect to the affinity graph to be close. [sent-476, score-0.632]
63 Finally, the estimated relevance of an unlabelled gallery instance xjg to the probe is computed as αl+u)T sjpr = fpr(xjg) =? [sent-495, score-0.86]
64 The parameter β balances the influence between initial ranking and user feedback selections. [sent-502, score-0.637]
65 Negative Accumulation After each round of negative mining, we add new negative selections to a cumulated strong negative sets collected from previous rounds (or also weak negative sets if weak negatives were selected). [sent-505, score-1.447]
66 Figure 4 shows an example for the effect of feedback accumulation in two rounds of negative mining. [sent-506, score-0.636]
67 As more negatives are accumulated, the classification boundary is refined, increasing the separation between the true match and other strong negatives. [sent-507, score-0.402]
68 The above negative accumulation are repeated together with the negative mining steps (Sec. [sent-508, score-0.467]
69 In the test set of each trial, we randomly chose one image from each person to set up the test gallery set and the remaining images were used as probe images. [sent-536, score-0.801]
70 Note that for the i-LIDS dataset, 50 images in the gallery set were insufficient to construct the intrinsic regulariser ? [sent-537, score-0.417]
71 They were asked to manually annotate the weak and strong negatives ranked by an off-line ranking model given a set of random probe images. [sent-548, score-0.91]
72 It is evident from Table 1 that the proportion of weak and strong negatives are extremely imbalanced with the strong negatives outnumbers the weak negatives significantly. [sent-549, score-1.022]
73 Overall, these results suggest that the relatively more salient strong negatives are more likely to be selected by a user during a post-rank feedback selection process. [sent-557, score-0.836]
74 This raises the question on how the POP model performs given a single strong negative feedback (i. [sent-558, score-0.595]
75 oneshot) as compared to its performance given multiple weak negatives as feedback. [sent-560, score-0.311]
76 1-norm, and were asked to perform one-shot strong negative selection from the top 15 ranked results. [sent-565, score-0.406]
77 They were allocated a maximum of 3 rank feedback rounds with one strong negative selection each. [sent-566, score-0.799]
78 If the true match cannot be promoted into the top 15 ranks by the model after the maximal 3 rounds of one-shot postrank optimisation, the users were asked to continue with an exhaustive visual search to find the true match. [sent-567, score-0.511]
79 Figure 5 depicts several examples of actual user interactions during the post-rank optimisation process. [sent-570, score-0.287]
80 5(b), when a user selected the first candidate as strong negative, both the first and second candidates who were wearing brown jackets were removed from the top ranks. [sent-573, score-0.28]
81 5(c) shows a failure case where selecting one strong negative is insufficient to resolve the visual ambiguity, since the true match experiences large appearance variation due to viewpoint change. [sent-575, score-0.422]
82 The probe and the true match are highlighted respectively with red and green bounding boxes. [sent-587, score-0.372]
83 The selected strong negative is denoted by a red cross. [sent-589, score-0.304]
84 1-norm, RankSVM, PRDC, MCC First we evaluate the benefits of POP on existing ranking based person re-identification methods using ? [sent-607, score-0.271]
85 In each round, the negative selection was performed on the first N ranked images, N = 15 for the VIPeR dataset and N = 10 for the i-LIDS dataset due to its relatively smaller size. [sent-609, score-0.294]
86 We treat the negative selections collected offline from the first behaviour study (Sec. [sent-610, score-0.305]
87 Despite the negative selection was performed without a live user in the loop, the experiments were still using the real feedback from users. [sent-613, score-0.682]
88 the number of feedback round on VIPeR and i-LIDS. [sent-633, score-0.388]
89 The one-shot experiment depicted an extremely sparse feedback scenario, where only one strong negative within the top N ranked images was selected in a round. [sent-636, score-0.691]
90 The maximum number of strong negatives was set to 5 assuming that the users do not bother to annotate more. [sent-638, score-0.313]
91 With feedback increased to three rounds, the performance improves monotonically and converges. [sent-644, score-0.32]
92 The one-shot negative selection in just one feedback round yields stable and competitive results with no obvious degradation in comparison to the multi-shot multi-rounds feedback, indicating the effectiveness of one-shot post-rank optimisation. [sent-649, score-0.615]
93 It uses Euclidean distance to construct the affinity matrix and optimises a ranking function with least square regression. [sent-657, score-0.272]
94 The yaxis shows the recognition rate at rank-5 along with the increment of feedback round. [sent-672, score-0.32]
95 In addition, we implemented two baseline approaches: (1) a na¨ ıve feedback method which simply demotes the strong negatives to the bottom of the ranking list in each round; (2) a SVM approach using the strong negatives and synthesized positive examples for training. [sent-674, score-1.175]
96 The y-axis shows the recognition rate at Rank-5 along with the increment of feedback round. [sent-691, score-0.32]
97 NPRF, PRF and the na¨ ıve feedback are generally poor in boosting the recognition rate on VIPeR dataset, suggesting that the use of top-ranked images as positive feedback samples can lead to erroneous post-rank results in a re-identification task. [sent-693, score-0.701]
98 The better performance of POP over EMR suggests the more effective propagation of negatives over the clustering-forest based affinity graph, rather than the Euclidean-based graph. [sent-700, score-0.361]
99 To prepare the baseline without visual expansion, we randomly selected one weak negative image from the top N ranks (N = 15 for VIPeR, 10 for i-LIDS) to pair with the one-shot strong negative. [sent-704, score-0.447]
100 Pseudo relevance feedback based on iterative probabilistic one-class SVMs in web image retrieval. [sent-777, score-0.32]
wordName wordTfidf (topN-words)
[('gallery', 0.384), ('feedback', 0.32), ('probe', 0.283), ('pop', 0.253), ('negatives', 0.226), ('viper', 0.221), ('synthesised', 0.2), ('negative', 0.188), ('optimisation', 0.152), ('ranking', 0.137), ('user', 0.135), ('affinity', 0.135), ('person', 0.134), ('emr', 0.117), ('xjg', 0.113), ('forest', 0.095), ('expansion', 0.089), ('strong', 0.087), ('reidentification', 0.087), ('weak', 0.085), ('rounds', 0.084), ('rank', 0.081), ('exhaustive', 0.08), ('unlabelled', 0.08), ('fpr', 0.077), ('mcc', 0.075), ('nprf', 0.075), ('postrank', 0.075), ('prf', 0.075), ('instances', 0.068), ('round', 0.068), ('optimising', 0.067), ('distt', 0.067), ('prdc', 0.067), ('ranked', 0.067), ('search', 0.066), ('selections', 0.06), ('ranksvm', 0.058), ('list', 0.058), ('rroanukn', 0.056), ('sinit', 0.056), ('suboptimal', 0.055), ('match', 0.055), ('xg', 0.052), ('systematic', 0.049), ('mining', 0.047), ('xig', 0.046), ('initial', 0.045), ('graph', 0.045), ('accumulation', 0.044), ('pseudo', 0.044), ('xw', 0.043), ('xs', 0.042), ('colour', 0.04), ('disparities', 0.04), ('selection', 0.039), ('regression', 0.038), ('aitj', 0.038), ('demoting', 0.038), ('ekenel', 0.038), ('irniantiakl', 0.038), ('spr', 0.038), ('synthesise', 0.038), ('whilst', 0.036), ('loy', 0.035), ('owing', 0.035), ('positive', 0.034), ('true', 0.034), ('invited', 0.033), ('behavioural', 0.033), ('regulariser', 0.033), ('visual', 0.033), ('jp', 0.033), ('studies', 0.033), ('xp', 0.033), ('labelled', 0.033), ('ambiguities', 0.032), ('visually', 0.032), ('leaf', 0.031), ('judgement', 0.031), ('fischer', 0.031), ('behaviour', 0.03), ('candidates', 0.029), ('inghua', 0.029), ('ali', 0.029), ('accommodate', 0.029), ('selected', 0.029), ('shall', 0.028), ('hirzer', 0.028), ('belkin', 0.028), ('interactive', 0.027), ('samples', 0.027), ('study', 0.027), ('positives', 0.026), ('mm', 0.026), ('ranks', 0.025), ('asked', 0.025), ('tre', 0.025), ('resolve', 0.025), ('shot', 0.024)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999958 305 iccv-2013-POP: Person Re-identification Post-rank Optimisation
Author: Chunxiao Liu, Chen Change Loy, Shaogang Gong, Guijin Wang
Abstract: Owing to visual ambiguities and disparities, person reidentification methods inevitably produce suboptimal ranklist, which still requires exhaustive human eyeballing to identify the correct target from hundreds of different likelycandidates. Existing re-identification studies focus on improving the ranking performance, but rarely look into the critical problem of optimising the time-consuming and error-prone post-rank visual search at the user end. In this study, we present a novel one-shot Post-rank OPtimisation (POP) method, which allows a user to quickly refine their search by either “one-shot” or a couple of sparse negative selections during a re-identification process. We conduct systematic behavioural studies to understand user’s searching behaviour and show that the proposed method allows correct re-identification to converge 2.6 times faster than the conventional exhaustive search. Importantly, through extensive evaluations we demonstrate that the method is capable of achieving significant improvement over the stateof-the-art distance metric learning based ranking models, even with just “one shot” feedback optimisation, by as much as over 30% performance improvement for rank 1reidentification on the VIPeR and i-LIDS datasets.
2 0.32754719 356 iccv-2013-Robust Feature Set Matching for Partial Face Recognition
Author: Renliang Weng, Jiwen Lu, Junlin Hu, Gao Yang, Yap-Peng Tan
Abstract: Over the past two decades, a number of face recognition methods have been proposed in the literature. Most of them use holistic face images to recognize people. However, human faces are easily occluded by other objects in many real-world scenarios and we have to recognize the person of interest from his/her partial faces. In this paper, we propose a new partial face recognition approach by using feature set matching, which is able to align partial face patches to holistic gallery faces automatically and is robust to occlusions and illumination changes. Given each gallery image and probe face patch, we first detect keypoints and extract their local features. Then, we propose a Metric Learned ExtendedRobust PointMatching (MLERPM) method to discriminatively match local feature sets of a pair of gallery and probe samples. Lastly, the similarity of two faces is converted as the distance between two feature sets. Experimental results on three public face databases are presented to show the effectiveness of the proposed approach.
3 0.31920165 213 iccv-2013-Implied Feedback: Learning Nuances of User Behavior in Image Search
Author: Devi Parikh, Kristen Grauman
Abstract: User feedback helps an image search system refine its relevance predictions, tailoring the search towards the user’s preferences. Existing methods simply take feedback at face value: clicking on an image means the user wants things like it; commenting that an image lacks a specific attribute means the user wants things that have it. However, we expect there is actually more information behind the user’s literal feedback. In particular, a user’s (possibly subconscious) search strategy leads him to comment on certain images rather than others, based on how any of the visible candidate images compare to the desired content. For example, he may be more likely to give negative feedback on an irrelevant image that is relatively close to his target, as opposed to bothering with one that is altogether different. We introduce novel features to capitalize on such implied feedback cues, and learn a ranking function that uses them to improve the system’s relevance estimates. We validate the approach with real users searching for shoes, faces, or scenes using two different modes of feedback: binary relevance feedback and relative attributes-based feedback. The results show that retrieval improves significantly when the system accounts for the learned behaviors. We show that the nuances learned are domain-invariant, and useful for both generic user-independent search as well as personalized user-specific search.
4 0.25201881 54 iccv-2013-Attribute Pivots for Guiding Relevance Feedback in Image Search
Author: Adriana Kovashka, Kristen Grauman
Abstract: In interactive image search, a user iteratively refines his results by giving feedback on exemplar images. Active selection methods aim to elicit useful feedback, but traditional approaches suffer from expensive selection criteria and cannot predict informativeness reliably due to the imprecision of relevance feedback. To address these drawbacks, we propose to actively select “pivot” exemplars for which feedback in the form of a visual comparison will most reduce the system’s uncertainty. For example, the system might ask, “Is your target image more or less crowded than this image? ” Our approach relies on a series of binary search trees in relative attribute space, together with a selection function that predicts the information gain were the user to compare his envisioned target to the next node deeper in a given attribute ’s tree. It makes interactive search more efficient than existing strategies—both in terms of the system ’s selection time as well as the user’s feedback effort.
5 0.20150399 267 iccv-2013-Model Recommendation with Virtual Probes for Egocentric Hand Detection
Author: Cheng Li, Kris M. Kitani
Abstract: Egocentric cameras can be used to benefit such tasks as analyzing fine motor skills, recognizing gestures and learning about hand-object manipulation. To enable such technology, we believe that the hands must detected on thepixellevel to gain important information about the shape of the hands and fingers. We show that the problem of pixel-wise hand detection can be effectively solved, by posing the problem as a model recommendation task. As such, the goal of a recommendation system is to recommend the n-best hand detectors based on the probe set a small amount of labeled data from the test distribution. This requirement of a probe set is a serious limitation in many applications, such as ego-centric hand detection, where the test distribution may be continually changing. To address this limitation, we propose the use of virtual probes which can be automatically extracted from the test distribution. The key idea is – that many features, such as the color distribution or relative performance between two detectors, can be used as a proxy to the probe set. In our experiments we show that the recommendation paradigm is well-equipped to handle complex changes in the appearance of the hands in firstperson vision. In particular, we show how our system is able to generalize to new scenarios by testing our model across multiple users.
6 0.20022395 97 iccv-2013-Coupling Alignments with Recognition for Still-to-Video Face Recognition
8 0.15620223 398 iccv-2013-Sparse Variation Dictionary Learning for Face Recognition with a Single Training Sample per Person
9 0.15270856 261 iccv-2013-Markov Network-Based Unified Classifier for Face Identification
10 0.12902546 313 iccv-2013-Person Re-identification by Salience Matching
11 0.11448504 111 iccv-2013-Detecting Dynamic Objects with Multi-view Background Subtraction
12 0.10313972 52 iccv-2013-Attribute Adaptation for Personalized Image Search
13 0.098623775 443 iccv-2013-Video Synopsis by Heterogeneous Multi-source Correlation
14 0.089833632 205 iccv-2013-Human Re-identification by Matching Compositional Template with Cluster Sampling
15 0.083245724 445 iccv-2013-Visual Reranking through Weakly Supervised Multi-graph Learning
16 0.080731593 437 iccv-2013-Unsupervised Random Forest Manifold Alignment for Lipreading
17 0.076291822 61 iccv-2013-Beyond Hard Negative Mining: Efficient Detector Learning via Block-Circulant Decomposition
18 0.075285651 336 iccv-2013-Random Forests of Local Experts for Pedestrian Detection
19 0.074328333 178 iccv-2013-From Semi-supervised to Transfer Counting of Crowds
20 0.073573858 338 iccv-2013-Randomized Ensemble Tracking
topicId topicWeight
[(0, 0.176), (1, 0.053), (2, -0.064), (3, -0.112), (4, 0.042), (5, -0.008), (6, 0.05), (7, 0.029), (8, 0.063), (9, 0.081), (10, -0.027), (11, 0.01), (12, 0.01), (13, -0.011), (14, 0.012), (15, -0.022), (16, -0.033), (17, -0.043), (18, 0.006), (19, 0.04), (20, -0.062), (21, -0.177), (22, -0.118), (23, -0.076), (24, 0.151), (25, 0.2), (26, -0.18), (27, 0.244), (28, -0.028), (29, -0.226), (30, -0.015), (31, -0.143), (32, -0.116), (33, 0.101), (34, 0.134), (35, 0.087), (36, -0.115), (37, 0.093), (38, 0.079), (39, 0.025), (40, 0.011), (41, -0.036), (42, 0.044), (43, -0.195), (44, -0.054), (45, -0.073), (46, -0.029), (47, -0.024), (48, 0.069), (49, 0.023)]
simIndex simValue paperId paperTitle
same-paper 1 0.94700718 305 iccv-2013-POP: Person Re-identification Post-rank Optimisation
Author: Chunxiao Liu, Chen Change Loy, Shaogang Gong, Guijin Wang
Abstract: Owing to visual ambiguities and disparities, person reidentification methods inevitably produce suboptimal ranklist, which still requires exhaustive human eyeballing to identify the correct target from hundreds of different likelycandidates. Existing re-identification studies focus on improving the ranking performance, but rarely look into the critical problem of optimising the time-consuming and error-prone post-rank visual search at the user end. In this study, we present a novel one-shot Post-rank OPtimisation (POP) method, which allows a user to quickly refine their search by either “one-shot” or a couple of sparse negative selections during a re-identification process. We conduct systematic behavioural studies to understand user’s searching behaviour and show that the proposed method allows correct re-identification to converge 2.6 times faster than the conventional exhaustive search. Importantly, through extensive evaluations we demonstrate that the method is capable of achieving significant improvement over the stateof-the-art distance metric learning based ranking models, even with just “one shot” feedback optimisation, by as much as over 30% performance improvement for rank 1reidentification on the VIPeR and i-LIDS datasets.
2 0.69806874 213 iccv-2013-Implied Feedback: Learning Nuances of User Behavior in Image Search
Author: Devi Parikh, Kristen Grauman
Abstract: User feedback helps an image search system refine its relevance predictions, tailoring the search towards the user’s preferences. Existing methods simply take feedback at face value: clicking on an image means the user wants things like it; commenting that an image lacks a specific attribute means the user wants things that have it. However, we expect there is actually more information behind the user’s literal feedback. In particular, a user’s (possibly subconscious) search strategy leads him to comment on certain images rather than others, based on how any of the visible candidate images compare to the desired content. For example, he may be more likely to give negative feedback on an irrelevant image that is relatively close to his target, as opposed to bothering with one that is altogether different. We introduce novel features to capitalize on such implied feedback cues, and learn a ranking function that uses them to improve the system’s relevance estimates. We validate the approach with real users searching for shoes, faces, or scenes using two different modes of feedback: binary relevance feedback and relative attributes-based feedback. The results show that retrieval improves significantly when the system accounts for the learned behaviors. We show that the nuances learned are domain-invariant, and useful for both generic user-independent search as well as personalized user-specific search.
3 0.64604914 267 iccv-2013-Model Recommendation with Virtual Probes for Egocentric Hand Detection
Author: Cheng Li, Kris M. Kitani
Abstract: Egocentric cameras can be used to benefit such tasks as analyzing fine motor skills, recognizing gestures and learning about hand-object manipulation. To enable such technology, we believe that the hands must detected on thepixellevel to gain important information about the shape of the hands and fingers. We show that the problem of pixel-wise hand detection can be effectively solved, by posing the problem as a model recommendation task. As such, the goal of a recommendation system is to recommend the n-best hand detectors based on the probe set a small amount of labeled data from the test distribution. This requirement of a probe set is a serious limitation in many applications, such as ego-centric hand detection, where the test distribution may be continually changing. To address this limitation, we propose the use of virtual probes which can be automatically extracted from the test distribution. The key idea is – that many features, such as the color distribution or relative performance between two detectors, can be used as a proxy to the probe set. In our experiments we show that the recommendation paradigm is well-equipped to handle complex changes in the appearance of the hands in firstperson vision. In particular, we show how our system is able to generalize to new scenarios by testing our model across multiple users.
4 0.55007029 54 iccv-2013-Attribute Pivots for Guiding Relevance Feedback in Image Search
Author: Adriana Kovashka, Kristen Grauman
Abstract: In interactive image search, a user iteratively refines his results by giving feedback on exemplar images. Active selection methods aim to elicit useful feedback, but traditional approaches suffer from expensive selection criteria and cannot predict informativeness reliably due to the imprecision of relevance feedback. To address these drawbacks, we propose to actively select “pivot” exemplars for which feedback in the form of a visual comparison will most reduce the system’s uncertainty. For example, the system might ask, “Is your target image more or less crowded than this image? ” Our approach relies on a series of binary search trees in relative attribute space, together with a selection function that predicts the information gain were the user to compare his envisioned target to the next node deeper in a given attribute ’s tree. It makes interactive search more efficient than existing strategies—both in terms of the system ’s selection time as well as the user’s feedback effort.
5 0.53904492 356 iccv-2013-Robust Feature Set Matching for Partial Face Recognition
Author: Renliang Weng, Jiwen Lu, Junlin Hu, Gao Yang, Yap-Peng Tan
Abstract: Over the past two decades, a number of face recognition methods have been proposed in the literature. Most of them use holistic face images to recognize people. However, human faces are easily occluded by other objects in many real-world scenarios and we have to recognize the person of interest from his/her partial faces. In this paper, we propose a new partial face recognition approach by using feature set matching, which is able to align partial face patches to holistic gallery faces automatically and is robust to occlusions and illumination changes. Given each gallery image and probe face patch, we first detect keypoints and extract their local features. Then, we propose a Metric Learned ExtendedRobust PointMatching (MLERPM) method to discriminatively match local feature sets of a pair of gallery and probe samples. Lastly, the similarity of two faces is converted as the distance between two feature sets. Experimental results on three public face databases are presented to show the effectiveness of the proposed approach.
6 0.49386752 97 iccv-2013-Coupling Alignments with Recognition for Still-to-Video Face Recognition
7 0.47836024 313 iccv-2013-Person Re-identification by Salience Matching
8 0.46462631 261 iccv-2013-Markov Network-Based Unified Classifier for Face Identification
9 0.40539694 398 iccv-2013-Sparse Variation Dictionary Learning for Face Recognition with a Single Training Sample per Person
10 0.36054602 178 iccv-2013-From Semi-supervised to Transfer Counting of Crowds
11 0.32805046 445 iccv-2013-Visual Reranking through Weakly Supervised Multi-graph Learning
12 0.32655498 437 iccv-2013-Unsupervised Random Forest Manifold Alignment for Lipreading
13 0.32458103 326 iccv-2013-Predicting Sufficient Annotation Strength for Interactive Foreground Segmentation
14 0.32038173 443 iccv-2013-Video Synopsis by Heterogeneous Multi-source Correlation
15 0.31478092 52 iccv-2013-Attribute Adaptation for Personalized Image Search
16 0.30694425 154 iccv-2013-Face Recognition via Archetype Hull Ranking
17 0.29256278 344 iccv-2013-Recognising Human-Object Interaction via Exemplar Based Modelling
18 0.29161769 61 iccv-2013-Beyond Hard Negative Mining: Efficient Detector Learning via Block-Circulant Decomposition
19 0.28982013 124 iccv-2013-Domain Transfer Support Vector Ranking for Person Re-identification without Target Camera Label Information
20 0.28934348 404 iccv-2013-Structured Forests for Fast Edge Detection
topicId topicWeight
[(2, 0.093), (7, 0.031), (12, 0.345), (26, 0.072), (31, 0.029), (40, 0.014), (42, 0.118), (48, 0.012), (64, 0.035), (73, 0.021), (89, 0.11), (98, 0.015)]
simIndex simValue paperId paperTitle
1 0.85288203 413 iccv-2013-Target-Driven Moire Pattern Synthesis by Phase Modulation
Author: Pei-Hen Tsai, Yung-Yu Chuang
Abstract: This paper investigates an approach for generating two grating images so that the moir e´ pattern of their superposition resembles the target image. Our method is grounded on the fundamental moir e´ theorem. By focusing on the visually most dominant (1, −1)-moir e´ component, we obtain the phase smto ddoumlaintiaonnt c (o1n,s−tr1a)in-mt on the phase shifts bee otwbteaeinn the two grating images. For improving visual appearance of the grating images and hiding capability the embedded image, a smoothness term is added to spread information between the two grating images and an appearance phase function is used to add irregular structures into grating images. The grating images can be printed on transparencies and the hidden image decoding can be performed optically by overlaying them together. The proposed method enables the creation of moir e´ art and allows visual decoding without computers.
same-paper 2 0.82309395 305 iccv-2013-POP: Person Re-identification Post-rank Optimisation
Author: Chunxiao Liu, Chen Change Loy, Shaogang Gong, Guijin Wang
Abstract: Owing to visual ambiguities and disparities, person reidentification methods inevitably produce suboptimal ranklist, which still requires exhaustive human eyeballing to identify the correct target from hundreds of different likelycandidates. Existing re-identification studies focus on improving the ranking performance, but rarely look into the critical problem of optimising the time-consuming and error-prone post-rank visual search at the user end. In this study, we present a novel one-shot Post-rank OPtimisation (POP) method, which allows a user to quickly refine their search by either “one-shot” or a couple of sparse negative selections during a re-identification process. We conduct systematic behavioural studies to understand user’s searching behaviour and show that the proposed method allows correct re-identification to converge 2.6 times faster than the conventional exhaustive search. Importantly, through extensive evaluations we demonstrate that the method is capable of achieving significant improvement over the stateof-the-art distance metric learning based ranking models, even with just “one shot” feedback optimisation, by as much as over 30% performance improvement for rank 1reidentification on the VIPeR and i-LIDS datasets.
3 0.7339775 451 iccv-2013-Write a Classifier: Zero-Shot Learning Using Purely Textual Descriptions
Author: Mohamed Elhoseiny, Babak Saleh, Ahmed Elgammal
Abstract: The main question we address in this paper is how to use purely textual description of categories with no training images to learn visual classifiers for these categories. We propose an approach for zero-shot learning of object categories where the description of unseen categories comes in the form of typical text such as an encyclopedia entry, without the need to explicitly defined attributes. We propose and investigate two baseline formulations, based on regression and domain adaptation. Then, we propose a new constrained optimization formulation that combines a regression function and a knowledge transfer function with additional constraints to predict the classifier parameters for new classes. We applied the proposed approach on two fine-grained categorization datasets, and the results indicate successful classifier prediction.
4 0.64810705 299 iccv-2013-Online Video SEEDS for Temporal Window Objectness
Author: Michael Van_Den_Bergh, Gemma Roig, Xavier Boix, Santiago Manen, Luc Van_Gool
Abstract: Superpixel and objectness algorithms are broadly used as a pre-processing step to generate support regions and to speed-up further computations. Recently, many algorithms have been extended to video in order to exploit the temporal consistency between frames. However, most methods are computationally too expensive for real-time applications. We introduce an online, real-time video superpixel algorithm based on the recently proposed SEEDS superpixels. A new capability is incorporated which delivers multiple diverse samples (hypotheses) of superpixels in the same image or video sequence. The multiple samples are shown to provide a strong cue to efficiently measure the objectness of image windows, and we introduce the novel concept of objectness in temporal windows. Experiments show that the video superpixels achieve comparable performance to state-of-the-art offline methods while running at 30 fps on a single 2.8 GHz i7 CPU. State-of-the-art performance on objectness is also demonstrated, yet orders of magnitude faster and extended to temporal windows in video.
5 0.64547426 417 iccv-2013-The Moving Pose: An Efficient 3D Kinematics Descriptor for Low-Latency Action Recognition and Detection
Author: Mihai Zanfir, Marius Leordeanu, Cristian Sminchisescu
Abstract: Human action recognition under low observational latency is receiving a growing interest in computer vision due to rapidly developing technologies in human-robot interaction, computer gaming and surveillance. In this paper we propose a fast, simple, yet powerful non-parametric Moving Pose (MP)frameworkfor low-latency human action and activity recognition. Central to our methodology is a moving pose descriptor that considers both pose information as well as differential quantities (speed and acceleration) of the human body joints within a short time window around the current frame. The proposed descriptor is used in conjunction with a modified kNN classifier that considers both the temporal location of a particular frame within the action sequence as well as the discrimination power of its moving pose descriptor compared to other frames in the training set. The resulting method is non-parametric and enables low-latency recognition, one-shot learning, and action detection in difficult unsegmented sequences. Moreover, the framework is real-time, scalable, and outperforms more sophisticated approaches on challenging benchmarks like MSR-Action3D or MSR-DailyActivities3D.
6 0.62301654 367 iccv-2013-SUN3D: A Database of Big Spaces Reconstructed Using SfM and Object Labels
7 0.56891859 136 iccv-2013-Efficient Pedestrian Detection by Directly Optimizing the Partial Area under the ROC Curve
8 0.56855828 274 iccv-2013-Monte Carlo Tree Search for Scheduling Activity Recognition
9 0.5664196 338 iccv-2013-Randomized Ensemble Tracking
10 0.55282551 428 iccv-2013-Translating Video Content to Natural Language Descriptions
11 0.55198723 316 iccv-2013-Pictorial Human Spaces: How Well Do Humans Perceive a 3D Articulated Pose?
12 0.55090177 124 iccv-2013-Domain Transfer Support Vector Ranking for Person Re-identification without Target Camera Label Information
13 0.55007654 239 iccv-2013-Learning Hash Codes with Listwise Supervision
14 0.54440963 180 iccv-2013-From Where and How to What We See
16 0.54150546 326 iccv-2013-Predicting Sufficient Annotation Strength for Interactive Foreground Segmentation
17 0.54079318 440 iccv-2013-Video Event Understanding Using Natural Language Descriptions
18 0.53950673 241 iccv-2013-Learning Near-Optimal Cost-Sensitive Decision Policy for Object Detection
19 0.53211367 61 iccv-2013-Beyond Hard Negative Mining: Efficient Detector Learning via Block-Circulant Decomposition
20 0.52979344 44 iccv-2013-Adapting Classification Cascades to New Domains