cvpr cvpr2013 cvpr2013-451 knowledge-graph by maker-knowledge-mining

451 cvpr-2013-Unsupervised Salience Learning for Person Re-identification


Source: pdf

Author: Rui Zhao, Wanli Ouyang, Xiaogang Wang

Abstract: Human eyes can recognize person identities based on some small salient regions. However, such valuable salient information is often hidden when computing similarities of images with existing approaches. Moreover, many existing approaches learn discriminative features and handle drastic viewpoint change in a supervised way and require labeling new training data for a different pair of camera views. In this paper, we propose a novel perspective for person re-identification based on unsupervised salience learning. Distinctive features are extracted without requiring identity labels in the training procedure. First, we apply adjacency constrained patch matching to build dense correspondence between image pairs, which shows effectiveness in handling misalignment caused by large viewpoint and pose variations. Second, we learn human salience in an unsupervised manner. To improve the performance of person re-identification, human salience is incorporated in patch matching to find reliable and discriminative matched patches. The effectiveness of our approach is validated on the widely used VIPeR dataset and ETHZ dataset.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 hk Abstract Human eyes can recognize person identities based on some small salient regions. [sent-4, score-0.334]

2 Moreover, many existing approaches learn discriminative features and handle drastic viewpoint change in a supervised way and require labeling new training data for a different pair of camera views. [sent-6, score-0.233]

3 In this paper, we propose a novel perspective for person re-identification based on unsupervised salience learning. [sent-7, score-1.006]

4 First, we apply adjacency constrained patch matching to build dense correspondence between image pairs, which shows effectiveness in handling misalignment caused by large viewpoint and pose variations. [sent-9, score-0.643]

5 Second, we learn human salience in an unsupervised manner. [sent-10, score-0.829]

6 To improve the performance of person re-identification, human salience is incorporated in patch matching to find reliable and discriminative matched patches. [sent-11, score-1.349]

7 Introduction Person re-identification handles pedestrian matching and ranking across non-overlapping camera views. [sent-14, score-0.219]

8 It has many important applications in video surveillance by saving a lot of human efforts on exhaustively searching for a person from large amounts of video sequences. [sent-15, score-0.269]

9 The same person observed in different camera views often undergoes significant variation in viewpoints, poses, camera settings, illumination, occlusions and background, which usually make intra-personal variations even larger than interpersonal variations as shown in Figure 1. [sent-18, score-0.446]

10 Upper part of the figure shows an example of matching based on dense correspondence and weighting with salience values, and the lower part shows some pairs of images with their salience maps. [sent-23, score-1.659]

11 In person re-identification, viewpoint change and pose variation cause uncontrolled misalignment between images. [sent-29, score-0.483]

12 For example in Figure 1, the central region of image (a1) is a backpack in camera view A, while it becomes an arm 333555888644 in image (b1) in camera view B. [sent-30, score-0.178]

13 In our method, patch matching is applied to tackle the misalignment problem. [sent-32, score-0.306]

14 In addition, based on prior knowledge on pedestrian structures, some constraints are added in patch matching in order to enhance the matching accuracy. [sent-33, score-0.392]

15 With patch matching, we are able to align the blue tilted stripe on the handbag of the lady in the dashed black boxes in Figure 1. [sent-34, score-0.163]

16 However, if they are small in size, salience information is often hidden when computing similarities of images. [sent-36, score-0.716]

17 In this paper, salience means distinct features that 1) are discriminative in making a person standing out from their companions, and 2) are reliable in finding the same person across different views. [sent-37, score-1.244]

18 However, human eyes are easy to identify the matching pairs because they have distinct features, e. [sent-39, score-0.155]

19 person (a1 − b1) has a backpack with tdilistetidn cbtl fueea stripes, person (na (2a −1 − b2) b 1 h)a hs a rae dba fcoklpdearc ku wnditehr thiletre arms, ea sntdri person r(sao3n − (a b23) − h bas2 a h reads abo rtetdle f oinl dheisr uhandnde. [sent-41, score-0.703]

20 r Thehres aerm mdsi,st ainndct pfeerastounre (sa are d bi3s)cr hiamsin aa rteivde b oint distinguishing one from others and robust in matching themselves across different camera views. [sent-42, score-0.154]

21 Intuitively, if a body part is salient in one camera view, it is usually also salient in another camera view. [sent-43, score-0.363]

22 Moreover, our computation of salience is based on the comparison with images from a large scale reference dataset rather than a small group of persons. [sent-44, score-0.768]

23 Clothes and trousers are generally considered as the most important regions for person re-identification. [sent-47, score-0.26]

24 Aided by patch matching, these discriminative and reliable features are employed in this paper for person re-identification. [sent-48, score-0.441]

25 First, an unsupervised framework is proposed to extract distinctive features for person re-identification without requiring manually labeled person identities in the training procedure. [sent-50, score-0.535]

26 Second, patch matching is utilized with adjacency constraint for handling the misalignment problem caused by viewpoint change, pose variation and articulation. [sent-51, score-0.59]

27 We show that the constrained patch matching greatly improves person re-identification accuracy because of its flexibility in handling large viewpoint change. [sent-52, score-0.566]

28 Third, human salience is learned in an unsupervised way. [sent-53, score-0.829]

29 Different from general image salience detection methods [4], our salience is especially designed for human matching, and has the following properties. [sent-54, score-1.478]

30 2) Distinct patches are considered as salient only when they are matched and distinct in both camera views. [sent-56, score-0.318]

31 3) Human salience itself is a useful descriptor for pedestrian matching. [sent-57, score-0.805]

32 For example, a person only with salient upper body and a person only with salient lower body must have different identities. [sent-58, score-0.706]

33 [25] formulated person re-identification as a ranking problem, and used ensembled RankSVMs to learn pairwise similarity. [sent-62, score-0.223]

34 Some unsupervised methods have also been developed for person re-identification [10, 21, 22, 19]. [sent-72, score-0.29]

35 Our approach exploit the salience information among person images, and it can be generalized to take use of these features. [sent-81, score-0.939]

36 Our approach differs from them in that patch matching is employed to handle spatial misalignment. [sent-91, score-0.245]

37 [19] used an attribute-based weighting scheme, which shared similar spirit with our salience in finding the unique 333555888755 and inherent appearance property. [sent-94, score-0.754]

38 Experimental results show that our defined salience is much more effective. [sent-99, score-0.716]

39 Inheriting the characteristics of partbased and region-based approaches, fine-grained methods including optical flow in pixel-level, keypoint feature matching and local patch matching are often better choices for more robust alignment. [sent-102, score-0.356]

40 In our approach, considering moderate resolution of human images captured by far-field surveillance cameras, we adopt the mid-level local patches for matching persons. [sent-103, score-0.194]

41 Different than general patch matching approaches, a simple but effective horizontal constraint is imposed on searching matched patches, which makes patch matching more adaptive in person re-identification. [sent-105, score-0.777]

42 The same as the setting of extracting dense color histograms, a dense grid of patches are sampled on each human image. [sent-115, score-0.228]

43 We divide each patch into 4 4 cells, quantize uthme aonri iemnatagtieo. [sent-116, score-0.163]

44 In a summary, each patch i s3 finally represented by a adticshc. [sent-123, score-0.163]

45 dColorSIFT features in human image are represented as xAm,,pn, where (A, p) denotes the p-th image in camera A, and (m, n) denotes the patch centered at the m-th row and the n-th column of image p. [sent-129, score-0.281]

46 (1) All patches in TA,p(m) have the same search set S for patch matching isni image q mfr)ohma camera Bm:e S(xAm,,pn,xB,q) = TB,q(m), ∀xAm,,pn TA,p(m), ∈ (2) where xB,q represent the collection of all patch features in image q from camera B. [sent-134, score-0.643]

47 SH roewsterivcetsr, bounding b seoxte ins produced by a human detector are not always well aligned, and also uncontrolled human pose variations exist in some conditions. [sent-137, score-0.191]

48 s very nsemsatlhl,e a patch may lnaoxte fdinadd cjaocrerencttv mertaictcahl due to vertical misalignment. [sent-154, score-0.163]

49 When lis set to be very large, a patch in the upper body would find a matched patch on the legs. [sent-155, score-0.431]

50 Thus less relaxed search space cannot well tolerate the spatial variation while more relaxed search space increases the chance of matching different body parts. [sent-156, score-0.273]

51 Generalized patch matching is a very mature technique in computer vision. [sent-159, score-0.245]

52 2 is the Euclidean distance betwwheeerne patch )fe=a tur ? [sent-167, score-0.163]

53 Local patches are densely sampled, and five exemplar patches on different body parts are shown in red boxes. [sent-172, score-0.173]

54 (b) One nearest neighbor from each reference image is returned by adjacency search for each patch on the left, and then N nearest neighbors from N reference images are sorted. [sent-173, score-0.528]

55 Figure 2 shows some visually similar patches returned by the discriminative adjacency constrained search. [sent-177, score-0.222]

56 Unsupervised Salience Learning With dense correpondence, we learn human salience with unsupervised methods. [sent-179, score-0.887]

57 To apply the KNN distance to person re-identification, we search for the K-nearest neighbors of a test patch in the output set of the dense correspondence. [sent-185, score-0.493]

58 With this strategy, salience is better adapted to re-identification problem. [sent-186, score-0.716]

59 Following the shared goal of abnormality detection and salience detection, we redefine the salient patch in our task as follows: Salience for person re-identification: salient patches are those possess uniqueness property among a specific set. [sent-187, score-1.346]

60 After building the dense correspondeces between a test image and images in reference set, the most similar patch in every image of the reference set is returned for each test patch, i. [sent-189, score-0.352]

61 , each test patch xAm,,pn have Nr neighbors in set Xnn(xAm,,pn), Xnn(xAm,,pn) ={x|a xˆr∈gSmˆp,aqxs(xAm,,pn, xˆ ),q = 1,2,. [sent-191, score-0.187]

62 We apply a similar scheme in [5] to Xnn (xAm,,pn) of each test patch, and the KNN distance is utilized to define the salience score: (xAm,,pn) (xAm,,pn)) = Dk (Xnn , (5) where Dk denotes the distance of the k-th nearest neighbor. [sent-198, score-0.772]

63 If the distribution of the reference set well relects the test scenario, the salient patches can only find limited number (k = αNr) of visually similar neighbors, as shown in Figure 3(a), and then scoreknn (xAm,,pn) is expected to be large. [sent-199, score-0.269]

64 Since k depends on the size of the reference set, the defined salience score works well even if the reference size is very large. [sent-201, score-0.848]

65 The goal of salience detection for person re-identificatioin is to identify persons with unique appearance. [sent-203, score-0.992]

66 We assume that if a person has such unique appearance, more than half of the people in the reference set are dissimilar with him/her. [sent-204, score-0.275]

67 For seeking a more principled method to compute human salience, oneclass SVM salience is discussion in Section 4. [sent-206, score-0.762]

68 Our results of unsupervised KNN salience are show in Figure 4(b) on the ETHZ dataset and 4(c) on the VIPeR dataset. [sent-210, score-0.783]

69 Salience scores are assigned to the center of patches, and the salience map is upsampled for better visualization. [sent-211, score-0.716]

70 Our unsupervised learning method better captures the salient regions. [sent-212, score-0.156]

71 To approximate the KNN salience algorithm (Section 4. [sent-242, score-0.716]

72 1) in a nonparametric form, the sailence score is re-defined in terms of kernel one-class SVM decision function: scoreocsvm (xAm,,pn) = x∗ = d(xAm,,pn, x∗) , (8) argmax f(x) , x∈Xnn(xmA,,np) where d is the Euclidean distance between patch features. [sent-243, score-0.253]

73 Our experiments show very similar results in person re-identification with the two salience detection methods. [sent-244, score-0.939]

74 Matching for re-identification Dense correspondence and salience described in Section 3 and 4 are used for person re-identification. [sent-247, score-0.988]

75 Bi-directional Weighted Matching A bi-directional weighted matching mechanism is designed to incorporate salience information into dense correspondence matching. [sent-250, score-0.93]

76 q∗ = argmaxSim(xA,p, xB,q), q (10) where xA,p and xB,q are collection of patch features in two images, i. [sent-261, score-0.163]

77 Intuitively, images of the same person would be more likely to have similar salience distributions than those of different persons. [sent-264, score-0.939]

78 Thus, the difference in salience score can be used as a penalty to the similarity score. [sent-265, score-0.771]

79 In another aspect, large salience scores are used to enhance the similarity score of matched patches. [sent-266, score-0.835]

80 Patches in red boxes are matched in dense correspondence with the guidence of corresponding salience scores in dark blue boxes. [sent-269, score-0.887]

81 where α is a parameter controlling the penalty of salience difference. [sent-270, score-0.716]

82 One can also change the salience score to scoreocsvm in a more principled framework without choosing the parameter k in Eq. [sent-271, score-0.836]

83 These two datasets are the most widely used for evaluation and reflect most of the challenges in real-world person re-identification applications, e. [sent-286, score-0.223]

84 q=node / 17 8 at the website http : It is one of the most challenging person re-identification × datasets, which suffers from significant viewpoint change, pose variation, and illumination difference between two camera views. [sent-297, score-0.429]

85 Each probe image is matched with every gallery image, and the correctly matched rank is obtained. [sent-305, score-0.281]

86 It is observed that our two salience detection based methods (SDC knn and SDC ocsvm) outperform all the three benchmarking approaches. [sent-313, score-0.858]

87 In particular, rank 1 matching rate is around 24% for SDC knn and 25% for SDC ocsvm, versus 20% for SDALF, 15% for LDFV, and 12% for ELF. [sent-314, score-0.257]

88 First, the dense correspondece matching can tolerate larger extent of pose and appearance variations. [sent-317, score-0.202]

89 Second, we incorporate human salience information to guide dense correspondence. [sent-318, score-0.82]

90 By combining with other descriptors, the rank 1 matching rate of eSDC knn goes to 26. [sent-319, score-0.257]

91 The compared methods includes the classical metric learning approaches, such as LMNN [29], and ITML [29], and their variants modified for person re-identification, such as PRDC [29], attribute PRDC (denoted as aPRDC) [19], and PCCA [24]. [sent-324, score-0.245]

92 Each image in probe is matched to every gallery image and the correct matched rank is obtained. [sent-349, score-0.281]

93 #3, our eSDC knn and eSDC ocsvm outperforms all other methods. [sent-355, score-0.252]

94 Conclusion In this work, we propose an unsupervised framework with salience detection for person re-identification. [sent-359, score-1.006]

95 Patch matching is utilized with adjacency constraint for handling the viewpoint and pose variation. [sent-360, score-0.341]

96 Human salience is unsupervisedly learned to seek for discriminative and reliable patch matching. [sent-362, score-0.934]

97 Experiments show that our unsupervised salience learning approach greatly improve the performance of person re-identification. [sent-363, score-1.006]

98 Exploiting local and global patch rarities for saliency detection. [sent-392, score-0.163]

99 Bicov: a novel image representation for person re-identification and face verification. [sent-512, score-0.223]

100 Local descriptors encoded by fisher vectors for person re-identification. [sent-518, score-0.223]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('salience', 0.716), ('person', 0.223), ('sdc', 0.183), ('patch', 0.163), ('viper', 0.153), ('knn', 0.142), ('cmc', 0.137), ('sdalf', 0.128), ('esdc', 0.124), ('ocsvm', 0.11), ('cumulated', 0.103), ('xnn', 0.103), ('adjacency', 0.097), ('pls', 0.093), ('salient', 0.089), ('matching', 0.082), ('ethz', 0.082), ('cam', 0.073), ('viewpoint', 0.072), ('camera', 0.072), ('unsupervised', 0.067), ('patches', 0.066), ('pedestrian', 0.065), ('matched', 0.064), ('ldfv', 0.062), ('rplm', 0.062), ('scoreknn', 0.062), ('scoreocsvm', 0.062), ('misalignment', 0.061), ('gallery', 0.061), ('probe', 0.059), ('dense', 0.058), ('persons', 0.053), ('reference', 0.052), ('hypersphere', 0.051), ('correspondence', 0.049), ('human', 0.046), ('appoaches', 0.041), ('bicov', 0.041), ('byers', 0.041), ('correpondence', 0.041), ('ebicov', 0.041), ('whsv', 0.041), ('body', 0.041), ('schwartz', 0.041), ('weighting', 0.038), ('uncontrolled', 0.037), ('trousers', 0.037), ('bak', 0.037), ('gheissari', 0.037), ('prdc', 0.037), ('pose', 0.035), ('neighbor', 0.034), ('mscr', 0.034), ('backpack', 0.034), ('prosser', 0.034), ('reidentification', 0.034), ('rank', 0.033), ('pcca', 0.032), ('discriminative', 0.032), ('svm', 0.031), ('gong', 0.031), ('bazzani', 0.03), ('farenzena', 0.03), ('hirzer', 0.03), ('change', 0.03), ('efi', 0.029), ('characteristics', 0.029), ('utilized', 0.029), ('sebastian', 0.028), ('misaligned', 0.028), ('score', 0.028), ('tolerate', 0.027), ('distinct', 0.027), ('supervised', 0.027), ('illumination', 0.027), ('nearest', 0.027), ('similarity', 0.027), ('variations', 0.027), ('returned', 0.027), ('patchmatch', 0.027), ('handling', 0.026), ('cuhk', 0.026), ('barnes', 0.026), ('clothes', 0.025), ('zheng', 0.025), ('search', 0.025), ('variation', 0.025), ('sift', 0.025), ('mechanism', 0.025), ('neighbors', 0.024), ('relaxed', 0.024), ('descriptor', 0.024), ('download', 0.024), ('reliable', 0.023), ('accumulation', 0.023), ('metric', 0.022), ('identity', 0.022), ('identities', 0.022), ('partial', 0.022)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0 451 cvpr-2013-Unsupervised Salience Learning for Person Re-identification

Author: Rui Zhao, Wanli Ouyang, Xiaogang Wang

Abstract: Human eyes can recognize person identities based on some small salient regions. However, such valuable salient information is often hidden when computing similarities of images with existing approaches. Moreover, many existing approaches learn discriminative features and handle drastic viewpoint change in a supervised way and require labeling new training data for a different pair of camera views. In this paper, we propose a novel perspective for person re-identification based on unsupervised salience learning. Distinctive features are extracted without requiring identity labels in the training procedure. First, we apply adjacency constrained patch matching to build dense correspondence between image pairs, which shows effectiveness in handling misalignment caused by large viewpoint and pose variations. Second, we learn human salience in an unsupervised manner. To improve the performance of person re-identification, human salience is incorporated in patch matching to find reliable and discriminative matched patches. The effectiveness of our approach is validated on the widely used VIPeR dataset and ETHZ dataset.

2 0.4881548 148 cvpr-2013-Ensemble Video Object Cut in Highly Dynamic Scenes

Author: Xiaobo Ren, Tony X. Han, Zhihai He

Abstract: We consider video object cut as an ensemble of framelevel background-foreground object classifiers which fuses information across frames and refine their segmentation results in a collaborative and iterative manner. Our approach addresses the challenging issues of modeling of background with dynamic textures and segmentation of foreground objects from cluttered scenes. We construct patch-level bagof-words background models to effectively capture the background motion and texture dynamics. We propose a foreground salience graph (FSG) to characterize the similarity of an image patch to the bag-of-words background models in the temporal domain and to neighboring image patches in the spatial domain. We incorporate this similarity information into a graph-cut energy minimization framework for foreground object segmentation. The background-foreground classification results at neighboring frames are fused together to construct a foreground probability map to update the graph weights. The resulting object shapes at neighboring frames are also used as constraints to guide the energy minimization process during graph cut. Our extensive experimental results and performance comparisons over a diverse set of challenging videos with dynamic scenes, including the new Change Detection Challenge Dataset, demonstrate that the proposed ensemble video object cut method outperforms various state-ofthe-art algorithms.

3 0.18323776 271 cvpr-2013-Locally Aligned Feature Transforms across Views

Author: Wei Li, Xiaogang Wang

Abstract: In this paper, we propose a new approach for matching images observed in different camera views with complex cross-view transforms and apply it to person reidentification. It jointly partitions the image spaces of two camera views into different configurations according to the similarity of cross-view transforms. The visual features of an image pair from different views are first locally aligned by being projected to a common feature space and then matched with softly assigned metrics which are locally optimized. The features optimal for recognizing identities are different from those for clustering cross-view transforms. They are jointly learned by utilizing sparsityinducing norm and information theoretical regularization. . cuhk . edu .hk (a) Camera view A (b) Camera view B This approach can be generalized to the settings where test images are from new camera views, not the same as those in the training set. Extensive experiments are conducted on public datasets and our own dataset. Comparisons with the state-of-the-art metric learning and person re-identification methods show the superior performance of our approach.

4 0.1786368 270 cvpr-2013-Local Fisher Discriminant Analysis for Pedestrian Re-identification

Author: Sateesh Pedagadi, James Orwell, Sergio Velastin, Boghos Boghossian

Abstract: Metric learning methods, , forperson re-identification, estimate a scaling for distances in a vector space that is optimized for picking out observations of the same individual. This paper presents a novel approach to the pedestrian re-identification problem that uses metric learning to improve the state-of-the-art performance on standard public datasets. Very high dimensional features are extracted from the source color image. A first processing stage performs unsupervised PCA dimensionality reduction, constrained to maintain the redundancy in color-space representation. A second stage further reduces the dimensionality, using a Local Fisher Discriminant Analysis defined by a training set. A regularization step is introduced to avoid singular matrices during this stage. The experiments conducted on three publicly available datasets confirm that the proposed method outperforms the state-of-the-art performance, including all other known metric learning methods. Furthermore, the method is an effective way to process observations comprising multiple shots, and is non-iterative: the computation times are relatively modest. Finally, a novel statistic is derived to characterize the Match Characteris- tic: the normalized entropy reduction can be used to define the ’Proportion of Uncertainty Removed’ (PUR). This measure is invariant to test set size and provides an intuitive indication of performance.

5 0.10447728 252 cvpr-2013-Learning Locally-Adaptive Decision Functions for Person Verification

Author: Zhen Li, Shiyu Chang, Feng Liang, Thomas S. Huang, Liangliang Cao, John R. Smith

Abstract: This paper considers the person verification problem in modern surveillance and video retrieval systems. The problem is to identify whether a pair of face or human body images is about the same person, even if the person is not seen before. Traditional methods usually look for a distance (or similarity) measure between images (e.g., by metric learning algorithms), and make decisions based on a fixed threshold. We show that this is nevertheless insufficient and sub-optimal for the verification problem. This paper proposes to learn a decision function for verification that can be viewed as a joint model of a distance metric and a locally adaptive thresholding rule. We further formulate the inference on our decision function as a second-order large-margin regularization problem, and provide an efficient algorithm in its dual from. We evaluate our algorithm on both human body verification and face verification problems. Our method outperforms not only the classical metric learning algorithm including LMNN and ITML, but also the state-of-the-art in the computer vision community.

6 0.089314401 355 cvpr-2013-Representing Videos Using Mid-level Discriminative Patches

7 0.088746861 199 cvpr-2013-Harry Potter's Marauder's Map: Localizing and Tracking Multiple Persons-of-Interest by Nonnegative Discretization

8 0.083995059 166 cvpr-2013-Fast Image Super-Resolution Based on In-Place Example Regression

9 0.083286434 107 cvpr-2013-Deformable Spatial Pyramid Matching for Fast Dense Correspondences

10 0.078798793 82 cvpr-2013-Class Generative Models Based on Feature Regression for Pose Estimation of Object Categories

11 0.077007964 393 cvpr-2013-Separating Signal from Noise Using Patch Recurrence across Scales

12 0.076810397 464 cvpr-2013-What Makes a Patch Distinct?

13 0.07100416 388 cvpr-2013-Semi-supervised Learning of Feature Hierarchies for Object Detection in a Video

14 0.070438057 273 cvpr-2013-Looking Beyond the Image: Unsupervised Learning for Object Saliency and Detection

15 0.070195913 378 cvpr-2013-Sampling Strategies for Real-Time Action Recognition

16 0.069051616 115 cvpr-2013-Depth Super Resolution by Rigid Body Self-Similarity in 3D

17 0.068201132 398 cvpr-2013-Single-Pedestrian Detection Aided by Multi-pedestrian Detection

18 0.0662902 299 cvpr-2013-Multi-source Multi-scale Counting in Extremely Dense Crowd Images

19 0.066267386 80 cvpr-2013-Category Modeling from Just a Single Labeling: Use Depth Information to Guide the Learning of 2D Models

20 0.066241905 272 cvpr-2013-Long-Term Occupancy Analysis Using Graph-Based Optimisation in Thermal Imagery


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.176), (1, -0.027), (2, 0.027), (3, -0.0), (4, -0.002), (5, 0.009), (6, 0.004), (7, -0.029), (8, 0.023), (9, -0.053), (10, -0.017), (11, -0.001), (12, 0.079), (13, -0.028), (14, 0.054), (15, -0.039), (16, -0.064), (17, -0.015), (18, 0.031), (19, -0.059), (20, 0.041), (21, 0.151), (22, -0.138), (23, -0.141), (24, -0.073), (25, -0.118), (26, -0.067), (27, 0.039), (28, -0.041), (29, -0.107), (30, -0.074), (31, -0.124), (32, 0.071), (33, -0.001), (34, 0.06), (35, 0.035), (36, -0.01), (37, 0.039), (38, -0.191), (39, -0.175), (40, 0.158), (41, 0.048), (42, 0.244), (43, 0.034), (44, 0.124), (45, -0.045), (46, -0.084), (47, -0.072), (48, 0.021), (49, 0.076)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.92056257 451 cvpr-2013-Unsupervised Salience Learning for Person Re-identification

Author: Rui Zhao, Wanli Ouyang, Xiaogang Wang

Abstract: Human eyes can recognize person identities based on some small salient regions. However, such valuable salient information is often hidden when computing similarities of images with existing approaches. Moreover, many existing approaches learn discriminative features and handle drastic viewpoint change in a supervised way and require labeling new training data for a different pair of camera views. In this paper, we propose a novel perspective for person re-identification based on unsupervised salience learning. Distinctive features are extracted without requiring identity labels in the training procedure. First, we apply adjacency constrained patch matching to build dense correspondence between image pairs, which shows effectiveness in handling misalignment caused by large viewpoint and pose variations. Second, we learn human salience in an unsupervised manner. To improve the performance of person re-identification, human salience is incorporated in patch matching to find reliable and discriminative matched patches. The effectiveness of our approach is validated on the widely used VIPeR dataset and ETHZ dataset.

2 0.67114031 148 cvpr-2013-Ensemble Video Object Cut in Highly Dynamic Scenes

Author: Xiaobo Ren, Tony X. Han, Zhihai He

Abstract: We consider video object cut as an ensemble of framelevel background-foreground object classifiers which fuses information across frames and refine their segmentation results in a collaborative and iterative manner. Our approach addresses the challenging issues of modeling of background with dynamic textures and segmentation of foreground objects from cluttered scenes. We construct patch-level bagof-words background models to effectively capture the background motion and texture dynamics. We propose a foreground salience graph (FSG) to characterize the similarity of an image patch to the bag-of-words background models in the temporal domain and to neighboring image patches in the spatial domain. We incorporate this similarity information into a graph-cut energy minimization framework for foreground object segmentation. The background-foreground classification results at neighboring frames are fused together to construct a foreground probability map to update the graph weights. The resulting object shapes at neighboring frames are also used as constraints to guide the energy minimization process during graph cut. Our extensive experimental results and performance comparisons over a diverse set of challenging videos with dynamic scenes, including the new Change Detection Challenge Dataset, demonstrate that the proposed ensemble video object cut method outperforms various state-ofthe-art algorithms.

3 0.65128696 252 cvpr-2013-Learning Locally-Adaptive Decision Functions for Person Verification

Author: Zhen Li, Shiyu Chang, Feng Liang, Thomas S. Huang, Liangliang Cao, John R. Smith

Abstract: This paper considers the person verification problem in modern surveillance and video retrieval systems. The problem is to identify whether a pair of face or human body images is about the same person, even if the person is not seen before. Traditional methods usually look for a distance (or similarity) measure between images (e.g., by metric learning algorithms), and make decisions based on a fixed threshold. We show that this is nevertheless insufficient and sub-optimal for the verification problem. This paper proposes to learn a decision function for verification that can be viewed as a joint model of a distance metric and a locally adaptive thresholding rule. We further formulate the inference on our decision function as a second-order large-margin regularization problem, and provide an efficient algorithm in its dual from. We evaluate our algorithm on both human body verification and face verification problems. Our method outperforms not only the classical metric learning algorithm including LMNN and ITML, but also the state-of-the-art in the computer vision community.

4 0.62406987 271 cvpr-2013-Locally Aligned Feature Transforms across Views

Author: Wei Li, Xiaogang Wang

Abstract: In this paper, we propose a new approach for matching images observed in different camera views with complex cross-view transforms and apply it to person reidentification. It jointly partitions the image spaces of two camera views into different configurations according to the similarity of cross-view transforms. The visual features of an image pair from different views are first locally aligned by being projected to a common feature space and then matched with softly assigned metrics which are locally optimized. The features optimal for recognizing identities are different from those for clustering cross-view transforms. They are jointly learned by utilizing sparsityinducing norm and information theoretical regularization. . cuhk . edu .hk (a) Camera view A (b) Camera view B This approach can be generalized to the settings where test images are from new camera views, not the same as those in the training set. Extensive experiments are conducted on public datasets and our own dataset. Comparisons with the state-of-the-art metric learning and person re-identification methods show the superior performance of our approach.

5 0.60425228 464 cvpr-2013-What Makes a Patch Distinct?

Author: Ran Margolin, Ayellet Tal, Lihi Zelnik-Manor

Abstract: What makes an object salient? Most previous work assert that distinctness is the dominating factor. The difference between the various algorithms is in the way they compute distinctness. Some focus on the patterns, others on the colors, and several add high-level cues and priors. We propose a simple, yet powerful, algorithm that integrates these three factors. Our key contribution is a novel and fast approach to compute pattern distinctness. We rely on the inner statistics of the patches in the image for identifying unique patterns. We provide an extensive evaluation and show that our approach outperforms all state-of-the-art methods on the five most commonly-used datasets.

6 0.59029382 393 cvpr-2013-Separating Signal from Noise Using Patch Recurrence across Scales

7 0.56122798 166 cvpr-2013-Fast Image Super-Resolution Based on In-Place Example Regression

8 0.54453641 270 cvpr-2013-Local Fisher Discriminant Analysis for Pedestrian Re-identification

9 0.53525627 266 cvpr-2013-Learning without Human Scores for Blind Image Quality Assessment

10 0.52098763 169 cvpr-2013-Fast Patch-Based Denoising Using Approximated Patch Geodesic Paths

11 0.46220943 195 cvpr-2013-HDR Deghosting: How to Deal with Saturation?

12 0.45544183 272 cvpr-2013-Long-Term Occupancy Analysis Using Graph-Based Optimisation in Thermal Imagery

13 0.42551705 80 cvpr-2013-Category Modeling from Just a Single Labeling: Use Depth Information to Guide the Learning of 2D Models

14 0.41297501 391 cvpr-2013-Sensing and Recognizing Surface Textures Using a GelSight Sensor

15 0.3992165 177 cvpr-2013-FrameBreak: Dramatic Image Extrapolation by Guided Shift-Maps

16 0.38802892 427 cvpr-2013-Texture Enhanced Image Denoising via Gradient Histogram Preservation

17 0.38429216 404 cvpr-2013-Sparse Quantization for Patch Description

18 0.38222691 355 cvpr-2013-Representing Videos Using Mid-level Discriminative Patches

19 0.38207579 401 cvpr-2013-Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection

20 0.37440956 318 cvpr-2013-Optimized Pedestrian Detection for Multiple and Occluded People


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(10, 0.076), (12, 0.02), (16, 0.087), (26, 0.035), (27, 0.264), (28, 0.01), (33, 0.237), (67, 0.092), (69, 0.035), (87, 0.053)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.78336626 451 cvpr-2013-Unsupervised Salience Learning for Person Re-identification

Author: Rui Zhao, Wanli Ouyang, Xiaogang Wang

Abstract: Human eyes can recognize person identities based on some small salient regions. However, such valuable salient information is often hidden when computing similarities of images with existing approaches. Moreover, many existing approaches learn discriminative features and handle drastic viewpoint change in a supervised way and require labeling new training data for a different pair of camera views. In this paper, we propose a novel perspective for person re-identification based on unsupervised salience learning. Distinctive features are extracted without requiring identity labels in the training procedure. First, we apply adjacency constrained patch matching to build dense correspondence between image pairs, which shows effectiveness in handling misalignment caused by large viewpoint and pose variations. Second, we learn human salience in an unsupervised manner. To improve the performance of person re-identification, human salience is incorporated in patch matching to find reliable and discriminative matched patches. The effectiveness of our approach is validated on the widely used VIPeR dataset and ETHZ dataset.

2 0.7620337 324 cvpr-2013-Part-Based Visual Tracking with Online Latent Structural Learning

Author: Rui Yao, Qinfeng Shi, Chunhua Shen, Yanning Zhang, Anton van_den_Hengel

Abstract: Despite many advances made in the area, deformable targets and partial occlusions continue to represent key problems in visual tracking. Structured learning has shown good results when applied to tracking whole targets, but applying this approach to a part-based target model is complicated by the need to model the relationships between parts, and to avoid lengthy initialisation processes. We thus propose a method which models the unknown parts using latent variables. In doing so we extend the online algorithm pegasos to the structured prediction case (i.e., predicting the location of the bounding boxes) with latent part variables. To better estimate the parts, and to avoid over-fitting caused by the extra model complexity/capacity introduced by theparts, wepropose a two-stage trainingprocess, based on the primal rather than the dual form. We then show that the method outperforms the state-of-the-art (linear and non-linear kernel) trackers.

3 0.76060897 164 cvpr-2013-Fast Convolutional Sparse Coding

Author: Hilton Bristow, Anders Eriksson, Simon Lucey

Abstract: Sparse coding has become an increasingly popular method in learning and vision for a variety of classification, reconstruction and coding tasks. The canonical approach intrinsically assumes independence between observations during learning. For many natural signals however, sparse coding is applied to sub-elements (i.e. patches) of the signal, where such an assumption is invalid. Convolutional sparse coding explicitly models local interactions through the convolution operator, however the resulting optimization problem is considerably more complex than traditional sparse coding. In this paper, we draw upon ideas from signal processing and Augmented Lagrange Methods (ALMs) to produce a fast algorithm with globally optimal subproblems and super-linear convergence.

4 0.75067472 145 cvpr-2013-Efficient Object Detection and Segmentation for Fine-Grained Recognition

Author: Anelia Angelova, Shenghuo Zhu

Abstract: We propose a detection and segmentation algorithm for the purposes of fine-grained recognition. The algorithm first detects low-level regions that could potentially belong to the object and then performs a full-object segmentation through propagation. Apart from segmenting the object, we can also ‘zoom in ’ on the object, i.e. center it, normalize it for scale, and thus discount the effects of the background. We then show that combining this with a state-of-the-art classification algorithm leads to significant improvements in performance especially for datasets which are considered particularly hard for recognition, e.g. birds species. The proposed algorithm is much more efficient than other known methods in similar scenarios [4, 21]. Our method is also simpler and we apply it here to different classes of objects, e.g. birds, flowers, cats and dogs. We tested the algorithm on a number of benchmark datasets for fine-grained categorization. It outperforms all the known state-of-the-art methods on these datasets, sometimes by as much as 11%. It improves the performance of our baseline algorithm by 3-4%, consistently on all datasets. We also observed more than a 4% improvement in the recognition performance on a challenging largescale flower dataset, containing 578 species of flowers and 250,000 images.

5 0.7336694 363 cvpr-2013-Robust Multi-resolution Pedestrian Detection in Traffic Scenes

Author: Junjie Yan, Xucong Zhang, Zhen Lei, Shengcai Liao, Stan Z. Li

Abstract: The serious performance decline with decreasing resolution is the major bottleneck for current pedestrian detection techniques [14, 23]. In this paper, we take pedestrian detection in different resolutions as different but related problems, and propose a Multi-Task model to jointly consider their commonness and differences. The model contains resolution aware transformations to map pedestrians in different resolutions to a common space, where a shared detector is constructed to distinguish pedestrians from background. For model learning, we present a coordinate descent procedure to learn the resolution aware transformations and deformable part model (DPM) based detector iteratively. In traffic scenes, there are many false positives located around vehicles, therefore, we further build a context model to suppress them according to the pedestrian-vehicle relationship. The context model can be learned automatically even when the vehicle annotations are not available. Our method reduces the mean miss rate to 60% for pedestrians taller than 30 pixels on the Caltech Pedestrian Benchmark, which noticeably outperforms previous state-of-the-art (71%).

6 0.72188002 118 cvpr-2013-Detecting Pulse from Head Motions in Video

7 0.71790206 138 cvpr-2013-Efficient 2D-to-3D Correspondence Filtering for Scalable 3D Object Recognition

8 0.71697998 271 cvpr-2013-Locally Aligned Feature Transforms across Views

9 0.71037304 323 cvpr-2013-POOF: Part-Based One-vs.-One Features for Fine-Grained Categorization, Face Verification, and Attribute Estimation

10 0.70997602 326 cvpr-2013-Patch Match Filter: Efficient Edge-Aware Filtering Meets Randomized Search for Fast Correspondence Field Estimation

11 0.7093153 322 cvpr-2013-PISA: Pixelwise Image Saliency by Aggregating Complementary Appearance Contrast Measures with Spatial Priors

12 0.7081095 403 cvpr-2013-Sparse Output Coding for Large-Scale Visual Recognition

13 0.70803297 450 cvpr-2013-Unsupervised Joint Object Discovery and Segmentation in Internet Images

14 0.70768106 27 cvpr-2013-A Theory of Refractive Photo-Light-Path Triangulation

15 0.70609146 119 cvpr-2013-Detecting and Aligning Faces by Image Retrieval

16 0.70541286 254 cvpr-2013-Learning SURF Cascade for Fast and Accurate Object Detection

17 0.70450258 438 cvpr-2013-Towards Pose Robust Face Recognition

18 0.70360237 224 cvpr-2013-Information Consensus for Distributed Multi-target Tracking

19 0.70309365 104 cvpr-2013-Deep Convolutional Network Cascade for Facial Point Detection

20 0.70304072 339 cvpr-2013-Probabilistic Graphlet Cut: Exploiting Spatial Structure Cue for Weakly Supervised Image Segmentation