iccv iccv2013 iccv2013-359 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Junliang Xing, Jin Gao, Bing Li, Weiming Hu, Shuicheng Yan
Abstract: Recently, sparse representation has been introduced for robust object tracking. By representing the object sparsely, i.e., using only a few templates via ?1-norm minimization, these so-called ?1-trackers exhibit promising tracking results. In this work, we address the object template building and updating problem in these ?1-tracking approaches, which has not been fully studied. We propose to perform template updating, in a new perspective, as an online incremental dictionary learning problem, which is efficiently solved through an online optimization procedure. To guarantee the robustness and adaptability of the tracking algorithm, we also propose to build a multi-lifespan dictionary model. By building target dictionaries of different lifespans, effective object observations can be obtained to deal with the well-known drifting problem in tracking and thus improve the tracking accuracy. We derive effective observa- tion models both generatively and discriminatively based on the online multi-lifespan dictionary learning model and deploy them to the Bayesian sequential estimation framework to perform tracking. The proposed approach has been extensively evaluated on ten challenging video sequences. Experimental results demonstrate the effectiveness of the online learned templates, as well as the state-of-the-art tracking performance of the proposed approach.
Reference: text
sentIndex sentText sentNum sentScore
1 In this work, we address the object template building and updating problem in these ? [sent-12, score-0.457]
2 We propose to perform template updating, in a new perspective, as an online incremental dictionary learning problem, which is efficiently solved through an online optimization procedure. [sent-14, score-1.034]
3 To guarantee the robustness and adaptability of the tracking algorithm, we also propose to build a multi-lifespan dictionary model. [sent-15, score-0.9]
4 By building target dictionaries of different lifespans, effective object observations can be obtained to deal with the well-known drifting problem in tracking and thus improve the tracking accuracy. [sent-16, score-0.932]
5 We derive effective observa- tion models both generatively and discriminatively based on the online multi-lifespan dictionary learning model and deploy them to the Bayesian sequential estimation framework to perform tracking. [sent-17, score-0.686]
6 Experimental results demonstrate the effectiveness of the online learned templates, as well as the state-of-the-art tracking performance of the proposed approach. [sent-19, score-0.506]
7 Although many algorithms have been proposed in the last decades, object tracking still remains a very challenging problem for realworld applications due to the difficulties like background cluttering, image noises, illumination changes, object occlusions and fast motions, as shown in Figure 1. [sent-25, score-0.474]
8 (1) Image noises and background cluttering (top-left image), (2) illumination changes (top-right image), (3) object occlusions (bottomleft image), and (4) fast object motions (bottom-right image). [sent-31, score-0.336]
9 The tracking results using fixed templates, fully updated templates, update method in [19], [26], [22], [13] and the proposed method in this work are respectively plotted in teal, olive, purple, cyan, blue, green and red colors. [sent-32, score-0.487]
10 cessfully applied to visual tracking problem [19, 20, 5, 28, 13]. [sent-34, score-0.34]
11 The basic assumption in these methods is that the target can be represented as a linear combination of only a few elements in a template set. [sent-35, score-0.439]
12 Denoting a template set of the target with m elements as T = [t1, t2 , . [sent-37, score-0.439]
13 3 for more details), the sparse representation of the candidate using the templates is obtained by solving mcin ? [sent-41, score-0.433]
14 kBeu iplt( on th)i as vidaleiad, previous wtyo drkisst on sparse coding based object tracking have mainly focused on two problems. [sent-49, score-0.481]
15 Supposing that we have already designed the object templates and used them to track the target via a certain optimization procedure, how can we update the object templates effectively and efficiently? [sent-53, score-0.912]
16 1-trackers address the template update problem usually by adopting some classic methods (e. [sent-57, score-0.46]
17 fixed templates with no updates [28] or incremental update with replacing [19, 27]) or some intuitive strategies [13, 26]. [sent-59, score-0.549]
18 In this paper, we propose to perform the template update problem in the tracking scenario as an online incremental dictionary learning problem. [sent-60, score-1.378]
19 We also explore the temporal nature of the object templates and propose to learn a multi-lifespan dictionary to improve the adaptability and robustness of a tracking algorithm. [sent-61, score-1.262]
20 We apply the online multi-lifespan dictionary learning model into the Bayesian sequential estimation framework and design effective observation models both generatively and discriminatively for a particle filter implementation. [sent-62, score-0.773]
21 Extensive experiments on public benchmark video sequences demonstrate the effectiveness of our online learned templates, and the state-ofthe-art tracking performance of the proposed approach. [sent-63, score-0.557]
22 Related Work Object tracking methods can be roughly grouped into two categories: generative and discriminative. [sent-65, score-0.374]
23 Generative methods, which use a descriptive appearance model to represent the object, perform tracking as a searching problem over candidate regions to find the most similar one. [sent-66, score-0.371]
24 Examples of generative tracking methods are eigentracker [6], mean shift tracker [7], and fragment tracker [1]. [sent-67, score-0.624]
25 Discriminative methods formulate object tracking as a binary classification problem and find the location that can best separate the target from the background. [sent-68, score-0.482]
26 Examples of discriminative tracking methods are online boosting tracker [11], ensemble tracker [3], and multi-instance learning tracker [4]. [sent-69, score-0.858]
27 With recent advances in sparse representation, sparse coding based object trackers demonstrate to be a promising tracking framework [19, 20, 5, 28, 13, 27, 26]. [sent-70, score-0.533]
28 To make the templates directly robust to occlusion, local image patches can be used as the object dictionary [13, 28]. [sent-77, score-0.676]
29 Besides template design and optimization method, template update is even a more important problem in the sparse coding based tracking framework. [sent-80, score-1.238]
30 Although fixed object template may work well for short video sequences when the target stays nearly unchanged, it may be incompetent for long sequences where the target often undergoes different kinds of changes (see Figure 1). [sent-81, score-0.764]
31 To adapt to appearance changes of the target during tracking, the templates in [19, 20, 5, 26] are updated based on the weights assigned to each template, and the similarities between templates and the current estimation of the target candidate. [sent-82, score-0.898]
32 In [28], the global object templates are kept fixed during tracking to ensure the discriminative power of the model, while the local patch templates are constantly updated to adapt to object changes. [sent-83, score-1.103]
33 In order to incrementally update the templates, a more reasonable strategy can be found in [13], where old templates are given slow update probabilities and the incremental subspace learning algorithm in [22] is employed which restricts the template vectors to be orthogonal. [sent-84, score-1.022]
34 Although all these methods provide different strategies contributing to template update, most of them use some predefined templates and update them intuitively, which may not fully unleash the potential capability of templates. [sent-87, score-0.775]
35 The proposed online learning based template building and updating algorithm, which is both robust and adaptive for tracking, well ad- dresses the problems of object templates in the ? [sent-88, score-0.902]
36 Proposed Approach Given the object tracking results, the main objective of this paper is to online automatically learn “good” object templates, which can, in turn, benefit the ongoing object tracking process with improved robustness and adaptability. [sent-91, score-0.974]
37 Our core idea to achieve this objective is trying not to impose heavy constraints on the template building and 666 Processed frames Processing. [sent-92, score-0.367]
38 We perform template update as an online dictionary learning problem and propose to learn a multi-lifespan dictionary, i. [sent-96, score-0.961]
39 To this end, we formulate this template building and updating problem as online dictionary learning, which automatically updates object templates that can better adapt to the data for tracking. [sent-105, score-1.293]
40 In order to further improve the robustness and adaptability of the learned templates, we explore the temporal property of the learned dictionary and pro- pose to build a dictionary with multiple lifespans to possess distinct temporal properties. [sent-106, score-1.106]
41 Based on the learned multilifespan dictionary, we deduce effective observation models both generatively and discriminatively, and deploy them into the Bayesian sequential estimation framework to perform tracking using a particle filter implementation. [sent-107, score-0.771]
42 Our multi-lifespan model, however, mainly aims to ease the contradiction between the adaptivity and robustness of template based object tracking algorithms, and the multi-lifespan dictionaries are fused in parallel in a multi-state particle filter. [sent-111, score-0.958]
43 Tracking as Online Dictionary Learning In sparse coding based tracking algorithms, a target candidate is represented as a linear combination of a few elements from a template set. [sent-115, score-0.909]
44 The building and updating of this template set, therefore, have great impact on the final tracking results. [sent-116, score-0.755]
45 Previous works usually build this template set by directly sampling from the initialization of tracking, and then use some intuitive strategies to update the set during tracking [19, 28, 13]. [sent-117, score-0.826]
46 From a different viewpoint, here we want to automatically learn this template set to make it best adapt to the video data to be tracked. [sent-118, score-0.388]
47 Suppose all the possible target candidates are within a template set Y = {y1, . [sent-120, score-0.484]
48 , yN}, where yi ∈ Rn denotes one nte-mdipmlaetnes sioent Yal sample and N }is, t wheh template size which can be very large, since samples are continually obtained when a new frame has been tracked. [sent-123, score-0.426]
49 A good template set should then have the minimal cost to represent all the elements in this set. [sent-124, score-0.339]
50 ∈Yl(y,D), (3) where D ∈ Rn×m is the learned template set, distinguished wfrohemr eth De predefined template set T as in previous works, and l(·) is the loss function such that l(y, D) is small if aDn dis l “good” aet representing tshuec hca tnhdaitd la(tye y. [sent-127, score-0.727]
51 I ins a sparse coding based object tracking framework, the loss function can be naturally modeled as: l(y,D) ? [sent-128, score-0.481]
52 (5) Now, putting everything together, the template learning can be formulated as the following optimization problem, D∗= argDmin|Y1|y? [sent-149, score-0.378]
53 In our scenario for template updating in online object tracking, since the target candidates are obtained consecutively, the above two-step iterated procedure must be redesigned to be performed in an online manner to learn the dictionary incrementally. [sent-162, score-1.19]
54 In Algorithm 1, we summarize this redesigned procedure of online dictionary learning for template update. [sent-163, score-0.877]
55 The learning procedure receives the dictionary learned in the previous frame as input, and updates the dictionary incrementally according to the samples collected in the current frame. [sent-164, score-0.971]
56 The introduced variables Ct and Yt are intermediate results associated with the dictionary Dt and are stored for incremental learning. [sent-167, score-0.447]
57 To improve the robustness of the learned dictionary, multiple samples are collected around the tracking result xt in frame It, and M is the parameter to control the explicit number of the collected samples. [sent-169, score-0.702]
58 Note that here we do not impose any constraints on the explicit dictionary format, which can be object templates, image patches or even extracted features. [sent-170, score-0.387]
59 These two characteristics, however, often contradict with each other in many tracking algorithms. [sent-175, score-0.34]
60 5: end for 6: Save dictionary Dt, intermediate variable Ct and Yt. [sent-203, score-0.37]
61 On the contrary, if the template is updated with a slower speed, the tracker is not easy to drift but may not catch up with the changes of the target. [sent-205, score-0.522]
62 s sampled to wtrahienr tehe { dictionary and completely determinate the learned dictionary together with regularization parameter λ. [sent-214, score-0.739]
63 Multi-lifespan dictionaries provide a very good solution to the contradiction when simultaneously pursuing the adaptability and robustness of the tracker. [sent-219, score-0.334]
64 Line 1: tracking results; Line 2-4: examples of the learned dictionaries at frame 100; Line 5-7: collected negative samples used for the discriminative observation model (see Section 3. [sent-223, score-0.674]
65 Denoting SLD, MLD and LLD respectively as DS, DM and DL, the final online multi-lifespan dictionary learning model (OMDL) is represented as: D∗ = ? [sent-235, score-0.501]
66 f the three lifespan dictionaries learned using the online dictionary learning algorithm. [sent-240, score-0.796]
67 Bayesian Sequential Estimation We deploy the OMDL model into the Bayesian sequential estimation framework, which performs tracking by solving the maximum a posterior (MAP) problem, xˆt = argmaxp(xt |y1:t) , (9) {y1,. [sent-245, score-0.452]
68 We use global object template normalized to 32 32 pixels as the training data to lpelaartne tnhoer multi-lifespan dictionary m asod theel. [sent-313, score-0.726]
69 rTahinei dictionary numbers for the SLD, MLD and LLD are all set to 20 and incrementally learned with 128, 8 and 1 sample(s) respectively at every frame. [sent-314, score-0.43]
70 We first conduct experiments to compare the tracking results using six different template update methods. [sent-320, score-0.836]
71 Then we evaluate the tracking performance of our approach compared with six state-of-the-art tracking algorithms. [sent-321, score-0.716]
72 Tracking error and precision of seven different methods for template update. [sent-326, score-0.46]
73 In order to concentrate on the template update method and make a fair comparison, the templates in these seven methods are all built from global target appearances and the number of templates is set to 60. [sent-333, score-1.234]
74 The experiments are performed on four challenging image sequences, car11, shaking, faceocc2 and animal (see Figure 1) with the same initial rectangles in the first frame, which cover most challenging situations for template updating. [sent-336, score-0.441]
75 We employ two well-accepted metrics, center location distance and overlap ratio, to respectively evaluate the tracking error and precision of the seven template update methods. [sent-337, score-0.921]
76 It is really surprising that the incremental update method in [22], which uses the eigenvector of the target samples as template and updates it incrementally in the eigenspace, performs poorly on the four test sequences and even is no better than the fixed template method and fully update method. [sent-341, score-1.252]
77 The reason behind this may be that forcing the template to be orthotropic cannot well adapt to the challenging tracking situations with non-white image noises, especially when using these templates to perform sparse representation [23]. [sent-342, score-1.092]
78 [13] proposes a modified template update method to better deploy the subspace learning into sparse representation. [sent-344, score-0.615]
79 From Figure 4 it is observed that fixed templates and fully update method may perform well when the target does not change much. [sent-345, score-0.51]
80 But with the accumulation of tracking errors and when the target undergoes great changes, these two methods tend to perform worse than other four methods using incremental template update. [sent-346, score-0.892]
81 Tracking Performance Evaluation We further evaluate the performance of our final tracking approach on 10 video sequences popularly used in previous works [14, 4, 5, 28, 13, 10], including sylv, bike, david, woman, coke11, jumping, and the four sequences used in the first experiment. [sent-352, score-0.442]
82 These ten video sequences together present an even wider range of challenges to a tracking algorithm (see Figure 5). [sent-353, score-0.391]
83 1 MTT Ours sylv bike car11 david woman animal coke11 shaking jumping faceocc2 0. [sent-358, score-0.488]
84 1 MTT Ours sylv bike car11 david woman animal coke11 shaking jumping faceocc2 0. [sent-439, score-0.488]
85 790 with six state-of-the-art algorithms, the fragment tracker (Frag) [1], the incremental visual tracking (IVT) algorithm [22], the multi-instance learning (MIL) tracker [4], the visual tracking decomposition (VTD) method [14], the latest ? [sent-516, score-1.082]
86 Table 2 and 3 list the average tracking errors and precisions for all seven algorithms. [sent-524, score-0.412]
87 The proposed tracking approach, on the whole, performs well against other six algorithms, especially on the sequence sylv, woman, animal, coke11, and jumping, on which some other algorithms may fail to follow the targets but ours can successfully track them until the end of the sequence. [sent-525, score-0.405]
88 The tem- plates learned using our dictionary learning method, on the contrary, can well adapt to the tracking data, especially in the sparse coding based tracking framework. [sent-528, score-1.261]
89 In Figure 5, some example tracking results are drawn to given a more vivid comparison. [sent-529, score-0.34]
90 Speed Analysis and Discussions Our tracking algorithm runs at about 2. [sent-533, score-0.34]
91 The main reason is that our approach does not need to add the trivial templates as those adopted in [19, 20, 5, 27] due to the design of observation model. [sent-540, score-0.345]
92 pWlahteast i iss more, currently our learning f8o4r template update 0is. [sent-544, score-0.499]
93 Conclusions and Future Work We study the template update problem in the sparse coding based object tracking framework. [sent-559, score-0.941]
94 We formulate the template update problem as online dictionary learning, which make the template better adapt to the tracking data. [sent-560, score-1.65]
95 We propose to learn a multi-lifespan dictionary to simultaneously ensure adaptability and robustness of the tracker. [sent-561, score-0.56]
96 The online learned multi-lifespan dictionary has been deployed into the Bayesian sequential estimation framework using particle filter to perform tracking. [sent-562, score-0.654]
97 Currently, the lifespan dictionary is only learned from global object templates, and in our future work, we plan to add local image patch based dictionary to further improve the tracking performance. [sent-564, score-1.285]
98 Eigentracking: Robust matching and tracking of articulated objects using a view-based representation. [sent-608, score-0.34]
99 Visual tracking via adaptive structural local sparse appearance model. [sent-655, score-0.392]
100 Sparse representation for face recognition based on discriminative low-rank dictionary learning. [sent-675, score-0.371]
wordName wordTfidf (topN-words)
[('dictionary', 0.345), ('tracking', 0.34), ('template', 0.339), ('templates', 0.289), ('lld', 0.206), ('sld', 0.206), ('yt', 0.187), ('lifespan', 0.164), ('adaptability', 0.164), ('mld', 0.144), ('omdl', 0.144), ('update', 0.121), ('online', 0.117), ('tracker', 0.112), ('xt', 0.107), ('shaking', 0.102), ('target', 0.1), ('dt', 0.099), ('ivt', 0.098), ('mtt', 0.091), ('dictionaries', 0.082), ('vtd', 0.082), ('sylv', 0.08), ('animal', 0.079), ('incremental', 0.077), ('frag', 0.076), ('cluttering', 0.073), ('generatively', 0.073), ('seven', 0.072), ('woman', 0.07), ('particle', 0.067), ('jumping', 0.064), ('deploy', 0.064), ('mcin', 0.061), ('mil', 0.059), ('ct', 0.058), ('observation', 0.056), ('frame', 0.055), ('bike', 0.053), ('sparse', 0.052), ('robustness', 0.051), ('sequences', 0.051), ('adapt', 0.049), ('learned', 0.049), ('updating', 0.048), ('sequential', 0.048), ('coding', 0.047), ('deduce', 0.046), ('candidates', 0.045), ('changes', 0.045), ('noises', 0.043), ('object', 0.042), ('dfnu', 0.041), ('dtn', 0.041), ('dtp', 0.041), ('lifespans', 0.041), ('motions', 0.041), ('david', 0.04), ('bayesian', 0.039), ('learning', 0.039), ('singapore', 0.037), ('rotations', 0.037), ('redesigned', 0.037), ('contradiction', 0.037), ('updates', 0.036), ('six', 0.036), ('incrementally', 0.036), ('undergoes', 0.036), ('ts', 0.036), ('pdf', 0.035), ('speed', 0.035), ('generative', 0.034), ('collected', 0.034), ('samples', 0.032), ('temporal', 0.031), ('candidate', 0.031), ('fixes', 0.03), ('track', 0.029), ('tsp', 0.029), ('tj', 0.028), ('rn', 0.028), ('ghanem', 0.028), ('filter', 0.028), ('building', 0.028), ('tc', 0.028), ('blurs', 0.027), ('occlusions', 0.027), ('updated', 0.026), ('fragment', 0.026), ('tpami', 0.026), ('discriminative', 0.026), ('dc', 0.026), ('strategies', 0.026), ('precision', 0.025), ('dm', 0.025), ('intermediate', 0.025), ('built', 0.024), ('error', 0.024), ('situations', 0.023), ('illumination', 0.023)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999917 359 iccv-2013-Robust Object Tracking with Online Multi-lifespan Dictionary Learning
Author: Junliang Xing, Jin Gao, Bing Li, Weiming Hu, Shuicheng Yan
Abstract: Recently, sparse representation has been introduced for robust object tracking. By representing the object sparsely, i.e., using only a few templates via ?1-norm minimization, these so-called ?1-trackers exhibit promising tracking results. In this work, we address the object template building and updating problem in these ?1-tracking approaches, which has not been fully studied. We propose to perform template updating, in a new perspective, as an online incremental dictionary learning problem, which is efficiently solved through an online optimization procedure. To guarantee the robustness and adaptability of the tracking algorithm, we also propose to build a multi-lifespan dictionary model. By building target dictionaries of different lifespans, effective object observations can be obtained to deal with the well-known drifting problem in tracking and thus improve the tracking accuracy. We derive effective observa- tion models both generatively and discriminatively based on the online multi-lifespan dictionary learning model and deploy them to the Bayesian sequential estimation framework to perform tracking. The proposed approach has been extensively evaluated on ten challenging video sequences. Experimental results demonstrate the effectiveness of the online learned templates, as well as the state-of-the-art tracking performance of the proposed approach.
2 0.4074125 298 iccv-2013-Online Robust Non-negative Dictionary Learning for Visual Tracking
Author: Naiyan Wang, Jingdong Wang, Dit-Yan Yeung
Abstract: This paper studies the visual tracking problem in video sequences and presents a novel robust sparse tracker under the particle filter framework. In particular, we propose an online robust non-negative dictionary learning algorithm for updating the object templates so that each learned template can capture a distinctive aspect of the tracked object. Another appealing property of this approach is that it can automatically detect and reject the occlusion and cluttered background in a principled way. In addition, we propose a new particle representation formulation using the Huber loss function. The advantage is that it can yield robust estimation without using trivial templates adopted by previous sparse trackers, leading to faster computation. We also reveal the equivalence between this new formulation and the previous one which uses trivial templates. The proposed tracker is empirically compared with state-of-the-art trackers on some challenging video sequences. Both quantitative and qualitative comparisons show that our proposed tracker is superior and more stable.
3 0.32164782 425 iccv-2013-Tracking via Robust Multi-task Multi-view Joint Sparse Representation
Author: Zhibin Hong, Xue Mei, Danil Prokhorov, Dacheng Tao
Abstract: Combining multiple observation views has proven beneficial for tracking. In this paper, we cast tracking as a novel multi-task multi-view sparse learning problem and exploit the cues from multiple views including various types of visual features, such as intensity, color, and edge, where each feature observation can be sparsely represented by a linear combination of atoms from an adaptive feature dictionary. The proposed method is integrated in a particle filter framework where every view in each particle is regarded as an individual task. We jointly consider the underlying relationship between tasks across different views and different particles, and tackle it in a unified robust multi-task formulation. In addition, to capture the frequently emerging outlier tasks, we decompose the representation matrix to two collaborative components which enable a more robust and accurate approximation. We show that theproposedformulation can be efficiently solved using the Accelerated Proximal Gradient method with a small number of closed-form updates. The presented tracker is implemented using four types of features and is tested on numerous benchmark video sequences. Both the qualitative and quantitative results demonstrate the superior performance of the proposed approach compared to several stateof-the-art trackers.
4 0.30857113 121 iccv-2013-Discriminatively Trained Templates for 3D Object Detection: A Real Time Scalable Approach
Author: Reyes Rios-Cabrera, Tinne Tuytelaars
Abstract: In this paper we propose a new method for detecting multiple specific 3D objects in real time. We start from the template-based approach based on the LINE2D/LINEMOD representation introduced recently by Hinterstoisser et al., yet extend it in two ways. First, we propose to learn the templates in a discriminative fashion. We show that this can be done online during the collection of the example images, in just a few milliseconds, and has a big impact on the accuracy of the detector. Second, we propose a scheme based on cascades that speeds up detection. Since detection of an object is fast, new objects can be added with very low cost, making our approach scale well. In our experiments, we easily handle 10-30 3D objects at frame rates above 10fps using a single CPU core. We outperform the state-of-the-art both in terms of speed as well as in terms of accuracy, as validated on 3 different datasets. This holds both when using monocular color images (with LINE2D) and when using RGBD images (with LINEMOD). Moreover, wepropose a challenging new dataset made of12 objects, for future competing methods on monocular color images.
5 0.3069433 161 iccv-2013-Fast Sparsity-Based Orthogonal Dictionary Learning for Image Restoration
Author: Chenglong Bao, Jian-Feng Cai, Hui Ji
Abstract: In recent years, how to learn a dictionary from input images for sparse modelling has been one very active topic in image processing and recognition. Most existing dictionary learning methods consider an over-complete dictionary, e.g. the K-SVD method. Often they require solving some minimization problem that is very challenging in terms of computational feasibility and efficiency. However, if the correlations among dictionary atoms are not well constrained, the redundancy of the dictionary does not necessarily improve the performance of sparse coding. This paper proposed a fast orthogonal dictionary learning method for sparse image representation. With comparable performance on several image restoration tasks, the proposed method is much more computationally efficient than the over-complete dictionary based learning methods.
6 0.29345408 384 iccv-2013-Semi-supervised Robust Dictionary Learning via Efficient l-Norms Minimization
7 0.2391564 318 iccv-2013-PixelTrack: A Fast Adaptive Algorithm for Tracking Non-rigid Objects
8 0.22487682 354 iccv-2013-Robust Dictionary Learning by Error Source Decomposition
9 0.21262643 276 iccv-2013-Multi-attributed Dictionary Learning for Sparse Coding
11 0.20671348 197 iccv-2013-Hierarchical Joint Max-Margin Learning of Mid and Top Level Representations for Visual Recognition
12 0.19837259 245 iccv-2013-Learning a Dictionary of Shape Epitomes with Applications to Image Labeling
13 0.19683555 244 iccv-2013-Learning View-Invariant Sparse Representations for Cross-View Action Recognition
14 0.1898606 89 iccv-2013-Constructing Adaptive Complex Cells for Robust Visual Tracking
15 0.18916766 95 iccv-2013-Cosegmentation and Cosketch by Unsupervised Learning
16 0.18451858 424 iccv-2013-Tracking Revisited Using RGBD Camera: Unified Benchmark and Baselines
17 0.16424662 338 iccv-2013-Randomized Ensemble Tracking
18 0.15977986 225 iccv-2013-Joint Segmentation and Pose Tracking of Human in Natural Videos
19 0.15577406 168 iccv-2013-Finding the Best from the Second Bests - Inhibiting Subjective Bias in Evaluation of Visual Tracking Algorithms
20 0.15118511 51 iccv-2013-Anchored Neighborhood Regression for Fast Example-Based Super-Resolution
topicId topicWeight
[(0, 0.257), (1, 0.05), (2, -0.046), (3, 0.038), (4, -0.249), (5, -0.246), (6, -0.297), (7, 0.137), (8, -0.179), (9, 0.196), (10, -0.058), (11, -0.111), (12, 0.063), (13, 0.138), (14, -0.024), (15, -0.012), (16, 0.112), (17, 0.101), (18, -0.034), (19, -0.138), (20, 0.037), (21, 0.01), (22, -0.045), (23, 0.014), (24, -0.019), (25, -0.02), (26, -0.05), (27, 0.063), (28, -0.028), (29, 0.022), (30, -0.013), (31, 0.035), (32, 0.059), (33, -0.038), (34, 0.01), (35, -0.093), (36, -0.119), (37, -0.001), (38, -0.057), (39, -0.1), (40, -0.099), (41, 0.035), (42, 0.034), (43, -0.02), (44, 0.022), (45, -0.028), (46, -0.028), (47, 0.002), (48, 0.032), (49, -0.003)]
simIndex simValue paperId paperTitle
same-paper 1 0.97593135 359 iccv-2013-Robust Object Tracking with Online Multi-lifespan Dictionary Learning
Author: Junliang Xing, Jin Gao, Bing Li, Weiming Hu, Shuicheng Yan
Abstract: Recently, sparse representation has been introduced for robust object tracking. By representing the object sparsely, i.e., using only a few templates via ?1-norm minimization, these so-called ?1-trackers exhibit promising tracking results. In this work, we address the object template building and updating problem in these ?1-tracking approaches, which has not been fully studied. We propose to perform template updating, in a new perspective, as an online incremental dictionary learning problem, which is efficiently solved through an online optimization procedure. To guarantee the robustness and adaptability of the tracking algorithm, we also propose to build a multi-lifespan dictionary model. By building target dictionaries of different lifespans, effective object observations can be obtained to deal with the well-known drifting problem in tracking and thus improve the tracking accuracy. We derive effective observa- tion models both generatively and discriminatively based on the online multi-lifespan dictionary learning model and deploy them to the Bayesian sequential estimation framework to perform tracking. The proposed approach has been extensively evaluated on ten challenging video sequences. Experimental results demonstrate the effectiveness of the online learned templates, as well as the state-of-the-art tracking performance of the proposed approach.
2 0.90159231 298 iccv-2013-Online Robust Non-negative Dictionary Learning for Visual Tracking
Author: Naiyan Wang, Jingdong Wang, Dit-Yan Yeung
Abstract: This paper studies the visual tracking problem in video sequences and presents a novel robust sparse tracker under the particle filter framework. In particular, we propose an online robust non-negative dictionary learning algorithm for updating the object templates so that each learned template can capture a distinctive aspect of the tracked object. Another appealing property of this approach is that it can automatically detect and reject the occlusion and cluttered background in a principled way. In addition, we propose a new particle representation formulation using the Huber loss function. The advantage is that it can yield robust estimation without using trivial templates adopted by previous sparse trackers, leading to faster computation. We also reveal the equivalence between this new formulation and the previous one which uses trivial templates. The proposed tracker is empirically compared with state-of-the-art trackers on some challenging video sequences. Both quantitative and qualitative comparisons show that our proposed tracker is superior and more stable.
3 0.81888515 425 iccv-2013-Tracking via Robust Multi-task Multi-view Joint Sparse Representation
Author: Zhibin Hong, Xue Mei, Danil Prokhorov, Dacheng Tao
Abstract: Combining multiple observation views has proven beneficial for tracking. In this paper, we cast tracking as a novel multi-task multi-view sparse learning problem and exploit the cues from multiple views including various types of visual features, such as intensity, color, and edge, where each feature observation can be sparsely represented by a linear combination of atoms from an adaptive feature dictionary. The proposed method is integrated in a particle filter framework where every view in each particle is regarded as an individual task. We jointly consider the underlying relationship between tasks across different views and different particles, and tackle it in a unified robust multi-task formulation. In addition, to capture the frequently emerging outlier tasks, we decompose the representation matrix to two collaborative components which enable a more robust and accurate approximation. We show that theproposedformulation can be efficiently solved using the Accelerated Proximal Gradient method with a small number of closed-form updates. The presented tracker is implemented using four types of features and is tested on numerous benchmark video sequences. Both the qualitative and quantitative results demonstrate the superior performance of the proposed approach compared to several stateof-the-art trackers.
Author: Yu Pang, Haibin Ling
Abstract: Evaluating visual tracking algorithms, or “trackers ” for short, is of great importance in computer vision. However, it is hard to “fairly” compare trackers due to many parameters need to be tuned in the experimental configurations. On the other hand, when introducing a new tracker, a recent trend is to validate it by comparing it with several existing ones. Such an evaluation may have subjective biases towards the new tracker which typically performs the best. This is mainly due to the difficulty to optimally tune all its competitors and sometimes the selected testing sequences. By contrast, little subjective bias exists towards the “second best” ones1 in the contest. This observation inspires us with a novel perspective towards inhibiting subjective bias in evaluating trackers by analyzing the results between the second bests. In particular, we first collect all tracking papers published in major computer vision venues in recent years. From these papers, after filtering out potential biases in various aspects, we create a dataset containing many records of comparison results between various visual trackers. Using these records, we derive performance rank- ings of the involved trackers by four different methods. The first two methods model the dataset as a graph and then derive the rankings over the graph, one by a rank aggregation algorithm and the other by a PageRank-like solution. The other two methods take the records as generated from sports contests and adopt widely used Elo’s and Glicko ’s rating systems to derive the rankings. The experimental results are presented and may serve as a reference for related research.
5 0.62261951 89 iccv-2013-Constructing Adaptive Complex Cells for Robust Visual Tracking
Author: Dapeng Chen, Zejian Yuan, Yang Wu, Geng Zhang, Nanning Zheng
Abstract: Representation is a fundamental problem in object tracking. Conventional methods track the target by describing its local or global appearance. In this paper we present that, besides the two paradigms, the composition of local region histograms can also provide diverse and important object cues. We use cells to extract local appearance, and construct complex cells to integrate the information from cells. With different spatial arrangements of cells, complex cells can explore various contextual information at multiple scales, which is important to improve the tracking performance. We also develop a novel template-matching algorithm for object tracking, where the template is composed of temporal varying cells and has two layers to capture the target and background appearance respectively. An adaptive weight is associated with each complex cell to cope with occlusion as well as appearance variation. A fusion weight is associated with each complex cell type to preserve the global distinctiveness. Our algorithm is evaluated on 25 challenging sequences, and the results not only confirm the contribution of each component in our tracking system, but also outperform other competing trackers.
6 0.5975576 395 iccv-2013-Slice Sampling Particle Belief Propagation
7 0.59582734 161 iccv-2013-Fast Sparsity-Based Orthogonal Dictionary Learning for Image Restoration
8 0.58786333 121 iccv-2013-Discriminatively Trained Templates for 3D Object Detection: A Real Time Scalable Approach
9 0.58293647 384 iccv-2013-Semi-supervised Robust Dictionary Learning via Efficient l-Norms Minimization
10 0.58262116 245 iccv-2013-Learning a Dictionary of Shape Epitomes with Applications to Image Labeling
11 0.57972378 354 iccv-2013-Robust Dictionary Learning by Error Source Decomposition
12 0.57601899 318 iccv-2013-PixelTrack: A Fast Adaptive Algorithm for Tracking Non-rigid Objects
13 0.57129431 303 iccv-2013-Orderless Tracking through Model-Averaged Posterior Estimation
14 0.52311498 197 iccv-2013-Hierarchical Joint Max-Margin Learning of Mid and Top Level Representations for Visual Recognition
15 0.50635177 217 iccv-2013-Initialization-Insensitive Visual Tracking through Voting with Salient Local Features
16 0.50190759 276 iccv-2013-Multi-attributed Dictionary Learning for Sparse Coding
17 0.49258974 95 iccv-2013-Cosegmentation and Cosketch by Unsupervised Learning
18 0.49171469 320 iccv-2013-Pose-Configurable Generic Tracking of Elongated Objects
19 0.4817884 51 iccv-2013-Anchored Neighborhood Regression for Fast Example-Based Super-Resolution
20 0.47230941 188 iccv-2013-Group Sparsity and Geometry Constrained Dictionary Learning for Action Recognition from Depth Maps
topicId topicWeight
[(2, 0.058), (7, 0.028), (11, 0.152), (26, 0.089), (31, 0.044), (35, 0.013), (40, 0.016), (42, 0.098), (44, 0.028), (48, 0.016), (64, 0.127), (73, 0.049), (89, 0.122), (97, 0.059)]
simIndex simValue paperId paperTitle
same-paper 1 0.82850122 359 iccv-2013-Robust Object Tracking with Online Multi-lifespan Dictionary Learning
Author: Junliang Xing, Jin Gao, Bing Li, Weiming Hu, Shuicheng Yan
Abstract: Recently, sparse representation has been introduced for robust object tracking. By representing the object sparsely, i.e., using only a few templates via ?1-norm minimization, these so-called ?1-trackers exhibit promising tracking results. In this work, we address the object template building and updating problem in these ?1-tracking approaches, which has not been fully studied. We propose to perform template updating, in a new perspective, as an online incremental dictionary learning problem, which is efficiently solved through an online optimization procedure. To guarantee the robustness and adaptability of the tracking algorithm, we also propose to build a multi-lifespan dictionary model. By building target dictionaries of different lifespans, effective object observations can be obtained to deal with the well-known drifting problem in tracking and thus improve the tracking accuracy. We derive effective observa- tion models both generatively and discriminatively based on the online multi-lifespan dictionary learning model and deploy them to the Bayesian sequential estimation framework to perform tracking. The proposed approach has been extensively evaluated on ten challenging video sequences. Experimental results demonstrate the effectiveness of the online learned templates, as well as the state-of-the-art tracking performance of the proposed approach.
2 0.79060096 425 iccv-2013-Tracking via Robust Multi-task Multi-view Joint Sparse Representation
Author: Zhibin Hong, Xue Mei, Danil Prokhorov, Dacheng Tao
Abstract: Combining multiple observation views has proven beneficial for tracking. In this paper, we cast tracking as a novel multi-task multi-view sparse learning problem and exploit the cues from multiple views including various types of visual features, such as intensity, color, and edge, where each feature observation can be sparsely represented by a linear combination of atoms from an adaptive feature dictionary. The proposed method is integrated in a particle filter framework where every view in each particle is regarded as an individual task. We jointly consider the underlying relationship between tasks across different views and different particles, and tackle it in a unified robust multi-task formulation. In addition, to capture the frequently emerging outlier tasks, we decompose the representation matrix to two collaborative components which enable a more robust and accurate approximation. We show that theproposedformulation can be efficiently solved using the Accelerated Proximal Gradient method with a small number of closed-form updates. The presented tracker is implemented using four types of features and is tested on numerous benchmark video sequences. Both the qualitative and quantitative results demonstrate the superior performance of the proposed approach compared to several stateof-the-art trackers.
3 0.79038024 303 iccv-2013-Orderless Tracking through Model-Averaged Posterior Estimation
Author: Seunghoon Hong, Suha Kwak, Bohyung Han
Abstract: We propose a novel offline tracking algorithm based on model-averaged posterior estimation through patch matching across frames. Contrary to existing online and offline tracking methods, our algorithm is not based on temporallyordered estimates of target state but attempts to select easyto-track frames first out of the remaining ones without exploiting temporal coherency of target. The posterior of the selected frame is estimated by propagating densities from the already tracked frames in a recursive manner. The density propagation across frames is implemented by an efficient patch matching technique, which is useful for our algorithm since it does not require motion smoothness assumption. Also, we present a hierarchical approach, where a small set of key frames are tracked first and non-key frames are handled by local key frames. Our tracking algorithm is conceptually well-suited for the sequences with abrupt motion, shot changes, and occlusion. We compare our tracking algorithm with existing techniques in real videos with such challenges and illustrate its superior performance qualitatively and quantitatively.
4 0.78341711 215 iccv-2013-Incorporating Cloud Distribution in Sky Representation
Author: Kuan-Chuan Peng, Tsuhan Chen
Abstract: Most sky models only describe the cloudiness ofthe overall sky by a single category or parameter such as sky index, which does not account for the distribution of the clouds across the sky. To capture variable cloudiness, we extend the concept of sky index to a random field indicating the level of cloudiness of each sky pixel in our proposed sky representation based on the Igawa sky model. We formulate the problem of solving the sky index of every sky pixel as a labeling problem, where an approximate solution can be efficiently found. Experimental results show that our proposed sky model has better expressiveness, stability with respect to variation in camera parameters, and geo-location estimation in outdoor images compared to the uniform sky index model. Potential applications of our proposed sky model include sky image rendering, where sky images can be generated with an arbitrary cloud distribution at any time and any location, previously impossible with traditional sky models.
5 0.77576172 220 iccv-2013-Joint Deep Learning for Pedestrian Detection
Author: Wanli Ouyang, Xiaogang Wang
Abstract: Feature extraction, deformation handling, occlusion handling, and classi?cation are four important components in pedestrian detection. Existing methods learn or design these components either individually or sequentially. The interaction among these components is not yet well explored. This paper proposes that they should be jointly learned in order to maximize their strengths through cooperation. We formulate these four components into a joint deep learning framework and propose a new deep network architecture1. By establishing automatic, mutual interaction among components, the deep model achieves a 9% reduction in the average miss rate compared with the current best-performing pedestrian detection approaches on the largest Caltech benchmark dataset.
6 0.77472693 86 iccv-2013-Concurrent Action Detection with Structural Prediction
7 0.77199852 298 iccv-2013-Online Robust Non-negative Dictionary Learning for Visual Tracking
8 0.7696135 442 iccv-2013-Video Segmentation by Tracking Many Figure-Ground Segments
9 0.76421714 166 iccv-2013-Finding Actors and Actions in Movies
10 0.76236939 380 iccv-2013-Semantic Transform: Weakly Supervised Semantic Inference for Relating Visual Attributes
11 0.76043403 88 iccv-2013-Constant Time Weighted Median Filtering for Stereo Matching and Beyond
12 0.75948811 37 iccv-2013-Action Recognition and Localization by Hierarchical Space-Time Segments
13 0.75941902 99 iccv-2013-Cross-View Action Recognition over Heterogeneous Feature Spaces
14 0.75863403 338 iccv-2013-Randomized Ensemble Tracking
15 0.75654769 242 iccv-2013-Learning People Detectors for Tracking in Crowded Scenes
16 0.75577807 362 iccv-2013-Robust Tucker Tensor Decomposition for Effective Image Representation
17 0.75226158 240 iccv-2013-Learning Maximum Margin Temporal Warping for Action Recognition
18 0.74599874 441 iccv-2013-Video Motion for Every Visible Point
19 0.74361444 180 iccv-2013-From Where and How to What We See
20 0.7412945 320 iccv-2013-Pose-Configurable Generic Tracking of Elongated Objects