iccv iccv2013 iccv2013-217 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Kwang Moo Yi, Hawook Jeong, Byeongho Heo, Hyung Jin Chang, Jin Young Choi
Abstract: In this paper we propose an object tracking method in case of inaccurate initializations. To track objects accurately in such situation, the proposed method uses “motion saliency ” and “descriptor saliency ” of local features and performs tracking based on generalized Hough transform (GHT). The proposed motion saliency of a local feature emphasizes features having distinctive motions, compared to the motions which are not from the target object. The descriptor saliency emphasizes features which are likely to be of the object in terms of its feature descriptors. Through these saliencies, the proposed method tries to “learn and find” the target object rather than looking for what was given at initialization, giving robust results even with inaccurate initializations. Also, our tracking result is obtained by combining the results of each local feature of the target and the surroundings with GHT voting, thus is robust against severe occlusions as well. The proposed method is compared against nine other methods, with nine image sequences, and hundred random initializations. The experimental results show that our method outperforms all other compared methods.
Reference: text
sentIndex sentText sentNum sentScore
1 }@ Abstract In this paper we propose an object tracking method in case of inaccurate initializations. [sent-3, score-0.31]
2 To track objects accurately in such situation, the proposed method uses “motion saliency ” and “descriptor saliency ” of local features and performs tracking based on generalized Hough transform (GHT). [sent-4, score-1.063]
3 The proposed motion saliency of a local feature emphasizes features having distinctive motions, compared to the motions which are not from the target object. [sent-5, score-0.863]
4 The descriptor saliency emphasizes features which are likely to be of the object in terms of its feature descriptors. [sent-6, score-0.686]
5 Through these saliencies, the proposed method tries to “learn and find” the target object rather than looking for what was given at initialization, giving robust results even with inaccurate initializations. [sent-7, score-0.344]
6 Also, our tracking result is obtained by combining the results of each local feature of the target and the surroundings with GHT voting, thus is robust against severe occlusions as well. [sent-8, score-0.591]
7 Introduction Various tracking methods have been developed over the past decade and have proven to be successful for many applications [22]. [sent-12, score-0.183]
8 To track objects accurately, problems such as non-rigid deformations [7, 14, 9], partial occlusions [1, 17], and drifting [20, 4] have been tackled giving promising results. [sent-13, score-0.339]
9 However, the applicability of conventional methods are still limited in actual environments due to their sensitivity to initializations (illustrated in Figure 1). [sent-14, score-0.321]
10 One of the major assumptions many conventional trackers have is that the target object is given rather accurately. [sent-15, score-0.445]
11 Therefore, the trackers are sensitive to how they were initialized at the first frame. [sent-16, score-0.189]
12 uk 2215 2GT25PROPOABSEMIBEYONDH2O1U5GHMLITLDMSTFRAGNIVSI Figure 1: Example of tracking in case of occlusions and with clumsy initializations. [sent-21, score-0.332]
13 Slight difference in initializations for the occCup sequence (left column), leading to different results for the same frame (right column). [sent-22, score-0.293]
14 Also, in actual environments, severe occlusions exist where almost all of the target object is occluded, which not many trackers are capable of dealing with. [sent-25, score-0.568]
15 The performance degradation of conventional trackers under inaccurate initializations is a problem which has not been well addressed yet. [sent-26, score-0.628]
16 There are methods with automatic initialization for trackers such as the method by Mahadevan and Vasconcelos [16], but still, they do not account for cases when these initializations fail to give an accurate initialization. [sent-27, score-0.582]
17 Performance degradation from inaccurate initializations is closely related to drifting problems. [sent-28, score-0.56]
18 In many tracking methods, such as [7, 1, 10, 11, 20], the model for the target object is constructed using the initial target information given at the first frame. [sent-29, score-0.599]
19 Then, the methods adapt the target model with the tracking results for each frames. [sent-30, score-0.371]
20 During the adaptation, rather than learning about the target object and enhancing the model, noise gets involved in the learning process and the performance of the tracker degrades. [sent-31, score-0.3]
21 This drifting would cause trackers to be more sensitive to initializations, since having more noise from the beginning would cause faster drifting. [sent-33, score-0.416]
22 [13] uses P-N learning scheme, which incorporates results from both detector and tracker to avoid such problems, but still, their work is mainly focused on the problem of trackers drifting. [sent-37, score-0.261]
23 Even without considering drifting problems, a small inaccuracy in initialization can cause much problem. [sent-38, score-0.288]
24 A major cause for conventional trackers being over sensitive to initializations is much based on the fact that they treat the given initialization as a fixed prior. [sent-39, score-0.649]
25 When we only have a single frame to use, constructing the target model solely based on the initialization is reasonable. [sent-40, score-0.288]
26 However, as more frames or more data is given, it is obvious that we would need to figure out what the target object is like, rather than just trying to adapt the model we learned from the initialization. [sent-41, score-0.291]
27 With the rough classification result, they again perform Grab-cut [19] to segment the target object. [sent-46, score-0.188]
28 Though their aim was to well-describe non-rigid deformations and reduce drifting effects, their method is closely related to the initialization problem of trackers. [sent-47, score-0.249]
29 In order to achieve good tracking results even with inaccurate initializations, we define the two saliencies related to motions and descriptors (explained in detail in Section 2). [sent-50, score-0.644]
30 In other words, rather than just trying to find what was given at initialization, the two saliencies work together to learn the salient characteristics of the target object, which also results in change of the influences of initial features. [sent-52, score-0.591]
31 Method in [16] also uses the concept of saliency, but their definition of saliency is a criterion for selection of features (features such as colors or or edges, not to be confused with feature points) similar to [3]. [sent-53, score-0.475]
32 Also, the bottom-up saliency they use for initialization is a center-surround saliency (unlike ours which considers target and non-target rather than center and surround), and does not always guarantee that it highlights the target object. [sent-54, score-1.348]
33 GHT is a powerful method for combining partial estimates into a whole used in many tracking methods [9, 12, 2]. [sent-56, score-0.309]
34 Since we use GHT voting to combine multiple estimates from local features, our method gives robust results even if some features of the target object becomes occluded. [sent-58, score-0.421]
35 Furthermore, we learn the model using all local features in the scene and keep local features which move along with the target object, thus giving robust results even when all of the target object is occluded (and in case of severe occlusions). [sent-59, score-0.622]
36 [12], which creates supporters with nearby features, and use them to aid tracking in case of severe occlusions. [sent-61, score-0.336]
37 However, the performance of their method relies much on the primary tracker being used, which is not always accurate and suffers from initialization problems, whereas our method successfully deals with both problems simultaneously. [sent-62, score-0.202]
38 Then, using the learned feature database (DB), we match each feature to obtain partial so- lutions for the center of the target object. [sent-69, score-0.38]
39 When collecting the partial results using GHT, in order to deal with inaccurate initializations, each solution is weighted according to the two proposed saliencies (the descriptor saliency and the motion saliency). [sent-71, score-1.05]
40 The feature DB is learned on-the-fly during the tracking process. [sent-72, score-0.223]
41 their descriptors (descriptor saliency), and where the center of the target object would be for each item. [sent-76, score-0.286]
42 The proposed motion saliency is obtained using the learned descriptor saliency of features and the optical flow of the local feature. [sent-77, score-1.014]
43 The motion saliency is designed so that the features showing distinctive motion characteristics of the target object have higher values. [sent-78, score-0.852]
44 With the two saliencies, the salient characteristics of the target object are learned in the model (feature DB). [sent-79, score-0.301]
45 Tracking based on GHT The proposed tracking scheme using GHT starts by building a likelihood map for the center position of the tar2913 Figure 2: Overall scheme of the proposed method. [sent-83, score-0.272]
46 The likelihood map is created through GHT by combining the center estimates (votes) from each local feature point. [sent-85, score-0.201]
47 their saliencies so that salient features are more accounted for. [sent-89, score-0.399]
48 This combining process is refered to as voting in GHT, each partial estimates as votes, wj as voting weights, and the resultant likelihood map as the vote map. [sent-95, score-0.583]
49 Since the target object position changes little between frames, we apply temporal low-pass filtering (temporal weighted averaging) to the vote map to take advantage of this fact. [sent-96, score-0.337]
50 Then, with this vote map, we can find the target object position ˆx as xˆ = ( xˆ, yˆ) = arg maxA (x, y) . [sent-97, score-0.337]
51 e matched and vote to (xc,j,yc,j) with weight wj as in Figure 3, where ? [sent-129, score-0.219]
52 This lets the proposed method to be able to vote regardless of the scale and rotation change of the target object. [sent-139, score-0.297]
53 The weight wj is designed to account yc,j and the descriptor saliency ζ(ij∗t). [sent-141, score-0.594]
54 both the motion saliency ηj We use the multiplication of the two saliencies to emphasize features which have both saliency values high. [sent-142, score-1.23]
55 (7) Details about these saliencies and their effects are presented in Section 2. [sent-144, score-0.33]
56 Descriptor Saliency and Feature DB Update To learn the salient features of the target object w. [sent-149, score-0.297]
57 the shape of the target object, we define the descriptor saliency 2914 (a)(b)(c) Figure 4: Illustration of the descriptor saliency in action. [sent-152, score-1.23]
58 Detected local features are depicted with circles, having their sizes as their descriptor saliency values (larger means high). [sent-153, score-0.549]
59 (a) Initial voting for the target object position, (b) voting after t frames, and (c) voting in case of occlusion. [sent-155, score-0.594]
60 For each item in F(t) , the descriptor saliency ζ(t) is learned to hold how good the partial (voting) results were when using the item, i. [sent-158, score-0.698]
61 For example, the feature items matched with the features pointing to the center of the cup in Figure 3 would be updated with high saliency values, whereas items matched with the feature pointing at the wrong direction (e. [sent-162, score-0.887]
62 feature point on the hand) would be updated with low saliency value. [sent-164, score-0.447]
63 As the target object information is provided with a bounding box in the first frame, feature points inside the bounding box are added to F(1) with descriptor saliency ζ = 1, and feature points outside the bounding box are added to F(1) with descriptor saliency ζ = 0 (Figure 4a). [sent-166, score-1.629]
64 Then, we continuously update the descriptor saliencies using the back projection result of each vote on A(t) . [sent-167, score-0.553]
65 d), the det, if item ζi(t) scriptor saliency ofthis item ? [sent-179, score-0.653]
66 SEMI, BEYOND, MIL, and TLD are mostly targeted for solving drifting issues and HOUGH is a method targeted to overcome inaccuracies arising from a bounding box representation of the target. [sent-192, score-0.304]
67 To compare results quantitatively, we used manually annotated bounding box representation of the target object as the ground truth. [sent-202, score-0.321]
68 Two measures were used to evaluate the algorithms; mean error between the ground truth center point and the tracking result, and the percentage of correctly tracked frames. [sent-203, score-0.401]
69 By correctly tracked frames, we counted the tracking result as correct if the center of the tracking result was inside the ground truth bounding box. [sent-204, score-0.611]
70 com/site/homekmyi 2916 Figure 5: Example of motion saliency obtained for the woman sequence. [sent-209, score-0.556]
71 Note that in (b), upper left motion on the cars result in low motion saliency values in (d) whereas down right motion on the person result in high values. [sent-212, score-0.611]
72 lap criterion is because we are using random initializations which may have little overlap even from the beginning. [sent-214, score-0.322]
73 When the overlap measure is used, results for all trackers become degraded since some initializations would be counted as failures even at the first frame. [sent-216, score-0.577]
74 Still, the relative performance of trackers remain similar since this happens equally for all trackers. [sent-217, score-0.189]
75 For trackers being able to detect tracking failures, we did not use these frames for computing the mean error. [sent-218, score-0.435]
76 However, they are considered as tracking failures in the percentage of correctly tracked frames for fair comparison. [sent-219, score-0.444]
77 Tracking with Inaccurate Initializations To validate the robustness of our method against clumsy initializations, we have tested trackers with 100 random initializations. [sent-222, score-0.269]
78 Of the 100, the first initialization is identical to the ground truth and 20 contain initializations having the center point of the bounding box fixed at the ground truth but having different width and height (sampled uniformly having maximum difference to be 20% of the original width or height). [sent-223, score-0.678]
79 Another 20 contain initializations having the same width and height as the ground truth, but with the center point differing (sampled uniformly having maximum difference to be 50% of the width or height of the target object. [sent-224, score-0.749]
80 Results for all sequences with all initializations are shown in Table. [sent-226, score-0.329]
81 1(PROPm and PROPd are the results of our method using only motion saliency and descriptor saliency, respectively). [sent-227, score-0.579]
82 1, it can be observed that as initialization overlap de- 1062840 0 PROP ABSEMIBEYONDHOUGHMILT DMSTFRAGINVIS Figure 6: Box plots for % correctly tracked with all initializations. [sent-230, score-0.245]
83 creases, the average performance of trackers generally degrade (lower percentage of correctly tracked frames and larger mean errors). [sent-231, score-0.438]
84 Occasionally, BEYOND shows best results in terms of mean error, but the percentage of correctly tracked frames shows that only a limited number of frames were tracked. [sent-242, score-0.286]
85 Note that using only one of the two saliencies degrade performance (PROPm and PROPd). [sent-243, score-0.356]
86 Figure 6 is a box plot demonstrating the performance of trackers against different initializations. [sent-244, score-0.239]
87 In these sequences, occlusion of the target object exist, expecially with the coke sequence having the object fully occluded at occasions, and occCup having the object occluded even at initialization. [sent-253, score-0.495]
88 Figure 7b is an example of severe occlusion where the target object gets fully occluded. [sent-255, score-0.31]
89 Our method successfully tracks the target object even in such case, by learning the features of the hand which moves together with the target object. [sent-256, score-0.444]
90 In Figure 7e, the target object is occluded even at initialization. [sent-257, score-0.267]
91 As in Figure 7f, many compared methods fail to recognize the cup as the target object and fail. [sent-258, score-0.3]
92 Conclusions A new visual tracking method for tracking objects in case ofinaccurate initialization has been proposed. [sent-261, score-0.466]
93 The proposed method uses motion saliency and descriptor saliency of local features and obtains the target position through GHT. [sent-262, score-1.202]
94 The motion saliency of a local feature emphasizes features having distinctive motions, compared to background motions. [sent-263, score-0.631]
95 The descriptor saliency emphasizes features which are likely to be of the object in terms of its feature descriptors. [sent-264, score-0.686]
96 Through these saliencies, the proposed method learns the distinctive characteristics of the target object in the image sequence. [sent-265, score-0.301]
97 The saliencies and GHT combined allowed the tracker to have robust performances under clumsy initializations and occlusions. [sent-266, score-0.803]
98 The experimental results demonstrated the robustness of our method against initializations and (severe) occlusions, outperforming the other compared methods significantly. [sent-268, score-0.293]
99 2(a) coke #2 56(b) coke #256106(c) tiger1 #106239(d) tiger2 #239 2(e) oc Cup #2 1 (f) oc Cup #21 14 (g) woman #14 301(h) woman #301 562(i) oc Face #5621093(j) sylvester #109375(k) mot cros 1 #75(l) mtn. [sent-410, score-0.558]
100 bike #132 GT25PROPOABSEMIBEYONDHOUGHMILTLDMSTFRAG INVIS Figure 7: Critical frames for tracking results. [sent-411, score-0.28]
wordName wordTfidf (topN-words)
[('saliency', 0.407), ('ght', 0.37), ('saliencies', 0.33), ('initializations', 0.293), ('trackers', 0.189), ('target', 0.188), ('tracking', 0.183), ('drifting', 0.149), ('item', 0.123), ('voting', 0.122), ('descriptor', 0.114), ('coke', 0.109), ('vote', 0.109), ('db', 0.109), ('invis', 0.106), ('initialization', 0.1), ('woman', 0.091), ('grabner', 0.087), ('inaccurate', 0.087), ('severe', 0.082), ('clumsy', 0.08), ('occcup', 0.08), ('tracked', 0.073), ('wj', 0.073), ('dj', 0.073), ('cup', 0.072), ('tracker', 0.072), ('nine', 0.072), ('supporters', 0.071), ('occlusions', 0.069), ('tld', 0.068), ('frames', 0.063), ('hough', 0.061), ('center', 0.058), ('motion', 0.058), ('emphasizes', 0.057), ('partial', 0.054), ('mst', 0.054), ('occface', 0.053), ('propd', 0.053), ('propm', 0.053), ('sjdri', 0.053), ('box', 0.05), ('votes', 0.05), ('dinh', 0.047), ('snu', 0.047), ('height', 0.046), ('width', 0.044), ('ij', 0.044), ('percentage', 0.044), ('motions', 0.044), ('seoul', 0.044), ('bounding', 0.043), ('correctly', 0.043), ('estimates', 0.043), ('items', 0.042), ('salient', 0.041), ('sylvester', 0.041), ('distinctive', 0.041), ('feature', 0.04), ('object', 0.04), ('oc', 0.039), ('semi', 0.039), ('frag', 0.039), ('occluded', 0.039), ('surf', 0.039), ('cause', 0.039), ('failures', 0.038), ('track', 0.038), ('matched', 0.037), ('mahadevan', 0.036), ('sequences', 0.036), ('godec', 0.035), ('sj', 0.035), ('bike', 0.034), ('kalal', 0.034), ('jth', 0.032), ('characteristics', 0.032), ('korea', 0.032), ('likelihood', 0.031), ('degradation', 0.031), ('targeted', 0.031), ('mil', 0.03), ('xj', 0.03), ('whereas', 0.03), ('differing', 0.03), ('overlap', 0.029), ('matas', 0.029), ('boosting', 0.029), ('giving', 0.029), ('combining', 0.029), ('features', 0.028), ('conventional', 0.028), ('counted', 0.028), ('performances', 0.028), ('jin', 0.028), ('pointing', 0.027), ('flows', 0.027), ('degrade', 0.026), ('hundred', 0.026)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000004 217 iccv-2013-Initialization-Insensitive Visual Tracking through Voting with Salient Local Features
Author: Kwang Moo Yi, Hawook Jeong, Byeongho Heo, Hyung Jin Chang, Jin Young Choi
Abstract: In this paper we propose an object tracking method in case of inaccurate initializations. To track objects accurately in such situation, the proposed method uses “motion saliency ” and “descriptor saliency ” of local features and performs tracking based on generalized Hough transform (GHT). The proposed motion saliency of a local feature emphasizes features having distinctive motions, compared to the motions which are not from the target object. The descriptor saliency emphasizes features which are likely to be of the object in terms of its feature descriptors. Through these saliencies, the proposed method tries to “learn and find” the target object rather than looking for what was given at initialization, giving robust results even with inaccurate initializations. Also, our tracking result is obtained by combining the results of each local feature of the target and the surroundings with GHT voting, thus is robust against severe occlusions as well. The proposed method is compared against nine other methods, with nine image sequences, and hundred random initializations. The experimental results show that our method outperforms all other compared methods.
2 0.33323145 71 iccv-2013-Category-Independent Object-Level Saliency Detection
Author: Yangqing Jia, Mei Han
Abstract: It is known that purely low-level saliency cues such as frequency does not lead to a good salient object detection result, requiring high-level knowledge to be adopted for successful discovery of task-independent salient objects. In this paper, we propose an efficient way to combine such high-level saliency priors and low-level appearance models. We obtain the high-level saliency prior with the objectness algorithm to find potential object candidates without the need of category information, and then enforce the consistency among the salient regions using a Gaussian MRF with the weights scaled by diverse density that emphasizes the influence of potential foreground pixels. Our model obtains saliency maps that assign high scores for the whole salient object, and achieves state-of-the-art performance on benchmark datasets covering various foreground statistics.
3 0.32730779 372 iccv-2013-Saliency Detection via Dense and Sparse Reconstruction
Author: Xiaohui Li, Huchuan Lu, Lihe Zhang, Xiang Ruan, Ming-Hsuan Yang
Abstract: In this paper, we propose a visual saliency detection algorithm from the perspective of reconstruction errors. The image boundaries are first extracted via superpixels as likely cues for background templates, from which dense and sparse appearance models are constructed. For each image region, we first compute dense and sparse reconstruction errors. Second, the reconstruction errors are propagated based on the contexts obtained from K-means clustering. Third, pixel-level saliency is computed by an integration of multi-scale reconstruction errors and refined by an object-biased Gaussian model. We apply the Bayes formula to integrate saliency measures based on dense and sparse reconstruction errors. Experimental results show that the proposed algorithm performs favorably against seventeen state-of-the-art methods in terms of precision and recall. In addition, the proposed algorithm is demonstrated to be more effective in highlighting salient objects uniformly and robust to background noise.
4 0.27482754 91 iccv-2013-Contextual Hypergraph Modeling for Salient Object Detection
Author: Xi Li, Yao Li, Chunhua Shen, Anthony Dick, Anton Van_Den_Hengel
Abstract: Salient object detection aims to locate objects that capture human attention within images. Previous approaches often pose this as a problem of image contrast analysis. In this work, we model an image as a hypergraph that utilizes a set of hyperedges to capture the contextual properties of image pixels or regions. As a result, the problem of salient object detection becomes one of finding salient vertices and hyperedges in the hypergraph. The main advantage of hypergraph modeling is that it takes into account each pixel’s (or region ’s) affinity with its neighborhood as well as its separation from image background. Furthermore, we propose an alternative approach based on centerversus-surround contextual contrast analysis, which performs salient object detection by optimizing a cost-sensitive support vector machine (SVM) objective function. Experimental results on four challenging datasets demonstrate the effectiveness of the proposed approaches against the stateof-the-art approaches to salient object detection.
5 0.27122563 396 iccv-2013-Space-Time Robust Representation for Action Recognition
Author: Nicolas Ballas, Yi Yang, Zhen-Zhong Lan, Bertrand Delezoide, Françoise Prêteux, Alexander Hauptmann
Abstract: We address the problem of action recognition in unconstrained videos. We propose a novel content driven pooling that leverages space-time context while being robust toward global space-time transformations. Being robust to such transformations is of primary importance in unconstrained videos where the action localizations can drastically shift between frames. Our pooling identifies regions of interest using video structural cues estimated by different saliency functions. To combine the different structural information, we introduce an iterative structure learning algorithm, WSVM (weighted SVM), that determines the optimal saliency layout ofan action model through a sparse regularizer. A new optimization method isproposed to solve the WSVM’ highly non-smooth objective function. We evaluate our approach on standard action datasets (KTH, UCF50 and HMDB). Most noticeably, the accuracy of our algorithm reaches 51.8% on the challenging HMDB dataset which outperforms the state-of-the-art of 7.3% relatively.
6 0.25294197 50 iccv-2013-Analysis of Scores, Datasets, and Models in Visual Saliency Prediction
7 0.23827551 373 iccv-2013-Saliency and Human Fixations: State-of-the-Art and Study of Comparison Metrics
8 0.2192537 318 iccv-2013-PixelTrack: A Fast Adaptive Algorithm for Tracking Non-rigid Objects
9 0.20929651 424 iccv-2013-Tracking Revisited Using RGBD Camera: Unified Benchmark and Baselines
10 0.19890858 168 iccv-2013-Finding the Best from the Second Bests - Inhibiting Subjective Bias in Evaluation of Visual Tracking Algorithms
11 0.19144297 370 iccv-2013-Saliency Detection in Large Point Sets
12 0.19089252 374 iccv-2013-Salient Region Detection by UFO: Uniqueness, Focusness and Objectness
13 0.17951041 371 iccv-2013-Saliency Detection via Absorbing Markov Chain
14 0.16516998 298 iccv-2013-Online Robust Non-negative Dictionary Learning for Visual Tracking
15 0.16510355 137 iccv-2013-Efficient Salient Region Detection with Soft Image Abstraction
16 0.15578747 369 iccv-2013-Saliency Detection: A Boolean Map Approach
17 0.15278006 425 iccv-2013-Tracking via Robust Multi-task Multi-view Joint Sparse Representation
18 0.14270598 359 iccv-2013-Robust Object Tracking with Online Multi-lifespan Dictionary Learning
19 0.13583642 89 iccv-2013-Constructing Adaptive Complex Cells for Robust Visual Tracking
20 0.12753691 338 iccv-2013-Randomized Ensemble Tracking
topicId topicWeight
[(0, 0.21), (1, -0.075), (2, 0.363), (3, -0.13), (4, -0.102), (5, -0.069), (6, -0.037), (7, 0.087), (8, -0.034), (9, 0.188), (10, -0.07), (11, -0.124), (12, 0.06), (13, 0.032), (14, 0.089), (15, -0.12), (16, 0.125), (17, 0.006), (18, -0.068), (19, 0.007), (20, -0.011), (21, 0.003), (22, 0.006), (23, -0.025), (24, 0.024), (25, -0.013), (26, 0.008), (27, 0.003), (28, 0.0), (29, -0.034), (30, -0.03), (31, -0.021), (32, -0.034), (33, 0.013), (34, -0.037), (35, 0.003), (36, -0.022), (37, -0.003), (38, -0.006), (39, 0.039), (40, -0.036), (41, -0.032), (42, -0.01), (43, -0.048), (44, 0.04), (45, 0.022), (46, -0.008), (47, -0.017), (48, 0.032), (49, -0.0)]
simIndex simValue paperId paperTitle
same-paper 1 0.94818586 217 iccv-2013-Initialization-Insensitive Visual Tracking through Voting with Salient Local Features
Author: Kwang Moo Yi, Hawook Jeong, Byeongho Heo, Hyung Jin Chang, Jin Young Choi
Abstract: In this paper we propose an object tracking method in case of inaccurate initializations. To track objects accurately in such situation, the proposed method uses “motion saliency ” and “descriptor saliency ” of local features and performs tracking based on generalized Hough transform (GHT). The proposed motion saliency of a local feature emphasizes features having distinctive motions, compared to the motions which are not from the target object. The descriptor saliency emphasizes features which are likely to be of the object in terms of its feature descriptors. Through these saliencies, the proposed method tries to “learn and find” the target object rather than looking for what was given at initialization, giving robust results even with inaccurate initializations. Also, our tracking result is obtained by combining the results of each local feature of the target and the surroundings with GHT voting, thus is robust against severe occlusions as well. The proposed method is compared against nine other methods, with nine image sequences, and hundred random initializations. The experimental results show that our method outperforms all other compared methods.
2 0.76555091 369 iccv-2013-Saliency Detection: A Boolean Map Approach
Author: Jianming Zhang, Stan Sclaroff
Abstract: A novel Boolean Map based Saliency (BMS) model is proposed. An image is characterized by a set of binary images, which are generated by randomly thresholding the image ’s color channels. Based on a Gestalt principle of figure-ground segregation, BMS computes saliency maps by analyzing the topological structure of Boolean maps. BMS is simple to implement and efficient to run. Despite its simplicity, BMS consistently achieves state-of-the-art performance compared with ten leading methods on five eye tracking datasets. Furthermore, BMS is also shown to be advantageous in salient object detection.
3 0.76338685 71 iccv-2013-Category-Independent Object-Level Saliency Detection
Author: Yangqing Jia, Mei Han
Abstract: It is known that purely low-level saliency cues such as frequency does not lead to a good salient object detection result, requiring high-level knowledge to be adopted for successful discovery of task-independent salient objects. In this paper, we propose an efficient way to combine such high-level saliency priors and low-level appearance models. We obtain the high-level saliency prior with the objectness algorithm to find potential object candidates without the need of category information, and then enforce the consistency among the salient regions using a Gaussian MRF with the weights scaled by diverse density that emphasizes the influence of potential foreground pixels. Our model obtains saliency maps that assign high scores for the whole salient object, and achieves state-of-the-art performance on benchmark datasets covering various foreground statistics.
4 0.76331306 372 iccv-2013-Saliency Detection via Dense and Sparse Reconstruction
Author: Xiaohui Li, Huchuan Lu, Lihe Zhang, Xiang Ruan, Ming-Hsuan Yang
Abstract: In this paper, we propose a visual saliency detection algorithm from the perspective of reconstruction errors. The image boundaries are first extracted via superpixels as likely cues for background templates, from which dense and sparse appearance models are constructed. For each image region, we first compute dense and sparse reconstruction errors. Second, the reconstruction errors are propagated based on the contexts obtained from K-means clustering. Third, pixel-level saliency is computed by an integration of multi-scale reconstruction errors and refined by an object-biased Gaussian model. We apply the Bayes formula to integrate saliency measures based on dense and sparse reconstruction errors. Experimental results show that the proposed algorithm performs favorably against seventeen state-of-the-art methods in terms of precision and recall. In addition, the proposed algorithm is demonstrated to be more effective in highlighting salient objects uniformly and robust to background noise.
5 0.7628997 91 iccv-2013-Contextual Hypergraph Modeling for Salient Object Detection
Author: Xi Li, Yao Li, Chunhua Shen, Anthony Dick, Anton Van_Den_Hengel
Abstract: Salient object detection aims to locate objects that capture human attention within images. Previous approaches often pose this as a problem of image contrast analysis. In this work, we model an image as a hypergraph that utilizes a set of hyperedges to capture the contextual properties of image pixels or regions. As a result, the problem of salient object detection becomes one of finding salient vertices and hyperedges in the hypergraph. The main advantage of hypergraph modeling is that it takes into account each pixel’s (or region ’s) affinity with its neighborhood as well as its separation from image background. Furthermore, we propose an alternative approach based on centerversus-surround contextual contrast analysis, which performs salient object detection by optimizing a cost-sensitive support vector machine (SVM) objective function. Experimental results on four challenging datasets demonstrate the effectiveness of the proposed approaches against the stateof-the-art approaches to salient object detection.
6 0.74431169 374 iccv-2013-Salient Region Detection by UFO: Uniqueness, Focusness and Objectness
7 0.74086761 50 iccv-2013-Analysis of Scores, Datasets, and Models in Visual Saliency Prediction
8 0.73034173 373 iccv-2013-Saliency and Human Fixations: State-of-the-Art and Study of Comparison Metrics
9 0.71611327 371 iccv-2013-Saliency Detection via Absorbing Markov Chain
10 0.68929082 370 iccv-2013-Saliency Detection in Large Point Sets
11 0.66731554 137 iccv-2013-Efficient Salient Region Detection with Soft Image Abstraction
12 0.65208769 168 iccv-2013-Finding the Best from the Second Bests - Inhibiting Subjective Bias in Evaluation of Visual Tracking Algorithms
13 0.63942093 396 iccv-2013-Space-Time Robust Representation for Action Recognition
14 0.57659268 425 iccv-2013-Tracking via Robust Multi-task Multi-view Joint Sparse Representation
15 0.56840938 424 iccv-2013-Tracking Revisited Using RGBD Camera: Unified Benchmark and Baselines
16 0.54926705 303 iccv-2013-Orderless Tracking through Model-Averaged Posterior Estimation
17 0.54772252 89 iccv-2013-Constructing Adaptive Complex Cells for Robust Visual Tracking
18 0.5410915 298 iccv-2013-Online Robust Non-negative Dictionary Learning for Visual Tracking
19 0.53252363 318 iccv-2013-PixelTrack: A Fast Adaptive Algorithm for Tracking Non-rigid Objects
20 0.48661944 338 iccv-2013-Randomized Ensemble Tracking
topicId topicWeight
[(2, 0.05), (4, 0.012), (7, 0.018), (26, 0.06), (31, 0.043), (40, 0.063), (42, 0.094), (53, 0.192), (64, 0.111), (73, 0.033), (89, 0.174), (97, 0.04), (98, 0.012)]
simIndex simValue paperId paperTitle
same-paper 1 0.81598073 217 iccv-2013-Initialization-Insensitive Visual Tracking through Voting with Salient Local Features
Author: Kwang Moo Yi, Hawook Jeong, Byeongho Heo, Hyung Jin Chang, Jin Young Choi
Abstract: In this paper we propose an object tracking method in case of inaccurate initializations. To track objects accurately in such situation, the proposed method uses “motion saliency ” and “descriptor saliency ” of local features and performs tracking based on generalized Hough transform (GHT). The proposed motion saliency of a local feature emphasizes features having distinctive motions, compared to the motions which are not from the target object. The descriptor saliency emphasizes features which are likely to be of the object in terms of its feature descriptors. Through these saliencies, the proposed method tries to “learn and find” the target object rather than looking for what was given at initialization, giving robust results even with inaccurate initializations. Also, our tracking result is obtained by combining the results of each local feature of the target and the surroundings with GHT voting, thus is robust against severe occlusions as well. The proposed method is compared against nine other methods, with nine image sequences, and hundred random initializations. The experimental results show that our method outperforms all other compared methods.
2 0.78610963 114 iccv-2013-Dictionary Learning and Sparse Coding on Grassmann Manifolds: An Extrinsic Solution
Author: Mehrtash Harandi, Conrad Sanderson, Chunhua Shen, Brian Lovell
Abstract: Recent advances in computer vision and machine learning suggest that a wide range of problems can be addressed more appropriately by considering non-Euclidean geometry. In this paper we explore sparse dictionary learning over the space of linear subspaces, which form Riemannian structures known as Grassmann manifolds. To this end, we propose to embed Grassmann manifolds into the space of symmetric matrices by an isometric mapping, which enables us to devise a closed-form solution for updating a Grassmann dictionary, atom by atom. Furthermore, to handle non-linearity in data, we propose a kernelised version of the dictionary learning algorithm. Experiments on several classification tasks (face recognition, action recognition, dynamic texture classification) show that the proposed approach achieves considerable improvements in discrimination accuracy, in comparison to state-of-the-art methods such as kernelised Affine Hull Method and graphembedding Grassmann discriminant analysis.
3 0.77576065 425 iccv-2013-Tracking via Robust Multi-task Multi-view Joint Sparse Representation
Author: Zhibin Hong, Xue Mei, Danil Prokhorov, Dacheng Tao
Abstract: Combining multiple observation views has proven beneficial for tracking. In this paper, we cast tracking as a novel multi-task multi-view sparse learning problem and exploit the cues from multiple views including various types of visual features, such as intensity, color, and edge, where each feature observation can be sparsely represented by a linear combination of atoms from an adaptive feature dictionary. The proposed method is integrated in a particle filter framework where every view in each particle is regarded as an individual task. We jointly consider the underlying relationship between tasks across different views and different particles, and tackle it in a unified robust multi-task formulation. In addition, to capture the frequently emerging outlier tasks, we decompose the representation matrix to two collaborative components which enable a more robust and accurate approximation. We show that theproposedformulation can be efficiently solved using the Accelerated Proximal Gradient method with a small number of closed-form updates. The presented tracker is implemented using four types of features and is tested on numerous benchmark video sequences. Both the qualitative and quantitative results demonstrate the superior performance of the proposed approach compared to several stateof-the-art trackers.
4 0.76518619 318 iccv-2013-PixelTrack: A Fast Adaptive Algorithm for Tracking Non-rigid Objects
Author: Stefan Duffner, Christophe Garcia
Abstract: In this paper, we present a novel algorithm for fast tracking of generic objects in videos. The algorithm uses two components: a detector that makes use of the generalised Hough transform with pixel-based descriptors, and a probabilistic segmentation method based on global models for foreground and background. These components are used for tracking in a combined way, and they adapt each other in a co-training manner. Through effective model adaptation and segmentation, the algorithm is able to track objects that undergo rigid and non-rigid deformations and considerable shape and appearance variations. The proposed tracking method has been thoroughly evaluated on challenging standard videos, and outperforms state-of-theart tracking methods designed for the same task. Finally, the proposed models allow for an extremely efficient implementation, and thus tracking is very fast.
5 0.76273763 442 iccv-2013-Video Segmentation by Tracking Many Figure-Ground Segments
Author: Fuxin Li, Taeyoung Kim, Ahmad Humayun, David Tsai, James M. Rehg
Abstract: We propose an unsupervised video segmentation approach by simultaneously tracking multiple holistic figureground segments. Segment tracks are initialized from a pool of segment proposals generated from a figure-ground segmentation algorithm. Then, online non-local appearance models are trained incrementally for each track using a multi-output regularized least squares formulation. By using the same set of training examples for all segment tracks, a computational trick allows us to track hundreds of segment tracks efficiently, as well as perform optimal online updates in closed-form. Besides, a new composite statistical inference approach is proposed for refining the obtained segment tracks, which breaks down the initial segment proposals and recombines for better ones by utilizing highorder statistic estimates from the appearance model and enforcing temporal consistency. For evaluating the algorithm, a dataset, SegTrack v2, is collected with about 1,000 frames with pixel-level annotations. The proposed framework outperforms state-of-the-art approaches in the dataset, show- ing its efficiency and robustness to challenges in different video sequences.
6 0.75860667 359 iccv-2013-Robust Object Tracking with Online Multi-lifespan Dictionary Learning
7 0.75666124 215 iccv-2013-Incorporating Cloud Distribution in Sky Representation
8 0.75547898 338 iccv-2013-Randomized Ensemble Tracking
9 0.7522974 380 iccv-2013-Semantic Transform: Weakly Supervised Semantic Inference for Relating Visual Attributes
10 0.75121075 160 iccv-2013-Fast Object Segmentation in Unconstrained Video
11 0.75053096 240 iccv-2013-Learning Maximum Margin Temporal Warping for Action Recognition
12 0.75005656 424 iccv-2013-Tracking Revisited Using RGBD Camera: Unified Benchmark and Baselines
13 0.74950594 445 iccv-2013-Visual Reranking through Weakly Supervised Multi-graph Learning
14 0.74773967 86 iccv-2013-Concurrent Action Detection with Structural Prediction
15 0.74718356 41 iccv-2013-Active Learning of an Action Detector from Untrimmed Videos
16 0.7436645 441 iccv-2013-Video Motion for Every Visible Point
17 0.74266815 303 iccv-2013-Orderless Tracking through Model-Averaged Posterior Estimation
18 0.74214303 340 iccv-2013-Real-Time Articulated Hand Pose Estimation Using Semi-supervised Transductive Regression Forests
19 0.74173063 166 iccv-2013-Finding Actors and Actions in Movies
20 0.74153829 37 iccv-2013-Action Recognition and Localization by Hierarchical Space-Time Segments