cvpr cvpr2013 cvpr2013-174 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Jia Deng, Jonathan Krause, Li Fei-Fei
Abstract: Fine-grained recognition concerns categorization at sub-ordinate levels, where the distinction between object classes is highly local. Compared to basic level recognition, fine-grained categorization can be more challenging as there are in general less data and fewer discriminative features. This necessitates the use of stronger prior for feature selection. In this work, we include humans in the loop to help computers select discriminative features. We introduce a novel online game called “Bubbles ” that reveals discriminative features humans use. The player’s goal is to identify the category of a heavily blurred image. During the game, the player can choose to reveal full details of circular regions ( “bubbles”), with a certain penalty. With proper setup the game generates discriminative bubbles with assured quality. We next propose the “BubbleBank” algorithm that uses the human selected bubbles to improve machine recognition performance. Experiments demonstrate that our approach yields large improvements over the previous state of the art on challenging benchmarks.
Reference: text
sentIndex sentText sentNum sentScore
1 In this work, we include humans in the loop to help computers select discriminative features. [sent-4, score-0.114]
2 We introduce a novel online game called “Bubbles ” that reveals discriminative features humans use. [sent-5, score-0.468]
3 During the game, the player can choose to reveal full details of circular regions ( “bubbles”), with a certain penalty. [sent-7, score-0.231]
4 With proper setup the game generates discriminative bubbles with assured quality. [sent-8, score-1.149]
5 We next propose the “BubbleBank” algorithm that uses the human selected bubbles to improve machine recognition performance. [sent-9, score-0.729]
6 There is in general limited data as fine grained labels are much harder to acquire. [sent-16, score-0.117]
7 In comparison, the difference between fine grained classes can be very subtle and only a few key features matter. [sent-19, score-0.149]
8 Another promising direction is including the crowd in the loop by having humans either label or propose parts and attributes [3, 13, 11, 22, 9, 21]. [sent-30, score-0.156]
9 Specifically, we propose a novel online game called “Bubbles” that reveals the discriminative features. [sent-40, score-0.411]
10 At each round of the game, a player sees example images for two bird species. [sent-42, score-0.267]
11 Regardless of the outcome, the game advances to the next round with a new image and possibly a new pair of bird species. [sent-45, score-0.435]
12 The key twist of the game is that the new image is always heavily blurred so that the player can only see a rough outline of the bird. [sent-46, score-0.553]
13 The player can, however, click to reveal small, circular areas of the image (“bubbles”) to inspect the full details, with a penalty on game points. [sent-47, score-0.63]
14 Through proper setup of reward, the game can guarantee that bubbles selected by a successful human player contain discriminative features. [sent-48, score-1.323]
15 The game enjoys the following advantages: (1) Domain agnostic. [sent-49, score-0.349]
16 The only assumption is that humans can discover discriminative visual features from a handful of examples. [sent-50, score-0.127]
17 The game provides entertainment and people will volunteer to play. [sent-56, score-0.349]
18 Our second contribution is ”BubbleBank”, a new algorithm that uses the crowd-selected bubbles for fine-grained recognition. [sent-58, score-0.729]
19 For each bubble from the game, we generate a “bubble detector” that tries to detect the same pattern from other images. [sent-59, score-0.312]
20 Each image can then be represented by ”BubbleBank”, a collection of max-pooled responses from each bubble detector. [sent-60, score-0.345]
21 During the game, the crowd is allowed to inspect circular regions (“bubbles”), with a penalty of game points. [sent-66, score-0.415]
22 Next, when a computer tries to recognize fine grained categories, it collects the human selected bubbles and detects similar patterns on a image. [sent-68, score-0.846]
23 by asking humans to directly provide annotation rationales [9], to label features in NLP tasks [ 10], to describe the differences between pairs of images [20], or to perform tasks that are parts of the machine pipeline [23]. [sent-73, score-0.167]
24 Our work is different in that we use online games to discover discriminative features for fine grained recognition. [sent-74, score-0.326]
25 The game is named after a well known psychology technique for studying features that humans use for face recognition [14]. [sent-76, score-0.406]
26 Human subjects are shown a face image with random bubbles revealed and asked to identify the gender or expression. [sent-77, score-0.765]
27 Our approach differs in that our bubbles are actively chosen by the player. [sent-78, score-0.729]
28 Another connection to human vision studies is that our game to a certain extent resembles eye tracking, revealing the locations looked at by humans. [sent-79, score-0.349]
29 Our game also draws inspiration from human computation [30, 3 1, 17], especially the seminal “Peekaboom” game [3 1]. [sent-80, score-0.698]
30 In this two player game, player A is given a word (e. [sent-81, score-0.38]
31 First, Peekaboom is not suitable for fine grained image into one of the two categories. [sent-87, score-0.117]
32 The player can click to reveal the area inside the bubble. [sent-89, score-0.241]
33 The more bubbles used, the fewer points the player can earn. [sent-90, score-0.91]
34 recognition because an average player cannot be expected to come up with the same word “Northern Flicker”. [sent-91, score-0.199]
35 In our game, we replace word typing with binary choices and make discovering discriminative visual features between unfamiliar categories part of the game play. [sent-94, score-0.451]
36 Another difference is that our game is for a single-player. [sent-95, score-0.349]
37 This eliminates the need to match two players in real time, making it much easier to deploy on paid crowdsourcing platforms such as Amazon Mechanical Turk (AMT). [sent-96, score-0.152]
38 A player is given example images of two categories. [sent-102, score-0.181]
39 A green “bubble” (size adjustable) follows the mouse cursor as the player hovers over the center image. [sent-105, score-0.181]
40 When the player clicks, the area under the circle is revealed in full detail. [sent-106, score-0.217]
41 If the player answers correctly, she earns new points. [sent-107, score-0.239]
42 Either way, the game then advances to the next round, with a new center image and possibly a new pair of categories. [sent-109, score-0.349]
43 We design the reward of the game such that a player can only earn high scores if she identifies the categories correctly and uses bubbles parsimoniously. [sent-111, score-1.335]
44 , the player is allowed to pass difficult images or categories with no penalty, such that they are not forced to guess. [sent-115, score-0.205]
45 Another issue of game design is determining the amount of blurring for the center image. [sent-121, score-0.376]
46 With insufficient blurring, the player can directly identify the category, whereas too much blurring would obscure the global shape. [sent-122, score-0.208]
47 To address this issue, we start with a small amount of blurring and increase it gradually in new games until the use of bubbles becomes necessary. [sent-123, score-0.895]
48 The game can be enjoyable as it has an engaging challenge-reward setup with instant feedback. [sent-125, score-0.349]
49 To earn high scores, the user needs to discover the differences between highly confusing categories. [sent-126, score-0.122]
50 To further enhance the ex- perience, we can create a sense of time pressure by adding a countdown timer and “freezing” the bubbles for a few seconds once a certain amount of area has been revealed. [sent-129, score-0.751]
51 We finally note that there is nothing specific about birds in the game design. [sent-130, score-0.374]
52 Thus the game can be readily applied in a different domain. [sent-132, score-0.349]
53 AMT Deployment The game is suitable for deployment on paid crowdsourcing platforms such as AMT. [sent-134, score-0.474]
54 The worker must score enough points in order to submit the task, otherwise the games will continue indefinitely. [sent-136, score-0.139]
55 We deployed the game on AMT using the CUB-2002010 bird dataset [35] that contains 200 types of birds. [sent-142, score-0.405]
56 We generate the games from visually confusing category pairs (see Sec. [sent-150, score-0.23]
57 the player correctly identifies the category), 14% failed, and 15% were skipped by passing the image or switching categories. [sent-156, score-0.181]
58 4 shows examples of successful games for four pairs of categories. [sent-158, score-0.183]
59 5 plots the cumulative distribution of the area revealed in successful games — over 90% of the games reveal less than 10% of the object bounding box. [sent-168, score-0.37]
60 This validates our hypothesis that (1) humans can indeed discover the fine differences from a handful of examples and (2) for fine-grained recognition, the key features are highly local. [sent-169, score-0.143]
61 The area revealed in most of the successful games is small. [sent-170, score-0.199]
62 Over 90% of the games use less than 10% of the object bounding box. [sent-171, score-0.139]
63 Finally, we can aggregate the bubbles on the same image from multiple games played by multiple players and obtain a heat map of discriminative regions. [sent-172, score-0.986]
64 It suggests that the game can indeed discover meaningful cues for fine-grained recognition. [sent-175, score-0.379]
65 The BubbleBank Algorithm The Bubbles game reveals discriminative features. [sent-177, score-0.411]
66 In this section we show how to use the human selected bubbles to improve recognition. [sent-178, score-0.729]
67 Our basic idea is to generate a detector for each bubble and represent each image as a collection of responses of the bubble detectors. [sent-179, score-0.689]
68 The Bubble Detectors Since each bubble is drawn in the context of discriminating two classes, we start by assuming 555888113 Figure6. [sent-180, score-0.312]
69 Our intuition is that since each bubble contains discriminative features for recognition, it suffices to detect such patterns in a test image. [sent-183, score-0.352]
70 Since each bubble is usually a small area, it can be represented by a single descriptor such as SIFT, or a concatenation of simple descriptors. [sent-186, score-0.312]
71 Instead of convolving with the entire image, each detector operates on a fixed, rectangular region whose center is determined by the relative location of the bubble in the original image. [sent-189, score-0.344]
72 Note that here we have assumed that the object has been localized, as is standard in the classification task in fine grained recognition [35, 37, 36]. [sent-191, score-0.117]
73 Now, assume that we have collected multiple bubbles, each from a training image of one of the two classes (each training image can have multiple bubbles from a single round of game or multiple games played by different players). [sent-192, score-1.303]
74 We can then form a bank of bubble detectors (“BubbleBank”) and represent the image by a vector of the maxpooled responses from each detector, in a spirit similar to the ObjectBank [18] representation. [sent-193, score-0.381]
75 — Extending to Multiple Classes Extending to multiple classes is straightforward — we can simply obtain bubbles for all pairs of categories and then use all of them to form our the BubbleBank. [sent-197, score-0.805]
76 This, however, does not scale well with the number of classes because we need to run O(K2) games for K classes. [sent-198, score-0.171]
77 Fortunately, obtaining bubbles for every pair of categories is unnecessary in practice. [sent-199, score-0.753]
78 It is likely that a bubble useful for differentiating a class from another very confusing class is also helpful for discriminating the same class against less similar ones. [sent-201, score-0.363]
79 For example, the bubbles selected for “Common Tern” against “Herring Gull” in Fig. [sent-202, score-0.729]
80 Bubble Detectors We implement the bubble detectors using SIFT [27] and color histograms extracted at the bubble locations. [sent-223, score-0.66]
81 To run the bubble detectors, we resize an image to a max dimension of 300 pixels 555888224 tector responses are raised to the power of p > 1, the differences between values in the higher range are amplified. [sent-230, score-0.383]
82 The detector response at each location is the dot product of the image patch descriptor and the bubble descriptor. [sent-232, score-0.363]
83 7 (left) plots the distribution of the maximum bubble detector responses on images from a pair of classes in CUB-14. [sent-250, score-0.409]
84 The red bars correspond to the responses of bubbles on images from the same class (i. [sent-251, score-0.782]
85 Since the 14 classes come from two visually very distinctive subgroups, vireo and woodpecker, we run the bubbles game within each subgroup. [sent-275, score-1.134]
86 We obtained 16336 bubbles from 4101 successful, positively scored games using a total of 210 unflipped training images. [sent-277, score-0.868]
87 The same bubbles can be mirrored on the flipped images, which gives a total of 32672 bubble detectors. [sent-278, score-1.041]
88 As a control experiment, we replace the crowdsourced bubbles with randomly generated ones while keeping everything else exactly the same. [sent-287, score-0.729]
89 This control experiment demonstrates that (1) the Bubbles game is essential and (2) the quality of the bubbles are indeed assured by the game mechanism. [sent-293, score-1.479]
90 8 reports recognition performances using subsampled bubbles (using 1%, 5%, 10%, 20%, 50%, 80% of the full set of 32672 bubbles). [sent-299, score-0.729]
91 Strikingly, using only 1634 human selected bubbles (5% of the entire set), we already outperform CFAF [36] (51. [sent-301, score-0.729]
92 9 shows success and failure cases of classification along with the top bubbles contributing to the predictions, We observe that the correct predictions can indeed be attributed to discriminative bubbles. [sent-310, score-0.769]
93 The first two cases are a result of treating each bubble independently. [sent-312, score-0.312]
94 Often humans draw multiple bubbles on the same image and the bubbles may not be sufficiently discriminative in isolation. [sent-313, score-1.555]
95 Since there are many more classes than CUB-14 and visually confusing pairs of classes are not necessarily from the same subgroup, we use a different approach to select the pairs for crowdsourcing. [sent-318, score-0.155]
96 We then obtain 220242 bubbles through 46958 successful, positively scored games using training images. [sent-321, score-0.868]
97 We also observe that random bubbles causes a large performance drop (from 32. [sent-326, score-0.729]
98 troduce a new online game “Bubbles” that reveals important features humans in fine-grained recognition. [sent-342, score-0.428]
99 The game is domain agnostic and guarantees high quality data. [sent-343, score-0.37]
100 Second, we propose the “BubbleBank” algorithm that uses the human selected bubbles to learn classifiers for fine-grained categories. [sent-344, score-0.729]
wordName wordTfidf (topN-words)
[('bubbles', 0.729), ('game', 0.349), ('bubble', 0.312), ('bubblebank', 0.23), ('player', 0.181), ('games', 0.139), ('tern', 0.094), ('cfaf', 0.081), ('grained', 0.08), ('crowdsourcing', 0.06), ('humans', 0.057), ('bird', 0.056), ('birdlet', 0.054), ('peekaboom', 0.054), ('players', 0.054), ('confusing', 0.051), ('auklet', 0.041), ('earns', 0.041), ('gull', 0.041), ('herring', 0.041), ('woodpecker', 0.041), ('amt', 0.04), ('discriminative', 0.04), ('branson', 0.038), ('attributes', 0.038), ('fine', 0.037), ('rounds', 0.036), ('revealed', 0.036), ('arctic', 0.036), ('tricos', 0.036), ('detectors', 0.036), ('responses', 0.033), ('wah', 0.032), ('classes', 0.032), ('reveal', 0.032), ('detector', 0.032), ('researcher', 0.031), ('assured', 0.031), ('discover', 0.03), ('pooling', 0.03), ('round', 0.03), ('reward', 0.03), ('ahn', 0.028), ('click', 0.028), ('deployment', 0.027), ('northern', 0.027), ('objectbank', 0.027), ('oouurrss', 0.027), ('parakeet', 0.027), ('rationales', 0.027), ('blurring', 0.027), ('crowd', 0.026), ('workers', 0.026), ('annotation', 0.026), ('birds', 0.025), ('species', 0.025), ('vireo', 0.024), ('kdes', 0.024), ('woodpeckers', 0.024), ('successful', 0.024), ('categories', 0.024), ('played', 0.024), ('perona', 0.023), ('welinder', 0.023), ('blurred', 0.023), ('von', 0.023), ('spm', 0.022), ('inspect', 0.022), ('earn', 0.022), ('pressure', 0.022), ('reveals', 0.022), ('settles', 0.021), ('flicker', 0.021), ('quality', 0.021), ('guessing', 0.02), ('unfamiliar', 0.02), ('platforms', 0.02), ('pairs', 0.02), ('category', 0.02), ('bars', 0.02), ('power', 0.019), ('categorization', 0.019), ('discriminated', 0.019), ('bagging', 0.019), ('farrell', 0.019), ('subordinate', 0.019), ('identification', 0.019), ('differences', 0.019), ('response', 0.019), ('circular', 0.018), ('word', 0.018), ('parts', 0.018), ('paid', 0.018), ('specify', 0.017), ('parikh', 0.017), ('answers', 0.017), ('schroff', 0.017), ('loop', 0.017), ('distinguishing', 0.017), ('barcelona', 0.017)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999964 174 cvpr-2013-Fine-Grained Crowdsourcing for Fine-Grained Recognition
Author: Jia Deng, Jonathan Krause, Li Fei-Fei
Abstract: Fine-grained recognition concerns categorization at sub-ordinate levels, where the distinction between object classes is highly local. Compared to basic level recognition, fine-grained categorization can be more challenging as there are in general less data and fewer discriminative features. This necessitates the use of stronger prior for feature selection. In this work, we include humans in the loop to help computers select discriminative features. We introduce a novel online game called “Bubbles ” that reveals discriminative features humans use. The player’s goal is to identify the category of a heavily blurred image. During the game, the player can choose to reveal full details of circular regions ( “bubbles”), with a certain penalty. With proper setup the game generates discriminative bubbles with assured quality. We next propose the “BubbleBank” algorithm that uses the human selected bubbles to improve machine recognition performance. Experiments demonstrate that our approach yields large improvements over the previous state of the art on challenging benchmarks.
2 0.19232532 441 cvpr-2013-Tracking Sports Players with Context-Conditioned Motion Models
Author: Jingchen Liu, Peter Carr, Robert T. Collins, Yanxi Liu
Abstract: We employ hierarchical data association to track players in team sports. Player movements are often complex and highly correlated with both nearby and distant players. A single model would require many degrees of freedom to represent the full motion diversity and could be difficult to use in practice. Instead, we introduce a set of Game Context Features extracted from noisy detections to describe the current state of the match, such as how the players are spatially distributed. Our assumption is that players react to the current situation in only a finite number of ways. As a result, we are able to select an appropriate simplified affinity model for each player and time instant using a random decisionforest based on current track and game contextfeatures. Our context-conditioned motion models implicitly incorporate complex inter-object correlations while remaining tractable. We demonstrate significant performance improvements over existing multi-target tracking algorithms on basketball and field hockey sequences several minutes in duration and containing 10 and 20 players respectively.
3 0.16718905 356 cvpr-2013-Representing and Discovering Adversarial Team Behaviors Using Player Roles
Author: Patrick Lucey, Alina Bialkowski, Peter Carr, Stuart Morgan, Iain Matthews, Yaser Sheikh
Abstract: In this paper, we describe a method to represent and discover adversarial group behavior in a continuous domain. In comparison to other types of behavior, adversarial behavior is heavily structured as the location of a player (or agent) is dependent both on their teammates and adversaries, in addition to the tactics or strategies of the team. We present a method which can exploit this relationship through the use of a spatiotemporal basis model. As players constantly change roles during a match, we show that employing a “role-based” representation instead of one based on player “identity” can best exploit the playing structure. As vision-based systems currently do not provide perfect detection/tracking (e.g. missed or false detections), we show that our compact representation can effectively “denoise ” erroneous detections as well as enabling temporal analysis, which was previously prohibitive due to the dimensionality of the signal. To evaluate our approach, we used a fully instrumented field-hockey pitch with 8 fixed highdefinition (HD) cameras and evaluated our approach on approximately 200,000 frames of data from a state-of-the- art real-time player detector and compare it to manually labelled data.
Author: Thomas Berg, Peter N. Belhumeur
Abstract: From a set ofimages in aparticular domain, labeled with part locations and class, we present a method to automatically learn a large and diverse set of highly discriminative intermediate features that we call Part-based One-vs-One Features (POOFs). Each of these features specializes in discrimination between two particular classes based on the appearance at a particular part. We demonstrate the particular usefulness of these features for fine-grained visual categorization with new state-of-the-art results on bird species identification using the Caltech UCSD Birds (CUB) dataset and parity with the best existing results in face verification on the Labeled Faces in the Wild (LFW) dataset. Finally, we demonstrate the particular advantage of POOFs when training data is scarce.
5 0.05895574 145 cvpr-2013-Efficient Object Detection and Segmentation for Fine-Grained Recognition
Author: Anelia Angelova, Shenghuo Zhu
Abstract: We propose a detection and segmentation algorithm for the purposes of fine-grained recognition. The algorithm first detects low-level regions that could potentially belong to the object and then performs a full-object segmentation through propagation. Apart from segmenting the object, we can also ‘zoom in ’ on the object, i.e. center it, normalize it for scale, and thus discount the effects of the background. We then show that combining this with a state-of-the-art classification algorithm leads to significant improvements in performance especially for datasets which are considered particularly hard for recognition, e.g. birds species. The proposed algorithm is much more efficient than other known methods in similar scenarios [4, 21]. Our method is also simpler and we apply it here to different classes of objects, e.g. birds, flowers, cats and dogs. We tested the algorithm on a number of benchmark datasets for fine-grained categorization. It outperforms all the known state-of-the-art methods on these datasets, sometimes by as much as 11%. It improves the performance of our baseline algorithm by 3-4%, consistently on all datasets. We also observed more than a 4% improvement in the recognition performance on a challenging largescale flower dataset, containing 578 species of flowers and 250,000 images.
6 0.057635855 116 cvpr-2013-Designing Category-Level Attributes for Discriminative Visual Recognition
7 0.055009305 153 cvpr-2013-Expanded Parts Model for Human Attribute and Action Recognition in Still Images
8 0.053796723 48 cvpr-2013-Attribute-Based Detection of Unfamiliar Classes with Humans in the Loop
9 0.050338253 248 cvpr-2013-Learning Collections of Part Models for Object Recognition
10 0.047085322 36 cvpr-2013-Adding Unlabeled Samples to Categories by Learned Attributes
11 0.04595498 43 cvpr-2013-Analyzing Semantic Segmentation Using Hybrid Human-Machine CRFs
12 0.045687761 293 cvpr-2013-Multi-attribute Queries: To Merge or Not to Merge?
13 0.045399599 325 cvpr-2013-Part Discovery from Partial Correspondence
14 0.042399663 67 cvpr-2013-Blocks That Shout: Distinctive Parts for Scene Classification
15 0.041997686 388 cvpr-2013-Semi-supervised Learning of Feature Hierarchies for Object Detection in a Video
16 0.039719056 229 cvpr-2013-It's Not Polite to Point: Describing People with Uncertain Attributes
17 0.038279504 452 cvpr-2013-Vantage Feature Frames for Fine-Grained Categorization
18 0.037377052 130 cvpr-2013-Discriminative Color Descriptors
19 0.036923386 421 cvpr-2013-Supervised Kernel Descriptors for Visual Recognition
20 0.036830653 146 cvpr-2013-Enriching Texture Analysis with Semantic Data
topicId topicWeight
[(0, 0.092), (1, -0.041), (2, -0.006), (3, -0.025), (4, 0.023), (5, 0.031), (6, -0.027), (7, 0.005), (8, 0.021), (9, 0.045), (10, -0.023), (11, -0.008), (12, 0.006), (13, -0.026), (14, 0.039), (15, 0.009), (16, 0.01), (17, 0.003), (18, 0.011), (19, -0.03), (20, 0.03), (21, 0.034), (22, -0.008), (23, 0.039), (24, 0.029), (25, 0.064), (26, 0.054), (27, -0.083), (28, -0.05), (29, -0.083), (30, -0.036), (31, -0.01), (32, -0.059), (33, -0.074), (34, 0.123), (35, -0.022), (36, -0.099), (37, -0.06), (38, 0.023), (39, 0.11), (40, 0.033), (41, -0.084), (42, -0.012), (43, -0.012), (44, -0.016), (45, 0.008), (46, 0.108), (47, -0.015), (48, -0.043), (49, 0.027)]
simIndex simValue paperId paperTitle
same-paper 1 0.83713752 174 cvpr-2013-Fine-Grained Crowdsourcing for Fine-Grained Recognition
Author: Jia Deng, Jonathan Krause, Li Fei-Fei
Abstract: Fine-grained recognition concerns categorization at sub-ordinate levels, where the distinction between object classes is highly local. Compared to basic level recognition, fine-grained categorization can be more challenging as there are in general less data and fewer discriminative features. This necessitates the use of stronger prior for feature selection. In this work, we include humans in the loop to help computers select discriminative features. We introduce a novel online game called “Bubbles ” that reveals discriminative features humans use. The player’s goal is to identify the category of a heavily blurred image. During the game, the player can choose to reveal full details of circular regions ( “bubbles”), with a certain penalty. With proper setup the game generates discriminative bubbles with assured quality. We next propose the “BubbleBank” algorithm that uses the human selected bubbles to improve machine recognition performance. Experiments demonstrate that our approach yields large improvements over the previous state of the art on challenging benchmarks.
2 0.82041484 356 cvpr-2013-Representing and Discovering Adversarial Team Behaviors Using Player Roles
Author: Patrick Lucey, Alina Bialkowski, Peter Carr, Stuart Morgan, Iain Matthews, Yaser Sheikh
Abstract: In this paper, we describe a method to represent and discover adversarial group behavior in a continuous domain. In comparison to other types of behavior, adversarial behavior is heavily structured as the location of a player (or agent) is dependent both on their teammates and adversaries, in addition to the tactics or strategies of the team. We present a method which can exploit this relationship through the use of a spatiotemporal basis model. As players constantly change roles during a match, we show that employing a “role-based” representation instead of one based on player “identity” can best exploit the playing structure. As vision-based systems currently do not provide perfect detection/tracking (e.g. missed or false detections), we show that our compact representation can effectively “denoise ” erroneous detections as well as enabling temporal analysis, which was previously prohibitive due to the dimensionality of the signal. To evaluate our approach, we used a fully instrumented field-hockey pitch with 8 fixed highdefinition (HD) cameras and evaluated our approach on approximately 200,000 frames of data from a state-of-the- art real-time player detector and compare it to manually labelled data.
3 0.70403582 441 cvpr-2013-Tracking Sports Players with Context-Conditioned Motion Models
Author: Jingchen Liu, Peter Carr, Robert T. Collins, Yanxi Liu
Abstract: We employ hierarchical data association to track players in team sports. Player movements are often complex and highly correlated with both nearby and distant players. A single model would require many degrees of freedom to represent the full motion diversity and could be difficult to use in practice. Instead, we introduce a set of Game Context Features extracted from noisy detections to describe the current state of the match, such as how the players are spatially distributed. Our assumption is that players react to the current situation in only a finite number of ways. As a result, we are able to select an appropriate simplified affinity model for each player and time instant using a random decisionforest based on current track and game contextfeatures. Our context-conditioned motion models implicitly incorporate complex inter-object correlations while remaining tractable. We demonstrate significant performance improvements over existing multi-target tracking algorithms on basketball and field hockey sequences several minutes in duration and containing 10 and 20 players respectively.
Author: Thomas Berg, Peter N. Belhumeur
Abstract: From a set ofimages in aparticular domain, labeled with part locations and class, we present a method to automatically learn a large and diverse set of highly discriminative intermediate features that we call Part-based One-vs-One Features (POOFs). Each of these features specializes in discrimination between two particular classes based on the appearance at a particular part. We demonstrate the particular usefulness of these features for fine-grained visual categorization with new state-of-the-art results on bird species identification using the Caltech UCSD Birds (CUB) dataset and parity with the best existing results in face verification on the Labeled Faces in the Wild (LFW) dataset. Finally, we demonstrate the particular advantage of POOFs when training data is scarce.
5 0.44545758 452 cvpr-2013-Vantage Feature Frames for Fine-Grained Categorization
Author: Asma Rejeb Sfar, Nozha Boujemaa, Donald Geman
Abstract: We study fine-grained categorization, the task of distinguishing among (sub)categories of the same generic object class (e.g., birds), focusing on determining botanical species (leaves and orchids) from scanned images. The strategy is to focus attention around several vantage points, which is the approach taken by botanists, but using features dedicated to the individual categories. Our implementation of the strategy is based on vantage feature frames, a novel object representation consisting of two components: a set of coordinate systems centered at the most discriminating local viewpoints for the generic object class and a set of category-dependentfeatures computed in these frames. The features are pooled over frames to build the classifier. Categorization then proceeds from coarse-grained (finding the frames) to fine-grained (finding the category), and hence the vantage feature frames must be both detectable and discriminating. The proposed method outperforms state-of-the art algorithms, in particular those using more distributed representations, on standard databases of leaves.
6 0.44164458 301 cvpr-2013-Multi-target Tracking by Rank-1 Tensor Approximation
7 0.36059886 48 cvpr-2013-Attribute-Based Detection of Unfamiliar Classes with Humans in the Loop
8 0.35445166 325 cvpr-2013-Part Discovery from Partial Correspondence
9 0.35115767 153 cvpr-2013-Expanded Parts Model for Human Attribute and Action Recognition in Still Images
10 0.34943947 103 cvpr-2013-Decoding Children's Social Behavior
11 0.34216416 121 cvpr-2013-Detection- and Trajectory-Level Exclusion in Multiple Object Tracking
12 0.34048939 239 cvpr-2013-Kernel Null Space Methods for Novelty Detection
13 0.33750266 346 cvpr-2013-Real-Time No-Reference Image Quality Assessment Based on Filter Learning
14 0.33071634 163 cvpr-2013-Fast, Accurate Detection of 100,000 Object Classes on a Single Machine
15 0.32936192 293 cvpr-2013-Multi-attribute Queries: To Merge or Not to Merge?
16 0.32815516 67 cvpr-2013-Blocks That Shout: Distinctive Parts for Scene Classification
17 0.32164878 201 cvpr-2013-Heterogeneous Visual Features Fusion via Sparse Multimodal Machine
18 0.32149717 99 cvpr-2013-Cross-View Image Geolocalization
19 0.31819293 96 cvpr-2013-Correlation Filters for Object Alignment
20 0.3178041 263 cvpr-2013-Learning the Change for Automatic Image Cropping
topicId topicWeight
[(10, 0.082), (16, 0.019), (19, 0.029), (26, 0.035), (27, 0.03), (28, 0.014), (33, 0.207), (67, 0.048), (69, 0.039), (70, 0.293), (72, 0.02), (80, 0.013), (87, 0.06)]
simIndex simValue paperId paperTitle
same-paper 1 0.75232118 174 cvpr-2013-Fine-Grained Crowdsourcing for Fine-Grained Recognition
Author: Jia Deng, Jonathan Krause, Li Fei-Fei
Abstract: Fine-grained recognition concerns categorization at sub-ordinate levels, where the distinction between object classes is highly local. Compared to basic level recognition, fine-grained categorization can be more challenging as there are in general less data and fewer discriminative features. This necessitates the use of stronger prior for feature selection. In this work, we include humans in the loop to help computers select discriminative features. We introduce a novel online game called “Bubbles ” that reveals discriminative features humans use. The player’s goal is to identify the category of a heavily blurred image. During the game, the player can choose to reveal full details of circular regions ( “bubbles”), with a certain penalty. With proper setup the game generates discriminative bubbles with assured quality. We next propose the “BubbleBank” algorithm that uses the human selected bubbles to improve machine recognition performance. Experiments demonstrate that our approach yields large improvements over the previous state of the art on challenging benchmarks.
2 0.72687435 23 cvpr-2013-A Practical Rank-Constrained Eight-Point Algorithm for Fundamental Matrix Estimation
Author: Yinqiang Zheng, Shigeki Sugimoto, Masatoshi Okutomi
Abstract: Due to its simplicity, the eight-point algorithm has been widely used in fundamental matrix estimation. Unfortunately, the rank-2 constraint of a fundamental matrix is enforced via a posterior rank correction step, thus leading to non-optimal solutions to the original problem. To address this drawback, existing algorithms need to solve either a very high order polynomial or a sequence of convex relaxation problems, both of which are computationally ineffective and numerically unstable. In this work, we present a new rank-2 constrained eight-point algorithm, which directly incorporates the rank-2 constraint in the minimization process. To avoid singularities, we propose to solve seven subproblems and retrieve their globally optimal solutions by using tailored polynomial system solvers. Our proposed method is noniterative, computationally efficient and numerically stable. Experiment results have verified its superiority over existing algebraic error based algorithms in terms of accuracy, as well as its advantages when used to initialize geometric error based algorithms.
3 0.72276056 466 cvpr-2013-Whitened Expectation Propagation: Non-Lambertian Shape from Shading and Shadow
Author: Brian Potetz, Mohammadreza Hajiarbabi
Abstract: For problems over continuous random variables, MRFs with large cliques pose a challenge in probabilistic inference. Difficulties in performing optimization efficiently have limited the probabilistic models explored in computer vision and other fields. One inference technique that handles large cliques well is Expectation Propagation. EP offers run times independent of clique size, which instead depend only on the rank, or intrinsic dimensionality, of potentials. This property would be highly advantageous in computer vision. Unfortunately, for grid-shaped models common in vision, traditional Gaussian EP requires quadratic space and cubic time in the number of pixels. Here, we propose a variation of EP that exploits regularities in natural scene statistics to achieve run times that are linear in both number of pixels and clique size. We test these methods on shape from shading, and we demonstrate strong performance not only for Lambertian surfaces, but also on arbitrary surface reflectance and lighting arrangements, which requires highly non-Gaussian potentials. Finally, we use large, non-local cliques to exploit cast shadow, which is traditionally ignored in shape from shading.
4 0.7084673 150 cvpr-2013-Event Recognition in Videos by Learning from Heterogeneous Web Sources
Author: Lin Chen, Lixin Duan, Dong Xu
Abstract: In this work, we propose to leverage a large number of loosely labeled web videos (e.g., from YouTube) and web images (e.g., from Google/Bing image search) for visual event recognition in consumer videos without requiring any labeled consumer videos. We formulate this task as a new multi-domain adaptation problem with heterogeneous sources, in which the samples from different source domains can be represented by different types of features with different dimensions (e.g., the SIFTfeaturesfrom web images and space-time (ST) features from web videos) while the target domain samples have all types of features. To effectively cope with the heterogeneous sources where some source domains are more relevant to the target domain, we propose a new method called Multi-domain Adaptation with Heterogeneous Sources (MDA-HS) to learn an optimal target classifier, in which we simultaneously seek the optimal weights for different source domains with different types of features as well as infer the labels of unlabeled target domain data based on multiple types of features. We solve our optimization problem by using the cutting-plane algorithm based on group-based multiple kernel learning. Comprehensive experiments on two datasets demonstrate the effectiveness of MDA-HS for event recognition in consumer videos.
5 0.69180882 87 cvpr-2013-Compressed Hashing
Author: Yue Lin, Rong Jin, Deng Cai, Shuicheng Yan, Xuelong Li
Abstract: Recent studies have shown that hashing methods are effective for high dimensional nearest neighbor search. A common problem shared by many existing hashing methods is that in order to achieve a satisfied performance, a large number of hash tables (i.e., long codewords) are required. To address this challenge, in this paper we propose a novel approach called Compressed Hashing by exploring the techniques of sparse coding and compressed sensing. In particular, we introduce a sparse coding scheme, based on the approximation theory of integral operator, that generate sparse representation for high dimensional vectors. We then project sparse codes into a low dimensional space by effectively exploring the Restricted Isometry Property (RIP), a key property in compressed sensing theory. Both of the theoretical analysis and the empirical studies on two large data sets show that the proposed approach is more effective than the state-of-the-art hashing algorithms.
6 0.65512985 309 cvpr-2013-Nonparametric Scene Parsing with Adaptive Feature Relevance and Semantic Context
7 0.63023984 63 cvpr-2013-Binary Code Ranking with Weighted Hamming Distance
8 0.62746233 365 cvpr-2013-Robust Real-Time Tracking of Multiple Objects by Volumetric Mass Densities
9 0.62737757 62 cvpr-2013-Bilinear Programming for Human Activity Recognition with Unknown MRF Graphs
10 0.6272524 325 cvpr-2013-Part Discovery from Partial Correspondence
11 0.62717307 356 cvpr-2013-Representing and Discovering Adversarial Team Behaviors Using Player Roles
12 0.62702292 248 cvpr-2013-Learning Collections of Part Models for Object Recognition
13 0.62608767 446 cvpr-2013-Understanding Indoor Scenes Using 3D Geometric Phrases
14 0.62556666 242 cvpr-2013-Label Propagation from ImageNet to 3D Point Clouds
15 0.62553006 225 cvpr-2013-Integrating Grammar and Segmentation for Human Pose Estimation
16 0.6248377 124 cvpr-2013-Determining Motion Directly from Normal Flows Upon the Use of a Spherical Eye Platform
17 0.62479937 206 cvpr-2013-Human Pose Estimation Using Body Parts Dependent Joint Regressors
18 0.62395447 98 cvpr-2013-Cross-View Action Recognition via a Continuous Virtual Path
19 0.62369764 119 cvpr-2013-Detecting and Aligning Faces by Image Retrieval
20 0.62356669 256 cvpr-2013-Learning Structured Hough Voting for Joint Object Detection and Occlusion Reasoning