iccv iccv2013 iccv2013-416 knowledge-graph by maker-knowledge-mining

416 iccv-2013-The Interestingness of Images


Source: pdf

Author: Michael Gygli, Helmut Grabner, Hayko Riemenschneider, Fabian Nater, Luc Van_Gool

Abstract: We investigate human interest in photos. Based on our own and others ’psychological experiments, we identify various cues for “interestingness ”, namely aesthetics, unusualness and general preferences. For the ranking of retrieved images, interestingness is more appropriate than cues proposed earlier. Interestingness is, for example, correlated with what people believe they will remember. This is opposed to actual memorability, which is uncorrelated to both of them. We introduce a set of features computationally capturing the three main aspects of visual interestingness that we propose and build an interestingness predictor from them. Its performance is shown on three datasets with varying context, reflecting diverse levels of prior knowledge of the viewers.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Based on our own and others ’psychological experiments, we identify various cues for “interestingness ”, namely aesthetics, unusualness and general preferences. [sent-4, score-0.207]

2 For the ranking of retrieved images, interestingness is more appropriate than cues proposed earlier. [sent-5, score-0.826]

3 We introduce a set of features computationally capturing the three main aspects of visual interestingness that we propose and build an interestingness predictor from them. [sent-8, score-1.615]

4 These include image quality [15], memorability [14] and aesthetics [10, 11]. [sent-12, score-0.456]

5 Yet, a measure that would seem more relevant to automatically quantify is how interesting people find an image and this “interestingness” has hardly been studied so far. [sent-13, score-0.095]

6 [11], who used their high-level aesthetics features to train a classifier on Flickr’s interestingness. [sent-16, score-0.288]

7 It can predict the interestingness ofthese images well, but it is questionable that these results can be generalized to other image datasets. [sent-17, score-0.796]

8 Flickr’s interestingness [7] is based on social behavior, i. [sent-18, score-0.796]

9 This measure has not been shown to relate to what people find interesting in images. [sent-21, score-0.117]

10 17 ↑mem↓int↓mem↑int Figure 1: Interestingness compared to aesthetics and memorability. [sent-31, score-0.265]

11 In our own series of psychological experiments we analyze “interestingness” and how it relates to measures such as aesthetics and memorability (Fig. [sent-33, score-0.513]

12 There exists indeed a strong correlation between aesthetics and interestingness (Fig. [sent-35, score-1.107]

13 However, what is interesting does not necessarily need to be aesthetically pleasing, e. [sent-37, score-0.099]

14 While one would also expect a high correlation of memorability and interestingness, our experiments indicate the contrary (Fig. [sent-40, score-0.237]

15 2) and show that it is fundamentally different from other properties such as memorability; (ii) propose a set of features able to computationally capture the most important aspects of interestingness (Sec. [sent-44, score-0.849]

16 3); (iii) show the performance of these features and an interestingness predictor built from them, on three datasets with varying levels of context (Sec. [sent-45, score-0.838]

17 4); (iv) show that the con- text within which an image is viewed is crucial for the appraisal of interestingness. [sent-46, score-0.066]

18 Given the lack of clear-cut and quantifiable psychological findings, we investigate the correlation of interestingness with an extensive list of image attributes, including emotional, aesthetic and content related aspects. [sent-60, score-0.952]

19 2a we relate the provided image at- tributes to the interestingness ground truth we collected (c. [sent-64, score-0.818]

20 This figure shows the Spearman rank correlation of all attributes and highlights several with high correlations (either positive or negative). [sent-69, score-0.095]

21 When comparing the data of [13] with our own we find that people agree, to a large extent, on which images are interesting, despite personal preferences (c. [sent-78, score-0.169]

22 The cues that we implemented were selected on the basis of their experimentally verified correlation with interestingness. [sent-84, score-0.076]

23 Interestingness, on the other hand, has its highest correlation with this assumed memorability. [sent-90, score-0.072]

24 What a human observer finds interesting is what he wants to remember and believes he (a) Interestingness correlated with an extensive set of image attributes, based on the data of [13]. [sent-91, score-0.129]

25 We compare the attributes to our interestingness score, collected as described in Sec. [sent-92, score-0.845]

26 6081 (c) Correlation of scene categories and interest on the dataset of [18], interestingness scores obtained as described in Sec. [sent-98, score-0.917]

27 Additionally, we investigated the preference for certain scene types (Fig. [sent-104, score-0.131]

28 2c) and found, in agreement with [3], that people prefer natural outdoor scenes rather than man-made scenes. [sent-105, score-0.067]

29 While interestingness is higher for images containing sky, actual memorability decreases if sky is present. [sent-106, score-0.987]

30 Indeed, when comparing actual memorability and interestingness, we find them to be negatively correlated1 . [sent-107, score-0.214]

31 [13]), it makes more sense to select an interesting image, than a memorable but dull one. [sent-109, score-0.11]

32 Computational approach for interestingness prediction In this section we propose features that computationally capture the aspects/cues of interestingness which we found most important (c. [sent-112, score-1.645]

33 2) and are implementable: unusualness, aesthetics and general preferences. [sent-115, score-0.265]

34 Then, we use these to predict the interestingness of images. [sent-116, score-0.796]

35 Formally, given an image I are looking for an interwe estingness score s. [sent-118, score-0.066]

36 Our pipeline to achieve this task consists of two stages: (i) exploring various features to capture each of the above cues for interestingness and (ii) combining these individual features. [sent-119, score-0.879]

37 Instead, we want to capture unusualness in single images from arbitrary scenes. [sent-124, score-0.207]

38 Interestingly they found this feature to play a crucial role in the classification of image aesthetics (c. [sent-135, score-0.265]

39 1, correlation of interestingness and aesthetics ρ = 0. [sent-138, score-1.107]

40 The graph’s energy determines how unusual the configuration of patches is: = E(L) ? [sent-148, score-0.083]

41 Tashee unary c|o isst e qDui (alli )to oi tsh tehe n uEmu-clidean distance in the descriptor space of a superpixel i to the nearest-neighboring superpixel in the database with label l. [sent-158, score-0.066]

42 With L being that optimal labeling, the unusualness by composition is defined as scuonmupsuosael := E(L)/|S| , i. [sent-167, score-0.199]

43 Aesthetics To capture the aesthetics of an image, we propose several features that are rather simple in comparison to other, more extensive works in the area. [sent-173, score-0.318]

44 For example [11] uses content preferences, such as the presence of people and animals or the preference for certain scene types to classify aesthetically pleasing images. [sent-174, score-0.222]

45 We capture such general preferences with global scene descriptors in Sec. [sent-175, score-0.183]

46 Machadjik and Hanbury [17] extracted emotion scores from raw pixels. [sent-185, score-0.072]

47 Their features are based on the 11663355 empirical findings of [25], which characterized emotions that are caused by color using the space of arousal, pleasure and dominance. [sent-186, score-0.113]

48 General preferences Following the observation that certain scene types tend to be more interesting than others (c. [sent-213, score-0.261]

49 of SIFT histograms [16] sppryref, and color histograms Spatial pyramids and GIST are known to capture scene categories well. [sent-218, score-0.112]

50 As we use a linear model that assumes uncorrelated features, we also applied whitening to decorrelate the features before training the model. [sent-237, score-0.066]

51 Experiments In this section we discuss the performance of the different interestingness features. [sent-244, score-0.796]

52 As we will see, the strength of the contextual cues that are relevant in the tested setting × determines – in part – which types of features are most effective in capturing interestingness. [sent-245, score-0.075]

53 Tshofis3 agrees with [23], where it was shown sufficient to capture scene types and important objects. [sent-250, score-0.088]

54 As for the general preference features, we trained the ν-SVR on the training set and optimized the hyperparameters using grid search on the validation set. [sent-255, score-0.082]

55 Images with in-between scores are excluded in the computation of RP, as there is no clear agreement between individuals. [sent-265, score-0.07]

56 Suppose that si∗ is the human interestingness score of image Ii, then TopN where PN is the set of N images ranked highest by? [sent-267, score-0.888]

57 Since the presented webcam sequential evolving, there is a strong context viewer rates interestingness. [sent-273, score-0.087]

58 Secondly, we use of webcam images are in which a the 8 scene 11663366 GT score: 0. [sent-274, score-0.083]

59 Figure 3: An example (Sequence 1) out of the 20 webcam sequences [12] (GT: ground truth, Est: the estimated scores from our method). [sent-320, score-0.098]

60 Last, we use the memorability dataset [14], which contains arbitrary photographs and offers practically no context. [sent-322, score-0.191]

61 It is annotated with interestingness ground truth, acquired in a psychological study [12]. [sent-328, score-0.853]

62 The interestingness score of an image is calculated as the frac- tion of people who considered it interesting. [sent-329, score-0.912]

63 There are only a few interesting events in these streams (mean interestingness score of 0. [sent-330, score-0.928]

64 Interestingness is highly subjective and there are individuals who did not consider any image interesting in some sequences. [sent-332, score-0.066]

65 We tested each sequence separately and split the remaining sequences randomly into training and validation sets (80% for training / 20% for validation) to train the SVRs and the combination of the features. [sent-342, score-0.067]

66 the unusualness scores are computed with respect to the previous frames only (while [12] uses the whole sequence). [sent-346, score-0.209]

67 3b shows the correlation of predicted interestingness and ground truth score and Fig. [sent-356, score-0.936]

68 3d plots the RecallPrecision curve for the combination of features along with the five single features having the highest weights. [sent-357, score-0.091]

69 Yet, not everything predicted as unusual is rated as interesting by humans, e. [sent-359, score-0.177]

70 This is not unusual at the semantic level and therefore not considered interesting by humans. [sent-363, score-0.149]

71 Aesthetics and general preference features show a lower performance. [sent-365, score-0.076]

72 Figure 4: The 8 scene category dataset (GT: ground truth, Est: the estimated scores from our method). [sent-413, score-0.068]

73 Weak context: Scene categories dataset The 8 scene categories dataset of Oliva and Torralba [18] consists of 2’688 images with a fixed size of 256 256 pixeclosn. [sent-416, score-0.076]

74 Tisthse o images are ganenso wtaittehd a w fixitehd th siezier scene categories, which allows us to investigate the correlation between scene types and interestingness. [sent-417, score-0.164]

75 We extended this dataset with an interestingness score by setting up a simple binary task on Amazon Mechanical Turk. [sent-421, score-0.862]

76 The interestingness score of an image was calculated as the fraction of selections over views. [sent-424, score-0.883]

77 The scene categories provide a weak context, given by the prior on the scene type, which allows to capture novelty/unusualness, as outliers to what are typical images of a certain scene category. [sent-434, score-0.178]

78 The algorithm can only capture unusualness with respect to the training images (the prior knowledge of our algorithms), not the observer’s prior experience. [sent-437, score-0.207]

79 Therefore a viewer mainly rates the images in this dataset according to aesthetics and general preferences, which transpires from the performance of the individual features. [sent-439, score-0.286]

80 × General preference features yield the highest performance, as they are able to capture scene type and illumination effects (sphriesft), such as the color of a sunset. [sent-440, score-0.168]

81 The features learn the preference for certain scene types (c. [sent-441, score-0.154]

82 Arbitrary photos: Memorability dataset The memorability dataset consists of 2’222 images with a fixed size of 256 256 pixels. [sent-448, score-0.191]

83 d I itn w [a1s3 ]i ttor investigate 1th4e] memorability of images (see examples in Fig. [sent-451, score-0.215]

84 Figure 5: The memorability dataset (GT: ground truth, Est: the estimated scores from our method). [sent-494, score-0.223]

85 Despite the different experimental setting, the scores obtained show a strong correlation (ρ = 0. [sent-501, score-0.078]

86 Unfortunately, we are not able to capture it for two reasons: (i) What is unusual or novel, in this unconstrained setting, depends on the prior knowledge of the observers, which is unknown to the algorithm. [sent-513, score-0.113]

87 (ii) Semantics are crucial in the appraisal of what is unusual in this dataset. [sent-514, score-0.149]

88 To predict the interestingness of such an image correctly, we need to understand such semantics. [sent-518, score-0.796]

89 It is clearly subjective and depends, to a certain degree, on personal preferences and prior knowledge. [sent-521, score-0.16]

90 We proposed a set of features able to capture interestingness in varying contexts. [sent-524, score-0.849]

91 With strong context, such as for static webcams, unusualness is the most important cue for interestingness. [sent-525, score-0.177]

92 In single, context-free images, general preferences for certain scene types are more important. [sent-526, score-0.195]

93 6 illustrates the importance of the different interestingness cues as con- text gets weaker. [sent-528, score-0.826]

94 To overcome the current limitations of interestingness prediction, one would need: (i) an extensive knowledge of what is known to most people, (ii) algorithms able to capture unusualness at the semantic level and (iii) knowledge about personal preferences of the observer. [sent-530, score-1.143]

95 11663399 ContextCueFeatureρAPTop5 Table 1: The interestingness cues and their performance on the 3 datasets. [sent-538, score-0.826]

96 portance of unusualness features decreases, as the context becomes weak. [sent-569, score-0.219]

97 Studying aesthetics in photographic images using a computational approach. [sent-608, score-0.265]

98 High level describable attributes for predicting aesthetics and interestingness. [sent-616, score-0.314]

99 Affective image classification using features inspired by psychology and art theory. [sent-655, score-0.068]

100 Coherence progress: A measure of interestingness based on fixed compressors. [sent-668, score-0.796]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('interestingness', 0.796), ('aesthetics', 0.265), ('gt', 0.233), ('memorability', 0.191), ('unusualness', 0.177), ('preferences', 0.117), ('unusual', 0.083), ('isola', 0.07), ('appraisal', 0.066), ('arousal', 0.066), ('score', 0.066), ('interesting', 0.066), ('psychological', 0.057), ('preference', 0.053), ('attributes', 0.049), ('webcam', 0.047), ('correlation', 0.046), ('psychology', 0.045), ('pleasure', 0.044), ('memorable', 0.044), ('datta', 0.044), ('topn', 0.041), ('lof', 0.039), ('agreement', 0.038), ('scene', 0.036), ('spearman', 0.034), ('superpixel', 0.033), ('interest', 0.033), ('aesth', 0.033), ('aesthetically', 0.033), ('berlyne', 0.033), ('druey', 0.033), ('gtscore', 0.033), ('gygl', 0.033), ('leastinter', 0.033), ('mem', 0.033), ('mostinter', 0.033), ('saceosmthplex', 0.033), ('sppirxefel', 0.033), ('ssel', 0.033), ('stingbot', 0.033), ('outlier', 0.032), ('scores', 0.032), ('lj', 0.032), ('rp', 0.032), ('jpeg', 0.031), ('est', 0.031), ('int', 0.03), ('cues', 0.03), ('capture', 0.03), ('hayko', 0.029), ('irregularities', 0.029), ('aesthetic', 0.029), ('pleasing', 0.029), ('people', 0.029), ('validation', 0.029), ('novelty', 0.028), ('predicted', 0.028), ('biederman', 0.027), ('pleasant', 0.027), ('vessel', 0.027), ('compression', 0.027), ('highest', 0.026), ('pyramids', 0.026), ('superpixels', 0.025), ('conflict', 0.024), ('emotions', 0.024), ('investigate', 0.024), ('boiman', 0.023), ('colorful', 0.023), ('features', 0.023), ('smith', 0.023), ('personal', 0.023), ('negatively', 0.023), ('ii', 0.022), ('findings', 0.022), ('types', 0.022), ('relate', 0.022), ('correlated', 0.022), ('whitening', 0.022), ('flickr', 0.022), ('composition', 0.022), ('calculated', 0.021), ('cor', 0.021), ('emotion', 0.021), ('observer', 0.021), ('viewer', 0.021), ('dhar', 0.021), ('uncorrelated', 0.021), ('gist', 0.021), ('zurich', 0.02), ('workers', 0.02), ('certain', 0.02), ('remember', 0.02), ('categories', 0.02), ('combination', 0.019), ('raw', 0.019), ('sequences', 0.019), ('context', 0.019), ('slic', 0.019)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999964 416 iccv-2013-The Interestingness of Images

Author: Michael Gygli, Helmut Grabner, Hayko Riemenschneider, Fabian Nater, Luc Van_Gool

Abstract: We investigate human interest in photos. Based on our own and others ’psychological experiments, we identify various cues for “interestingness ”, namely aesthetics, unusualness and general preferences. For the ranking of retrieved images, interestingness is more appropriate than cues proposed earlier. Interestingness is, for example, correlated with what people believe they will remember. This is opposed to actual memorability, which is uncorrelated to both of them. We introduce a set of features computationally capturing the three main aspects of visual interestingness that we propose and build an interestingness predictor from them. Its performance is shown on three datasets with varying context, reflecting diverse levels of prior knowledge of the viewers.

2 0.20463938 272 iccv-2013-Modifying the Memorability of Face Photographs

Author: Aditya Khosla, Wilma A. Bainbridge, Antonio Torralba, Aude Oliva

Abstract: Contemporary life bombards us with many new images of faces every day, which poses non-trivial constraints on human memory. The vast majority of face photographs are intended to be remembered, either because of personal relevance, commercial interests or because the pictures were deliberately designed to be memorable. Can we make aportrait more memorable or more forgettable automatically? Here, we provide a method to modify the memorability of individual face photographs, while keeping the identity and other facial traits (e.g. age, attractiveness, and emotional magnitude) of the individual fixed. We show that face photographs manipulated to be more memorable (or more forgettable) are indeed more often remembered (or forgotten) in a crowd-sourcing experiment with an accuracy of 74%. Quantifying and modifying the ‘memorability ’ of a face lends itself to many useful applications in computer vision and graphics, such as mnemonic aids for learning, photo editing applications for social networks and tools for designing memorable advertisements.

3 0.053241558 194 iccv-2013-Heterogeneous Image Features Integration via Multi-modal Semi-supervised Learning Model

Author: Xiao Cai, Feiping Nie, Weidong Cai, Heng Huang

Abstract: Automatic image categorization has become increasingly important with the development of Internet and the growth in the size of image databases. Although the image categorization can be formulated as a typical multiclass classification problem, two major challenges have been raised by the real-world images. On one hand, though using more labeled training data may improve the prediction performance, obtaining the image labels is a time consuming as well as biased process. On the other hand, more and more visual descriptors have been proposed to describe objects and scenes appearing in images and different features describe different aspects of the visual characteristics. Therefore, how to integrate heterogeneous visual features to do the semi-supervised learning is crucial for categorizing large-scale image data. In this paper, we propose a novel approach to integrate heterogeneous features by performing multi-modal semi-supervised classification on unlabeled as well as unsegmented images. Considering each type of feature as one modality, taking advantage of the large amoun- t of unlabeled data information, our new adaptive multimodal semi-supervised classification (AMMSS) algorithm learns a commonly shared class indicator matrix and the weights for different modalities (image features) simultaneously.

4 0.047006063 52 iccv-2013-Attribute Adaptation for Personalized Image Search

Author: Adriana Kovashka, Kristen Grauman

Abstract: Current methods learn monolithic attribute predictors, with the assumption that a single model is sufficient to reflect human understanding of a visual attribute. However, in reality, humans vary in how they perceive the association between a named property and image content. For example, two people may have slightly different internal models for what makes a shoe look “formal”, or they may disagree on which of two scenes looks “more cluttered”. Rather than discount these differences as noise, we propose to learn user-specific attribute models. We adapt a generic model trained with annotations from multiple users, tailoring it to satisfy user-specific labels. Furthermore, we propose novel techniques to infer user-specific labels based on transitivity and contradictions in the user’s search history. We demonstrate that adapted attributes improve accuracy over both existing monolithic models as well as models that learn from scratch with user-specific data alone. In addition, we show how adapted attributes are useful to personalize image search, whether with binary or relative attributes.

5 0.045061667 282 iccv-2013-Multi-view Object Segmentation in Space and Time

Author: Abdelaziz Djelouah, Jean-Sébastien Franco, Edmond Boyer, François Le_Clerc, Patrick Pérez

Abstract: In this paper, we address the problem of object segmentation in multiple views or videos when two or more viewpoints of the same scene are available. We propose a new approach that propagates segmentation coherence information in both space and time, hence allowing evidences in one image to be shared over the complete set. To this aim the segmentation is cast as a single efficient labeling problem over space and time with graph cuts. In contrast to most existing multi-view segmentation methods that rely on some form of dense reconstruction, ours only requires a sparse 3D sampling to propagate information between viewpoints. The approach is thoroughly evaluated on standard multiview datasets, as well as on videos. With static views, results compete with state of the art methods but they are achieved with significantly fewer viewpoints. With multiple videos, we report results that demonstrate the benefit of segmentation propagation through temporal cues.

6 0.041542098 53 iccv-2013-Attribute Dominance: What Pops Out?

7 0.039867751 299 iccv-2013-Online Video SEEDS for Temporal Window Objectness

8 0.039674122 238 iccv-2013-Learning Graphs to Match

9 0.038291622 378 iccv-2013-Semantic-Aware Co-indexing for Image Retrieval

10 0.037973374 424 iccv-2013-Tracking Revisited Using RGBD Camera: Unified Benchmark and Baselines

11 0.037614692 111 iccv-2013-Detecting Dynamic Objects with Multi-view Background Subtraction

12 0.037547626 31 iccv-2013-A Unified Probabilistic Approach Modeling Relationships between Attributes and Objects

13 0.03750838 204 iccv-2013-Human Attribute Recognition by Rich Appearance Dictionary

14 0.036925524 449 iccv-2013-What Do You Do? Occupation Recognition in a Photo via Social Context

15 0.036790743 380 iccv-2013-Semantic Transform: Weakly Supervised Semantic Inference for Relating Visual Attributes

16 0.036124401 54 iccv-2013-Attribute Pivots for Guiding Relevance Feedback in Image Search

17 0.035824697 411 iccv-2013-Symbiotic Segmentation and Part Localization for Fine-Grained Categorization

18 0.035133187 414 iccv-2013-Temporally Consistent Superpixels

19 0.035039235 246 iccv-2013-Learning the Visual Interpretation of Sentences

20 0.03498508 1 iccv-2013-3DNN: Viewpoint Invariant 3D Geometry Matching for Scene Understanding


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.098), (1, 0.018), (2, 0.005), (3, -0.042), (4, 0.037), (5, 0.001), (6, -0.014), (7, -0.023), (8, 0.023), (9, -0.01), (10, 0.005), (11, 0.011), (12, 0.008), (13, 0.008), (14, -0.014), (15, -0.005), (16, -0.03), (17, 0.005), (18, 0.009), (19, -0.009), (20, -0.008), (21, -0.02), (22, -0.012), (23, -0.016), (24, -0.005), (25, -0.003), (26, 0.018), (27, -0.001), (28, -0.007), (29, 0.005), (30, 0.008), (31, 0.015), (32, 0.002), (33, 0.022), (34, 0.029), (35, -0.013), (36, 0.03), (37, -0.039), (38, -0.024), (39, 0.024), (40, -0.01), (41, 0.039), (42, -0.064), (43, 0.009), (44, 0.004), (45, -0.048), (46, 0.056), (47, -0.049), (48, -0.021), (49, 0.008)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.85859787 416 iccv-2013-The Interestingness of Images

Author: Michael Gygli, Helmut Grabner, Hayko Riemenschneider, Fabian Nater, Luc Van_Gool

Abstract: We investigate human interest in photos. Based on our own and others ’psychological experiments, we identify various cues for “interestingness ”, namely aesthetics, unusualness and general preferences. For the ranking of retrieved images, interestingness is more appropriate than cues proposed earlier. Interestingness is, for example, correlated with what people believe they will remember. This is opposed to actual memorability, which is uncorrelated to both of them. We introduce a set of features computationally capturing the three main aspects of visual interestingness that we propose and build an interestingness predictor from them. Its performance is shown on three datasets with varying context, reflecting diverse levels of prior knowledge of the viewers.

2 0.67303193 350 iccv-2013-Relative Attributes for Large-Scale Abandoned Object Detection

Author: Quanfu Fan, Prasad Gabbur, Sharath Pankanti

Abstract: Effective reduction of false alarms in large-scale video surveillance is rather challenging, especially for applications where abnormal events of interest rarely occur, such as abandoned object detection. We develop an approach to prioritize alerts by ranking them, and demonstrate its great effectiveness in reducing false positives while keeping good detection accuracy. Our approach benefits from a novel representation of abandoned object alerts by relative attributes, namely staticness, foregroundness and abandonment. The relative strengths of these attributes are quantified using a ranking function[19] learnt on suitably designed low-level spatial and temporal features.These attributes of varying strengths are not only powerful in distinguishing abandoned objects from false alarms such as people and light artifacts, but also computationally efficient for large-scale deployment. With these features, we apply a linear ranking algorithm to sort alerts according to their relevance to the end-user. We test the effectiveness of our approach on both public data sets and large ones collected from the real world.

3 0.62203538 449 iccv-2013-What Do You Do? Occupation Recognition in a Photo via Social Context

Author: Ming Shao, Liangyue Li, Yun Fu

Abstract: In this paper, we investigate the problem of recognizing occupations of multiple people with arbitrary poses in a photo. Previous work utilizing single person ’s nearly frontal clothing information and fore/background context preliminarily proves that occupation recognition is computationally feasible in computer vision. However, in practice, multiple people with arbitrary poses are common in a photo, and recognizing their occupations is even more challenging. We argue that with appropriately built visual attributes, co-occurrence, and spatial configuration model that is learned through structure SVM, we can recognize multiple people ’s occupations in a photo simultaneously. To evaluate our method’s performance, we conduct extensive experiments on a new well-labeled occupation database with 14 representative occupations and over 7K images. Results on this database validate our method’s effectiveness and show that occupation recognition is solvable in a more general case.

4 0.60059565 246 iccv-2013-Learning the Visual Interpretation of Sentences

Author: C. Lawrence Zitnick, Devi Parikh, Lucy Vanderwende

Abstract: Sentences that describe visual scenes contain a wide variety of information pertaining to the presence of objects, their attributes and their spatial relations. In this paper we learn the visual features that correspond to semantic phrases derived from sentences. Specifically, we extract predicate tuples that contain two nouns and a relation. The relation may take several forms, such as a verb, preposition, adjective or their combination. We model a scene using a Conditional Random Field (CRF) formulation where each node corresponds to an object, and the edges to their relations. We determine the potentials of the CRF using the tuples extracted from the sentences. We generate novel scenes depicting the sentences’ visual meaning by sampling from the CRF. The CRF is also used to score a set of scenes for a text-based image retrieval task. Our results show we can generate (retrieve) scenes that convey the desired semantic meaning, even when scenes (queries) are described by multiple sentences. Significant improvement is found over several baseline approaches.

5 0.58886266 332 iccv-2013-Quadruplet-Wise Image Similarity Learning

Author: Marc T. Law, Nicolas Thome, Matthieu Cord

Abstract: This paper introduces a novel similarity learning framework. Working with inequality constraints involving quadruplets of images, our approach aims at efficiently modeling similarity from rich or complex semantic label relationships. From these quadruplet-wise constraints, we propose a similarity learning framework relying on a convex optimization scheme. We then study how our metric learning scheme can exploit specific class relationships, such as class ranking (relative attributes), and class taxonomy. We show that classification using the learned metrics gets improved performance over state-of-the-art methods on several datasets. We also evaluate our approach in a new application to learn similarities between webpage screenshots in a fully unsupervised way.

6 0.5774985 365 iccv-2013-SIFTpack: A Compact Representation for Efficient SIFT Matching

7 0.56740093 327 iccv-2013-Predicting an Object Location Using a Global Image Representation

8 0.56516683 125 iccv-2013-Drosophila Embryo Stage Annotation Using Label Propagation

9 0.56376868 193 iccv-2013-Heterogeneous Auto-similarities of Characteristics (HASC): Exploiting Relational Information for Classification

10 0.56086653 287 iccv-2013-Neighbor-to-Neighbor Search for Fast Coding of Feature Vectors

11 0.56026798 377 iccv-2013-Segmentation Driven Object Detection with Fisher Vectors

12 0.55859458 104 iccv-2013-Decomposing Bag of Words Histograms

13 0.55741322 380 iccv-2013-Semantic Transform: Weakly Supervised Semantic Inference for Relating Visual Attributes

14 0.55693185 388 iccv-2013-Shape Index Descriptors Applied to Texture-Based Galaxy Analysis

15 0.55280656 248 iccv-2013-Learning to Rank Using Privileged Information

16 0.54385775 285 iccv-2013-NEIL: Extracting Visual Knowledge from Web Data

17 0.5413326 431 iccv-2013-Unbiased Metric Learning: On the Utilization of Multiple Datasets and Web Images for Softening Bias

18 0.5391649 406 iccv-2013-Style-Aware Mid-level Representation for Discovering Visual Connections in Space and Time

19 0.5270862 191 iccv-2013-Handling Uncertain Tags in Visual Recognition

20 0.51991671 192 iccv-2013-Handwritten Word Spotting with Corrected Attributes


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(2, 0.057), (7, 0.012), (8, 0.032), (12, 0.023), (26, 0.078), (31, 0.033), (35, 0.015), (42, 0.09), (44, 0.294), (64, 0.036), (73, 0.031), (89, 0.151), (98, 0.027)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.71062738 416 iccv-2013-The Interestingness of Images

Author: Michael Gygli, Helmut Grabner, Hayko Riemenschneider, Fabian Nater, Luc Van_Gool

Abstract: We investigate human interest in photos. Based on our own and others ’psychological experiments, we identify various cues for “interestingness ”, namely aesthetics, unusualness and general preferences. For the ranking of retrieved images, interestingness is more appropriate than cues proposed earlier. Interestingness is, for example, correlated with what people believe they will remember. This is opposed to actual memorability, which is uncorrelated to both of them. We introduce a set of features computationally capturing the three main aspects of visual interestingness that we propose and build an interestingness predictor from them. Its performance is shown on three datasets with varying context, reflecting diverse levels of prior knowledge of the viewers.

2 0.70610034 225 iccv-2013-Joint Segmentation and Pose Tracking of Human in Natural Videos

Author: Taegyu Lim, Seunghoon Hong, Bohyung Han, Joon Hee Han

Abstract: We propose an on-line algorithm to extract a human by foreground/background segmentation and estimate pose of the human from the videos captured by moving cameras. We claim that a virtuous cycle can be created by appropriate interactions between the two modules to solve individual problems. This joint estimation problem is divided into two subproblems, , foreground/background segmentation and pose tracking, which alternate iteratively for optimization; segmentation step generates foreground mask for human pose tracking, and human pose tracking step provides foreground response map for segmentation. The final solution is obtained when the iterative procedure converges. We evaluate our algorithm quantitatively and qualitatively in real videos involving various challenges, and present its outstandingperformance compared to the state-of-the-art techniques for segmentation and pose estimation.

3 0.70259768 303 iccv-2013-Orderless Tracking through Model-Averaged Posterior Estimation

Author: Seunghoon Hong, Suha Kwak, Bohyung Han

Abstract: We propose a novel offline tracking algorithm based on model-averaged posterior estimation through patch matching across frames. Contrary to existing online and offline tracking methods, our algorithm is not based on temporallyordered estimates of target state but attempts to select easyto-track frames first out of the remaining ones without exploiting temporal coherency of target. The posterior of the selected frame is estimated by propagating densities from the already tracked frames in a recursive manner. The density propagation across frames is implemented by an efficient patch matching technique, which is useful for our algorithm since it does not require motion smoothness assumption. Also, we present a hierarchical approach, where a small set of key frames are tracked first and non-key frames are handled by local key frames. Our tracking algorithm is conceptually well-suited for the sequences with abrupt motion, shot changes, and occlusion. We compare our tracking algorithm with existing techniques in real videos with such challenges and illustrate its superior performance qualitatively and quantitatively.

4 0.66754031 86 iccv-2013-Concurrent Action Detection with Structural Prediction

Author: Ping Wei, Nanning Zheng, Yibiao Zhao, Song-Chun Zhu

Abstract: Action recognition has often been posed as a classification problem, which assumes that a video sequence only have one action class label and different actions are independent. However, a single human body can perform multiple concurrent actions at the same time, and different actions interact with each other. This paper proposes a concurrent action detection model where the action detection is formulated as a structural prediction problem. In this model, an interval in a video sequence can be described by multiple action labels. An detected action interval is determined both by the unary local detector and the relations with other actions. We use a wavelet feature to represent the action sequence, and design a composite temporal logic descriptor to describe the action relations. The model parameters are trained by structural SVM learning. Given a long video sequence, a sequential decision window search algorithm is designed to detect the actions. Experiments on our new collected concurrent action dataset demonstrate the strength of our method.

5 0.63566917 447 iccv-2013-Volumetric Semantic Segmentation Using Pyramid Context Features

Author: Jonathan T. Barron, Mark D. Biggin, Pablo Arbeláez, David W. Knowles, Soile V.E. Keranen, Jitendra Malik

Abstract: We present an algorithm for the per-voxel semantic segmentation of a three-dimensional volume. At the core of our algorithm is a novel “pyramid context” feature, a descriptive representation designed such that exact per-voxel linear classification can be made extremely efficient. This feature not only allows for efficient semantic segmentation but enables other aspects of our algorithm, such as novel learned features and a stacked architecture that can reason about self-consistency. We demonstrate our technique on 3Dfluorescence microscopy data ofDrosophila embryosfor which we are able to produce extremely accurate semantic segmentations in a matter of minutes, and for which other algorithms fail due to the size and high-dimensionality of the data, or due to the difficulty of the task.

6 0.62286317 281 iccv-2013-Multi-view Normal Field Integration for 3D Reconstruction of Mirroring Objects

7 0.58122158 359 iccv-2013-Robust Object Tracking with Online Multi-lifespan Dictionary Learning

8 0.5738523 150 iccv-2013-Exemplar Cut

9 0.57071209 326 iccv-2013-Predicting Sufficient Annotation Strength for Interactive Foreground Segmentation

10 0.57057035 3 iccv-2013-3D Sub-query Expansion for Improving Sketch-Based Multi-view Image Retrieval

11 0.57026821 21 iccv-2013-A Method of Perceptual-Based Shape Decomposition

12 0.56922656 196 iccv-2013-Hierarchical Data-Driven Descent for Efficient Optimal Deformation Estimation

13 0.56901443 349 iccv-2013-Regionlets for Generic Object Detection

14 0.56882608 61 iccv-2013-Beyond Hard Negative Mining: Efficient Detector Learning via Block-Circulant Decomposition

15 0.56843746 379 iccv-2013-Semantic Segmentation without Annotating Segments

16 0.5681209 95 iccv-2013-Cosegmentation and Cosketch by Unsupervised Learning

17 0.5680148 6 iccv-2013-A Convex Optimization Framework for Active Learning

18 0.56748307 427 iccv-2013-Transfer Feature Learning with Joint Distribution Adaptation

19 0.56710052 186 iccv-2013-GrabCut in One Cut

20 0.56687856 272 iccv-2013-Modifying the Memorability of Face Photographs