iccv iccv2013 iccv2013-325 knowledge-graph by maker-knowledge-mining

325 iccv-2013-Predicting Primary Gaze Behavior Using Social Saliency Fields

Source: pdf

Author: Hyun Soo Park, Eakta Jain, Yaser Sheikh

Abstract: We present a method to predict primary gaze behavior in a social scene. Inspired by the study of electric fields, we posit “social charges ”—latent quantities that drive the primary gaze behavior of members of a social group. These charges induce a gradient field that defines the relationship between the social charges and the primary gaze direction of members in the scene. This field model is used to predict primary gaze behavior at any location or time in the scene. We present an algorithm to estimate the time-varying behavior of these charges from the primary gaze behavior of measured observers in the scene. We validate the model by evaluating its predictive precision via cross-validation in a variety of social scenes.

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 edu Abstract We present a method to predict primary gaze behavior in a social scene. [sent-6, score-1.262]

2 Inspired by the study of electric fields, we posit “social charges ”—latent quantities that drive the primary gaze behavior of members of a social group. [sent-7, score-1.911]

3 These charges induce a gradient field that defines the relationship between the social charges and the primary gaze direction of members in the scene. [sent-8, score-2.275]

4 We present an algorithm to estimate the time-varying behavior of these charges from the primary gaze behavior of measured observers in the scene. [sent-10, score-1.166]

5 Introduction Humans interact, in part, by transmitting and receiving social signals, such as gaze direction, voice tone, or facial expression [36, 48]. [sent-13, score-1.01]

6 Inspired by Coulomb’s law, which describes the electrostatic interaction between charged particles, we present a model to describe the primary gaze behavior of individuals in a social scene. [sent-20, score-1.256]

7 We characterize how information of the time-varying location and charge of multiple moving social charges is combined to induce a social saliency field analogous to an electric field. [sent-24, score-2.295]

8 We use this feature to also establish correspondence of social charges over time. [sent-36, score-1.032]

9 Such models can also be used within a filtering framework to more effectively track primary gaze directions in a social scene. [sent-42, score-1.195]

10 In augmented reality applications, predictive models of primary gaze behavior will enable the insertion of believable virtual characters into social scenes that respond to the social dynamics of a scene. [sent-43, score-1.875]

11 We validate our social charge model on four real world sequences where various human interactions occur, including a social game, office meetings, and an informal party. [sent-45, score-1.54]

12 Related Work We review prior research on representing social scenes and predicting gaze directions. [sent-51, score-1.01]

13 A generalized F-formation concept has been applied to estimate social attention where gaze directions intersect in the scene [2, 10, 29, 34]. [sent-68, score-1.097]

14 Time is another axis to represent a social scene because the social scene often includes dynamic human interactions. [sent-69, score-1.236]

15 Friesen and Kingstone [11] showed that gaze is a strong social attention stimulus that can trigger attention shift. [sent-117, score-1.078]

16 We present a novel predictive representation based on the concept of latent social charges for any 3D location or time, and validate it on real measurements of 3D gaze behavior. [sent-120, score-1.526]

17 We model the relationship between a primary gaze direction and a social charge via a social saliency field inspired by Coulomb’s law. [sent-122, score-2.346]

18 The two social charges (the blue and green points) generate the social saliency saliency field on the left figure. [sent-123, score-1.99]

19 The size of the social charges is proportional to social saliency. [sent-124, score-1.626]

20 Primary Gaze Behavior Prediction A social member is a participant in a social scene in which multiple members interact with each other. [sent-127, score-1.398]

21 In this paper, we predict the primary gaze direction at any 3D location and time, given the observed gaze behavior of the members. [sent-134, score-1.136]

22 Inspired by Coulomb’s law, we {g(evnerativ)e}ly model the relationship between primary gaze directions via latent social charges that drive attention of members in the scene we show that this approach demonstrates superior predictive precision in the presence of missing and noisy measurements. [sent-144, score-1.87]

23 According to Coulomb’s law, the force exerted on an electric charge due to the presence of another electric charge is directed along the line that connects these two charges. [sent-145, score-0.904]

24 We represent a social charge as Q = (q, r) where q ∈ R iWs ae mrepearsesueren to af ssoocciiaall s cahliaerngecy, a si. [sent-146, score-0.923]

25 w sT ahtete dneticoany, aofn dth re ∈spa Rtial influence of the social charge is modeled as an inverse squared function (as with classic electric field model). [sent-149, score-1.104]

26 A social charge is a quantity that changes over time because the scene includes dy— namic human interactions. [sent-150, score-0.947]

27 There may exist multiple social charges, {Qi}iI=1 when multiple social groups are formed, wchhaerrgee sI, i {sQ Qthe} number of the charges. [sent-151, score-1.188]

28 Estimating the social charges given the primary gaze directions of the members is equivalent to optimizing the following likelihood, (2) {Qi∗}iI=1= a{rQgim}iI=a1xL? [sent-155, score-1.751]

29 This estimates the optimal {Qi∗}iI=1 such that each observed pTrhiimsaersyti gaze sdtihreecotpiotnim misa lo{rQient}ed towards one of the social charges. [sent-158, score-1.028]

30 From these social charges, we can predict the most likely primary gaze direction at p by maximizing the following probability, argvmaxp v∗ = = ? [sent-159, score-1.254]

31 We will develop a computational representation for the relationship between social charges and primary gaze directions to predict the primary gaze behavior via optimizing Equation (3) in Section 4. [sent-170, score-2.301]

32 Based on the relationship, we present a method to estimate the latent social charges given primary gaze behaviors of the observers via optimiz- ing Equation (2) in Section 5. [sent-171, score-1.656]

33 Social Saliency Field In this section, we present a computational model that captures the relationship between time-varying social charges and primary gaze behavior. [sent-173, score-1.604]

34 The charges induce a social saliency field that enables us to define a probability of the primary gaze direction given a location and time in Equation (3). [sent-174, score-1.878]

35 Comparison between the social saliency field and electric field can be found in Table 1. [sent-175, score-0.997]

36 social saliency field force between two charges, Q = (q, r) and Qx = (qx, x), from Coulomb’s law is: F = Kqq? [sent-180, score-0.878]

37 The force between two charges is proportional to their magnitude of charges and 33549058 inversely proportional to squares of distance. [sent-183, score-0.92]

38 We posit that a negative social charge, q, exerts an attractive force on a member (with an infinitesimal positive charge), along the line connecting the two charges (r − x)/ ? [sent-187, score-1.163]

39 the electric field, a social saliency field is defined by the limiting process, S(x) =q lxi→m0qFx= K? [sent-196, score-0.917]

40 )3, (5) where S (x) is the social saliency field evaluating at x, induced by a single social charge, Q = (q, r). [sent-198, score-1.424]

41 When multiple electric charges exist, the net electric field induced by the charges are the superposition of the electric fields by all charges, i. [sent-199, score-1.299]

42 e electric field, the net social saliency field selectively takes one of the social saliency fields1 , i. [sent-204, score-1.679]

43 iI=1 {Si(x)}iI=1 where Si (x) is the social saliency field induced by the ith social charge, Qi. [sent-209, score-1.424]

44 To reflect the selective gaze behavior, we model the underlying probability distribution of a primary gaze direction using a mixture of von-Mises Fisher distributions, p? [sent-210, score-1.043]

45 res the distance between the primary gaze direction, v, and a unit vector from each social saliency field, Si/? [sent-235, score-1.308]

46 ocial charge may move independently depending on the primary gaze behavior of the participating group. [sent-241, score-0.998]

47 {uqn(dt)e,frin(et)d} teot≤he tr ≤wis tde, (8) 1A primary gaze direction is not oriented towards an average location between two social charges but towards one of the charges. [sent-244, score-1.692]

48 Given the saliency field from each charge at each time instant, the net time-varying saliency field can be written as S(x, t) = argmax ? [sent-248, score-0.799]

49 Social Saliency Field Estimation In this section, we present a method to estimate the time-varying location and magnitude of the social charges {Qi (t)}iI=1, given the primary gaze directions of members, {(vj (t)t)} , pj (t))}jJ=1, in the scene, i. [sent-252, score-1.694]

50 imal estimates of {Qi}iI=1 that explain the observed primary gaze directions, {{Q(vj}, pj)}jJ=1, given the number of social charges. [sent-271, score-1.166]

51 In the expectation step, we estimate the membership of each social charge given the social charge locations, i. [sent-275, score-1.927]

52 This also allows us to compute the social saliency qi = γij, i. [sent-291, score-0.8]

53 In the maximization step, we estimate the social charge lo? [sent-295, score-0.923]

54 (b) The trajectories of the social charges are illustrated. [sent-315, score-1.032]

55 Note that emergence and dissolution times, te and td, are the same for all social charges in Equation (14). [sent-334, score-1.099]

56 In practice, we split the time windows such that the number of the social charges remains constant for each time window. [sent-335, score-1.032]

57 This EM method requires prior knowledge of the number social charges and a good initialization of {Qi}iI=1 . [sent-336, score-1.032]

58 Initialization Detecting social charges in a static scene has been presented by Fathi et al. [sent-340, score-1.056]

59 We present a method to track the detected social charges across time based on membership features to initialize the EM algorithm. [sent-346, score-1.113]

60 Each element of the membership feature indicates a probability that the jth member belongs to the ith social charge obtained by Equation (11), i. [sent-348, score-1.056]

61 − − This membership feature enables us to describe a social charge in terms of the participating members. [sent-365, score-1.033]

62 The membership feature from a social charge remains a similar pattern across time because the same members tend to stay in their social clique as shown in Figure 2(a). [sent-366, score-1.716]

63 We compute the membership features of all the detected social charges and cluster the charges using the classic meanshift algorithm [12] based on the features. [sent-367, score-1.568]

64 A set of the charges clustered by the same label forms a trajectory of a single social charge. [sent-369, score-1.032]

65 When multiple charges at the same time instant are labeled in a single cluster, we choose the charge that is close to the center of the feature cluster. [sent-370, score-0.785]

66 The social charge representation via a membership feature enables us to track a social charge invariant to locations and time. [sent-371, score-1.927]

67 This introduces missing data because of temporary dissolution of the social charge as shown in Figure 2(b). [sent-374, score-0.965]

68 Our tracking method can re-associate with the re-emerging charges based on the membership feature clustering because two temporally separated trajectories of the social charge have the same membership feature. [sent-375, score-1.54]

69 Results We validate our social saliency field model and evaluate the prediction accuracy, quantitatively and qualitatively via four real world sequences capturing various human interactions from third person and first person cameras4. [sent-377, score-0.899]

70 We leave out one of the members and estimate the time-varying social charges from the primary gaze behaviors of the rest of members. [sent-383, score-1.74]

71 Using the estimated social charges, we evaluate the predictive validity of the left-out primary gaze direction. [sent-384, score-1.231]

72 We run this cross validation scheme and measure the angle difference between the predicted gaze direction and the ground truth gaze di- rection. [sent-385, score-0.867]

73 We randomly choose k number of members 4First person cameras refer to head-mounted or wearable cameras that produce video from the point of view of the wearer; third person cameras refer to infrastructure cameras in the scene looking at the social interaction. [sent-393, score-0.852]

74 We exploit social charge motion estimated by other members to regulate the noisy face tracking process. [sent-400, score-1.076]

75 estimate the social charges from randomly chosen observed members (E to J) and predict the primary gaze directions of the unobserved members (A to D). [sent-403, score-1.897]

76 out of 11members and predict their primary gaze directions using (11-k) number of the primary gaze directions. [sent-407, score-1.201]

77 The orange vector field and dark gray vector field in Figure 3 are the RBF regression model and a social saliency field, respectively. [sent-408, score-0.896]

78 The social saliency field outperforms over the RBF regression in three aspects: (1) The social saliency field is insensitive to outliers while the RBF regression is often biased by the outliers. [sent-409, score-1.632]

79 Two sequences (Party and Meeting) are used to estimate the social saliency field as shown in Figure 5(c) and 5(d). [sent-421, score-0.816]

80 , primary gaze directions, we generated a social saliency field as shown in Figure 5(a). [sent-431, score-1.388]

81 We estimated social charge motion from other members and fused gaze prediction by the social saliency field with the face orientation estimate at each frame from the PittPatt system. [sent-433, score-2.323]

82 While they interrogated each other during the game, the social charge stays in the group. [sent-438, score-0.923]

83 We estimate a social saliency field from both third person cameras and first person cameras. [sent-445, score-0.866]

84 (a) A social charge is formed at the presenter and splits into two subgroups at frame 8 members 248 in the meeting scene. [sent-446, score-1.084]

85 The social saliency field reflects the selective gaze behavior. [sent-447, score-1.252]

86 We also apply our method to estimate a social saliency field on a public dataset provided by Park et al. [sent-450, score-0.816]

87 In Figure 5(b), we estimate the social charge motion. [sent-453, score-0.923]

88 In most cases, the social charge stays near the player who is investigated. [sent-454, score-0.944]

89 Based on the social saliency field, we show that we can detect the outliers whose primary gaze direction does not behave in accord with social attention. [sent-455, score-1.937]

90 Thus, to build perceptual systems that can similarly interpret human social interaction, the systems need to be equipped with internal models of social behavior that they can appeal to, when direct measurements from data is noisy or insufficient. [sent-460, score-1.256]

91 The social saliency field model we present in this paper is a step towards this vision. [sent-461, score-0.834]

92 By describing the activity in the scene in terms of the motion of latent social charges, we move beyond measuring scene activity, and towards understanding the narrative of the events of the scene, as interpreted by the members of the social group itself. [sent-462, score-1.386]

93 We present the social saliency field induced by the motion of social charges as a model to predict primary gaze behavior of people in a social scene. [sent-464, score-3.124]

94 The motion of the charges is estimated from the observed primary gaze behavior of members of a social scene. [sent-465, score-1.79]

95 The net social saliency field is created by selecting the maximum of a mixture of von-Mises Fisher distributions, each produced by a different social charge. [sent-466, score-1.436]

96 We evaluate the predictive validity of spatial and temporal forecasting on real sequences and demonstrate that the social saliency field model is supported empirically. [sent-467, score-0.881]

97 The principal assumption in the model is the conditional independence of gaze behavior between two observers given the behavior of the social charges. [sent-469, score-1.166]

98 In practice, the gaze behavior of each observer in the scene is known to have a degree of influence on the gaze behavior of other observers [9, 38]. [sent-470, score-1.012]

99 In this paper, we limited our analysis to a 33550092 single social signal: primary gaze behavior. [sent-472, score-1.166]

100 This field would include the influence of observer prediction of the behavior of social charges. [sent-475, score-0.774]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('social', 0.594), ('charges', 0.438), ('gaze', 0.416), ('charge', 0.329), ('primary', 0.156), ('saliency', 0.142), ('members', 0.118), ('electric', 0.101), ('membership', 0.081), ('field', 0.08), ('behavior', 0.068), ('jj', 0.068), ('qi', 0.064), ('member', 0.052), ('predictive', 0.047), ('causality', 0.046), ('vj', 0.045), ('pj', 0.044), ('force', 0.044), ('coulomb', 0.042), ('dissolution', 0.042), ('proxemics', 0.037), ('direction', 0.035), ('anomaly', 0.035), ('attention', 0.034), ('cristani', 0.033), ('prediction', 0.032), ('game', 0.031), ('rbf', 0.031), ('directions', 0.029), ('participating', 0.029), ('eyes', 0.029), ('predict', 0.028), ('meeting', 0.026), ('ii', 0.026), ('net', 0.026), ('argvmaxp', 0.025), ('enet', 0.025), ('mafia', 0.025), ('menegaz', 0.025), ('pittpatt', 0.025), ('emergence', 0.025), ('scene', 0.024), ('park', 0.024), ('interactions', 0.023), ('helbing', 0.022), ('interaction', 0.022), ('cameras', 0.022), ('player', 0.021), ('observers', 0.02), ('selective', 0.02), ('bazzani', 0.02), ('microscopic', 0.02), ('posit', 0.02), ('spatiotemporal', 0.019), ('towards', 0.018), ('law', 0.018), ('validity', 0.018), ('instant', 0.018), ('crowd', 0.018), ('face', 0.018), ('abnormal', 0.018), ('behaviors', 0.018), ('tracking', 0.017), ('location', 0.017), ('autism', 0.017), ('birmingham', 0.017), ('diagnoses', 0.017), ('dissolve', 0.017), ('inattentional', 0.017), ('kingstone', 0.017), ('marshall', 0.017), ('mehran', 0.017), ('oop', 0.017), ('paggetti', 0.017), ('presenter', 0.017), ('raghavendra', 0.017), ('sophie', 0.017), ('ttd', 0.017), ('vvj', 0.017), ('meanshift', 0.017), ('equation', 0.017), ('si', 0.017), ('fisher', 0.016), ('interact', 0.016), ('players', 0.016), ('cmu', 0.015), ('qx', 0.015), ('orienting', 0.015), ('friesen', 0.015), ('abowd', 0.015), ('vinciarelli', 0.015), ('macroscopic', 0.015), ('ssii', 0.015), ('ssjjk', 0.015), ('attractive', 0.015), ('latent', 0.014), ('induced', 0.014), ('person', 0.014), ('itti', 0.014)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000007 325 iccv-2013-Predicting Primary Gaze Behavior Using Social Saliency Fields

Author: Hyun Soo Park, Eakta Jain, Yaser Sheikh

2 0.40701428 67 iccv-2013-Calibration-Free Gaze Estimation Using Human Gaze Patterns

Author: Fares Alnajar, Theo Gevers, Roberto Valenti, Sennay Ghebreab

Abstract: We present a novel method to auto-calibrate gaze estimators based on gaze patterns obtained from other viewers. Our method is based on the observation that the gaze patterns of humans are indicative of where a new viewer will look at [12]. When a new viewer is looking at a stimulus, we first estimate a topology of gaze points (initial gaze points). Next, these points are transformed so that they match the gaze patterns of other humans to find the correct gaze points. In a flexible uncalibrated setup with a web camera and no chin rest, the proposed method was tested on ten subjects and ten images. The method estimates the gaze points after looking at a stimulus for a few seconds with an average accuracy of 4.3◦. Although the reported performance is lower than what could be achieved with dedicated hardware or calibrated setup, the proposed method still provides a sufficient accuracy to trace the viewer attention. This is promising considering the fact that auto-calibration is done in a flexible setup , without the use of a chin rest, and based only on a few seconds of gaze initialization data. To the best of our knowledge, this is the first work to use human gaze patterns in order to auto-calibrate gaze estimators.

3 0.37180009 247 iccv-2013-Learning to Predict Gaze in Egocentric Video

Author: Yin Li, Alireza Fathi, James M. Rehg

Abstract: We present a model for gaze prediction in egocentric video by leveraging the implicit cues that exist in camera wearer’s behaviors. Specifically, we compute the camera wearer’s head motion and hand location from the video and combine them to estimate where the eyes look. We further model the dynamic behavior of the gaze, in particular fixations, as latent variables to improve the gaze prediction. Our gaze prediction results outperform the state-of-the-art algorithms by a large margin on publicly available egocentric vision datasets. In addition, we demonstrate that we get a significant performance boost in recognizing daily actions and segmenting foreground objects by plugging in our gaze predictions into state-of-the-art methods.

4 0.28559077 381 iccv-2013-Semantically-Based Human Scanpath Estimation with HMMs

Author: Huiying Liu, Dong Xu, Qingming Huang, Wen Li, Min Xu, Stephen Lin

Abstract: We present a method for estimating human scanpaths, which are sequences of gaze shifts that follow visual attention over an image. In this work, scanpaths are modeled based on three principal factors that influence human attention, namely low-levelfeature saliency, spatialposition, and semantic content. Low-level feature saliency is formulated as transition probabilities between different image regions based on feature differences. The effect of spatial position on gaze shifts is modeled as a Levy flight with the shifts following a 2D Cauchy distribution. To account for semantic content, we propose to use a Hidden Markov Model (HMM) with a Bag-of-Visual-Words descriptor of image regions. An HMM is well-suited for this purpose in that 1) the hidden states, obtained by unsupervised learning, can represent latent semantic concepts, 2) the prior distribution of the hidden states describes visual attraction to the semantic concepts, and 3) the transition probabilities represent human gaze shift patterns. The proposed method is applied to task-driven viewing processes. Experiments and analysis performed on human eye gaze data verify the effectiveness of this method.

5 0.13861077 50 iccv-2013-Analysis of Scores, Datasets, and Models in Visual Saliency Prediction

Author: Ali Borji, Hamed R. Tavakoli, Dicky N. Sihite, Laurent Itti

Abstract: Significant recent progress has been made in developing high-quality saliency models. However, less effort has been undertaken on fair assessment of these models, over large standardized datasets and correctly addressing confounding factors. In this study, we pursue a critical and quantitative look at challenges (e.g., center-bias, map smoothing) in saliency modeling and the way they affect model accuracy. We quantitatively compare 32 state-of-the-art models (using the shuffled AUC score to discount center-bias) on 4 benchmark eye movement datasets, for prediction of human fixation locations and scanpath sequence. We also account for the role of map smoothing. We find that, although model rankings vary, some (e.g., AWS, LG, AIM, and HouNIPS) consistently outperform other models over all datasets. Some models work well for prediction of both fixation locations and scanpath sequence (e.g., Judd, GBVS). Our results show low prediction accuracy for models over emotional stimuli from the NUSEF dataset. Our last benchmark, for the first time, gauges the ability of models to decode the stimulus category from statistics of fixations, saccades, and model saliency values at fixated locations. In this test, ITTI and AIM models win over other models. Our benchmark provides a comprehensive high-level picture of the strengths and weaknesses of many popular models, and suggests future research directions in saliency modeling.

6 0.12401581 373 iccv-2013-Saliency and Human Fixations: State-of-the-Art and Study of Comparison Metrics

7 0.11415014 71 iccv-2013-Category-Independent Object-Level Saliency Detection

8 0.11198549 372 iccv-2013-Saliency Detection via Dense and Sparse Reconstruction

9 0.10597574 449 iccv-2013-What Do You Do? Occupation Recognition in a Photo via Social Context

10 0.093669154 91 iccv-2013-Contextual Hypergraph Modeling for Salient Object Detection

11 0.089323327 396 iccv-2013-Space-Time Robust Representation for Action Recognition

12 0.081067272 370 iccv-2013-Saliency Detection in Large Point Sets

13 0.073638067 217 iccv-2013-Initialization-Insensitive Visual Tracking through Voting with Salient Local Features

14 0.068689838 374 iccv-2013-Salient Region Detection by UFO: Uniqueness, Focusness and Objectness

15 0.064140245 369 iccv-2013-Saliency Detection: A Boolean Map Approach

16 0.06282033 371 iccv-2013-Saliency Detection via Absorbing Markov Chain

17 0.061986133 137 iccv-2013-Efficient Salient Region Detection with Soft Image Abstraction

18 0.050058059 157 iccv-2013-Fast Face Detector Training Using Tailored Views

19 0.04713995 216 iccv-2013-Inferring "Dark Matter" and "Dark Energy" from Videos

20 0.043292075 167 iccv-2013-Finding Causal Interactions in Video Sequences

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.095), (1, -0.014), (2, 0.201), (3, -0.092), (4, -0.052), (5, -0.023), (6, 0.076), (7, -0.036), (8, 0.025), (9, 0.068), (10, -0.005), (11, -0.089), (12, -0.069), (13, 0.077), (14, -0.032), (15, 0.157), (16, -0.32), (17, 0.08), (18, 0.178), (19, -0.215), (20, -0.063), (21, 0.01), (22, -0.066), (23, 0.022), (24, -0.036), (25, 0.016), (26, -0.033), (27, -0.024), (28, -0.021), (29, -0.004), (30, 0.007), (31, 0.021), (32, 0.002), (33, 0.022), (34, -0.011), (35, 0.007), (36, 0.03), (37, 0.008), (38, 0.032), (39, -0.006), (40, -0.029), (41, -0.027), (42, 0.004), (43, 0.026), (44, 0.032), (45, 0.005), (46, 0.01), (47, -0.036), (48, 0.001), (49, -0.001)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.96581304 325 iccv-2013-Predicting Primary Gaze Behavior Using Social Saliency Fields

Author: Hyun Soo Park, Eakta Jain, Yaser Sheikh

2 0.95891541 67 iccv-2013-Calibration-Free Gaze Estimation Using Human Gaze Patterns

Author: Fares Alnajar, Theo Gevers, Roberto Valenti, Sennay Ghebreab

3 0.9522571 247 iccv-2013-Learning to Predict Gaze in Egocentric Video

Author: Yin Li, Alireza Fathi, James M. Rehg

4 0.85707545 381 iccv-2013-Semantically-Based Human Scanpath Estimation with HMMs

Author: Huiying Liu, Dong Xu, Qingming Huang, Wen Li, Min Xu, Stephen Lin

5 0.43221042 50 iccv-2013-Analysis of Scores, Datasets, and Models in Visual Saliency Prediction

Author: Ali Borji, Hamed R. Tavakoli, Dicky N. Sihite, Laurent Itti

6 0.40617645 373 iccv-2013-Saliency and Human Fixations: State-of-the-Art and Study of Comparison Metrics

7 0.32196885 369 iccv-2013-Saliency Detection: A Boolean Map Approach

8 0.25763553 180 iccv-2013-From Where and How to What We See

9 0.24969187 370 iccv-2013-Saliency Detection in Large Point Sets

10 0.22411132 372 iccv-2013-Saliency Detection via Dense and Sparse Reconstruction

11 0.22042055 416 iccv-2013-The Interestingness of Images

12 0.21760401 91 iccv-2013-Contextual Hypergraph Modeling for Salient Object Detection

13 0.21568951 71 iccv-2013-Category-Independent Object-Level Saliency Detection

14 0.20912342 371 iccv-2013-Saliency Detection via Absorbing Markov Chain

15 0.2074459 374 iccv-2013-Salient Region Detection by UFO: Uniqueness, Focusness and Objectness

16 0.20048639 137 iccv-2013-Efficient Salient Region Detection with Soft Image Abstraction

17 0.19436505 316 iccv-2013-Pictorial Human Spaces: How Well Do Humans Perceive a 3D Articulated Pose?

18 0.18735173 396 iccv-2013-Space-Time Robust Representation for Action Recognition

19 0.17079821 433 iccv-2013-Understanding High-Level Semantics by Modeling Traffic Patterns

20 0.16464798 217 iccv-2013-Initialization-Insensitive Visual Tracking through Voting with Salient Local Features

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(2, 0.066), (7, 0.016), (12, 0.011), (26, 0.059), (31, 0.044), (34, 0.012), (42, 0.101), (48, 0.014), (64, 0.029), (70, 0.312), (73, 0.027), (84, 0.012), (89, 0.146)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.76320028 325 iccv-2013-Predicting Primary Gaze Behavior Using Social Saliency Fields

Author: Hyun Soo Park, Eakta Jain, Yaser Sheikh

2 0.61758709 328 iccv-2013-Probabilistic Elastic Part Model for Unsupervised Face Detector Adaptation

Author: Haoxiang Li, Gang Hua, Zhe Lin, Jonathan Brandt, Jianchao Yang

Abstract: We propose an unsupervised detector adaptation algorithm to adapt any offline trained face detector to a specific collection of images, and hence achieve better accuracy. The core of our detector adaptation algorithm is a probabilistic elastic part (PEP) model, which is offline trained with a set of face examples. It produces a statisticallyaligned part based face representation, namely the PEP representation. To adapt a general face detector to a collection of images, we compute the PEP representations of the candidate detections from the general face detector, and then train a discriminative classifier with the top positives and negatives. Then we re-rank all the candidate detections with this classifier. This way, a face detector tailored to the statistics of the specific image collection is adapted from the original detector. We present extensive results on three datasets with two state-of-the-art face detectors. The significant improvement of detection accuracy over these state- of-the-art face detectors strongly demonstrates the efficacy of the proposed face detector adaptation algorithm.

3 0.60859632 326 iccv-2013-Predicting Sufficient Annotation Strength for Interactive Foreground Segmentation

Author: Suyog Dutt Jain, Kristen Grauman

Abstract: The mode of manual annotation used in an interactive segmentation algorithm affects both its accuracy and easeof-use. For example, bounding boxes are fast to supply, yet may be too coarse to get good results on difficult images; freehand outlines are slower to supply and more specific, yet they may be overkill for simple images. Whereas existing methods assume a fixed form of input no matter the image, we propose to predict the tradeoff between accuracy and effort. Our approach learns whether a graph cuts segmentation will succeed if initialized with a given annotation mode, based on the image ’s visual separability and foreground uncertainty. Using these predictions, we optimize the mode of input requested on new images a user wants segmented. Whether given a single image that should be segmented as quickly as possible, or a batch of images that must be segmented within a specified time budget, we show how to select the easiest modality that will be sufficiently strong to yield high quality segmentations. Extensive results with real users and three datasets demonstrate the impact.

4 0.60050654 336 iccv-2013-Random Forests of Local Experts for Pedestrian Detection

Author: Javier Marín, David Vázquez, Antonio M. López, Jaume Amores, Bastian Leibe

Abstract: Pedestrian detection is one of the most challenging tasks in computer vision, and has received a lot of attention in the last years. Recently, some authors have shown the advantages of using combinations of part/patch-based detectors in order to cope with the large variability of poses and the existence of partial occlusions. In this paper, we propose a pedestrian detection method that efficiently combines multiple local experts by means of a Random Forest ensemble. The proposed method works with rich block-based representations such as HOG and LBP, in such a way that the same features are reused by the multiple local experts, so that no extra computational cost is needed with respect to a holistic method. Furthermore, we demonstrate how to integrate the proposed approach with a cascaded architecture in order to achieve not only high accuracy but also an acceptable efficiency. In particular, the resulting detector operates at five frames per second using a laptop machine. We tested the proposed method with well-known challenging datasets such as Caltech, ETH, Daimler, and INRIA. The method proposed in this work consistently ranks among the top performers in all the datasets, being either the best method or having a small difference with the best one.

5 0.5484913 384 iccv-2013-Semi-supervised Robust Dictionary Learning via Efficient l-Norms Minimization

Author: Hua Wang, Feiping Nie, Weidong Cai, Heng Huang

Abstract: Representing the raw input of a data set by a set of relevant codes is crucial to many computer vision applications. Due to the intrinsic sparse property of real-world data, dictionary learning, in which the linear decomposition of a data point uses a set of learned dictionary bases, i.e., codes, has demonstrated state-of-the-art performance. However, traditional dictionary learning methods suffer from three weaknesses: sensitivity to noisy and outlier samples, difficulty to determine the optimal dictionary size, and incapability to incorporate supervision information. In this paper, we address these weaknesses by learning a Semi-Supervised Robust Dictionary (SSR-D). Specifically, we use the ℓ2,0+ norm as the loss function to improve the robustness against outliers, and develop a new structured sparse regularization com, , tom. . cai@sydney . edu . au , heng@uta .edu make the learning tasks easier to deal with and reduce the computational cost. For example, in image tagging, instead of using the raw pixel-wise features, semi-local or patch- based features, such as SIFT and geometric blur, are usually more desirable to achieve better performance. In practice, finding a set of compact features bases, also referred to as dictionary, with enhanced representative and discriminative power, plays a significant role in building a successful computer vision system. In this paper, we explore this important problem by proposing a novel formulation and its solution for learning Semi-Supervised Robust Dictionary (SSRD), where we examine the challenges in dictionary learning, and seek opportunities to overcome them and improve the dictionary qualities. 1.1. Challenges in Dictionary Learning to incorporate the supervision information in dictionary learning, without incurring additional parameters. Moreover, the optimal dictionary size is automatically learned from the input data. Minimizing the derived objective function is challenging because it involves many non-smooth ℓ2,0+ -norm terms. We present an efficient algorithm to solve the problem with a rigorous proof of the convergence of the algorithm. Extensive experiments are presented to show the superior performance of the proposed method.

6 0.5469135 180 iccv-2013-From Where and How to What We See

7 0.54640114 349 iccv-2013-Regionlets for Generic Object Detection

8 0.54628265 137 iccv-2013-Efficient Salient Region Detection with Soft Image Abstraction

9 0.54603219 126 iccv-2013-Dynamic Label Propagation for Semi-supervised Multi-class Multi-label Classification

10 0.54594529 208 iccv-2013-Image Co-segmentation via Consistent Functional Maps

11 0.54569548 327 iccv-2013-Predicting an Object Location Using a Global Image Representation

12 0.5455907 188 iccv-2013-Group Sparsity and Geometry Constrained Dictionary Learning for Action Recognition from Depth Maps

13 0.54548645 197 iccv-2013-Hierarchical Joint Max-Margin Learning of Mid and Top Level Representations for Visual Recognition

14 0.5452311 59 iccv-2013-Bayesian Joint Topic Modelling for Weakly Supervised Object Localisation

15 0.54478776 241 iccv-2013-Learning Near-Optimal Cost-Sensitive Decision Policy for Object Detection

16 0.54476273 187 iccv-2013-Group Norm for Learning Structured SVMs with Unstructured Latent Variables

17 0.54444385 206 iccv-2013-Hybrid Deep Learning for Face Verification

18 0.54407942 80 iccv-2013-Collaborative Active Learning of a Kernel Machine Ensemble for Recognition

19 0.54407567 29 iccv-2013-A Scalable Unsupervised Feature Merging Approach to Efficient Dimensionality Reduction of High-Dimensional Visual Data

20 0.54383957 194 iccv-2013-Heterogeneous Image Features Integration via Multi-modal Semi-supervised Learning Model