iccv iccv2013 iccv2013-146 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Yifan Zhang, Qiang Ji, Hanqing Lu
Abstract: In complex scenes with multiple atomic events happening sequentially or in parallel, detecting each individual event separately may not always obtain robust and reliable result. It is essential to detect them in a holistic way which incorporates the causality and temporal dependency among them to compensate the limitation of current computer vision techniques. In this paper, we propose an interval temporal constrained dynamic Bayesian network to extendAllen ’s interval algebra network (IAN) [2]from a deterministic static model to a probabilistic dynamic system, which can not only capture the complex interval temporal relationships, but also model the evolution dynamics and handle the uncertainty from the noisy visual observation. In the model, the topology of the IAN on each time slice and the interlinks between the time slices are discovered by an advanced structure learning method. The duration of the event and the unsynchronized time lags between two correlated event intervals are captured by a duration model, so that we can better determine the temporal boundary of the event. Empirical results on two real world datasets show the power of the proposed interval temporal constrained model.
Reference: text
sentIndex sentText sentNum sentScore
1 ia Abstract In complex scenes with multiple atomic events happening sequentially or in parallel, detecting each individual event separately may not always obtain robust and reliable result. [sent-2, score-0.973]
2 The duration of the event and the unsynchronized time lags between two correlated event intervals are captured by a duration model, so that we can better determine the temporal boundary of the event. [sent-6, score-2.26]
3 Empirical results on two real world datasets show the power of the proposed interval temporal constrained model. [sent-7, score-0.594]
4 Each event may be correlated and affected by others. [sent-12, score-0.576]
5 Detecting each individual event separately may not always obtain reliable result due to many reasons such as occlusion, motion blur, appearance variation, background clutter, etc. [sent-13, score-0.55]
6 edu In a complex scene, the atomic events often maintain certain temporal relationships with each other, and their occurrences are governed by an underlying temporal structure based on some domain knowledge and rules of thumb. [sent-19, score-1.234]
7 If we can tell how long they overlap, and in what delay the later one still lasts after the early one has finished, that would be great helpful to detect the event and better determine the event boundary. [sent-23, score-1.123]
8 Hence, it is important to capture the interval based temporal relationships and discover the underlying temporal structure amongst the events, which can be used as an inference engine to disambiguate the uncertainties from the low-level visual processing and facilitate the event detection. [sent-24, score-1.705]
9 before, after and equal), and are not expressive enough to capture a larger number of interval based temporal relationships such as during, overlapping, etc. [sent-28, score-0.722]
10 33 117847 To address these issues, we develop an interval temporal constrained dynamic Bayesian network to extend Allen’s interval algebra network (IAN) [2] from a deterministic static model to a probabilistic dynamic system. [sent-36, score-1.222]
11 In the first stage, the IAN is constructed by analyzing the interval temporal relationships among the events, and its topology is used as the prior structure within each time slice. [sent-38, score-0.89]
12 A duration model is attached to the DBN to capture the interval length of the event. [sent-41, score-0.599]
13 Different from the existed explicit duration models, in which the duration is only determined by the event itself, we claim that, the duration of the dependent event is also affected by the status of the depended one, given their interval temporal relationships. [sent-42, score-2.683]
14 Thus, a duration fragmentation is performed to better represent the interval temporal relationships, and thus we can determine the start and end point of the event more accurately. [sent-43, score-1.492]
15 Related work Among various methodologies which can model multi- ple event relationships and interactions, time-sliced graphical models, i. [sent-46, score-0.703]
16 Xiang and Gong [18] presented a dynamically multi-linked hidden Markov model (DML-HMM) for modeling the temporal and causal correlations among events in an outdoor scene. [sent-51, score-0.648]
17 Pinhanez [13] captured relative temporal relationships in a propagation network to detect event in a deterministic way, which cannot handle the uncertainties brought by the visual observation. [sent-52, score-1.12]
18 Generally, time-sliced graphical models typically model events as occurring instantaneously, so as to lack the expressive power to capture a fully range of the interval temporal relationships. [sent-53, score-0.937]
19 [15] introduced a DBN framework that provides duration modeling within the network with the limited ability to capture only simple sequential temporal relationships such as before or after. [sent-57, score-0.793]
20 Most of the explicit duration models determine the event duration solely by the event itself, while discarding the implied affection from the other depended events, thus cannot well capture the unsynchronized temporal lags between the correlated event intervals. [sent-61, score-2.754]
21 Moreover, these methods seldom perform structure learning process, leaving the model structure manually defined or fully connected, which cannot automatically discover the undying temporal constraints beneath the observations. [sent-62, score-0.517]
22 Morariu and Davis [10] proposed an Markov logic network based approach for complex multi-agent event recognition that employs knowledge such as rules, event descriptions, and physical constraints of the events being modeled. [sent-64, score-1.527]
23 Probabilistic event logic (PEL) proposed by Brendel et al. [sent-65, score-0.594]
24 [9] proposed a DDPHMM model to recognize the activities in traffic scenes, in which each event corresponds to a topic which is a specific spatial flow pattern. [sent-70, score-0.65]
25 The topic models, while powerful in modeling some complex activities, stillcannot effectively handle events with strong and diverse temporal dependencies. [sent-73, score-0.622]
26 Based on the limitations of the approaches mentioned above, we need to find a model which can systematically accounts for a full range of interval temporal relationships among different events. [sent-74, score-0.722]
27 Interval Algebra Network (IAN) An event is defined as the state change of one or more entities over a period of time. [sent-78, score-0.614]
28 The actual interval relationship between two events that happens over a time interval can be a union of these atomic relations, e. [sent-82, score-0.918]
29 Allen’s thirteen atomic interval temporal relations to represent the temporal relations between two events 푋 and 푌 . [sent-85, score-1.392]
30 An interval algebra network [2], or simply IAN can be used to represent the temporal relationships among a set of events, where the nodes represent events, and the directed links represent the temporal relationships among the events. [sent-87, score-1.432]
31 3 shows an IAN that models the interval temporal relationships among 6 events in basketball games. [sent-90, score-1.113]
32 To fully represent the total thirteen Allen’s interval temporal relations in a time-sliced model, we share the idea from Pinhanez and Bobick’s work [13]. [sent-102, score-0.719]
33 Mapping the 13 interval temporal relations into intra-slice pairwise constraints using the 3-valued state domain (a), where the symbols “P”,“N” and “F” represent “past”, “now” and “future”; and the 2-valued state domain (b), where the symbols “T” and “F” represent “true” and “false”. [sent-105, score-0.894]
34 the 13 interval temporal relations are systematically transformed into pairwise temporal constraints to restrict the admissible states of dependent event node given its temporal reference. [sent-106, score-2.024]
35 If 푒푖푡 = 푃, then 푒푗푡 = 푃푁, meaning that the event 푒푗 is happening now or have finished already. [sent-112, score-0.682]
36 Table 1 (a) displays the mapping from the 13 interval temporal relations to the equivalent intra-slice pairwise constraints represented by “P/N/F” value, where 푟푃푡 represents the admissible values of 푒푗푡 given its temporal reference 푒푖푡 = 푃, and similarly for and 푟퐹푡. [sent-113, score-1.058]
37 It can be seen that using the 3-valued state domain is more expressive to represent the 13 interval temporal relations than using the 2-valued state domain. [sent-115, score-0.814]
38 Inter-slice constraint: In [13], the interval temporal relations are only mapped to the intra-slice pairwise constraints. [sent-116, score-0.681]
39 However, in many cases, the temporal dependent 푒푗 is not only restricted by the current state of its temporal reference 푒푖 at time 푡 but also the previous state of 푒푖 at time 푡 − 1. [sent-117, score-0.893]
40 Htimenec푡e , − th 1e, tihnetenr푒- slice constraints from the temporal reference events at the previous time slice are also critical to reveal the interval temporal relations. [sent-122, score-1.421]
41 They can be considered as inter-slice pairwise constraints, and are represented as the inter-slice links from the correlated event nodes in 푒푗 푟푁푡 푒푖푡−1 33 117869 D1 D2 D3 Reference ei Dependent ej D1 D2 D3 D4 Figure 2. [sent-123, score-0.765]
42 The duration of the dependent event the state transition point of its reference event D5 D6 is fragmented by 푒푗 . [sent-124, score-1.614]
43 Besides the inter-slice pairwise constraint, the evolution of each event is also restricted by the previous state of itself, which can be called inter-slice self constraint. [sent-126, score-0.668]
44 Given the previous state, the event node can be either stay in the same state or transit to a restricted state. [sent-127, score-0.743]
45 Particularly, 퐹the) s =tat e푁 “past” jump t)o m m“feuatnusre s”means that after a certain time when the event had finished, it will revisit the state of waiting for the new occurrence. [sent-129, score-0.64]
46 Interval duration: To capture the duration of the event and better represent the interval temporal relations, each event node is attached with a duration node in the model. [sent-130, score-2.492]
47 Different from most of the existed explicit duration models, in which the duration is only determined by the event itself, the duration node of the dependent event in our model is also conditioned on the status of its temporal reference, given their interval temporal relationships. [sent-131, score-3.066]
48 For example, if 푒푖 starts 푒푗, the duration conditioned on 푒푖 = 푃 and 푒푗 = 푁 is the length of the interval in which 푒푗 still lasts while 푒푖 has already finished. [sent-132, score-0.565]
49 Since the duration node of the dependent event has multiple parents, it actually performs a duration fragmentation which is shown in Fig 2. [sent-134, score-1.306]
50 The interval of the dependent event being in the same state is fragmented by the time point of the reference event state transition. [sent-135, score-1.631]
51 By this, we can provide a quantitative description for the unsynchronized time lags between two events, and thus better model their interval temporal relations. [sent-136, score-0.775]
52 Therefore, we construct an IAN for the events by analyzing the interval temporal relationships between them in the training data. [sent-144, score-0.986]
53 To construct the IAN, we use a temporal window with predefined length sliding along the time axis in the training data to get temporal interval samples. [sent-146, score-0.93]
54 Thus we can obtain the statistics of the temporal relationships for each pair of events by analyzing every sampled temporal interval. [sent-147, score-1.012]
55 The pairwise temporal dependency between event 푒푗 and its temporal reference 푒푖 is represented by 푃(푒푗 = 1∣푟푒푖 = 1), that is the probability of 푒푗 being present an=d 1r∣elated with 푒푖 by temporal relation 푟 conditioning on 푒푖 being present. [sent-148, score-1.607]
56 For each event 푒푖 and its temporal reference 푒푗, we get the maximal 푟 related co-occurrence conditional probability, 푃˜푖푗 = max푟 푃(푒푗 = 1∣푟푒푖 = 1). [sent-153, score-0.904]
57 The TC makes sure that the temporal relationship on the newly added link must be consistent with the temporal relationship on the existed links. [sent-159, score-0.764]
58 Specifically, if a link is added in the current graph and form a triangle with two existed links, the temporal relationship on the new link should satisfy the transitivity rules governed by the temporal relationship on the two existed links. [sent-160, score-0.949]
59 3 shows a constructed IAN modeling the interval temporal relationships among 6 events from the OSUPEL basketball data [4]. [sent-164, score-1.113]
60 The basketball IAN modeling the interval temporal relationships among six events from the OSUPEL basketball data. [sent-170, score-1.24]
61 For clarity, the observation nodes of each event node are omitted. [sent-209, score-0.702]
62 Duration and observation node After we obtain the skeleton structure of our DBN model, each event node is attached with a duration node as its child. [sent-221, score-1.236]
63 The state of the duration node represents how long the current state of the event node lasts. [sent-222, score-1.161]
64 The duration node deterministically counts down on every time slice, and the event node state will not change until its duration node counts down to 0. [sent-223, score-1.505]
65 Specially, the duration node of the temporal dependent event has the link also from the temporal reference event node. [sent-224, score-2.27]
66 Besides the duration node, each event node is also attached with an observation node as its child so as to form a two-layer model. [sent-225, score-1.067]
67 The top layer encodes the events and their temporal relationships. [sent-226, score-0.574]
68 The CPD of the event node 푒푘 can be written as follows: 푃(푒푘푡=푗∣푒푘푡−1=푖,푝푘푡=푚,푞푘푡−1=푛,퐷푘푡−1=푑)=⎨⎧ 훿퐴(푖(,푚푗),푛,푖,푗) i f 푑푑 = > 00 (6) where 푝푘 is the configuration of intra-slice event parents of 푒푘, 푞푘 is the configuration of inter-slice event parents of 푒푘, 퐷푘 is the duration node of 푒푘. [sent-233, score-2.189]
69 퐴(푚, 푛, 푖, 푗) is the state transition probability given its event parents. [sent-234, score-0.715]
70 During model inference, the event nodes and the duration nodes in the top layer are hidden and need to be inferred from the observations in the bottom layer. [sent-240, score-0.976]
71 Let 푒1푡:푛 represents all the event nodes at time 푡, where 푛 is the number of the event nodes. [sent-242, score-1.177]
72 Experiments In this section, we report the event detection results in complex scenes using the proposed model. [sent-246, score-0.601]
73 Before discussing the event detection results in the OSUPEL basketball dataset, we first briefly describe the method 33 11 8892 on how to get the visual observation of the events from lowlevel features. [sent-253, score-0.996]
74 We extract features from the bounding box of the tracks and use an HMM to detect each event separately. [sent-255, score-0.55]
75 The HMMs are trained for each event class and used to detect events in the videos separately. [sent-257, score-0.814]
76 The F1-score of the event detection on both interval level and frame level are demonstrated in Table 2 and 3. [sent-265, score-0.863]
77 The results showed that the overall F1 score of event detection decreases by 7% and 10% on interval and frame level respectively. [sent-270, score-0.863]
78 Our model is also superior to DML-HMM which does not explicitly model the event durations. [sent-271, score-0.55]
79 Without duration model, it lacks the ability to fully express the interval temporal relations. [sent-272, score-0.897]
80 In addition, the duration fragmentation is also an important step in our duration model. [sent-273, score-0.606]
81 Based on a comparison experiment, the overall F1 score of event detection on the frame level decreases by 4% without using duration fragmentation. [sent-275, score-0.86]
82 These 5 events are regulated by the traffic lights and the right of way, thus they have strong interval temporal relationships. [sent-296, score-0.907]
83 For each clip we can get the distribution on the topics, so that we can determine which event occurs in the clip. [sent-298, score-0.573]
84 Using this data of complex scene, our model can be learned to capture the interval temporal relationships and the event durations. [sent-300, score-1.294]
85 Hence, to evaluate the robustness of our model, the detections are corrupted by 2 types of noises which are common in event detection and fed to the model as the observations. [sent-302, score-0.579]
86 One common noise in event detection is mis-detection, i. [sent-304, score-0.579]
87 , the event is not detected or falsely recognized as another event. [sent-306, score-0.55]
88 This is accomplished by perturbing the event labels of the testing data to simulate incorrect event detection. [sent-310, score-1.123]
89 This result shows that our model is more robust to event mis-detection compared with the other two. [sent-315, score-0.55]
90 056%472 often makes mistakes in determining the start and end time of the event, as well as the event duration. [sent-324, score-0.576]
91 In this experiment, we investigate the performance of our model under a varying event time measurement errors. [sent-325, score-0.576]
92 We corrupted the testing data by perturbing the event start and end time by a noise with the noise intensity varying from ±30% to ±50% nofo tishee wevitehnt th hdeur naotiisoen. [sent-326, score-0.599]
93 i Tteanbsleit y5 vsharoywinsg gth fero performance ±of5 0th%e three models under different event time errors. [sent-327, score-0.576]
94 It shows again that our model is more robust to the time measurement errors than CHSMM and DML-HMM, due to its ability in modeling the event duration and unsynchronized time lags between two temporal intervals. [sent-328, score-1.348]
95 Conclusions We have proposed a temporal interval constrained DBN model for event detection in complex scenes. [sent-337, score-1.195]
96 Allen’s interval temporal relationships are successfully captured in our model to compensate for the poor image meansurements of the low-level visual detectors. [sent-338, score-0.722]
97 Our model suits for the scenarios with multiple events occurring sequentially or in parallel, especially with overlapping event intervals, which forms complex relationships. [sent-340, score-0.868]
98 Currently, our model only focuses on the atomic event detection. [sent-341, score-0.61]
99 In the following work, we are interested in simultaneously detecting both the atomic event and the high-level complex activity in an unified model. [sent-342, score-0.658]
100 Bridging the past,present and future: Modeling scene activities from event relationships and global rules. [sent-464, score-0.703]
wordName wordTfidf (topN-words)
[('event', 0.55), ('temporal', 0.31), ('interval', 0.284), ('duration', 0.281), ('events', 0.264), ('ian', 0.222), ('dbn', 0.196), ('relationships', 0.128), ('basketball', 0.127), ('links', 0.112), ('chsmm', 0.109), ('node', 0.101), ('osupel', 0.094), ('lags', 0.083), ('slice', 0.08), ('happening', 0.077), ('transition', 0.076), ('link', 0.075), ('network', 0.074), ('unsynchronized', 0.072), ('existed', 0.069), ('structure', 0.068), ('state', 0.064), ('relations', 0.061), ('allen', 0.06), ('atomic', 0.06), ('intervals', 0.058), ('finished', 0.055), ('nodes', 0.051), ('traffic', 0.049), ('dependent', 0.049), ('static', 0.047), ('cpds', 0.047), ('junction', 0.046), ('logic', 0.044), ('dag', 0.044), ('fragmentation', 0.044), ('reference', 0.044), ('topology', 0.044), ('hidden', 0.043), ('pinhanez', 0.042), ('thirteen', 0.042), ('rules', 0.041), ('algebra', 0.035), ('attached', 0.034), ('catch', 0.033), ('qmul', 0.033), ('dependency', 0.032), ('coupled', 0.032), ('occurring', 0.032), ('fitness', 0.032), ('preliminary', 0.032), ('cpd', 0.031), ('hongeng', 0.031), ('interlinks', 0.031), ('varadarajan', 0.031), ('causal', 0.031), ('domain', 0.031), ('prior', 0.03), ('overlaps', 0.03), ('probabilistic', 0.029), ('meets', 0.029), ('uncertainties', 0.029), ('detection', 0.029), ('deterministic', 0.029), ('parents', 0.028), ('dynamic', 0.028), ('evolution', 0.028), ('markov', 0.028), ('subrahmanian', 0.028), ('notated', 0.028), ('depended', 0.028), ('duong', 0.028), ('transit', 0.028), ('discover', 0.026), ('topic', 0.026), ('pairwise', 0.026), ('lowlevel', 0.026), ('time', 0.026), ('activity', 0.026), ('correlated', 0.026), ('bayesian', 0.026), ('nevatia', 0.026), ('finishes', 0.026), ('probability', 0.025), ('activities', 0.025), ('graphical', 0.025), ('hmm', 0.024), ('instantaneously', 0.024), ('causality', 0.024), ('climbing', 0.024), ('oliver', 0.024), ('determine', 0.023), ('bobick', 0.023), ('perturbing', 0.023), ('occurrence', 0.023), ('topics', 0.023), ('constraints', 0.023), ('complex', 0.022), ('fully', 0.022)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999893 146 iccv-2013-Event Detection in Complex Scenes Using Interval Temporal Constraints
Author: Yifan Zhang, Qiang Ji, Hanqing Lu
Abstract: In complex scenes with multiple atomic events happening sequentially or in parallel, detecting each individual event separately may not always obtain robust and reliable result. It is essential to detect them in a holistic way which incorporates the causality and temporal dependency among them to compensate the limitation of current computer vision techniques. In this paper, we propose an interval temporal constrained dynamic Bayesian network to extendAllen ’s interval algebra network (IAN) [2]from a deterministic static model to a probabilistic dynamic system, which can not only capture the complex interval temporal relationships, but also model the evolution dynamics and handle the uncertainty from the noisy visual observation. In the model, the topology of the IAN on each time slice and the interlinks between the time slices are discovered by an advanced structure learning method. The duration of the event and the unsynchronized time lags between two correlated event intervals are captured by a duration model, so that we can better determine the temporal boundary of the event. Empirical results on two real world datasets show the power of the proposed interval temporal constrained model.
2 0.47811511 268 iccv-2013-Modeling 4D Human-Object Interactions for Event and Object Recognition
Author: Ping Wei, Yibiao Zhao, Nanning Zheng, Song-Chun Zhu
Abstract: Recognizing the events and objects in the video sequence are two challenging tasks due to the complex temporal structures and the large appearance variations. In this paper, we propose a 4D human-object interaction model, where the two tasks jointly boost each other. Our human-object interaction is defined in 4D space: i) the cooccurrence and geometric constraints of human pose and object in 3D space; ii) the sub-events transition and objects coherence in 1D temporal dimension. We represent the structure of events, sub-events and objects in a hierarchical graph. For an input RGB-depth video, we design a dynamic programming beam search algorithm to: i) segment the video, ii) recognize the events, and iii) detect the objects simultaneously. For evaluation, we built a large-scale multiview 3D event dataset which contains 3815 video sequences and 383,036 RGBD frames captured by the Kinect cameras. The experiment results on this dataset show the effectiveness of our method.
3 0.36864769 147 iccv-2013-Event Recognition in Photo Collections with a Stopwatch HMM
Author: Lukas Bossard, Matthieu Guillaumin, Luc Van_Gool
Abstract: The task of recognizing events in photo collections is central for automatically organizing images. It is also very challenging, because of the ambiguity of photos across different event classes and because many photos do not convey enough relevant information. Unfortunately, the field still lacks standard evaluation data sets to allow comparison of different approaches. In this paper, we introduce and release a novel data set of personal photo collections containing more than 61,000 images in 807 collections, annotated with 14 diverse social event classes. Casting collections as sequential data, we build upon recent and state-of-the-art work in event recognition in videos to propose a latent sub-event approach for event recognition in photo collections. However, photos in collections are sparsely sampled over time and come in bursts from which transpires the importance of specific moments for the photographers. Thus, we adapt a discriminative hidden Markov model to allow the transitions between states to be a function of the time gap between consecutive images, which we coin as Stopwatch Hidden Markov model (SHMM). In our experiments, we show that our proposed model outperforms approaches based only on feature pooling or a classical hidden Markov model. With an average accuracy of 56%, we also highlight the difficulty of the data set and the need for future advances in event recognition in photo collections.
4 0.33531204 127 iccv-2013-Dynamic Pooling for Complex Event Recognition
Author: Weixin Li, Qian Yu, Ajay Divakaran, Nuno Vasconcelos
Abstract: The problem of adaptively selecting pooling regions for the classification of complex video events is considered. Complex events are defined as events composed of several characteristic behaviors, whose temporal configuration can change from sequence to sequence. A dynamic pooling operator is defined so as to enable a unified solution to the problems of event specific video segmentation, temporal structure modeling, and event detection. Video is decomposed into segments, and the segments most informative for detecting a given event are identified, so as to dynamically determine the pooling operator most suited for each sequence. This dynamic pooling is implemented by treating the locations of characteristic segments as hidden information, which is inferred, on a sequence-by-sequence basis, via a large-margin classification rule with latent variables. Although the feasible set of segment selections is combinatorial, it is shown that a globally optimal solution to the inference problem can be obtained efficiently, through the solution of a series of linear programs. Besides the coarselevel location of segments, a finer model of video struc- ture is implemented by jointly pooling features of segmenttuples. Experimental evaluation demonstrates that the re- sulting event detector has state-of-the-art performance on challenging video datasets.
5 0.29981613 203 iccv-2013-How Related Exemplars Help Complex Event Detection in Web Videos?
Author: Yi Yang, Zhigang Ma, Zhongwen Xu, Shuicheng Yan, Alexander G. Hauptmann
Abstract: Compared to visual concepts such as actions, scenes and objects, complex event is a higher level abstraction of longer video sequences. For example, a “marriage proposal” event is described by multiple objects (e.g., ring, faces), scenes (e.g., in a restaurant, outdoor) and actions (e.g., kneeling down). The positive exemplars which exactly convey the precise semantic of an event are hard to obtain. It would be beneficial to utilize the related exemplars for complex event detection. However, the semantic correlations between related exemplars and the target event vary substantially as relatedness assessment is subjective. Two related exemplars can be about completely different events, e.g., in the TRECVID MED dataset, both bicycle riding and equestrianism are labeled as related to “attempting a bike trick” event. To tackle the subjectiveness of human assessment, our algorithm automatically evaluates how positive the related exemplars are for the detection of an event and uses them on an exemplar-specific basis. Experiments demonstrate that our algorithm is able to utilize related exemplars adaptively, and the algorithm gains good perform- z. ance for complex event detection.
6 0.2393624 4 iccv-2013-ACTIVE: Activity Concept Transitions in Video Event Classification
7 0.21841273 85 iccv-2013-Compositional Models for Video Event Detection: A Multiple Kernel Learning Latent Variable Approach
8 0.21349263 81 iccv-2013-Combining the Right Features for Complex Event Recognition
9 0.2017801 440 iccv-2013-Video Event Understanding Using Natural Language Descriptions
10 0.16606116 86 iccv-2013-Concurrent Action Detection with Structural Prediction
11 0.16454485 163 iccv-2013-Feature Weighting via Optimal Thresholding for Video Analysis
12 0.16219783 397 iccv-2013-Space-Time Tradeoffs in Photo Sequencing
13 0.16189903 155 iccv-2013-Facial Action Unit Event Detection by Cascade of Tasks
14 0.1333425 40 iccv-2013-Action and Event Recognition with Fisher Vectors on a Compact Feature Set
15 0.12875642 443 iccv-2013-Video Synopsis by Heterogeneous Multi-source Correlation
16 0.11162654 34 iccv-2013-Abnormal Event Detection at 150 FPS in MATLAB
17 0.1101995 274 iccv-2013-Monte Carlo Tree Search for Scheduling Activity Recognition
18 0.10447589 167 iccv-2013-Finding Causal Interactions in Video Sequences
19 0.10048615 240 iccv-2013-Learning Maximum Margin Temporal Warping for Action Recognition
20 0.094359152 400 iccv-2013-Stable Hyper-pooling and Query Expansion for Event Detection
topicId topicWeight
[(0, 0.183), (1, 0.166), (2, 0.085), (3, 0.179), (4, 0.097), (5, 0.074), (6, 0.121), (7, -0.059), (8, -0.023), (9, -0.131), (10, -0.229), (11, -0.204), (12, -0.057), (13, 0.326), (14, -0.278), (15, -0.047), (16, 0.04), (17, 0.034), (18, 0.014), (19, 0.112), (20, 0.099), (21, -0.024), (22, 0.043), (23, 0.031), (24, -0.007), (25, -0.021), (26, -0.042), (27, 0.051), (28, 0.002), (29, -0.018), (30, -0.042), (31, 0.061), (32, 0.034), (33, 0.031), (34, 0.054), (35, -0.018), (36, 0.019), (37, -0.003), (38, 0.023), (39, 0.008), (40, 0.009), (41, 0.032), (42, -0.021), (43, 0.006), (44, -0.029), (45, -0.018), (46, 0.021), (47, 0.046), (48, 0.082), (49, -0.024)]
simIndex simValue paperId paperTitle
same-paper 1 0.98048717 146 iccv-2013-Event Detection in Complex Scenes Using Interval Temporal Constraints
Author: Yifan Zhang, Qiang Ji, Hanqing Lu
Abstract: In complex scenes with multiple atomic events happening sequentially or in parallel, detecting each individual event separately may not always obtain robust and reliable result. It is essential to detect them in a holistic way which incorporates the causality and temporal dependency among them to compensate the limitation of current computer vision techniques. In this paper, we propose an interval temporal constrained dynamic Bayesian network to extendAllen ’s interval algebra network (IAN) [2]from a deterministic static model to a probabilistic dynamic system, which can not only capture the complex interval temporal relationships, but also model the evolution dynamics and handle the uncertainty from the noisy visual observation. In the model, the topology of the IAN on each time slice and the interlinks between the time slices are discovered by an advanced structure learning method. The duration of the event and the unsynchronized time lags between two correlated event intervals are captured by a duration model, so that we can better determine the temporal boundary of the event. Empirical results on two real world datasets show the power of the proposed interval temporal constrained model.
2 0.89968234 268 iccv-2013-Modeling 4D Human-Object Interactions for Event and Object Recognition
Author: Ping Wei, Yibiao Zhao, Nanning Zheng, Song-Chun Zhu
Abstract: Recognizing the events and objects in the video sequence are two challenging tasks due to the complex temporal structures and the large appearance variations. In this paper, we propose a 4D human-object interaction model, where the two tasks jointly boost each other. Our human-object interaction is defined in 4D space: i) the cooccurrence and geometric constraints of human pose and object in 3D space; ii) the sub-events transition and objects coherence in 1D temporal dimension. We represent the structure of events, sub-events and objects in a hierarchical graph. For an input RGB-depth video, we design a dynamic programming beam search algorithm to: i) segment the video, ii) recognize the events, and iii) detect the objects simultaneously. For evaluation, we built a large-scale multiview 3D event dataset which contains 3815 video sequences and 383,036 RGBD frames captured by the Kinect cameras. The experiment results on this dataset show the effectiveness of our method.
3 0.84987217 203 iccv-2013-How Related Exemplars Help Complex Event Detection in Web Videos?
Author: Yi Yang, Zhigang Ma, Zhongwen Xu, Shuicheng Yan, Alexander G. Hauptmann
Abstract: Compared to visual concepts such as actions, scenes and objects, complex event is a higher level abstraction of longer video sequences. For example, a “marriage proposal” event is described by multiple objects (e.g., ring, faces), scenes (e.g., in a restaurant, outdoor) and actions (e.g., kneeling down). The positive exemplars which exactly convey the precise semantic of an event are hard to obtain. It would be beneficial to utilize the related exemplars for complex event detection. However, the semantic correlations between related exemplars and the target event vary substantially as relatedness assessment is subjective. Two related exemplars can be about completely different events, e.g., in the TRECVID MED dataset, both bicycle riding and equestrianism are labeled as related to “attempting a bike trick” event. To tackle the subjectiveness of human assessment, our algorithm automatically evaluates how positive the related exemplars are for the detection of an event and uses them on an exemplar-specific basis. Experiments demonstrate that our algorithm is able to utilize related exemplars adaptively, and the algorithm gains good perform- z. ance for complex event detection.
4 0.82245439 147 iccv-2013-Event Recognition in Photo Collections with a Stopwatch HMM
Author: Lukas Bossard, Matthieu Guillaumin, Luc Van_Gool
Abstract: The task of recognizing events in photo collections is central for automatically organizing images. It is also very challenging, because of the ambiguity of photos across different event classes and because many photos do not convey enough relevant information. Unfortunately, the field still lacks standard evaluation data sets to allow comparison of different approaches. In this paper, we introduce and release a novel data set of personal photo collections containing more than 61,000 images in 807 collections, annotated with 14 diverse social event classes. Casting collections as sequential data, we build upon recent and state-of-the-art work in event recognition in videos to propose a latent sub-event approach for event recognition in photo collections. However, photos in collections are sparsely sampled over time and come in bursts from which transpires the importance of specific moments for the photographers. Thus, we adapt a discriminative hidden Markov model to allow the transitions between states to be a function of the time gap between consecutive images, which we coin as Stopwatch Hidden Markov model (SHMM). In our experiments, we show that our proposed model outperforms approaches based only on feature pooling or a classical hidden Markov model. With an average accuracy of 56%, we also highlight the difficulty of the data set and the need for future advances in event recognition in photo collections.
5 0.80651665 4 iccv-2013-ACTIVE: Activity Concept Transitions in Video Event Classification
Author: Chen Sun, Ram Nevatia
Abstract: The goal of high level event classification from videos is to assign a single, high level event label to each query video. Traditional approaches represent each video as a set of low level features and encode it into a fixed length feature vector (e.g. Bag-of-Words), which leave a big gap between low level visual features and high level events. Our paper tries to address this problem by exploiting activity concept transitions in video events (ACTIVE). A video is treated as a sequence of short clips, all of which are observations corresponding to latent activity concept variables in a Hidden Markov Model (HMM). We propose to apply Fisher Kernel techniques so that the concept transitions over time can be encoded into a compact and fixed length feature vector very efficiently. Our approach can utilize concept annotations from independent datasets, and works well even with a very small number of training samples. Experiments on the challenging NIST TRECVID Multimedia Event Detection (MED) dataset shows our approach performs favorably over the state-of-the-art.
6 0.77157801 127 iccv-2013-Dynamic Pooling for Complex Event Recognition
7 0.64776158 163 iccv-2013-Feature Weighting via Optimal Thresholding for Video Analysis
8 0.61405355 85 iccv-2013-Compositional Models for Video Event Detection: A Multiple Kernel Learning Latent Variable Approach
9 0.51264286 443 iccv-2013-Video Synopsis by Heterogeneous Multi-source Correlation
10 0.49991339 155 iccv-2013-Facial Action Unit Event Detection by Cascade of Tasks
11 0.49702153 81 iccv-2013-Combining the Right Features for Complex Event Recognition
12 0.47261286 34 iccv-2013-Abnormal Event Detection at 150 FPS in MATLAB
13 0.43916973 397 iccv-2013-Space-Time Tradeoffs in Photo Sequencing
14 0.42480773 440 iccv-2013-Video Event Understanding Using Natural Language Descriptions
15 0.40683731 191 iccv-2013-Handling Uncertain Tags in Visual Recognition
16 0.39560318 40 iccv-2013-Action and Event Recognition with Fisher Vectors on a Compact Feature Set
17 0.39451432 243 iccv-2013-Learning Slow Features for Behaviour Analysis
18 0.38464671 274 iccv-2013-Monte Carlo Tree Search for Scheduling Activity Recognition
19 0.3725158 331 iccv-2013-Pyramid Coding for Functional Scene Element Recognition in Video Scenes
20 0.36802149 400 iccv-2013-Stable Hyper-pooling and Query Expansion for Event Detection
topicId topicWeight
[(2, 0.037), (7, 0.015), (12, 0.032), (26, 0.118), (28, 0.145), (31, 0.044), (34, 0.027), (42, 0.114), (48, 0.014), (64, 0.082), (73, 0.032), (78, 0.021), (89, 0.195), (98, 0.01)]
simIndex simValue paperId paperTitle
Author: Sarah Parisot, William Wells_III, Stéphane Chemouny, Hugues Duffau, Nikos Paragios
Abstract: Graph-based methods have become popular in recent years and have successfully addressed tasks like segmentation and deformable registration. Their main strength is optimality of the obtained solution while their main limitation is the lack of precision due to the grid-like representations and the discrete nature of the quantized search space. In this paper we introduce a novel approach for combined segmentation/registration of brain tumors that adapts graph and sampling resolution according to the image content. To this end we estimate the segmentation and registration marginals towards adaptive graph resolution and intelligent definition of the search space. This information is considered in a hierarchical framework where uncertainties are propagated in a natural manner. State of the art results in the joint segmentation/registration of brain images with low-grade gliomas demonstrate the potential of our approach.
same-paper 2 0.89691341 146 iccv-2013-Event Detection in Complex Scenes Using Interval Temporal Constraints
Author: Yifan Zhang, Qiang Ji, Hanqing Lu
Abstract: In complex scenes with multiple atomic events happening sequentially or in parallel, detecting each individual event separately may not always obtain robust and reliable result. It is essential to detect them in a holistic way which incorporates the causality and temporal dependency among them to compensate the limitation of current computer vision techniques. In this paper, we propose an interval temporal constrained dynamic Bayesian network to extendAllen ’s interval algebra network (IAN) [2]from a deterministic static model to a probabilistic dynamic system, which can not only capture the complex interval temporal relationships, but also model the evolution dynamics and handle the uncertainty from the noisy visual observation. In the model, the topology of the IAN on each time slice and the interlinks between the time slices are discovered by an advanced structure learning method. The duration of the event and the unsynchronized time lags between two correlated event intervals are captured by a duration model, so that we can better determine the temporal boundary of the event. Empirical results on two real world datasets show the power of the proposed interval temporal constrained model.
3 0.86403257 150 iccv-2013-Exemplar Cut
Author: Jimei Yang, Yi-Hsuan Tsai, Ming-Hsuan Yang
Abstract: We present a hybrid parametric and nonparametric algorithm, exemplar cut, for generating class-specific object segmentation hypotheses. For the parametric part, we train a pylon model on a hierarchical region tree as the energy function for segmentation. For the nonparametric part, we match the input image with each exemplar by using regions to obtain a score which augments the energy function from the pylon model. Our method thus generates a set of highly plausible segmentation hypotheses by solving a series of exemplar augmented graph cuts. Experimental results on the Graz and PASCAL datasets show that the proposed algorithm achievesfavorable segmentationperformance against the state-of-the-art methods in terms of visual quality and accuracy.
4 0.85620493 338 iccv-2013-Randomized Ensemble Tracking
Author: Qinxun Bai, Zheng Wu, Stan Sclaroff, Margrit Betke, Camille Monnier
Abstract: We propose a randomized ensemble algorithm to model the time-varying appearance of an object for visual tracking. In contrast with previous online methods for updating classifier ensembles in tracking-by-detection, the weight vector that combines weak classifiers is treated as a random variable and the posterior distribution for the weight vector is estimated in a Bayesian manner. In essence, the weight vector is treated as a distribution that reflects the confidence among the weak classifiers used to construct and adapt the classifier ensemble. The resulting formulation models the time-varying discriminative ability among weak classifiers so that the ensembled strong classifier can adapt to the varying appearance, backgrounds, and occlusions. The formulation is tested in a tracking-by-detection implementation. Experiments on 28 challenging benchmark videos demonstrate that the proposed method can achieve results comparable to and often better than those of stateof-the-art approaches.
5 0.85519302 414 iccv-2013-Temporally Consistent Superpixels
Author: Matthias Reso, Jörn Jachalsky, Bodo Rosenhahn, Jörn Ostermann
Abstract: Superpixel algorithms represent a very useful and increasingly popular preprocessing step for a wide range of computer vision applications, as they offer the potential to boost efficiency and effectiveness. In this regards, this paper presents a highly competitive approach for temporally consistent superpixelsfor video content. The approach is based on energy-minimizing clustering utilizing a novel hybrid clustering strategy for a multi-dimensional feature space working in a global color subspace and local spatial subspaces. Moreover, a new contour evolution based strategy is introduced to ensure spatial coherency of the generated superpixels. For a thorough evaluation the proposed approach is compared to state of the art supervoxel algorithms using established benchmarks and shows a superior performance.
6 0.85489064 379 iccv-2013-Semantic Segmentation without Annotating Segments
7 0.85457069 359 iccv-2013-Robust Object Tracking with Online Multi-lifespan Dictionary Learning
8 0.85370553 420 iccv-2013-Topology-Constrained Layered Tracking with Latent Flow
9 0.85328865 330 iccv-2013-Proportion Priors for Image Sequence Segmentation
10 0.85255027 442 iccv-2013-Video Segmentation by Tracking Many Figure-Ground Segments
11 0.85195929 95 iccv-2013-Cosegmentation and Cosketch by Unsupervised Learning
12 0.85158503 299 iccv-2013-Online Video SEEDS for Temporal Window Objectness
13 0.85155761 326 iccv-2013-Predicting Sufficient Annotation Strength for Interactive Foreground Segmentation
14 0.85135567 196 iccv-2013-Hierarchical Data-Driven Descent for Efficient Optimal Deformation Estimation
15 0.85026807 270 iccv-2013-Modeling Self-Occlusions in Dynamic Shape and Appearance Tracking
16 0.8500247 107 iccv-2013-Deformable Part Descriptors for Fine-Grained Recognition and Attribute Prediction
17 0.8499589 65 iccv-2013-Breaking the Chain: Liberation from the Temporal Markov Assumption for Tracking Human Poses
18 0.84936535 230 iccv-2013-Latent Data Association: Bayesian Model Selection for Multi-target Tracking
19 0.84778583 127 iccv-2013-Dynamic Pooling for Complex Event Recognition
20 0.84759843 349 iccv-2013-Regionlets for Generic Object Detection