iccv iccv2013 iccv2013-397 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Tali Dekel_(Basha), Yael Moses, Shai Avidan
Abstract: Photo-sequencing is the problem of recovering the temporal order of a set of still images of a dynamic event, taken asynchronously by a set of uncalibrated cameras. Solving this problem is a first, crucial step for analyzing (or visualizing) the dynamic content of the scene captured by a large number of freely moving spectators. We propose a geometric based solution, followed by rank aggregation to . ac . i l avidan@ eng .t au . ac . i l the photo-sequencing problem. Our algorithm trades spatial certainty for temporal certainty. Whereas the previous solution proposed by [4] relies on two images taken from the same static camera to eliminate uncertainty in space, we drop the static-camera assumption and replace it with temporal information available from images taken from the same (moving) camera. Our method thus overcomes the limitation of the static-camera assumption, and scales much better with the duration of the event and the spread of cameras in space. We present successful results on challenging real data sets and large scale synthetic data (250 images).
Reference: text
sentIndex sentText sentNum sentScore
1 @ Abstract Photo-sequencing is the problem of recovering the temporal order of a set of still images of a dynamic event, taken asynchronously by a set of uncalibrated cameras. [sent-5, score-0.831]
2 Solving this problem is a first, crucial step for analyzing (or visualizing) the dynamic content of the scene captured by a large number of freely moving spectators. [sent-6, score-0.358]
3 Whereas the previous solution proposed by [4] relies on two images taken from the same static camera to eliminate uncertainty in space, we drop the static-camera assumption and replace it with temporal information available from images taken from the same (moving) camera. [sent-14, score-0.95]
4 We are interested in developing tools that analyze, explore and visualize the dynamic regions of the scene given images taken by CrowdCam (e. [sent-25, score-0.352]
5 A preliminary step in solving this problem is to recover the temporal order of the still images taken asynchronously by a set of uncalibrated cameras. [sent-29, score-0.655]
6 Visualization of a dynamic event from a set of still im- ages; each image was captured from a different location at a different time. [sent-31, score-0.276]
7 First, we compute corresponding static and dynamic feature points across images. [sent-37, score-0.357]
8 The static features are used to determine the epipolar geometry between pairs of images. [sent-38, score-0.281]
9 Each set of corresponding dynamic features vote for the temporal order of the images in which it appears. [sent-39, score-0.68]
10 The partial orders provided by the dynamic feature sets are aggregated into a globally consistent temporal order of images using rank aggregation. [sent-40, score-0.991]
11 One of the non-trivial problems that must be solved is how a set of corresponding dynamic features can be used to determine the partial order of the images in which it was found. [sent-41, score-0.327]
12 That is, each feature set contains the projections of a 3D dynamic point onto different viewpoints at a different time instance. [sent-43, score-0.271]
13 proposed a 2D geometric based solution that requires that two of the input images be captured by a static camera. [sent-45, score-0.268]
14 Under linear motion of each of the dynamic features, this assumption allows them to compute a unique ordering by mapping all the features to the same reference image. [sent-46, score-0.407]
15 This scenario increases the uncertainty in space, because the features cannot be mapped to the same reference image, but decreases the uncertainty in time (since the temporal order of images taken by the same camera is known). [sent-52, score-0.817]
16 We show that, using both the spatial and the temporal constraints, a small number of temporal orders can be determined from each feature set. [sent-53, score-0.864]
17 In addition, the temporal information is also integrated as a confidence vote into the rank aggregation,which improves the robustness to errors and noise. [sent-54, score-0.411]
18 This demonstrates the advantages of the tradeoff between spatial and temporal cues. [sent-62, score-0.316]
19 In D-SFM and NRSFM the goal is to recover a 3D model of a dynamic world from a set of images taken at different time instances (e. [sent-68, score-0.396]
20 ITnhpeu temporal oofr dsetilrl lo ifm images t,a tkaekne by tyhe a same camera i sc akmneowrans. [sent-82, score-0.499]
21 3: Classify the features into static and dynamic feature sets, Si . [sent-87, score-0.304]
22 4: for each set of dynamic features, Si do 5: for each image, Ij ∈ ISi do 6: Compute the order∈ set, Γji, using Ij as reference (see Sec. [sent-88, score-0.255]
23 Hartley & Vidal [9] proposed a closed-form solution to nonrigid shape and motion recovery from multiple perspective views, under the assumption that the nonrigid object deforms as a linear combination of K rigid shapes. [sent-99, score-0.296]
24 In both cases the intersections of the trajectory of dynamic features with the epipolar lines of corresponding features in the other images were used to define order. [sent-111, score-0.682]
25 Method A group of cameras moves in space and captures a set of still images, I, of a dynamic event. [sent-123, score-0.306]
26 th We same camera is given, and use it to impose temporal constraints on those images. [sent-128, score-0.439]
27 In this case, feature points that obey the epipolar constraint are labeled as static points, and those that do not are labeled as dynamic points. [sent-135, score-0.52]
28 Order from a Single Feature Set: Let Si be a set of corresponding dynamic features, which are the projections of a dynamic 3D point Pi onto a subset of images, ISi ⊆ I. [sent-136, score-0.406]
29 The set Si is used to compute a suebt soeft possible temporal orders (permutations), Γi, of its set of images, ISi . [sent-137, score-0.596]
30 le with the computed partial temporal orders from all sets, ? [sent-144, score-0.548]
31 Then, the temporal order of the image set, IS, is determined (up to time-flip) by tohred spatial eo irmdaegr oef s etht,e I 3D locations of P along its 3D trajectory. [sent-159, score-0.397]
32 Critical points: (a), a point p1 and two epipolar lines (green and purple); each black dashed line, ? [sent-164, score-0.327]
33 , is in a different sector and induces different orders; the 3 sectors are define by 3 critical points that are marked on the unit circle centered in p1. [sent-165, score-0.849]
34 (b), a point and 3 epipolar lines; here not all critical points are marked; see proof in Sec. [sent-166, score-0.366]
35 [4] suggested recovering the 2D linear trajectory of the point P in one reference image. [sent-171, score-0.28]
36 Then, the 2D projections of P at time {t(Ii) | Ii ∈ IS} could be easily computed, sa ondf Pthe airt 2tiDm spatial )o|rd Ier along }th ceo 2ulDd trajectory induce a unique temporal order of IS. [sent-172, score-0.611]
37 This assumption limits the spatial and temporal configurations of cameras that can be considered by their method. [sent-174, score-0.482]
38 That is, only feature sets that involve the reference image are used to induce the temporal order, and valuable temporal information, such as known temporal orders ofim- ages taken by the same camera, is ignored. [sent-176, score-1.426]
39 2 Order Without Trajectory Recovery We drop the static camera assumption, which means that a unique order of IS, based on the recovered 2D trajectory icqauneno ort dbeer o obfta Iined as in [4]. [sent-181, score-0.479]
40 As a result, we obtain n = |S| sets of temporal orders·,· ·Γ}1 . [sent-183, score-0.316]
41 The main challenge is how to efficiently compute the set Γj and we show that geometric and temporal constraints can drastically reduce the size of Γj, compared to pure combinatorial considerations. [sent-186, score-0.387]
42 For example, from a combinatorial point of view there are ∼1043 possible ways to order a set opboitanitn oedf by w10 t cameras ∼th1a0t take 5 images each. [sent-187, score-0.295]
43 In practice, it can be further reduced to ∼4 using temporal constraints. [sent-189, score-0.316]
44 } induce the temporal order of the images, up to order reversing. [sent-205, score-0.549]
45 Since different lines may induce different temporal orders (see example in Fig. [sent-212, score-0.749]
46 However, thanks to geometric and temporal constraints, we can recover a small bounded number of valid orders. [sent-214, score-0.399]
47 Geometric Analysis: The key observation is that there are ranges of orientations for which the temporal order induced by ? [sent-215, score-0.452]
48 2(a) for a particular configuration of one point and two epipolar lines; the order defined by all lines in sector R1 will be the same, and p1 will be between the pink and the green lines, while in sector R2 the green line will be in the center. [sent-218, score-0.932]
49 With this observation in mind, we divide the image plane into sectors, such that all lines within a sector give rise to the same (up to reversing) temporal order. [sent-219, score-0.66]
50 We define the image sectors by critical points, which are points on the unit circle centered at p1. [sent-220, score-0.591]
51 The first type are lines connecting the point p1 and the intersection of a pair of epipolar × lines ? [sent-222, score-0.457]
52 The second type are lines passing through p1 and parallel to an epipolar line; we denote these lines by ? [sent-227, score-0.423]
53 Each such line intersects the unit circle at a critical point, ci. [sent-229, score-0.344]
54 The number of possible temporal orderings, |Γj |, is bouTnhdeed n by bteher onufm pboesrs oblfe ese tcetmorpso (raolr ocrridtiecrainl points). [sent-232, score-0.316]
55 I ins fact, it can be further reduced by eliminating sectors that do not fulfill the known temporal orders of images taken from the same camera. [sent-233, score-1.025]
56 That is, S = {pi}i4=1, such that the temporal order of each oisf, t hSe pairs {}I1, I2} and {I3, I4} is known. [sent-236, score-0.397]
57 2(b) 2sh ≤ow i,s an example of four of these points and the resulting sectors (R1-R4), while ignoring for the sake of clarity the critical points c2,3 and c2,4. [sent-241, score-0.533]
58 We make the following claim: Claim 1: There are at most 4 possible orders that are both temporally and geometrically consistent. [sent-242, score-0.265]
59 Time-Direction: All lines in the same sector induce the same order, up to time-direction ambiguity. [sent-245, score-0.415]
60 The known =e temporal order between two images, w. [sent-246, score-0.397]
61 Temporal Consistency: The order induced in each sector may be either consistent or inconsistent with all the known temporal orders. [sent-253, score-0.757]
62 In our example, if we are given that t(I1) < t(I2) and t(I3) < t(I4) (the green before the pink), then the sectors R2 and R4 are consistent, while R1 and R3 are not. [sent-254, score-0.311]
63 Adjacent sectors: The orders induced in adjacent sectors are different. [sent-256, score-0.598]
64 Proof Claim 1: We prove that at most 4 sectors are consistent with the known temporal orders. [sent-266, score-0.685]
65 Hence, the question is how many sectors are consistent with the temporal order of I3 and I4? [sent-268, score-0.766]
66 Let’s consider only the critical points that affect the order of I3 and I4: c2, c3, c3,4 and c4. [sent-269, score-0.25]
67 Each of them can split either an inconsistent sector or a consistent one. [sent-277, score-0.305]
68 In the first case, the number of consistent sectors remains 2. [sent-278, score-0.369]
69 In the second case, a consistent sector is replaced by two consistent ones. [sent-279, score-0.33]
70 Thus, it follows that the maximum number of consistent sectors is 4 and is obtained if each of c2,3 and c2,4 splits a consistent sector. [sent-280, score-0.427]
71 The number of temporal permutations of IS from a combinTahteor niaulm point off te evmiepwo riasl given by: π = n! [sent-285, score-0.433]
72 As in the two o≤rd 1er2e2d5 pairs case, we can further reduce the number of valid orders by determining the sectors that are temporally consistent. [sent-304, score-0.576]
73 o n∈ts Rj irs saeclhec steecd,and the temporal o rredperre sinednutacteivde by nthei s? [sent-308, score-0.316]
74 In case all available temporal constraints for S are satisfied, the computed order is added to the order list, Γ. [sent-310, score-0.51]
75 In our case, each feature generally votes for more than a single order, and we set the weight of the vote to be inverse proportional to the number of orders it votes for (|Γi |). [sent-317, score-0.345]
76 m Iant raixd-, dition, the global order should be consistent with the known temporal orders (of images taken by the same camera) in addition to the computed partial orders ? [sent-320, score-1.085]
77 irwise known temporal orders to have probability of 1 (m? [sent-323, score-0.548]
78 To quantitatively evaluate the results, we measured the percentage of incorrect pairwise orders out of the total number of image pairs, known as the Kendall distance. [sent-327, score-0.275]
79 That is, the error ranges from 0% (the order is perfectly correct) to 100% (all pairwise orders are incorrect). [sent-328, score-0.313]
80 Theoretically, there is no limitation to the scalability of our method as long as the motion of 3D points is not periodic, and provided that dynamic features are correctly matched in as many images as needed. [sent-341, score-0.429]
81 In particular, we used 54 images taken by 10 freely moving cameras (each camera provided a maximum of 10 images). [sent-350, score-0.498]
82 The 3D scene consisted of 100 3D lines, where each line was projected to only 3 images on average; the fundamental matrices were computed between 58% of the image pairs. [sent-351, score-0.278]
83 Yet, the maximum mean error is below 6% incorrect pairwise orders out of a total of 143 1image pairs. [sent-355, score-0.275]
84 Scalability: In this experiment, we evaluated the scalability of our method in the total number of images and considered two cases: increasing the number of cameras, or increasing the number of images captured by each camera. [sent-356, score-0.31]
85 , the number of images in which each dynamic feature appears) in all experiments was in the range of 3 to 7, and the fundamental matrices were computed for 60% of the image pairs. [sent-360, score-0.365]
86 In each trial, we increased the maximum number of images provided by each camera (the actual number of images was randomly chosen for each camera in the range of 1to a predefined maximum value). [sent-367, score-0.302]
87 The correspondences of dynamic features across the input images and the ground truth order were given by [13]. [sent-381, score-0.327]
88 Thus, although the dynamic features were manually extracted and matched across all images by [13], the actual size of the feature sets was much smaller. [sent-384, score-0.305]
89 Our method successfully recovered the correct temporal order of 19 images of RockClimbing taken by five cameras, and 14 images of HandWave taken by four cameras. [sent-385, score-0.775]
90 For the RockClimbing, the feature set consists of 10 images, taken by 3 cameras, whereas the HandWave feature set consists of 7 images taken by 4 cameras. [sent-392, score-0.272]
91 In particular, we used the method of Avidan & Shashua [16] to reconstruct the 3D linear trajectory of each 3D dynamic feature and the 3D locations along it. [sent-396, score-0.329]
92 The temporal order is then given by the spatial order of the 3D locations along the line. [sent-397, score-0.478]
93 For the HandWave dataset, 28% of the 91 pairwise orders were incorrect, and for the RockClimbing dataset 43% out of the 171 pairwise orders were incorrect. [sent-400, score-0.464]
94 The original dataset consists of 15 images, taken by two hand-held mobile phones (iPhone 4), where ten of the images were taken by the first camera, and five by the other one. [sent-419, score-0.327]
95 Our method was not provided with any prior information except for the known temporal constraints. [sent-423, score-0.316]
96 We successfully recovered the correct temporal order with no error. [sent-424, score-0.443]
97 The main challenge in this dataset was matching the dynamic features due to the change in appearance of the boats (avg. [sent-426, score-0.266]
98 The temporal order can, in some cases, be determined by the spatial order of the features in the stroboscopic image but this will not work in the general case as, Fig. [sent-451, score-0.585]
99 Since all we care about is the order of images, we can tolerate inaccuracy and partial information in both the computed geometry and the matching of dynamic features. [sent-455, score-0.267]
100 In particular, we dropped the static camera assumption of [4] and compensate for the uncertainty in space by adding temporal certainty that stems from our knowing the order of images taken by each camera. [sent-460, score-0.95]
wordName wordTfidf (topN-words)
[('temporal', 0.316), ('sectors', 0.311), ('rockclimbing', 0.268), ('orders', 0.232), ('sector', 0.214), ('handwave', 0.19), ('basha', 0.19), ('dynamic', 0.186), ('epipolar', 0.163), ('trajectory', 0.143), ('lines', 0.13), ('cameras', 0.12), ('static', 0.118), ('critical', 0.116), ('stroboscopic', 0.107), ('taken', 0.106), ('nonrigid', 0.106), ('pj', 0.094), ('camera', 0.091), ('permutations', 0.083), ('order', 0.081), ('boats', 0.08), ('sequencing', 0.08), ('shashua', 0.076), ('circle', 0.074), ('avidan', 0.073), ('moving', 0.071), ('fundamental', 0.071), ('induce', 0.071), ('reference', 0.069), ('ordering', 0.068), ('line', 0.062), ('images', 0.06), ('matched', 0.059), ('rank', 0.058), ('consistent', 0.058), ('voting', 0.057), ('isi', 0.057), ('phones', 0.055), ('intersects', 0.055), ('induced', 0.055), ('carnival', 0.054), ('crowdcam', 0.054), ('kaminski', 0.054), ('nrsfm', 0.054), ('pi', 0.054), ('points', 0.053), ('increasing', 0.053), ('captured', 0.051), ('viewpoints', 0.051), ('freely', 0.05), ('synchronization', 0.049), ('matrices', 0.048), ('crossing', 0.048), ('asynchronously', 0.048), ('casually', 0.048), ('reversing', 0.048), ('aviv', 0.048), ('soeft', 0.048), ('akhter', 0.048), ('aggregation', 0.047), ('uncertainty', 0.047), ('assumption', 0.046), ('photo', 0.046), ('recovered', 0.046), ('dropped', 0.045), ('ij', 0.045), ('recover', 0.044), ('marked', 0.044), ('incorrect', 0.043), ('synthetic', 0.042), ('smartphones', 0.041), ('bregler', 0.04), ('certainty', 0.04), ('overlaying', 0.04), ('moses', 0.04), ('geometric', 0.039), ('event', 0.039), ('tel', 0.038), ('iphone', 0.038), ('transitivity', 0.038), ('ballan', 0.038), ('motion', 0.038), ('votes', 0.038), ('unit', 0.037), ('vote', 0.037), ('jn', 0.037), ('orr', 0.037), ('consisted', 0.037), ('pink', 0.034), ('visualization', 0.034), ('claim', 0.034), ('point', 0.034), ('recovering', 0.034), ('eng', 0.033), ('inconsistent', 0.033), ('scalability', 0.033), ('temporally', 0.033), ('constraints', 0.032), ('sc', 0.032)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999988 397 iccv-2013-Space-Time Tradeoffs in Photo Sequencing
Author: Tali Dekel_(Basha), Yael Moses, Shai Avidan
Abstract: Photo-sequencing is the problem of recovering the temporal order of a set of still images of a dynamic event, taken asynchronously by a set of uncalibrated cameras. Solving this problem is a first, crucial step for analyzing (or visualizing) the dynamic content of the scene captured by a large number of freely moving spectators. We propose a geometric based solution, followed by rank aggregation to . ac . i l avidan@ eng .t au . ac . i l the photo-sequencing problem. Our algorithm trades spatial certainty for temporal certainty. Whereas the previous solution proposed by [4] relies on two images taken from the same static camera to eliminate uncertainty in space, we drop the static-camera assumption and replace it with temporal information available from images taken from the same (moving) camera. Our method thus overcomes the limitation of the static-camera assumption, and scales much better with the duration of the event and the spread of cameras in space. We present successful results on challenging real data sets and large scale synthetic data (250 images).
2 0.16219783 146 iccv-2013-Event Detection in Complex Scenes Using Interval Temporal Constraints
Author: Yifan Zhang, Qiang Ji, Hanqing Lu
Abstract: In complex scenes with multiple atomic events happening sequentially or in parallel, detecting each individual event separately may not always obtain robust and reliable result. It is essential to detect them in a holistic way which incorporates the causality and temporal dependency among them to compensate the limitation of current computer vision techniques. In this paper, we propose an interval temporal constrained dynamic Bayesian network to extendAllen ’s interval algebra network (IAN) [2]from a deterministic static model to a probabilistic dynamic system, which can not only capture the complex interval temporal relationships, but also model the evolution dynamics and handle the uncertainty from the noisy visual observation. In the model, the topology of the IAN on each time slice and the interlinks between the time slices are discovered by an advanced structure learning method. The duration of the event and the unsynchronized time lags between two correlated event intervals are captured by a duration model, so that we can better determine the temporal boundary of the event. Empirical results on two real world datasets show the power of the proposed interval temporal constrained model.
3 0.15629399 68 iccv-2013-Camera Alignment Using Trajectory Intersections in Unsynchronized Videos
Author: Thomas Kuo, Santhoshkumar Sunderrajan, B.S. Manjunath
Abstract: This paper addresses the novel and challenging problem of aligning camera views that are unsynchronized by low and/or variable frame rates using object trajectories. Unlike existing trajectory-based alignment methods, our method does not require frame-to-frame synchronization. Instead, we propose using the intersections of corresponding object trajectories to match views. To find these intersections, we introduce a novel trajectory matching algorithm based on matching Spatio-Temporal Context Graphs (STCGs). These graphs represent the distances between trajectories in time and space within a view, and are matched to an STCG from another view to find the corresponding trajectories. To the best of our knowledge, this is one of the first attempts to align views that are unsynchronized with variable frame rates. The results on simulated and real-world datasets show trajectory intersections area viablefeatureforcamera alignment, and that the trajectory matching method performs well in real-world scenarios.
4 0.11750508 127 iccv-2013-Dynamic Pooling for Complex Event Recognition
Author: Weixin Li, Qian Yu, Ajay Divakaran, Nuno Vasconcelos
Abstract: The problem of adaptively selecting pooling regions for the classification of complex video events is considered. Complex events are defined as events composed of several characteristic behaviors, whose temporal configuration can change from sequence to sequence. A dynamic pooling operator is defined so as to enable a unified solution to the problems of event specific video segmentation, temporal structure modeling, and event detection. Video is decomposed into segments, and the segments most informative for detecting a given event are identified, so as to dynamically determine the pooling operator most suited for each sequence. This dynamic pooling is implemented by treating the locations of characteristic segments as hidden information, which is inferred, on a sequence-by-sequence basis, via a large-margin classification rule with latent variables. Although the feasible set of segment selections is combinatorial, it is shown that a globally optimal solution to the inference problem can be obtained efficiently, through the solution of a series of linear programs. Besides the coarselevel location of segments, a finer model of video struc- ture is implemented by jointly pooling features of segmenttuples. Experimental evaluation demonstrates that the re- sulting event detector has state-of-the-art performance on challenging video datasets.
5 0.1165254 439 iccv-2013-Video Co-segmentation for Meaningful Action Extraction
Author: Jiaming Guo, Zhuwen Li, Loong-Fah Cheong, Steven Zhiying Zhou
Abstract: Given a pair of videos having a common action, our goal is to simultaneously segment this pair of videos to extract this common action. As a preprocessing step, we first remove background trajectories by a motion-based figureground segmentation. To remove the remaining background and those extraneous actions, we propose the trajectory cosaliency measure, which captures the notion that trajectories recurring in all the videos should have their mutual saliency boosted. This requires a trajectory matching process which can compare trajectories with different lengths and not necessarily spatiotemporally aligned, and yet be discriminative enough despite significant intra-class variation in the common action. We further leverage the graph matching to enforce geometric coherence between regions so as to reduce feature ambiguity and matching errors. Finally, to classify the trajectories into common action and action outliers, we formulate the problem as a binary labeling of a Markov Random Field, in which the data term is measured by the trajectory co-saliency and the smooth- ness term is measured by the spatiotemporal consistency between trajectories. To evaluate the performance of our framework, we introduce a dataset containing clips that have animal actions as well as human actions. Experimental results show that the proposed method performs well in common action extraction.
6 0.11469999 250 iccv-2013-Lifting 3D Manhattan Lines from a Single Image
7 0.11043074 382 iccv-2013-Semi-dense Visual Odometry for a Monocular Camera
8 0.10738644 111 iccv-2013-Detecting Dynamic Objects with Multi-view Background Subtraction
9 0.10358574 128 iccv-2013-Dynamic Probabilistic Volumetric Models
10 0.093237825 280 iccv-2013-Multi-view 3D Reconstruction from Uncalibrated Radially-Symmetric Cameras
11 0.092640668 17 iccv-2013-A Global Linear Method for Camera Pose Registration
12 0.091781683 319 iccv-2013-Point-Based 3D Reconstruction of Thin Objects
13 0.091204777 282 iccv-2013-Multi-view Object Segmentation in Space and Time
14 0.090614796 314 iccv-2013-Perspective Motion Segmentation via Collaborative Clustering
15 0.089329779 39 iccv-2013-Action Recognition with Improved Trajectories
16 0.088861816 240 iccv-2013-Learning Maximum Margin Temporal Warping for Action Recognition
17 0.086412914 343 iccv-2013-Real-World Normal Map Capture for Nearly Flat Reflective Surfaces
18 0.084612638 49 iccv-2013-An Enhanced Structure-from-Motion Paradigm Based on the Absolute Dual Quadric and Images of Circular Points
19 0.083824418 139 iccv-2013-Elastic Fragments for Dense Scene Reconstruction
20 0.082251109 423 iccv-2013-Towards Motion Aware Light Field Video for Dynamic Scenes
topicId topicWeight
[(0, 0.214), (1, -0.072), (2, 0.002), (3, 0.11), (4, -0.008), (5, 0.064), (6, 0.062), (7, -0.072), (8, 0.063), (9, 0.018), (10, -0.023), (11, -0.032), (12, -0.018), (13, 0.063), (14, -0.057), (15, 0.006), (16, 0.035), (17, 0.073), (18, -0.028), (19, 0.012), (20, 0.008), (21, -0.092), (22, -0.02), (23, 0.06), (24, -0.027), (25, 0.037), (26, -0.009), (27, 0.008), (28, -0.026), (29, -0.03), (30, 0.003), (31, 0.087), (32, -0.019), (33, 0.024), (34, 0.034), (35, -0.027), (36, 0.021), (37, -0.006), (38, 0.067), (39, -0.002), (40, 0.035), (41, 0.056), (42, -0.032), (43, 0.011), (44, 0.014), (45, -0.014), (46, 0.011), (47, 0.051), (48, 0.087), (49, 0.062)]
simIndex simValue paperId paperTitle
same-paper 1 0.95106357 397 iccv-2013-Space-Time Tradeoffs in Photo Sequencing
Author: Tali Dekel_(Basha), Yael Moses, Shai Avidan
Abstract: Photo-sequencing is the problem of recovering the temporal order of a set of still images of a dynamic event, taken asynchronously by a set of uncalibrated cameras. Solving this problem is a first, crucial step for analyzing (or visualizing) the dynamic content of the scene captured by a large number of freely moving spectators. We propose a geometric based solution, followed by rank aggregation to . ac . i l avidan@ eng .t au . ac . i l the photo-sequencing problem. Our algorithm trades spatial certainty for temporal certainty. Whereas the previous solution proposed by [4] relies on two images taken from the same static camera to eliminate uncertainty in space, we drop the static-camera assumption and replace it with temporal information available from images taken from the same (moving) camera. Our method thus overcomes the limitation of the static-camera assumption, and scales much better with the duration of the event and the spread of cameras in space. We present successful results on challenging real data sets and large scale synthetic data (250 images).
2 0.67284179 68 iccv-2013-Camera Alignment Using Trajectory Intersections in Unsynchronized Videos
Author: Thomas Kuo, Santhoshkumar Sunderrajan, B.S. Manjunath
Abstract: This paper addresses the novel and challenging problem of aligning camera views that are unsynchronized by low and/or variable frame rates using object trajectories. Unlike existing trajectory-based alignment methods, our method does not require frame-to-frame synchronization. Instead, we propose using the intersections of corresponding object trajectories to match views. To find these intersections, we introduce a novel trajectory matching algorithm based on matching Spatio-Temporal Context Graphs (STCGs). These graphs represent the distances between trajectories in time and space within a view, and are matched to an STCG from another view to find the corresponding trajectories. To the best of our knowledge, this is one of the first attempts to align views that are unsynchronized with variable frame rates. The results on simulated and real-world datasets show trajectory intersections area viablefeatureforcamera alignment, and that the trajectory matching method performs well in real-world scenarios.
3 0.65938032 348 iccv-2013-Refractive Structure-from-Motion on Underwater Images
Author: Anne Jordt-Sedlazeck, Reinhard Koch
Abstract: In underwater environments, cameras need to be confined in an underwater housing, viewing the scene through a piece of glass. In case of flat port underwater housings, light rays entering the camera housing are refracted twice, due to different medium densities of water, glass, and air. This causes the usually linear rays of light to bend and the commonly used pinhole camera model to be invalid. When using the pinhole camera model without explicitly modeling refraction in Structure-from-Motion (SfM) methods, a systematic model error occurs. Therefore, in this paper, we propose a system for computing camera path and 3D points with explicit incorporation of refraction using new methods for pose estimation. Additionally, a new error function is introduced for non-linear optimization, especially bundle adjustment. The proposed method allows to increase reconstruction accuracy and is evaluated in a set of experiments, where the proposed method’s performance is compared to SfM with the perspective camera model.
4 0.65588081 280 iccv-2013-Multi-view 3D Reconstruction from Uncalibrated Radially-Symmetric Cameras
Author: Jae-Hak Kim, Yuchao Dai, Hongdong Li, Xin Du, Jonghyuk Kim
Abstract: We present a new multi-view 3D Euclidean reconstruction method for arbitrary uncalibrated radially-symmetric cameras, which needs no calibration or any camera model parameters other than radial symmetry. It is built on the radial 1D camera model [25], a unified mathematical abstraction to different types of radially-symmetric cameras. We formulate the problem of multi-view reconstruction for radial 1D cameras as a matrix rank minimization problem. Efficient implementation based on alternating direction continuation is proposed to handle scalability issue for real-world applications. Our method applies to a wide range of omnidirectional cameras including both dioptric and catadioptric (central and non-central) cameras. Additionally, our method deals with complete and incomplete measurements under a unified framework elegantly. Experiments on both synthetic and real images from various types of cameras validate the superior performance of our new method, in terms of numerical accuracy and robustness.
5 0.65108889 17 iccv-2013-A Global Linear Method for Camera Pose Registration
Author: Nianjuan Jiang, Zhaopeng Cui, Ping Tan
Abstract: We present a linear method for global camera pose registration from pairwise relative poses encoded in essential matrices. Our method minimizes an approximate geometric error to enforce the triangular relationship in camera triplets. This formulation does not suffer from the typical ‘unbalanced scale ’ problem in linear methods relying on pairwise translation direction constraints, i.e. an algebraic error; nor the system degeneracy from collinear motion. In the case of three cameras, our method provides a good linear approximation of the trifocal tensor. It can be directly scaled up to register multiple cameras. The results obtained are accurate for point triangulation and can serve as a good initialization for final bundle adjustment. We evaluate the algorithm performance with different types of data and demonstrate its effectiveness. Our system produces good accuracy, robustness, and outperforms some well-known systems on efficiency.
7 0.63055521 436 iccv-2013-Unsupervised Intrinsic Calibration from a Single Frame Using a "Plumb-Line" Approach
8 0.62554306 226 iccv-2013-Joint Subspace Stabilization for Stereoscopic Video
9 0.61873645 164 iccv-2013-Fibonacci Exposure Bracketing for High Dynamic Range Imaging
10 0.61166751 250 iccv-2013-Lifting 3D Manhattan Lines from a Single Image
11 0.61117238 139 iccv-2013-Elastic Fragments for Dense Scene Reconstruction
12 0.5923627 145 iccv-2013-Estimating the Material Properties of Fabric from Video
13 0.58820075 346 iccv-2013-Rectangling Stereographic Projection for Wide-Angle Image Visualization
14 0.57959545 115 iccv-2013-Direct Optimization of Frame-to-Frame Rotation
15 0.57927597 152 iccv-2013-Extrinsic Camera Calibration without a Direct View Using Spherical Mirror
16 0.57714391 146 iccv-2013-Event Detection in Complex Scenes Using Interval Temporal Constraints
17 0.57311213 343 iccv-2013-Real-World Normal Map Capture for Nearly Flat Reflective Surfaces
18 0.56208342 90 iccv-2013-Content-Aware Rotation
19 0.5530718 268 iccv-2013-Modeling 4D Human-Object Interactions for Event and Object Recognition
20 0.54942161 243 iccv-2013-Learning Slow Features for Behaviour Analysis
topicId topicWeight
[(2, 0.059), (7, 0.037), (12, 0.014), (26, 0.066), (31, 0.038), (40, 0.019), (42, 0.1), (48, 0.017), (62, 0.181), (64, 0.064), (73, 0.046), (89, 0.219), (95, 0.021), (98, 0.018)]
simIndex simValue paperId paperTitle
1 0.91965109 100 iccv-2013-Curvature-Aware Regularization on Riemannian Submanifolds
Author: Kwang In Kim, James Tompkin, Christian Theobalt
Abstract: One fundamental assumption in object recognition as well as in other computer vision and pattern recognition problems is that the data generation process lies on a manifold and that it respects the intrinsic geometry of the manifold. This assumption is held in several successful algorithms for diffusion and regularization, in particular, in graph-Laplacian-based algorithms. We claim that the performance of existing algorithms can be improved if we additionally account for how the manifold is embedded within the ambient space, i.e., if we consider the extrinsic geometry of the manifold. We present a procedure for characterizing the extrinsic (as well as intrinsic) curvature of a manifold M which is described by a sampled point cloud in a high-dimensional Euclidean space. Once estimated, we use this characterization in general diffusion and regularization on M, and form a new regularizer on a point cloud. The resulting re-weighted graph Laplacian demonstrates superior performance over classical graph Laplacian in semisupervised learning and spectral clustering.
2 0.9153564 55 iccv-2013-Automatic Kronecker Product Model Based Detection of Repeated Patterns in 2D Urban Images
Author: Juan Liu, Emmanouil Psarakis, Ioannis Stamos
Abstract: Repeated patterns (such as windows, tiles, balconies and doors) are prominent and significant features in urban scenes. Therefore, detection of these repeated patterns becomes very important for city scene analysis. This paper attacks the problem of repeated patterns detection in a precise, efficient and automatic way, by combining traditional feature extraction followed by a Kronecker product lowrank modeling approach. Our method is tailored for 2D images of building fac ¸ades. We have developed algorithms for automatic selection ofa representative texture withinfa ¸cade images using vanishing points and Harris corners. After rectifying the input images, we describe novel algorithms that extract repeated patterns by using Kronecker product based modeling that is based on a solid theoretical foundation. Our approach is unique and has not ever been used for fac ¸ade analysis. We have tested our algorithms in a large set of images.
same-paper 3 0.86370504 397 iccv-2013-Space-Time Tradeoffs in Photo Sequencing
Author: Tali Dekel_(Basha), Yael Moses, Shai Avidan
Abstract: Photo-sequencing is the problem of recovering the temporal order of a set of still images of a dynamic event, taken asynchronously by a set of uncalibrated cameras. Solving this problem is a first, crucial step for analyzing (or visualizing) the dynamic content of the scene captured by a large number of freely moving spectators. We propose a geometric based solution, followed by rank aggregation to . ac . i l avidan@ eng .t au . ac . i l the photo-sequencing problem. Our algorithm trades spatial certainty for temporal certainty. Whereas the previous solution proposed by [4] relies on two images taken from the same static camera to eliminate uncertainty in space, we drop the static-camera assumption and replace it with temporal information available from images taken from the same (moving) camera. Our method thus overcomes the limitation of the static-camera assumption, and scales much better with the duration of the event and the spread of cameras in space. We present successful results on challenging real data sets and large scale synthetic data (250 images).
4 0.85001981 398 iccv-2013-Sparse Variation Dictionary Learning for Face Recognition with a Single Training Sample per Person
Author: Meng Yang, Luc Van_Gool, Lei Zhang
Abstract: Face recognition (FR) with a single training sample per person (STSPP) is a very challenging problem due to the lack of information to predict the variations in the query sample. Sparse representation based classification has shown interesting results in robust FR; however, its performance will deteriorate much for FR with STSPP. To address this issue, in this paper we learn a sparse variation dictionary from a generic training set to improve the query sample representation by STSPP. Instead of learning from the generic training set independently w.r.t. the gallery set, the proposed sparse variation dictionary learning (SVDL) method is adaptive to the gallery set by jointly learning a projection to connect the generic training set with the gallery set. The learnt sparse variation dictionary can be easily integrated into the framework of sparse representation based classification so that various variations in face images, including illumination, expression, occlusion, pose, etc., can be better handled. Experiments on the large-scale CMU Multi-PIE, FRGC and LFW databases demonstrate the promising performance of SVDL on FR with STSPP.
5 0.84382689 321 iccv-2013-Pose-Free Facial Landmark Fitting via Optimized Part Mixtures and Cascaded Deformable Shape Model
Author: Xiang Yu, Junzhou Huang, Shaoting Zhang, Wang Yan, Dimitris N. Metaxas
Abstract: This paper addresses the problem of facial landmark localization and tracking from a single camera. We present a two-stage cascaded deformable shape model to effectively and efficiently localize facial landmarks with large head pose variations. For face detection, we propose a group sparse learning method to automatically select the most salient facial landmarks. By introducing 3D face shape model, we use procrustes analysis to achieve pose-free facial landmark initialization. For deformation, the first step uses mean-shift local search with constrained local model to rapidly approach the global optimum. The second step uses component-wise active contours to discriminatively refine the subtle shape variation. Our framework can simultaneously handle face detection, pose-free landmark localization and tracking in real time. Extensive experiments are conducted on both laboratory environmental face databases and face-in-the-wild databases. All results demonstrate that our approach has certain advantages over state-of-theart methods in handling pose variations1.
6 0.8179751 433 iccv-2013-Understanding High-Level Semantics by Modeling Traffic Patterns
7 0.8177669 65 iccv-2013-Breaking the Chain: Liberation from the Temporal Markov Assumption for Tracking Human Poses
8 0.81712866 57 iccv-2013-BOLD Features to Detect Texture-less Objects
9 0.81598681 89 iccv-2013-Constructing Adaptive Complex Cells for Robust Visual Tracking
10 0.81533098 160 iccv-2013-Fast Object Segmentation in Unconstrained Video
11 0.81517577 410 iccv-2013-Support Surface Prediction in Indoor Scenes
12 0.81514674 204 iccv-2013-Human Attribute Recognition by Rich Appearance Dictionary
13 0.81500119 111 iccv-2013-Detecting Dynamic Objects with Multi-view Background Subtraction
14 0.81478608 379 iccv-2013-Semantic Segmentation without Annotating Segments
15 0.81477475 127 iccv-2013-Dynamic Pooling for Complex Event Recognition
16 0.81463349 82 iccv-2013-Compensating for Motion during Direct-Global Separation
17 0.81457651 338 iccv-2013-Randomized Ensemble Tracking
18 0.8142277 300 iccv-2013-Optical Flow via Locally Adaptive Fusion of Complementary Data Costs
19 0.81410283 196 iccv-2013-Hierarchical Data-Driven Descent for Efficient Optimal Deformation Estimation
20 0.81389874 190 iccv-2013-Handling Occlusions with Franken-Classifiers