iccv iccv2013 iccv2013-278 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Kaoning Hu, Lijun Yin
Abstract: In this paper, we propose a multi-scale topological feature representation for automatic analysis of hand posture. Such topological features have the advantage of being posture-dependent while being preserved under certain variations of illumination, rotation, personal dependency, etc. Our method studies the topology of the holes between the hand region and its convex hull. Inspired by the principle of Persistent Homology, which is the theory of computational topology for topological feature analysis over multiple scales, we construct the multi-scale Betti Numbers matrix (MSBNM) for the topological feature representation. In our experiments, we used 12 different hand postures and compared our features with three popular features (HOG, MCT, and Shape Context) on different data sets. In addition to hand postures, we also extend the feature representations to arm postures. The results demonstrate the feasibility and reliability of the proposed method.
Reference: text
sentIndex sentText sentNum sentScore
1 edu , Abstract In this paper, we propose a multi-scale topological feature representation for automatic analysis of hand posture. [sent-2, score-0.487]
2 Our method studies the topology of the holes between the hand region and its convex hull. [sent-4, score-0.636]
3 Inspired by the principle of Persistent Homology, which is the theory of computational topology for topological feature analysis over multiple scales, we construct the multi-scale Betti Numbers matrix (MSBNM) for the topological feature representation. [sent-5, score-0.582]
4 In our experiments, we used 12 different hand postures and compared our features with three popular features (HOG, MCT, and Shape Context) on different data sets. [sent-6, score-0.612]
5 The 2D hand models [13] are B-spline curves constructed by control points which are matched to the hand region by using partitioned sampling algorithms. [sent-13, score-0.43]
6 However, a single 2D model is limited to describe one or two postures [13]. [sent-14, score-0.415]
7 , static hand postures) [14], and can involve one hand or two hands [16]. [sent-30, score-0.436]
8 In this paper, we focus on the static hand postures of a single hand. [sent-31, score-0.629]
9 In shape-based approaches, hand postures are recognized by matching the hand contour to the training samples or to hand postures and the desire of real-time applications, people either fuse multiple features [25] or develop new features. [sent-40, score-1.46]
10 It is an algebraic topological invariant that has been used as a mathematical tool for algorithmically analyzing topological features of shapes or functions, and has been previously applied to the problem of shape analysis and retrieval [4, 7]. [sent-43, score-0.553]
11 Moreover, the topological similarity ofhand postures makes it difficult to distinguish various hand postures only based on the topology of hand shapes. [sent-45, score-1.543]
12 In this work, we consider the complementary holes between the hand region and its convex hull as topological spaces. [sent-46, score-0.975]
13 The complementary holes (colored regions) between the hand postures and their convex hulls. [sent-49, score-1.03]
14 We propose a multi-scale topological feature representation for hand posture analysis inspired by the principle of Persistent Homology. [sent-52, score-0.813]
15 We study the holes between the hand region and its convex hull and distinguish different hand postures by the unique topology of these holes. [sent-53, score-1.338]
16 The unique multi-scale Betti Numbers matrix (MSBNM) is computed for hand posture representation, characterization, and classification. [sent-54, score-0.552]
17 Representing hand postures using complementary holes Topological features rely on the number of parts (connected components) and holes of an object. [sent-58, score-1.294]
18 Since many hand postures yield the same topology as they do not show any holes, we propose to “create” topological holes from hand images through the construction of their convex hulls. [sent-60, score-1.476]
19 The convex hull of a given hand posture is the minimal convex set containing the hand region and can be derived by a convex polygon. [sent-61, score-0.988]
20 Due to the typical concave shape of the hand region, non-trivial holes are observed between the hand region and its convex hull. [sent-62, score-0.837]
21 We call these holes the complementary holes of a hand posture. [sent-63, score-0.879]
22 Figure 1shows two examples of different postures and their complementary holes. [sent-64, score-0.465]
23 Using the complementary holes to describe hand postures has the following advantages. [sent-65, score-0.978]
24 First, it is observed that each hand posture produces unique sets of complementary holes. [sent-66, score-0.587]
25 Second, the complementary holes are independent to the illuminations and rotations. [sent-67, score-0.412]
26 Third, it is straightforward to quantify each set ofthe complementary holes using the topological features. [sent-68, score-0.614]
27 Although each hand posture produces unique sets of complementary holes, simply counting the number of holes is not sufficient to distinguish different postures, nor reliable enough to tolerate variances of a single posture. [sent-69, score-0.918]
28 This is because the number of holes of a posture is changeable due to noise in the hand images or variations of viewing angles. [sent-70, score-0.839]
29 Ambiguity may be caused by the same number of holes from different postures. [sent-71, score-0.334]
30 Betti numbers [27] is used to quantify the topological features: In a topological space, B0 is defined by the number of parts (connected components) of the space, and B 1is defined by the number of holes. [sent-78, score-0.549]
31 So this scaling process will change the connectivity of the points, therefore the persistence of the holes can change, and so can the topological features. [sent-87, score-0.564]
32 Because topological features come in all scale-levels and can be nested or in more complicated relationships, observing the homology classes as to how they change as the scale changes [6] will help us to exploit detailed features of hand postures. [sent-88, score-0.719]
33 To serve our purpose of hand posture representation, the number of holes with respect to sequential scales is adopted for Betti number representation. [sent-89, score-0.865]
34 If we group the hand shape and its convex hull as one topological space, we can analyze the number of holes (B 1) of this space. [sent-90, score-0.947]
35 Moreover, if we consider each hole as a topological space, we can analyze the number of parts (B0) of the hole’s space (i. [sent-91, score-0.517]
36 Such a representation is not only taking account of the number of holes but also the life-span of each hole. [sent-94, score-0.34]
37 Details about the feature representation of hand postures are illustrated in the following sections. [sent-95, score-0.654]
38 Feature representation by MSBNM Persistent Homology can detect holes in a coordinatefree system [27]. [sent-98, score-0.34]
39 In order to instantiate its application on a Cartesian coordinate system, which is the 2D domain on which the hand images are represented, we simplify the computation of Persistent Homology with image processing tools to locate holes in 2D images. [sent-99, score-0.513]
40 Multiple morphological dilations of the hand region leads to multiple scales associated with the homology (i. [sent-104, score-0.528]
41 Since we consider each hole as a topological space, applying the dilation operation to the convex hull region of a hand is equivalent to applying an erosion operation O to the hole regions of the hand within the convex hull. [sent-107, score-1.436]
42 This operation is applied repeatedly until the holes are completely eroded. [sent-108, score-0.316]
43 → Os(K) → 0 (2) In our proposed approach, we track the topology of each hole as the scale changes. [sent-112, score-0.316]
44 At the original scale, each hole is shown as an individual region of connected component and is considered as a topological space. [sent-113, score-0.555]
45 A B0 sequence is constructed for each hole in the convex hull of the hand. [sent-122, score-0.393]
46 The multi-scale Betti Numbers are defined as a matrix as shown in Figure 2, where each column represents the B0 sequence (connected components) of each hole along with the scales at which it is found, as we consider each hole as a topological space. [sent-124, score-0.804]
47 Notice that the sum of each row is the total number of holes at each individual scale, which is also the hand’s B 1 value of the topological space within its convex hull. [sent-127, score-0.616]
48 hole is preserved at very large scales, the hole is significant and should be considered as a key feature of the hand posture. [sent-135, score-0.731]
49 As a result, MSBNM partially reflects the size of each hole, improving the distinguishability of different postures with a same topology. [sent-137, score-0.415]
50 Figure 4 shows MSBNM of three hand postures from the dorsal view. [sent-138, score-0.646]
51 Since we only choose 7 instead of infinite scales, some large holes could stay in all scales. [sent-143, score-0.33]
52 We bound the hand region using an ellipse E, and use a smaller ellipse which is 1:32 (by dimension, not area) of E as the erosion structure. [sent-149, score-0.369]
53 Then, the rest of the holes are sorted in counter11993300 Figure 6. [sent-154, score-0.316]
54 Twelve different hand postures clockwise order with respect to the center of the hand, following the reference hole. [sent-155, score-0.612]
55 Figure 5 shows an example where the holes of two samples with the same posture were matched. [sent-156, score-0.681]
56 Based on our observation, hand postures can only generate a finite number of hole regions. [sent-157, score-0.862]
57 Thus, for hand posture analysis, we only count at most 7 largest holes, resulting in 7 columns of MSBNM. [sent-158, score-0.523]
58 In cases where some postures produce less than 7 holes, we add padding columns of 0s to the MSBNM to make the matrix dimensions consistent. [sent-159, score-0.451]
59 The acquisition of MSBNM follows six steps: (1) Compute the convex hull of the hand region. [sent-160, score-0.341]
60 (2) Consider the hand region and its convex hull as one topological space and locate all holes inside the convex hull. [sent-161, score-0.977]
61 It is optional to use a threshold and discard the tiny holes which are caused by noise. [sent-162, score-0.334]
62 (3) Enclose the hand region using an ellipse E, and use a smaller ellipse which is 312 of E as the erosion structure. [sent-163, score-0.369]
63 (4) Consider each hole as a topological space, and apply the morphological erosion operations iteratively 6 times on the region to generate multiple scale representations. [sent-164, score-0.658]
64 Then, we record the B0 (connected components) sequence of each individual hole space along with each step of the erosion operation. [sent-165, score-0.343]
65 (6) Sort the remaining holes in counter-clockwise order with the reference hole first. [sent-167, score-0.566]
66 Then, the MSBNM is to be used as the input of a classifier for hand posture classification. [sent-170, score-0.523]
67 Experiments on Hand Posture Recognition We created a database of 12 different types of hand postures with 100 samples of each posture and 1,200 samples in total. [sent-172, score-1.016]
68 In addition, we have also tested our approach on another public hand posture database [20]. [sent-174, score-0.523]
69 Before the hand posture is described by MSBNM, the region of the hand must be detected and segmented. [sent-175, score-0.756]
70 Since our focus is to evaluate the efficacy of the new hand feature representation, we use a simple yet relatively reliable method [9], which combines background subtraction, skin region segmentation, and the AAM method to detect the hand region. [sent-181, score-0.463]
71 After segmentation of the hand region, the program computes the MSBNM of the hand posture at each frame. [sent-184, score-0.737]
72 The yaw rotation range of the hand is [0◦, 90◦], and the pitching/rolling of the hand is [−30◦, 30◦]. [sent-186, score-0.447]
73 We selected 4 types of postures from their data set, which had been included in our posture set as (a), (b), (c) and (h) shown in Figure 6. [sent-212, score-0.741]
74 The samples of their postures are shown in Figure 9. [sent-214, score-0.454]
75 Since the hand images of Jochen’s dataset are taken from palm view, where the textures are different from dorsal view, we created a training set of 100 samples of each of the 12 hand postures (1200 in total) of the palm view from our lab. [sent-215, score-0.918]
76 The results demonstrate that our approach is more robust than the com- pared approaches in terms of the personal-independent test for hand posture classification. [sent-230, score-0.523]
77 Performance Evaluation In order to evaluate the robustness of our MSBNM approach, we conducted experiments for hand posture recognition under various imaging conditions (e. [sent-232, score-0.538]
78 It shows that the proposed topological feature representation is more robust under various image resolutions than the compared ap11993322 proaches. [sent-246, score-0.33]
79 There are two types of degradations affecting the hand postures recognition. [sent-250, score-0.63]
80 The second one is imperfect or noisy segmentation of the hand region caused by uneven illumination. [sent-252, score-0.308]
81 Evaluation under cross illuminations Knowing that illumination conditions can impact the performance ofhand posture recognition, we evaluated the proposed MSBNM approach using a training set and a testing set with different lighting conditions. [sent-264, score-0.492]
82 We collected hand posture samples under four different lighting conditions: bright illumination, medium bright illumination, dark illumination, and uneven illumination. [sent-265, score-0.599]
83 Evaluation under cross poses on various rotations We evaluated the performance of MSBNM approach when the training set and the testing set had different hand rotations. [sent-289, score-0.325]
84 f W thee c hoalnledct weda s1 [00− samples of each posture within each rotation range, so the data set consists of 4,800 samples in total. [sent-292, score-0.433]
85 The depth camera was put in front of the hand, so the hand is segmented by using the distance between the hand and the camera as shown in Figure 11 (a). [sent-316, score-0.41]
86 In this application, the user can use hand postures defined in Figure 6 to interact with a program, such as drag (Posture a), draw on (Posture b), zoom in (Posture h), or reset (Posture c) the map. [sent-334, score-0.612]
87 Eight arm postures and their complementary holes The results demonstrate the computational efficiency of the proposed feature representation and show the feasibility for hand posture recognition in real-time. [sent-337, score-1.424]
88 Extension to arm postures The idea of MSBNM representation of hand postures is extendible in nature to the arm postures representation as the arm postures exhibit the similar “hole” topological features under the convex hull of a body. [sent-340, score-2.47]
89 Figure 12 shows eight postures used for our study. [sent-342, score-0.415]
90 Using the depth camera, we captured 100 samples of each arm posture of one person as the training set, and 100 samples of each arm posture of another person as the testing data. [sent-343, score-0.888]
91 The results in Table 8 demonstrates the applicability of using our MSBNM representation for arm posture recognition. [sent-356, score-0.413]
92 Conclusion In this paper we proposed a novel approach to analyze the topological features of hand postures at multi-scales. [sent-358, score-0.879]
93 Since many postures do not show explicit “holes”, we compute the convex hull of the hand region and consider the complementary space of the hand as holes. [sent-359, score-1.023]
94 We use the multi-scale Betti Numbers matrix inspired by Persistent Homology to describe the multi-scale topological features of the hand posture. [sent-360, score-0.46]
95 Experimental results show that the multi-scale topological feature representation of hand postures by MSBNM is capable of distinguishing multiple hand postures against various illuminations, rotations, and resolutions. [sent-361, score-1.529]
96 In our future work, we will further analyze the Homology of the hand posture and will investigate the issue of partial occlusion by fusing it with the other texture based or contour based features. [sent-362, score-0.564]
97 The multi-scale Betti numbers matrix as a new feature descriptor is also applicable for representing objects with holes or complementary holes. [sent-365, score-0.452]
98 A mayer-vietoris formula for persistent homology with an application to shape recognition in the presence of occlusions. [sent-398, score-0.437]
99 Persistent betti numbers for a noise tolerant shape-based approach to image retrieval. [sent-413, score-0.306]
100 Hand posture classification and recognition using the modified census transform. [sent-430, score-0.346]
wordName wordTfidf (topN-words)
[('msbnm', 0.43), ('postures', 0.415), ('posture', 0.326), ('holes', 0.316), ('betti', 0.253), ('hole', 0.25), ('topological', 0.248), ('homology', 0.24), ('hand', 0.197), ('persistent', 0.158), ('mct', 0.135), ('jochen', 0.089), ('erosion', 0.078), ('hull', 0.076), ('shm', 0.076), ('arm', 0.063), ('approachesdtbncrperformance', 0.063), ('rotations', 0.057), ('numbers', 0.053), ('convex', 0.052), ('complementary', 0.05), ('illuminations', 0.046), ('samples', 0.039), ('shape', 0.039), ('hoscaptbgen', 0.038), ('oteuxrts', 0.038), ('region', 0.036), ('topology', 0.035), ('dorsal', 0.034), ('fgr', 0.031), ('hog', 0.03), ('rotation', 0.029), ('morphological', 0.029), ('ellipse', 0.029), ('leap', 0.027), ('scales', 0.026), ('chamfer', 0.026), ('snr', 0.026), ('context', 0.025), ('hoscaptgben', 0.025), ('knuckles', 0.025), ('toeuxrts', 0.025), ('resolutions', 0.025), ('hands', 0.025), ('evaluated', 0.024), ('representation', 0.024), ('gestures', 0.024), ('kalman', 0.024), ('yaw', 0.024), ('illumination', 0.023), ('imperfect', 0.023), ('degradation', 0.022), ('texture', 0.022), ('connected', 0.021), ('ofhand', 0.021), ('padding', 0.021), ('accuracies', 0.02), ('lighting', 0.02), ('census', 0.02), ('crossvalidation', 0.02), ('oikonomidis', 0.02), ('analyze', 0.019), ('gesture', 0.019), ('stenger', 0.019), ('caused', 0.018), ('decision', 0.018), ('distinguishable', 0.018), ('algorithmically', 0.018), ('degradations', 0.018), ('palm', 0.018), ('feature', 0.018), ('static', 0.017), ('segmentation', 0.017), ('longest', 0.017), ('pitch', 0.017), ('nested', 0.017), ('scale', 0.017), ('wherein', 0.017), ('uneven', 0.017), ('pages', 0.016), ('acquisition', 0.016), ('testing', 0.016), ('preserved', 0.016), ('roll', 0.016), ('cross', 0.016), ('construction', 0.016), ('depth', 0.016), ('confusion', 0.016), ('tech', 0.016), ('reliable', 0.015), ('various', 0.015), ('matrix', 0.015), ('sequence', 0.015), ('tracking', 0.015), ('highest', 0.015), ('feasibility', 0.015), ('tree', 0.015), ('track', 0.014), ('unique', 0.014), ('stay', 0.014)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000004 278 iccv-2013-Multi-scale Topological Features for Hand Posture Representation and Analysis
Author: Kaoning Hu, Lijun Yin
Abstract: In this paper, we propose a multi-scale topological feature representation for automatic analysis of hand posture. Such topological features have the advantage of being posture-dependent while being preserved under certain variations of illumination, rotation, personal dependency, etc. Our method studies the topology of the holes between the hand region and its convex hull. Inspired by the principle of Persistent Homology, which is the theory of computational topology for topological feature analysis over multiple scales, we construct the multi-scale Betti Numbers matrix (MSBNM) for the topological feature representation. In our experiments, we used 12 different hand postures and compared our features with three popular features (HOG, MCT, and Shape Context) on different data sets. In addition to hand postures, we also extend the feature representations to arm postures. The results demonstrate the feasibility and reliability of the proposed method.
2 0.1012153 133 iccv-2013-Efficient Hand Pose Estimation from a Single Depth Image
Author: Chi Xu, Li Cheng
Abstract: We tackle the practical problem of hand pose estimation from a single noisy depth image. A dedicated three-step pipeline is proposed: Initial estimation step provides an initial estimation of the hand in-plane orientation and 3D location; Candidate generation step produces a set of 3D pose candidate from the Hough voting space with the help of the rotational invariant depth features; Verification step delivers the final 3D hand pose as the solution to an optimization problem. We analyze the depth noises, and suggest tips to minimize their negative impacts on the overall performance. Our approach is able to work with Kinecttype noisy depth images, and reliably produces pose estimations of general motions efficiently (12 frames per second). Extensive experiments are conducted to qualitatively and quantitatively evaluate the performance with respect to the state-of-the-art methods that have access to additional RGB images. Our approach is shown to deliver on par or even better results.
3 0.077026434 218 iccv-2013-Interactive Markerless Articulated Hand Motion Tracking Using RGB and Depth Data
Author: Srinath Sridhar, Antti Oulasvirta, Christian Theobalt
Abstract: Tracking the articulated 3D motion of the hand has important applications, for example, in human–computer interaction and teleoperation. We present a novel method that can capture a broad range of articulated hand motions at interactive rates. Our hybrid approach combines, in a voting scheme, a discriminative, part-based pose retrieval method with a generative pose estimation method based on local optimization. Color information from a multiview RGB camera setup along with a person-specific hand model are used by the generative method to find the pose that best explains the observed images. In parallel, our discriminative pose estimation method uses fingertips detected on depth data to estimate a complete or partial pose of the hand by adopting a part-based pose retrieval strategy. This part-based strategy helps reduce the search space drastically in comparison to a global pose retrieval strategy. Quantitative results show that our method achieves state-of-the-art accuracy on challenging sequences and a near-realtime performance of 10 fps on a desktop computer.
4 0.044602286 379 iccv-2013-Semantic Segmentation without Annotating Segments
Author: Wei Xia, Csaba Domokos, Jian Dong, Loong-Fah Cheong, Shuicheng Yan
Abstract: Numerous existing object segmentation frameworks commonly utilize the object bounding box as a prior. In this paper, we address semantic segmentation assuming that object bounding boxes are provided by object detectors, but no training data with annotated segments are available. Based on a set of segment hypotheses, we introduce a simple voting scheme to estimate shape guidance for each bounding box. The derived shape guidance is used in the subsequent graph-cut-based figure-ground segmentation. The final segmentation result is obtained by merging the segmentation results in the bounding boxes. We conduct an extensive analysis of the effect of object bounding box accuracy. Comprehensive experiments on both the challenging PASCAL VOC object segmentation dataset and GrabCut50 image segmentation dataset show that the proposed approach achieves competitive results compared to previous detection or bounding box prior based methods, as well as other state-of-the-art semantic segmentation methods.
5 0.042617019 56 iccv-2013-Automatic Registration of RGB-D Scans via Salient Directions
Author: Bernhard Zeisl, Kevin Köser, Marc Pollefeys
Abstract: We address the problem of wide-baseline registration of RGB-D data, such as photo-textured laser scans without any artificial targets or prediction on the relative motion. Our approach allows to fully automatically register scans taken in GPS-denied environments such as urban canyon, industrial facilities or even indoors. We build upon image features which are plenty, localized well and much more discriminative than geometry features; however, they suffer from viewpoint distortions and request for normalization. We utilize the principle of salient directions present in the geometry and propose to extract (several) directions from the distribution of surface normals or other cues such as observable symmetries. Compared to previous work we pose no requirements on the scanned scene (like containing large textured planes) and can handle arbitrary surface shapes. Rendering the whole scene from these repeatable directions using an orthographic camera generates textures which are identical up to 2D similarity transformations. This ambiguity is naturally handled by 2D features and allows to find stable correspondences among scans. For geometric pose estimation from tentative matches we propose a fast and robust 2 point sample consensus scheme integrating an early rejection phase. We evaluate our approach on different challenging real world scenes.
6 0.042015288 340 iccv-2013-Real-Time Articulated Hand Pose Estimation Using Semi-supervised Transductive Regression Forests
7 0.040135082 263 iccv-2013-Measuring Flow Complexity in Videos
8 0.039689604 45 iccv-2013-Affine-Constrained Group Sparse Coding and Its Application to Image-Based Classifications
9 0.036275588 448 iccv-2013-Weakly Supervised Learning of Image Partitioning Using Decision Trees with Structured Split Criteria
10 0.036079973 199 iccv-2013-High Quality Shape from a Single RGB-D Image under Uncalibrated Natural Illumination
11 0.035638396 403 iccv-2013-Strong Appearance and Expressive Spatial Models for Human Pose Estimation
12 0.034460958 429 iccv-2013-Tree Shape Priors with Connectivity Constraints Using Convex Relaxation on General Graphs
13 0.032763179 103 iccv-2013-Deblurring by Example Using Dense Correspondence
14 0.032064486 366 iccv-2013-STAR3D: Simultaneous Tracking and Reconstruction of 3D Objects Using RGB-D Data
15 0.031764437 6 iccv-2013-A Convex Optimization Framework for Active Learning
16 0.031472169 140 iccv-2013-Elastic Net Constraints for Shape Matching
17 0.031135567 143 iccv-2013-Estimating Human Pose with Flowing Puppets
18 0.031099649 367 iccv-2013-SUN3D: A Database of Big Spaces Reconstructed Using SfM and Object Labels
19 0.030505978 177 iccv-2013-From Point to Set: Extend the Learning of Distance Metrics
20 0.030075239 330 iccv-2013-Proportion Priors for Image Sequence Segmentation
topicId topicWeight
[(0, 0.086), (1, -0.028), (2, -0.012), (3, -0.005), (4, 0.002), (5, -0.011), (6, 0.004), (7, 0.004), (8, -0.013), (9, 0.007), (10, 0.007), (11, -0.013), (12, -0.011), (13, -0.001), (14, 0.014), (15, 0.028), (16, -0.006), (17, -0.033), (18, -0.004), (19, 0.014), (20, 0.031), (21, 0.009), (22, 0.001), (23, 0.011), (24, -0.024), (25, 0.022), (26, 0.016), (27, 0.062), (28, 0.003), (29, -0.015), (30, 0.028), (31, 0.03), (32, -0.03), (33, 0.006), (34, -0.028), (35, -0.039), (36, -0.011), (37, 0.009), (38, -0.037), (39, 0.015), (40, 0.009), (41, 0.013), (42, 0.007), (43, 0.05), (44, 0.022), (45, -0.021), (46, 0.038), (47, -0.001), (48, -0.122), (49, 0.025)]
simIndex simValue paperId paperTitle
same-paper 1 0.88190359 278 iccv-2013-Multi-scale Topological Features for Hand Posture Representation and Analysis
Author: Kaoning Hu, Lijun Yin
Abstract: In this paper, we propose a multi-scale topological feature representation for automatic analysis of hand posture. Such topological features have the advantage of being posture-dependent while being preserved under certain variations of illumination, rotation, personal dependency, etc. Our method studies the topology of the holes between the hand region and its convex hull. Inspired by the principle of Persistent Homology, which is the theory of computational topology for topological feature analysis over multiple scales, we construct the multi-scale Betti Numbers matrix (MSBNM) for the topological feature representation. In our experiments, we used 12 different hand postures and compared our features with three popular features (HOG, MCT, and Shape Context) on different data sets. In addition to hand postures, we also extend the feature representations to arm postures. The results demonstrate the feasibility and reliability of the proposed method.
2 0.61685193 133 iccv-2013-Efficient Hand Pose Estimation from a Single Depth Image
Author: Chi Xu, Li Cheng
Abstract: We tackle the practical problem of hand pose estimation from a single noisy depth image. A dedicated three-step pipeline is proposed: Initial estimation step provides an initial estimation of the hand in-plane orientation and 3D location; Candidate generation step produces a set of 3D pose candidate from the Hough voting space with the help of the rotational invariant depth features; Verification step delivers the final 3D hand pose as the solution to an optimization problem. We analyze the depth noises, and suggest tips to minimize their negative impacts on the overall performance. Our approach is able to work with Kinecttype noisy depth images, and reliably produces pose estimations of general motions efficiently (12 frames per second). Extensive experiments are conducted to qualitatively and quantitatively evaluate the performance with respect to the state-of-the-art methods that have access to additional RGB images. Our approach is shown to deliver on par or even better results.
3 0.56896132 218 iccv-2013-Interactive Markerless Articulated Hand Motion Tracking Using RGB and Depth Data
Author: Srinath Sridhar, Antti Oulasvirta, Christian Theobalt
Abstract: Tracking the articulated 3D motion of the hand has important applications, for example, in human–computer interaction and teleoperation. We present a novel method that can capture a broad range of articulated hand motions at interactive rates. Our hybrid approach combines, in a voting scheme, a discriminative, part-based pose retrieval method with a generative pose estimation method based on local optimization. Color information from a multiview RGB camera setup along with a person-specific hand model are used by the generative method to find the pose that best explains the observed images. In parallel, our discriminative pose estimation method uses fingertips detected on depth data to estimate a complete or partial pose of the hand by adopting a part-based pose retrieval strategy. This part-based strategy helps reduce the search space drastically in comparison to a global pose retrieval strategy. Quantitative results show that our method achieves state-of-the-art accuracy on challenging sequences and a near-realtime performance of 10 fps on a desktop computer.
4 0.5512656 388 iccv-2013-Shape Index Descriptors Applied to Texture-Based Galaxy Analysis
Author: Kim Steenstrup Pedersen, Kristoffer Stensbo-Smidt, Andrew Zirm, Christian Igel
Abstract: A texture descriptor based on the shape index and the accompanying curvedness measure is proposed, and it is evaluated for the automated analysis of astronomical image data. A representative sample of images of low-redshift galaxies from the Sloan Digital Sky Survey (SDSS) serves as a testbed. The goal of applying texture descriptors to these data is to extract novel information about galaxies; information which is often lost in more traditional analysis. In this study, we build a regression model for predicting a spectroscopic quantity, the specific star-formation rate (sSFR). As texture features we consider multi-scale gradient orientation histograms as well as multi-scale shape index histograms, which lead to a new descriptor. Our results show that we can successfully predict spectroscopic quantities from the texture in optical multi-band images. We successfully recover the observed bi-modal distribution of galaxies into quiescent and star-forming. The state-ofthe-art for predicting the sSFR is a color-based physical model. We significantly improve its accuracy by augmenting the model with texture information. This study is thefirst step towards enabling the quantification of physical galaxy properties from imaging data alone.
5 0.54176962 125 iccv-2013-Drosophila Embryo Stage Annotation Using Label Propagation
Author: Tomáš Kazmar, Evgeny Z. Kvon, Alexander Stark, Christoph H. Lampert
Abstract: In this work we propose a system for automatic classification of Drosophila embryos into developmental stages. While the system is designed to solve an actual problem in biological research, we believe that the principle underlying it is interesting not only for biologists, but also for researchers in computer vision. The main idea is to combine two orthogonal sources of information: one is a classifier trained on strongly invariant features, which makes it applicable to images of very different conditions, but also leads to rather noisy predictions. The other is a label propagation step based on a more powerful similarity measure that however is only consistent within specific subsets of the data at a time. In our biological setup, the information sources are the shape and the staining patterns of embryo images. We show experimentally that while neither of the methods can be used by itself to achieve satisfactory results, their combination achieves prediction quality comparable to human per- formance.
6 0.5406096 90 iccv-2013-Content-Aware Rotation
7 0.51252609 341 iccv-2013-Real-Time Body Tracking with One Depth Camera and Inertial Sensors
9 0.48463836 368 iccv-2013-SYM-FISH: A Symmetry-Aware Flip Invariant Sketch Histogram Shape Descriptor
10 0.47805339 115 iccv-2013-Direct Optimization of Frame-to-Frame Rotation
11 0.47115764 185 iccv-2013-Go-ICP: Solving 3D Registration Efficiently and Globally Optimally
12 0.47058147 130 iccv-2013-Dynamic Structured Model Selection
13 0.46801335 353 iccv-2013-Revisiting the PnP Problem: A Fast, General and Optimal Solution
14 0.4678967 273 iccv-2013-Monocular Image 3D Human Pose Estimation under Self-Occlusion
15 0.46639928 308 iccv-2013-Parsing IKEA Objects: Fine Pose Estimation
16 0.46459815 30 iccv-2013-A Simple Model for Intrinsic Image Decomposition with Depth Cues
17 0.46315831 422 iccv-2013-Toward Guaranteed Illumination Models for Non-convex Objects
18 0.46200117 254 iccv-2013-Live Metric 3D Reconstruction on Mobile Phones
19 0.46130204 5 iccv-2013-A Color Constancy Model with Double-Opponency Mechanisms
20 0.45906776 444 iccv-2013-Viewing Real-World Faces in 3D
topicId topicWeight
[(2, 0.038), (7, 0.012), (22, 0.014), (26, 0.059), (31, 0.034), (34, 0.388), (35, 0.017), (42, 0.083), (64, 0.04), (73, 0.03), (78, 0.011), (84, 0.016), (89, 0.133)]
simIndex simValue paperId paperTitle
1 0.73618442 53 iccv-2013-Attribute Dominance: What Pops Out?
Author: Naman Turakhia, Devi Parikh
Abstract: When we look at an image, some properties or attributes of the image stand out more than others. When describing an image, people are likely to describe these dominant attributes first. Attribute dominance is a result of a complex interplay between the various properties present or absent in the image. Which attributes in an image are more dominant than others reveals rich information about the content of the image. In this paper we tap into this information by modeling attribute dominance. We show that this helps improve the performance of vision systems on a variety of human-centric applications such as zero-shot learning, image search and generating textual descriptions of images.
2 0.73319227 202 iccv-2013-How Do You Tell a Blackbird from a Crow?
Author: Thomas Berg, Peter N. Belhumeur
Abstract: How do you tell a blackbirdfrom a crow? There has been great progress toward automatic methods for visual recognition, including fine-grained visual categorization in which the classes to be distinguished are very similar. In a task such as bird species recognition, automatic recognition systems can now exceed the performance of non-experts – most people are challenged to name a couple dozen bird species, let alone identify them. This leads us to the question, “Can a recognition system show humans what to look for when identifying classes (in this case birds)? ” In the context of fine-grained visual categorization, we show that we can automatically determine which classes are most visually similar, discover what visual features distinguish very similar classes, and illustrate the key features in a way meaningful to humans. Running these methods on a dataset of bird images, we can generate a visual field guide to birds which includes a tree of similarity that displays the similarity relations between all species, pages for each species showing the most similar other species, and pages for each pair of similar species illustrating their differences.
same-paper 3 0.66423786 278 iccv-2013-Multi-scale Topological Features for Hand Posture Representation and Analysis
Author: Kaoning Hu, Lijun Yin
Abstract: In this paper, we propose a multi-scale topological feature representation for automatic analysis of hand posture. Such topological features have the advantage of being posture-dependent while being preserved under certain variations of illumination, rotation, personal dependency, etc. Our method studies the topology of the holes between the hand region and its convex hull. Inspired by the principle of Persistent Homology, which is the theory of computational topology for topological feature analysis over multiple scales, we construct the multi-scale Betti Numbers matrix (MSBNM) for the topological feature representation. In our experiments, we used 12 different hand postures and compared our features with three popular features (HOG, MCT, and Shape Context) on different data sets. In addition to hand postures, we also extend the feature representations to arm postures. The results demonstrate the feasibility and reliability of the proposed method.
4 0.61816704 230 iccv-2013-Latent Data Association: Bayesian Model Selection for Multi-target Tracking
Author: Aleksandr V. Segal, Ian Reid
Abstract: We propose a novel parametrization of the data association problem for multi-target tracking. In our formulation, the number of targets is implicitly inferred together with the data association, effectively solving data association and model selection as a single inference problem. The novel formulation allows us to interpret data association and tracking as a single Switching Linear Dynamical System (SLDS). We compute an approximate posterior solution to this problem using a dynamic programming/message passing technique. This inference-based approach allows us to incorporate richer probabilistic models into the tracking system. In particular, we incorporate inference over inliers/outliers and track termination times into the system. We evaluate our approach on publicly available datasets and demonstrate results competitive with, and in some cases exceeding the state of the art.
5 0.59792489 64 iccv-2013-Box in the Box: Joint 3D Layout and Object Reasoning from Single Images
Author: Alexander G. Schwing, Sanja Fidler, Marc Pollefeys, Raquel Urtasun
Abstract: In this paper we propose an approach to jointly infer the room layout as well as the objects present in the scene. Towards this goal, we propose a branch and bound algorithm which is guaranteed to retrieve the global optimum of the joint problem. The main difficulty resides in taking into account occlusion in order to not over-count the evidence. We introduce a new decomposition method, which generalizes integral geometry to triangular shapes, and allows us to bound the different terms in constant time. We exploit both geometric cues and object detectors as image features and show large improvements in 2D and 3D object detection over state-of-the-art deformable part-based models.
6 0.58654249 335 iccv-2013-Random Faces Guided Sparse Many-to-One Encoder for Pose-Invariant Face Recognition
7 0.58193088 31 iccv-2013-A Unified Probabilistic Approach Modeling Relationships between Attributes and Objects
8 0.57106131 399 iccv-2013-Spoken Attributes: Mixing Binary and Relative Attributes to Say the Right Thing
9 0.54799831 138 iccv-2013-Efficient and Robust Large-Scale Rotation Averaging
10 0.51760864 52 iccv-2013-Attribute Adaptation for Personalized Image Search
11 0.49135366 7 iccv-2013-A Deep Sum-Product Architecture for Robust Facial Attributes Analysis
12 0.48593694 449 iccv-2013-What Do You Do? Occupation Recognition in a Photo via Social Context
13 0.48521465 107 iccv-2013-Deformable Part Descriptors for Fine-Grained Recognition and Attribute Prediction
14 0.47585118 169 iccv-2013-Fine-Grained Categorization by Alignments
15 0.46684045 54 iccv-2013-Attribute Pivots for Guiding Relevance Feedback in Image Search
17 0.45878094 381 iccv-2013-Semantically-Based Human Scanpath Estimation with HMMs
18 0.45497331 380 iccv-2013-Semantic Transform: Weakly Supervised Semantic Inference for Relating Visual Attributes
19 0.45496029 411 iccv-2013-Symbiotic Segmentation and Part Localization for Fine-Grained Categorization
20 0.45113012 272 iccv-2013-Modifying the Memorability of Face Photographs