iccv iccv2013 iccv2013-243 knowledge-graph by maker-knowledge-mining

243 iccv-2013-Learning Slow Features for Behaviour Analysis

Source: pdf

Author: Lazaros Zafeiriou, Mihalis A. Nicolaou, Stefanos Zafeiriou, Symeon Nikitidis, Maja Pantic

Abstract: A recently introduced latent feature learning technique for time varying dynamic phenomena analysis is the socalled Slow Feature Analysis (SFA). SFA is a deterministic component analysis technique for multi-dimensional sequences that by minimizing the variance of the first order time derivative approximation of the input signal finds uncorrelated projections that extract slowly-varying features ordered by their temporal consistency and constancy. In this paper, we propose a number of extensions in both the deterministic and the probabilistic SFA optimization frameworks. In particular, we derive a novel deterministic SFA algorithm that is able to identify linear projections that extract the common slowest varying features of two or more sequences. In addition, we propose an Expectation Maximization (EM) algorithm to perform inference in a probabilistic formulation of SFA and similarly extend it in order to handle two and more time varying data sequences. Moreover, we demonstrate that the probabilistic SFA (EMSFA) algorithm that discovers the common slowest varying latent space of multiple sequences can be combined with dynamic time warping techniques for robust sequence timealignment. The proposed SFA algorithms were applied for facial behavior analysis demonstrating their usefulness and appropriateness for this task.

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 z afe i s riou , Abstract A recently introduced latent feature learning technique for time varying dynamic phenomena analysis is the socalled Slow Feature Analysis (SFA). [sent-4, score-0.413]

2 In this paper, we propose a number of extensions in both the deterministic and the probabilistic SFA optimization frameworks. [sent-6, score-0.236]

3 In particular, we derive a novel deterministic SFA algorithm that is able to identify linear projections that extract the common slowest varying features of two or more sequences. [sent-7, score-0.448]

4 In addition, we propose an Expectation Maximization (EM) algorithm to perform inference in a probabilistic formulation of SFA and similarly extend it in order to handle two and more time varying data sequences. [sent-8, score-0.194]

5 Moreover, we demonstrate that the probabilistic SFA (EMSFA) algorithm that discovers the common slowest varying latent space of multiple sequences can be combined with dynamic time warping techniques for robust sequence timealignment. [sent-9, score-0.587]

6 The proposed SFA algorithms were applied for facial behavior analysis demonstrating their usefulness and appropriateness for this task. [sent-10, score-0.091]

7 Introduction Slow Feature Analysis (SFA) was first proposed in [25] as an unsupervised methodology for finding slowly varying (invariant) features from rapidly temporal varying signals. [sent-12, score-0.422]

8 The exploited slowness learning principle in [25] was motivated by the empirical observation that higher order meanings of sensory data, such as objects and their attributes, are often more persistent (i. [sent-13, score-0.242]

9 , change smoothly) than the independent activation of any single sensory receptor. [sent-15, score-0.111]

10 For instance, the position and the identity of an object are visible for extended periods of time and change with time in a continuous fashion. [sent-16, score-0.054]

11 uk c} any primary sensory signal (like the responses of individual retinal receptors or the gray-scale values of a single pixel in a video camera), thus being more robust to subtle changes in the environment. [sent-21, score-0.176]

12 The proposed in [25] optimization problem aims to minimize the magnitude of the approximated first order time derivative of the extracted slowly varying features under the constraints that these are centered (i. [sent-23, score-0.246]

13 Thus, the slowest varying features are identified by solving a generalized eigenvalue problem for the joint diagonalization of the data covariance matrix and the covariance matrix of the first order forward data differences. [sent-26, score-0.533]

14 Intuitively, SFA imitates the functionality of the receptive fields of the visual cortex [2], thus being appropriate for describing the evolution of time varying visual phenomena. [sent-27, score-0.167]

15 Recently, SFA and its discriminant extensions have been successfully applied for human action recognition in [26], while hierarchical segmentation of video sequences using SFA was investigated in [15]. [sent-29, score-0.113]

16 In [8] SFA was applied for object and object-pose recognition on a homogeneous background, while in [14] SFA for vectorvalued functions was studied for blind source separation. [sent-30, score-0.033]

17 In [4], the equivalence between linear SFA and the second-order ICA algorithm, in the case of one time delay, is demonstrated. [sent-33, score-0.027]

18 In [20], the relation between we 1Continuous time SFA has been proposed in [24] but since in this paper assume discrete time signals, such works are out of our scope. [sent-34, score-0.054]

19 2840 f1 f7 f10 f23 f33 f38 f42 frames Figure 1: The latent space obtained by EM-SFA, accurately capturing the transition between temporal phases of action units. [sent-35, score-0.384]

20 LE and SFA was studied and exhibited that SFA is a special case of kernel Locality Preserving Projections (LPP) [9] acquired by defining the data neighborhood structure using their temporal variations. [sent-37, score-0.089]

21 In [21], it was shown that the projection bases provided by SFA are similar to those yielded by the Maximum Likelihood (ML) solution of a probabilistic generative model in the limit case that the noise vari- ance tends to zero. [sent-38, score-0.136]

22 The probabilistic generative model comprises a linear model for the generation of observations and imposes a Gaussian linear dynamical system with diagonal covariances over the latent space. [sent-39, score-0.289]

23 In this paper, we study the application of SFA for unsupervised facial behaviour analysis. [sent-40, score-0.161]

24 1, we can see the resulting latent space obtained by EM-SFA, applied on a video sequence where the subject is activating Action Unit (AU) 22 (Lip Funneler). [sent-45, score-0.179]

25 In general, when activating an AU, the following temporal phases are recorded: Neutral, when the face is relaxed, Onset, when the action initiates, Apex, when the muscles reach the peak intensity and Offset when the muscles begin to relax. [sent-46, score-0.475]

26 It can be clearly observed in the figure, that the latent space obtained by EM-SFA accurately captures the transitions between the temporal phases of the AU, providing an unsupervised method for detecting the temporal phases of AUs. [sent-48, score-0.586]

27 Summarising the contributions of our paper, we propose the following theoretical novelties: • We propose the first Expectation Maximization (EM) algorithm for learning the model parameters of a probabilistic SFA (EM-SFA). [sent-49, score-0.076]

28 In contrast to existing ML approaches ([21]), our approach allows for full probabilistic modelling of the latent distributions instead of mapping the variances to zero, as in ML. [sent-50, score-0.194]

29 • We extend both deterministic and probabilistic SFA to enable us to find the common slowest varying features of two or more time varying data sequences, thus allowing the simultaneous analysis of multiple data streams. [sent-51, score-0.603]

30 The novelties of our paper in terms of application can be summarized as follows: • • We apply the proposed EM-SFA to facial behaviour dynamics analysis and in particular for facial Action Units (AUs) analysis. [sent-52, score-0.271]

31 More precisely, we demonstrate that it is possible to discover the dynamics of AUs in an unsupervised manner using EM-SFA. [sent-53, score-0.075]

32 To the best of our knowledge, this is the first unsupervised approach which detects the temporal phases of AUs (other unsupervised approaches such as [29] focus on detecting global structures (i. [sent-54, score-0.3]

33 We combine the common latent space derived by EM-SFA with Dynamic Time Warping techniques [18] for the temporal alignment of dynamic facial behaviour. [sent-57, score-0.294]

34 We claim that by using the slowest varying features for sequence alignment is well motivated by the principle of slowness as described above (i. [sent-58, score-0.377]

35 , slowly varying features correspond to meaningful changes rather than rapidly varying ones, which most likely correspond to noise [25]). [sent-60, score-0.289]

36 × The rest of the paper is organised as follows. [sent-61, score-0.03]

37 2, we describe the deterministic SFA model, while in Sec. [sent-63, score-0.136]

38 2), while the latter method is incremented with warpings in Sec. [sent-70, score-0.063]

39 Deterministic Slow Feature Analysis In order to identify the slowest varying features deterministic SFA considers the following optimization problem. [sent-80, score-0.409]

40 Given an M-dimensional time-varying input sequence X = [xt, t ∈ [1, T]], where t denotes time and xt ∈ ? [sent-81, score-0.087]

41 termine appropriate projection bases stored in the columns of matrix V = [v1, v2, . [sent-83, score-0.054]

42 imize the variance of the approximated first order time derivative of the latent variables Y = [y1, y2, . [sent-88, score-0.197]

43 N×T subject to zero mean, unit covariance and decorre]l∈a ti ? [sent-92, score-0.081]

44 ] is the matrix trace operator, 1is a T (1) 1vector × wwihtehr aell t ri[ts. [sent-96, score-0.126]

45 ]e ilse tmheen mts equal atoc eT1 o, Ip eirsa a oNr, ×1 N is identity m veactrtioxr and matrix Y˙ approximates the first order time derivative of Y, evaluated using the forward latent variable differences as follows: Y˙ = [y2 − y1, y3 − y2, . [sent-97, score-0.388]

46 (3) where B is the input data covariance matrix and A is an M M covariance matrix evaluated using the forward temporal Mdiff ceorveanrcieasn coef mthaet input adluaatate, cdo unstianigne tdhe ei fno mrwaatrridx tXe˙m A =1X˙X˙T,B =T1XXT. [sent-104, score-0.389]

47 (4) The solution of (3) can be found from the Generalized Eigenvalue Problem (GEP) [25]: AV = BVL (5) where the columns of the projection matrix V are the generalized eigenvectors associated with the N-lower general- ized eigenvalues contained sorted in the diagonal matrix L. [sent-105, score-0.11]

48 A Probabilistic Interpretation of SFA In this section, we discuss a probabilistic approach to SFA latent variable extraction. [sent-107, score-0.194]

49 Let us assume the following linear generative model that relates the latent variable yt with the observed samples xt as: xt = V−Tyt + et, et ∼ N(0, σx2I) (6) where ei is the noise which is assumed to be an isotropic Gaussian model. [sent-108, score-0.437]

50 Let us also assume the lin|Vea,ry yGauss)ia =n dynamical system priors over the latent space Y are: ? [sent-110, score-0.156]

51 m=o =h Σ = [δi,jσ2n] and Σ1 = [δi,jσ2n,1] the prior over the latent space can be evaluated as: P(Y|θy)=+(Z21 yσ1n2e xy? [sent-119, score-0.118]

52 In [21], it was shown that the ML solution of the above model in the deterministic case (i. [sent-128, score-0.136]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('sfa', 0.817), ('slowest', 0.182), ('yt', 0.144), ('deterministic', 0.136), ('phases', 0.123), ('latent', 0.118), ('sensory', 0.111), ('aus', 0.095), ('varying', 0.091), ('temporal', 0.089), ('slowly', 0.076), ('probabilistic', 0.076), ('muscles', 0.074), ('slowness', 0.074), ('novelties', 0.065), ('tyt', 0.065), ('riou', 0.065), ('ml', 0.063), ('apex', 0.061), ('activating', 0.061), ('xt', 0.06), ('behaviour', 0.059), ('facial', 0.058), ('afe', 0.057), ('onset', 0.054), ('covariance', 0.054), ('action', 0.054), ('derivative', 0.052), ('ica', 0.052), ('slow', 0.052), ('imperial', 0.05), ('au', 0.05), ('trace', 0.047), ('unsupervised', 0.044), ('forward', 0.04), ('projections', 0.039), ('dynamical', 0.038), ('signal', 0.038), ('neutral', 0.037), ('tr', 0.036), ('dy', 0.035), ('sequences', 0.035), ('vectorvalued', 0.033), ('appropriateness', 0.033), ('initiates', 0.033), ('robability', 0.033), ('warpings', 0.033), ('rices', 0.033), ('txe', 0.033), ('ilse', 0.033), ('argv', 0.033), ('atoc', 0.033), ('lpp', 0.033), ('maja', 0.033), ('stefanos', 0.033), ('summarising', 0.033), ('twente', 0.033), ('slower', 0.032), ('rapidly', 0.031), ('generalized', 0.031), ('generative', 0.031), ('eigenvalue', 0.031), ('dynamics', 0.031), ('wwihtehr', 0.03), ('yyt', 0.03), ('conditio', 0.03), ('eirsa', 0.03), ('mts', 0.03), ('organised', 0.03), ('incremented', 0.03), ('motivated', 0.03), ('expectation', 0.03), ('dynamic', 0.029), ('maximization', 0.029), ('bases', 0.029), ('warping', 0.029), ('fied', 0.029), ('mthaet', 0.029), ('ized', 0.029), ('retinal', 0.027), ('eigenmaps', 0.027), ('idi', 0.027), ('lip', 0.027), ('mvin', 0.027), ('persistent', 0.027), ('time', 0.027), ('interpretation', 0.027), ('zero', 0.027), ('covariances', 0.026), ('socalled', 0.026), ('coef', 0.025), ('delay', 0.025), ('functionality', 0.025), ('matrix', 0.025), ('cortex', 0.024), ('aell', 0.024), ('ei', 0.024), ('extensions', 0.024), ('cdo', 0.024), ('uncorrelated', 0.024)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999988 243 iccv-2013-Learning Slow Features for Behaviour Analysis

Author: Lazaros Zafeiriou, Mihalis A. Nicolaou, Stefanos Zafeiriou, Symeon Nikitidis, Maja Pantic

2 0.103945 69 iccv-2013-Capturing Global Semantic Relationships for Facial Action Unit Recognition

Author: Ziheng Wang, Yongqiang Li, Shangfei Wang, Qiang Ji

Abstract: In this paper we tackle the problem of facial action unit (AU) recognition by exploiting the complex semantic relationships among AUs, which carry crucial top-down information yet have not been thoroughly exploited. Towards this goal, we build a hierarchical model that combines the bottom-level image features and the top-level AU relationships to jointly recognize AUs in a principled manner. The proposed model has two major advantages over existing methods. 1) Unlike methods that can only capture local pair-wise AU dependencies, our model is developed upon the restricted Boltzmann machine and therefore can exploit the global relationships among AUs. 2) Although AU relationships are influenced by many related factors such as facial expressions, these factors are generally ignored by the current methods. Our model, however, can successfully capture them to more accurately characterize the AU relationships. Efficient learning and inference algorithms of the proposed model are also developed. Experimental results on benchmark databases demonstrate the effectiveness of the proposed approach in modelling complex AU relationships as well as its superior AU recognition performance over existing approaches.

3 0.087576665 155 iccv-2013-Facial Action Unit Event Detection by Cascade of Tasks

Author: Xiaoyu Ding, Wen-Sheng Chu, Fernando De_La_Torre, Jeffery F. Cohn, Qiao Wang

Abstract: Automatic facial Action Unit (AU) detection from video is a long-standing problem in facial expression analysis. AU detection is typically posed as a classification problem between frames or segments of positive examples and negative ones, where existing work emphasizes the use of different features or classifiers. In this paper, we propose a method called Cascade of Tasks (CoT) that combines the use ofdifferent tasks (i.e., , frame, segment and transition)for AU event detection. We train CoT in a sequential manner embracing diversity, which ensures robustness and generalization to unseen data. In addition to conventional framebased metrics that evaluate frames independently, we propose a new event-based metric to evaluate detection performance at event-level. We show how the CoT method consistently outperforms state-of-the-art approaches in both frame-based and event-based metrics, across three public datasets that differ in complexity: CK+, FERA and RUFACS.

4 0.081717879 249 iccv-2013-Learning to Share Latent Tasks for Action Recognition

Author: Qiang Zhou, Gang Wang, Kui Jia, Qi Zhao

Abstract: Sharing knowledge for multiple related machine learning tasks is an effective strategy to improve the generalization performance. In this paper, we investigate knowledge sharing across categories for action recognition in videos. The motivation is that many action categories are related, where common motion pattern are shared among them (e.g. diving and high jump share the jump motion). We propose a new multi-task learning method to learn latent tasks shared across categories, and reconstruct a classifier for each category from these latent tasks. Compared to previous methods, our approach has two advantages: (1) The learned latent tasks correspond to basic motionpatterns instead offull actions, thus enhancing discrimination power of the classifiers. (2) Categories are selected to share information with a sparsity regularizer, avoidingfalselyforcing all categories to share knowledge. Experimental results on multiplepublic data sets show that the proposed approach can effectively transfer knowledge between different action categories to improve the performance of conventional single task learning methods.

5 0.06923756 85 iccv-2013-Compositional Models for Video Event Detection: A Multiple Kernel Learning Latent Variable Approach

Author: Arash Vahdat, Kevin Cannons, Greg Mori, Sangmin Oh, Ilseo Kim

Abstract: We present a compositional model for video event detection. A video is modeled using a collection of both global and segment-level features and kernel functions are employed for similarity comparisons. The locations of salient, discriminative video segments are treated as a latent variable, allowing the model to explicitly ignore portions of the video that are unimportant for classification. A novel, multiple kernel learning (MKL) latent support vector machine (SVM) is defined, that is used to combine and re-weight multiple feature types in a principled fashion while simultaneously operating within the latent variable framework. The compositional nature of the proposed model allows it to respond directly to the challenges of temporal clutter and intra-class variation, which are prevalent in unconstrained internet videos. Experimental results on the TRECVID Multimedia Event Detection 2011 (MED11) dataset demonstrate the efficacy of the method.

6 0.067339234 225 iccv-2013-Joint Segmentation and Pose Tracking of Human in Natural Videos

7 0.066794477 187 iccv-2013-Group Norm for Learning Structured SVMs with Unstructured Latent Variables

8 0.066659912 240 iccv-2013-Learning Maximum Margin Temporal Warping for Action Recognition

9 0.056018654 417 iccv-2013-The Moving Pose: An Efficient 3D Kinematics Descriptor for Low-Latency Action Recognition and Detection

10 0.054863449 36 iccv-2013-Accurate and Robust 3D Facial Capture Using a Single RGBD Camera

11 0.053192068 86 iccv-2013-Concurrent Action Detection with Structural Prediction

12 0.052350152 438 iccv-2013-Unsupervised Visual Domain Adaptation Using Subspace Alignment

13 0.051222179 397 iccv-2013-Space-Time Tradeoffs in Photo Sequencing

14 0.050779853 233 iccv-2013-Latent Task Adaptation with Large-Scale Hierarchies

15 0.050045777 127 iccv-2013-Dynamic Pooling for Complex Event Recognition

16 0.048827022 146 iccv-2013-Event Detection in Complex Scenes Using Interval Temporal Constraints

17 0.047991291 359 iccv-2013-Robust Object Tracking with Online Multi-lifespan Dictionary Learning

18 0.04740417 26 iccv-2013-A Practical Transfer Learning Algorithm for Face Verification

19 0.044512875 128 iccv-2013-Dynamic Probabilistic Volumetric Models

20 0.043596536 232 iccv-2013-Latent Space Sparse Subspace Clustering

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.103), (1, 0.038), (2, -0.002), (3, 0.045), (4, -0.015), (5, -0.006), (6, 0.062), (7, 0.012), (8, 0.008), (9, -0.002), (10, -0.03), (11, -0.028), (12, -0.006), (13, 0.011), (14, -0.008), (15, 0.014), (16, 0.011), (17, 0.016), (18, -0.013), (19, -0.04), (20, 0.031), (21, 0.026), (22, -0.013), (23, -0.036), (24, -0.032), (25, -0.027), (26, 0.037), (27, -0.046), (28, 0.049), (29, 0.04), (30, 0.012), (31, 0.101), (32, -0.038), (33, 0.018), (34, 0.018), (35, -0.009), (36, -0.014), (37, 0.036), (38, 0.036), (39, 0.0), (40, 0.029), (41, 0.012), (42, 0.039), (43, -0.063), (44, -0.02), (45, 0.034), (46, -0.035), (47, 0.093), (48, 0.029), (49, 0.019)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.91497314 243 iccv-2013-Learning Slow Features for Behaviour Analysis

Author: Lazaros Zafeiriou, Mihalis A. Nicolaou, Stefanos Zafeiriou, Symeon Nikitidis, Maja Pantic

2 0.76575005 69 iccv-2013-Capturing Global Semantic Relationships for Facial Action Unit Recognition

Author: Ziheng Wang, Yongqiang Li, Shangfei Wang, Qiang Ji

3 0.64376485 249 iccv-2013-Learning to Share Latent Tasks for Action Recognition

Author: Qiang Zhou, Gang Wang, Kui Jia, Qi Zhao

4 0.63816243 155 iccv-2013-Facial Action Unit Event Detection by Cascade of Tasks

Author: Xiaoyu Ding, Wen-Sheng Chu, Fernando De_La_Torre, Jeffery F. Cohn, Qiao Wang

5 0.60253453 187 iccv-2013-Group Norm for Learning Structured SVMs with Unstructured Latent Variables

Author: Daozheng Chen, Dhruv Batra, William T. Freeman

Abstract: Latent variables models have been applied to a number of computer vision problems. However, the complexity of the latent space is typically left as a free design choice. A larger latent space results in a more expressive model, but such models are prone to overfitting and are slower to perform inference with. The goal of this paper is to regularize the complexity of the latent space and learn which hidden states are really relevant for prediction. Specifically, we propose using group-sparsity-inducing regularizers such as ?1-?2 to estimate the parameters of Structured SVMs with unstructured latent variables. Our experiments on digit recognition and object detection show that our approach is indeed able to control the complexity of latent space without any significant loss in accuracy of the learnt model.

6 0.54108512 85 iccv-2013-Compositional Models for Video Event Detection: A Multiple Kernel Learning Latent Variable Approach

7 0.47045755 127 iccv-2013-Dynamic Pooling for Complex Event Recognition

8 0.47032592 233 iccv-2013-Latent Task Adaptation with Large-Scale Hierarchies

9 0.46932158 231 iccv-2013-Latent Multitask Learning for View-Invariant Action Recognition

10 0.46662402 251 iccv-2013-Like Father, Like Son: Facial Expression Dynamics for Kinship Verification

11 0.4292807 265 iccv-2013-Mining Motion Atoms and Phrases for Complex Action Recognition

12 0.42695814 36 iccv-2013-Accurate and Robust 3D Facial Capture Using a Single RGBD Camera

13 0.42643359 147 iccv-2013-Event Recognition in Photo Collections with a Stopwatch HMM

14 0.42209247 240 iccv-2013-Learning Maximum Margin Temporal Warping for Action Recognition

15 0.4139837 397 iccv-2013-Space-Time Tradeoffs in Photo Sequencing

16 0.40487373 146 iccv-2013-Event Detection in Complex Scenes Using Interval Temporal Constraints

17 0.40161541 26 iccv-2013-A Practical Transfer Learning Algorithm for Face Verification

18 0.3955994 253 iccv-2013-Linear Sequence Discriminant Analysis: A Model-Based Dimensionality Reduction Method for Vector Sequences

19 0.39289138 60 iccv-2013-Bayesian Robust Matrix Factorization for Image and Video Processing

20 0.38625148 86 iccv-2013-Concurrent Action Detection with Structural Prediction

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(2, 0.025), (7, 0.03), (19, 0.299), (26, 0.063), (31, 0.045), (34, 0.018), (42, 0.105), (48, 0.014), (64, 0.049), (73, 0.046), (78, 0.036), (89, 0.138), (97, 0.011)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.7317189 243 iccv-2013-Learning Slow Features for Behaviour Analysis

Author: Lazaros Zafeiriou, Mihalis A. Nicolaou, Stefanos Zafeiriou, Symeon Nikitidis, Maja Pantic

2 0.6714654 418 iccv-2013-The Way They Move: Tracking Multiple Targets with Similar Appearance

Author: Caglayan Dicle, Octavia I. Camps, Mario Sznaier

Abstract: We introduce a computationally efficient algorithm for multi-object tracking by detection that addresses four main challenges: appearance similarity among targets, missing data due to targets being out of the field of view or occluded behind other objects, crossing trajectories, and camera motion. The proposed method uses motion dynamics as a cue to distinguish targets with similar appearance, minimize target mis-identification and recover missing data. Computational efficiency is achieved by using a Generalized Linear Assignment (GLA) coupled with efficient procedures to recover missing data and estimate the complexity of the underlying dynamics. The proposed approach works with tracklets of arbitrary length and does not assume a dynamical model a priori, yet it captures the overall motion dynamics of the targets. Experiments using challenging videos show that this framework can handle complex target motions, non-stationary cameras and long occlusions, on scenarios where appearance cues are not available or poor.

3 0.63624454 371 iccv-2013-Saliency Detection via Absorbing Markov Chain

Author: Bowen Jiang, Lihe Zhang, Huchuan Lu, Chuan Yang, Ming-Hsuan Yang

Abstract: In this paper, we formulate saliency detection via absorbing Markov chain on an image graph model. We jointly consider the appearance divergence and spatial distribution of salient objects and the background. The virtual boundary nodes are chosen as the absorbing nodes in a Markov chain and the absorbed time from each transient node to boundary absorbing nodes is computed. The absorbed time of transient node measures its global similarity with all absorbing nodes, and thus salient objects can be consistently separated from the background when the absorbed time is used as a metric. Since the time from transient node to absorbing nodes relies on the weights on the path and their spatial distance, the background region on the center of image may be salient. We further exploit the equilibrium distribution in an ergodic Markov chain to reduce the absorbed time in the long-range smooth background regions. Extensive experiments on four benchmark datasets demonstrate robustness and efficiency of the proposed method against the state-of-the-art methods.

4 0.62966645 383 iccv-2013-Semi-supervised Learning for Large Scale Image Cosegmentation

Author: Zhengxiang Wang, Rujie Liu

Abstract: This paper introduces to use semi-supervised learning for large scale image cosegmentation. Different from traditional unsupervised cosegmentation that does not use any segmentation groundtruth, semi-supervised cosegmentation exploits the similarity from both the very limited training image foregrounds, as well as the common object shared between the large number of unsegmented images. This would be a much practical way to effectively cosegment a large number of related images simultaneously, where previous unsupervised cosegmentation work poorly due to the large variances in appearance between different images and the lack ofsegmentation groundtruthfor guidance in cosegmentation. For semi-supervised cosegmentation in large scale, we propose an effective method by minimizing an energy function, which consists of the inter-image distance, the intraimage distance and the balance term. We also propose an iterative updating algorithm to efficiently solve this energy function, which decomposes the original energy minimization problem into sub-problems, and updates each image alternatively to reduce the number of variables in each subproblem for computation efficiency. Experiment results on iCoseg and Pascal VOC datasets show that the proposed cosegmentation method can effectively cosegment hundreds of images in less than one minute. And our semi-supervised cosegmentation is able to outperform both unsupervised cosegmentation as well asfully supervised single image segmentation, especially when the training data is limited.

5 0.61987913 314 iccv-2013-Perspective Motion Segmentation via Collaborative Clustering

Author: Zhuwen Li, Jiaming Guo, Loong-Fah Cheong, Steven Zhiying Zhou

Abstract: This paper addresses real-world challenges in the motion segmentation problem, including perspective effects, missing data, and unknown number of motions. It first formulates the 3-D motion segmentation from two perspective views as a subspace clustering problem, utilizing the epipolar constraint of an image pair. It then combines the point correspondence information across multiple image frames via a collaborative clustering step, in which tight integration is achieved via a mixed norm optimization scheme. For model selection, wepropose an over-segment and merge approach, where the merging step is based on the property of the ?1-norm ofthe mutual sparse representation oftwo oversegmented groups. The resulting algorithm can deal with incomplete trajectories and perspective effects substantially better than state-of-the-art two-frame and multi-frame methods. Experiments on a 62-clip dataset show the significant superiority of the proposed idea in both segmentation accuracy and model selection.

6 0.58179545 190 iccv-2013-Handling Occlusions with Franken-Classifiers

7 0.5488596 150 iccv-2013-Exemplar Cut

8 0.54632574 330 iccv-2013-Proportion Priors for Image Sequence Segmentation

9 0.5458262 259 iccv-2013-Manifold Based Face Synthesis from Sparse Samples

10 0.54533899 328 iccv-2013-Probabilistic Elastic Part Model for Unsupervised Face Detector Adaptation

11 0.54321355 359 iccv-2013-Robust Object Tracking with Online Multi-lifespan Dictionary Learning

12 0.54312015 69 iccv-2013-Capturing Global Semantic Relationships for Facial Action Unit Recognition

13 0.54306519 79 iccv-2013-Coherent Object Detection with 3D Geometric Context from a Single Image

14 0.54297948 349 iccv-2013-Regionlets for Generic Object Detection

15 0.54286087 376 iccv-2013-Scene Text Localization and Recognition with Oriented Stroke Detection

16 0.54269749 230 iccv-2013-Latent Data Association: Bayesian Model Selection for Multi-target Tracking

17 0.54244357 59 iccv-2013-Bayesian Joint Topic Modelling for Weakly Supervised Object Localisation

18 0.54194391 223 iccv-2013-Joint Noise Level Estimation from Personal Photo Collections

19 0.541713 270 iccv-2013-Modeling Self-Occlusions in Dynamic Shape and Appearance Tracking

20 0.54103124 187 iccv-2013-Group Norm for Learning Structured SVMs with Unstructured Latent Variables