jmlr jmlr2012 jmlr2012-88 knowledge-graph by maker-knowledge-mining

88 jmlr-2012-PREA: Personalized Recommendation Algorithms Toolkit


Source: pdf

Author: Joonseok Lee, Mingxuan Sun, Guy Lebanon

Abstract: Recommendation systems are important business applications with significant economic impact. In recent years, a large number of algorithms have been proposed for recommendation systems. In this paper, we describe an open-source toolkit implementing many recommendation algorithms as well as popular evaluation metrics. In contrast to other packages, our toolkit implements recent state-of-the-art algorithms as well as most classic algorithms. Keywords: recommender systems, collaborative filtering, evaluation metrics

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 EDU College of Computing Georgia Institute of Technology Atlanta, Georgia 30332, USA Editor: Soeren Sonnenburg Abstract Recommendation systems are important business applications with significant economic impact. [sent-5, score-0.038]

2 In recent years, a large number of algorithms have been proposed for recommendation systems. [sent-6, score-0.489]

3 In this paper, we describe an open-source toolkit implementing many recommendation algorithms as well as popular evaluation metrics. [sent-7, score-0.797]

4 In contrast to other packages, our toolkit implements recent state-of-the-art algorithms as well as most classic algorithms. [sent-8, score-0.26]

5 Keywords: recommender systems, collaborative filtering, evaluation metrics 1. [sent-9, score-0.411]

6 Introduction As the demand for personalized services in E-commerce increases, recommendation systems are emerging as an important business application. [sent-10, score-0.632]

7 com, for example, provides personalized product recommendations based on previous purchases. [sent-12, score-0.184]

8 A wide variety of algorithms have been proposed by the research community for recommendation systems. [sent-17, score-0.489]

9 Unlike classification where comprehensive packages are available, existing recommendation systems toolkits lag behind. [sent-18, score-0.763]

10 They concentrate on implementing traditional algorithms rather than the rapidly evolving state-of-the-art. [sent-19, score-0.126]

11 Implementations of modern algorithms are scattered over different sources which makes it hard to have a fair and comprehensive comparison. [sent-20, score-0.108]

12 In this paper we describe a new toolkit PREA (Personalized Recommendation Algorithms Toolkit), implementing a wide variety of recommendation systems algorithms. [sent-21, score-0.73]

13 PREA offers implementation of modern state-of-the-art recommendation algorithms as well as most traditional ones. [sent-22, score-0.563]

14 In addition, it provides many popular evaluation methods and data sets. [sent-23, score-0.067]

15 As a result, it can be used to conduct fair and comprehensive comparisons between different recommendation algorithms. [sent-24, score-0.573]

16 The implemented evaluation routines, data sets, and the open-source nature of PREA makes it easy for third parties to contribute additional implementations. [sent-25, score-0.067]

17 Not surprisingly, the performance of recommendation algorithms depends on the data characteristics. [sent-26, score-0.489]

18 , 2012a,b) For example, some algorithms may work better or worse depending on the amount of missing data (sparsity), distribution of ratings, and the number of users and items. [sent-28, score-0.029]

19 Furthermore, the different evaluation methods that have been proposed in the literature may have conflicting orderings over algorithms (Gunawardana and Shani, 2009). [sent-30, score-0.116]

20 L EE , S UN AND L EBANON algorithms using a variety of evaluation metrics may clarify which algorithms perform better under what circumstance. [sent-32, score-0.183]

21 The toolkit consists of three groups of classes, as shown in Figure 1. [sent-37, score-0.189]

22 The top-level routines of the toolkit may be directly called from other programming environments like Matlab. [sent-38, score-0.229]

23 The top two groups implement basic data structures and recommendation algorithms. [sent-39, score-0.489]

24 The third group (bottom) implements evaluation metrics, statistical distributions, and other utility functions. [sent-40, score-0.138]

25 1 Input Data File Format As in the WEKA1 toolkit PREA accepts the data in ARFF2 (Attribute-Relation File Format). [sent-42, score-0.189]

26 Since virtually all recommendation systems data sets are sparse, PREA implements sparse (rather than dense) ARFF format. [sent-43, score-0.588]

27 2 Evaluation Framework For ease of comparison, PREA provides the following unified interface for running the different recommendation algorithms. [sent-45, score-0.489]

28 • • • • Instantiate a class instance according to the type of the recommendation algorithm. [sent-46, score-0.489]

29 Given a test set, predict user ratings over unseen items. [sent-48, score-0.215]

30 Given the above prediction and held-out ground truth ratings, evaluate the prediction using various evaluation metrics. [sent-49, score-0.067]

31 Note that lazy learning algorithms like the constant models or memory-based algorithms may skip the second step. [sent-50, score-0.023]

32 A simple train and test split is constructed by choosing a certain proportion of all ratings as test set and assigning the remaining ratings to the train set. [sent-51, score-0.318]

33 3 Data Structures PREA uses a sparse matrix representation for the rating matrices which are generally extremely sparse (users provide ratings for only a small subset of items). [sent-67, score-0.269]

34 Figure 2 (left) shows an example of a sparse vector, containing data only in indices 1, 3, and 6. [sent-70, score-0.028]

35 This design is also useful for fast transposing of sparse matrices by interchanging rows and columns. [sent-73, score-0.055]

36 Figure 2 (right) shows an example of sparse matrix. [sent-74, score-0.028]

37 Figure 2: Sparse Vector (left) and Matrix (right) Implementation PREA also uses dense matrices in some cases. [sent-75, score-0.038]

38 For example, dense representations are used for low-rank matrix factorizations or other algebraic operations that do not maintain sparsity. [sent-76, score-0.089]

39 The dense representations are based on the matrix implementations in the Universal Java Matrix Package (UJMP) (http://www. [sent-77, score-0.059]

40 4 Implemented Algorithms and Evaluation Metrics PREA implements the following prediction algorithms: • Baselines (constant, random, overall average, user average, item average): make little use of personalized information for recommending items. [sent-81, score-0.266]

41 • Memory-based Neighborhood algorithms (user-based, item-based collaborative filtering and their extensions including default vote, inverse user frequency): predict ratings of unseen items by referring those of similar users or items. [sent-82, score-0.478]

42 • Matrix Factorization methods (SVD, NMF, PMF, Bayesian PMF, Non-linear PMF): build low-rank user/item profiles by factorizing training data set with linear algebraic techniques. [sent-83, score-0.054]

43 • Others: recent state-of-the-art algorithms such as Fast NPCA and Rank-based collaborative filtering. [sent-84, score-0.183]

44 We provide popular evaluation metrics as follows: • Accuracy for rating predictions (RMSE, MAE, NMAE): measure how much the predictions are similar to the actual ratings. [sent-85, score-0.216]

45 • Rank-based evaluation metrics (HLU, NDCG, Kendall’s Tau, and Spearman): score depending on similarity between orderings of predicted ratings and those of ground truth. [sent-86, score-0.391]

46 Related Work Several other open-source recommendation toolkits are available. [sent-88, score-0.66]

47 Table 1 summarizes the implemented features in these toolkits and compares them to those in PREA. [sent-89, score-0.171]

48 It also supports powerful mathematical and statistical operations as it is a general-purpose machine learning toolkit. [sent-91, score-0.022]

49 Cofi6 implements several traditional algorithms with a simple design based on providing wrappers for publicly available data sets. [sent-93, score-0.121]

50 MyMedia7 is a C#-based recommendation toolkit which supports most traditional algorithms and several evaluation metrics. [sent-94, score-0.817]

51 As indicated in Table 1, existing toolkits widely provide simple memory-based algorithms, while recent state-of-the-art algorithms are often not supported. [sent-95, score-0.171]

52 Also, these toolkits are generally limited in their evaluation metrics (with the notable exception of MyMedia). [sent-96, score-0.354]

53 In contrast, PREA provides a wide coverage of the the most up-to-date algorithms as well as various evaluation metrics, facilitating a comprehensive comparison between state-of-the-art and newly proposed algorithms in the research community. [sent-97, score-0.137]

54 The open-source nature of the software (available under GPL) may encourage other recommendation systems experts to add their own algorithms to PREA. [sent-104, score-0.489]

55 More documentation for developers as well as user tutorials are available on the web (http://prea. [sent-105, score-0.084]

56 A survey of accuracy evaluation metrics of recommendation tasks. [sent-140, score-0.672]

57 Bayesian probabilistic matrix factorization using markov chain monte carlo. [sent-185, score-0.077]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('prea', 0.571), ('recommendation', 0.489), ('toolkit', 0.189), ('collaborative', 0.183), ('toolkits', 0.171), ('pmf', 0.163), ('ratings', 0.159), ('java', 0.139), ('metrics', 0.116), ('personalized', 0.105), ('sun', 0.09), ('lebanon', 0.09), ('gpl', 0.09), ('cf', 0.084), ('salakhutdinov', 0.084), ('recommendations', 0.079), ('implements', 0.071), ('evaluation', 0.067), ('lee', 0.064), ('breese', 0.063), ('ebanon', 0.063), ('ecommendation', 0.063), ('ersonalized', 0.063), ('gunawardana', 0.063), ('guy', 0.063), ('hashmap', 0.063), ('joonseok', 0.063), ('lemire', 0.063), ('mingxuan', 0.063), ('mymedia', 0.063), ('npca', 0.063), ('sarwar', 0.063), ('sparsevector', 0.063), ('gatech', 0.063), ('factorization', 0.056), ('ltering', 0.056), ('mnih', 0.054), ('oolkit', 0.054), ('implementing', 0.052), ('traditional', 0.05), ('tau', 0.049), ('orderings', 0.049), ('lgpl', 0.049), ('arff', 0.049), ('file', 0.049), ('nmf', 0.049), ('comprehensive', 0.047), ('kendall', 0.045), ('spearman', 0.045), ('rmse', 0.045), ('recommender', 0.045), ('mae', 0.042), ('su', 0.04), ('routines', 0.04), ('dense', 0.038), ('lgorithms', 0.038), ('business', 0.038), ('fair', 0.037), ('georgia', 0.036), ('baselines', 0.036), ('item', 0.033), ('vote', 0.033), ('svd', 0.033), ('rating', 0.033), ('co', 0.033), ('format', 0.032), ('algebraic', 0.03), ('ee', 0.03), ('user', 0.03), ('web', 0.03), ('packages', 0.029), ('items', 0.029), ('users', 0.029), ('lawrence', 0.028), ('sparse', 0.028), ('interchanging', 0.027), ('lag', 0.027), ('instantiate', 0.027), ('konstan', 0.027), ('nmae', 0.027), ('year', 0.027), ('recommending', 0.027), ('soeren', 0.027), ('unseen', 0.026), ('un', 0.025), ('factorizing', 0.024), ('release', 0.024), ('evolving', 0.024), ('friend', 0.024), ('urtasun', 0.024), ('sigir', 0.024), ('tutorials', 0.024), ('modern', 0.024), ('industry', 0.023), ('facilitating', 0.023), ('lazy', 0.023), ('filtering', 0.023), ('default', 0.022), ('supports', 0.022), ('matrix', 0.021)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000001 88 jmlr-2012-PREA: Personalized Recommendation Algorithms Toolkit

Author: Joonseok Lee, Mingxuan Sun, Guy Lebanon

Abstract: Recommendation systems are important business applications with significant economic impact. In recent years, a large number of algorithms have been proposed for recommendation systems. In this paper, we describe an open-source toolkit implementing many recommendation algorithms as well as popular evaluation metrics. In contrast to other packages, our toolkit implements recent state-of-the-art algorithms as well as most classic algorithms. Keywords: recommender systems, collaborative filtering, evaluation metrics

2 0.14643867 101 jmlr-2012-SVDFeature: A Toolkit for Feature-based Collaborative Filtering

Author: Tianqi Chen, Weinan Zhang, Qiuxia Lu, Kailong Chen, Zhao Zheng, Yong Yu

Abstract: In this paper we introduce SVDFeature, a machine learning toolkit for feature-based collaborative filtering. SVDFeature is designed to efficiently solve the feature-based matrix factorization. The feature-based setting allows us to build factorization models incorporating side information such as temporal dynamics, neighborhood relationship, and hierarchical information. The toolkit is capable of both rate prediction and collaborative ranking, and is carefully designed for efficient training on large-scale data set. Using this toolkit, we built solutions to win KDD Cup for two consecutive years. Keywords: large-scale collaborative filtering, context-aware recommendation, ranking

3 0.059609868 75 jmlr-2012-NIMFA : A Python Library for Nonnegative Matrix Factorization

Author: Marinka Žitnik, Blaž Zupan

Abstract: NIMFA is an open-source Python library that provides a unified interface to nonnegative matrix factorization algorithms. It includes implementations of state-of-the-art factorization methods, initialization approaches, and quality scoring. It supports both dense and sparse matrix representation. NIMFA’s component-based implementation and hierarchical design should help the users to employ already implemented techniques or design and code new strategies for matrix factorization tasks. Keywords: nonnegative matrix factorization, initialization methods, quality measures, scripting, Python

4 0.054698061 61 jmlr-2012-ML-Flex: A Flexible Toolbox for Performing Classification Analyses In Parallel

Author: Stephen R. Piccolo, Lewis J. Frey

Abstract: Motivated by a need to classify high-dimensional, heterogeneous data from the bioinformatics domain, we developed ML-Flex, a machine-learning toolbox that enables users to perform two-class and multi-class classification analyses in a systematic yet flexible manner. ML-Flex was written in Java but is capable of interfacing with third-party packages written in other programming languages. It can handle multiple input-data formats and supports a variety of customizations. MLFlex provides implementations of various validation strategies, which can be executed in parallel across multiple computing cores, processors, and nodes. Additionally, ML-Flex supports aggregating evidence across multiple algorithms and data sets via ensemble learning. This open-source software package is freely available from http://mlflex.sourceforge.net. Keywords: toolbox, classification, parallel, ensemble, reproducible research

5 0.041038256 108 jmlr-2012-Sparse and Unique Nonnegative Matrix Factorization Through Data Preprocessing

Author: Nicolas Gillis

Abstract: Nonnegative matrix factorization (NMF) has become a very popular technique in machine learning because it automatically extracts meaningful features through a sparse and part-based representation. However, NMF has the drawback of being highly ill-posed, that is, there typically exist many different but equivalent factorizations. In this paper, we introduce a completely new way to obtaining more well-posed NMF problems whose solutions are sparser. Our technique is based on the preprocessing of the nonnegative input data matrix, and relies on the theory of M-matrices and the geometric interpretation of NMF. This approach provably leads to optimal and sparse solutions under the separability assumption of Donoho and Stodden (2003), and, for rank-three matrices, makes the number of exact factorizations finite. We illustrate the effectiveness of our technique on several image data sets. Keywords: nonnegative matrix factorization, data preprocessing, uniqueness, sparsity, inversepositive matrices

6 0.036372881 90 jmlr-2012-Pattern for Python

7 0.033369776 10 jmlr-2012-A Unified View of Performance Metrics: Translating Threshold Choice into Expected Classification Loss

8 0.033315904 60 jmlr-2012-Local and Global Scaling Reduce Hubs in Space

9 0.029364999 52 jmlr-2012-Iterative Reweighted Algorithms for Matrix Rank Minimization

10 0.028173175 30 jmlr-2012-DARWIN: A Framework for Machine Learning and Computer Vision Research and Development

11 0.024160856 86 jmlr-2012-Optimistic Bayesian Sampling in Contextual-Bandit Problems

12 0.02121095 62 jmlr-2012-MULTIBOOST: A Multi-purpose Boosting Package

13 0.020288756 99 jmlr-2012-Restricted Strong Convexity and Weighted Matrix Completion: Optimal Bounds with Noise

14 0.019756826 79 jmlr-2012-Oger: Modular Learning Architectures For Large-Scale Sequential Processing

15 0.018666174 70 jmlr-2012-Multi-Assignment Clustering for Boolean Data

16 0.018606238 15 jmlr-2012-Algebraic Geometric Comparison of Probability Distributions

17 0.017617101 47 jmlr-2012-GPLP: A Local and Parallel Computation Toolbox for Gaussian Process Regression

18 0.017482437 42 jmlr-2012-Facilitating Score and Causal Inference Trees for Large Observational Studies

19 0.01744728 57 jmlr-2012-Learning Symbolic Representations of Hybrid Dynamical Systems

20 0.017368991 119 jmlr-2012-glm-ie: Generalised Linear Models Inference & Estimation Toolbox


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.073), (1, 0.029), (2, 0.14), (3, -0.032), (4, -0.021), (5, 0.051), (6, 0.051), (7, -0.035), (8, -0.029), (9, -0.183), (10, -0.177), (11, -0.123), (12, 0.215), (13, -0.089), (14, -0.014), (15, 0.105), (16, -0.026), (17, -0.028), (18, -0.103), (19, -0.125), (20, -0.007), (21, 0.168), (22, 0.163), (23, -0.096), (24, -0.045), (25, 0.412), (26, 0.069), (27, -0.097), (28, -0.031), (29, 0.024), (30, -0.009), (31, 0.044), (32, -0.081), (33, -0.137), (34, -0.035), (35, 0.015), (36, 0.029), (37, 0.062), (38, -0.02), (39, 0.014), (40, -0.081), (41, 0.001), (42, -0.006), (43, -0.051), (44, -0.035), (45, 0.082), (46, 0.04), (47, -0.031), (48, 0.01), (49, 0.002)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.97852045 88 jmlr-2012-PREA: Personalized Recommendation Algorithms Toolkit

Author: Joonseok Lee, Mingxuan Sun, Guy Lebanon

Abstract: Recommendation systems are important business applications with significant economic impact. In recent years, a large number of algorithms have been proposed for recommendation systems. In this paper, we describe an open-source toolkit implementing many recommendation algorithms as well as popular evaluation metrics. In contrast to other packages, our toolkit implements recent state-of-the-art algorithms as well as most classic algorithms. Keywords: recommender systems, collaborative filtering, evaluation metrics

2 0.9278779 101 jmlr-2012-SVDFeature: A Toolkit for Feature-based Collaborative Filtering

Author: Tianqi Chen, Weinan Zhang, Qiuxia Lu, Kailong Chen, Zhao Zheng, Yong Yu

Abstract: In this paper we introduce SVDFeature, a machine learning toolkit for feature-based collaborative filtering. SVDFeature is designed to efficiently solve the feature-based matrix factorization. The feature-based setting allows us to build factorization models incorporating side information such as temporal dynamics, neighborhood relationship, and hierarchical information. The toolkit is capable of both rate prediction and collaborative ranking, and is carefully designed for efficient training on large-scale data set. Using this toolkit, we built solutions to win KDD Cup for two consecutive years. Keywords: large-scale collaborative filtering, context-aware recommendation, ranking

3 0.37527779 61 jmlr-2012-ML-Flex: A Flexible Toolbox for Performing Classification Analyses In Parallel

Author: Stephen R. Piccolo, Lewis J. Frey

Abstract: Motivated by a need to classify high-dimensional, heterogeneous data from the bioinformatics domain, we developed ML-Flex, a machine-learning toolbox that enables users to perform two-class and multi-class classification analyses in a systematic yet flexible manner. ML-Flex was written in Java but is capable of interfacing with third-party packages written in other programming languages. It can handle multiple input-data formats and supports a variety of customizations. MLFlex provides implementations of various validation strategies, which can be executed in parallel across multiple computing cores, processors, and nodes. Additionally, ML-Flex supports aggregating evidence across multiple algorithms and data sets via ensemble learning. This open-source software package is freely available from http://mlflex.sourceforge.net. Keywords: toolbox, classification, parallel, ensemble, reproducible research

4 0.26260397 75 jmlr-2012-NIMFA : A Python Library for Nonnegative Matrix Factorization

Author: Marinka Žitnik, Blaž Zupan

Abstract: NIMFA is an open-source Python library that provides a unified interface to nonnegative matrix factorization algorithms. It includes implementations of state-of-the-art factorization methods, initialization approaches, and quality scoring. It supports both dense and sparse matrix representation. NIMFA’s component-based implementation and hierarchical design should help the users to employ already implemented techniques or design and code new strategies for matrix factorization tasks. Keywords: nonnegative matrix factorization, initialization methods, quality measures, scripting, Python

5 0.19876102 10 jmlr-2012-A Unified View of Performance Metrics: Translating Threshold Choice into Expected Classification Loss

Author: José Hernández-Orallo, Peter Flach, Cèsar Ferri

Abstract: Many performance metrics have been introduced in the literature for the evaluation of classification performance, each of them with different origins and areas of application. These metrics include accuracy, unweighted accuracy, the area under the ROC curve or the ROC convex hull, the mean absolute error and the Brier score or mean squared error (with its decomposition into refinement and calibration). One way of understanding the relations among these metrics is by means of variable operating conditions (in the form of misclassification costs and/or class distributions). Thus, a metric may correspond to some expected loss over different operating conditions. One dimension for the analysis has been the distribution for this range of operating conditions, leading to some important connections in the area of proper scoring rules. We demonstrate in this paper that there is an equally important dimension which has so far received much less attention in the analysis of performance metrics. This dimension is given by the decision rule, which is typically implemented as a threshold choice method when using scoring models. In this paper, we explore many old and new threshold choice methods: fixed, score-uniform, score-driven, rate-driven and optimal, among others. By calculating the expected loss obtained with these threshold choice methods for a uniform range of operating conditions we give clear interpretations of the 0-1 loss, the absolute error, the Brier score, the AUC and the refinement loss respectively. Our analysis provides a comprehensive view of performance metrics as well as a systematic approach to loss minimisation which can be summarised as follows: given a model, apply the threshold choice methods that correspond with the available information about the operating condition, and compare their expected losses. In order to assist in this procedure we also derive several connections between the aforementioned performance metrics, and we highlight the role of calibra

6 0.19861965 60 jmlr-2012-Local and Global Scaling Reduce Hubs in Space

7 0.18036284 79 jmlr-2012-Oger: Modular Learning Architectures For Large-Scale Sequential Processing

8 0.16613457 70 jmlr-2012-Multi-Assignment Clustering for Boolean Data

9 0.13328178 52 jmlr-2012-Iterative Reweighted Algorithms for Matrix Rank Minimization

10 0.13101861 47 jmlr-2012-GPLP: A Local and Parallel Computation Toolbox for Gaussian Process Regression

11 0.11519351 86 jmlr-2012-Optimistic Bayesian Sampling in Contextual-Bandit Problems

12 0.1099764 30 jmlr-2012-DARWIN: A Framework for Machine Learning and Computer Vision Research and Development

13 0.10571464 55 jmlr-2012-Learning Algorithms for the Classification Restricted Boltzmann Machine

14 0.10331957 15 jmlr-2012-Algebraic Geometric Comparison of Probability Distributions

15 0.10161251 108 jmlr-2012-Sparse and Unique Nonnegative Matrix Factorization Through Data Preprocessing

16 0.10053387 114 jmlr-2012-Towards Integrative Causal Analysis of Heterogeneous Data Sets and Studies

17 0.098740846 90 jmlr-2012-Pattern for Python

18 0.086707003 31 jmlr-2012-DEAP: Evolutionary Algorithms Made Easy

19 0.081136376 42 jmlr-2012-Facilitating Score and Causal Inference Trees for Large Observational Studies

20 0.079965338 36 jmlr-2012-Efficient Methods for Robust Classification Under Uncertainty in Kernel Matrices


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(7, 0.046), (21, 0.025), (26, 0.023), (27, 0.016), (29, 0.013), (35, 0.014), (49, 0.021), (56, 0.057), (57, 0.015), (69, 0.587), (75, 0.019), (92, 0.027), (96, 0.051)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.88377845 88 jmlr-2012-PREA: Personalized Recommendation Algorithms Toolkit

Author: Joonseok Lee, Mingxuan Sun, Guy Lebanon

Abstract: Recommendation systems are important business applications with significant economic impact. In recent years, a large number of algorithms have been proposed for recommendation systems. In this paper, we describe an open-source toolkit implementing many recommendation algorithms as well as popular evaluation metrics. In contrast to other packages, our toolkit implements recent state-of-the-art algorithms as well as most classic algorithms. Keywords: recommender systems, collaborative filtering, evaluation metrics

2 0.84942144 6 jmlr-2012-A Model of the Perception of Facial Expressions of Emotion by Humans: Research Overview and Perspectives

Author: Aleix Martinez, Shichuan Du

Abstract: In cognitive science and neuroscience, there have been two leading models describing how humans perceive and classify facial expressions of emotion—the continuous and the categorical model. The continuous model defines each facial expression of emotion as a feature vector in a face space. This model explains, for example, how expressions of emotion can be seen at different intensities. In contrast, the categorical model consists of C classifiers, each tuned to a specific emotion category. This model explains, among other findings, why the images in a morphing sequence between a happy and a surprise face are perceived as either happy or surprise but not something in between. While the continuous model has a more difficult time justifying this latter finding, the categorical model is not as good when it comes to explaining how expressions are recognized at different intensities or modes. Most importantly, both models have problems explaining how one can recognize combinations of emotion categories such as happily surprised versus angrily surprised versus surprise. To resolve these issues, in the past several years, we have worked on a revised model that justifies the results reported in the cognitive science and neuroscience literature. This model consists of C distinct continuous spaces. Multiple (compound) emotion categories can be recognized by linearly combining these C face spaces. The dimensions of these spaces are shown to be mostly configural. According to this model, the major task for the classification of facial expressions of emotion is precise, detailed detection of facial landmarks rather than recognition. We provide an overview of the literature justifying the model, show how the resulting model can be employed to build algorithms for the recognition of facial expression of emotion, and propose research directions in machine learning and computer vision researchers to keep pushing the state of the art in these areas. We also discuss how the model can aid in stu

3 0.83684272 50 jmlr-2012-Human Gesture Recognition on Product Manifolds

Author: Yui Man Lui

Abstract: Action videos are multidimensional data and can be naturally represented as data tensors. While tensor computing is widely used in computer vision, the geometry of tensor space is often ignored. The aim of this paper is to demonstrate the importance of the intrinsic geometry of tensor space which yields a very discriminating structure for action recognition. We characterize data tensors as points on a product manifold and model it statistically using least squares regression. To this aim, we factorize a data tensor relating to each order of the tensor using Higher Order Singular Value Decomposition (HOSVD) and then impose each factorized element on a Grassmann manifold. Furthermore, we account for underlying geometry on manifolds and formulate least squares regression as a composite function. This gives a natural extension from Euclidean space to manifolds. Consequently, classification is performed using geodesic distance on a product manifold where each factor manifold is Grassmannian. Our method exploits appearance and motion without explicitly modeling the shapes and dynamics. We assess the proposed method using three gesture databases, namely the Cambridge hand-gesture, the UMD Keck body-gesture, and the CHALEARN gesture challenge data sets. Experimental results reveal that not only does the proposed method perform well on the standard benchmark data sets, but also it generalizes well on the one-shot-learning gesture challenge. Furthermore, it is based on a simple statistical model and the intrinsic geometry of tensor space. Keywords: gesture recognition, action recognition, Grassmann manifolds, product manifolds, one-shot-learning, kinect data

4 0.2766979 106 jmlr-2012-Sign Language Recognition using Sub-Units

Author: Helen Cooper, Eng-Jon Ong, Nicolas Pugeault, Richard Bowden

Abstract: This paper discusses sign language recognition using linguistic sub-units. It presents three types of sub-units for consideration; those learnt from appearance data as well as those inferred from both 2D or 3D tracking data. These sub-units are then combined using a sign level classifier; here, two options are presented. The first uses Markov Models to encode the temporal changes between sub-units. The second makes use of Sequential Pattern Boosting to apply discriminative feature selection at the same time as encoding temporal information. This approach is more robust to noise and performs well in signer independent tests, improving results from the 54% achieved by the Markov Chains to 76%. Keywords: sign language recognition, sequential pattern boosting, depth cameras, sub-units, signer independence, data set

5 0.26546454 45 jmlr-2012-Finding Recurrent Patterns from Continuous Sign Language Sentences for Automated Extraction of Signs

Author: Sunita Nayak, Kester Duncan, Sudeep Sarkar, Barbara Loeding

Abstract: We present a probabilistic framework to automatically learn models of recurring signs from multiple sign language video sequences containing the vocabulary of interest. We extract the parts of the signs that are present in most occurrences of the sign in context and are robust to the variations produced by adjacent signs. Each sentence video is first transformed into a multidimensional time series representation, capturing the motion and shape aspects of the sign. Skin color blobs are extracted from frames of color video sequences, and a probabilistic relational distribution is formed for each frame using the contour and edge pixels from the skin blobs. Each sentence is represented as a trajectory in a low dimensional space called the space of relational distributions. Given these time series trajectories, we extract signemes from multiple sentences concurrently using iterated conditional modes (ICM). We show results by learning single signs from a collection of sentences with one common pervading sign, multiple signs from a collection of sentences with more than one common sign, and single signs from a mixed collection of sentences. The extracted signemes demonstrate that our approach is robust to some extent to the variations produced within a sign due to different contexts. We also show results whereby these learned sign models are used for spotting signs in test sequences. Keywords: pattern extraction, sign language recognition, signeme extraction, sign modeling, iterated conditional modes

6 0.25754765 57 jmlr-2012-Learning Symbolic Representations of Hybrid Dynamical Systems

7 0.25533265 75 jmlr-2012-NIMFA : A Python Library for Nonnegative Matrix Factorization

8 0.24384405 100 jmlr-2012-Robust Kernel Density Estimation

9 0.24279729 101 jmlr-2012-SVDFeature: A Toolkit for Feature-based Collaborative Filtering

10 0.21047123 30 jmlr-2012-DARWIN: A Framework for Machine Learning and Computer Vision Research and Development

11 0.20620744 83 jmlr-2012-Online Learning in the Embedded Manifold of Low-rank Matrices

12 0.19290671 62 jmlr-2012-MULTIBOOST: A Multi-purpose Boosting Package

13 0.1886487 92 jmlr-2012-Positive Semidefinite Metric Learning Using Boosting-like Algorithms

14 0.18422741 42 jmlr-2012-Facilitating Score and Causal Inference Trees for Large Observational Studies

15 0.18416663 5 jmlr-2012-A Local Spectral Method for Graphs: With Applications to Improving Graph Partitions and Exploring Data Graphs Locally

16 0.18370074 36 jmlr-2012-Efficient Methods for Robust Classification Under Uncertainty in Kernel Matrices

17 0.18312331 11 jmlr-2012-A Unifying Probabilistic Perspective for Spectral Dimensionality Reduction: Insights and New Models

18 0.17956445 43 jmlr-2012-Fast Approximation of Matrix Coherence and Statistical Leverage

19 0.17867383 64 jmlr-2012-Manifold Identification in Dual Averaging for Regularized Stochastic Online Learning

20 0.1780367 77 jmlr-2012-Non-Sparse Multiple Kernel Fisher Discriminant Analysis