nips nips2007 nips2007-211 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Sabri Boutemedjet, Djemel Ziou, Nizar Bouguila
Abstract: Content-based image suggestion (CBIS) targets the recommendation of products based on user preferences on the visual content of images. In this paper, we motivate both feature selection and model order identification as two key issues for a successful CBIS. We propose a generative model in which the visual features and users are clustered into separate classes. We identify the number of both user and image classes with the simultaneous selection of relevant visual features using the message length approach. The goal is to ensure an accurate prediction of ratings for multidimensional non-Gaussian and continuous image descriptors. Experiments on a collected data have demonstrated the merits of our approach.
Reference: text
sentIndex sentText sentNum sentScore
1 ca Abstract Content-based image suggestion (CBIS) targets the recommendation of products based on user preferences on the visual content of images. [sent-7, score-0.841]
2 In this paper, we motivate both feature selection and model order identification as two key issues for a successful CBIS. [sent-8, score-0.188]
3 We propose a generative model in which the visual features and users are clustered into separate classes. [sent-9, score-0.307]
4 We identify the number of both user and image classes with the simultaneous selection of relevant visual features using the message length approach. [sent-10, score-0.74]
5 The goal is to ensure an accurate prediction of ratings for multidimensional non-Gaussian and continuous image descriptors. [sent-11, score-0.285]
6 1 Introduction Products in today’s e-market are described using both visual and textual information. [sent-13, score-0.202]
7 From consumer psychology, the visual information has been recognized as an important factor that influences the consumer’s decision making and has an important power of persuasion [4]. [sent-14, score-0.278]
8 Furthermore, it is well recognized that the consumer choice is also influenced by the external environment or context such as the time and location [4]. [sent-15, score-0.141]
9 For example, a consumer could express an information need during a travel that is different from the situation when she or he is working or even at home. [sent-16, score-0.078]
10 “Content-Based Image Suggestion” (CBIS) [4] motivates the modeling of user preferences with respect to visual information under the influence of the context. [sent-17, score-0.598]
11 Therefore, CBIS aims at the suggestion of products whose relevance is inferred from the history of users in different contexts on images of the previously consumed products. [sent-18, score-0.294]
12 The domains considered by CBIS are a set of users U = {1, 2, . [sent-19, score-0.06]
13 , Nu }, a set of visual documents V = {v 1 , v2 , . [sent-22, score-0.172]
14 Each vk is an arbitrary descriptor (visual, textual, or categorical) used to represent images or products. [sent-29, score-0.093]
15 In this work, we consider an image as a D-dimensional vector v = (v1 , v2 , . [sent-30, score-0.124]
16 The visual features may be local such as interest points or global such as color, texture, or shape. [sent-34, score-0.214]
17 The history of each user u ∈ U, is defined as Du = {< u, e(j) , v (j) , r(j) > |e(j) ∈ E, v (j) ∈ V, r(j) ∈ R, j = 1, . [sent-42, score-0.24]
18 Figure 1: The VCC-FMM identifies like-mindedness from similar appreciations on similar images represented in 3-dimensional space. [sent-46, score-0.093]
19 Notice the inter-relation between the number of image clusters and the considered feature subset. [sent-47, score-0.242]
20 In literature, the modeling of user preferences has been addressed mainly within collaborative filtering (CF) and content-based filtering (CBF) communities. [sent-48, score-0.528]
21 On the other hand, CF approaches predict the relevance of a given product for a given user based on the preferences provided by a set of “like-minded” (similar tastes) users. [sent-50, score-0.432]
22 The Aspect model [7] and the flexible mixture model (FMM) [15] are examples of some model-based CF approaches. [sent-52, score-0.122]
23 Recently, the authors in [4] have proposed a statistical model for CBIS which uses both visual and contextual information in modeling user preferences with respect to multidimensional non Gaussian and continuous data. [sent-53, score-0.678]
24 Users with similar preferences are considered in [4] as those who appreciated with similar degrees similar images. [sent-54, score-0.137]
25 Therefore, instead of considering products as categorical variables (CF), visual documents are represented by a richer visual information in the form of a vector of visual features (texture, shape, and interest points). [sent-55, score-0.635]
26 The similarity between images and between user preferences is modeled in [4] through a single graphical model which clusters users and images separately into homogeneous groups in a similar way to the flexible mixture model (FMM) [15]. [sent-56, score-0.826]
27 In addition, since image data are generally non-Gaussian [1], class-conditional distributions of visual features are assumed Dirichlet densities. [sent-57, score-0.338]
28 By this way, the like-mindedness in user preferences is captured at the level of visual features. [sent-58, score-0.549]
29 First, once the model is learned from training data (union of user histories), it can be used to “suggest” unknown (possibly unrated) images efficiently i. [sent-60, score-0.366]
30 Second, the model can be updated from new data (images or ratings) in an online fashion in order to handle the changes in either image clusters and/or user preferences. [sent-63, score-0.441]
31 Third, model selection approaches can be employed to identify “without supervision” both numbers of user preferences and image clusters (i. [sent-64, score-0.659]
32 It should be stressed that the unsupervised selection of the model order was not addressed in CF/CBF literature. [sent-67, score-0.195]
33 Indeed, the model order in many well- founded statistical models such as the Aspect model [7] or FMM [15] was set “empirically” as a compromise between the model’s complexity and the accuracy of prediction, but not from the data. [sent-68, score-0.098]
34 From an “image collection modeling” point of view, the work in [4] has focused on modeling user preferences with respect to non-Gaussian image data. [sent-69, score-0.55]
35 However, since CBIS employs generally highdimensional image descriptors, then the problem of modeling accurately image collections needs to be addressed in order to overcome the curse of dimensionality and provide accurate suggestions. [sent-70, score-0.424]
36 Indeed, the presence of many irrelevant features degrades substantially the performance of the modeling and prediction [6] in addition to the increase of the computational complexity. [sent-71, score-0.163]
37 To achieve a better modeling, we consider feature selection and extraction as another “key issue” for CBIS. [sent-72, score-0.155]
38 In literature [6], the process of feature selection in mixture models have not received as much attention as in supervised learning. [sent-73, score-0.211]
39 The main reason is the absence of class labels that may guide the selection process [6]. [sent-74, score-0.081]
40 In this paper, we address the issue of feature selection in CBIS through a new generative model which we call Visual Content Context-aware Flexible Mixture Model (VCC-FMM). [sent-75, score-0.188]
41 Due to the problem of the inter-relation between feature subsets and the model order i. [sent-76, score-0.107]
42 different feature subsets correspond to different natural groupings of images, we propose to learn the VCC-FMM from unlabeled data using the Minimum Message Length (MML) approach [16]. [sent-78, score-0.074]
43 The next Section details the VCC-FMM model with an integrated feature selection. [sent-79, score-0.107]
44 2 The Visual Content Context Flexible Mixture Model The data set D used to learn a CBIS system is the union of all user histories i. [sent-83, score-0.276]
45 From this data set we model both like-mindedness shared by user groups as well as the visual and semantic similarity between images [4]. [sent-86, score-0.575]
46 For that end, we introduce two latent variables z and c to label each observation < u, e, v, r > with information about user classes and image classes, respectively. [sent-87, score-0.396]
47 Then, the rating r for a given user u, context e and a visual document v can be predicted on the basis of probabilities p(r|u, e, v) that can be derived by conditioning the generative model p(u, e, v, r). [sent-89, score-0.56]
48 Let K and M be the number of user classes and images classes respectively, an initial model for CBIS can be derived as [4]: K M p(v, r, u, e) = p(z)p(c)p(u|z)p(e|z)p(v|c)p(r|z, c) (1) z=1 c=1 The quantities p(z) and p(c) denote the a priori weights of user and image classes. [sent-93, score-0.794]
49 p(u|z) and p(e|z) denote the likelihood of a user and context to belong respectively to the user’s class z. [sent-94, score-0.275]
50 p(r|z, c) is the probability to sample a rating for a given user class and image class. [sent-95, score-0.444]
51 On the other hand, image descriptors are high-dimensional, continuous and generally non Gaussian data [1]. [sent-97, score-0.201]
52 Thus, the distribution of class-conditional densities p(v|c) should be modeled carefully in order to capture efficiently the added-value of the visual information. [sent-98, score-0.172]
53 In this work, we assume that p(v|c) is a Generalized Dirichlet distribution (GDD) which is more appropriate than other distributions such as the Gaussian or Dirichlet distributions in modeling image collections [1]. [sent-99, score-0.216]
54 p(v|Θ∗ ) = c D l=1 D ∗ Γ(α∗ + βcl ) α∗ −1 cl v cl (1 − ∗ Γ(α∗ )Γ(βcl ) l cl l ∗ vk )γcl (2) k=1 ∗ ∗ ∗ where l=l vl < 1 and 0 < vl < 1 for l = 1, . [sent-103, score-1.38]
55 |θcl ) D ∗ ∗ ∗ with parameters θcl = (α∗ , βcl ) which leads to the fact p(x| Θ∗ ) = l=1 pb (xl |θcl ). [sent-121, score-0.24]
56 The indepenc cl dence between xl makes the estimation of a GDD very efficient i. [sent-122, score-0.718]
57 However, even with independent features, the unsupervised identification of image clusters based on high-dimensional descriptors remains a hard problem due to the omnipresence of noisy, redundant and uninformative features [6] that degrade the accuracy of the modeling and prediction. [sent-125, score-0.449]
58 We consider feature selection and extraction as a “key” methodology in order to remove that kind of features in our modeling. [sent-126, score-0.197]
59 From figure 1, four well-separated image clusters can be identified from only two relevant features 1 and 2 which are multimodal and influenced by class labels. [sent-129, score-0.21]
60 irrelevant) and can be approximated by a single Beta distribution pb (. [sent-132, score-0.24]
61 This definition of feature’s relevance has been motivated in unsupervised learning [2][9]. [sent-134, score-0.107]
62 φ l is set to 1 when the l-th feature is relevant and 0 otherwise. [sent-139, score-0.074]
63 The set Θ of all VCC-FMM parameters is defined by θz , θz , θzc , θφl , cl l Z C θ , θ and θcl , ξl . [sent-148, score-0.438]
64 , N, u(i) ∈ U, e(i) ∈ E, x(i) ∈ X , r(i) ∈ R} is given by: N log p(D|Θ) = K M log i=1 z=1 c=1 p(z)p(c)p(u(i) |z)p(e(i) |z)p(r (i) |z, c) D [ l=1 (i) l1 pb (xl |θcl ) + (i) l2 pb (xl |ξl )] (5) The maximum likelihood (ML) approach which optimizes equation (5) w. [sent-152, score-0.536]
65 To overcome these problems, we define a message length objective [16] for both the estimation of Θ and identification of K and M using MML [9][2]. [sent-156, score-0.077]
66 It is common sense to assume an independence among the different groups of parameters which factorizes both |I(Θ)| and p(Θ) over the Fisher and prior distribution of different groups of parameters, respectively. [sent-159, score-0.074]
67 The Fisher information of θ cl and ξl can be computed by following a similar methodology of [1]. [sent-161, score-0.438]
68 4 Experiments The benefits of using feature selection and the contextual information are evaluated by considering two variants: V-FMM and V-GD-FMM in addition the original VCC-FMM given by equation (4). [sent-175, score-0.202]
69 E V-FMM does not handle the contextual information and assumes θ ze constant for all e ∈ E. [sent-176, score-0.137]
70 On the other hand, feature selection is not considered for V-GD-FMM by setting l1 = 1 and pruning the uninformative components ξ l for l = 1, . [sent-177, score-0.184]
71 1 Data Set We have collected ratings from 27 subjects who participated in the experiment (i. [sent-182, score-0.162]
72 Subjects received periodically (twice a day) a list of three images on which they assign relevance degrees expressed on a five star rating scale (i. [sent-186, score-0.255]
73 A data set D of 13446 ratings is collected (N = 13446). [sent-191, score-0.128]
74 N v = 4775) images collected from Washington University [10] and collections of free photographs which we categorized manually into 41 categories. [sent-194, score-0.17]
75 For visual content characterization, we have employed both local and global descriptors. [sent-195, score-0.222]
76 For local descriptors, we use the 128-dimensional Scale Invariant Feature Transform (SIFT) [11] to represent image patches. [sent-196, score-0.124]
77 We employ vector quantization to SIFT descriptors and we build a histogram for each image (“bag of visual words”). [sent-197, score-0.373]
78 For global descriptors, we used the color correlogram for image texture representation, and the edge histogram descriptor. [sent-199, score-0.161]
79 Therefore, a visual feature vector is represented in a 540-dimensional space (D = 540). [sent-200, score-0.246]
80 We measure the accuracy of the prediction by the Mean Absolute Error (MAE) which is the average of the absolute deviation between the actual and predicted ratings. [sent-201, score-0.072]
81 2 First Experiment: Evaluating the influence of model order on the prediction accuracy This experiment tries to investigate the relationship between the assumed model order defined by K and M on the prediction accuracy of VCC-FMM. [sent-203, score-0.21]
82 It should be noticed that the ground truth number of user classes K ∗ is not known for our data set D. [sent-204, score-0.363]
83 D GT is sampled from the preferences P 1 and P2 of two most dissimilar subjects according to Pearson correlation coefficients [14]. [sent-206, score-0.171]
84 We sample ratings for 100 simulated users from the preferences P 1 and P2 only on images of four image classes. [sent-207, score-0.508]
85 For each user, we generate 80 ratings (∼ 20 ratings per context). [sent-208, score-0.188]
86 Therefore, the ground truth model order is K ∗ = 2 and M ∗ = 4. [sent-209, score-0.092]
87 Figure 3(a) shows that both K and M have been identified correctly on D GT since the lowest MML was reported for the model order defined by M = 4 and K = 2. [sent-213, score-0.061]
88 The selection of the best model order is important since it influences the accuracy of the prediction (MAE) as illustrated by Figure 3(b). [sent-214, score-0.186]
89 3 Second Experiment: Comparison with state-of-the-art The aim of this experiment is to measure the contribution of the visual information and the user’s context in making accurate predictions comparatively with some existing CF approaches. [sent-217, score-0.297]
90 84% The first five columns of table 1 show the added value provided by the visual information comparatively with pure CF techniques. [sent-243, score-0.235]
91 For example, the improvement in the rating’s prediction reported by V-FMM is 3. [sent-244, score-0.068]
92 The algorithms (with context information) shown in the last two columns have also improved the accuracy of the prediction comparatively with the others (at least 15. [sent-247, score-0.17]
93 This explains the importance of the contextual information on user preferences. [sent-249, score-0.287]
94 Feature selection is also important since VCC-FMM has reported a better accuracy (14. [sent-250, score-0.141]
95 Furthermore, it is reported in figure 4(a) that VCCFMM is less sensitive to data sparsity (number of ratings per user) than pure CF techniques. [sent-252, score-0.149]
96 Finally, the evolution of the average MAE provided VCC-FMM for different proportions of unrated images remains under < 25% for up to 30% of unrated images as shown in Figure 4(b). [sent-253, score-0.304]
97 We explain the stability of the accuracy of VCC-FMM for data sparsity and new images by the visual information since only cluster representatives need to be rated. [sent-254, score-0.324]
98 (a) Data sparsity (b) new images Figure 4: MAE curves with error bars on the data set D. [sent-255, score-0.12]
99 5 Conclusions This paper has motivated theoretically and empirically the importance of both feature selection and model order identification from unlabeled data as important issues in content-based image suggestion. [sent-256, score-0.312]
100 Experiments on collected data showed also the importance of the visual information and the user’s context in making accurate suggestions. [sent-257, score-0.268]
wordName wordTfidf (topN-words)
[('cl', 0.438), ('xl', 0.28), ('qzci', 0.248), ('pb', 0.24), ('user', 0.24), ('cbis', 0.225), ('mml', 0.216), ('visual', 0.172), ('nr', 0.142), ('preferences', 0.137), ('fmm', 0.135), ('mae', 0.134), ('image', 0.124), ('cf', 0.12), ('ratings', 0.094), ('images', 0.093), ('bouguila', 0.09), ('boutemedjet', 0.09), ('gdd', 0.09), ('sherbrooke', 0.09), ('zcr', 0.09), ('ze', 0.09), ('np', 0.09), ('fisher', 0.082), ('selection', 0.081), ('rating', 0.08), ('consumer', 0.078), ('flexible', 0.078), ('urp', 0.078), ('descriptors', 0.077), ('feature', 0.074), ('collaborative', 0.073), ('beta', 0.073), ('nu', 0.072), ('zc', 0.068), ('zu', 0.068), ('comparatively', 0.063), ('users', 0.06), ('unrated', 0.059), ('dirichlet', 0.059), ('mixture', 0.056), ('relevance', 0.055), ('qc', 0.054), ('unsupervised', 0.052), ('canada', 0.051), ('suggestion', 0.05), ('gt', 0.05), ('du', 0.05), ('content', 0.05), ('modeling', 0.049), ('message', 0.049), ('contextual', 0.047), ('boulevard', 0.045), ('cbf', 0.045), ('concordia', 0.045), ('universite', 0.045), ('clusters', 0.044), ('identi', 0.044), ('collections', 0.043), ('features', 0.042), ('categorical', 0.041), ('prediction', 0.04), ('pcc', 0.039), ('campus', 0.039), ('aspect', 0.038), ('groups', 0.037), ('texture', 0.037), ('products', 0.036), ('histories', 0.036), ('context', 0.035), ('collected', 0.034), ('subjects', 0.034), ('parent', 0.034), ('vl', 0.033), ('model', 0.033), ('accuracy', 0.032), ('irrelevant', 0.032), ('truth', 0.032), ('classes', 0.032), ('recommendation', 0.032), ('cd', 0.032), ('noticed', 0.032), ('pearson', 0.032), ('textual', 0.03), ('uenced', 0.03), ('uninformative', 0.029), ('addressed', 0.029), ('reported', 0.028), ('overcome', 0.028), ('log', 0.028), ('recognized', 0.028), ('sift', 0.028), ('accurate', 0.027), ('sparsity', 0.027), ('filtering', 0.027), ('bell', 0.027), ('star', 0.027), ('parents', 0.027), ('universit', 0.027), ('ground', 0.027)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000004 211 nips-2007-Unsupervised Feature Selection for Accurate Recommendation of High-Dimensional Image Data
Author: Sabri Boutemedjet, Djemel Ziou, Nizar Bouguila
Abstract: Content-based image suggestion (CBIS) targets the recommendation of products based on user preferences on the visual content of images. In this paper, we motivate both feature selection and model order identification as two key issues for a successful CBIS. We propose a generative model in which the visual features and users are clustered into separate classes. We identify the number of both user and image classes with the simultaneous selection of relevant visual features using the message length approach. The goal is to ensure an accurate prediction of ratings for multidimensional non-Gaussian and continuous image descriptors. Experiments on a collected data have demonstrated the merits of our approach.
2 0.13596183 212 nips-2007-Using Deep Belief Nets to Learn Covariance Kernels for Gaussian Processes
Author: Geoffrey E. Hinton, Ruslan Salakhutdinov
Abstract: We show how to use unlabeled data and a deep belief net (DBN) to learn a good covariance kernel for a Gaussian process. We first learn a deep generative model of the unlabeled data using the fast, greedy algorithm introduced by [7]. If the data is high-dimensional and highly-structured, a Gaussian kernel applied to the top layer of features in the DBN works much better than a similar kernel applied to the raw input. Performance at both regression and classification can then be further improved by using backpropagation through the DBN to discriminatively fine-tune the covariance kernel.
3 0.095940351 19 nips-2007-Active Preference Learning with Discrete Choice Data
Author: Brochu Eric, Nando D. Freitas, Abhijeet Ghosh
Abstract: We propose an active learning algorithm that learns a continuous valuation model from discrete preferences. The algorithm automatically decides what items are best presented to an individual in order to find the item that they value highly in as few trials as possible, and exploits quirks of human psychology to minimize time and cognitive burden. To do this, our algorithm maximizes the expected improvement at each query without accurately modelling the entire valuation surface, which would be needlessly expensive. The problem is particularly difficult because the space of choices is infinite. We demonstrate the effectiveness of the new algorithm compared to related active learning methods. We also embed the algorithm within a decision making tool for assisting digital artists in rendering materials. The tool finds the best parameters while minimizing the number of queries. 1
4 0.090255208 183 nips-2007-Spatial Latent Dirichlet Allocation
Author: Xiaogang Wang, Eric Grimson
Abstract: In recent years, the language model Latent Dirichlet Allocation (LDA), which clusters co-occurring words into topics, has been widely applied in the computer vision field. However, many of these applications have difficulty with modeling the spatial and temporal structure among visual words, since LDA assumes that a document is a “bag-of-words”. It is also critical to properly design “words” and “documents” when using a language model to solve vision problems. In this paper, we propose a topic model Spatial Latent Dirichlet Allocation (SLDA), which better encodes spatial structures among visual words that are essential for solving many vision problems. The spatial information is not encoded in the values of visual words but in the design of documents. Instead of knowing the partition of words into documents a priori, the word-document assignment becomes a random hidden variable in SLDA. There is a generative procedure, where knowledge of spatial structure can be flexibly added as a prior, grouping visual words which are close in space into the same document. We use SLDA to discover objects from a collection of images, and show it achieves better performance than LDA. 1
5 0.077793784 41 nips-2007-COFI RANK - Maximum Margin Matrix Factorization for Collaborative Ranking
Author: Markus Weimer, Alexandros Karatzoglou, Quoc V. Le, Alex J. Smola
Abstract: In this paper, we consider collaborative filtering as a ranking problem. We present a method which uses Maximum Margin Matrix Factorization and optimizes ranking instead of rating. We employ structured output prediction to optimize directly for ranking scores. Experimental results show that our method gives very good ranking scores and scales well on collaborative filtering tasks. 1
6 0.072857156 143 nips-2007-Object Recognition by Scene Alignment
7 0.072694808 105 nips-2007-Infinite State Bayes-Nets for Structured Domains
8 0.068527408 158 nips-2007-Probabilistic Matrix Factorization
9 0.060894318 193 nips-2007-The Distribution Family of Similarity Distances
10 0.060132582 111 nips-2007-Learning Horizontal Connections in a Sparse Coding Model of Natural Images
11 0.059585776 94 nips-2007-Gaussian Process Models for Link Analysis and Transfer Learning
12 0.059097007 156 nips-2007-Predictive Matrix-Variate t Models
13 0.05893806 181 nips-2007-Sparse Overcomplete Latent Variable Decomposition of Counts Data
14 0.056838572 175 nips-2007-Semi-Supervised Multitask Learning
15 0.055448167 145 nips-2007-On Sparsity and Overcompleteness in Image Models
16 0.054876473 83 nips-2007-Evaluating Search Engines by Modeling the Relationship Between Relevance and Clicks
17 0.05322174 113 nips-2007-Learning Visual Attributes
18 0.05265744 135 nips-2007-Multi-task Gaussian Process Prediction
19 0.052578989 155 nips-2007-Predicting human gaze using low-level saliency combined with face detection
20 0.05226621 172 nips-2007-Scene Segmentation with CRFs Learned from Partially Labeled Images
topicId topicWeight
[(0, -0.172), (1, 0.082), (2, -0.047), (3, -0.091), (4, 0.029), (5, 0.089), (6, -0.051), (7, 0.021), (8, 0.043), (9, -0.029), (10, -0.037), (11, -0.027), (12, 0.075), (13, 0.028), (14, -0.058), (15, 0.069), (16, -0.03), (17, -0.036), (18, 0.023), (19, 0.022), (20, -0.04), (21, -0.009), (22, 0.056), (23, -0.022), (24, -0.064), (25, 0.08), (26, -0.003), (27, -0.159), (28, 0.07), (29, 0.038), (30, 0.044), (31, -0.019), (32, 0.009), (33, -0.084), (34, 0.068), (35, 0.017), (36, -0.053), (37, -0.047), (38, -0.091), (39, -0.141), (40, -0.157), (41, -0.105), (42, 0.001), (43, 0.097), (44, -0.14), (45, -0.077), (46, 0.042), (47, -0.092), (48, -0.005), (49, 0.012)]
simIndex simValue paperId paperTitle
same-paper 1 0.9414075 211 nips-2007-Unsupervised Feature Selection for Accurate Recommendation of High-Dimensional Image Data
Author: Sabri Boutemedjet, Djemel Ziou, Nizar Bouguila
Abstract: Content-based image suggestion (CBIS) targets the recommendation of products based on user preferences on the visual content of images. In this paper, we motivate both feature selection and model order identification as two key issues for a successful CBIS. We propose a generative model in which the visual features and users are clustered into separate classes. We identify the number of both user and image classes with the simultaneous selection of relevant visual features using the message length approach. The goal is to ensure an accurate prediction of ratings for multidimensional non-Gaussian and continuous image descriptors. Experiments on a collected data have demonstrated the merits of our approach.
2 0.68521434 158 nips-2007-Probabilistic Matrix Factorization
Author: Andriy Mnih, Ruslan Salakhutdinov
Abstract: Many existing approaches to collaborative filtering can neither handle very large datasets nor easily deal with users who have very few ratings. In this paper we present the Probabilistic Matrix Factorization (PMF) model which scales linearly with the number of observations and, more importantly, performs well on the large, sparse, and very imbalanced Netflix dataset. We further extend the PMF model to include an adaptive prior on the model parameters and show how the model capacity can be controlled automatically. Finally, we introduce a constrained version of the PMF model that is based on the assumption that users who have rated similar sets of movies are likely to have similar preferences. The resulting model is able to generalize considerably better for users with very few ratings. When the predictions of multiple PMF models are linearly combined with the predictions of Restricted Boltzmann Machines models, we achieve an error rate of 0.8861, that is nearly 7% better than the score of Netflix’s own system.
3 0.65756267 19 nips-2007-Active Preference Learning with Discrete Choice Data
Author: Brochu Eric, Nando D. Freitas, Abhijeet Ghosh
Abstract: We propose an active learning algorithm that learns a continuous valuation model from discrete preferences. The algorithm automatically decides what items are best presented to an individual in order to find the item that they value highly in as few trials as possible, and exploits quirks of human psychology to minimize time and cognitive burden. To do this, our algorithm maximizes the expected improvement at each query without accurately modelling the entire valuation surface, which would be needlessly expensive. The problem is particularly difficult because the space of choices is infinite. We demonstrate the effectiveness of the new algorithm compared to related active learning methods. We also embed the algorithm within a decision making tool for assisting digital artists in rendering materials. The tool finds the best parameters while minimizing the number of queries. 1
4 0.57411867 196 nips-2007-The Infinite Gamma-Poisson Feature Model
Author: Michalis K. Titsias
Abstract: We present a probability distribution over non-negative integer valued matrices with possibly an infinite number of columns. We also derive a stochastic process that reproduces this distribution over equivalence classes. This model can play the role of the prior in nonparametric Bayesian learning scenarios where multiple latent features are associated with the observed data and each feature can have multiple appearances or occurrences within each data point. Such data arise naturally when learning visual object recognition systems from unlabelled images. Together with the nonparametric prior we consider a likelihood model that explains the visual appearance and location of local image patches. Inference with this model is carried out using a Markov chain Monte Carlo algorithm. 1
5 0.5115459 143 nips-2007-Object Recognition by Scene Alignment
Author: Bryan Russell, Antonio Torralba, Ce Liu, Rob Fergus, William T. Freeman
Abstract: Current object recognition systems can only recognize a limited number of object categories; scaling up to many categories is the next challenge. We seek to build a system to recognize and localize many different object categories in complex scenes. We achieve this through a simple approach: by matching the input image, in an appropriate representation, to images in a large training set of labeled images. Due to regularities in object identities across similar scenes, the retrieved matches provide hypotheses for object identities and locations. We build a probabilistic model to transfer the labels from the retrieval set to the input image. We demonstrate the effectiveness of this approach and study algorithm component contributions using held-out test sets from the LabelMe database. 1
6 0.46993241 41 nips-2007-COFI RANK - Maximum Margin Matrix Factorization for Collaborative Ranking
7 0.46817493 156 nips-2007-Predictive Matrix-Variate t Models
8 0.45853907 105 nips-2007-Infinite State Bayes-Nets for Structured Domains
9 0.4442153 193 nips-2007-The Distribution Family of Similarity Distances
10 0.43675047 113 nips-2007-Learning Visual Attributes
11 0.39169034 44 nips-2007-Catching Up Faster in Bayesian Model Selection and Model Averaging
12 0.38581735 93 nips-2007-GRIFT: A graphical model for inferring visual classification features from human data
13 0.37578756 172 nips-2007-Scene Segmentation with CRFs Learned from Partially Labeled Images
14 0.36405769 131 nips-2007-Modeling homophily and stochastic equivalence in symmetric relational data
15 0.36184099 139 nips-2007-Nearest-Neighbor-Based Active Learning for Rare Category Detection
16 0.35877723 212 nips-2007-Using Deep Belief Nets to Learn Covariance Kernels for Gaussian Processes
17 0.35773301 86 nips-2007-Exponential Family Predictive Representations of State
18 0.35237044 115 nips-2007-Learning the 2-D Topology of Images
19 0.34536558 24 nips-2007-An Analysis of Inference with the Universum
20 0.34473306 96 nips-2007-Heterogeneous Component Analysis
topicId topicWeight
[(5, 0.039), (13, 0.035), (16, 0.021), (18, 0.015), (19, 0.011), (21, 0.086), (31, 0.015), (34, 0.017), (35, 0.018), (37, 0.315), (47, 0.078), (49, 0.011), (83, 0.1), (85, 0.024), (87, 0.059), (88, 0.013), (90, 0.076)]
simIndex simValue paperId paperTitle
same-paper 1 0.74703699 211 nips-2007-Unsupervised Feature Selection for Accurate Recommendation of High-Dimensional Image Data
Author: Sabri Boutemedjet, Djemel Ziou, Nizar Bouguila
Abstract: Content-based image suggestion (CBIS) targets the recommendation of products based on user preferences on the visual content of images. In this paper, we motivate both feature selection and model order identification as two key issues for a successful CBIS. We propose a generative model in which the visual features and users are clustered into separate classes. We identify the number of both user and image classes with the simultaneous selection of relevant visual features using the message length approach. The goal is to ensure an accurate prediction of ratings for multidimensional non-Gaussian and continuous image descriptors. Experiments on a collected data have demonstrated the merits of our approach.
2 0.49023041 73 nips-2007-Distributed Inference for Latent Dirichlet Allocation
Author: David Newman, Padhraic Smyth, Max Welling, Arthur U. Asuncion
Abstract: We investigate the problem of learning a widely-used latent-variable model – the Latent Dirichlet Allocation (LDA) or “topic” model – using distributed computation, where each of processors only sees of the total data set. We propose two distributed inference schemes that are motivated from different perspectives. The first scheme uses local Gibbs sampling on each processor with periodic updates—it is simple to implement and can be viewed as an approximation to a single processor implementation of Gibbs sampling. The second scheme relies on a hierarchical Bayesian extension of the standard LDA model to directly account for the fact that data are distributed across processors—it has a theoretical guarantee of convergence but is more complex to implement than the approximate method. Using five real-world text corpora we show that distributed learning works very well for LDA models, i.e., perplexity and precision-recall scores for distributed learning are indistinguishable from those obtained with single-processor learning. Our extensive experimental results include large-scale distributed computation on 1000 virtual processors; and speedup experiments of learning topics in a 100-million word corpus using 16 processors. ¢ ¤ ¦¥£ ¢ ¢
3 0.48912913 93 nips-2007-GRIFT: A graphical model for inferring visual classification features from human data
Author: Michael Ross, Andrew Cohen
Abstract: This paper describes a new model for human visual classification that enables the recovery of image features that explain human subjects’ performance on different visual classification tasks. Unlike previous methods, this algorithm does not model their performance with a single linear classifier operating on raw image pixels. Instead, it represents classification as the combination of multiple feature detectors. This approach extracts more information about human visual classification than previous methods and provides a foundation for further exploration. 1
4 0.48470327 189 nips-2007-Supervised Topic Models
Author: Jon D. Mcauliffe, David M. Blei
Abstract: We introduce supervised latent Dirichlet allocation (sLDA), a statistical model of labelled documents. The model accommodates a variety of response types. We derive a maximum-likelihood procedure for parameter estimation, which relies on variational approximations to handle intractable posterior expectations. Prediction problems motivate this research: we use the fitted model to predict response values for new documents. We test sLDA on two real-world problems: movie ratings predicted from reviews, and web page popularity predicted from text descriptions. We illustrate the benefits of sLDA versus modern regularized regression, as well as versus an unsupervised LDA analysis followed by a separate regression. 1
5 0.48422027 172 nips-2007-Scene Segmentation with CRFs Learned from Partially Labeled Images
Author: Bill Triggs, Jakob J. Verbeek
Abstract: Conditional Random Fields (CRFs) are an effective tool for a variety of different data segmentation and labeling tasks including visual scene interpretation, which seeks to partition images into their constituent semantic-level regions and assign appropriate class labels to each region. For accurate labeling it is important to capture the global context of the image as well as local information. We introduce a CRF based scene labeling model that incorporates both local features and features aggregated over the whole image or large sections of it. Secondly, traditional CRF learning requires fully labeled datasets which can be costly and troublesome to produce. We introduce a method for learning CRFs from datasets with many unlabeled nodes by marginalizing out the unknown labels so that the log-likelihood of the known ones can be maximized by gradient ascent. Loopy Belief Propagation is used to approximate the marginals needed for the gradient and log-likelihood calculations and the Bethe free-energy approximation to the log-likelihood is monitored to control the step size. Our experimental results show that effective models can be learned from fragmentary labelings and that incorporating top-down aggregate features significantly improves the segmentations. The resulting segmentations are compared to the state-of-the-art on three different image datasets. 1
6 0.48050183 59 nips-2007-Continuous Time Particle Filtering for fMRI
7 0.48035944 153 nips-2007-People Tracking with the Laplacian Eigenmaps Latent Variable Model
8 0.479224 94 nips-2007-Gaussian Process Models for Link Analysis and Transfer Learning
9 0.47889516 18 nips-2007-A probabilistic model for generating realistic lip movements from speech
10 0.4776476 138 nips-2007-Near-Maximum Entropy Models for Binary Neural Representations of Natural Images
11 0.4761889 154 nips-2007-Predicting Brain States from fMRI Data: Incremental Functional Principal Component Regression
12 0.47580847 156 nips-2007-Predictive Matrix-Variate t Models
13 0.47435158 105 nips-2007-Infinite State Bayes-Nets for Structured Domains
14 0.47174948 56 nips-2007-Configuration Estimates Improve Pedestrian Finding
15 0.47128698 47 nips-2007-Collapsed Variational Inference for HDP
16 0.47122049 86 nips-2007-Exponential Family Predictive Representations of State
17 0.4706811 69 nips-2007-Discriminative Batch Mode Active Learning
18 0.47064719 2 nips-2007-A Bayesian LDA-based model for semi-supervised part-of-speech tagging
19 0.47059083 122 nips-2007-Locality and low-dimensions in the prediction of natural experience from fMRI
20 0.4705292 45 nips-2007-Classification via Minimum Incremental Coding Length (MICL)