nips nips2007 nips2007-183 knowledge-graph by maker-knowledge-mining

183 nips-2007-Spatial Latent Dirichlet Allocation


Source: pdf

Author: Xiaogang Wang, Eric Grimson

Abstract: In recent years, the language model Latent Dirichlet Allocation (LDA), which clusters co-occurring words into topics, has been widely applied in the computer vision field. However, many of these applications have difficulty with modeling the spatial and temporal structure among visual words, since LDA assumes that a document is a “bag-of-words”. It is also critical to properly design “words” and “documents” when using a language model to solve vision problems. In this paper, we propose a topic model Spatial Latent Dirichlet Allocation (SLDA), which better encodes spatial structures among visual words that are essential for solving many vision problems. The spatial information is not encoded in the values of visual words but in the design of documents. Instead of knowing the partition of words into documents a priori, the word-document assignment becomes a random hidden variable in SLDA. There is a generative procedure, where knowledge of spatial structure can be flexibly added as a prior, grouping visual words which are close in space into the same document. We use SLDA to discover objects from a collection of images, and show it achieves better performance than LDA. 1

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 edu Abstract In recent years, the language model Latent Dirichlet Allocation (LDA), which clusters co-occurring words into topics, has been widely applied in the computer vision field. [sent-5, score-0.27]

2 However, many of these applications have difficulty with modeling the spatial and temporal structure among visual words, since LDA assumes that a document is a “bag-of-words”. [sent-6, score-0.589]

3 In this paper, we propose a topic model Spatial Latent Dirichlet Allocation (SLDA), which better encodes spatial structures among visual words that are essential for solving many vision problems. [sent-8, score-0.706]

4 The spatial information is not encoded in the values of visual words but in the design of documents. [sent-9, score-0.423]

5 Instead of knowing the partition of words into documents a priori, the word-document assignment becomes a random hidden variable in SLDA. [sent-10, score-0.267]

6 There is a generative procedure, where knowledge of spatial structure can be flexibly added as a prior, grouping visual words which are close in space into the same document. [sent-11, score-0.459]

7 For example, LDA was used to discover objects from a collection of images [2, 3, 4] and to classify images into different scene categories [5]. [sent-15, score-0.393]

8 First, LDA assumes that a document is a bag of words, such that spatial and temporal structures among visual words, which are meaningless in a language model but important in many computer vision problems, are ignored. [sent-20, score-0.678]

9 For example, in order to cluster image patches, which are treated as words, into classes of objects, researchers treated images as documents [2]. [sent-23, score-0.496]

10 This assumes that if two types of patches are from the same object class, they often appear in the same images. [sent-24, score-0.336]

11 As an example shown in Figure 1, even though sky is far from vehicles, if they often exist in the same images in some data set, they would be clustered into the same topic by LDA. [sent-26, score-0.42]

12 Furthermore, since in this image most of the patches are sky and building, a patch on a vehicle is likely to be labeled as building or sky as well. [sent-27, score-0.762]

13 These problems could be solved if the document of a patch, such as the yellow patch in Figure 1, only includes other 1 Figure 1: There will be some problems (see text) if the whole image is treated as one document when using LDA to discover classes of objects. [sent-28, score-1.056]

14 patches falling within its neighborhood, marked by the red dashed window in Figure 1, instead of the whole image. [sent-29, score-0.341]

15 So a better assumption is that if two types of image patches are from the same object class, they are not only often in the same images but also close in space. [sent-30, score-0.639]

16 We expect to utilize spatial information in a flexible way when designing documents for solving vision problems. [sent-31, score-0.31]

17 In this paper, we propose a Spatial Latent Dirichlet Allocation (SLDA) model which encodes the spatial structure among visual words. [sent-32, score-0.282]

18 an eye patch and a nose patch), which often occur in the same images and are close in space, into one topic (e. [sent-35, score-0.549]

19 However the spatial or temporal information is not encoded in the values of visual words, but in the design of documents. [sent-40, score-0.323]

20 LDA and its extensions, such as the author-topic model [8], the dynamic topic model [9], and the correlated topic model [10], all assume that the partition of words into documents is known a priori. [sent-41, score-0.691]

21 When visual words are close in space or time, they have a high probability to be grouped into the same document. [sent-44, score-0.298]

22 Some approaches such as [11, 3, 12, 4] could also capture some spatial structures among visual words. [sent-45, score-0.282]

23 [11] assumed that the spatial distribution of an object class could be modeled as Gaussian and the number of objects in the image was known. [sent-46, score-0.518]

24 [12] modeled the spatial dependency among image patches as Markov random fields. [sent-48, score-0.58]

25 And an image usually contains several objects of different classes. [sent-52, score-0.291]

26 The goal is to segment objects from images, and at the same time, to label these segments as different object classes in an unsupervised way. [sent-53, score-0.306]

27 A local descriptor is computed for each image patch and quantized into a visual word. [sent-56, score-0.582]

28 Using topic models, the visual words are clustered into topics which correspond to object classes. [sent-57, score-0.723]

29 Thus an image patch can be labeled as one of the object classes. [sent-58, score-0.504]

30 Instead of only computing visual words at interest points as in [2], we divide an image into local patches on a grid and densely sample a local descriptor for each patch. [sent-63, score-0.853]

31 Each local patch is quantized into a visual word according to the codebook. [sent-65, score-0.551]

32 In the next step, these visual words (image patches) will be further clustered into classes of objects. [sent-66, score-0.343]

33 2 Figure 2: Given a collection of images as shown in the first row (which are selected from the MSRC image dataset [13]), the goal is to segment images into objects and cluster these objects into different classes. [sent-68, score-0.676]

34 Under the same labeling approach, image patches marked in the same color are in one object cluster, but the meaning of colors changes across different labeling methods. [sent-71, score-0.64]

35 3 LDA When LDA is used to solve our problem, we treat local patches of images as words and the whole image as a document. [sent-72, score-0.723]

36 wji is the observed value of word i in document j. [sent-76, score-0.534]

37 All the words in the corpus will be clustered into K topics (classes of objects). [sent-77, score-0.307]

38 Each topic k is modeled as a multinomial distribution over the codebook. [sent-78, score-0.278]

39 For a topic k, a multinomial parameter k is sampled from Dirichlet prior k Dir(φ). [sent-82, score-0.332]

40 For a document j, a multinomial parameter j over the K topics is sampled from Dirichlet prior j Dir( ). [sent-84, score-0.46]

41 For a word i in document j, a topic label zji is sampled from discrete distribution zji Discrete( j ). [sent-86, score-1.04]

42 The value wji of word i in document j is sampled from the discrete distribution of topic zji , wji Discrete( zji ). [sent-88, score-1.211]

43 zji can be sampled through a Gibbs sampling procedure which integrates out p(zji = k z where n (k) ji w ji w φ) (k) n ji wji W w=1 + φwji (k) n ji w + φw j (j) n ji k + (j) K k =1 n ji k and k k + [14]. [sent-89, score-0.779]

44 (1) k is the number of words in the corpus with value w assigned to topic k excluding word (j) i in document j, and n ji k is the number of words in document j assigned to topic k excluding word i in document j. [sent-90, score-2.221]

45 Eq 1 is the product of two ratios: the probability of word wji under topic k and the probability of topic k in document j. [sent-91, score-0.984]

46 So LDA clusters the visual words often co-occurring in the same images into one object class. [sent-92, score-0.489]

47 Although LDA assumes that one image contains multiple topics, from experimental results we observe that the patches in the same image are likely to have the same labels. [sent-97, score-0.598]

48 Since the whole image is treated as one document, if one object class, e. [sent-98, score-0.349]

49 car in Figure 2, is dominant in the image, the second ratio in Eq 1 will lead to a large bias towards the car class, and thus the patches of street are also likely to be labeled as car. [sent-100, score-0.269]

50 This problem could be solved if a local patch only considers its neighboring patches as being in the same document. [sent-101, score-0.471]

51 4 SLDA We assume that if visual words are from the same class of objects, they not only often co-occur in the same images but also are close in space. [sent-102, score-0.389]

52 So we try to group image patches which are close in space into the same documents. [sent-103, score-0.452]

53 Each region is treated as a document instead of the whole image. [sent-105, score-0.374]

54 In Figure 4 (a), patch A on the cow is likely to be labeled as grass, since most other patches in its document are grass. [sent-107, score-0.732]

55 Any two patches whose distance is smaller than the region size “could” belong to the same document if the regions are placed densely enough. [sent-110, score-0.675]

56 We use the word “could” because each local patch is covered by several regions, so we have to decide to which document it belongs. [sent-111, score-0.665]

57 If two patches are closer in space, they have a higher probability to be assigned to the same document since there are more regions covering both of them. [sent-113, score-0.627]

58 As shown in Figure 4 (c), each document can be represented by a point (marked by magenta circle) in the image, assuming its region covers the whole image. [sent-115, score-0.365]

59 If an image patch is close to a document, it has a high probability to be assigned to that document. [sent-116, score-0.493]

60 A hidden variable di indicates which document word i is assigned to. [sent-119, score-0.648]

61 For each document d d d j there is a hyperparameter cd = gj , xd , yj known a priori. [sent-120, score-0.41]

62 gj is the index of the image where j j d d document j is placed and xj , yj is the location of the document. [sent-121, score-0.516]

63 For a word i, in addition to the observed word value wi , its location (xi , yi ) and image index gi are also observed and stored in variable ci = (gi , xi , yi ). [sent-122, score-0.684]

64 For a topic k, a multinomial parameter φk is sampled from Dirichlet prior φk ∼ Dir(β). [sent-124, score-0.332]

65 4 Figure 4: There are several ways to add spatial information among image patches when designing documents. [sent-125, score-0.607]

66 Image patches inside the region are assigned to the corresponding document. [sent-128, score-0.358]

67 (c): Each document is associated with a point (marked in magenta color). [sent-131, score-0.29]

68 If a image patch is close to a document, it has a high probability to be assigned to that document. [sent-133, score-0.493]

69 For a document j, a multinomial parameter j over the K topics is sampled from Dirichlet prior j Dir( ). [sent-135, score-0.46]

70 For a word (image patch) i, a random variable di is sampled from prior p(di σ) indicating to which document word i is assigned. [sent-137, score-0.792]

71 The image index and location of word i is sampled from distribution p(ci cdi ). [sent-140, score-0.442]

72 2 2 d xdi xi + ydi yi d d d d ) πgd (gi ) exp p((gi xi yi ) gdi xdi ydi d 2 i p(ci cdi ) = 0 if the word and the document are not in the same image. [sent-142, score-0.624]

73 The topic label zi of word i is sampled from the discrete distribution of document di , zi Discrete( di ). [sent-144, score-1.24]

74 The value wi of word i is sampled from the discrete distribution of topic zi , wi Discrete( zi ). [sent-146, score-0.756]

75 1 Gibbs Sampling zi and di can be sampled through a Gibbs sampling procedure integrating out the conditional distribution of zi given di is the same as in LDA. [sent-148, score-0.52]

76 In SLDA k (2) + k is the number of words in the corpus with value w assigned to topic k excluding word i, and is the number of words in document j assigned to topic k excluding word i. [sent-150, score-1.613]

77 p di = j zi = k z i d i ci cd φ σ j p (di = j σ) p ci cd j p (zi = k z i di = j d i p (zi = k z ) is obtained by integrating out j i di = j d ) i . [sent-153, score-0.857]

78 M p (zi = k z i di = j d i p( ) = j )p(zj ji )d j j =1 K k =1 M = j =1 K k =1 5 ( k K k =1 k K k =1 ) (j ) nk (j ) nk + + k K k =1 k We choose p (di = j|η) as a uniform prior and p ci |cd , σ as a Gaussian kernel. [sent-154, score-0.287]

79 From Eq 2 and 3, we observed that a word tends to have the same topic label as other words in its document and words closer in space are more likely to be assigned to the same documents. [sent-157, score-1.057]

80 So essentially under SLDA a word tends to be labeled as the same topic as other words close to it. [sent-158, score-0.602]

81 This satisfies our assumption that visual words from the same object class are closer in space. [sent-159, score-0.361]

82 Since we densely place many documents over one image, during Gibbs sampling some documents are only assigned a few words and the distributions cannot be well estimated. [sent-160, score-0.477]

83 To solve this problem we replicate each image patch to get many particles. [sent-161, score-0.379]

84 These particles have the same word value and location but can be assigned to different documents and have different labels. [sent-162, score-0.349]

85 Thus each document will have enough samples of words to estimate the distributions. [sent-163, score-0.407]

86 2 Discussion SLDA is a flexible model intended to encode spatial structure among image patches and design documents. [sent-165, score-0.607]

87 If there is only one document placed over one image, SLDA simply reduces to LDA. [sent-166, score-0.297]

88 However the object class model φk , simply a multinomial distribution over the codebook, has no spatial structure. [sent-171, score-0.28]

89 By simply adding a time stamp to ci and cd , it is easy for SLDA to encode temporal structure j among visual words. [sent-173, score-0.374]

90 Our codebook size is 200 and the topic number is 15. [sent-176, score-0.263]

91 The results of LDA are noisy and within one image most of the patches are labeled as one topic. [sent-179, score-0.448]

92 We treat all the frames in the sequence as an image collection and ignore their temporal order. [sent-186, score-0.282]

93 SLDA clusters image patches into tigers, rock, water, and grass. [sent-192, score-0.456]

94 If we choose the topic of tiger, as shown in the last row of Figure 5, all the tigers in the video can be segmented out. [sent-193, score-0.357]

95 6 Conclusion We propose a novel Spatial Latent Dirichlet Allocation model which clusters co-occurring and spatially neighboring visual words into the same topic. [sent-194, score-0.302]

96 Instead of knowing word-document assignment a priori, SLDA has a generative procedure partitioning visual words which are close in space into the same documents. [sent-195, score-0.354]

97 In the second column, we label the patches in the two frames as different topics using LDA. [sent-199, score-0.391]

98 In the fourth column, we segment tigers out by choosing the topic marked in red. [sent-202, score-0.376]

99 Using multiple segmentations to discover objects and their extent in image collections. [sent-250, score-0.337]

100 Spatially coherent latent topic model for concurrent object segmentation and classification. [sent-256, score-0.418]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('slda', 0.515), ('lda', 0.32), ('document', 0.263), ('patches', 0.24), ('topic', 0.225), ('patch', 0.2), ('image', 0.179), ('word', 0.171), ('words', 0.144), ('zji', 0.133), ('di', 0.133), ('spatial', 0.131), ('visual', 0.121), ('objects', 0.112), ('cd', 0.107), ('wji', 0.1), ('zi', 0.1), ('dirichlet', 0.099), ('documents', 0.097), ('object', 0.096), ('images', 0.091), ('topics', 0.09), ('ji', 0.082), ('assigned', 0.081), ('ci', 0.072), ('overlapped', 0.066), ('msrc', 0.066), ('marked', 0.063), ('densely', 0.058), ('segmentation', 0.058), ('bicycles', 0.057), ('cows', 0.057), ('video', 0.057), ('sky', 0.057), ('vision', 0.055), ('gi', 0.054), ('sampled', 0.054), ('dir', 0.053), ('multinomial', 0.053), ('tigers', 0.05), ('clustered', 0.047), ('allocation', 0.047), ('discover', 0.046), ('temporal', 0.044), ('iccv', 0.044), ('regions', 0.043), ('cars', 0.042), ('excluding', 0.041), ('gj', 0.04), ('latent', 0.039), ('cdi', 0.038), ('xdi', 0.038), ('ydi', 0.038), ('fa', 0.038), ('codebook', 0.038), ('segment', 0.038), ('whole', 0.038), ('wi', 0.037), ('clusters', 0.037), ('cvpr', 0.037), ('region', 0.037), ('treated', 0.036), ('language', 0.034), ('placed', 0.034), ('close', 0.033), ('sivic', 0.033), ('efros', 0.033), ('discrete', 0.032), ('frames', 0.032), ('classes', 0.031), ('labeling', 0.031), ('local', 0.031), ('gibbs', 0.031), ('tiger', 0.03), ('generative', 0.03), ('activities', 0.03), ('among', 0.03), ('label', 0.029), ('eq', 0.029), ('labeled', 0.029), ('blei', 0.028), ('quantized', 0.028), ('wang', 0.028), ('collection', 0.027), ('designing', 0.027), ('design', 0.027), ('russell', 0.027), ('magenta', 0.027), ('corpus', 0.026), ('scene', 0.026), ('assignment', 0.026), ('divide', 0.026), ('faces', 0.026), ('cluster', 0.026), ('segmented', 0.025), ('atomic', 0.024), ('human', 0.024), ('discovering', 0.023), ('alarm', 0.023), ('descriptor', 0.023)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000007 183 nips-2007-Spatial Latent Dirichlet Allocation

Author: Xiaogang Wang, Eric Grimson

Abstract: In recent years, the language model Latent Dirichlet Allocation (LDA), which clusters co-occurring words into topics, has been widely applied in the computer vision field. However, many of these applications have difficulty with modeling the spatial and temporal structure among visual words, since LDA assumes that a document is a “bag-of-words”. It is also critical to properly design “words” and “documents” when using a language model to solve vision problems. In this paper, we propose a topic model Spatial Latent Dirichlet Allocation (SLDA), which better encodes spatial structures among visual words that are essential for solving many vision problems. The spatial information is not encoded in the values of visual words but in the design of documents. Instead of knowing the partition of words into documents a priori, the word-document assignment becomes a random hidden variable in SLDA. There is a generative procedure, where knowledge of spatial structure can be flexibly added as a prior, grouping visual words which are close in space into the same document. We use SLDA to discover objects from a collection of images, and show it achieves better performance than LDA. 1

2 0.54716897 189 nips-2007-Supervised Topic Models

Author: Jon D. Mcauliffe, David M. Blei

Abstract: We introduce supervised latent Dirichlet allocation (sLDA), a statistical model of labelled documents. The model accommodates a variety of response types. We derive a maximum-likelihood procedure for parameter estimation, which relies on variational approximations to handle intractable posterior expectations. Prediction problems motivate this research: we use the fitted model to predict response values for new documents. We test sLDA on two real-world problems: movie ratings predicted from reviews, and web page popularity predicted from text descriptions. We illustrate the benefits of sLDA versus modern regularized regression, as well as versus an unsupervised LDA analysis followed by a separate regression. 1

3 0.28658086 73 nips-2007-Distributed Inference for Latent Dirichlet Allocation

Author: David Newman, Padhraic Smyth, Max Welling, Arthur U. Asuncion

Abstract: We investigate the problem of learning a widely-used latent-variable model – the Latent Dirichlet Allocation (LDA) or “topic” model – using distributed computation, where each of processors only sees of the total data set. We propose two distributed inference schemes that are motivated from different perspectives. The first scheme uses local Gibbs sampling on each processor with periodic updates—it is simple to implement and can be viewed as an approximation to a single processor implementation of Gibbs sampling. The second scheme relies on a hierarchical Bayesian extension of the standard LDA model to directly account for the fact that data are distributed across processors—it has a theoretical guarantee of convergence but is more complex to implement than the approximate method. Using five real-world text corpora we show that distributed learning works very well for LDA models, i.e., perplexity and precision-recall scores for distributed learning are indistinguishable from those obtained with single-processor learning. Our extensive experimental results include large-scale distributed computation on 1000 virtual processors; and speedup experiments of learning topics in a 100-million word corpus using 16 processors. ¢ ¤ ¦¥£ ¢ ¢

4 0.2027465 143 nips-2007-Object Recognition by Scene Alignment

Author: Bryan Russell, Antonio Torralba, Ce Liu, Rob Fergus, William T. Freeman

Abstract: Current object recognition systems can only recognize a limited number of object categories; scaling up to many categories is the next challenge. We seek to build a system to recognize and localize many different object categories in complex scenes. We achieve this through a simple approach: by matching the input image, in an appropriate representation, to images in a large training set of labeled images. Due to regularities in object identities across similar scenes, the retrieved matches provide hypotheses for object identities and locations. We build a probabilistic model to transfer the labels from the retrieval set to the input image. We demonstrate the effectiveness of this approach and study algorithm component contributions using held-out test sets from the LabelMe database. 1

5 0.19787015 105 nips-2007-Infinite State Bayes-Nets for Structured Domains

Author: Max Welling, Ian Porteous, Evgeniy Bart

Abstract: A general modeling framework is proposed that unifies nonparametric-Bayesian models, topic-models and Bayesian networks. This class of infinite state Bayes nets (ISBN) can be viewed as directed networks of ‘hierarchical Dirichlet processes’ (HDPs) where the domain of the variables can be structured (e.g. words in documents or features in images). We show that collapsed Gibbs sampling can be done efficiently in these models by leveraging the structure of the Bayes net and using the forward-filtering-backward-sampling algorithm for junction trees. Existing models, such as nested-DP, Pachinko allocation, mixed membership stochastic block models as well as a number of new models are described as ISBNs. Two experiments have been performed to illustrate these ideas. 1

6 0.18230729 172 nips-2007-Scene Segmentation with CRFs Learned from Partially Labeled Images

7 0.16038993 2 nips-2007-A Bayesian LDA-based model for semi-supervised part-of-speech tagging

8 0.15765879 95 nips-2007-HM-BiTAM: Bilingual Topic Exploration, Word Alignment, and Translation

9 0.13781855 47 nips-2007-Collapsed Variational Inference for HDP

10 0.12845188 129 nips-2007-Mining Internet-Scale Software Repositories

11 0.09997955 188 nips-2007-Subspace-Based Face Recognition in Analog VLSI

12 0.096647382 181 nips-2007-Sparse Overcomplete Latent Variable Decomposition of Counts Data

13 0.09417887 113 nips-2007-Learning Visual Attributes

14 0.090255208 211 nips-2007-Unsupervised Feature Selection for Accurate Recommendation of High-Dimensional Image Data

15 0.0899745 197 nips-2007-The Infinite Markov Model

16 0.088195145 1 nips-2007-A Bayesian Framework for Cross-Situational Word-Learning

17 0.082969278 111 nips-2007-Learning Horizontal Connections in a Sparse Coding Model of Natural Images

18 0.078255959 84 nips-2007-Expectation Maximization and Posterior Constraints

19 0.074582696 71 nips-2007-Discriminative Keyword Selection Using Support Vector Machines

20 0.073684998 9 nips-2007-A Probabilistic Approach to Language Change


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.211), (1, 0.176), (2, -0.102), (3, -0.523), (4, 0.159), (5, 0.001), (6, 0.128), (7, -0.102), (8, 0.035), (9, 0.191), (10, -0.007), (11, -0.003), (12, -0.101), (13, 0.201), (14, -0.023), (15, 0.183), (16, 0.181), (17, -0.112), (18, 0.042), (19, 0.006), (20, -0.002), (21, -0.01), (22, -0.001), (23, 0.001), (24, -0.063), (25, -0.038), (26, 0.016), (27, -0.057), (28, 0.019), (29, -0.009), (30, -0.031), (31, 0.064), (32, -0.012), (33, 0.051), (34, -0.037), (35, 0.002), (36, -0.029), (37, -0.087), (38, 0.034), (39, 0.02), (40, -0.0), (41, 0.043), (42, -0.023), (43, -0.013), (44, -0.012), (45, 0.084), (46, -0.046), (47, 0.101), (48, -0.021), (49, 0.05)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.97261113 183 nips-2007-Spatial Latent Dirichlet Allocation

Author: Xiaogang Wang, Eric Grimson

Abstract: In recent years, the language model Latent Dirichlet Allocation (LDA), which clusters co-occurring words into topics, has been widely applied in the computer vision field. However, many of these applications have difficulty with modeling the spatial and temporal structure among visual words, since LDA assumes that a document is a “bag-of-words”. It is also critical to properly design “words” and “documents” when using a language model to solve vision problems. In this paper, we propose a topic model Spatial Latent Dirichlet Allocation (SLDA), which better encodes spatial structures among visual words that are essential for solving many vision problems. The spatial information is not encoded in the values of visual words but in the design of documents. Instead of knowing the partition of words into documents a priori, the word-document assignment becomes a random hidden variable in SLDA. There is a generative procedure, where knowledge of spatial structure can be flexibly added as a prior, grouping visual words which are close in space into the same document. We use SLDA to discover objects from a collection of images, and show it achieves better performance than LDA. 1

2 0.83444583 189 nips-2007-Supervised Topic Models

Author: Jon D. Mcauliffe, David M. Blei

Abstract: We introduce supervised latent Dirichlet allocation (sLDA), a statistical model of labelled documents. The model accommodates a variety of response types. We derive a maximum-likelihood procedure for parameter estimation, which relies on variational approximations to handle intractable posterior expectations. Prediction problems motivate this research: we use the fitted model to predict response values for new documents. We test sLDA on two real-world problems: movie ratings predicted from reviews, and web page popularity predicted from text descriptions. We illustrate the benefits of sLDA versus modern regularized regression, as well as versus an unsupervised LDA analysis followed by a separate regression. 1

3 0.81981856 73 nips-2007-Distributed Inference for Latent Dirichlet Allocation

Author: David Newman, Padhraic Smyth, Max Welling, Arthur U. Asuncion

Abstract: We investigate the problem of learning a widely-used latent-variable model – the Latent Dirichlet Allocation (LDA) or “topic” model – using distributed computation, where each of processors only sees of the total data set. We propose two distributed inference schemes that are motivated from different perspectives. The first scheme uses local Gibbs sampling on each processor with periodic updates—it is simple to implement and can be viewed as an approximation to a single processor implementation of Gibbs sampling. The second scheme relies on a hierarchical Bayesian extension of the standard LDA model to directly account for the fact that data are distributed across processors—it has a theoretical guarantee of convergence but is more complex to implement than the approximate method. Using five real-world text corpora we show that distributed learning works very well for LDA models, i.e., perplexity and precision-recall scores for distributed learning are indistinguishable from those obtained with single-processor learning. Our extensive experimental results include large-scale distributed computation on 1000 virtual processors; and speedup experiments of learning topics in a 100-million word corpus using 16 processors. ¢ ¤ ¦¥£ ¢ ¢

4 0.56006414 47 nips-2007-Collapsed Variational Inference for HDP

Author: Yee W. Teh, Kenichi Kurihara, Max Welling

Abstract: A wide variety of Dirichlet-multinomial ‘topic’ models have found interesting applications in recent years. While Gibbs sampling remains an important method of inference in such models, variational techniques have certain advantages such as easy assessment of convergence, easy optimization without the need to maintain detailed balance, a bound on the marginal likelihood, and side-stepping of issues with topic-identifiability. The most accurate variational technique thus far, namely collapsed variational latent Dirichlet allocation, did not deal with model selection nor did it include inference for hyperparameters. We address both issues by generalizing the technique, obtaining the first variational algorithm to deal with the hierarchical Dirichlet process and to deal with hyperparameters of Dirichlet variables. Experiments show a significant improvement in accuracy. 1

5 0.55922139 105 nips-2007-Infinite State Bayes-Nets for Structured Domains

Author: Max Welling, Ian Porteous, Evgeniy Bart

Abstract: A general modeling framework is proposed that unifies nonparametric-Bayesian models, topic-models and Bayesian networks. This class of infinite state Bayes nets (ISBN) can be viewed as directed networks of ‘hierarchical Dirichlet processes’ (HDPs) where the domain of the variables can be structured (e.g. words in documents or features in images). We show that collapsed Gibbs sampling can be done efficiently in these models by leveraging the structure of the Bayes net and using the forward-filtering-backward-sampling algorithm for junction trees. Existing models, such as nested-DP, Pachinko allocation, mixed membership stochastic block models as well as a number of new models are described as ISBNs. Two experiments have been performed to illustrate these ideas. 1

6 0.50486213 143 nips-2007-Object Recognition by Scene Alignment

7 0.48949307 95 nips-2007-HM-BiTAM: Bilingual Topic Exploration, Word Alignment, and Translation

8 0.48102349 172 nips-2007-Scene Segmentation with CRFs Learned from Partially Labeled Images

9 0.46245354 129 nips-2007-Mining Internet-Scale Software Repositories

10 0.41441044 113 nips-2007-Learning Visual Attributes

11 0.41273594 196 nips-2007-The Infinite Gamma-Poisson Feature Model

12 0.38509592 2 nips-2007-A Bayesian LDA-based model for semi-supervised part-of-speech tagging

13 0.37772647 188 nips-2007-Subspace-Based Face Recognition in Analog VLSI

14 0.33957946 1 nips-2007-A Bayesian Framework for Cross-Situational Word-Learning

15 0.32065687 211 nips-2007-Unsupervised Feature Selection for Accurate Recommendation of High-Dimensional Image Data

16 0.29287735 193 nips-2007-The Distribution Family of Similarity Distances

17 0.28232443 181 nips-2007-Sparse Overcomplete Latent Variable Decomposition of Counts Data

18 0.2815772 9 nips-2007-A Probabilistic Approach to Language Change

19 0.27825946 71 nips-2007-Discriminative Keyword Selection Using Support Vector Machines

20 0.25354856 138 nips-2007-Near-Maximum Entropy Models for Binary Neural Representations of Natural Images


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(5, 0.028), (13, 0.019), (21, 0.062), (26, 0.016), (31, 0.035), (34, 0.011), (35, 0.017), (47, 0.052), (49, 0.011), (83, 0.089), (85, 0.012), (87, 0.508), (90, 0.048)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.92635697 183 nips-2007-Spatial Latent Dirichlet Allocation

Author: Xiaogang Wang, Eric Grimson

Abstract: In recent years, the language model Latent Dirichlet Allocation (LDA), which clusters co-occurring words into topics, has been widely applied in the computer vision field. However, many of these applications have difficulty with modeling the spatial and temporal structure among visual words, since LDA assumes that a document is a “bag-of-words”. It is also critical to properly design “words” and “documents” when using a language model to solve vision problems. In this paper, we propose a topic model Spatial Latent Dirichlet Allocation (SLDA), which better encodes spatial structures among visual words that are essential for solving many vision problems. The spatial information is not encoded in the values of visual words but in the design of documents. Instead of knowing the partition of words into documents a priori, the word-document assignment becomes a random hidden variable in SLDA. There is a generative procedure, where knowledge of spatial structure can be flexibly added as a prior, grouping visual words which are close in space into the same document. We use SLDA to discover objects from a collection of images, and show it achieves better performance than LDA. 1

2 0.9186067 129 nips-2007-Mining Internet-Scale Software Repositories

Author: Erik Linstead, Paul Rigor, Sushil Bajracharya, Cristina Lopes, Pierre F. Baldi

Abstract: Large repositories of source code create new challenges and opportunities for statistical machine learning. Here we first develop Sourcerer, an infrastructure for the automated crawling, parsing, and database storage of open source software. Sourcerer allows us to gather Internet-scale source code. For instance, in one experiment, we gather 4,632 java projects from SourceForge and Apache totaling over 38 million lines of code from 9,250 developers. Simple statistical analyses of the data first reveal robust power-law behavior for package, SLOC, and lexical containment distributions. We then develop and apply unsupervised author-topic, probabilistic models to automatically discover the topics embedded in the code and extract topic-word and author-topic distributions. In addition to serving as a convenient summary for program function and developer activities, these and other related distributions provide a statistical and information-theoretic basis for quantifying and analyzing developer similarity and competence, topic scattering, and document tangling, with direct applications to software engineering. Finally, by combining software textual content with structural information captured by our CodeRank approach, we are able to significantly improve software retrieval performance, increasing the AUC metric to 0.84– roughly 10-30% better than previous approaches based on text alone. Supplementary material may be found at: http://sourcerer.ics.uci.edu/nips2007/nips07.html. 1

3 0.82936013 50 nips-2007-Combined discriminative and generative articulated pose and non-rigid shape estimation

Author: Leonid Sigal, Alexandru Balan, Michael J. Black

Abstract: Estimation of three-dimensional articulated human pose and motion from images is a central problem in computer vision. Much of the previous work has been limited by the use of crude generative models of humans represented as articulated collections of simple parts such as cylinders. Automatic initialization of such models has proved difficult and most approaches assume that the size and shape of the body parts are known a priori. In this paper we propose a method for automatically recovering a detailed parametric model of non-rigid body shape and pose from monocular imagery. Specifically, we represent the body using a parameterized triangulated mesh model that is learned from a database of human range scans. We demonstrate a discriminative method to directly recover the model parameters from monocular images using a conditional mixture of kernel regressors. This predicted pose and shape are used to initialize a generative model for more detailed pose and shape estimation. The resulting approach allows fully automatic pose and shape recovery from monocular and multi-camera imagery. Experimental results show that our method is capable of robustly recovering articulated pose, shape and biometric measurements (e.g. height, weight, etc.) in both calibrated and uncalibrated camera environments. 1

4 0.78076887 59 nips-2007-Continuous Time Particle Filtering for fMRI

Author: Lawrence Murray, Amos J. Storkey

Abstract: We construct a biologically motivated stochastic differential model of the neural and hemodynamic activity underlying the observed Blood Oxygen Level Dependent (BOLD) signal in Functional Magnetic Resonance Imaging (fMRI). The model poses a difficult parameter estimation problem, both theoretically due to the nonlinearity and divergence of the differential system, and computationally due to its time and space complexity. We adapt a particle filter and smoother to the task, and discuss some of the practical approaches used to tackle the difficulties, including use of sparse matrices and parallelisation. Results demonstrate the tractability of the approach in its application to an effective connectivity study. 1

5 0.54675996 189 nips-2007-Supervised Topic Models

Author: Jon D. Mcauliffe, David M. Blei

Abstract: We introduce supervised latent Dirichlet allocation (sLDA), a statistical model of labelled documents. The model accommodates a variety of response types. We derive a maximum-likelihood procedure for parameter estimation, which relies on variational approximations to handle intractable posterior expectations. Prediction problems motivate this research: we use the fitted model to predict response values for new documents. We test sLDA on two real-world problems: movie ratings predicted from reviews, and web page popularity predicted from text descriptions. We illustrate the benefits of sLDA versus modern regularized regression, as well as versus an unsupervised LDA analysis followed by a separate regression. 1

6 0.49677375 73 nips-2007-Distributed Inference for Latent Dirichlet Allocation

7 0.47012183 95 nips-2007-HM-BiTAM: Bilingual Topic Exploration, Word Alignment, and Translation

8 0.45000041 143 nips-2007-Object Recognition by Scene Alignment

9 0.44462302 2 nips-2007-A Bayesian LDA-based model for semi-supervised part-of-speech tagging

10 0.44171256 105 nips-2007-Infinite State Bayes-Nets for Structured Domains

11 0.42417517 47 nips-2007-Collapsed Variational Inference for HDP

12 0.41213316 1 nips-2007-A Bayesian Framework for Cross-Situational Word-Learning

13 0.41129991 172 nips-2007-Scene Segmentation with CRFs Learned from Partially Labeled Images

14 0.40710947 113 nips-2007-Learning Visual Attributes

15 0.39744076 56 nips-2007-Configuration Estimates Improve Pedestrian Finding

16 0.39701653 211 nips-2007-Unsupervised Feature Selection for Accurate Recommendation of High-Dimensional Image Data

17 0.39485571 153 nips-2007-People Tracking with the Laplacian Eigenmaps Latent Variable Model

18 0.38365659 169 nips-2007-Retrieved context and the discovery of semantic structure

19 0.37613338 154 nips-2007-Predicting Brain States from fMRI Data: Incremental Functional Principal Component Regression

20 0.37512746 180 nips-2007-Sparse Feature Learning for Deep Belief Networks