nips nips2001 nips2001-90 knowledge-graph by maker-knowledge-mining

90 nips-2001-Hyperbolic Self-Organizing Maps for Semantic Navigation

Source: pdf

Author: Jorg Ontrup, Helge Ritter

Abstract: We introduce a new type of Self-Organizing Map (SOM) to navigate in the Semantic Space of large text collections. We propose a “hyperbolic SOM” (HSOM) based on a regular tesselation of the hyperbolic plane, which is a non-euclidean space characterized by constant negative gaussian curvature. The exponentially increasing size of a neighborhood around a point in hyperbolic space provides more freedom to map the complex information space arising from language into spatial relations. We describe experiments, showing that the HSOM can successfully be applied to text categorization tasks and yields results comparable to other state-of-the-art methods.

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 de Abstract We introduce a new type of Self-Organizing Map (SOM) to navigate in the Semantic Space of large text collections. [sent-5, score-0.155]

2 We propose a “hyperbolic SOM” (HSOM) based on a regular tesselation of the hyperbolic plane, which is a non-euclidean space characterized by constant negative gaussian curvature. [sent-6, score-0.658]

3 The exponentially increasing size of a neighborhood around a point in hyperbolic space provides more freedom to map the complex information space arising from language into spatial relations. [sent-7, score-0.665]

4 We describe experiments, showing that the HSOM can successfully be applied to text categorization tasks and yields results comparable to other state-of-the-art methods. [sent-8, score-0.262]

5 So far, the overwhelming majority of SOM approaches have taken it for granted to use a ﬂat space as their data model and, motivated by its convenience for visualization, have favored the (suitably discretized) euclidean plane as their chief “canvas” for the generated mappings. [sent-10, score-0.17]

6 However, even if our thinking is deeply entrenched with euclidean space, an obvious limiting factor is the rather restricted neighborhood that “ﬁts” around a point on a euclidean 2D surface. [sent-11, score-0.172]

7 They are characterized by uniform negative curvature, resulting in a geometry such that the size of a neighborhood around a point increases exponentially with its radius . [sent-13, score-0.155]

8 Consequently, we suggest to use hyperbolic spaces also in conjunction with the SOM. [sent-15, score-0.52]

9 The lattice structure of the resulting hyperbolic SOMs (HSOMs) is based on a tesselation of the hyperbolic space (in 2D or 3D) and the lattice neighborhood reﬂects the hyperbolic distance metric that is responsible for the non-intuitive properties of hyperbolic spaces. [sent-16, score-2.464]

10 After a brief introduction to the construction of hyperbolic spaces we describe several computer experiments that indicate that the HSOM offers new interesting perspectives in the ﬁeld of text-mining. [sent-17, score-0.52]

11 [7]), and the relationships for the area and the circumference of a circle of radius are given by ¡ )# ¨ 0! [sent-21, score-0.133]

12 It is this property that was observed in [3, 4] to make hyperbolic spaces extremely useful for accommodating hierarchical structures. [sent-24, score-0.52]

13 ) lattice into “ﬂat space” in order to be able to inspect the generated maps. [sent-27, score-0.135]

14 , distance preserving) embedding of the hyperbolic plane into a “ﬂat” space, we may use a Minkowski space [8]. [sent-31, score-0.669]

15 Under this embedding, the hyperbolic plane about the -axis. [sent-36, score-0.565]

16 The N Klein model is obtained by projecting the points C onto the plane along rays passing of through the origin (see Fig. [sent-39, score-0.173]

17 S The Poincar´ Model results if we add two fure ther steps: ﬁrst a perpendicular projection of Figure 1: Construction steps underlying the Klein Model onto the (“northern”) surface Klein and Poincar´ -models of the space H2 e of the unit sphere centered at the origin (e. [sent-45, score-0.17]

18 , ), and then a stereographic projection of the “northern” hemisphere onto the unit circle about the origin in the ground plane (point ). [sent-47, score-0.261]

19 The neighborhood of the H2 origin is mapped almost faithfully (up to a linear shrinkage factor of 2), while more distant regions become increasingly “squeezed”. [sent-52, score-0.217]

20 , (sufﬁciently small) shapes painted onto H2 are not deformed, only their size shrinks with increasing distance from the origin. [sent-55, score-0.106]

21 By translating the original H2, the ﬁsh-eye-fovea can be moved to any other part of H2, allowing to selectively zoom-in on interesting portions of a map painted on H2 while still keeping a coarser view of its surrounding context. [sent-56, score-0.111]

22 2 Tesselations of the Hyperbolic Plane To complete the set-up for a hyperbolic SOM we still need an equivalent of a regular grid in the hyperbolic plane. [sent-58, score-0.979]

23 For the hyperbolic plane there exist an inﬁnite number of tesselations with congruent polygons such that each grid point is surrounded by the same number of neighbors [9, 10]. [sent-59, score-0.713]

24 2 shows two example tesselations (for the minimal value of and for ), using the Poincar´ model for their visualization. [sent-61, score-0.123]

25 While these tesselations e appear non-uniform, this is only due to the ﬁsh-eye effect of the Poincar´ projection. [sent-62, score-0.123]

26 In the e original H2, each tesselation triangle has the same size. [sent-63, score-0.153]

27 ¡ ¢¢ s A¢ a One way to generate these tesselations algorithmically is by repeated application of a suitable set of generators of their symmetry group to a (suitably sized, cf. [sent-64, score-0.123]

28 Figure 2: Regular triangle tesselations of the hyperbolic plane, projected into the unit disk using the Poincar´ mapping. [sent-67, score-0.71]

29 The left tesselation shows the case where the minimal number ( e ) of equilateral triangles meet at each vertex, the right ﬁgure was constructed with . [sent-68, score-0.123]

30 We organize the nodes of a lattice as described above in “rings” around an origin node. [sent-71, score-0.25]

31 The numbers of nodes of such a lattice grows very rapidly (asymptotically exponentially) with the chosen lattice radius (its number of rings). [sent-72, score-0.418]

32 Each lattice node carries a prototype vector from some -dimensional feature space (if we wish to make any non-standard assumptions about the metric structure of this space, we would build this into the distance metric that is used for determining the best-match node). [sent-74, score-0.377]

33 # $ 4 &BX;¨¢ © " # BC 5 §¨¢ © ¦ &#¥£$ ¡ repeatedly determining the winner node and adjusting all nodes lattice neighborhood around according to the familiar rule in a radial (4) with . [sent-79, score-0.421]

34 However, since we now work on a hyperbolic lattice, we have to determine both the neighborhood and the (squared) node distance according to the natural metric that is inherited by the hyperbolic lattice. [sent-80, score-1.207]

35 # $ 4 The simplest way to do this is to keep with each node a complex number to identify its position in the Poincar´ model. [sent-81, score-0.117]

36 The node distance is then given (using the Poincar´ model, e e see e. [sent-82, score-0.155]

37 arctanh (5) " ¢4 # $ £ ¡ The neighborhood can be deﬁned as the subset of nodes within a certain graph distance (which is chosen as a small multiple of the neighborhood radius ) around . [sent-92, score-0.308]

38 A major example of the use of the SOM for text mining is the WEBSOM project [2]. [sent-94, score-0.181]

39 1 Text Categorization In order to apply the HSOM to natural text categorization, i. [sent-96, score-0.155]

40 the assignment of natural language documents to a number of predeﬁned categories, we follow the widely used vector-space-model of Information Retrieval (IR). [sent-98, score-0.233]

41 For each document we construct a fea, where the components are determined by the frequency of which term ture vector occurs in that document. [sent-99, score-0.13]

42 The HSOM can be utilized for text categorization in the following manner. [sent-103, score-0.262]

43 During the second step, the training set is mapped onto the HSOM lattice. [sent-105, score-0.164]

44 To this end, for each training example its best match node is determined such that (7) F @ 4 D B $ E! [sent-106, score-0.147]

45 @4 @ # A4 ( where denotes the feature vector of document , as described above. [sent-118, score-0.103]

46 After all examples have been presented to the net, each node is labelled with the union of all categories that belonged to the documents that were mapped to this node. [sent-119, score-0.563]

47 A new, unknown text is then classiﬁed into the union of categories which are associated with its winner node selected in the HSOM. [sent-120, score-0.384]

48 The classiﬁcation effectiveness is commonly measured in terms of precision and recall [16], which can be estimated as where and are the numbers of documents correctly classiﬁed, and and are the correctly not classiﬁed to category , respectively. [sent-132, score-0.292]

49 ¢ ¢ ) g $ ¤ ¢ ¥©£§¦¢¨¤ ¤¡ £¡ ¢ ) ) ¡ ) ) ¡ g ) ) ) ¡ ) $ ¤ ¨ ¦ ¢ ©§¤ ¡ ¤ ¢ ¡ For each node and each category a conﬁdence value is determined. [sent-134, score-0.176]

50 It describes the number of training documents belonging to class which were mapped to node . [sent-135, score-0.477]

51 When retrieving documents from a given category , we compare for each node its associated against a threshold . [sent-136, score-0.409]

52 Documents from nodes with become then included into the retrieval set. [sent-137, score-0.096]

53 For nodes which contain a set of documents , the order of the , where . [sent-138, score-0.302]

54 u # @ 4 ( # $ # @ 4 ( 6 In this way the number of retrieved documents can be controlled and we obtain the precision-recall-diagrams as shown in Fig. [sent-142, score-0.258]

55 In order to compare the HSOM’s performance for text categorization, we also evaluated a -nearest neighbor ( -NN) classiﬁer with our training set. [sent-144, score-0.185]

56 The conﬁdence level of a -NN classiﬁer to assign document to class is " ) " " @ 4 4 # @ 4$ @ 4 6 " @ 8 64 20 (& 975% 31 )'% # $ 0# @ 4 $ @ 4 6 % A@) ) $ ¢ # @ 4 ¡ -NN (8) )# @ 4 # ¡ is the set of documents for which is maximum. [sent-146, score-0.336]

57 We have compared a HSOM with rings and a tesselation with neighbors (summing up to 1306 nodes) to a spherical standard euclidean SOM as described in [11] with approx. [sent-151, score-0.253]

58 This is due to the fact, that the nodes in H2 cover a much broader space and therefore offer more freedom to map smaller portions of the original dataspace with less distortions as compared to euclidean space. [sent-155, score-0.229]

59 ¢ ¤ A¢ g D E " " " As the -NN results suggest, other state-of-the-art techniques like support vector machines will probably lead to better numerical categorization results than the HSOM. [sent-156, score-0.107]

60 69 Figure 3: Precision-recall curves for the three most frequent categories earn, acq and money-fx. [sent-208, score-0.157]

61 2 Text Mining & Semantic Navigation A major advantage of the HSOM is its remarkable capability to map high-dimensional similarity relationships to a low-dimensional space which can be more easily handled and interpreted by the human observer. [sent-210, score-0.134]

62 This feature and the particular “ﬁsh-eye” capability motivates our approach to visualize whole text collections with the HSOM. [sent-211, score-0.248]

63 It can be regarded as an interface capturing the semantic structure of a text database and provides a way to guide the users attention. [sent-212, score-0.247]

64 In preliminary experiments we have labelled the nodes with glyphs corresponding to the categories of the documents mapped to that node. [sent-213, score-0.55]

65 Note, that the major amount of data gets mapped to the outermost region, where the nodes of the HSOM make use of the large space offered by the hyperbolic geometry. [sent-216, score-0.672]

66 During the unsupervised training process, the document’s categories were not presented to the HSOM. [sent-217, score-0.116]

67 The two most prominent are the earn and acquisition region of the map, reﬂecting the large proportion of these categories in the Reuters-21578 collection. [sent-219, score-0.174]

68 Note, that categories which are semantically similar are located beside each other, as can be seen in the corn, wheat, grain the interest, money-fx or the crude, ship area of the map. [sent-220, score-0.258]

69 Additional to the category (glyph type) and the number of training documents per node (glyph size), the number of test documents mapped to each node is shown as the height of the symbol above the ground plane. [sent-221, score-0.918]

70 In this way the HSOM can be used as a novelty detector in chronological document streams. [sent-222, score-0.103]

71 For the Reuters-21578 dataset, a particular node strikes out. [sent-223, score-0.117]

72 It corresponds to the small glyph tagged with the “ship” label in Fig. [sent-224, score-0.137]

73 Only a few documents from the training collection are mapped to that node as shown by it’s relatively small glyph size. [sent-226, score-0.583]

74 A closer inspection reveals, that the vast majority (35 of 40) of the test documents describe an incident where an Iranian oil rig was attacked in the gulf. [sent-230, score-0.29]

75 The next example illustrates that the HSOM can provide more information about an unknown text than just it’s category. [sent-232, score-0.155]

76 For this experiment we have taken movie reviews from the rec. [sent-233, score-0.132]

77 Since all the reviews describe a certain movie, we retrieved their associated genres from the Internet Movie Database (http://www. [sent-237, score-0.092]

78 The training set contained 8923 ran- money−fx ship trade corn wheat grain interest acq crude earn ¡ ¢¤ Figure 4: The left ﬁgure shows a central view of the Reuters data. [sent-240, score-0.485]

79 We used a HSOM with rings and a tesselation with neighbors. [sent-241, score-0.179]

80 Ten different glyphs were used to visualize the ten most frequent categories. [sent-242, score-0.126]

81 The glyph sizes and the -values (height above ground plane) reﬂect the number of training and test documents mapped to the corresponding node, respectively. [sent-244, score-0.498]

82 ¥ £ ¤¤ £ domly selected reviews (without their genre information) from ﬁlms released before 2000. [sent-245, score-0.129]

83 We then presented the system with ﬁve reviews from the ﬁlm “Atlantis”, a Disney cartoon released in 2001. [sent-246, score-0.098]

84 The HSOM correctly classiﬁed all of the ﬁve texts as reviews for an animation movie. [sent-247, score-0.169]

85 5 the projection of the ﬁve new documents onto the map with the previously acquired text collection is shown. [sent-249, score-0.537]

86 5 it can be seen that all of the “Atlantis” reviews where mapped to a node in immediate vicinity of documents describing other Disney animation movies. [sent-253, score-0.57]

87 This example motivates the approach of “semantic navigation” to rapidly visualize the linkage between unknown documents and previously acquired semantic concepts. [sent-254, score-0.472]

88 In both ﬁgures, glyph size and -value indicate the number of texts related to the animation genre mapped to the corresponding node. [sent-256, score-0.336]

89 Nodes exceeding a certain threshold were labelled with the title corresponding to the most frequently occuring movie mapped to that node. [sent-257, score-0.192]

90 The underlined label in the right ﬁgure indicates the position of the node to which ﬁve new documents were mapped to. [sent-258, score-0.447]

91 5 Conclusion Efﬁcient navigation in “Sematic Space” requires to address two challenges: (i) how to create a low dimensional display of semantic relationship of documents, and (ii) how to obtain these relationships by automated text categorization. [sent-259, score-0.378]

92 The HSOM is able to exploit the peculiar geometric properties of hyperbolic space to successfully compress complex semantic relationships between text documents. [sent-261, score-0.83]

93 Additionally, the use of hyperbolic lattice topology for the arrangement of the HSOM nodes offers new and attractive features for interactive “semantic navigation”. [sent-262, score-0.679]

94 Large document databases can be inspected at a glance while the HSOM provides additional information which was captured during a previous training step, allowing e. [sent-263, score-0.133]

95 to rapidly visualize relationships between new documents and previously acquired collections. [sent-265, score-0.401]

96 Future work will address more sophisticated visualization strategies based on the new approach, as well as the exploration of other text representations which might take advantage of hyperbolic space properties. [sent-266, score-0.698]

97 Laying out and visualizing large trees using a hyperbolic space. [sent-276, score-0.475]

98 Text categorization and semantic browsing with self-organizing maps on non-euclidean spaces. [sent-323, score-0.246]

99 Text categorization with support vector machines: learning with many relevant features. [sent-333, score-0.107]

100 An improved boosting algorithm and its application to automated text categorization. [sent-346, score-0.186]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('hsom', 0.511), ('hyperbolic', 0.475), ('documents', 0.233), ('som', 0.23), ('poincar', 0.176), ('text', 0.155), ('lattice', 0.135), ('tesselation', 0.123), ('tesselations', 0.123), ('node', 0.117), ('categorization', 0.107), ('glyph', 0.106), ('document', 0.103), ('mapped', 0.097), ('semantic', 0.092), ('plane', 0.09), ('earn', 0.088), ('ship', 0.088), ('categories', 0.086), ('neighborhood', 0.074), ('acq', 0.071), ('klein', 0.07), ('nodes', 0.069), ('reviews', 0.067), ('movie', 0.065), ('visualize', 0.065), ('category', 0.059), ('rings', 0.056), ('animation', 0.056), ('bielefeld', 0.056), ('disk', 0.056), ('map', 0.054), ('radius', 0.053), ('atlantis', 0.053), ('corn', 0.053), ('grain', 0.053), ('wheat', 0.053), ('navigation', 0.051), ('euclidean', 0.049), ('relationships', 0.049), ('crude', 0.049), ('classi', 0.049), ('maps', 0.047), ('kohonen', 0.046), ('texts', 0.046), ('origin', 0.046), ('spaces', 0.045), ('helge', 0.042), ('ve', 0.039), ('distance', 0.038), ('onto', 0.037), ('visualization', 0.037), ('embedding', 0.035), ('disney', 0.035), ('glyphs', 0.035), ('hsoms', 0.035), ('isometric', 0.035), ('kaski', 0.035), ('mulan', 0.035), ('northern', 0.035), ('ontrup', 0.035), ('tarzan', 0.035), ('ground', 0.032), ('space', 0.031), ('released', 0.031), ('automated', 0.031), ('painted', 0.031), ('reuters', 0.031), ('circumference', 0.031), ('genre', 0.031), ('incident', 0.031), ('semantically', 0.031), ('tagged', 0.031), ('training', 0.03), ('labelled', 0.03), ('triangle', 0.03), ('projection', 0.03), ('regular', 0.029), ('geometry', 0.028), ('acquired', 0.028), ('money', 0.028), ('peculiar', 0.028), ('motivates', 0.028), ('neuroinformatics', 0.028), ('metric', 0.028), ('frequency', 0.027), ('retrieval', 0.027), ('ten', 0.026), ('mining', 0.026), ('inspection', 0.026), ('winner', 0.026), ('portions', 0.026), ('riemannian', 0.026), ('arcs', 0.026), ('fx', 0.026), ('unit', 0.026), ('rapidly', 0.026), ('neighbors', 0.025), ('retrieved', 0.025), ('suitably', 0.025)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999976 90 nips-2001-Hyperbolic Self-Organizing Maps for Semantic Navigation

Author: Jorg Ontrup, Helge Ritter

2 0.11700086 107 nips-2001-Latent Dirichlet Allocation

Author: David M. Blei, Andrew Y. Ng, Michael I. Jordan

Abstract: We propose a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams [6], and Hofmann's aspect model , also known as probabilistic latent semantic indexing (pLSI) [3]. In the context of text modeling, our model posits that each document is generated as a mixture of topics, where the continuous-valued mixture proportions are distributed as a latent Dirichlet random variable. Inference and learning are carried out efficiently via variational algorithms. We present empirical results on applications of this model to problems in text modeling, collaborative filtering, and text classification. 1

3 0.084119968 169 nips-2001-Small-World Phenomena and the Dynamics of Information

Author: Jon M. Kleinberg

Abstract: unkown-abstract

4 0.059473373 24 nips-2001-Active Information Retrieval

Author: Tommi Jaakkola, Hava T. Siegelmann

Abstract: In classical large information retrieval systems, the system responds to a user initiated query with a list of results ranked by relevance. The users may further refine their query as needed. This process may result in a lengthy correspondence without conclusion. We propose an alternative active learning approach, where the system responds to the initial user's query by successively probing the user for distinctions at multiple levels of abstraction. The system's initiated queries are optimized for speedy recovery and the user is permitted to respond with multiple selections or may reject the query. The information is in each case unambiguously incorporated by the system and the subsequent queries are adjusted to minimize the need for further exchange. The system's initiated queries are subject to resource constraints pertaining to the amount of information that can be presented to the user per iteration. 1

5 0.05851499 144 nips-2001-Partially labeled classification with Markov random walks

Author: Martin Szummer, Tommi Jaakkola

Abstract: To classify a large number of unlabeled examples we combine a limited number of labeled examples with a Markov random walk representation over the unlabeled examples. The random walk representation exploits any low dimensional structure in the data in a robust, probabilistic manner. We develop and compare several estimation criteria/algorithms suited to this representation. This includes in particular multi-way classiﬁcation with an average margin criterion which permits a closed form solution. The time scale of the random walk regularizes the representation and can be set through a margin-based criterion favoring unambiguous classiﬁcation. We also extend this basic regularization by adapting time scales for individual examples. We demonstrate the approach on synthetic examples and on text classiﬁcation problems.

6 0.057344757 47 nips-2001-Causal Categorization with Bayes Nets

7 0.055841818 100 nips-2001-Iterative Double Clustering for Unsupervised and Semi-Supervised Learning

8 0.050891355 33 nips-2001-An Efficient Clustering Algorithm Using Stochastic Association Model and Its Implementation Using Nanostructures

9 0.050230388 62 nips-2001-Duality, Geometry, and Support Vector Regression

10 0.050093397 60 nips-2001-Discriminative Direction for Kernel Classifiers

11 0.045778729 46 nips-2001-Categorization by Learning and Combining Object Parts

12 0.044105675 22 nips-2001-A kernel method for multi-labelled classification

13 0.041544031 171 nips-2001-Spectral Relaxation for K-means Clustering

14 0.041069508 28 nips-2001-Adaptive Nearest Neighbor Classification Using Support Vector Machines

15 0.040883545 52 nips-2001-Computing Time Lower Bounds for Recurrent Sigmoidal Neural Networks

16 0.040061474 77 nips-2001-Fast and Robust Classification using Asymmetric AdaBoost and a Detector Cascade

17 0.039592303 101 nips-2001-K-Local Hyperplane and Convex Distance Nearest Neighbor Algorithms

18 0.039496247 50 nips-2001-Classifying Single Trial EEG: Towards Brain Computer Interfacing

19 0.038605992 194 nips-2001-Using Vocabulary Knowledge in Bayesian Multinomial Estimation

20 0.036477458 164 nips-2001-Sampling Techniques for Kernel Methods

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.124), (1, 0.02), (2, -0.028), (3, -0.004), (4, -0.017), (5, -0.06), (6, -0.103), (7, -0.013), (8, -0.053), (9, -0.024), (10, 0.087), (11, -0.083), (12, -0.051), (13, -0.004), (14, 0.008), (15, 0.012), (16, 0.003), (17, 0.023), (18, 0.011), (19, 0.024), (20, -0.034), (21, 0.058), (22, 0.064), (23, 0.019), (24, -0.084), (25, 0.106), (26, 0.029), (27, 0.066), (28, 0.125), (29, 0.106), (30, 0.004), (31, -0.038), (32, -0.027), (33, 0.026), (34, -0.033), (35, -0.113), (36, -0.139), (37, -0.058), (38, 0.229), (39, -0.186), (40, -0.031), (41, -0.134), (42, -0.088), (43, 0.106), (44, 0.044), (45, -0.061), (46, -0.093), (47, 0.115), (48, -0.032), (49, 0.093)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.95659822 90 nips-2001-Hyperbolic Self-Organizing Maps for Semantic Navigation

Author: Jorg Ontrup, Helge Ritter

2 0.59083098 169 nips-2001-Small-World Phenomena and the Dynamics of Information

Author: Jon M. Kleinberg

Abstract: unkown-abstract

3 0.57417268 107 nips-2001-Latent Dirichlet Allocation

Author: David M. Blei, Andrew Y. Ng, Michael I. Jordan

4 0.43100193 194 nips-2001-Using Vocabulary Knowledge in Bayesian Multinomial Estimation

Author: Thomas L. Griffiths, Joshua B. Tenenbaum

Abstract: Estimating the parameters of sparse multinomial distributions is an important component of many statistical learning tasks. Recent approaches have used uncertainty over the vocabulary of symbols in a multinomial distribution as a means of accounting for sparsity. We present a Bayesian approach that allows weak prior knowledge, in the form of a small set of approximate candidate vocabularies, to be used to dramatically improve the resulting estimates. We demonstrate these improvements in applications to text compression and estimating distributions over words in newsgroup data. 1

5 0.3997122 100 nips-2001-Iterative Double Clustering for Unsupervised and Semi-Supervised Learning

Author: Ran El-Yaniv, Oren Souroujon

Abstract: We present a powerful meta-clustering technique called Iterative Double Clustering (IDC). The IDC method is a natural extension of the recent Double Clustering (DC) method of Slonim and Tishby that exhibited impressive performance on text categorization tasks [12]. Using synthetically generated data we empirically ﬁnd that whenever the DC procedure is successful in recovering some of the structure hidden in the data, the extended IDC procedure can incrementally compute a signiﬁcantly more accurate classiﬁcation. IDC is especially advantageous when the data exhibits high attribute noise. Our simulation results also show the eﬀectiveness of IDC in text categorization problems. Surprisingly, this unsupervised procedure can be competitive with a (supervised) SVM trained with a small training set. Finally, we propose a simple and natural extension of IDC for semi-supervised and transductive learning where we are given both labeled and unlabeled examples. 1

6 0.3339836 144 nips-2001-Partially labeled classification with Markov random walks

7 0.32026124 184 nips-2001-The Intelligent surfer: Probabilistic Combination of Link and Content Information in PageRank

8 0.31900647 30 nips-2001-Agglomerative Multivariate Information Bottleneck

9 0.3133108 62 nips-2001-Duality, Geometry, and Support Vector Regression

10 0.31153166 149 nips-2001-Probabilistic Abstraction Hierarchies

11 0.29269633 26 nips-2001-Active Portfolio-Management based on Error Correction Neural Networks

12 0.28991368 118 nips-2001-Matching Free Trees with Replicator Equations

13 0.28779677 43 nips-2001-Bayesian time series classification

14 0.28583264 52 nips-2001-Computing Time Lower Bounds for Recurrent Sigmoidal Neural Networks

15 0.26918468 99 nips-2001-Intransitive Likelihood-Ratio Classifiers

16 0.26763678 101 nips-2001-K-Local Hyperplane and Convex Distance Nearest Neighbor Algorithms

17 0.25175023 17 nips-2001-A Quantitative Model of Counterfactual Reasoning

18 0.24810682 47 nips-2001-Causal Categorization with Bayes Nets

19 0.23965335 93 nips-2001-Incremental A*

20 0.2229498 85 nips-2001-Grammar Transfer in a Second Order Recurrent Neural Network

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(14, 0.045), (17, 0.024), (19, 0.036), (27, 0.103), (30, 0.052), (38, 0.022), (59, 0.036), (68, 0.333), (70, 0.011), (72, 0.049), (79, 0.042), (83, 0.018), (91, 0.15)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.823493 90 nips-2001-Hyperbolic Self-Organizing Maps for Semantic Navigation

Author: Jorg Ontrup, Helge Ritter

2 0.80043304 17 nips-2001-A Quantitative Model of Counterfactual Reasoning

Author: Daniel Yarlett, Michael Ramscar

Abstract: In this paper we explore two quantitative approaches to the modelling of counterfactual reasoning – a linear and a noisy-OR model – based on information contained in conceptual dependency networks. Empirical data is acquired in a study and the ﬁt of the models compared to it. We conclude by considering the appropriateness of non-parametric approaches to counterfactual reasoning, and examining the prospects for other parametric approaches in the future.

3 0.77937144 118 nips-2001-Matching Free Trees with Replicator Equations

Author: Marcello Pelillo

Abstract: Motivated by our recent work on rooted tree matching, in this paper we provide a solution to the problem of matching two free (i.e., unrooted) trees by constructing an association graph whose maximal cliques are in one-to-one correspondence with maximal common subtrees. We then solve the problem using simple replicator dynamics from evolutionary game theory. Experiments on hundreds of uniformly random trees are presented. The results are impressive: despite the inherent inability of these simple dynamics to escape from local optima, they always returned a globally optimal solution.

4 0.52381051 7 nips-2001-A Dynamic HMM for On-line Segmentation of Sequential Data

Author: Jens Kohlmorgen, Steven Lemm

Abstract: We propose a novel method for the analysis of sequential data that exhibits an inherent mode switching. In particular, the data might be a non-stationary time series from a dynamical system that switches between multiple operating modes. Unlike other approaches, our method processes the data incrementally and without any training of internal parameters. We use an HMM with a dynamically changing number of states and an on-line variant of the Viterbi algorithm that performs an unsupervised segmentation and classification of the data on-the-fly, i.e. the method is able to process incoming data in real-time. The main idea of the approach is to track and segment changes of the probability density of the data in a sliding window on the incoming data stream. The usefulness of the algorithm is demonstrated by an application to a switching dynamical system. 1

5 0.52307957 13 nips-2001-A Natural Policy Gradient

Author: Sham M. Kakade

Abstract: We provide a natural gradient method that represents the steepest descent direction based on the underlying structure of the parameter space. Although gradient methods cannot make large changes in the values of the parameters, we show that the natural gradient is moving toward choosing a greedy optimal action rather than just a better action. These greedy optimal actions are those that would be chosen under one improvement step of policy iteration with approximate, compatible value functions, as defined by Sutton et al. [9]. We then show drastic performance improvements in simple MDPs and in the more challenging MDP of Tetris. 1

6 0.52036852 95 nips-2001-Infinite Mixtures of Gaussian Process Experts

7 0.51905203 132 nips-2001-Novel iteration schemes for the Cluster Variation Method

8 0.51849163 89 nips-2001-Grouping with Bias

9 0.51775146 66 nips-2001-Efficiency versus Convergence of Boolean Kernels for On-Line Learning Algorithms

10 0.51507658 160 nips-2001-Reinforcement Learning and Time Perception -- a Model of Animal Experiments

11 0.51503015 135 nips-2001-On Spectral Clustering: Analysis and an algorithm

12 0.51489621 157 nips-2001-Rates of Convergence of Performance Gradient Estimates Using Function Approximation and Bias in Reinforcement Learning

13 0.5140183 183 nips-2001-The Infinite Hidden Markov Model

14 0.51384896 131 nips-2001-Neural Implementation of Bayesian Inference in Population Codes

15 0.51326233 100 nips-2001-Iterative Double Clustering for Unsupervised and Semi-Supervised Learning

16 0.51324713 58 nips-2001-Covariance Kernels from Bayesian Generative Models

17 0.51274455 57 nips-2001-Correlation Codes in Neuronal Populations

18 0.51258928 161 nips-2001-Reinforcement Learning with Long Short-Term Memory

19 0.51258683 68 nips-2001-Entropy and Inference, Revisited

20 0.51169878 56 nips-2001-Convolution Kernels for Natural Language