nips nips2010 nips2010-71 knowledge-graph by maker-knowledge-mining

71 nips-2010-Efficient Relational Learning with Hidden Variable Detection

Source: pdf

Author: Ni Lao, Jun Zhu, Liu Xinwang, Yandong Liu, William W. Cohen

Abstract: Markov networks (MNs) can incorporate arbitrarily complex features in modeling relational data. However, this ﬂexibility comes at a sharp price of training an exponentially complex model. To address this challenge, we propose a novel relational learning approach, which consists of a restricted class of relational MNs (RMNs) called relation tree-based RMN (treeRMN), and an efﬁcient Hidden Variable Detection algorithm called Contrastive Variable Induction (CVI). On one hand, the restricted treeRMN only considers simple (e.g., unary and pairwise) features in relational data and thus achieves computational efﬁciency; and on the other hand, the CVI algorithm efﬁciently detects hidden variables which can capture long range dependencies. Therefore, the resultant approach is highly efﬁcient yet does not sacriﬁce its expressive power. Empirical results on four real datasets show that the proposed relational learning method can achieve similar prediction quality as the state-of-the-art approaches, but is signiﬁcantly more efﬁcient in training; and the induced hidden variables are semantically meaningful and crucial to improve the training speed and prediction qualities of treeRMNs.

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 edu Abstract Markov networks (MNs) can incorporate arbitrarily complex features in modeling relational data. [sent-4, score-0.448]

2 To address this challenge, we propose a novel relational learning approach, which consists of a restricted class of relational MNs (RMNs) called relation tree-based RMN (treeRMN), and an efﬁcient Hidden Variable Detection algorithm called Contrastive Variable Induction (CVI). [sent-6, score-0.827]

3 , unary and pairwise) features in relational data and thus achieves computational efﬁciency; and on the other hand, the CVI algorithm efﬁciently detects hidden variables which can capture long range dependencies. [sent-9, score-0.698]

4 1 Introduction Statistical relational learning has attracted ever-growing interest in the last decade, because of widely available relational data, which can be as complex as citation graphs, the World Wide Web, or relational databases. [sent-12, score-1.077]

5 Relational Markov Networks (RMNs) are excellent tools to capture the statistical dependency among entities in a relational dataset, as has been shown in many tasks such as collective classiﬁcation [22] and information extraction [18][2]. [sent-13, score-0.548]

6 For example, Markov Logic Networks [10] can be automatically instantiated as a RMN, given just a set of predicates representing attributes and relations among entities. [sent-15, score-0.2]

7 Relational Bayesian networks [22], in contrary, would require expert knowledge to design proper model structures and parameterizations whenever the schema of the domain under consideration is changed. [sent-17, score-0.115]

8 For example, work by Kok and Domingos [10][11][12] has shown that a prominent problem of relational undirected models is how to handle the exponentially many features, each of which is an conjunction of several neighboring variables (or “ground atoms” in terms of ﬁrst order logic). [sent-19, score-0.405]

9 The main goal of this paper is to show that instead of learning a very expressive relational model, which can be extremely expensive, an alternative approach that explores Hidden Variable Detection (HVD) to compensate a family of restricted relational models (e. [sent-21, score-0.81]

10 , treeRMNs) can yield a very efﬁcient yet competent relational learning framework. [sent-23, score-0.359]

11 First, to achieve efﬁcient inference, we introduce a restricted class of RMNs called relation tree-based RMNs (treeRMNs), which only considers unary (single variable assignment) and pairwise (conjunction of two variable assignments) features. [sent-24, score-0.319]

12 1 Since the Markov blanket of a variable is concisely deﬁned by a relation tree on the schema, we can easily control the complexities of treeRMN models. [sent-25, score-0.227]

13 Second, to compensate for the restricted expressive power of treeRMNs, we further introduce a hidden variable induction algorithm called Contrastive Variable Induction (CVI), which can effectively detect latent variables capturing long range dependencies. [sent-26, score-0.394]

14 It has been shown in relational Bayesian networks [24] that hidden variables can help propagating information across network structures, thus reducing the burden of extensive structural learning. [sent-27, score-0.577]

15 In this work, we explore the usefulness of hidden variables in learning RMNs. [sent-28, score-0.169]

16 Our experiments on four real datasets show that the proposed relational learning framework can achieve similar prediction quality to the state-of-the-art RMN models, but is signiﬁcantly more efﬁcient in training. [sent-29, score-0.389]

17 Furthermore, the induced hidden variables are semantically meaningful and are crucial to improving training speed of treeRMN. [sent-30, score-0.2]

18 Most of the existing RMN models construct Markov networks by applying templates to entity relation graphs [21][8]. [sent-36, score-0.474]

19 The treeRMN model that we are going to introduce uses a type of template called a relation tree, which is very general and applicable to a wide range of applications. [sent-37, score-0.269]

20 This relation tree template resembles the path-based feature generation approach for relational classiﬁers developed by Huang et al. [sent-38, score-0.608]

21 Recently, much work has been done to induce hidden variables for generative Bayesian networks [5][4][16][9][20][14]. [sent-40, score-0.218]

22 On the other hand, very little progress has been made in the direction of non-parametric hidden variable models based on discriminative Markov networks (MNs). [sent-42, score-0.209]

23 One recent attempt is the Multiple Relational Clustering (MRC) [11] algorithm, which performs top-down clustering of predicates and symbols. [sent-43, score-0.162]

24 The “ideal parent” evaluates candidate hidden variables based on the estimated gain of log-likelihood they can bring to the Bayesian network. [sent-46, score-0.246]

25 Similarly, the CVI algorithm evaluates candidate hidden variables based on the estimated gain of an regularized RMN log-likelihood, thus avoids the costly step of parameter estimation. [sent-47, score-0.246]

26 Here we consider the general case that Markov networks have observed variables O, labeled variables Y, and hidden variables H. [sent-54, score-0.31]

27 Let X = (Y, H) be the joint of hidden and labeled variables. [sent-55, score-0.123]

28 The conditional distribution of X given observations O is p(x|o; θ) = exp(θ⊤ f (x, o))/Z(θ), where f ∑ is a vector of feature functions fk ; θ is a vector of weights; Z(θ) = x exp(θ⊤ f (x, o)) is a normalization factor; and fk (x, o) counts the number of times the k-th feature ﬁres in (x, o). [sent-56, score-0.116]

29 After applying each template to all the nodes in a graph, we get a graphical model with a large number of features (i. [sent-62, score-0.12]

30 In the context of relational learning, the templates can be deﬁned similarly, except having richer representations–with multiple types of entities and neighboring relations. [sent-66, score-0.642]

31 Because of its singularity at the origin, the ℓ1 -norm can yield a sparse estimate, which is a desired property for hidden variable discovery, as we shall see. [sent-68, score-0.16]

32 CD1 4 Relation Tree-Based RMNs In the following, we formally deﬁne the treeRMN model with relation tree templates, which is very general and applicable to a wide range of applications. [sent-86, score-0.165]

33 T = {Ti } is a set of entity types which include both basic entity types (e. [sent-88, score-0.562]

34 Each entity type is associated with a set of attributes A(T ) = {T. [sent-93, score-0.332]

35 For each argument of a composite entity type, we deﬁne two relations, one with outward direction (e. [sent-100, score-0.37]

36 We further introduce a T win relation, which connects a composite entity type to itself. [sent-106, score-0.429]

37 In principle, we can deﬁne other types of relations F atherOf such as those corresponding to functions in second order logic (e. [sent-108, score-0.14]

38 − − −→ An entity relation graph G = IE (S) (Figure 1 right), is the instantiation of schema S on a set of basic entities E = {ei }. [sent-111, score-0.635]

39 We deﬁne the instantiation of a basic entity type T as IE (T ) = {e : e. [sent-112, score-0.322]

40 T = T }, and similarly for a composite type IE (T = ⟨T1 , . [sent-113, score-0.15]

41 Xi } that correspond to the set of attributes of its entity type A(e. [sent-123, score-0.332]

42 For a composite entity that consists of two entities of the same type, we’d like to capture its correlation with its twin– the composite entity made of the same basic entities but in reversed order. [sent-125, score-1.092]

43 Therefore, we add the T win relation between all pairs of twin entities: e. [sent-126, score-0.195]

44 3 Twin -1 PP1 -1 PP2 PP1 PP2 Twin PP1 -1 -1 -1 PC1 PP1 PP2 -1 -1 PP1 -1 PC1 Figure 2: Two-level relation trees for the P erson type (left) and the ⟨P erson, P erson⟩ type (right). [sent-129, score-0.473]

45 Given a schema, we can conveniently express how one entity can reach another entity by the concept of a relation path. [sent-130, score-0.599]

46 A relation path P is a sequence of relations R1 . [sent-131, score-0.18]

47 , the path P erson is actually equivalent to the path P erson − − → −− P C1 ⟨P erson, Class⟩ − − P erson. [sent-152, score-0.574]

48 To avoid creating these uninteresting paths, we add a constraint −→ to outward composite relations (e. [sent-153, score-0.171]

49 We also constrain that the T win relation should not be combined with any other relations. [sent-156, score-0.143]

50 Now, the Markov blanket of an entity e ∈ T can be concisely deﬁned by the set of all relation paths with domain T and of length ≤ ℓ (as shown in Figure 2). [sent-157, score-0.434]

51 We call this set the relation tree of type T , and denote it as T ree(T, ℓ) = {P }. [sent-158, score-0.187]

52 This template can be applied to any entity e of type T ∧ in the entity relation graph. [sent-161, score-0.73]

53 This template can be applied to any entity pair (e1 , e2 ), where e1 . [sent-166, score-0.325]

54 P as the set of entities reach able from entity e ∈ T through the relation path P . [sent-170, score-0.568]

55 In our current implementation, we systematically enumerate all possible unary and pairwise templates. [sent-175, score-0.136]

56 Given the above concepts, we deﬁne a treeRMN model M = (G, f , θ) as the tuple of an entity relation graph G, a set of feature functions f , and their weights θ. [sent-176, score-0.387]

57 Each feature function fk counts the number of times the k-th template ﬁres in G. [sent-177, score-0.138]

58 Generally, the complexity of inference is exponential in the depth of the relation trees, because both the number of templates and their sizes of Markov blankets grow exponentially w. [sent-178, score-0.18]

59 However, the limited expressive power of treeRMN can be effectively compensated for by detecting hidden variables, which is another key component of our relational learning approach, as explained in the next section. [sent-184, score-0.543]

60 We give closed-form expressions of the likelihood gain and the weights of newly added features under contrastive divergence approximation [23] (other type of inference can be done similarly). [sent-189, score-0.218]

61 Consider introducing a new HV H to the entity type T . [sent-191, score-0.296]

62 Here we assume that any feature f ∈ fH is in the pairwise form fH=1 ∧ A=a , where a is the assignment to one of the existing variables A in the relation tree of type T . [sent-194, score-0.301]

63 For easy evaluation of H, we set its mean ﬁeld variational parameters µH to either 0 or 1 on the entities of type T . [sent-196, score-0.24]

64 Therefore, a candidate HV can be represented as (I, fH , θH ), where I is the set of indices to the entities with µH = 1. [sent-198, score-0.225]

65 large β) smoothly shrinks both the (estimated) likelihood gain and the feature weights; while the non-differentiable ℓ1 -norm not only shrink the estimated gain and feature weights, but also drive features to have zero gains, therefore, can automatically select the features. [sent-207, score-0.188]

66 Remark 2 currently, we only induce HVs to basic entity types. [sent-227, score-0.271]

67 Extension to composite types can show interesting tenary relations such as “Abnormality can be PartOf Animals”. [sent-228, score-0.168]

68 We demonstrate that CVI can discover semantically meaningful hidden variables, which can signiﬁcantly improve the speed and quality of treeRMN models. [sent-231, score-0.154]

69 These datasets are commonly used by previous work Animal 50 80 0 0 Nation 14 111 196 56 in relational learning [9][11][20][14]. [sent-234, score-0.389]

70 It consists exclusively of unary predicates of the form A(a) where A is an attribute and Kinship 104 0 10,816 1* a is an animal (e. [sent-236, score-0.34]

71 This is a simple proposi- Table 1: Number of entities tional dataset with no relational structure, but is useful as a base case (#E) and attributes (#A) for for comparison. [sent-239, score-0.584]

72 The binary predicates are of the form data has only one attribute R(n1 , n2 ), where n1 , n2 are nations and R is a relation between which has 26 possible values. [sent-242, score-0.296]

73 The unary predicates are of the form A(n), where n is a nation and A is a attribute (e. [sent-246, score-0.337]

74 It consists of binary predicates of the form R(c1 , c2 ), where c1 and c2 are biomedical concepts and R is a relation between them (e. [sent-250, score-0.269]

75 The Kinship dataset contains kinship relationships among members of the Alyawarra tribe from Central Australia. [sent-253, score-0.164]

76 Predicates are of the form R(p1 , p2 ), where R is a kinship term and p1 , p2 are persons. [sent-254, score-0.164]

77 Except for the animal data, the number of composite entities is the square of the number of basic entities. [sent-255, score-0.396]

78 2 Characterization of treeRMN and CVI In this section, we analyze the properties of the discovered hidden variables and demonstrate the behavior of the CVI algorithm. [sent-257, score-0.169]

79 For the simple non-relational Animal data, if we start with a full model with all pairwise features, CVI will decide not to introduce any hidden variables. [sent-258, score-0.158]

80 If we run CVI starting from a model with only unary features, however, CVI decides to introduce one hidden variable H0 with 8 categories. [sent-259, score-0.261]

81 Table 2 shows the associated entities and features for the ﬁrst four categories. [sent-260, score-0.229]

82 Table 2: The associated entities and features (sorted by decreasing magnitude of feature weights) for the ﬁrst four categories∧ the induced hidden variable a. [sent-286, score-0.422]

83 Table 3: The associated entities and features (sorted by decreasing magnitude of feature weights) for the ﬁrst three categories of the induced hidden variable c. [sent-310, score-0.447]

84 CLL 0 For the three relational datasets, we use UML as an example. [sent-314, score-0.359]

85 2 CVI induces two multinomial hidden variables H0 and H1 . [sent-318, score-0.169]

86 3 can see from Figure 3, the inclusion of each hidden variable sigIntroduce c. [sent-321, score-0.16]

87 6 represent the hidden concepts Abnormalities, Animals and Plants -0. [sent-328, score-0.165]

88 3 Overall Performance Now we present quantitative evaluation of the treeRMN model, and compare it with other relational learning methods including MLN structure learning (MLS) [10], Inﬁnite Relational Models (IRM) [9] and Multiple Relational Clustering (MRC) [11]. [sent-335, score-0.359]

89 At each run, we treat one fold as hidden during training, and then evaluate the prediction of these variables conditioned on the observed variables during testing. [sent-338, score-0.215]

90 Table 4 compares the overall performance of treeRMN (RMN), treeRMN with hidden variable discovery (RMNCV I ), and other relational models (MSL, IRM and MRC) as reported in [11]. [sent-342, score-0.519]

91 § The CLL of kinship data is not comparable to previous approaches, because we treat each of its labels as one variable with 26 categories instead of 26 binary variables. [sent-466, score-0.226]

92 On all three relational datasets, treeRMNs with CVI can signiﬁcantly improve CLL and AUC. [sent-471, score-0.359]

93 Since both UML and Kinship data have no attributes in basic entity types, composite entities become more important to model. [sent-478, score-0.595]

94 Therefore, we suspect that the MRC model achieves better performance because it can perform clustering on two-argument predicates which corresponds to composite entities. [sent-479, score-0.261]

95 By using simple treeRMNs, we achieve computational efﬁciency, and CVI can effectively detect hidden variables, which compensates for the limited expressive power of treeRMNs. [sent-481, score-0.184]

96 Experiments on four real datasets show that the proposed relational learning approach can achieve state-of-the-art prediction accuracy and is much faster than existing relational Markov network models. [sent-482, score-0.748]

97 Second, as we have explained at the end of section 5, we’d like to apply HVD on composite entity types. [sent-485, score-0.344]

98 Third, we’d also like to treat the introduced hidden variables as free variables and to make their cardinalities adaptive. [sent-486, score-0.238]

99 Predictive modeling using features derived from paths in relational graphs. [sent-513, score-0.425]

100 Learning systems of concepts with an inﬁnite relational model. [sent-521, score-0.401]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('treermn', 0.418), ('relational', 0.359), ('cvi', 0.314), ('erson', 0.262), ('entity', 0.245), ('entities', 0.189), ('kinship', 0.164), ('rmn', 0.157), ('cll', 0.134), ('treermns', 0.134), ('hidden', 0.123), ('uml', 0.119), ('mrc', 0.118), ('predicates', 0.118), ('relation', 0.109), ('ie', 0.104), ('unary', 0.101), ('composite', 0.099), ('dom', 0.096), ('hvs', 0.09), ('msl', 0.09), ('contrastive', 0.086), ('irm', 0.084), ('animal', 0.082), ('template', 0.08), ('nation', 0.079), ('rmncv', 0.075), ('kok', 0.072), ('rmns', 0.072), ('templates', 0.071), ('logic', 0.071), ('markov', 0.068), ('induction', 0.067), ('mns', 0.066), ('schema', 0.066), ('expressive', 0.061), ('fh', 0.059), ('nir', 0.053), ('twin', 0.052), ('type', 0.051), ('networks', 0.049), ('dim', 0.049), ('relations', 0.046), ('variables', 0.046), ('auc', 0.046), ('clustering', 0.044), ('pedro', 0.043), ('hv', 0.043), ('concepts', 0.042), ('gain', 0.041), ('daphne', 0.041), ('features', 0.04), ('attribute', 0.039), ('stanley', 0.039), ('increment', 0.039), ('variable', 0.037), ('py', 0.037), ('candidate', 0.036), ('attributes', 0.036), ('pairwise', 0.035), ('win', 0.034), ('feature', 0.033), ('uai', 0.033), ('eld', 0.032), ('elidan', 0.032), ('semantically', 0.031), ('domingos', 0.031), ('compensate', 0.031), ('gal', 0.031), ('ei', 0.03), ('datasets', 0.03), ('blanket', 0.03), ('fish', 0.03), ('hvd', 0.03), ('nations', 0.03), ('paws', 0.03), ('raccoon', 0.03), ('swims', 0.03), ('toughskin', 0.03), ('vegetation', 0.03), ('range', 0.029), ('res', 0.028), ('pp', 0.027), ('tree', 0.027), ('animals', 0.027), ('exibility', 0.026), ('paths', 0.026), ('dolphin', 0.026), ('lao', 0.026), ('outward', 0.026), ('basic', 0.026), ('path', 0.025), ('categories', 0.025), ('fk', 0.025), ('abnormalities', 0.024), ('mlns', 0.024), ('concisely', 0.024), ('types', 0.023), ('plants', 0.023), ('cardinalities', 0.023)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000007 71 nips-2010-Efficient Relational Learning with Hidden Variable Detection

Author: Ni Lao, Jun Zhu, Liu Xinwang, Yandong Liu, William W. Cohen

2 0.19921711 83 nips-2010-Evidence-Specific Structures for Rich Tractable CRFs

Author: Anton Chechetka, Carlos Guestrin

Abstract: We present a simple and effective approach to learning tractable conditional random ﬁelds with structure that depends on the evidence. Our approach retains the advantages of tractable discriminative models, namely efﬁcient exact inference and arbitrarily accurate parameter learning in polynomial time. At the same time, our algorithm does not suffer a large expressive power penalty inherent to ﬁxed tractable structures. On real-life relational datasets, our approach matches or exceeds state of the art accuracy of the dense models, and at the same time provides an order of magnitude speedup. 1

3 0.17906117 67 nips-2010-Dynamic Infinite Relational Model for Time-varying Relational Data Analysis

Author: Katsuhiko Ishiguro, Tomoharu Iwata, Naonori Ueda, Joshua B. Tenenbaum

Abstract: We propose a new probabilistic model for analyzing dynamic evolutions of relational data, such as additions, deletions and split & merge, of relation clusters like communities in social networks. Our proposed model abstracts observed timevarying object-object relationships into relationships between object clusters. We extend the inﬁnite Hidden Markov model to follow dynamic and time-sensitive changes in the structure of the relational data and to estimate a number of clusters simultaneously. We show the usefulness of the model through experiments with synthetic and real-world data sets.

4 0.10228697 128 nips-2010-Infinite Relational Modeling of Functional Connectivity in Resting State fMRI

Author: Morten Mørup, Kristoffer Madsen, Anne-marie Dogonowski, Hartwig Siebner, Lars K. Hansen

Abstract: Functional magnetic resonance imaging (fMRI) can be applied to study the functional connectivity of the neural elements which form complex network at a whole brain level. Most analyses of functional resting state networks (RSN) have been based on the analysis of correlation between the temporal dynamics of various regions of the brain. While these models can identify coherently behaving groups in terms of correlation they give little insight into how these groups interact. In this paper we take a different view on the analysis of functional resting state networks. Starting from the deﬁnition of resting state as functional coherent groups we search for functional units of the brain that communicate with other parts of the brain in a coherent manner as measured by mutual information. We use the inﬁnite relational model (IRM) to quantify functional coherent groups of resting state networks and demonstrate how the extracted component interactions can be used to discriminate between functional resting state activity in multiple sclerosis and normal subjects. 1

5 0.081437975 49 nips-2010-Computing Marginal Distributions over Continuous Markov Networks for Statistical Relational Learning

Author: Matthias Broecheler, Lise Getoor

Abstract: Continuous Markov random ﬁelds are a general formalism to model joint probability distributions over events with continuous outcomes. We prove that marginal computation for constrained continuous MRFs is #P-hard in general and present a polynomial-time approximation scheme under mild assumptions on the structure of the random ﬁeld. Moreover, we introduce a sampling algorithm to compute marginal distributions and develop novel techniques to increase its efﬁciency. Continuous MRFs are a general purpose probabilistic modeling tool and we demonstrate how they can be applied to statistical relational learning. On the problem of collective classiﬁcation, we evaluate our algorithm and show that the standard deviation of marginals serves as a useful measure of conﬁdence. 1

6 0.078748614 78 nips-2010-Error Propagation for Approximate Policy and Value Iteration

7 0.060934171 159 nips-2010-Lifted Inference Seen from the Other Side : The Tractable Features

8 0.054295912 213 nips-2010-Predictive Subspace Learning for Multi-view Data: a Large Margin Approach

9 0.051574696 276 nips-2010-Tree-Structured Stick Breaking for Hierarchical Data

10 0.050425321 144 nips-2010-Learning Efficient Markov Networks

11 0.044265404 162 nips-2010-Link Discovery using Graph Feature Tracking

12 0.043665688 235 nips-2010-Self-Paced Learning for Latent Variable Models

13 0.043336745 171 nips-2010-Movement extraction by detecting dynamics switches and repetitions

14 0.042956974 99 nips-2010-Gated Softmax Classification

15 0.04151696 101 nips-2010-Gaussian sampling by local perturbations

16 0.041073479 206 nips-2010-Phone Recognition with the Mean-Covariance Restricted Boltzmann Machine

17 0.041057602 135 nips-2010-Label Embedding Trees for Large Multi-Class Tasks

18 0.040024258 230 nips-2010-Robust Clustering as Ensembles of Affinity Relations

19 0.039104469 32 nips-2010-Approximate Inference by Compilation to Arithmetic Circuits

20 0.037940975 273 nips-2010-Towards Property-Based Classification of Clustering Paradigms

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.131), (1, 0.03), (2, -0.026), (3, 0.009), (4, -0.108), (5, -0.031), (6, 0.007), (7, -0.031), (8, 0.069), (9, -0.008), (10, -0.09), (11, -0.011), (12, 0.055), (13, -0.056), (14, 0.05), (15, -0.048), (16, 0.021), (17, -0.001), (18, -0.024), (19, -0.083), (20, -0.022), (21, 0.09), (22, 0.101), (23, -0.026), (24, -0.135), (25, 0.051), (26, -0.056), (27, -0.015), (28, -0.03), (29, 0.062), (30, -0.12), (31, -0.228), (32, -0.133), (33, -0.056), (34, -0.121), (35, 0.086), (36, 0.003), (37, 0.153), (38, -0.122), (39, -0.01), (40, -0.044), (41, -0.008), (42, 0.045), (43, -0.002), (44, -0.079), (45, 0.079), (46, 0.149), (47, 0.146), (48, -0.044), (49, 0.11)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.9234581 71 nips-2010-Efficient Relational Learning with Hidden Variable Detection

Author: Ni Lao, Jun Zhu, Liu Xinwang, Yandong Liu, William W. Cohen

2 0.77257657 67 nips-2010-Dynamic Infinite Relational Model for Time-varying Relational Data Analysis

Author: Katsuhiko Ishiguro, Tomoharu Iwata, Naonori Ueda, Joshua B. Tenenbaum

3 0.76440179 83 nips-2010-Evidence-Specific Structures for Rich Tractable CRFs

Author: Anton Chechetka, Carlos Guestrin

4 0.58428693 159 nips-2010-Lifted Inference Seen from the Other Side : The Tractable Features

Author: Abhay Jha, Vibhav Gogate, Alexandra Meliou, Dan Suciu

Abstract: Lifted Inference algorithms for representations that combine ﬁrst-order logic and graphical models have been the focus of much recent research. All lifted algorithms developed to date are based on the same underlying idea: take a standard probabilistic inference algorithm (e.g., variable elimination, belief propagation etc.) and improve its efﬁciency by exploiting repeated structure in the ﬁrst-order model. In this paper, we propose an approach from the other side in that we use techniques from logic for probabilistic inference. In particular, we deﬁne a set of rules that look only at the logical representation to identify models for which exact efﬁcient inference is possible. Our rules yield new tractable classes that could not be solved efﬁciently by any of the existing techniques. 1

5 0.48232126 121 nips-2010-Improving Human Judgments by Decontaminating Sequential Dependencies

Author: Harold Pashler, Matthew Wilder, Robert Lindsey, Matt Jones, Michael C. Mozer, Michael P. Holmes

Abstract: For over half a century, psychologists have been struck by how poor people are at expressing their internal sensations, impressions, and evaluations via rating scales. When individuals make judgments, they are incapable of using an absolute rating scale, and instead rely on reference points from recent experience. This relativity of judgment limits the usefulness of responses provided by individuals to surveys, questionnaires, and evaluation forms. Fortunately, the cognitive processes that transform internal states to responses are not simply noisy, but rather are inﬂuenced by recent experience in a lawful manner. We explore techniques to remove sequential dependencies, and thereby decontaminate a series of ratings to obtain more meaningful human judgments. In our formulation, decontamination is fundamentally a problem of inferring latent states (internal sensations) which, because of the relativity of judgment, have temporal dependencies. We propose a decontamination solution using a conditional random ﬁeld with constraints motivated by psychological theories of relative judgment. Our exploration of decontamination models is supported by two experiments we conducted to obtain ground-truth rating data on a simple length estimation task. Our decontamination techniques yield an over 20% reduction in the error of human judgments. 1

6 0.41148955 215 nips-2010-Probabilistic Deterministic Infinite Automata

7 0.38286892 144 nips-2010-Learning Efficient Markov Networks

8 0.35199061 49 nips-2010-Computing Marginal Distributions over Continuous Markov Networks for Statistical Relational Learning

9 0.34305692 1 nips-2010-(RF)^2 -- Random Forest Random Field

10 0.33218551 128 nips-2010-Infinite Relational Modeling of Functional Connectivity in Resting State fMRI

11 0.32705289 155 nips-2010-Learning the context of a category

12 0.32584089 62 nips-2010-Discriminative Clustering by Regularized Information Maximization

13 0.29781848 32 nips-2010-Approximate Inference by Compilation to Arithmetic Circuits

14 0.29612675 188 nips-2010-On Herding and the Perceptron Cycling Theorem

15 0.28854948 13 nips-2010-A Primal-Dual Message-Passing Algorithm for Approximated Large Scale Structured Prediction

16 0.27522641 213 nips-2010-Predictive Subspace Learning for Multi-view Data: a Large Margin Approach

17 0.25794458 171 nips-2010-Movement extraction by detecting dynamics switches and repetitions

18 0.25719351 242 nips-2010-Slice sampling covariance hyperparameters of latent Gaussian models

19 0.25422883 9 nips-2010-A New Probabilistic Model for Rank Aggregation

20 0.25399283 206 nips-2010-Phone Recognition with the Mean-Covariance Restricted Boltzmann Machine

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(13, 0.03), (17, 0.014), (27, 0.064), (30, 0.046), (35, 0.025), (41, 0.356), (45, 0.192), (50, 0.062), (52, 0.02), (60, 0.035), (77, 0.029), (78, 0.012), (90, 0.024)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.73024511 71 nips-2010-Efficient Relational Learning with Hidden Variable Detection

Author: Ni Lao, Jun Zhu, Liu Xinwang, Yandong Liu, William W. Cohen

2 0.6626327 176 nips-2010-Multiple Kernel Learning and the SMO Algorithm

Author: Zhaonan Sun, Nawanol Ampornpunt, Manik Varma, S.v.n. Vishwanathan

Abstract: Our objective is to train p-norm Multiple Kernel Learning (MKL) and, more generally, linear MKL regularised by the Bregman divergence, using the Sequential Minimal Optimization (SMO) algorithm. The SMO algorithm is simple, easy to implement and adapt, and efﬁciently scales to large problems. As a result, it has gained widespread acceptance and SVMs are routinely trained using SMO in diverse real world applications. Training using SMO has been a long standing goal in MKL for the very same reasons. Unfortunately, the standard MKL dual is not differentiable, and therefore can not be optimised using SMO style co-ordinate ascent. In this paper, we demonstrate that linear MKL regularised with the p-norm squared, or with certain Bregman divergences, can indeed be trained using SMO. The resulting algorithm retains both simplicity and efﬁciency and is signiﬁcantly faster than state-of-the-art specialised p-norm MKL solvers. We show that we can train on a hundred thousand kernels in approximately seven minutes and on ﬁfty thousand points in less than half an hour on a single core. 1

3 0.6048755 138 nips-2010-Large Margin Multi-Task Metric Learning

Author: Shibin Parameswaran, Kilian Q. Weinberger

Abstract: Multi-task learning (MTL) improves the prediction performance on multiple, different but related, learning problems through shared parameters or representations. One of the most prominent multi-task learning algorithms is an extension to support vector machines (svm) by Evgeniou et al. [15]. Although very elegant, multi-task svm is inherently restricted by the fact that support vector machines require each class to be addressed explicitly with its own weight vector which, in a multi-task setting, requires the different learning tasks to share the same set of classes. This paper proposes an alternative formulation for multi-task learning by extending the recently published large margin nearest neighbor (lmnn) algorithm to the MTL paradigm. Instead of relying on separating hyperplanes, its decision function is based on the nearest neighbor rule which inherently extends to many classes and becomes a natural ﬁt for multi-task learning. We evaluate the resulting multi-task lmnn on real-world insurance data and speech classiﬁcation problems and show that it consistently outperforms single-task kNN under several metrics and state-of-the-art MTL classiﬁers. 1

4 0.52757221 51 nips-2010-Construction of Dependent Dirichlet Processes based on Poisson Processes

Author: Dahua Lin, Eric Grimson, John W. Fisher

Abstract: We present a novel method for constructing dependent Dirichlet processes. The approach exploits the intrinsic relationship between Dirichlet and Poisson processes in order to create a Markov chain of Dirichlet processes suitable for use as a prior over evolving mixture models. The method allows for the creation, removal, and location variation of component models over time while maintaining the property that the random measures are marginally DP distributed. Additionally, we derive a Gibbs sampling algorithm for model inference and test it on both synthetic and real data. Empirical results demonstrate that the approach is effective in estimating dynamically varying mixture models. 1

5 0.52656782 63 nips-2010-Distributed Dual Averaging In Networks

Author: Alekh Agarwal, Martin J. Wainwright, John C. Duchi

Abstract: The goal of decentralized optimization over a network is to optimize a global objective formed by a sum of local (possibly nonsmooth) convex functions using only local computation and communication. We develop and analyze distributed algorithms based on dual averaging of subgradients, and provide sharp bounds on their convergence rates as a function of the network size and topology. Our analysis clearly separates the convergence of the optimization algorithm itself from the effects of communication constraints arising from the network structure. We show that the number of iterations required by our algorithm scales inversely in the spectral gap of the network. The sharpness of this prediction is conﬁrmed both by theoretical lower bounds and simulations for various networks. 1

6 0.52623528 109 nips-2010-Group Sparse Coding with a Laplacian Scale Mixture Prior

7 0.526205 87 nips-2010-Extended Bayesian Information Criteria for Gaussian Graphical Models

8 0.52610093 155 nips-2010-Learning the context of a category

9 0.52591652 49 nips-2010-Computing Marginal Distributions over Continuous Markov Networks for Statistical Relational Learning

10 0.52580971 158 nips-2010-Learning via Gaussian Herding

11 0.52523428 287 nips-2010-Worst-Case Linear Discriminant Analysis

12 0.52433068 85 nips-2010-Exact learning curves for Gaussian process regression on large random graphs

13 0.52362788 12 nips-2010-A Primal-Dual Algorithm for Group Sparse Regularization with Overlapping Groups

14 0.52359676 239 nips-2010-Sidestepping Intractable Inference with Structured Ensemble Cascades

15 0.52337688 277 nips-2010-Two-Layer Generalization Analysis for Ranking Using Rademacher Average

16 0.52333933 217 nips-2010-Probabilistic Multi-Task Feature Selection

17 0.52297854 238 nips-2010-Short-term memory in neuronal networks through dynamical compressed sensing

18 0.52233857 164 nips-2010-MAP Estimation for Graphical Models by Likelihood Maximization

19 0.52226913 13 nips-2010-A Primal-Dual Message-Passing Algorithm for Approximated Large Scale Structured Prediction

20 0.52223247 150 nips-2010-Learning concept graphs from text with stick-breaking priors