nips nips2010 nips2010-71 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Ni Lao, Jun Zhu, Liu Xinwang, Yandong Liu, William W. Cohen
Abstract: Markov networks (MNs) can incorporate arbitrarily complex features in modeling relational data. However, this flexibility comes at a sharp price of training an exponentially complex model. To address this challenge, we propose a novel relational learning approach, which consists of a restricted class of relational MNs (RMNs) called relation tree-based RMN (treeRMN), and an efficient Hidden Variable Detection algorithm called Contrastive Variable Induction (CVI). On one hand, the restricted treeRMN only considers simple (e.g., unary and pairwise) features in relational data and thus achieves computational efficiency; and on the other hand, the CVI algorithm efficiently detects hidden variables which can capture long range dependencies. Therefore, the resultant approach is highly efficient yet does not sacrifice its expressive power. Empirical results on four real datasets show that the proposed relational learning method can achieve similar prediction quality as the state-of-the-art approaches, but is significantly more efficient in training; and the induced hidden variables are semantically meaningful and crucial to improve the training speed and prediction qualities of treeRMNs.
Reference: text
sentIndex sentText sentNum sentScore
1 edu Abstract Markov networks (MNs) can incorporate arbitrarily complex features in modeling relational data. [sent-4, score-0.448]
2 To address this challenge, we propose a novel relational learning approach, which consists of a restricted class of relational MNs (RMNs) called relation tree-based RMN (treeRMN), and an efficient Hidden Variable Detection algorithm called Contrastive Variable Induction (CVI). [sent-6, score-0.827]
3 , unary and pairwise) features in relational data and thus achieves computational efficiency; and on the other hand, the CVI algorithm efficiently detects hidden variables which can capture long range dependencies. [sent-9, score-0.698]
4 1 Introduction Statistical relational learning has attracted ever-growing interest in the last decade, because of widely available relational data, which can be as complex as citation graphs, the World Wide Web, or relational databases. [sent-12, score-1.077]
5 Relational Markov Networks (RMNs) are excellent tools to capture the statistical dependency among entities in a relational dataset, as has been shown in many tasks such as collective classification [22] and information extraction [18][2]. [sent-13, score-0.548]
6 For example, Markov Logic Networks [10] can be automatically instantiated as a RMN, given just a set of predicates representing attributes and relations among entities. [sent-15, score-0.2]
7 Relational Bayesian networks [22], in contrary, would require expert knowledge to design proper model structures and parameterizations whenever the schema of the domain under consideration is changed. [sent-17, score-0.115]
8 For example, work by Kok and Domingos [10][11][12] has shown that a prominent problem of relational undirected models is how to handle the exponentially many features, each of which is an conjunction of several neighboring variables (or “ground atoms” in terms of first order logic). [sent-19, score-0.405]
9 The main goal of this paper is to show that instead of learning a very expressive relational model, which can be extremely expensive, an alternative approach that explores Hidden Variable Detection (HVD) to compensate a family of restricted relational models (e. [sent-21, score-0.81]
10 , treeRMNs) can yield a very efficient yet competent relational learning framework. [sent-23, score-0.359]
11 First, to achieve efficient inference, we introduce a restricted class of RMNs called relation tree-based RMNs (treeRMNs), which only considers unary (single variable assignment) and pairwise (conjunction of two variable assignments) features. [sent-24, score-0.319]
12 1 Since the Markov blanket of a variable is concisely defined by a relation tree on the schema, we can easily control the complexities of treeRMN models. [sent-25, score-0.227]
13 Second, to compensate for the restricted expressive power of treeRMNs, we further introduce a hidden variable induction algorithm called Contrastive Variable Induction (CVI), which can effectively detect latent variables capturing long range dependencies. [sent-26, score-0.394]
14 It has been shown in relational Bayesian networks [24] that hidden variables can help propagating information across network structures, thus reducing the burden of extensive structural learning. [sent-27, score-0.577]
15 In this work, we explore the usefulness of hidden variables in learning RMNs. [sent-28, score-0.169]
16 Our experiments on four real datasets show that the proposed relational learning framework can achieve similar prediction quality to the state-of-the-art RMN models, but is significantly more efficient in training. [sent-29, score-0.389]
17 Furthermore, the induced hidden variables are semantically meaningful and are crucial to improving training speed of treeRMN. [sent-30, score-0.2]
18 Most of the existing RMN models construct Markov networks by applying templates to entity relation graphs [21][8]. [sent-36, score-0.474]
19 The treeRMN model that we are going to introduce uses a type of template called a relation tree, which is very general and applicable to a wide range of applications. [sent-37, score-0.269]
20 This relation tree template resembles the path-based feature generation approach for relational classifiers developed by Huang et al. [sent-38, score-0.608]
21 Recently, much work has been done to induce hidden variables for generative Bayesian networks [5][4][16][9][20][14]. [sent-40, score-0.218]
22 On the other hand, very little progress has been made in the direction of non-parametric hidden variable models based on discriminative Markov networks (MNs). [sent-42, score-0.209]
23 One recent attempt is the Multiple Relational Clustering (MRC) [11] algorithm, which performs top-down clustering of predicates and symbols. [sent-43, score-0.162]
24 The “ideal parent” evaluates candidate hidden variables based on the estimated gain of log-likelihood they can bring to the Bayesian network. [sent-46, score-0.246]
25 Similarly, the CVI algorithm evaluates candidate hidden variables based on the estimated gain of an regularized RMN log-likelihood, thus avoids the costly step of parameter estimation. [sent-47, score-0.246]
26 Here we consider the general case that Markov networks have observed variables O, labeled variables Y, and hidden variables H. [sent-54, score-0.31]
27 Let X = (Y, H) be the joint of hidden and labeled variables. [sent-55, score-0.123]
28 The conditional distribution of X given observations O is p(x|o; θ) = exp(θ⊤ f (x, o))/Z(θ), where f ∑ is a vector of feature functions fk ; θ is a vector of weights; Z(θ) = x exp(θ⊤ f (x, o)) is a normalization factor; and fk (x, o) counts the number of times the k-th feature fires in (x, o). [sent-56, score-0.116]
29 After applying each template to all the nodes in a graph, we get a graphical model with a large number of features (i. [sent-62, score-0.12]
30 In the context of relational learning, the templates can be defined similarly, except having richer representations–with multiple types of entities and neighboring relations. [sent-66, score-0.642]
31 Because of its singularity at the origin, the ℓ1 -norm can yield a sparse estimate, which is a desired property for hidden variable discovery, as we shall see. [sent-68, score-0.16]
32 CD1 4 Relation Tree-Based RMNs In the following, we formally define the treeRMN model with relation tree templates, which is very general and applicable to a wide range of applications. [sent-86, score-0.165]
33 T = {Ti } is a set of entity types which include both basic entity types (e. [sent-88, score-0.562]
34 Each entity type is associated with a set of attributes A(T ) = {T. [sent-93, score-0.332]
35 For each argument of a composite entity type, we define two relations, one with outward direction (e. [sent-100, score-0.37]
36 We further introduce a T win relation, which connects a composite entity type to itself. [sent-106, score-0.429]
37 In principle, we can define other types of relations F atherOf such as those corresponding to functions in second order logic (e. [sent-108, score-0.14]
38 − − −→ An entity relation graph G = IE (S) (Figure 1 right), is the instantiation of schema S on a set of basic entities E = {ei }. [sent-111, score-0.635]
39 We define the instantiation of a basic entity type T as IE (T ) = {e : e. [sent-112, score-0.322]
40 T = T }, and similarly for a composite type IE (T = ⟨T1 , . [sent-113, score-0.15]
41 Xi } that correspond to the set of attributes of its entity type A(e. [sent-123, score-0.332]
42 For a composite entity that consists of two entities of the same type, we’d like to capture its correlation with its twin– the composite entity made of the same basic entities but in reversed order. [sent-125, score-1.092]
43 Therefore, we add the T win relation between all pairs of twin entities: e. [sent-126, score-0.195]
44 3 Twin -1 PP1 -1 PP2 PP1 PP2 Twin PP1 -1 -1 -1 PC1 PP1 PP2 -1 -1 PP1 -1 PC1 Figure 2: Two-level relation trees for the P erson type (left) and the ⟨P erson, P erson⟩ type (right). [sent-129, score-0.473]
45 Given a schema, we can conveniently express how one entity can reach another entity by the concept of a relation path. [sent-130, score-0.599]
46 A relation path P is a sequence of relations R1 . [sent-131, score-0.18]
47 , the path P erson is actually equivalent to the path P erson − − → −− P C1 ⟨P erson, Class⟩ − − P erson. [sent-152, score-0.574]
48 To avoid creating these uninteresting paths, we add a constraint −→ to outward composite relations (e. [sent-153, score-0.171]
49 We also constrain that the T win relation should not be combined with any other relations. [sent-156, score-0.143]
50 Now, the Markov blanket of an entity e ∈ T can be concisely defined by the set of all relation paths with domain T and of length ≤ ℓ (as shown in Figure 2). [sent-157, score-0.434]
51 We call this set the relation tree of type T , and denote it as T ree(T, ℓ) = {P }. [sent-158, score-0.187]
52 This template can be applied to any entity e of type T ∧ in the entity relation graph. [sent-161, score-0.73]
53 This template can be applied to any entity pair (e1 , e2 ), where e1 . [sent-166, score-0.325]
54 P as the set of entities reach able from entity e ∈ T through the relation path P . [sent-170, score-0.568]
55 In our current implementation, we systematically enumerate all possible unary and pairwise templates. [sent-175, score-0.136]
56 Given the above concepts, we define a treeRMN model M = (G, f , θ) as the tuple of an entity relation graph G, a set of feature functions f , and their weights θ. [sent-176, score-0.387]
57 Each feature function fk counts the number of times the k-th template fires in G. [sent-177, score-0.138]
58 Generally, the complexity of inference is exponential in the depth of the relation trees, because both the number of templates and their sizes of Markov blankets grow exponentially w. [sent-178, score-0.18]
59 However, the limited expressive power of treeRMN can be effectively compensated for by detecting hidden variables, which is another key component of our relational learning approach, as explained in the next section. [sent-184, score-0.543]
60 We give closed-form expressions of the likelihood gain and the weights of newly added features under contrastive divergence approximation [23] (other type of inference can be done similarly). [sent-189, score-0.218]
61 Consider introducing a new HV H to the entity type T . [sent-191, score-0.296]
62 Here we assume that any feature f ∈ fH is in the pairwise form fH=1 ∧ A=a , where a is the assignment to one of the existing variables A in the relation tree of type T . [sent-194, score-0.301]
63 For easy evaluation of H, we set its mean field variational parameters µH to either 0 or 1 on the entities of type T . [sent-196, score-0.24]
64 Therefore, a candidate HV can be represented as (I, fH , θH ), where I is the set of indices to the entities with µH = 1. [sent-198, score-0.225]
65 large β) smoothly shrinks both the (estimated) likelihood gain and the feature weights; while the non-differentiable ℓ1 -norm not only shrink the estimated gain and feature weights, but also drive features to have zero gains, therefore, can automatically select the features. [sent-207, score-0.188]
66 Remark 2 currently, we only induce HVs to basic entity types. [sent-227, score-0.271]
67 Extension to composite types can show interesting tenary relations such as “Abnormality can be PartOf Animals”. [sent-228, score-0.168]
68 We demonstrate that CVI can discover semantically meaningful hidden variables, which can significantly improve the speed and quality of treeRMN models. [sent-231, score-0.154]
69 These datasets are commonly used by previous work Animal 50 80 0 0 Nation 14 111 196 56 in relational learning [9][11][20][14]. [sent-234, score-0.389]
70 It consists exclusively of unary predicates of the form A(a) where A is an attribute and Kinship 104 0 10,816 1* a is an animal (e. [sent-236, score-0.34]
71 This is a simple proposi- Table 1: Number of entities tional dataset with no relational structure, but is useful as a base case (#E) and attributes (#A) for for comparison. [sent-239, score-0.584]
72 The binary predicates are of the form data has only one attribute R(n1 , n2 ), where n1 , n2 are nations and R is a relation between which has 26 possible values. [sent-242, score-0.296]
73 The unary predicates are of the form A(n), where n is a nation and A is a attribute (e. [sent-246, score-0.337]
74 It consists of binary predicates of the form R(c1 , c2 ), where c1 and c2 are biomedical concepts and R is a relation between them (e. [sent-250, score-0.269]
75 The Kinship dataset contains kinship relationships among members of the Alyawarra tribe from Central Australia. [sent-253, score-0.164]
76 Predicates are of the form R(p1 , p2 ), where R is a kinship term and p1 , p2 are persons. [sent-254, score-0.164]
77 Except for the animal data, the number of composite entities is the square of the number of basic entities. [sent-255, score-0.396]
78 2 Characterization of treeRMN and CVI In this section, we analyze the properties of the discovered hidden variables and demonstrate the behavior of the CVI algorithm. [sent-257, score-0.169]
79 For the simple non-relational Animal data, if we start with a full model with all pairwise features, CVI will decide not to introduce any hidden variables. [sent-258, score-0.158]
80 If we run CVI starting from a model with only unary features, however, CVI decides to introduce one hidden variable H0 with 8 categories. [sent-259, score-0.261]
81 Table 2 shows the associated entities and features for the first four categories. [sent-260, score-0.229]
82 Table 2: The associated entities and features (sorted by decreasing magnitude of feature weights) for the first four categories∧ the induced hidden variable a. [sent-286, score-0.422]
83 Table 3: The associated entities and features (sorted by decreasing magnitude of feature weights) for the first three categories of the induced hidden variable c. [sent-310, score-0.447]
84 CLL 0 For the three relational datasets, we use UML as an example. [sent-314, score-0.359]
85 2 CVI induces two multinomial hidden variables H0 and H1 . [sent-318, score-0.169]
86 3 can see from Figure 3, the inclusion of each hidden variable sigIntroduce c. [sent-321, score-0.16]
87 6 represent the hidden concepts Abnormalities, Animals and Plants -0. [sent-328, score-0.165]
88 3 Overall Performance Now we present quantitative evaluation of the treeRMN model, and compare it with other relational learning methods including MLN structure learning (MLS) [10], Infinite Relational Models (IRM) [9] and Multiple Relational Clustering (MRC) [11]. [sent-335, score-0.359]
89 At each run, we treat one fold as hidden during training, and then evaluate the prediction of these variables conditioned on the observed variables during testing. [sent-338, score-0.215]
90 Table 4 compares the overall performance of treeRMN (RMN), treeRMN with hidden variable discovery (RMNCV I ), and other relational models (MSL, IRM and MRC) as reported in [11]. [sent-342, score-0.519]
91 § The CLL of kinship data is not comparable to previous approaches, because we treat each of its labels as one variable with 26 categories instead of 26 binary variables. [sent-466, score-0.226]
92 On all three relational datasets, treeRMNs with CVI can significantly improve CLL and AUC. [sent-471, score-0.359]
93 Since both UML and Kinship data have no attributes in basic entity types, composite entities become more important to model. [sent-478, score-0.595]
94 Therefore, we suspect that the MRC model achieves better performance because it can perform clustering on two-argument predicates which corresponds to composite entities. [sent-479, score-0.261]
95 By using simple treeRMNs, we achieve computational efficiency, and CVI can effectively detect hidden variables, which compensates for the limited expressive power of treeRMNs. [sent-481, score-0.184]
96 Experiments on four real datasets show that the proposed relational learning approach can achieve state-of-the-art prediction accuracy and is much faster than existing relational Markov network models. [sent-482, score-0.748]
97 Second, as we have explained at the end of section 5, we’d like to apply HVD on composite entity types. [sent-485, score-0.344]
98 Third, we’d also like to treat the introduced hidden variables as free variables and to make their cardinalities adaptive. [sent-486, score-0.238]
99 Predictive modeling using features derived from paths in relational graphs. [sent-513, score-0.425]
100 Learning systems of concepts with an infinite relational model. [sent-521, score-0.401]
wordName wordTfidf (topN-words)
[('treermn', 0.418), ('relational', 0.359), ('cvi', 0.314), ('erson', 0.262), ('entity', 0.245), ('entities', 0.189), ('kinship', 0.164), ('rmn', 0.157), ('cll', 0.134), ('treermns', 0.134), ('hidden', 0.123), ('uml', 0.119), ('mrc', 0.118), ('predicates', 0.118), ('relation', 0.109), ('ie', 0.104), ('unary', 0.101), ('composite', 0.099), ('dom', 0.096), ('hvs', 0.09), ('msl', 0.09), ('contrastive', 0.086), ('irm', 0.084), ('animal', 0.082), ('template', 0.08), ('nation', 0.079), ('rmncv', 0.075), ('kok', 0.072), ('rmns', 0.072), ('templates', 0.071), ('logic', 0.071), ('markov', 0.068), ('induction', 0.067), ('mns', 0.066), ('schema', 0.066), ('expressive', 0.061), ('fh', 0.059), ('nir', 0.053), ('twin', 0.052), ('type', 0.051), ('networks', 0.049), ('dim', 0.049), ('relations', 0.046), ('variables', 0.046), ('auc', 0.046), ('clustering', 0.044), ('pedro', 0.043), ('hv', 0.043), ('concepts', 0.042), ('gain', 0.041), ('daphne', 0.041), ('features', 0.04), ('attribute', 0.039), ('stanley', 0.039), ('increment', 0.039), ('variable', 0.037), ('py', 0.037), ('candidate', 0.036), ('attributes', 0.036), ('pairwise', 0.035), ('win', 0.034), ('feature', 0.033), ('uai', 0.033), ('eld', 0.032), ('elidan', 0.032), ('semantically', 0.031), ('domingos', 0.031), ('compensate', 0.031), ('gal', 0.031), ('ei', 0.03), ('datasets', 0.03), ('blanket', 0.03), ('fish', 0.03), ('hvd', 0.03), ('nations', 0.03), ('paws', 0.03), ('raccoon', 0.03), ('swims', 0.03), ('toughskin', 0.03), ('vegetation', 0.03), ('range', 0.029), ('res', 0.028), ('pp', 0.027), ('tree', 0.027), ('animals', 0.027), ('exibility', 0.026), ('paths', 0.026), ('dolphin', 0.026), ('lao', 0.026), ('outward', 0.026), ('basic', 0.026), ('path', 0.025), ('categories', 0.025), ('fk', 0.025), ('abnormalities', 0.024), ('mlns', 0.024), ('concisely', 0.024), ('types', 0.023), ('plants', 0.023), ('cardinalities', 0.023)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000007 71 nips-2010-Efficient Relational Learning with Hidden Variable Detection
Author: Ni Lao, Jun Zhu, Liu Xinwang, Yandong Liu, William W. Cohen
Abstract: Markov networks (MNs) can incorporate arbitrarily complex features in modeling relational data. However, this flexibility comes at a sharp price of training an exponentially complex model. To address this challenge, we propose a novel relational learning approach, which consists of a restricted class of relational MNs (RMNs) called relation tree-based RMN (treeRMN), and an efficient Hidden Variable Detection algorithm called Contrastive Variable Induction (CVI). On one hand, the restricted treeRMN only considers simple (e.g., unary and pairwise) features in relational data and thus achieves computational efficiency; and on the other hand, the CVI algorithm efficiently detects hidden variables which can capture long range dependencies. Therefore, the resultant approach is highly efficient yet does not sacrifice its expressive power. Empirical results on four real datasets show that the proposed relational learning method can achieve similar prediction quality as the state-of-the-art approaches, but is significantly more efficient in training; and the induced hidden variables are semantically meaningful and crucial to improve the training speed and prediction qualities of treeRMNs.
2 0.19921711 83 nips-2010-Evidence-Specific Structures for Rich Tractable CRFs
Author: Anton Chechetka, Carlos Guestrin
Abstract: We present a simple and effective approach to learning tractable conditional random fields with structure that depends on the evidence. Our approach retains the advantages of tractable discriminative models, namely efficient exact inference and arbitrarily accurate parameter learning in polynomial time. At the same time, our algorithm does not suffer a large expressive power penalty inherent to fixed tractable structures. On real-life relational datasets, our approach matches or exceeds state of the art accuracy of the dense models, and at the same time provides an order of magnitude speedup. 1
3 0.17906117 67 nips-2010-Dynamic Infinite Relational Model for Time-varying Relational Data Analysis
Author: Katsuhiko Ishiguro, Tomoharu Iwata, Naonori Ueda, Joshua B. Tenenbaum
Abstract: We propose a new probabilistic model for analyzing dynamic evolutions of relational data, such as additions, deletions and split & merge, of relation clusters like communities in social networks. Our proposed model abstracts observed timevarying object-object relationships into relationships between object clusters. We extend the infinite Hidden Markov model to follow dynamic and time-sensitive changes in the structure of the relational data and to estimate a number of clusters simultaneously. We show the usefulness of the model through experiments with synthetic and real-world data sets.
4 0.10228697 128 nips-2010-Infinite Relational Modeling of Functional Connectivity in Resting State fMRI
Author: Morten Mørup, Kristoffer Madsen, Anne-marie Dogonowski, Hartwig Siebner, Lars K. Hansen
Abstract: Functional magnetic resonance imaging (fMRI) can be applied to study the functional connectivity of the neural elements which form complex network at a whole brain level. Most analyses of functional resting state networks (RSN) have been based on the analysis of correlation between the temporal dynamics of various regions of the brain. While these models can identify coherently behaving groups in terms of correlation they give little insight into how these groups interact. In this paper we take a different view on the analysis of functional resting state networks. Starting from the definition of resting state as functional coherent groups we search for functional units of the brain that communicate with other parts of the brain in a coherent manner as measured by mutual information. We use the infinite relational model (IRM) to quantify functional coherent groups of resting state networks and demonstrate how the extracted component interactions can be used to discriminate between functional resting state activity in multiple sclerosis and normal subjects. 1
5 0.081437975 49 nips-2010-Computing Marginal Distributions over Continuous Markov Networks for Statistical Relational Learning
Author: Matthias Broecheler, Lise Getoor
Abstract: Continuous Markov random fields are a general formalism to model joint probability distributions over events with continuous outcomes. We prove that marginal computation for constrained continuous MRFs is #P-hard in general and present a polynomial-time approximation scheme under mild assumptions on the structure of the random field. Moreover, we introduce a sampling algorithm to compute marginal distributions and develop novel techniques to increase its efficiency. Continuous MRFs are a general purpose probabilistic modeling tool and we demonstrate how they can be applied to statistical relational learning. On the problem of collective classification, we evaluate our algorithm and show that the standard deviation of marginals serves as a useful measure of confidence. 1
6 0.078748614 78 nips-2010-Error Propagation for Approximate Policy and Value Iteration
7 0.060934171 159 nips-2010-Lifted Inference Seen from the Other Side : The Tractable Features
8 0.054295912 213 nips-2010-Predictive Subspace Learning for Multi-view Data: a Large Margin Approach
9 0.051574696 276 nips-2010-Tree-Structured Stick Breaking for Hierarchical Data
10 0.050425321 144 nips-2010-Learning Efficient Markov Networks
11 0.044265404 162 nips-2010-Link Discovery using Graph Feature Tracking
12 0.043665688 235 nips-2010-Self-Paced Learning for Latent Variable Models
13 0.043336745 171 nips-2010-Movement extraction by detecting dynamics switches and repetitions
14 0.042956974 99 nips-2010-Gated Softmax Classification
15 0.04151696 101 nips-2010-Gaussian sampling by local perturbations
16 0.041073479 206 nips-2010-Phone Recognition with the Mean-Covariance Restricted Boltzmann Machine
17 0.041057602 135 nips-2010-Label Embedding Trees for Large Multi-Class Tasks
18 0.040024258 230 nips-2010-Robust Clustering as Ensembles of Affinity Relations
19 0.039104469 32 nips-2010-Approximate Inference by Compilation to Arithmetic Circuits
20 0.037940975 273 nips-2010-Towards Property-Based Classification of Clustering Paradigms
topicId topicWeight
[(0, 0.131), (1, 0.03), (2, -0.026), (3, 0.009), (4, -0.108), (5, -0.031), (6, 0.007), (7, -0.031), (8, 0.069), (9, -0.008), (10, -0.09), (11, -0.011), (12, 0.055), (13, -0.056), (14, 0.05), (15, -0.048), (16, 0.021), (17, -0.001), (18, -0.024), (19, -0.083), (20, -0.022), (21, 0.09), (22, 0.101), (23, -0.026), (24, -0.135), (25, 0.051), (26, -0.056), (27, -0.015), (28, -0.03), (29, 0.062), (30, -0.12), (31, -0.228), (32, -0.133), (33, -0.056), (34, -0.121), (35, 0.086), (36, 0.003), (37, 0.153), (38, -0.122), (39, -0.01), (40, -0.044), (41, -0.008), (42, 0.045), (43, -0.002), (44, -0.079), (45, 0.079), (46, 0.149), (47, 0.146), (48, -0.044), (49, 0.11)]
simIndex simValue paperId paperTitle
same-paper 1 0.9234581 71 nips-2010-Efficient Relational Learning with Hidden Variable Detection
Author: Ni Lao, Jun Zhu, Liu Xinwang, Yandong Liu, William W. Cohen
Abstract: Markov networks (MNs) can incorporate arbitrarily complex features in modeling relational data. However, this flexibility comes at a sharp price of training an exponentially complex model. To address this challenge, we propose a novel relational learning approach, which consists of a restricted class of relational MNs (RMNs) called relation tree-based RMN (treeRMN), and an efficient Hidden Variable Detection algorithm called Contrastive Variable Induction (CVI). On one hand, the restricted treeRMN only considers simple (e.g., unary and pairwise) features in relational data and thus achieves computational efficiency; and on the other hand, the CVI algorithm efficiently detects hidden variables which can capture long range dependencies. Therefore, the resultant approach is highly efficient yet does not sacrifice its expressive power. Empirical results on four real datasets show that the proposed relational learning method can achieve similar prediction quality as the state-of-the-art approaches, but is significantly more efficient in training; and the induced hidden variables are semantically meaningful and crucial to improve the training speed and prediction qualities of treeRMNs.
2 0.77257657 67 nips-2010-Dynamic Infinite Relational Model for Time-varying Relational Data Analysis
Author: Katsuhiko Ishiguro, Tomoharu Iwata, Naonori Ueda, Joshua B. Tenenbaum
Abstract: We propose a new probabilistic model for analyzing dynamic evolutions of relational data, such as additions, deletions and split & merge, of relation clusters like communities in social networks. Our proposed model abstracts observed timevarying object-object relationships into relationships between object clusters. We extend the infinite Hidden Markov model to follow dynamic and time-sensitive changes in the structure of the relational data and to estimate a number of clusters simultaneously. We show the usefulness of the model through experiments with synthetic and real-world data sets.
3 0.76440179 83 nips-2010-Evidence-Specific Structures for Rich Tractable CRFs
Author: Anton Chechetka, Carlos Guestrin
Abstract: We present a simple and effective approach to learning tractable conditional random fields with structure that depends on the evidence. Our approach retains the advantages of tractable discriminative models, namely efficient exact inference and arbitrarily accurate parameter learning in polynomial time. At the same time, our algorithm does not suffer a large expressive power penalty inherent to fixed tractable structures. On real-life relational datasets, our approach matches or exceeds state of the art accuracy of the dense models, and at the same time provides an order of magnitude speedup. 1
4 0.58428693 159 nips-2010-Lifted Inference Seen from the Other Side : The Tractable Features
Author: Abhay Jha, Vibhav Gogate, Alexandra Meliou, Dan Suciu
Abstract: Lifted Inference algorithms for representations that combine first-order logic and graphical models have been the focus of much recent research. All lifted algorithms developed to date are based on the same underlying idea: take a standard probabilistic inference algorithm (e.g., variable elimination, belief propagation etc.) and improve its efficiency by exploiting repeated structure in the first-order model. In this paper, we propose an approach from the other side in that we use techniques from logic for probabilistic inference. In particular, we define a set of rules that look only at the logical representation to identify models for which exact efficient inference is possible. Our rules yield new tractable classes that could not be solved efficiently by any of the existing techniques. 1
5 0.48232126 121 nips-2010-Improving Human Judgments by Decontaminating Sequential Dependencies
Author: Harold Pashler, Matthew Wilder, Robert Lindsey, Matt Jones, Michael C. Mozer, Michael P. Holmes
Abstract: For over half a century, psychologists have been struck by how poor people are at expressing their internal sensations, impressions, and evaluations via rating scales. When individuals make judgments, they are incapable of using an absolute rating scale, and instead rely on reference points from recent experience. This relativity of judgment limits the usefulness of responses provided by individuals to surveys, questionnaires, and evaluation forms. Fortunately, the cognitive processes that transform internal states to responses are not simply noisy, but rather are influenced by recent experience in a lawful manner. We explore techniques to remove sequential dependencies, and thereby decontaminate a series of ratings to obtain more meaningful human judgments. In our formulation, decontamination is fundamentally a problem of inferring latent states (internal sensations) which, because of the relativity of judgment, have temporal dependencies. We propose a decontamination solution using a conditional random field with constraints motivated by psychological theories of relative judgment. Our exploration of decontamination models is supported by two experiments we conducted to obtain ground-truth rating data on a simple length estimation task. Our decontamination techniques yield an over 20% reduction in the error of human judgments. 1
6 0.41148955 215 nips-2010-Probabilistic Deterministic Infinite Automata
7 0.38286892 144 nips-2010-Learning Efficient Markov Networks
8 0.35199061 49 nips-2010-Computing Marginal Distributions over Continuous Markov Networks for Statistical Relational Learning
9 0.34305692 1 nips-2010-(RF)^2 -- Random Forest Random Field
10 0.33218551 128 nips-2010-Infinite Relational Modeling of Functional Connectivity in Resting State fMRI
11 0.32705289 155 nips-2010-Learning the context of a category
12 0.32584089 62 nips-2010-Discriminative Clustering by Regularized Information Maximization
13 0.29781848 32 nips-2010-Approximate Inference by Compilation to Arithmetic Circuits
14 0.29612675 188 nips-2010-On Herding and the Perceptron Cycling Theorem
15 0.28854948 13 nips-2010-A Primal-Dual Message-Passing Algorithm for Approximated Large Scale Structured Prediction
16 0.27522641 213 nips-2010-Predictive Subspace Learning for Multi-view Data: a Large Margin Approach
17 0.25794458 171 nips-2010-Movement extraction by detecting dynamics switches and repetitions
18 0.25719351 242 nips-2010-Slice sampling covariance hyperparameters of latent Gaussian models
19 0.25422883 9 nips-2010-A New Probabilistic Model for Rank Aggregation
20 0.25399283 206 nips-2010-Phone Recognition with the Mean-Covariance Restricted Boltzmann Machine
topicId topicWeight
[(13, 0.03), (17, 0.014), (27, 0.064), (30, 0.046), (35, 0.025), (41, 0.356), (45, 0.192), (50, 0.062), (52, 0.02), (60, 0.035), (77, 0.029), (78, 0.012), (90, 0.024)]
simIndex simValue paperId paperTitle
same-paper 1 0.73024511 71 nips-2010-Efficient Relational Learning with Hidden Variable Detection
Author: Ni Lao, Jun Zhu, Liu Xinwang, Yandong Liu, William W. Cohen
Abstract: Markov networks (MNs) can incorporate arbitrarily complex features in modeling relational data. However, this flexibility comes at a sharp price of training an exponentially complex model. To address this challenge, we propose a novel relational learning approach, which consists of a restricted class of relational MNs (RMNs) called relation tree-based RMN (treeRMN), and an efficient Hidden Variable Detection algorithm called Contrastive Variable Induction (CVI). On one hand, the restricted treeRMN only considers simple (e.g., unary and pairwise) features in relational data and thus achieves computational efficiency; and on the other hand, the CVI algorithm efficiently detects hidden variables which can capture long range dependencies. Therefore, the resultant approach is highly efficient yet does not sacrifice its expressive power. Empirical results on four real datasets show that the proposed relational learning method can achieve similar prediction quality as the state-of-the-art approaches, but is significantly more efficient in training; and the induced hidden variables are semantically meaningful and crucial to improve the training speed and prediction qualities of treeRMNs.
2 0.6626327 176 nips-2010-Multiple Kernel Learning and the SMO Algorithm
Author: Zhaonan Sun, Nawanol Ampornpunt, Manik Varma, S.v.n. Vishwanathan
Abstract: Our objective is to train p-norm Multiple Kernel Learning (MKL) and, more generally, linear MKL regularised by the Bregman divergence, using the Sequential Minimal Optimization (SMO) algorithm. The SMO algorithm is simple, easy to implement and adapt, and efficiently scales to large problems. As a result, it has gained widespread acceptance and SVMs are routinely trained using SMO in diverse real world applications. Training using SMO has been a long standing goal in MKL for the very same reasons. Unfortunately, the standard MKL dual is not differentiable, and therefore can not be optimised using SMO style co-ordinate ascent. In this paper, we demonstrate that linear MKL regularised with the p-norm squared, or with certain Bregman divergences, can indeed be trained using SMO. The resulting algorithm retains both simplicity and efficiency and is significantly faster than state-of-the-art specialised p-norm MKL solvers. We show that we can train on a hundred thousand kernels in approximately seven minutes and on fifty thousand points in less than half an hour on a single core. 1
3 0.6048755 138 nips-2010-Large Margin Multi-Task Metric Learning
Author: Shibin Parameswaran, Kilian Q. Weinberger
Abstract: Multi-task learning (MTL) improves the prediction performance on multiple, different but related, learning problems through shared parameters or representations. One of the most prominent multi-task learning algorithms is an extension to support vector machines (svm) by Evgeniou et al. [15]. Although very elegant, multi-task svm is inherently restricted by the fact that support vector machines require each class to be addressed explicitly with its own weight vector which, in a multi-task setting, requires the different learning tasks to share the same set of classes. This paper proposes an alternative formulation for multi-task learning by extending the recently published large margin nearest neighbor (lmnn) algorithm to the MTL paradigm. Instead of relying on separating hyperplanes, its decision function is based on the nearest neighbor rule which inherently extends to many classes and becomes a natural fit for multi-task learning. We evaluate the resulting multi-task lmnn on real-world insurance data and speech classification problems and show that it consistently outperforms single-task kNN under several metrics and state-of-the-art MTL classifiers. 1
4 0.52757221 51 nips-2010-Construction of Dependent Dirichlet Processes based on Poisson Processes
Author: Dahua Lin, Eric Grimson, John W. Fisher
Abstract: We present a novel method for constructing dependent Dirichlet processes. The approach exploits the intrinsic relationship between Dirichlet and Poisson processes in order to create a Markov chain of Dirichlet processes suitable for use as a prior over evolving mixture models. The method allows for the creation, removal, and location variation of component models over time while maintaining the property that the random measures are marginally DP distributed. Additionally, we derive a Gibbs sampling algorithm for model inference and test it on both synthetic and real data. Empirical results demonstrate that the approach is effective in estimating dynamically varying mixture models. 1
5 0.52656782 63 nips-2010-Distributed Dual Averaging In Networks
Author: Alekh Agarwal, Martin J. Wainwright, John C. Duchi
Abstract: The goal of decentralized optimization over a network is to optimize a global objective formed by a sum of local (possibly nonsmooth) convex functions using only local computation and communication. We develop and analyze distributed algorithms based on dual averaging of subgradients, and provide sharp bounds on their convergence rates as a function of the network size and topology. Our analysis clearly separates the convergence of the optimization algorithm itself from the effects of communication constraints arising from the network structure. We show that the number of iterations required by our algorithm scales inversely in the spectral gap of the network. The sharpness of this prediction is confirmed both by theoretical lower bounds and simulations for various networks. 1
6 0.52623528 109 nips-2010-Group Sparse Coding with a Laplacian Scale Mixture Prior
7 0.526205 87 nips-2010-Extended Bayesian Information Criteria for Gaussian Graphical Models
8 0.52610093 155 nips-2010-Learning the context of a category
9 0.52591652 49 nips-2010-Computing Marginal Distributions over Continuous Markov Networks for Statistical Relational Learning
10 0.52580971 158 nips-2010-Learning via Gaussian Herding
11 0.52523428 287 nips-2010-Worst-Case Linear Discriminant Analysis
12 0.52433068 85 nips-2010-Exact learning curves for Gaussian process regression on large random graphs
13 0.52362788 12 nips-2010-A Primal-Dual Algorithm for Group Sparse Regularization with Overlapping Groups
14 0.52359676 239 nips-2010-Sidestepping Intractable Inference with Structured Ensemble Cascades
15 0.52337688 277 nips-2010-Two-Layer Generalization Analysis for Ranking Using Rademacher Average
16 0.52333933 217 nips-2010-Probabilistic Multi-Task Feature Selection
17 0.52297854 238 nips-2010-Short-term memory in neuronal networks through dynamical compressed sensing
18 0.52233857 164 nips-2010-MAP Estimation for Graphical Models by Likelihood Maximization
19 0.52226913 13 nips-2010-A Primal-Dual Message-Passing Algorithm for Approximated Large Scale Structured Prediction
20 0.52223247 150 nips-2010-Learning concept graphs from text with stick-breaking priors