jmlr jmlr2006 knowledge-graph by maker-knowledge-mining

jmlr 2006 knowledge graph


similar papers computed by tfidf model


similar papers computed by lsi model


similar papers computed by lda model


papers list:

1 jmlr-2006-A Direct Method for Building Sparse Kernel Learning Algorithms

Author: Mingrui Wu, Bernhard Schölkopf, Gökhan Bakır

Abstract: Many kernel learning algorithms, including support vector machines, result in a kernel machine, such as a kernel classifier, whose key component is a weight vector in a feature space implicitly introduced by a positive definite kernel function. This weight vector is usually obtained by solving a convex optimization problem. Based on this fact we present a direct method to build sparse kernel learning algorithms by adding one more constraint to the original convex optimization problem, such that the sparseness of the resulting kernel machine is explicitly controlled while at the same time performance is kept as high as possible. A gradient based approach is provided to solve this modified optimization problem. Applying this method to the support vectom machine results in a concrete algorithm for building sparse large margin classifiers. These classifiers essentially find a discriminating subspace that can be spanned by a small number of vectors, and in this subspace, the different classes of data are linearly well separated. Experimental results over several classification benchmarks demonstrate the effectiveness of our approach. Keywords: sparse learning, sparse large margin classifiers, kernel learning algorithms, support vector machine, kernel Fisher discriminant

2 jmlr-2006-A Graphical Representation of Equivalence Classes of AMP Chain Graphs

Author: Alberto Roverato, Milan Studený

Abstract: This paper deals with chain graph models under alternative AMP interpretation. A new representative of an AMP Markov equivalence class, called the largest deflagged graph, is proposed. The representative is based on revealed internal structure of the AMP Markov equivalence class. More specifically, the AMP Markov equivalence class decomposes into finer strong equivalence classes and there exists a distinguished strong equivalence class among those forming the AMP Markov equivalence class. The largest deflagged graph is the largest chain graph in that distinguished strong equivalence class. A composed graphical procedure to get the largest deflagged graph on the basis of any AMP Markov equivalent chain graph is presented. In general, the largest deflagged graph differs from the AMP essential graph, which is another representative of the AMP Markov equivalence class. Keywords: chain graph, AMP Markov equivalence, strong equivalence, largest deflagged graph, component merging procedure, deflagging procedure, essential graph

3 jmlr-2006-A Hierarchy of Support Vector Machines for Pattern Detection

Author: Hichem Sahbi, Donald Geman

Abstract: We introduce a computational design for pattern detection based on a tree-structured network of support vector machines (SVMs). An SVM is associated with each cell in a recursive partitioning of the space of patterns (hypotheses) into increasingly finer subsets. The hierarchy is traversed coarse-to-fine and each chain of positive responses from the root to a leaf constitutes a detection. Our objective is to design and build a network which balances overall error and computation. Initially, SVMs are constructed for each cell with no constraints. This “free network” is then perturbed, cell by cell, into another network, which is “graded” in two ways: first, the number of support vectors of each SVM is reduced (by clustering) in order to adjust to a pre-determined, increasing function of cell depth; second, the decision boundaries are shifted to preserve all positive responses from the original set of training data. The limits on the numbers of clusters (virtual support vectors) result from minimizing the mean computational cost of collecting all detections subject to a bound on the expected number of false positives. When applied to detecting faces in cluttered scenes, the patterns correspond to poses and the free network is already faster and more accurate than applying a single pose-specific SVM many times. The graded network promotes very rapid processing of background regions while maintaining the discriminatory power of the free network. Keywords: statistical learning, hierarchy of classifiers, coarse-to-fine computation, support vector machines, face detection

4 jmlr-2006-A Linear Non-Gaussian Acyclic Model for Causal Discovery

Author: Shohei Shimizu, Patrik O. Hoyer, Aapo Hyvärinen, Antti Kerminen

Abstract: In recent years, several methods have been proposed for the discovery of causal structure from non-experimental data. Such methods make various assumptions on the data generating process to facilitate its identification from purely observational data. Continuing this line of research, we show how to discover the complete causal structure of continuous-valued data, under the assumptions that (a) the data generating process is linear, (b) there are no unobserved confounders, and (c) disturbance variables have non-Gaussian distributions of non-zero variances. The solution relies on the use of the statistical method known as independent component analysis, and does not require any pre-specified time-ordering of the variables. We provide a complete Matlab package for performing this LiNGAM analysis (short for Linear Non-Gaussian Acyclic Model), and demonstrate the effectiveness of the method using artificially generated data and real-world data. Keywords: independent component analysis, non-Gaussianity, causal discovery, directed acyclic graph, non-experimental data

5 jmlr-2006-A Robust Procedure For Gaussian Graphical Model Search From Microarray Data WithpLarger Thann

Author: Robert Castelo, Alberto Roverato

Abstract: Learning of large-scale networks of interactions from microarray data is an important and challenging problem in bioinformatics. A widely used approach is to assume that the available data constitute a random sample from a multivariate distribution belonging to a Gaussian graphical model. As a consequence, the prime objects of inference are full-order partial correlations which are partial correlations between two variables given the remaining ones. In the context of microarray data the number of variables exceed the sample size and this precludes the application of traditional structure learning procedures because a sampling version of full-order partial correlations does not exist. In this paper we consider limited-order partial correlations, these are partial correlations computed on marginal distributions of manageable size, and provide a set of rules that allow one to assess the usefulness of these quantities to derive the independence structure of the underlying Gaussian graphical model. Furthermore, we introduce a novel structure learning procedure based on a quantity, obtained from limited-order partial correlations, that we call the non-rejection rate. The applicability and usefulness of the procedure are demonstrated by both simulated and real data. Keywords: Gaussian distribution, gene network, graphical model, microarray data, non-rejection rate, partial correlation, small-sample inference

6 jmlr-2006-A Scoring Function for Learning Bayesian Networks based on Mutual Information and Conditional Independence Tests

Author: Luis M. de Campos

Abstract: We propose a new scoring function for learning Bayesian networks from data using score+search algorithms. This is based on the concept of mutual information and exploits some well-known properties of this measure in a novel way. Essentially, a statistical independence test based on the chi-square distribution, associated with the mutual information measure, together with a property of additive decomposition of this measure, are combined in order to measure the degree of interaction between each variable and its parent variables in the network. The result is a non-Bayesian scoring function called MIT (mutual information tests) which belongs to the family of scores based on information theory. The MIT score also represents a penalization of the Kullback-Leibler divergence between the joint probability distributions associated with a candidate network and with the available data set. Detailed results of a complete experimental evaluation of the proposed scoring function and its comparison with the well-known K2, BDeu and BIC/MDL scores are also presented. Keywords: Bayesian networks, scoring functions, learning, mutual information, conditional independence tests

7 jmlr-2006-A Simulation-Based Algorithm for Ergodic Control of Markov Chains Conditioned on Rare Events

Author: Shalabh Bhatnagar, Vivek S. Borkar, Madhukar Akarapu

Abstract: We study the problem of long-run average cost control of Markov chains conditioned on a rare event. In a related recent work, a simulation based algorithm for estimating performance measures associated with a Markov chain conditioned on a rare event has been developed. We extend ideas from this work and develop an adaptive algorithm for obtaining, online, optimal control policies conditioned on a rare event. Our algorithm uses three timescales or step-size schedules. On the slowest timescale, a gradient search algorithm for policy updates that is based on one-simulation simultaneous perturbation stochastic approximation (SPSA) type estimates is used. Deterministic perturbation sequences obtained from appropriate normalized Hadamard matrices are used here. The fast timescale recursions compute the conditional transition probabilities of an associated chain by obtaining solutions to the multiplicative Poisson equation (for a given policy estimate). Further, the risk parameter associated with the value function for a given policy estimate is updated on a timescale that lies in between the two scales above. We briefly sketch the convergence analysis of our algorithm and present a numerical application in the setting of routing multiple flows in communication networks. Keywords: Markov decision processes, optimal control conditioned on a rare event, simulation based algorithms, SPSA with deterministic perturbations, reinforcement learning

8 jmlr-2006-A Very Fast Learning Method for Neural Networks Based on Sensitivity Analysis

Author: Enrique Castillo, Bertha Guijarro-Berdiñas, Oscar Fontenla-Romero, Amparo Alonso-Betanzos

Abstract: This paper introduces a learning method for two-layer feedforward neural networks based on sensitivity analysis, which uses a linear training algorithm for each of the two layers. First, random values are assigned to the outputs of the first layer; later, these initial values are updated based on sensitivity formulas, which use the weights in each of the layers; the process is repeated until convergence. Since these weights are learnt solving a linear system of equations, there is an important saving in computational time. The method also gives the local sensitivities of the least square errors with respect to input and output data, with no extra computational cost, because the necessary information becomes available without extra calculations. This method, called the Sensitivity-Based Linear Learning Method, can also be used to provide an initial set of weights, which significantly improves the behavior of other learning algorithms. The theoretical basis for the method is given and its performance is illustrated by its application to several examples in which it is compared with several learning algorithms and well known data sets. The results have shown a learning speed generally faster than other existing methods. In addition, it can be used as an initialization tool for other well known methods with significant improvements. Keywords: supervised learning, neural networks, linear optimization, least-squares, initialization method, sensitivity analysis

9 jmlr-2006-Accurate Error Bounds for the Eigenvalues of the Kernel Matrix

Author: Mikio L. Braun

Abstract: The eigenvalues of the kernel matrix play an important role in a number of kernel methods, in particular, in kernel principal component analysis. It is well known that the eigenvalues of the kernel matrix converge as the number of samples tends to infinity. We derive probabilistic finite sample size bounds on the approximation error of individual eigenvalues which have the important property that the bounds scale with the eigenvalue under consideration, reflecting the actual behavior of the approximation errors as predicted by asymptotic results and observed in numerical simulations. Such scaling bounds have so far only been known for tail sums of eigenvalues. Asymptotically, the bounds presented here have a slower than stochastic rate, but the number of sample points necessary to make this disadvantage noticeable is often unrealistically large. Therefore, under practical conditions, and for all but the largest few eigenvalues, the bounds presented here form a significant improvement over existing non-scaling bounds. Keywords: kernel matrix, eigenvalues, relative perturbation bounds

10 jmlr-2006-Action Elimination and Stopping Conditions for the Multi-Armed Bandit and Reinforcement Learning Problems

Author: Eyal Even-Dar, Shie Mannor, Yishay Mansour

Abstract: We incorporate statistical confidence intervals in both the multi-armed bandit and the reinforcement learning problems. In the bandit problem we show that given n arms, it suffices to pull the arms a total of O (n/ε2 ) log(1/δ) times to find an ε-optimal arm with probability of at least 1 − δ. This bound matches the lower bound of Mannor and Tsitsiklis (2004) up to constants. We also devise action elimination procedures in reinforcement learning algorithms. We describe a framework that is based on learning the confidence interval around the value function or the Q-function and eliminating actions that are not optimal (with high probability). We provide a model-based and a model-free variants of the elimination method. We further derive stopping conditions guaranteeing that the learned policy is approximately optimal with high probability. Simulations demonstrate a considerable speedup and added robustness over ε-greedy Q-learning.

11 jmlr-2006-Active Learning in Approximately Linear Regression Based on Conditional Expectation of Generalization Error

Author: Masashi Sugiyama

Abstract: The goal of active learning is to determine the locations of training input points so that the generalization error is minimized. We discuss the problem of active learning in linear regression scenarios. Traditional active learning methods using least-squares learning often assume that the model used for learning is correctly specified. In many practical situations, however, this assumption may not be fulfilled. Recently, active learning methods using “importance”-weighted least-squares learning have been proposed, which are shown to be robust against misspecification of models. In this paper, we propose a new active learning method also using the weighted least-squares learning, which we call ALICE (Active Learning using the Importance-weighted least-squares learning based on Conditional Expectation of the generalization error). An important difference from existing methods is that we predict the conditional expectation of the generalization error given training input points, while existing methods predict the full expectation of the generalization error. Due to this difference, the training input design can be fine-tuned depending on the realization of training input points. Theoretically, we prove that the proposed active learning criterion is a more accurate predictor of the single-trial generalization error than the existing criterion. Numerical studies with toy and benchmark data sets show that the proposed method compares favorably to existing methods. Keywords: Active Learning, Conditional Expectation of Generalization Error, Misspecification of Models, Importance-Weighted Least-Squares Learning, Covariate Shift.

12 jmlr-2006-Active Learning with Feedback on Features and Instances

Author: Hema Raghavan, Omid Madani, Rosie Jones

Abstract: We extend the traditional active learning framework to include feedback on features in addition to labeling instances, and we execute a careful study of the effects of feature selection and human feedback on features in the setting of text categorization. Our experiments on a variety of categorization tasks indicate that there is significant potential in improving classifier performance by feature re-weighting, beyond that achieved via membership queries alone (traditional active learning) if we have access to an oracle that can point to the important (most predictive) features. Our experiments on human subjects indicate that human feedback on feature relevance can identify a sufficient proportion of the most relevant features (over 50% in our experiments). We find that on average, labeling a feature takes much less time than labeling a document. We devise an algorithm that interleaves labeling features and documents which significantly accelerates standard active learning in our simulation experiments. Feature feedback can complement traditional active learning in applications such as news filtering, e-mail classification, and personalization, where the human teacher can have significant knowledge on the relevance of features. Keywords: active learning, feature selection, relevance feedback, term feedback, text classification

13 jmlr-2006-Adaptive Prototype Learning Algorithms: Theoretical and Experimental Studies

Author: Fu Chang, Chin-Chin Lin, Chi-Jen Lu

Abstract: In this paper, we propose a number of adaptive prototype learning (APL) algorithms. They employ the same algorithmic scheme to determine the number and location of prototypes, but differ in the use of samples or the weighted averages of samples as prototypes, and also in the assumption of distance measures. To understand these algorithms from a theoretical viewpoint, we address their convergence properties, as well as their consistency under certain conditions. We also present a soft version of APL, in which a non-zero training error is allowed in order to enhance the generalization power of the resultant classifier. Applying the proposed algorithms to twelve UCI benchmark data sets, we demonstrate that they outperform many instance-based learning algorithms, the k-nearest neighbor rule, and support vector machines in terms of average test accuracy. Keywords: adaptive prototype learning, cluster-based prototypes, consistency, instance-based prototype, pattern classification 1

14 jmlr-2006-An Efficient Implementation of an Active Set Method for SVMs    (Special Topic on Machine Learning and Optimization)

Author: Katya Scheinberg

Abstract: We propose an active set algorithm to solve the convex quadratic programming (QP) problem which is the core of the support vector machine (SVM) training. The underlying method is not new and is based on the extensive practice of the Simplex method and its variants for convex quadratic problems. However, its application to large-scale SVM problems is new. Until recently the traditional active set methods were considered impractical for large SVM problems. By adapting the methods to the special structure of SVM problems we were able to produce an efficient implementation. We conduct an extensive study of the behavior of our method and its variations on SVM problems. We present computational results comparing our method with Joachims’ SVM light (see Joachims, 1999). The results show that our method has overall better performance on many SVM problems. It seems to have a particularly strong advantage on more difficult problems. In addition this algorithm has better theoretical properties and it naturally extends to the incremental mode. Since the proposed method solves the standard SVM formulation, as does SVMlight , the generalization properties of these two approaches are identical and we do not discuss them in the paper. Keywords: active set methods, support vector machines, quadratic programming

15 jmlr-2006-Bayesian Network Learning with Parameter Constraints     (Special Topic on Machine Learning and Optimization)

Author: Radu Stefan Niculescu, Tom M. Mitchell, R. Bharat Rao

Abstract: The task of learning models for many real-world problems requires incorporating domain knowledge into learning algorithms, to enable accurate learning from a realistic volume of training data. This paper considers a variety of types of domain knowledge for constraining parameter estimates when learning Bayesian networks. In particular, we consider domain knowledge that constrains the values or relationships among subsets of parameters in a Bayesian network with known structure. We incorporate a wide variety of parameter constraints into learning procedures for Bayesian networks, by formulating this task as a constrained optimization problem. The assumptions made in module networks, dynamic Bayes nets and context specific independence models can be viewed as particular cases of such parameter constraints. We present closed form solutions or fast iterative algorithms for estimating parameters subject to several specific classes of parameter constraints, including equalities and inequalities among parameters, constraints on individual parameters, and constraints on sums and ratios of parameters, for discrete and continuous variables. Our methods cover learning from both frequentist and Bayesian points of view, from both complete and incomplete data. We present formal guarantees for our estimators, as well as methods for automatically learning useful parameter constraints from data. To validate our approach, we apply it to the domain of fMRI brain image analysis. Here we demonstrate the ability of our system to first learn useful relationships among parameters, and then to use them to constrain the training of the Bayesian network, resulting in improved cross-validated accuracy of the learned model. Experiments on synthetic data are also presented. Keywords: Bayesian networks, constrained optimization, domain knowledge c 2006 Radu Stefan Niculescu, Tom Mitchell and Bharat Rao. N ICULESCU , M ITCHELL AND R AO

16 jmlr-2006-Bounds for Linear Multi-Task Learning

Author: Andreas Maurer

Abstract: We give dimension-free and data-dependent bounds for linear multi-task learning where a common linear operator is chosen to preprocess data for a vector of task specific linear-thresholding classifiers. The complexity penalty of multi-task learning is bounded by a simple expression involving the margins of the task-specific classifiers, the Hilbert-Schmidt norm of the selected preprocessor and the Hilbert-Schmidt norm of the covariance operator for the total mixture of all task distributions, or, alternatively, the Frobenius norm of the total Gramian matrix for the data-dependent version. The results can be compared to state-of-the-art results on linear single-task learning. Keywords: learning to learn, transfer learning, multi-task learning

17 jmlr-2006-Bounds for the Loss in Probability of Correct Classification Under Model Based Approximation

Author: Magnus Ekdahl, Timo Koski

Abstract: In many pattern recognition/classification problem the true class conditional model and class probabilities are approximated for reasons of reducing complexity and/or of statistical estimation. The approximated classifier is expected to have worse performance, here measured by the probability of correct classification. We present an analysis valid in general, and easily computable formulas for estimating the degradation in probability of correct classification when compared to the optimal classifier. An example of an approximation is the Na¨ve Bayes classifier. We show that the perforı mance of the Na¨ve Bayes depends on the degree of functional dependence between the features ı and labels. We provide a sufficient condition for zero loss of performance, too. Keywords: Bayesian networks, na¨ve Bayes, plug-in classifier, Kolmogorov distance of variation, ı variational learning

18 jmlr-2006-Building Support Vector Machines with Reduced Classifier Complexity     (Special Topic on Machine Learning and Optimization)

Author: S. Sathiya Keerthi, Olivier Chapelle, Dennis DeCoste

Abstract: Support vector machines (SVMs), though accurate, are not preferred in applications requiring great classification speed, due to the number of support vectors being large. To overcome this problem we devise a primal method with the following properties: (1) it decouples the idea of basis functions from the concept of support vectors; (2) it greedily finds a set of kernel basis functions of a specified maximum size (dmax ) to approximate the SVM primal cost function well; (3) it is 2 efficient and roughly scales as O(ndmax ) where n is the number of training examples; and, (4) the number of basis functions it requires to achieve an accuracy close to the SVM accuracy is usually far less than the number of SVM support vectors. Keywords: SVMs, classification, sparse design

19 jmlr-2006-Causal Graph Based Decomposition of Factored MDPs

Author: Anders Jonsson, Andrew Barto

Abstract: We present Variable Influence Structure Analysis, or VISA, an algorithm that performs hierarchical decomposition of factored Markov decision processes. VISA uses a dynamic Bayesian network model of actions, and constructs a causal graph that captures relationships between state variables. In tasks with sparse causal graphs VISA exploits structure by introducing activities that cause the values of state variables to change. The result is a hierarchy of activities that together represent a solution to the original task. VISA performs state abstraction for each activity by ignoring irrelevant state variables and lower-level activities. In addition, we describe an algorithm for constructing compact models of the activities introduced. State abstraction and compact activity models enable VISA to apply efficient algorithms to solve the stand-alone subtask associated with each activity. Experimental results show that the decomposition introduced by VISA can significantly accelerate construction of an optimal, or near-optimal, policy. Keywords: Markov decision processes, hierarchical decomposition, state abstraction

20 jmlr-2006-Collaborative Multiagent Reinforcement Learning by Payoff Propagation

Author: Jelle R. Kok, Nikos Vlassis

Abstract: In this article we describe a set of scalable techniques for learning the behavior of a group of agents in a collaborative multiagent setting. As a basis we use the framework of coordination graphs of Guestrin, Koller, and Parr (2002a) which exploits the dependencies between agents to decompose the global payoff function into a sum of local terms. First, we deal with the single-state case and describe a payoff propagation algorithm that computes the individual actions that approximately maximize the global payoff function. The method can be viewed as the decision-making analogue of belief propagation in Bayesian networks. Second, we focus on learning the behavior of the agents in sequential decision-making tasks. We introduce different model-free reinforcementlearning techniques, unitedly called Sparse Cooperative Q-learning, which approximate the global action-value function based on the topology of a coordination graph, and perform updates using the contribution of the individual agents to the maximal global action value. The combined use of an edge-based decomposition of the action-value function and the payoff propagation algorithm for efficient action selection, result in an approach that scales only linearly in the problem size. We provide experimental evidence that our method outperforms related multiagent reinforcement-learning methods based on temporal differences. Keywords: collaborative multiagent system, coordination graph, reinforcement learning, Qlearning, belief propagation

21 jmlr-2006-Computational and Theoretical Analysis of Null Space and Orthogonal Linear Discriminant Analysis

22 jmlr-2006-Considering Cost Asymmetry in Learning Classifiers

23 jmlr-2006-Consistency and Convergence Rates of One-Class SVMs and Related Algorithms

24 jmlr-2006-Consistency of Multiclass Empirical Risk Minimization Methods Based on Convex Loss

25 jmlr-2006-Distance Patterns in Structural Similarity

26 jmlr-2006-Efficient Learning of Label Ranking by Soft Projections onto Polyhedra     (Special Topic on Machine Learning and Optimization)

27 jmlr-2006-Ensemble Pruning Via Semi-definite Programming     (Special Topic on Machine Learning and Optimization)

28 jmlr-2006-Estimating the "Wrong" Graphical Model: Benefits in the Computation-Limited Setting

29 jmlr-2006-Estimation of Gradients and Coordinate Covariation in Classification

30 jmlr-2006-Evolutionary Function Approximation for Reinforcement Learning

31 jmlr-2006-Exact 1-Norm Support Vector Machines Via Unconstrained Convex Differentiable Minimization     (Special Topic on Machine Learning and Optimization)

32 jmlr-2006-Expectation Correction for Smoothed Inference in Switching Linear Dynamical Systems

33 jmlr-2006-Fast SDP Relaxations of Graph Cut Clustering, Transduction, and Other Combinatorial Problems     (Special Topic on Machine Learning and Optimization)

34 jmlr-2006-Generalized Bradley-Terry Models and Multi-Class Probability Estimates

35 jmlr-2006-Geometric Variance Reduction in Markov Chains: Application to Value Function and Gradient Estimation

36 jmlr-2006-In Search of Non-Gaussian Components of a High-Dimensional Distribution

37 jmlr-2006-Incremental Algorithms for Hierarchical Classification

38 jmlr-2006-Incremental Support Vector Learning: Analysis, Implementation and Applications     (Special Topic on Machine Learning and Optimization)

39 jmlr-2006-Inductive Synthesis of Functional Programs: An Explanation Based Generalization Approach     (Special Topic on Inductive Programming)

40 jmlr-2006-Infinite-σ Limits For Tikhonov Regularization

41 jmlr-2006-Kernel-Based Learning of Hierarchical Multilabel Classification Models     (Special Topic on Machine Learning and Optimization)

42 jmlr-2006-Kernels on Prolog Proof Trees: Statistical Learning in the ILP Setting     (Special Topic on Inductive Programming)

43 jmlr-2006-Large Scale Multiple Kernel Learning     (Special Topic on Machine Learning and Optimization)

44 jmlr-2006-Large Scale Transductive SVMs

45 jmlr-2006-Learning Coordinate Covariances via Gradients

46 jmlr-2006-Learning Factor Graphs in Polynomial Time and Sample Complexity

47 jmlr-2006-Learning Image Components for Object Recognition

48 jmlr-2006-Learning Minimum Volume Sets

49 jmlr-2006-Learning Parts-Based Representations of Data

50 jmlr-2006-Learning Recursive Control Programs from Problem Solving     (Special Topic on Inductive Programming)

51 jmlr-2006-Learning Sparse Representations by Non-Negative Matrix Factorization and Sequential Cone Programming     (Special Topic on Machine Learning and Optimization)

52 jmlr-2006-Learning Spectral Clustering, With Application To Speech Separation

53 jmlr-2006-Learning a Hidden Hypergraph

54 jmlr-2006-Learning the Structure of Linear Latent Variable Models

55 jmlr-2006-Linear Programming Relaxations and Belief Propagation -- An Empirical Study     (Special Topic on Machine Learning and Optimization)

56 jmlr-2006-Linear Programs for Hypotheses Selection in Probabilistic Inference Models     (Special Topic on Machine Learning and Optimization)

57 jmlr-2006-Linear State-Space Models for Blind Source Separation

58 jmlr-2006-Lower Bounds and Aggregation in Density Estimation

59 jmlr-2006-Machine Learning for Computer Security    (Special Topic on Machine Learning for Computer Security)

60 jmlr-2006-Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples

61 jmlr-2006-Maximum-Gain Working Set Selection for SVMs     (Special Topic on Machine Learning and Optimization)

62 jmlr-2006-MinReg: A Scalable Algorithm for Learning Parsimonious Regulatory Networks in Yeast and Mammals

63 jmlr-2006-New Algorithms for Efficient High-Dimensional Nonparametric Classification

64 jmlr-2006-Noisy-OR Component Analysis and its Application to Link Analysis

65 jmlr-2006-Nonparametric Quantile Estimation

66 jmlr-2006-On Model Selection Consistency of Lasso

67 jmlr-2006-On Representing and Generating Kernels by Fuzzy Equivalence Relations

68 jmlr-2006-On the Complexity of Learning Lexicographic Strategies

69 jmlr-2006-One-Class Novelty Detection for Seizure Analysis from Intracranial EEG

70 jmlr-2006-Online Passive-Aggressive Algorithms

71 jmlr-2006-Optimising Kernel Parameters and Regularisation Coefficients for Non-linear Discriminant Analysis

72 jmlr-2006-Parallel Software for Training Large Scale Support Vector Machines on Multiprocessor Systems     (Special Topic on Machine Learning and Optimization)

73 jmlr-2006-Pattern Recognition for Conditionally Independent Data

74 jmlr-2006-Point-Based Value Iteration for Continuous POMDPs

75 jmlr-2006-Policy Gradient in Continuous Time

76 jmlr-2006-QP Algorithms with Guaranteed Accuracy and Run Time for Support Vector Machines

77 jmlr-2006-Quantile Regression Forests

78 jmlr-2006-Rearrangement Clustering: Pitfalls, Remedies, and Applications

79 jmlr-2006-Second Order Cone Programming Approaches for Handling Missing and Uncertain Data     (Special Topic on Machine Learning and Optimization)

80 jmlr-2006-Segmental Hidden Markov Models with Random Effects for Waveform Modeling

81 jmlr-2006-Some Discriminant-Based PAC Algorithms

82 jmlr-2006-Some Theory for Generalized Boosting Algorithms

83 jmlr-2006-Sparse Boosting

84 jmlr-2006-Stability Properties of Empirical Risk Minimization over Donsker Classes

85 jmlr-2006-Statistical Comparisons of Classifiers over Multiple Data Sets

86 jmlr-2006-Step Size Adaptation in Reproducing Kernel Hilbert Space

87 jmlr-2006-Stochastic Complexities of Gaussian Mixtures in Variational Bayesian Approximation

88 jmlr-2006-Streamwise Feature Selection

89 jmlr-2006-Structured Prediction, Dual Extragradient and Bregman Projections     (Special Topic on Machine Learning and Optimization)

90 jmlr-2006-Superior Guarantees for Sequential Prediction and Lossless Compression via Alphabet Decomposition

91 jmlr-2006-The Interplay of Optimization and Machine Learning Research     (Special Topic on Machine Learning and Optimization)

92 jmlr-2006-Toward Attribute Efficient Learning of Decision Lists and Parities

93 jmlr-2006-Universal Kernels

94 jmlr-2006-Using Machine Learning to Guide Architecture Simulation

95 jmlr-2006-Walk-Sums and Belief Propagation in Gaussian Graphical Models

96 jmlr-2006-Worst-Case Analysis of Selective Sampling for Linear Classification

97 jmlr-2006- (Special Topic on Machine Learning for Computer Security)