jmlr jmlr2009 jmlr2009-20 knowledge-graph by maker-knowledge-mining

20 jmlr-2009-DL-Learner: Learning Concepts in Description Logics

Source: pdf

Author: Jens Lehmann

Abstract: In this paper, we introduce DL-Learner, a framework for learning in description logics and OWL. OWL is the ofÄ?Ĺš cial W3C standard ontology language for the Semantic Web. Concepts in this language can be learned for constructing and maintaining OWL ontologies or for solving problems similar to those in Inductive Logic Programming. DL-Learner includes several learning algorithms, support for different OWL formats, reasoner interfaces, and learning problems. It is a cross-platform framework implemented in Java. The framework allows easy programmatic access and provides a command line interface, a graphical interface as well as a WSDL-based web service. Keywords: concept learning, description logics, OWL, classiÄ?Ĺš cation, open-source

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 DE Department of Computer Science University of Leipzig Johannisgasse 26, 04103 Leipzig, Germany Editor: Soeren Sonnenburg Abstract In this paper, we introduce DL-Learner, a framework for learning in description logics and OWL. [sent-3, score-0.252]

2 Ĺš cial W3C standard ontology language for the Semantic Web. [sent-5, score-0.293]

3 Concepts in this language can be learned for constructing and maintaining OWL ontologies or for solving problems similar to those in Inductive Logic Programming. [sent-6, score-0.161]

4 DL-Learner includes several learning algorithms, support for different OWL formats, reasoner interfaces, and learning problems. [sent-7, score-0.166]

5 The framework allows easy programmatic access and provides a command line interface, a graphical interface as well as a WSDL-based web service. [sent-9, score-0.377]

6 Introduction The Semantic Web grows steadily1 and contains knowledge from diverse areas such as science, music, literature, geography, social networks, as well as from upper and cross domain ontologies2 . [sent-12, score-0.04]

7 The underlying semantic technologies currently start to create substantial industrial impact in application scenarios on and off the web, including knowledge management, expert systems, web services, e-commerce, e-collaboration, etc. [sent-13, score-0.457]

8 Since 2004, the Web Ontology Language OWL, which is based on description logics (Baader et al. [sent-14, score-0.225]

9 , 2007), has been the W3C-recommended standard for Semantic Web ontologies and is a key to the growth of the Semantic Web. [sent-15, score-0.133]

10 Ĺš eld, there is a need for well-structured ontologies with large amounts of instance data, since engineering such ontologies constitutes a considerable investment of resources. [sent-17, score-0.322]

11 Nowadays, knowledge bases often provide large amounts of instance data without sophisticated schemata. [sent-18, score-0.131]

12 Methods for automated schema acquisition and maintenance are therefore sought (see, e. [sent-19, score-0.087]

13 DL-Learner provides an open source framework for such methods as we will brieÄ? [sent-25, score-0.071]

14 Outside of DL-Learner, there exist only non open source implementations of algorithms (YinYang, DL-FOIL) to the best of our knowledge. [sent-28, score-0.044]

15 To give a rough estimate, the semantic index Sindice (http://sindice. [sent-30, score-0.204]

16 com/) lists more than 10 billion entities from more than 100 million web pages. [sent-31, score-0.184]

17 A component manager can be used to create, combine, and conÄ? [sent-41, score-0.118]

18 Framework DL-Learner consists of core functionality, which provides Machine Learning algorithms for solving learning problems in OWL, support for different knowledge base formats, an OWL library, and reasoner interfaces. [sent-44, score-0.24]

19 There are several interfaces for accessing this functionality, a couple of tools which use the DL-Learner algorithms, and a set of convenience scripts. [sent-45, score-0.182]

20 There are four types of components: knowledge source, reasoning service, learning problem, and learning algorithm. [sent-48, score-0.04]

21 For each type, there are several implemented components and each component can have its own conÄ? [sent-49, score-0.097]

22 Ĺš guration options can be used to change parameters/settings of a component. [sent-52, score-0.097]

23 Almost all standard OWL formats are supported through the OWL API,3 for example, RDF/XML, Manchester OWL Syntax, or Turtle. [sent-54, score-0.117]

24 DL-Learner supports the inclusion of several knowledge sources, since knowledge can be widespread in the Semantic Web. [sent-55, score-0.08]

25 In addition, DL-Learner facilitates the extraction of knowledge fragments from SPARQL4 endpoints. [sent-56, score-0.04]

26 This feature allows DL-Learner to scale up to very large knowledge bases containing millions of axioms (cf. [sent-57, score-0.079]

27 15 and OWL API reasoner interfaces, which allow to connect to all standard OWL reasoners via an HTTP and XML-based mechanism or a Java interface, respectively. [sent-62, score-0.166]

28 2640 DL-L EARNER DL-Learner offers its own approximate reasoner, which uses Pellet6 for bootstrapping and loading the inferred model in memory. [sent-75, score-0.027]

29 Ĺš ciently by using a local closed world assumption (see Badea and Nienhuys-Cheng 2000 on why this assumption is useful in description logics). [sent-77, score-0.059]

30 Ĺš cient coverage checks which can be used in the learning algorithms, for example, stochastic approaches for computing coverage up to a desired accuracy with respect to a 95% conÄ? [sent-86, score-0.161]

31 Learning Algorithm components provide methods to solve one or more speciÄ? [sent-88, score-0.057]

32 Apart from simple algorithms involving brute force or random guessing techniques, DL-Learner comprises a number of sophisticated algorithms based on genetic programming with a novel genetic operator (Lehmann, 2007), reÄ? [sent-90, score-0.259]

33 Ĺš nement operators for the description logic ALC (Lehmann and Hitzler, 2008), an extended operator supporting many features of OWL including datatype support, and an algorithm tailored for ontology engineering with a strong bias on short and readable concepts. [sent-91, score-0.62]

34 Some of those algorithms have shown to be superior to other description logic learning systems and also superior to state-of-the-art ILP systems, for example, on the carcinogenesis problem. [sent-92, score-0.168]

35 A manual,8 which complements the homepage and describes how to run DL-Learner, is included in its release. [sent-96, score-0.195]

36 9 The code base of DL-Learner consists of approximately 50,000 lines of code (excluding comments) with its core, that is, the component framework itself, accounting for roughly 1,500 lines. [sent-98, score-0.067]

37 About 20 learning examples are included in the latest release (to be precise: 132 if smaller variations of existing problems/conÄ? [sent-100, score-0.114]

38 27 unit tests based on the JUnit framework are used to detect errors. [sent-102, score-0.027]

39 There are several interfaces available to access DL-Learner: To use components programmatically, the core package, in particular the component manager, can be of service. [sent-103, score-0.262]

40 Similar methods are also available at the web service interface, which is based on WSDL. [sent-104, score-0.23]

41 DL-Learner starts a web service included in Java 6, that is, no further tools are necessary. [sent-105, score-0.286]

42 For end users, a command line interface is available. [sent-106, score-0.163]

43 A prototypical graphical user interface is equally available, which can create, load, and save conf Ä? [sent-109, score-0.215]

44 An advantage of the component-based architecture is that all the interfaces mentioned need not to be changed, when new components are added or existing ones modiÄ? [sent-113, score-0.191]

45 Another means to access DL-Learner, in particular for ontology engineering, is 6. [sent-116, score-0.298]

46 2641 L EHMANN through plugins for the ontology editors OntoWiki10 and ProtĂ‚Â´ gĂ‚Â´ . [sent-129, score-0.298]

47 11 The OntoWiki plugin is under e e construction, but can be used in its latest SVN version. [sent-130, score-0.164]

48 The ProtĂ‚Â´ gĂ‚Â´ 4 plugin is included in the e e ofÄ? [sent-131, score-0.138]

49 Ĺš cial ProtĂ‚Â´ gĂ‚Â´ plugin repository, that is, it is easy to install within ProtĂ‚Â´ gĂ‚Â´ . [sent-132, score-0.109]

50 Special thanks goes to Francesca Lisi for her comments, as well as the developers working on DL-Learner and tools based on it. [sent-134, score-0.082]

51 Ĺš nement operator, to Christian KĂ‚Â¨ tteritzsch for his work o on the ProtĂ‚Â´ gĂ‚Â´ plugin, to Sebastian Bader who contributed a Prolog parser, and to those people using e e DL-Learner in their software. [sent-136, score-0.105]

52 on Inductive Logic Programming, volume 1866 of Lecture Notes in ArtiÄ? [sent-157, score-0.027]

53 Ontology Learning from Text: Methods, Evaluation and Applications, volume 123 of Frontiers in ArtiÄ? [sent-164, score-0.027]

54 Learning of OWL class descriptions on very large knowledge bases. [sent-183, score-0.04]

55 on Machine Learning and Data Mining in Pattern Recognition, volume 4571 of Lecture Notes in Computer Science, pages 883Ă˘€“898. [sent-191, score-0.027]

56 Ĺš nement operator based learning algorithm for the ALC description logic. [sent-197, score-0.191]

57 on Inductive Logic Programming, volume 4894 of Lecture Notes in Computer Science, pages 147Ă˘€“160. [sent-201, score-0.027]

58 on Inductive Logic Programming, volume 5194 of Lecture Notes in Computer Science, pages 158Ă˘€“ 175. [sent-212, score-0.027]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('owl', 0.586), ('ontology', 0.265), ('semantic', 0.204), ('lehmann', 0.203), ('prot', 0.199), ('homepage', 0.166), ('logics', 0.166), ('reasoner', 0.166), ('web', 0.154), ('ontologies', 0.133), ('formats', 0.117), ('hellmann', 0.117), ('lisi', 0.117), ('ontowiki', 0.117), ('plugin', 0.109), ('logic', 0.109), ('interface', 0.104), ('esposito', 0.1), ('interfaces', 0.098), ('leipzig', 0.089), ('baader', 0.078), ('badea', 0.078), ('buitelaar', 0.078), ('conf', 0.078), ('dig', 0.078), ('ehmann', 0.078), ('hitzler', 0.078), ('jens', 0.078), ('manager', 0.078), ('sparql', 0.078), ('service', 0.076), ('nement', 0.075), ('alc', 0.066), ('javadoc', 0.066), ('guration', 0.064), ('inductive', 0.064), ('checks', 0.059), ('api', 0.059), ('command', 0.059), ('functionality', 0.059), ('description', 0.059), ('http', 0.058), ('operator', 0.057), ('components', 0.057), ('developers', 0.055), ('ilp', 0.055), ('latest', 0.055), ('sebastian', 0.055), ('coverage', 0.051), ('source', 0.044), ('java', 0.041), ('knowledge', 0.04), ('component', 0.04), ('bases', 0.039), ('lecture', 0.039), ('architecture', 0.036), ('core', 0.034), ('springer', 0.034), ('options', 0.033), ('access', 0.033), ('comprises', 0.033), ('afterwards', 0.033), ('manchester', 0.033), ('parser', 0.033), ('plugins', 0.033), ('prototypical', 0.033), ('syntax', 0.033), ('notes', 0.033), ('create', 0.032), ('sources', 0.031), ('genetic', 0.031), ('investment', 0.03), ('readable', 0.03), ('super', 0.03), ('acquisition', 0.03), ('accessing', 0.03), ('billion', 0.03), ('contributed', 0.03), ('extensible', 0.03), ('release', 0.03), ('schema', 0.03), ('included', 0.029), ('language', 0.028), ('tools', 0.027), ('ios', 0.027), ('technologies', 0.027), ('bootstrapping', 0.027), ('sought', 0.027), ('axiom', 0.027), ('brute', 0.027), ('couple', 0.027), ('gpl', 0.027), ('guessing', 0.027), ('programming', 0.027), ('volume', 0.027), ('framework', 0.027), ('amounts', 0.026), ('sophisticated', 0.026), ('music', 0.025), ('tailored', 0.025)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000001 20 jmlr-2009-DL-Learner: Learning Concepts in Description Logics

Author: Jens Lehmann

2 0.081702597 43 jmlr-2009-Java-ML: A Machine Learning Library (Machine Learning Open Source Software Paper)

Author: Thomas Abeel, Yves Van de Peer, Yvan Saeys

Abstract: Java-ML is a collection of machine learning and data mining algorithms, which aims to be a readily usable and easily extensible API for both software developers and research scientists. The interfaces for each type of algorithm are kept simple and algorithms strictly follow their respective interface. Comparing different classiﬁers or clustering algorithms is therefore straightforward, and implementing new algorithms is also easy. The implementations of the algorithms are clearly written, properly documented and can thus be used as a reference. The library is written in Java and is available from http://java-ml.sourceforge.net/ under the GNU GPL license. Keywords: open source, machine learning, data mining, java library, clustering, feature selection, classiﬁcation

3 0.046472047 77 jmlr-2009-RL-Glue: Language-Independent Software for Reinforcement-Learning Experiments (Machine Learning Open Source Software Paper)

Author: Brian Tanner, Adam White

Abstract: RL-Glue is a standard, language-independent software package for reinforcement-learning experiments. The standardization provided by RL-Glue facilitates code sharing and collaboration. Code sharing reduces the need to re-engineer tasks and experimental apparatus, both common barriers to comparatively evaluating new ideas in the context of the literature. Our software features a minimalist interface and works with several languages and computing platforms. RL-Glue compatibility can be extended to any programming language that supports network socket communication. RL-Glue has been used to teach classes, to run international competitions, and is currently used by several other open-source software and hardware projects. Keywords: reinforcement learning, empirical evaluation, standardization, open source 1. Introduction and Motivation Reinforcement learning is an embodied, trial-and-error problem formulation for artiﬁcial intelligence (Sutton and Barto, 1998; Kaelbling et al., 1996; Bertsekas and Tsitsiklis, 1996). At a series of time steps, the agent emits actions in response to observations and rewards generated by the environment. The agent’s objective is to select actions that maximize the future rewards. Reinforcementlearning methods have been successfully applied to many problems including backgammon (Tesauro, 1994), elevator control (Crites and Barto, 1998), and helicopter control (Ng et al., 2004). Reinforcementlearning models and formalisms have inﬂuenced a number of ﬁelds, including operations research, cognitive science, optimal control, psychology, neuroscience, and others. Reinforcement-learning practitioners create their agents and environments using various incompatible software frameworks, making collaboration inconvenient and thus slowing progress in our community. It can be time consuming, difﬁcult, and sometimes even impossible to exactly reproduce the work of others. A conference or journal article is not the appropriate medium to share a sufﬁciently detailed speciﬁcation of the environment, agent and overall experimental apparatus. We need a convenient way to share source code. We believe that a standard programming interface for reinforcement-learning experiments will remove some barriers to collaboration and accelerate the pace of research in our ﬁeld. To encourage widespread adoption, this interface should be easy to adhere to, and it should not force users to abandon their favorite tools or languages. With these goals in mind, we have developed RL-Glue: language independent software for reinforcement-learning experiments. c 2009 Brian Tanner and Adam White. TANNER AND W HITE 2. RL-Glue Reinforcement-learning environments cannot be stored as ﬁxed data-sets, as is common in conventional supervised machine learning. The environment generates observations and rewards in response to actions selected by the agent, making it more natural to think of the environment and agent as interactive programs. Sutton and Barto describe one prevalent view of agent-environment interactions in their introductory text (1998). Their view, shown in Figure 1, clearly separates the agent and environment into different components which interact in a particular way, following a particular sequence. observation ot reward rt Agent action at rt+1 Environment ot+1 Figure 1: Sutton and Barto’s agent-environment interface, with states generalized to observations. White’s RL-Glue Protocol (2006) formalizes Sutton and Barto’s interface for online, singleagent reinforcement learning. The RL-Glue Protocol describes how the different aspects of a reinforcement-learning experiment should be arranged into programs, and the etiquette they should follow when communicating with each other. These programs (Figure 2) are the agent, the environment, the experiment, and RL-Glue. The agent program implements the learning algorithm and action-selection mechanism. The environment program implements the dynamics of the task and generates the observations and rewards. The experiment program directs the experiment’s execution, including the sequence of agent-environment interactions and agent performance evaluation. The RL-Glue program mediates the communication between the agent and environment programs in response to commands from the experiment program. Our RL-Glue software (RL-Glue) implements White’s protocol.1 Experiment Program Agent Program RL-Glue Program Environment Program Figure 2: The four programs speciﬁed by the RL-Glue Protocol. Arrows indicate the direction of the ﬂow of control. RL-Glue can be used either in internal or external mode. In internal mode, the agent, environment and experiment are linked into a single program, and their communication is through function calls. Internal mode is currently an option if the agent, environment, and experiment are written exclusively in Java or C/C++. In external mode, the agent, environment and experiment are linked 1. This can be found at http://glue.rl-community.org/protocol. 2134 RL-G LUE into separate programs. Each program connects to the RL-Glue server program, and all communication is over TCP/IP socket connections. External mode allows these programs to be written in any programming language that supports socket communication. External mode is currently supported for C/C++, Java, Python, Lisp, and Matlab. Each mode has its strengths and weaknesses. Internal mode has less overhead, so it can execute more steps per second. External mode is more ﬂexible and portable. The performance difference between the two modes vanishes as the agent or environment becomes complex enough that computation dominates the socket overhead in terms of time per step. The agent and environment are indifferent and unaware of their execution mode; the difference in modes lies only in how the agent and environment are linked or loaded. 3. RL-Glue in Practice RL-Glue’s provides a common interface for a number of software and hardware projects in the reinforcement-learning community. For example, there is the annual RL-Competition, where teams from around the world compare their agents on a variety of challenging environments. The competition software uses the API, called RL-Viz, that is layered on top of RL-Glue to dynamically load agent and environment programs, modify parameters at runtime and visualize interaction and performance. All of the environments and sample agents created by the competition organizers are added to the RL-Library, a public, community-supported repository of RL-Glue compatible code. The RL-Library is also available as an archive of top competition agents, challenge problems, project code from academic publications, or any other RL-Glue compatible software that members of our community would like to share. The socket architecture of RL-Glue allows diverse software and hardware platforms to be connected as RL-Glue environment programs. There are ongoing projects that connect a mobile robot platform, a keepaway soccer server, a real-time strategy game, and an Atari emulator to RL-Glue. Our socket architecture helps lower the barriers for researchers wishing to work on larger scale environments by providing a simple and familiar interface. RL-Glue has been used for teaching reinforcement learning in several university courses and to create experiments for scientiﬁc articles published in leading conferences. See our RL-Glue in practice web page for an updated list of projects and papers that have used RL-Glue.2 4. Other Reinforcement-Learning Software Projects RL-Glue is not the ﬁrst software project that aims to standardize empirical reinforcement learning or to make agent and environment programs more accessible within our community. However, RLGlue is the only project that offers a standardized language-independent interface, rich actions and observations, and ﬁne-grained control of the experiment. Other projects, most notably: CLSquare,3 PIQLE,4 RL Toolbox,5 JRLF,6 and LibPG,7 offer signiﬁcant value to the reinforcement-learning community by offering agents and environments, 2. 3. 4. 5. 6. 7. This can be found at http://glue.rl-community.org/rl-glue-in-practice. This can be found at http://www.ni.uos.de/index.php?id=70. This can be found at http://piqle.sourceforge.net/. This can be found at http://www.igi.tugraz.at/ril-toolbox/. This can be found at http://mykel.kochenderfer.com/jrlf/. This can be found at http://code.google.com/p/libpgrl/. 2135 TANNER AND W HITE intuitive visualizations, programming tools, etc. Users should not be forced to choose between RL-Glue and these alternative projects. Our design makes it relatively easy to interface existing frameworks with RL-Glue. We are currently offering our assistance in bridging other frameworks to RL-Glue, with the hope of improving access to all of these tools for all members of our community. 5. RL-Glue Open Source Project Website: http://glue.rl-community.org License: Apache 2.0 RL-Glue is more than an interface; it connects a family of community projects, with many levels of possible participation. Members of the community are invited to submit agent, environment and experiment programs to the RL-Library. Developers can also extend the reach of RL-Glue compatibility by writing external-mode or internal-mode interfaces for their favorite programming language. The RL-Glue software project also welcomes submissions and improvements for all parts of the software and documentation. Acknowledgments We would like to thank the users, testers, and developers for their contributions to RL-Glue 3.0. Special thanks to G´ bor Bal´ zs, Jos´ Antonio Martin H., Scott Livingston, Marc Bellemare, Istv´ n a a e a Szita, Marc Lanctot, Anna Koop, Dan Lizotte, Richard Sutton, Monica Dinculescu, Jordan Frank, and Andrew Butcher. Of course, we also owe a great debt to all of the talented people responsible for the historic and ongoing development of RL-Glue.8 References Dimitri P. Bertsekas and John N. Tsitsiklis. Neuro-Dynamic Programming (Optimization and Neural Computation Series, 3). Athena Scientiﬁc, May 1996. ISBN 1886529108. Robert H. Crites and Andrew G. Barto. Elevator group control using multiple reinforcement learning agents. Machine Learning, 33(2-3):235–262, 1998. Leslie Pack Kaelbling, Michael L. Littman, and Andrew W. Moore. Reinforcement learning: a survey. Journal of Artiﬁcial Intelligence Research, 4:237–285, 1996. Andrew Y. Ng, Adam Coates, Mark Diel, Varun Ganapathi, Jamie Schulte, Ben Tse, Eric Berger, and Eric Liang. Autonomous inverted helicopter ﬂight via reinforcement learning. In Proceedings of the International Symposium on Experimental Robotics, pages 363–372, 2004. Richard S. Sutton and Andrew G. Barto. Reinforcement Learning: An Introduction. The MIT Press, Cambridge, Massachusetts, 1998. Gerald Tesauro. TD-gammon, a self-teaching backgammon program achieves master-level play. Neural Computation, 6:215–219, 1994. Adam White. A Standard System for Benchmarking in Reinforcement Learning. Master’s thesis, University of Alberta, Alberta, Canada, 2006. 8. This can be found at http://glue.rl-community.org/contributors-history. 2136

4 0.040539991 26 jmlr-2009-Dlib-ml: A Machine Learning Toolkit (Machine Learning Open Source Software Paper)

Author: Davis E. King

Abstract: There are many excellent toolkits which provide support for developing machine learning software in Python, R, Matlab, and similar environments. Dlib-ml is an open source library, targeted at both engineers and research scientists, which aims to provide a similarly rich environment for developing machine learning software in the C++ language. Towards this end, dlib-ml contains an extensible linear algebra toolkit with built in BLAS support. It also houses implementations of algorithms for performing inference in Bayesian networks and kernel-based methods for classiﬁcation, regression, clustering, anomaly detection, and feature ranking. To enable easy use of these tools, the entire library has been developed with contract programming, which provides complete and precise documentation as well as powerful debugging tools. Keywords: kernel-methods, svm, rvm, kernel clustering, C++, Bayesian networks

5 0.037911542 78 jmlr-2009-Refinement of Reproducing Kernels

Author: Yuesheng Xu, Haizhang Zhang

Abstract: We continue our recent study on constructing a reﬁnement kernel for a given kernel so that the reproducing kernel Hilbert space associated with the reﬁnement kernel contains that with the original kernel as a subspace. To motivate this study, we ﬁrst develop a reﬁnement kernel method for learning, which gives an efﬁcient algorithm for updating a learning predictor. Several characterizations of reﬁnement kernels are then presented. It is shown that a nontrivial reﬁnement kernel for a given kernel always exists if the input space has an inﬁnite cardinal number. Reﬁnement kernels for translation invariant kernels and Hilbert-Schmidt kernels are investigated. Various concrete examples are provided. Keywords: reproducing kernels, reproducing kernel Hilbert spaces, learning with kernels, reﬁnement kernels, translation invariant kernels, Hilbert-Schmidt kernels

6 0.032002985 56 jmlr-2009-Model Monitor (M2): Evaluating, Comparing, and Monitoring Models (Machine Learning Open Source Software Paper)

7 0.027193069 96 jmlr-2009-Transfer Learning for Reinforcement Learning Domains: A Survey

8 0.025641011 50 jmlr-2009-Learning When Concepts Abound

9 0.025396524 81 jmlr-2009-Robust Process Discovery with Artificial Negative Events (Special Topic on Mining and Learning with Graphs and Relations)

10 0.025096431 60 jmlr-2009-Nieme: Large-Scale Energy-Based Models (Machine Learning Open Source Software Paper)

11 0.024257492 4 jmlr-2009-A Survey of Accuracy Evaluation Metrics of Recommendation Tasks

12 0.020864509 76 jmlr-2009-Python Environment for Bayesian Learning: Inferring the Structure of Bayesian Networks from Knowledge and Data (Machine Learning Open Source Software Paper)

13 0.019411445 2 jmlr-2009-A New Approach to Collaborative Filtering: Operator Estimation with Spectral Regularization

14 0.019043593 72 jmlr-2009-Polynomial-Delay Enumeration of Monotonic Graph Classes

15 0.018604465 31 jmlr-2009-Evolutionary Model Type Selection for Global Surrogate Modeling

16 0.017714158 39 jmlr-2009-Hybrid MPI OpenMP Parallel Linear Support Vector Machine Training

17 0.014556046 41 jmlr-2009-Improving the Reliability of Causal Discovery from Small Data Sets Using Argumentation (Special Topic on Causality)

18 0.013774421 28 jmlr-2009-Entropy Inference and the James-Stein Estimator, with Application to Nonlinear Gene Association Networks

19 0.012173711 84 jmlr-2009-Scalable Collaborative Filtering Approaches for Large Recommender Systems (Special Topic on Mining and Learning with Graphs and Relations)

20 0.011643896 98 jmlr-2009-Universal Kernel-Based Learning with Applications to Regular Languages (Special Topic on Mining and Learning with Graphs and Relations)

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.069), (1, -0.077), (2, 0.024), (3, -0.043), (4, 0.019), (5, -0.077), (6, 0.131), (7, 0.018), (8, 0.093), (9, 0.043), (10, -0.058), (11, 0.145), (12, 0.0), (13, 0.175), (14, -0.023), (15, -0.017), (16, 0.023), (17, -0.076), (18, -0.039), (19, -0.002), (20, 0.016), (21, -0.078), (22, -0.017), (23, 0.014), (24, -0.064), (25, 0.054), (26, 0.077), (27, -0.129), (28, 0.104), (29, -0.203), (30, 0.118), (31, 0.039), (32, -0.142), (33, 0.038), (34, -0.253), (35, -0.014), (36, 0.092), (37, -0.302), (38, 0.195), (39, 0.141), (40, -0.058), (41, -0.414), (42, 0.033), (43, 0.078), (44, 0.243), (45, 0.077), (46, 0.055), (47, -0.036), (48, 0.164), (49, -0.103)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.98578805 20 jmlr-2009-DL-Learner: Learning Concepts in Description Logics

Author: Jens Lehmann

2 0.30531642 43 jmlr-2009-Java-ML: A Machine Learning Library (Machine Learning Open Source Software Paper)

Author: Thomas Abeel, Yves Van de Peer, Yvan Saeys

3 0.2246234 77 jmlr-2009-RL-Glue: Language-Independent Software for Reinforcement-Learning Experiments (Machine Learning Open Source Software Paper)

Author: Brian Tanner, Adam White

4 0.21553172 81 jmlr-2009-Robust Process Discovery with Artificial Negative Events (Special Topic on Mining and Learning with Graphs and Relations)

Author: Stijn Goedertier, David Martens, Jan Vanthienen, Bart Baesens

Abstract: Process discovery is the automated construction of structured process models from information system event logs. Such event logs often contain positive examples only. Without negative examples, it is a challenge to strike the right balance between recall and speciﬁcity, and to deal with problems such as expressiveness, noise, incomplete event logs, or the inclusion of prior knowledge. In this paper, we present a conﬁgurable technique that deals with these challenges by representing process discovery as a multi-relational classiﬁcation problem on event logs supplemented with Artiﬁcially Generated Negative Events (AGNEs). This problem formulation allows using learning algorithms and evaluation techniques that are well-know in the machine learning community. Moreover, it allows users to have a declarative control over the inductive bias and language bias. Keywords: graph pattern discovery, inductive logic programming, Petri net, process discovery, positive data only

5 0.17626712 72 jmlr-2009-Polynomial-Delay Enumeration of Monotonic Graph Classes

Author: Jan Ramon, Siegfried Nijssen

Abstract: Algorithms that list graphs such that no two listed graphs are isomorphic, are important building blocks of systems for mining and learning in graphs. Algorithms are already known that solve this problem efﬁciently for many classes of graphs of restricted topology, such as trees. In this article we introduce the concept of a dense augmentation schema, and introduce an algorithm that can be used to enumerate any class of graphs with polynomial delay, as long as the class of graphs can be described using a monotonic predicate operating on a dense augmentation schema. In practice this means that this is the ﬁrst enumeration algorithm that can be applied theoretically efﬁciently in any frequent subgraph mining algorithm, and that this algorithm generalizes to situations beyond the standard frequent subgraph mining setting. Keywords: graph mining, enumeration, monotonic graph classes

6 0.16305977 50 jmlr-2009-Learning When Concepts Abound

7 0.15144788 78 jmlr-2009-Refinement of Reproducing Kernels

8 0.14968558 60 jmlr-2009-Nieme: Large-Scale Energy-Based Models (Machine Learning Open Source Software Paper)

9 0.14022669 56 jmlr-2009-Model Monitor (M2): Evaluating, Comparing, and Monitoring Models (Machine Learning Open Source Software Paper)

10 0.12699081 28 jmlr-2009-Entropy Inference and the James-Stein Estimator, with Application to Nonlinear Gene Association Networks

11 0.12400535 96 jmlr-2009-Transfer Learning for Reinforcement Learning Domains: A Survey

12 0.10589221 55 jmlr-2009-Maximum Entropy Discrimination Markov Networks

13 0.10585631 26 jmlr-2009-Dlib-ml: A Machine Learning Toolkit (Machine Learning Open Source Software Paper)

14 0.10390053 88 jmlr-2009-Stable and Efficient Gaussian Process Calculations

15 0.10064238 64 jmlr-2009-On The Power of Membership Queries in Agnostic Learning

16 0.094489425 41 jmlr-2009-Improving the Reliability of Causal Discovery from Small Data Sets Using Argumentation (Special Topic on Causality)

17 0.087765791 14 jmlr-2009-CarpeDiem: Optimizing the Viterbi Algorithm and Applications to Supervised Sequential Learning

18 0.081282534 15 jmlr-2009-Cautious Collective Classification

19 0.077629127 46 jmlr-2009-Learning Halfspaces with Malicious Noise

20 0.075203195 53 jmlr-2009-Marginal Likelihood Integrals for Mixtures of Independence Models

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(8, 0.023), (26, 0.656), (38, 0.057), (52, 0.012), (58, 0.015), (66, 0.042), (90, 0.02), (91, 0.044), (96, 0.021)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.94949895 20 jmlr-2009-DL-Learner: Learning Concepts in Description Logics

Author: Jens Lehmann

2 0.84317815 35 jmlr-2009-Feature Selection with Ensembles, Artificial Variables, and Redundancy Elimination (Special Topic on Model Selection)

Author: Eugene Tuv, Alexander Borisov, George Runger, Kari Torkkola

Abstract: Predictive models beneﬁt from a compact, non-redundant subset of features that improves interpretability and generalization. Modern data sets are wide, dirty, mixed with both numerical and categorical predictors, and may contain interactive effects that require complex models. This is a challenge for ﬁlters, wrappers, and embedded feature selection methods. We describe details of an algorithm using tree-based ensembles to generate a compact subset of non-redundant features. Parallel and serial ensembles of trees are combined into a mixed method that can uncover masking and detect features of secondary effect. Simulated and actual examples illustrate the effectiveness of the approach. Keywords: trees, resampling, importance, masking, residuals

3 0.3023183 91 jmlr-2009-Subgroup Analysis via Recursive Partitioning

Author: Xiaogang Su, Chih-Ling Tsai, Hansheng Wang, David M. Nickerson, Bogong Li

Abstract: Subgroup analysis is an integral part of comparative analysis where assessing the treatment effect on a response is of central interest. Its goal is to determine the heterogeneity of the treatment effect across subpopulations. In this paper, we adapt the idea of recursive partitioning and introduce an interaction tree (IT) procedure to conduct subgroup analysis. The IT procedure automatically facilitates a number of objectively deﬁned subgroups, in some of which the treatment effect is found prominent while in others the treatment has a negligible or even negative effect. The standard CART (Breiman et al., 1984) methodology is inherited to construct the tree structure. Also, in order to extract factors that contribute to the heterogeneity of the treatment effect, variable importance measure is made available via random forests of the interaction trees. Both simulated experiments and analysis of census wage data are presented for illustration. Keywords: CART, interaction, subgroup analysis, random forests

4 0.29659271 3 jmlr-2009-A Parameter-Free Classification Method for Large Scale Learning

Author: Marc Boullé

Abstract: With the rapid growth of computer storage capacities, available data and demand for scoring models both follow an increasing trend, sharper than that of the processing power. However, the main limitation to a wide spread of data mining solutions is the non-increasing availability of skilled data analysts, which play a key role in data preparation and model selection. In this paper, we present a parameter-free scalable classiﬁcation method, which is a step towards fully automatic data mining. The method is based on Bayes optimal univariate conditional density estimators, naive Bayes classiﬁcation enhanced with a Bayesian variable selection scheme, and averaging of models using a logarithmic smoothing of the posterior distribution. We focus on the complexity of the algorithms and show how they can cope with data sets that are far larger than the available central memory. We ﬁnally report results on the Large Scale Learning challenge, where our method obtains state of the art performance within practicable computation time. Keywords: large scale learning, naive Bayes, Bayesianism, model selection, model averaging

5 0.28912118 76 jmlr-2009-Python Environment for Bayesian Learning: Inferring the Structure of Bayesian Networks from Knowledge and Data (Machine Learning Open Source Software Paper)

Author: Abhik Shah, Peter Woolf

Abstract: In this paper, we introduce PEBL, a Python library and application for learning Bayesian network structure from data and prior knowledge that provides features unmatched by alternative software packages: the ability to use interventional data, ﬂexible speciﬁcation of structural priors, modeling with hidden variables and exploitation of parallel processing. PEBL is released under the MIT open-source license, can be installed from the Python Package Index and is available at http://pebl-project.googlecode.com. Keywords: Bayesian networks, python, open source software

6 0.27456704 32 jmlr-2009-Exploiting Product Distributions to Identify Relevant Variables of Correlation Immune Functions

7 0.27389503 85 jmlr-2009-Settable Systems: An Extension of Pearl's Causal Model with Optimization, Equilibrium, and Learning

8 0.26845503 70 jmlr-2009-Particle Swarm Model Selection (Special Topic on Model Selection)

9 0.25608402 97 jmlr-2009-Ultrahigh Dimensional Feature Selection: Beyond The Linear Model

10 0.24896973 43 jmlr-2009-Java-ML: A Machine Learning Library (Machine Learning Open Source Software Paper)

11 0.2447636 31 jmlr-2009-Evolutionary Model Type Selection for Global Surrogate Modeling

12 0.22985135 81 jmlr-2009-Robust Process Discovery with Artificial Negative Events (Special Topic on Mining and Learning with Graphs and Relations)

13 0.20926638 26 jmlr-2009-Dlib-ml: A Machine Learning Toolkit (Machine Learning Open Source Software Paper)

14 0.20322567 62 jmlr-2009-Nonlinear Models Using Dirichlet Process Mixtures

15 0.18550994 19 jmlr-2009-Controlling the False Discovery Rate of the Association Causality Structure Learned with the PC Algorithm (Special Topic on Mining and Learning with Graphs and Relations)

16 0.18026641 96 jmlr-2009-Transfer Learning for Reinforcement Learning Domains: A Survey

17 0.17161676 10 jmlr-2009-Application of Non Parametric Empirical Bayes Estimation to High Dimensional Classification

18 0.1715166 41 jmlr-2009-Improving the Reliability of Causal Discovery from Small Data Sets Using Argumentation (Special Topic on Causality)

19 0.17040253 58 jmlr-2009-NEUROSVM: An Architecture to Reduce the Effect of the Choice of Kernel on the Performance of SVM

20 0.16929936 4 jmlr-2009-A Survey of Accuracy Evaluation Metrics of Recommendation Tasks