Inductive logic Programming 2009 Papers

Machine Learning Technology Artificial Intelligence Technology Natural Language Processing Technology Semantic Web Technology Ontology Technology Reasoning Technology Knowledge Information Technology Collecting AI Conference Papers Digital Transformation Technology

Following the previous ILP2008, we will now discuss ILP2009.

The ILP conference series has been the premier forum for work on logic-based approaches to machine learning for almost two decades. The 19th International Conference on Inductive Logic Programming, which was organized in Leuven, July 2-4, 2009, continued this tradition but also reached out to other communities as it was colocated with SRL-2009 – the International Workshop on Statistical Relational Learning, and MLG-2009 – the 7th International Workshop on Mining and Learning with Graphs. While these three series of events each have their own focus, emphasis and tradition, they essentially share the problem that is studied: learning about structured data in the form of graphs, relational descriptions or logic. The colocation of the events was intended to increase the interaction between the three communities.

There was a single program with joint invited and tutorial speakers, a panel, regular talks and poster sessions. The invited speakers and tutorial speakers were James Cussens, Jason Eisner, Jure Leskovec, Raymond Mooney, Scott Sanner, and Philip Yu. The panel featured Karsten Borgwardt, Luc De Raedt, Pedro Domingos, Paolo Frasconi, Thomas Ga ̈rtner, Kristian Kersting, Stephen Muggleton, and C. David Page. Video-recordings of these talks can be found at www.videolectures.net. The overall program featured 30 talks presented in two parallel tracks and 53 posters. The talks and posters were selected on the basis of an extended abstract. These abstracts can be found at http:// dtai.cs.kuleuven.be/ilp-mlg-srl/. In addition, as in previous years, a se- lection of the papers of ILP 2009 have been published in a volume in the Lectures Notes in Artificial Intelligence series and in a special issue of the Machine Learn- ing Journal. From the initial 54 extended abstracts (6 pages in LNCS format) that were submitted to ILP-2009, 5 papers were invited for the special issue, 10 papers were published as a long paper and 14 more as a short paper in the proceedings. These papers were prepared after the conference. Many of the other abstracts were accepted for poster presentation.

This event would not have been possible without the help of a large number of people: I would especially like to thank Bernhard Pfahringer for suggesting this colocation, the invited and tutorial speakers, the panelists, the Chairs of MLG (Hendrik Blockeel, Karsten Borgwardt and Xifeng Yan) and SRL (Pedro Domingos and Kristian Kersting), Filip Zˇelezny ́ for handling the Leuven papers, the Program Committee, the additional reviewers, the authors that submitted papers, the participants, the videolectures organization, and the extensive local organizing team (listed herein) as well as the sponsors (the BNVKI – the Benelux Association for Artificial Intelligence, Pascal 2 – the Network of Excellence, FWO – the Research Foundation Flanders) for their financial support. In preparing

this volume, the support of Jan Struyf (who managed Easychair) and Laura Antanas (who managed the proceedings) was essential.

Knowledge-Directed Theory Revision

Using domain knowledge to speed up learning is widely ac- cepted but theory revision of such knowledge continues to use general syntactic operators. Using such operators for theory revision of teleore- active logic programs is especially expensive in which proof of a top-level goal involves playing a game. In such contexts, one should have the option to complement general theory revision with domain-specific knowledge. Using American football as an example, we use Icarus’ multi-agent tele- oreactive logic programming ability to encode a coach agent whose con- cepts correspond to faults recognized in execution of the play and whose skills correspond to making repairs in the goals of the player agents. Our results show effective learning using as few as twenty examples. We also show that structural changes made by such revision can pro- duce performance gains that cannot be matched by doing only numeric optimization.

Towards Clausal Discovery for Stream Mining

With the increasing popularity of data streams it has become time to adapt logical and relational learning techniques for dealing with streams. In this note, we present our preliminary results on upgrading the clausal discovery paradigm towards the mining of streams. In this setting, there is a stream of interpretations and the goal is to learn a clausal theory that is satisfied by these interpretations. Furthermore, in data streams the interpretations can be read (and processed) only once.

On the Relationship between Logical Bayesian Networks and Probabilistic LogicProgramming Based on the Distribution Semantics

A significant part of current research on (inductive) logic program- ming deals with probabilistic logical models. Over the last decade many logics or languages for representing such models have been introduced. There is cur- rently a great need for insight into the relationships between all these languages. One kind of languages are those that extend probabilistic models with elements of logic, such as the language of Logical Bayesian Networks (LBNs). Some other languages follow the converse strategy of extending logic programs with a prob- abilistic semantics, often in a way similar to that of Sato’s distribution semantics.

In this paper we study the relationship between the language of LBNs and languages based on the distribution semantics. Concretely, we define a mapping from LBNs to theories in the Independent Choice Logic (ICL). We also show how this mapping can be used to learn ICL theories from data.

Induction of Relational Algebra Expressions

We study the problem of inducing relational data bases queries expressing quantification. Such queries express interesting multi-relational patterns in a database in a concise manner. Queries on relational databases can be expressed as Datalog programs. Inducing Datalog programs ex- pressing quantification requires both negation and predicate invention. Predicate invention has been studied in the ILP literature. However, we propose a radical new approach to inducing quantification. We express queries in the relational algebra and we perform a heuristic search for the desired expression. A heuristic, which takes the complement operator into account, is proposed. We report some preliminary experimental results of a software prototype implementing our ideas. The results are compared with the results of FOIL and Tilde on the same examples.

A Logic-Based Approach to Relation Extraction from Texts

In recent years, text mining has moved far beyond the clas- sical problem of text classification with an increased interest in more sophisticated processing of large text corpora, such as, for example, eval- uations of complex queries. This and several other tasks are based on the essential step of relation extraction. This problem becomes a typical ap- plication of learning logic programs by considering the dependency trees of sentences as relational structures and examples of the target rela- tion as ground atoms of a target predicate. In this way, each example is represented by a definite first-order Horn-clause. We show that an adaptation of Plotkin’s least general generalization (LGG) operator can effectively be applied to such clauses and propose a simple and effec- tive divide-and-conquer algorithm for listing a certain set of LGGs. We use these LGGs to generate binary features and compute the hypothesis by applying SVM to the feature vectors obtained. Empirical results on the ACE–2003 benchmark dataset indicate that the performance of our approach is comparable to state-of-the-art kernel methods.

Discovering Rules by Meta-level Abduction

This paper addresses discovery of unknown relations from incomplete network data by abduction. Given a network information such as causal relations and metabolic pathways, we want to infer missing links and nodes in the network to account for observations. To this end, we introduce a framework of meta-level abduction, which performs abduction in the meta level. This is implemented in SOLAR, an automated deduction system for consequence finding, using a first- order representation for algebraic properties of causality and the full-clausal form of network information and constraints. Meta-level abduction by SOLAR is pow- erful enough to infer missing rules, missing facts, and unknown causes that in- volve predicate invention in the form of existentially quantified hypotheses. We also show an application of rule abduction to discover certain physical techniques and related integrity constraints within the subject area of Skill Science.

Inductive Generalization of Analytically Learned Goal Hierarchies

We describe a new approach for learning procedural knowledge rep- resented as teleoreactive logic programs using relational behavior traces as input. This representation organizes task decomposition skills hierarchically and asso- ciate explicitly defined goals with them. Our approach integrates analytical learn- ing with inductive generalization in order to learn these skills. The analytical component predicts the goal dependencies in a successful solution and gener- ates a teleoreactive logic program that can solve similar problems by determining the structure of the skill hierarchy and skill applicability conditions (precondi- tions), which may be overgeneral. The inductive component experiments with these skills on new problems and uses the data collected in this process to refine the preconditions. Our system achieves this by converting the data collected dur- ing the problem solving experiments into the positive and negative examples of preconditions that can be learned with a standard Inductive Logic Programming system. We show that this conversion uses one of the main commitments of tele- oreactive logic programs: associating all skills with explicitly defined goals. We claim that our approach uses less expert effort compared to a purely inductive approach and performs better compared to a purely analytical approach.

Ideal Downward Refinement in the EL Description Logic

With the proliferation of the Semantic Web, there has been a rapidly rising interest in description logics, which form the logical foun- dation of the W3C standard ontology language OWL. While the num- ber of OWL knowledge bases grows, there is an increasing demand for tools assisting knowledge engineers in building up and maintaining their structure. For this purpose, concept learning algorithms based on re- finement operators have been investigated. In this paper, we provide an ideal refinement operator for the description logic EL and show that it is computationally feasible on large knowledge bases.

Nonmonotonic Onto-Relational Learning

In this paper we carry on the work on Onto-Relational Learning by investigating the impact of having disjunctive Datalog with de- fault negation either in the language of hypotheses or in the language for the background theory. The inclusion of nonmonotonic features strength- ens the ability of our ILP framework to deal with incomplete knowledge. One such ability can turn out to be useful in application domains, such as the Semantic Web. As a showcase we face the problem of inducing an integrity theory for a relational database whose instance is given and whose schema encompasses an ontology and a set of rules linking the database to the ontology.

CP-Logic Theory Inference with Contextual Variable Elimination and Comparison to BDD Based Inference Methods

There is a growing interest in languages that combine proba- bilistic models with logic to represent complex domains involving uncer- tainty. Causal probabilistic logic (CP-logic), which has been designed to model causal processes, is such a probabilistic logic language. This paper investigates inference algorithms for CP-logic; these are crucial for devel- oping learning algorithms. It proposes a new CP-logic inference method based on contextual variable elimination and compares this method to variable elimination and to methods based on binary decision diagrams.

Speeding Up Inference in Statistical Relational Learning by Clustering Similar Query Literals

Markov logic networks (MLNs) have been successfully ap- plied to several challenging problems by taking a “programming lan- guage” approach where a set of formulas is hand-coded and weights are learned from data. Because inference plays an important role in this pro- cess, “programming” with an MLN would be significantly facilitated by speeding up inference. We present a new meta-inference algorithm that exploits the repeated structure frequently present in relational domains to speed up existing inference techniques. Our approach first clusters the query literals and then performs full inference for only one repre- sentative from each cluster. The clustering step incurs only a one-time up-front cost when weights are learned over a fixed structure.

Chess Revision: Acquiring the Rules of Chess Variants through FOL Theory Revision from Examples

The game of chess has been a major testbed for research in artificial intelligence, since it requires focus on intelligent reasoning. Particularly, several challenges arise to machine learning systems when inducing a model describing legal moves of the chess, including the col- lection of the examples, the learning of a model correctly representing the official rules of the game, covering all the branches and restrictions of the correct moves, and the comprehensibility of such a model. Besides, the game of chess has inspired the creation of numerous variants, ranging from faster to more challenging or to regional versions of the game. The question arises if it is possible to take advantage of an initial classifier of chess as a starting point to obtain classifiers for the different variants. We approach this problem as an instance of theory revision from exam- ples. The initial classifier of chess is inspired by a FOL theory approved by a chess expert and the examples are defined as sequences of moves within a game. Starting from a standard revision system, we argue that abduction and negation are also required to best address this problem. Experimental results show the effectiveness of our approach

ProGolem: A System Based on Relative Minimal Generalisation

Over the last decade Inductive Logic Programming systems have been dominated by use of top-down refinement search techniques. In this paper we re-examine the use of bottom-up approaches to the con- struction of logic programs. In particular, we explore variants of Plotkin’s Relative Least General Generalisation (RLGG) which are based on sub- sumption relative to a bottom clause. With Plotkin’s RLGG, clause length grows exponentially in the number of examples. By contrast, in the Golem system, the length of ij-determinate RLGG clauses were shown to be polynomially bounded for given values of i and j. However, the determinacy restrictions made Golem inapplicable in many key applica- tion areas, including the learning of chemical properties from atom and bond descriptions. In this paper we show that with Asymmetric Relative Minimal Generalisations (or ARMGs) relative to a bottom clause, clause length is bounded by the length of the initial bottom clause. ARMGs, therefore do not need the determinacy restrictions used in Golem. An algorithm is described for constructing ARMGs and this has been imple- mented in an ILP system called ProGolem which combines bottom-clause construction in Progol with a Golem control strategy which uses ARMG in place of determinate RLGG. ProGolem has been evaluated on several well-known ILP datasets. It is shown that ProGolem has a similar or better predictive accuracy and learning time compared to Golem on two determinate real-world applications where Golem was originally tested. Moreover, ProGolem was also tested on several non-determinate real- world applications where Golem is inapplicable. In these applications, ProGolem and Aleph have comparable times and accuracies. The ex- perimental results also suggest that ProGolem significantly outperforms Aleph in cases where clauses in the target theory are long and complex.

An Inductive Logic Programming Approach to Validate Hexose Binding Biochemical Knowledge

Hexoses are simple sugars that play a key role in many cellular pathways, and in the regulation of development and disease mechanisms. Current protein-sugar computational models are based, at least partially, on prior biochemical findings and knowledge. They incorporate different parts of these findings in predictive black-box models. We investigate the empirical support for biochemical findings by comparing Inductive Logic Programming (ILP) induced rules to actual biochemical results. We mine the Protein Data Bank for a representative data set of hexose binding sites, non-hexose binding sites and surface grooves. We build an ILP model of hexose-binding sites and evaluate our results against several baseline ma- chine learning classifiers. Our method achieves an accuracy similar to that of other black-box classifiers while providing insight into the discriminat- ing process. In addition, it confirms wet-lab findings and reveals a previ- ously unreported Trp-Glu amino acids dependency.

Boosting First-Order Clauses for Large, Skewed Data Sets

Creating an effective ensemble of clauses for large, skewed data sets requires finding a diverse, high-scoring set of clauses and then combining them in such a way as to maximize predictive performance. We have adapted the RankBoost algorithm in order to maximize area under the recall-precision curve, a much better metric when working with highly skewed data sets than ROC curves. We have also explored a range of possibilities for the weak hypotheses used by our modified RankBoost algorithm beyond using individual clauses. We provide results on four large, skewed data sets showing that our modified RankBoost algorithm outperforms the original on area under the recall-precision curves.

Incorporating Linguistic Expertise Using ILP for Named Entity Recognition in Data Hungry Indian Languages

Developing linguistically sound and data-compliant rules for named entity annotation is usually an intensive and time consuming process for any de- veloper or linguist. In this work, we present the use of two Inductive Logic Pro- gramming (ILP) techniques to construct rules for extracting instances of various named entity classes thereby reducing the efforts of a linguist/developer. Using ILP for rule development not only reduces the amount of effort required but also provides an interactive framework wherein a linguist can incorporate his intuition about named entities such as in form of mode declarations for refinements (suita- bly exposed for ease of use by the linguist) and the background knowledge (in the form of linguistic resources). We have a small amount of tagged data – approxi- mately 3884 sentences for Marathi and 22748 sentences in Hindi. The paucity of tagged data for Indian languages makes manual development of rules more chal- lenging, However, the ability to fold in background knowledge and domain exper- tise in ILP techniques comes to our rescue and we have been able to develop rules that are mostly linguistically sound that yield results comparable to rules hand- crafted by linguists. The ILP approach has two advantages over the approach of hand-crafting all rules: (i) the development time reduces by a factor of 240 when ILP is used instead of involving a linguist for the entire rule development and (ii) the ILP technique has the computational edge that it has a complete and consistent view of all significant patterns in the data at the level of abstraction specified through the mode declarations. The point (ii) enables the discovery of rules that could be missed by the linguist and also makes it possible to scale the rule devel- opment to a larger training dataset. The rules thus developed could be optionally edited by linguistic experts and consolidated either (a) through default ordering (as in TILDE[1]) or (b) with an ordering induced using [2] or (c) by using the rules as features in a statistical graphical model such a conditional random field (CRF) [3]. We report results using WARMR [4] and TILDE to learn rules for named entities of Indian languages namely Hindi and Marathi.

Transfer Learning via Relational Templates

Transfer Learning described in “Overview of Transfer Learning and Examples of Algorithms and Implementations” refers to learning of knowledge in one do- main that can be applied to a different domain. In this paper, we view transfer learning as generalization of knowledge in a richer representa- tion language that includes multiple subdomains as parts of the same superdomain. We employ relational templates of different specificity to learn pieces of additive value functions. We show significant transfer of learned knowledge across different subdomains of a real-time strategy game by generalizing the value function using relational templates.

Automatic Revision of Metabolic Networks through Logical Analysis of Experimental Data

This paper presents a nonmonotonic ILP approach for the automatic revision of metabolic networks through the logical analysis of experimental data. The method extends previous work in two respects: by suggesting revisions that involve both the addition and removal of in- formation; and by suggesting revisions that involve combinations of gene functions, enzyme inhibitions, and metabolic reactions. Our proposal is based on a new declarative model of metabolism expressed in a non- monotonic logic programming formalism. With respect to this model, a mixture of abductive and inductive inference is used to compute a set of minimal revisions needed to make a given network consistent with some observed data. In this way, we describe how a reasoning system called XHAIL was able to correctly revise a state-of-the-art metabolic pathway in the light of real-world experimental data acquired by an autonomous laboratory platform called the Robot Scientist.

Finding Relational Associations in HIV Resistance Mutation Data

HIV therapy optimization is a hard task due to rapidly evolving mutations leading to drug resistance. Over the past five years, several machine learning approaches have been developed for decision support, mostly to predict therapy failure from the genotypic sequence of viral proteins and additional factors. In this paper, we define a relational rep- resentation for an important part of the data, namely the sequences of a viral protein (reverse transcriptase), their mutations, and the drug resis- tance(s) associated with those mutations. The data were retrieved from the Los Alamos National Laboratories’ (LANL) HIV databases. In con- trast to existing work in this area, we do not aim directly for predictive modeling, but take one step back and apply descriptive mining methods to develop a better understanding of the correlations and associations between mutations and resistances. In our particular application, we use the Warmr algorithm to detect non-trivial patterns connecting muta- tions and resistances. Our findings suggest that well-known facts can be rediscovered, but also hint at the potential of discovering yet unknown associations.

ILP, the Blind, and the Elephant: Euclidean Embedding of Co-proven Queries

Relational data is complex. This complexity makes one of the basic steps of ILP difficult: understanding the data and results. If the user cannot easily understand it, he draws incomplete conclusions. The situation is very much as in the parable of the blind men and the elephant that appears in many cultures. In this tale the blind work inde- pendently and with quite different pieces of information, thereby drawing very different conclusions about the nature of the beast. In contrast, vi- sual representations make it easy to shift from one perspective to another while exploring and analyzing data. This paper describes a method for embedding interpretations and queries into a single, common Euclidean space based on their co-proven statistics. We demonstrate our method on real-world datasets showing that ILP results can indeed be captured at a glance.

Parameter Screening and Optimisation for ILP Using Designed Experiments

Reports of experiments conducted with an Inductive Logic Programming system rarely describe how specific values of parameters of the system are arrived at when constructing models. Usually, no at- tempt is made to identify sensitive parameters, and those that are used are often given “factory-supplied” default values, or values obtained from some non-systematic exploratory analysis. The immediate consequence of this is, of course, that it is not clear if better models could have been obtained if some form of parameter selection and optimisation had been performed. Questions follow inevitably on the experiments themselves: specifically, are all algorithms being treated fairly, and is the exploratory phase sufficiently well-defined to allow the experiments to be replicated? In this paper, we investigate the use of parameter selection and opti- misation techniques grouped under the study of experimental design. Screening and “response surface” methods determine, in turn, sensitive parameters and good values for these parameters. This combined use of parameter selection and response surface-driven optimisation has a long history of application in industrial engineering, and its role in ILP is investigated using two well-known benchmarks. The results suggest that computational overheads from this preliminary phase are not substantial, and that much can be gained, both on improving system performance and on enabling controlled experimentation, by adopting well-established procedures such as the ones proposed here.

Don’t Fear Optimality: Sampling for Probabilistic-Logic Sequence Models

One of the current challenges in artificial intelligence is modeling dynamic environments that change due to the actions or activities undertaken by people or agents. The task of inferring hidden states, e.g. the activities or inten- tions of people, based on observations is called filtering. Standard probabilistic models such as Dynamic Bayesian Networks are able to solve this task efficiently using approximative methods such as particle filters. However, these models do not support logical or relational representations. The key contribution of this pa- per is the upgrade of a particle filter algorithm for use with a probabilistic logical representation through the definition of a proposal distribution. The performance of the algorithm depends largely on how well this distribution fits the target distri- bution. We adopt the idea of logical compilation into Binary Decision Diagrams for sampling. This allows us to use the optimal proposal distribution which is normally prohibitively slow.

Policy Transfer via Markov Logic Networks

We propose using a statistical-relational model, the Markov Logic Network, for knowledge transfer in reinforcement learning. Our goal is to extract relational knowledge from a source task and use it to speed up learning in a related target task. We show that Markov Logic Networks are effective models for capturing both source-task Q-functions and source-task policies. We apply them via demonstration, which in- volves using them for decision making in an initial stage of the target task before continuing to learn. Through experiments in the RoboCup simulated-soccer domain, we show that transfer via Markov Logic Net- works can significantly improve early performance in complex tasks, and that transferring policies is more effective than transferring Q-functions.

Can ILP Be Applied to Large Datasets?

There exist large data in science and business. Existing ILP systems cannot be applied effectively for data sets with 10000 data points. In this paper, we consider a technique which can be used to apply for more than 10000 data by simplifying it. Our approach is called Approximative Generalisation and can compress several data points into one example. In case that the original examples are mixture of positive and negative examples, the resulting example is ascribed in probability values representing proportion of positiveness. Our longer term aim is to apply on large Chess endgame database to allow well controlled evalua- tions of the technique. In this paper we start by choosing a simple game of Noughts and Crosses and we apply mini-max backup algorithm to obtain database of examples. These outcomes are compacted using our approach and empirical results show this has advantage both in accu- racy and speed. In further work we hope to apply the approach to large database of both natural and artificial domains.

In the next article, we will discuss ILP2010.