Misplaced Pages

ILP

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.

Inductive logic programming ( ILP ) is a subfield of symbolic artificial intelligence which uses logic programming as a uniform representation for examples, background knowledge and hypotheses. The term " inductive " here refers to philosophical (i.e. suggesting a theory to explain observed facts) rather than mathematical (i.e. proving a property for all members of a well-ordered set) induction. Given an encoding of the known background knowledge and a set of examples represented as a logical database of facts, an ILP system will derive a hypothesised logic program which entails all the positive and none of the negative examples.

#148851

54-453: ILP can refer to: Computer science [ edit ] Inductive logic programming Information Leak Prevention Instruction-level parallelism Integer linear programming Other [ edit ] ilp . , a 2013 album by Kwes Independent Labour Party , United Kingdom Independent Living Program , a US Veteran Affairs program aimed at making sure that each eligible veteran

108-470: A d ← b o d y {\textstyle \mathrm {head} \leftarrow \mathrm {body} } in B ∪ H {\textstyle B\cup H} such that b o d y θ ⊆ e {\textstyle \mathrm {body} \theta \subseteq e} , h e a d θ ⊆ e {\displaystyle \mathrm {head} \theta \subseteq e} also holds. The goal

162-583: A background theory. In general, such relative least general generalisations are not guaranteed to exist; however, if the background theory B is a finite set of ground literals , then the negation of B is itself a clause. In this case, a relative least general generalisation can be computed by disjoining the negation of B with both C 1 {\textstyle C_{1}} and C 2 {\textstyle C_{2}} and then computing their least general generalisation as before. Relative least general generalisations are

216-464: A beam search among probabilistic logic programs by iteratively refining probabilistic theories and optimizing the parameters of each theory using expectation-maximisation. Its extension SLIPCOVER, proposed in 2014, uses bottom clauses generated as in Progol to guide the refinement process, thus reducing the number of revisions and exploring the search space more effectively. Moreover, SLIPCOVER separates

270-439: A bridge theory satisfying the conditions B ∧ ¬ E ⊨ F {\displaystyle B\land \neg E\models F} and F ⊨ ¬ H {\displaystyle F\models \neg H} . Then as H ⊨ ¬ F {\displaystyle H\models \neg F} , they generalize the negation of the bridge theory F with anti-entailment. However,

324-694: A clause C 2 {\textstyle C_{2}} such that R {\textstyle R} is the resolvent of C 1 {\textstyle C_{1}} and C 2 {\textstyle C_{2}} . A W-operator takes two clauses R 1 {\textstyle R_{1}} and R 2 {\textstyle R_{2}} and returns thre clauses C 1 {\textstyle C_{1}} , C 2 {\textstyle C_{2}} and C 3 {\textstyle C_{3}} such that R 1 {\textstyle R_{1}}

378-744: A clause C {\textstyle C} that subsumes C 1 {\textstyle C_{1}} and C 2 {\textstyle C_{2}} , and that is subsumed by every other clause that subsumes C 1 {\textstyle C_{1}} and C 2 {\textstyle C_{2}} . The least general generalisation can be computed by first computing all selections from C {\textstyle C} and D {\textstyle D} , which are pairs of literals ( L , M ) ∈ ( C 1 , C 2 ) {\displaystyle (L,M)\in (C_{1},C_{2})} sharing

432-614: A correct hypothesis H with respect to theories B , E + , E − {\displaystyle B,E^{+},E^{-}} . A system is complete if and only if for any input logic theories B , E + , E − {\displaystyle B,E^{+},E^{-}} any correct hypothesis H with respect to these input theories can be found with its hypothesis search procedure. Inductive logic programming systems can be roughly divided into two classes, search-based and meta-interpretative systems. Search-based systems exploit that

486-400: A form of statistical relational learning within the formalism of probabilistic logic programming. Given the goal of probabilistic inductive logic programming is to find a probabilistic logic program H {\textstyle H} such that the probability of positive examples according to H ∪ B {\textstyle {H\cup B}} is maximized and

540-456: A literal with the most information gain in an iteration of FOCL is non-operational, it is operationalized and its definition is added to the clause under construction. An operational rule might be the literal lessThan(X,Y) ; a non-operational rule might be between(X,Y,Z) ← lessThan(X,Y), lessThan(Y,Z) . The addition of non-operational rules to the knowledge base increases the size of the space which FOCL must search. Rather than simply providing

594-473: A meta-level logic program which is then solved to obtain an optimal hypothesis. Formalisms used to express the problem specification include Prolog and answer set programming , with existing Prolog systems and answer set solvers used for solving the constraints. And example of a Prolog-based system is Metagol , which is based on a meta-interpreter in Prolog , while ASPAL and ILASP are based on an encoding of

SECTION 10

#1732765281149

648-490: A paper by Stephen Muggleton in 1990, defined as the intersection of machine learning and logic programming. Muggleton and Wray Buntine introduced predicate invention and inverse resolution in 1988. Several inductive logic programming systems that proved influential appeared in the early 1990s. FOIL , introduced by Ross Quinlan in 1990 was based on upgrading propositional learning algorithms AQ and ID3 . Golem , introduced by Muggleton and Feng in 1990, went back to

702-638: A restricted form of Plotkin's least generalisation algorithm. The Progol system, introduced by Muggleton in 1995, first implemented inverse entailment, and inspired many later systems. Aleph , a descendant of Progol introduced by Ashwin Srinivasan in 2001, is still one of the most widely used systems as of 2022 . At around the same time, the first practical applications emerged, particularly in bioinformatics , where by 2000 inductive logic programming had been successfully applied to drug design, carcinogenicity and mutagenicity prediction, and elucidation of

756-493: A restriction on h , but forbids any generation of a hypothesis as long as the positive facts are explainable without it. . "Weak consistency", which states that no contradiction can be derived from B ∧ H {\textstyle B\land H} , forbids generation of any hypothesis h that contradicts the background knowledge B . Weak consistency is implied by strong consistency; if no negative examples are given, both requirements coincide. Weak consistency

810-413: A variety of ways, which affect how FOCL selects literals to test while extending a clause under construction. Constraints on the search space are allowed, as are predicates that are defined on a rule rather than on a set of examples (called intensional predicates); most importantly a potentially incorrect hypothesis is allowed as an initial approximation to the predicate to be learned. The main goal of FOCL

864-541: Is a rule-based learning algorithm. Developed in 1990 by Ross Quinlan , FOIL learns function-free Horn clauses , a subset of first-order predicate calculus . Given positive and negative examples of some concept and a set of background-knowledge predicates , FOIL inductively generates a logical concept definition or rule for the concept. The induced rule must not involve any constants ( color(X,red) becomes color(X,Y), red(Y) ) or function symbols, but may allow negated predicates; recursive concepts are also learnable. Like

918-847: Is a set of clauses satisfying the following requirements, where the turnstile symbol ⊨ {\displaystyle \models } stands for logical entailment : Completeness: B ∪ H ⊨ E + Consistency:  B ∪ H ∪ E − ⊭ false {\displaystyle {\begin{array}{llll}{\text{Completeness:}}&B\cup H&\models &E^{+}\\{\text{Consistency: }}&B\cup H\cup E^{-}&\not \models &{\textit {false}}\end{array}}} Completeness requires any generated hypothesis h to explain all positive examples E + {\textstyle E^{+}} , and consistency forbids generation of any hypothesis h that

972-501: Is able to live independently Index Librorum Prohibitorum , the list of publications banned by the Catholic Church between 1559 and 1966. Individual Learning Plan , a teaching methodology Inner Line Permit , a permission required for mainland Indian citizens to be able to travel into a restricted/protected state of North-East India Institution of Lighting Professionals , a professional lighting association based in

1026-764: Is allowed until the number of free variables exceeds the maximum used for any predicate. Unlike FOIL, which does not put typing constraints on its variables, FOCL uses typing as an inexpensive way of incorporating a simple form of background knowledge. For example, a predicate livesAt(X,Y) may have types livesAt(person, location) . Additional predicates may need to be introduced, though – without types, nextDoor(X,Y) could determine whether person X and person Y live next door to each other, or whether two locations are next door to each other. With types, two different predicates nextDoor(person, person) and nextDoor(location, location) would need to exist for this functionality to be maintained. However, this typing mechanism eliminates

1080-507: Is different from Wikidata All article disambiguation pages All disambiguation pages Inductive logic programming Inductive logic programming is particularly useful in bioinformatics and natural language processing . Building on earlier work on Inductive inference , Gordon Plotkin was the first to formalise induction in a clausal setting around 1970, adopting an approach of generalising from examples. In 1981, Ehud Shapiro introduced several ideas that would shape

1134-516: Is equivalent to adjacent(Y,X) ), still others may require that a particular variable be present in the current clause, and many other potential constraints. Operational rules are those rules which are defined extensionally , or as a list of tuples for which a predicate is true. FOIL allows only operational rules; FOCL extends its knowledge base to allow combinations of rules called non-operational rules as well as partially defined or incorrect rules for robustness. Allowing for partial definitions reduces

SECTION 20

#1732765281149

1188-550: Is false. On the next iteration of FOIL after parent(X,Z) has been added, the algorithm will consider all combinations of predicate names and variables such that at least one variable in the new literal is present in the existing clause. This results in a very large search space. Several extensions of the FOIL theory have shown that additions to the basic algorithm may reduce this search space, sometimes drastically. The FOCL algorithm ( First Order Combined Learner ) extends FOIL in

1242-576: Is given as a hypothesis H , itself a logical theory that typically consists of one or more clauses. The two settings differ in the format of examples presented. As of 2022 , learning from entailment is by far the most popular setting for inductive logic programming. In this setting, the positive and negative examples are given as finite sets E + {\textstyle E^{+}} and E − {\textstyle E^{-}} of positive and negated ground literals , respectively. A correct hypothesis H

1296-464: Is inconsistent with the negative examples E − {\textstyle E^{-}} , both given the background knowledge B . In Muggleton's setting of concept learning, "completeness" is referred to as "sufficiency", and "consistency" as "strong consistency". Two further conditions are added: " Necessity ", which postulates that B does not entail E + {\textstyle E^{+}} , does not impose

1350-573: Is particularly important in the case of noisy data, where completeness and strong consistency cannot be guaranteed. In learning from interpretations, the positive and negative examples are given as a set of complete or partial Herbrand structures , each of which are themselves a finite set of ground literals. Such a structure e is said to be a model of the set of clauses B ∪ H {\textstyle B\cup H} if for any substitution θ {\textstyle \theta } and any clause h e

1404-409: Is required to be present already in an unnegated literal of the clause). If FOIL extends a clause grandfather(X,Y) ← true by conjoining the literal parent(X,Z) , it is introducing the new variable Z . Positive examples now consist of those values < X,Y,Z > such that grandfather(X,Y) is true and parent(X,Z) is true; negative examples are those where grandfather(X,Y) is true but parent(X,Z)

1458-455: Is the resolvent of C 1 {\textstyle C_{1}} and C 2 {\textstyle C_{2}} and R 2 {\textstyle R_{2}} is the resolvent of C 2 {\textstyle C_{2}} and C 3 {\textstyle C_{3}} . Inverse resolution was first introduced by Stephen Muggleton and Wray Buntine in 1988 for use in

1512-513: Is then to output a hypothesis that is complete, meaning every positive example is a model of B ∪ H {\textstyle B\cup H} , and consistent, meaning that no negative example is a model of B ∪ H {\textstyle B\cup H} . An inductive logic programming system is a program that takes as an input logic theories B , E + , E − {\displaystyle B,E^{+},E^{-}} and outputs

1566-409: Is to incorporate the methods of explanation-based learning (EBL) into the empirical methods of FOIL. Even when no additional knowledge is provided to FOCL over FOIL, however, it utilizes an iterative widening search strategy similar to depth-first search : first FOCL attempts to learn a clause by introducing no free variables. If this fails (no positive gain), one additional free variable per failure

1620-455: The ID3 algorithm , FOIL hill climbs using a metric based on information theory to construct a rule that covers the data. Unlike ID3, however, FOIL uses a separate-and-conquer method rather than divide-and-conquer , focusing on creating one rule at a time and collecting uncovered examples for the next iteration of the algorithm. The FOIL algorithm is as follows: Suppose FOIL's task is to learn

1674-409: The resolution inference rule. A least general generalisation algorithm takes as input two clauses C 1 {\textstyle C_{1}} and C 2 {\textstyle C_{2}} and outputs the least general generalisation of C 1 {\textstyle C_{1}} and C 2 {\textstyle C_{2}} , that is,

ILP - Misplaced Pages Continue

1728-452: The Progol hypothesis search procedure based on the inverse entailment inference rule is not complete by Yamamoto's example . On the other hand, Imparo is complete by both anti-entailment procedure and its extended inverse subsumption procedure. Rather than explicitly searching the hypothesis graph, metainterpretive or meta-level systems encode the inductive logic programming program as

1782-616: The UK and Ireland. Intelligence-led policing Isolated Limb Perfusion , a limb-sparing, neoadjuvant therapy for soft tissue sarcomas Topics referred to by the same term [REDACTED] This disambiguation page lists articles associated with the title ILP . If an internal link led you here, you may wish to change the link to point directly to the intended article. Retrieved from " https://en.wikipedia.org/w/index.php?title=ILP&oldid=867057022 " Category : Disambiguation pages Hidden categories: Short description

1836-426: The amount of work needed as the algorithm need not generate these partial definitions for itself, and the incorrect rules do not add significantly to the work needed since they are discarded if they are not judged to provide positive information gain. Non-operational rules are advantageous as the individual rules which they combine may not provide information gain on their own, but are useful when taken in conjunction. If

1890-444: The authors learn the structure of first-order rules with associated probabilistic uncertainty parameters. Their approach involves generating the underlying graphical model in a preliminary step and then applying expectation-maximisation. In 2008, De Raedt et al. presented an algorithm for performing theory compression on ProbLog programs, where theory compression refers to a process of removing as many clauses as possible from

1944-430: The concept grandfather(X,Y) given the relations father(X,Y) and parent(X,Y) . Furthermore, suppose our current Body consists of grandfather(X,Y) ← parent(X,Z) . This can be extended by conjoining Body with any of the literals father(X,X) , father(Y,Z) , parent(U,Y) , or many others – to create this literal, the algorithm must choose both a predicate name and a set of variables for the predicate (at least one of which

1998-417: The examples can be given as examples or as (partial) interpretations. Parameter learning for languages following the distribution semantics has been performed by using an expectation-maximisation algorithm or by gradient descent . An expectation-maximisation algorithm consists of a cycle in which the steps of expectation and maximization are repeatedly performed. In the expectation step, the distribution of

2052-478: The field in his new approach of model inference, an algorithm employing refinement and backtracing to search for a complete axiomatisation of given examples. His first implementation was the Model Inference System in 1981: a Prolog program that inductively inferred Horn clause logic programs from positive and negative examples. The term Inductive Logic Programming was first introduced in

2106-549: The foundation of the bottom-up system Golem . Inverse resolution is an inductive reasoning technique that involves inverting the resolution operator . Inverse resolution takes information about the resolvent of a resolution step to compute possible resolving clauses. Two types of inverse resolution operator are in use in inductive logic programming: V-operators and W-operators. A V-operator takes clauses R {\textstyle R} and C 1 {\textstyle C_{1}} as input and returns

2160-411: The hidden variables is computed according to the current values of the probability parameters, while in the maximisation step, the new values of the parameters are computed. Gradient descent methods compute the gradient of the target function and iteratively modify the parameters moving in the direction of the gradient. Structure learning was pioneered by Daphne Koller and Avi Pfeffer in 1997, where

2214-549: The inductive logic programming problem in answer set programming. Evolutionary algorithms in ILP use a population-based approach to evolve hypotheses, refining them through selection, crossover, and mutation. Methods like EvoLearner have been shown to outperform traditional approaches on structured machine learning benchmarks. Probabilistic inductive logic programming adapts the setting of inductive logic programming to learning probabilistic logic programs . It can be considered as

ILP - Misplaced Pages Continue

2268-496: The inductive logic programming system FOIL with ProbLog . Logical rules are learned from probabilistic data in the sense that both the examples themselves and their classifications can be probabilistic. The set of rules has to allow one to predict the probability of the examples from their description. In this setting, the parameters (the probability values) are fixed and the structure has to be learned. In 2011, Elena Bellodi and Fabrizio Riguzzi introduced SLIPCASE, which performs

2322-561: The inductive logic programming system Cigol. By 1993, this spawned a surge of research into inverse resolution operators and their properties. The ILP systems Progol, Hail and Imparo find a hypothesis H using the principle of the inverse entailment for theories B , E , H : B ∧ H ⊨ E ⟺ B ∧ ¬ E ⊨ ¬ H {\displaystyle B\land H\models E\iff B\land \neg E\models \neg H} . First they construct an intermediate theory F called

2376-565: The introduction of meta-interpretative learning makes predicate invention and learning recursive programs more feasible. This technique was pioneered with the Metagol system introduced by Muggleton, Dianhuan Lin, Niels Pahlavi and Alireza Tamaddoni-Nezhad in 2014. This allows ILP systems to work with fewer examples, and brought successes in learning string transformation programs, answer set grammars and general algorithms. Inductive logic programming has adopted several different learning settings,

2430-461: The most common of which are learning from entailment and learning from interpretations. In both cases, the input is provided in the form of background knowledge B , a logical theory (commonly in the form of clauses used in logic programming ), as well as positive and negative examples, denoted E + {\textstyle E^{+}} and E − {\textstyle E^{-}} respectively. The output

2484-658: The need for predicates such as isPerson(X) or isLocation(Y) , and need not consider livesAt(A,B) when A and B are defined to be person variables, reducing the search space. Additionally, typing can improve the accuracy of the resulting rule by eliminating from consideration impossible literals such as livesAt(A,B) which may nevertheless appear to have a high information gain . Rather than implementing trivial predicates such as equals(X,X) or between(X,X,Y) , FOCL introduces implicit constraints on variables, further reducing search space. Some predicates must have all variables unique, others must have commutativity ( adjacent(X,Y)

2538-414: The operation of anti-entailment is computationally more expensive since it is highly nondeterministic. Therefore, an alternative hypothesis search can be conducted using the inverse subsumption (anti-subsumption) operation instead, which is less non-deterministic than anti-entailment. Questions of completeness of a hypothesis search procedure of specific inductive logic programming system arise. For example,

2592-416: The probability of negative examples is minimized. This problem has two variants: parameter learning and structure learning. In the former, one is given the structure (the clauses) of H and the goal is to infer the probabilities annotations of the given clauses, while in the latter the goal is to infer both the structure and the probability parameters of H . Just as in classical inductive logic programming,

2646-562: The result of applying θ {\textstyle \theta } to C 1 {\textstyle C_{1}} , is a subset of C 2 {\textstyle C_{2}} . This lattice can be traversed either bottom-up or top-down. Bottom-up methods to search the subsumption lattice have been investigated since Plotkin's first work on formalising induction in clausal logic in 1970. Techniques used include least general generalisation, based on anti-unification , and inverse resolution, based on inverting

2700-441: The same predicate symbol and negated/unnegated status. Then, the least general generalisation is obtained as the disjunction of the least general generalisations of the individual selections, which can be obtained by first-order syntactical anti-unification . To account for background knowledge, inductive logic programming systems employ relative least general generalisations , which are defined in terms of subsumption relative to

2754-589: The search for promising clauses from that of the theory: the space of clauses is explored with a beam search , while the space of theories is searched greedily . [REDACTED]  This article incorporates text from a free content work. Licensed under CC-BY 4.0 ( license statement/permission ). Text taken from A History of Probabilistic Inductive Logic Programming​ , Fabrizio Riguzzi, Elena Bellodi and Riccardo Zese, Frontiers Media . First-order inductive learner In machine learning , first-order inductive learner ( FOIL )

SECTION 50

#1732765281149

2808-429: The space of possible clauses forms a complete lattice under the subsumption relation, where one clause C 1 {\textstyle C_{1}} subsumes another clause C 2 {\textstyle C_{2}} if there is a substitution θ {\textstyle \theta } such that C 1 θ {\textstyle C_{1}\theta } ,

2862-448: The structure and function of proteins. Unlike the focus on automatic programming inherent in the early work, these fields used inductive logic programming techniques from a viewpoint of relational data mining . The success of those initial applications and the lack of progress in recovering larger traditional logic programs shaped the focus of the field. Recently, classical tasks from automated programming have moved back into focus, as

2916-515: The theory in order to maximize the probability of a given set of positive and negative examples. No new clause can be added to the theory. In the same year, Meert, W. et al. introduced a method for learning parameters and structure of ground probabilistic logic programs by considering the Bayesian networks equivalent to them and applying techniques for learning Bayesian networks. ProbFOIL, introduced by De Raedt and Ingo Thon in 2010, combined

#148851