Description of m ethods used in ILP algorithms

Description of methods used in ILP algorithms Generic ILP algorithm needs description of operations for design of new hypothesis. The intended hypothesis can be constructed using one of the following approaches: • Top-down: specialization (used e.g. in FOIL) • Bottom-up: generalization (used e.g. in GOLEM) • Combination of both (used e.g. in PROGOL)

FOIL(Goal_P, BKPredicates, Examples) Input:Goal_P, BKPredicates, Examples Output: Induced_rules ___________________________________________________ Pos := { eÎExamples, emeets the goal Goal_P } Neg := { e ÎExamples, edoes notmeet the goal Goal_P } Induced_rules := Ø if Pos is not Ø do “induce new rule New_Rule“ Induced_rules := Induced_rules {New_Rule} Pos := Pos - {examples from Pos, covered by Induced_rules } output Induced_rules

Specialization algorithm inFOILprocedure “induce new rule New_Rule“ New_Rule := (Goal_P)/* most general unconditional fact covering the considerd predicate */ New_Neg := Neg while New_Neg is not Ø do "specialize rules" • Suggestions_P := choice of literals (candidates), which can be included in the body of New_Rule • Best_P := argmax {L: FOIL_Gain(L,New_Rule) } LÎSuggestions_P • New_Rule := (rule New_Rule is extended by adding the Best_P literal) • New_ Neg := {P: P ÎNew_Neg & (P is covered by all literals in the body of New_Rule) } /* the part of Neg, which is still covered by New_Rule */

Choice of literals for the created clause (1) P(X) :-T1(X,Y), T2(X,Y). Suppose (1) covers the negative example eÎNeg, ie. P(e) holds (because there is c such that T1(e,c), T2(e,c) ). The clause (1) has to be specialized! The choice of additional candidates of the form L(Z1,Z2,Z3) respects some heuristic constraints: • Syntactic c. on the used terms, e.g. „at most 1 variable among Z1,..,Z3 is new (does not appear among X and Y)" • Restrictions on the set of possible literals („=„ , recursive rule, ..) Selection criterion FOIL_Gain(L,New_Rule) : weighted information gain – decrease of number of bits neccessary for coding corresponding decision (similar to ID3).

FOIL_Gain(L,New_Rule) (1) P(X) :-T1(X,Y), T2(X,Y). Let us denote Examplesi the set of training examples, which meet the condition – body of the clause (1), ie. T1(X,Y), T2(X,Y). (2) P(X) :-T1(X,Y), T2(X,Y), L(X,Y,Z) . Cardinality Examplesi = { <e,f>: T1(e,f) & T2(e,f)} mi Pos_Examplesi = {<e,f>: <e,f> ÎExamplesi & e Î Pos} ki Neg_Examplesi = {<e,f>: <e,f> ÎExamplesi & e Î Neg} zi ________________________________________________________________ Examplesi(L) = {<e,f,g>: T1(e,f) & T2(e,f) & L(e,f,g)} mi(L) Pos_Examplesi(L) = {<e,f,g>:<e,f,g> ÎExamplesi(L) & e Î Pos} ki(L) Neg_Examplesi(L) = {<e,f,g>:<e,f,g> ÎExamplesi(L) & e Î Neg} zi(L) Infi= -log2(ki/mi) Infi(L)= -log2( ki(L) / mi(L) ) Foil_Gain(L, Examplesi ) = (Infi - Infi(L)) * card{u: uÎExamplesi &$v (<u,v> ÎExamplesi(L))}

PROGOL Combines both techniques: It searches through the hypothesis space top-down and mod directed inverse entailment (MDIE) suggests lower bound for the search space. MDIE applies bottom up approach, but contrary to syntactic approach of inverse resolution (based on proof-theory) it is based on model-theory. MDIE is generalization and enhancement of former approaches. How do the MDIE constraints cause effectivity inrease? • Declaration of mode for the used predicates characterizes „flow of information“ from input to output. • Clear definition of lower bound for top down part (most specific clause is used).

Declaration of mode restricts ussage of input/output variables in the hypothesis modeh(n,atom) and modeb(n,atom) points to literals, which can be used in head or body of the rule • n is nat. number > 0 or * , it isrefered to as„recall“ and it specifies the upper bound for number of different considered solutions to be searched for in the case of the considered predicate (* means „no restristions“) – very useful in case of functional dependence („1 mother“,...) • The expression atom has the structure name(Types), where Types is a sequence of the length given by arity of the predicate name and consisting of expressions +type,-type, #type(meaning input, outputand constant of the corresponding type) • The placement of input variables in a hypothesized clause h:- b1,..bn is restricted: every variable of +type in any atom bi is either+type in h or it is -type in some atom bj where 0< j < i

Theoretical foundations • Let B be background k. and E the set of training examples. Let us denote H the intended hypothesis (Horn.clause), such that B & H |= E • This is equivalent to B & not(E) |= not(H), where not(H) is a conjunction of literals • Let us denote not()the conjunction of all ground literals, which are true in all models of B & not(E). The main difference among the considered models is their domain (ie. the set of objects on which the predicates are defined). The formula not()has been defined in such a way that the following condition holds • B & not(E) |= not() • On the other hand not(H) must hold in all models of B & not(E) as well. The formula not()contains all, what is shared by the models, thus not(H)must be a subset ofnot(). Consequently • B & not(E) |= not()|= not(H) a • B & H |= not (not()) , ie.B & H |= • Usingit is possible to define the most specific general clause Se which can serve as the lower bound for the hypothesis H (Se- subsumes)

Construction of not() and Se • In general,  can have infinite cardinality. Progol uses mode declarations to build the most specific clause as a constraint for the search of suitable hypothesis. • The declaration of modes provides these constraints by specifying: • the predicates, which can be used in the head (modeh) and in the body (modeb) of the constructed hypothesis • the arguments (ground terms), which have to included as a consequence of validity of ground atoms in not(E) • The predicates assigned as modeh appear in not() as negative literals • The predicates assigned as modeb appear in not() as positive literals • Let hash: Terms --> N be a function uniquely mapping all (ground) terms to natural numbers

Example: Defineimplication in 5 valued logics modeh(1, implies5(+truthvalue, +truthvalue , -truthvalue)) modeb(1, or5(+truthvalue, +truthvalue , -truthvalue)) modeb(1, not5(+truthvalue, -truthvalue)) %Typy truthvalue(0). truthvalue(1). truthvalue(2). truthvalue(3). truthvalue(4). % Background K. – definitions of not5 and or5 not5(X,Y) :- Y is (4-X). or5(X,X,X). or5(X,Y,Z) :- X>Y, Z is X. or5(X,Y,Z) :- X<Y, Z is Y. % Positive examples implies5(4,4,4). implies5(4,0,0). implies5(0,4,4). implies5(0,0,4). implies5(1,2,3). % Negative examples :- implies5(2,0,0). :-implies5(4,2,4).

Example:implication in 5 valued logics – continuation 1 Construction of the most specific clause Seand Generalize the first example e=implies5(4,4,4). Using modeh(1, implies5(+truthvalue, +truthvalue , -truthvalue))we get B & not(e) |= ¬ (implies5(4,4,4).Object4 must be in thedomain. Now apply the body declarationmodeb(1, or5(+truthvalue, +truthvalue , -truthvalue))and modeb(1, not5(+truthvalue, -truthvalue))to find out the other objects which have to be included. The background k. has to be taken into account. Let us consider not5(X,Y) :- Y is (4-X). Replacing the input variable by 4we get B & not(e) |= not5(4,0). We have a new object 0, which must be included in the domain. As a consequence new facts on or5 and not5 with input from {0, 1}have to be added. Finalynot()is implies5(4,4,4) v ¬or5(4,4,4) v ¬ not5(4,0) v ¬ or5(4,0,4) v ¬ or5(0,4,4) v ¬or5(0,0,0) v ¬not5(0,4). And the most specific clauseSeis created by replacing variables for constants implies5(A,A,A) v ¬or5(A,A,A) v ¬ not5(A,B) v ¬ or5(A,B,A) v ¬ or5(B,A,A) v ¬ or5(B,B,B) v ¬ not5(B,A).

Example:implication in 5 valued logics – continuation 2 The other notation for the most specific clauseSeis implies5(A,A,A) :- or5(A,A,A), not5(A,B), or5(A,B,A), or5(B,A,A), or5(B,B,B), not5(B,A). This clause is used as the lower bound for top down search starting from False :- true. Further on, there are considered only those rafinements which subsumeSenamely implies5(A,A,A) :- …. implies5(B,B,A) :- …. implies5(B,C,A) :- The best is chosen from the point of predictive/compression measure. Let it beimplies5(B,A,A) :- Further rafinement considers the predicates in the body ofSe… Finally, the result is implies5(A,B,C) :- not5(A,D), or5(B,D,C).

Construction of the most specific clause Sefor the example e Add e to BK; InTerms =0; VarDepth=0; Set =0 Design the positive member of Se and add it to Set Consider the next body mode declaration and use it to design the negative members of Se and add them to Set VarDepth = VarDepth +1 VarDepth<MaxVarDepth NO YES

Construction of Sefor the example e (a ground fact)Design of the positive member of Se • Find the head mode declarationh such that h e. • For eachv/tin : • ifvcorresponds to #type, replace vin h by t • ifvcorresponds to +type or -type , replace v in h by vk , where vk is the variable such that k = hash(t) • ifvcorresponds to +type, add t to the set InTerms. • Addhto Set

Construction of Sefor the example e (a ground fact)Design of the negative members of Se For each body mode declarationb • For every possible substitution  of variables corresponding to +typeby terms from the set InTerms • Repeat recall times • If Prolog succeeds on the goal b with answer substitution1do: • for eachv/tin  or 1 • If v corresponds to #type, replace vin b by t, otherwise replacevin b by vk , where k = hash(t). • Ifvcorresponds to -type, add t to the set InTerms. • Add¬bto Set

Top down algorithm for construction of hypothesis with the lower boundSe(generalization of the example e) 1. Open ={}, Closed = 0 2.s= best(Open), Open = Open - {s}, Closed = Closed {s} 3. If prune(s) goto 5 4. Find the set refinement(s) of all min. refinements of s which subsume Se ; set Open = (Open  refinement(s)) – Closed 5. If terminated (Closed, Open) return best (Closed) 6. If Open = 0 return e (no generalization) 7. Goto 2 Notation:is theemptyclause ps = number of pos.examples covered bys ns = number of neg. examples covered bys cs = the length of theclause s -1 hs = number of additional atoms, necessary to “close” the clause s wrt.modes in fs = ps – (ns + cs + hs ) best( M )choosesm ÎMwith the highest value of fs prune(s) is trueiff ns = 0 (consequentlyfurther refinement has no sense) and fs >0 terminated(Closed,Open) is true iff r {(r = best(Closed) &nr= 0 & fr>0) & tÎOpen (hr > ht)}

The used coveringalgorithm:B is BK and E the set of training examples 1. If E = 0 return B 2.. Let e be the first example in E 3. Construct the clause Se for e 4. Construct the clause H from Se 5. Let B = B  H 6.Let E´ = {e : e ÎE and B|= e} 7.Let E = E - E´ 8.Goto 1

Further resources • Links: http://www.cs.york.ac.uk/mlg/progol.html http://www-ai.ijs.si/~ilpnet2 • Books • Nada Lavrač, Sašo Džeroski: Inductive Logic Programming, Techniques and applications, Ellis Horwood Ltd. 1994 • Sašo Džeroski, Nada Lavrač: Relational Data Mining, Springer 2001

Description of m ethods used in ILP algorithms

Description of m ethods used in ILP algorithms

Presentation Transcript

Basic T herapeutic M ethods in M edicine

Current Developments in Quantitative R esearch M ethods

GSE M ethods

M ETHODS

ILP-Challenges in Implementation

e -ILP

Facilitating S tudent M otivation in Engineering Education through Active Learning M ethods

M ethods used for description of astrophysics reaction rates :

Comparing Drawing M ethods

S tatistical M ethods

A comparative analysis of selection schemes used in genetic algorithms

ILP In Software

Retrieval algorithms used in GlobAEROSOL

Evaluation of processes used in screen imperfection algorithms

M ultiple M ultiplication M ethods

M ETHODS

A Taxonomy of Algorithms used in the ACM Programming Competition

Common MMalignant Skin Tumours, M elanoma, Biopsy M ethods, Surgical

ILP 2.0

Legal Issues in ILP

The CARDS System Description and Algorithms

Algorithms used by CDNs