1 / 21

Theory Revision

Theory Revision. Chris Murphy. The Problem. Sometimes we: Have theories for existing data that do not match new data Do not want to repeat learning every time we update data Believe that our rule learners could perform much better if given basic theories to build off of.

vinaya
Download Presentation

Theory Revision

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Theory Revision Chris Murphy

  2. The Problem • Sometimes we: • Have theories for existing data that do not match new data • Do not want to repeat learning every time we update data • Believe that our rule learners could perform much better if given basic theories to build off of

  3. Two Types of Errors in Theories • Over-generalization • Theory covers negative examples • Caused by incorrect rules in theory or by existing rules missing necessary constraints • Example: uncle(A,B) :- brother(A,C). • Solution: uncle(A,B) :- brother(A,C), parent(C,B).

  4. Two Types of Errors in Theories • Over-specialization • Theory does not cover all positive examples • Caused by rules having additional, unnecessary constraints or missing rules in the theory that are necessary to proving some examples • Example: uncle(A,B) :- brother(A,C), mother(C,B). • Solution: Uncle(A,B) :- brother(A,C), parent(C,B).

  5. What is Theory Refinement? • “…learning systems that have a goal of making small changes to an original theory to account for new data.” • Combination of two processes: • Using a background theory to improve rule effectiveness and adequacy on data • Using problem detection and correction processes to make small adjustments to said theories

  6. Basic Issues Addressed • Is there an error in the existing theory? • What part of the theory is incorrect? • What correction needs to be made?

  7. Theory Refinement Basics • System is given a beginning theory about domain • Can be incorrect or incomplete (and often is) • Well refined theory will: • Be accurate with new/updated data • Make as few changes as possible to original theory • Changes are monitored by a “Distance Metric” that keeps a count of every change made

  8. The Distance Metric • Adds every addition, deletion, or replacement of clauses • Used to: • Measure syntactical corruptness of original theory • Determine how good a learning system is at replicating human created theories • Drawback is that it does not recognize equivalent literals such as less(X,Y). And greq(Y,X). • Table on the right shows examples of distance between theories, as well as its relationship to accuracy

  9. Why Preserve the Original Theory? • If you understood the original theory, you’ll likely understand the new one • Similar theories will likely retain the ability to use abstract predicates from the original theory

  10. Theory Refinement Systems • EITHER • FORTE • AUDREY II • KBANN • FOCL, KR-FOCL, A-EBL, AUDREY, and more

  11. EITHER • Explanation-based and Inductive Theory Extension and Revision • First system with ability to fix over-generalizing and over-specialization • Able to correct multiple faults • Uses one or more failings at a time to learn one or more corrections to a theory • Able to correct intermediate points in theories • Uses positive and negative examples • Able to learn disjunctive rules • Specialization algorithm does not allow positives to be eliminated • Generalization algorithm does not allow negatives to be admitted

  12. FORTE • Attempts to prove all positive and negative examples using the current theory • When errors are detected: • Identify all clauses that are candidates for revision • Determine whether clause needs to be specialized or generalized • Determine what operators to test for various revisions • Best revision is determined based on its accuracy when tested on complete training set • Process repeats until system perfectly classifies the training set or until FORTE finds that no revisions improve the accuracy of the theory

  13. Specializing a Theory • Needs to happen when one or more negatives are covered • Ways to fix the problem: • Delete a clause: simple, just delete and retest • Add new antecedents to existing clause • More difficult • FORTE uses two methods... • Add one antecedent at a time, like FOIL, choosing the antecedent that provides the best info gain at any point • Relational Pathfinding – uses graph structures to find new relations in data

  14. Generalizing a Theory • Need to generalize when positives are not covered • Ways FORTE generalizes: • Delete antecedents from an existing clause (either singly or in groups) • Add a new clause • Copy clause identified at the revision point • Purposely over-generalize • Send over-general rule to specialization algorithm • Use inverse relation operators “identification” and “absorption” • These use intermediate rules to provide more options for alternative definitions

  15. AUDREY II • Runs in two main phases: • Initial domain theory is specialized to eliminate negative coverage • At each step, a best clause is chosen, it is specialized, and the process repeats • Best clause is the one that contributes the most negative examples being incorrectly classified and is required by the fewest number of positives • If best clause covers no positives, it is deleted, otherwise, literals are added in a FOIL-like manner to eliminate covered negatives

  16. AUDREY II • Revised theory is generalized to cover all positives (without covering any negatives) • Uncovered positive example is randomly chosen, and theory is generalized to cover the example • Process repeats until all remaining positives are covered • If assumed literals can be removed without decreasing positive coverage, that is done • If not, AUDREY II tries replacing literals with new conjuction of literals (also uses FOIL-type process) • If deleting and replacement fail, system uses a FOIL-like method of determining entirely new clauses for proving the literal

  17. KBANN • System that takes a domain theory of Prolog style clauses, and transforms it into knowledge-based neural network (KNN) • Uses the knowledge base (background theory) to determine topology and initial weights of KNN • Different units and links within KNN correspond to various components of the domain theory • Topologies of KNNs can be different than topologies that we have seen in neural networks

  18. KBANN • KNNs are trained on example data, and rules are extracted using an N of M method (saves time) • Domain theories for KBANN need not contain all intermediate theories necessary to learn certain concepts • Adding hidden units along with units specified by the domain theory allows the network to induce necessary terms not stated in background info • Problems arise when interpreting intermediate rules learned from hidden nodes • Difficult to label them based on the inputs they resulted from • In one case, programmers labeled rules based on the section of info that they were attached to in that topology

  19. System Comparison • AUDREY II is better than FOCL at theory revision, but it still has room for improvement • Its revised theories are closer to both original theory and human-created correct theory

  20. System Comparison • AUDREY II is slightly more accurate than FORTE, and its revised theories are closer to the original and correct theories • KR-FOCL addresses some issues of other systems by allowing user to decide among changes that have the same accuracy

  21. Applications of Theory Refinement • Used to identify different parts of both DNA and RNA sequences • Used to debug student written basic Prolog programs • Used to maintain working theories as new data is obtained

More Related