1 / 28

Inductive Transfer With Context-sensitive Neural Networks

Inductive Transfer With Context-sensitive Neural Networks. Danny Silver, Ryan Poirier, & Duane Currie Acadia University, Wolfville, NS, Canada danny.silver@acadiau.ca. Outline. Machine Lifelong Learning (ML3) and Inductive Transfer Multiple Task Learning (MTL)

garima
Download Presentation

Inductive Transfer With Context-sensitive Neural Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Inductive Transfer With Context-sensitive Neural Networks Danny Silver, Ryan Poirier, & Duane Currie Acadia University, Wolfville, NS, Canada danny.silver@acadiau.ca

  2. Outline • Machine Lifelong Learning (ML3) and Inductive Transfer • Multiple Task Learning (MTL) and its Limitations • csMTL – context sensitive MTL • Empirical Studies of csMTL • Conclusions and Future Work

  3. Machine Lifelong Learning (ML3) • Considers methods of retaining and using learned knowledge to improve the effectiveness and efficiency of future learning [Thrun97] • We investigate systems that must learn: • From impoverished training sets • For diverse domains of related/unrelated tasks • Where practice of the same task is possible • Applications: IA, User Modeling, Robotics, DM

  4. Domain Knowledge long-term memory Retention & Consolidation Knowledge Transfer Inductive Bias Selection Knowledge-Based Inductive Learning: An ML3Framework Testing Examples Instance Space X (x, f(x)) Model of Classifier h Inductive Learning System short-term memory Training Examples h(x) ~ f(x)

  5. Domain Knowledge long-term memory Retention & Consolidation Knowledge Transfer Inductive Bias Selection fk(x) f1(x) f2(x) x1 xn Knowledge-Based Inductive Learning: An ML3Framework Testing Examples Instance Space X (x, f(x)) Model of Classifier h Training Examples Multiple Task Learning (MTL) h(x) ~ f(x)

  6. fk(x) f1(x) f2(x) Task specific representation Common feature layer Common internal Representation x1 xn Multiple Task Learning (MTL) • Multiple hypotheses develop in parallel within one back-propagation network [Caruana, Baxter 93-95] • An inductive bias occurs through shared use of common internal representation • Knowledge or Inductive transfer to primary task f1 (x) depends on choice of secondary tasks

  7. Lifelong Learning with MTL Band Domain Mean Percent Misclass. Logic Domain Coronary Artery Disease A B C D

  8. fk(x) f1(x) f2(x) Task specific representation Common feature layer Common internal Representation [Caruana, Baxter] x1 xn Limitations of MTL for ML3 • Problems with multiple outputs: • Training examples must have matching target values • Redundant representation • Frustrates practice of a task • Prevents a fluid development of domain knowledge • No way to naturally associate examples with tasks • Inductive transfer limited to sharing of hidden node weights • Inductive transfer relies on selecting related secondary tasks

  9. One output for all tasks y’=f’(c,x) c1 ck x1 xn Context Inputs c Primary Inputs x Context Sensitive MTL (csMTL) • Recently developed an alternative approach that is meant to overcome these limitations: • Uses a single output neural network structure • Context inputs associate an example with a task • All weights are shared - focus shifts from learning separate tasks to learning a domain of tasks • No measure of task relatedness is required

  10. One output for all tasks y’=f’(c,x) c1 ck x1 xn Context Inputs c Primary Inputs x Context Sensitive MTL (csMTL) Recently, have shown that csMTL has two important constraints: • Context and bias weights • Context and output weights VC(csMTL) < VC(MTL) k j

  11. T3 T1 T0 T2 T4 T6 T5 Band of positive examples csMTL Empirical StudiesTask Domains • Band • 7 tasks, 2 primary inputs • Logic T0 = (x1 > 0.5  x2 > 0.5)  (x3 > 0.5  x4 > 0.5) • 6 tasks, 10 primary inputs • fMRI • 2 tasks, 24 primary inputs

  12. csMTL Empirical StudiesResults

  13. csMTL Empirical StudiesResults (2 more domains)

  14. y’ c1 ck x1 xn Why is csMTL doing so well? • Consider two unrelated tasks: • From a task relatedness perspective - correlation or mutual information over all examples is 0 • From an example by example perspective - 50% of examples have matching target values • csMTL transfers knowledge at the example level • Greater sharing of representation

  15. csMTL Results – SameTask • Learn primary task with transfer from 5 secondary tasks • 20 training examples per tasks, all examples drawn from same function f’(c,x) f1(x) f2(x) f5(x) c1 c5 x1 x10 x1 x10 MTL csMTL

  16. csMTL Results – SameTask • Learn primary task with transfer from 5 secondary tasks • 20 training examples per tasks, all examples drawn from same function

  17. One output for all tasks y’ c1 ck x1 xn Context Inputs Primary Inputs Measure of Task Relatedness? Early conjecture:Context to hidden node weight vectors can be used to measure task relatedness Not true:Two hypotheses for the same examples can develop that • have equivalent function • use different representation Transfer is functional in nature.

  18. Conclusions • csMTL is a method of inductive transfer using multiple tasks: • Single task output, additional context inputs • Shifts focus to learning a continuous domain of tasks • Eliminates redundant task representation (multiple outputs) • Empirical studies: • csMTL performs inductive transfer at or above level of MTL • Without measure of relatedness • A machine life-long learning (ML3) system based on two csMTL networks is also proposed in paper

  19. Future Work • Relationship between theory of Hints [Abu-Mostafa], secondary tasks (inductive bias, VCD) • Conditions under which csMTL ANNs succeed / fail • Exploring domains with real-valued context inputs • Will csMTL work with other ML methods? • Develop and test csMTL ML3 system

  20. csMTL Using IDT (Logic Domain)

  21. csMTL Using kNN (Logic Domain, k=5)

  22. f1(c,x) Short-term Learning Network Representational transfer from CDK for rapid learning Task Context Standard Inputs A ML3 based on csMTL One output for all tasks Functional transfer (virtual examples) for slow consolidation f’(c,x) Long-term Consolidated Domain Knowledge Network c1 ck x1 xn

  23. Thank You! • danny.silver@acadiau.ca • http://plato.acadiau.ca/courses/comp/dsilver/ • http://birdcage.acadiau.ca:8080/ml3/

  24. Inductive Bias and Knowledge Transfer Human learners use Inductive Bias ASH ST FIR ST SEC OND THI RD ELM ST • Inductive bias depends upon: • Knowledge of task domain • Selection of most related • tasks PINE ST OAK ST

  25. Requirements for a ML3 System:Req. for Long-term Retention … • Effective Retention • Resist introduction and accumulation of error • Retention of new task knowledge should improve related prior task knowledge (practice should improve performance) • Efficient Retention • Minimize redundant use of memory via consolidation • Meta-knowledge Collection • e.g. Example distribution over the input space • Ensures Effective and Efficient Indexing • Selection of related prior knowledge for inductive bias should be accurate and rapid

  26. Requirements for a ML3 System:Req. for Short-term Learning … • Effective (transfer) Learning • New learning should benefit from related prior task knowledge • ML3 hypotheses should meet or exceed accuracy of those hypotheses developed without benefit of transfer • Efficient (transfer) Learning • Transfer should reduce training time • Increase in space complexity should be minimized • Transfer versus Training Examples • Must weigh relevance and accuracy of prior knowledge, against • Number and accuracy of available training examples

  27. x = weather data f(x) = flow rate MTL – A Recent Example Stream flow rate prediction [Lisa Gaudette, 2006]

  28. Benefits of csMTL ML3: • Long-term Consolidation … • Effective retention (all tasks in DK net improve) • Efficient retention (redundancy eliminated) • Meta-knowledge collection (context cues) • Short-term Learning … • Effective learning (inductive transfer) • Efficient learning (representation + function) • Transfer / training examples used appropriately

More Related