towards a knowledgeable machine that can pass an elementary science test n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Towards a Knowledgeable Machine that can Pass an Elementary Science Test PowerPoint Presentation
Download Presentation
Towards a Knowledgeable Machine that can Pass an Elementary Science Test

Loading in 2 Seconds...

play fullscreen
1 / 56

Towards a Knowledgeable Machine that can Pass an Elementary Science Test - PowerPoint PPT Presentation


  • 105 Views
  • Uploaded on

Towards a Knowledgeable Machine that can Pass an Elementary Science Test. Peter Clark Vulcan Inc August 2013. Outline. Halo: The Goal and Road Travelled… AURA, Inquire, and reflections Exploiting Semi-Formal Representations and Textual Inference

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Towards a Knowledgeable Machine that can Pass an Elementary Science Test' - elgin


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
towards a knowledgeable machine that can pass an elementary science test

Towards a Knowledgeable Machine that can Pass an Elementary Science Test

Peter Clark

Vulcan Inc

August 2013

outline
Outline
  • Halo: The Goal and Road Travelled…
    • AURA, Inquire, and reflections
  • Exploiting Semi-Formal Representations and Textual Inference
  • A New Challenge: Fourth-Grade Science Tests
overall goals
Overall Goals
  • Long-Term Goal: The Digital Aristotle
    • Have large volumes of knowledge encoded in a computable form, such that the computer can answer questions, explain its answers, and ultimately dialog with users about the subject matter

“Explainable Reasoning”

  • History
    • Halo Pilot: Assess representation & reasoning technologies
      • Formal reasoning works, but acquisition and language are problems
    • Halo: Develop high-performance acquisition tool (AURA)
    • HaloBook (2010-12): Aim to encode much of a textbook
      • Inquire: An iPad app – the knowledgeable book
    • Halo 2.0: Reorient towards semi-automated acquisition
      • focus on taking K-12 science exams
the knowledge encoding process
The Knowledge Encoding Process

…Eukaryotic cells similarly have a plasma membrane, but also contain a cell nucleus that houses the eukaryotic cell's DNA…

Concept Map (User View)

Logic (Internal View)

∀x isa( x, Eukaryotic-cell) →

∃p,n,disa(p, Plasma-membrane) ∧

isa(n, Nucleus) ∧ isa(d, DNA) ∧ has-part(x, p) ∧

has-part(x, n) ∧ has-part(x, d) ∧ is-inside(d, n)

slide8

Reasoning: Deductive elaboration of the graph using other graphs and commonsense rules

  • Parts:
    • Plasma membrane
    • Nucleus
    • DNA

EukaryoticCell

  • Parts:
    • Plasma membrane
    • Cell wall
    • Chloroplast

PlantCell

  • Parts:
    • Plasma membrane
    • Cell wall
    • Chloroplast
    • Nucleus
    • DNA

Plant

Cell

(more)

question answering
Question Answering

Typical examples of questions the system can answer:

During mitosis, when does the cell plate begin to form?

What happens during DNA replication?

What is the relationship between photosynthesis and cellular respiration?

What do ribosomes do?

During synapsis, when are chromatids exchanged?

What are the differences between eukaryotic cells and prokaryotic cells?

How many chromosomes are in a human cell?

In which phase of mitosis does the cell divide?

What is the structure of a plasma membrane?

outcomes
Outcomes
  • The good…
    • Experiments suggested Inquire is educationally useful
    • Some question classes answered well
    • “Suggested question” mechanism helped a lot
  • The bad…
    • Only covered ~25% of the book after 2 years
    • Deductive question-answering somewhat hit-and-miss
slide13

It’s not that manually constructed rulebasesare “bad”, but:

    • Expensive (of course, costs may be brought down)
    • Brittle (unless the task is very tightly constrained)
    • Never seem to be finished (permanently incomplete)…
  • Textual Inference / Semi-Formal Representations:
    • Create language-based representations from (lots of) text
      • include words/phrases – deferred ontological commitment
    • Imprecise, shallower reasoning
      • an evidential process, using multiple sources of evidence

The Dilemma of Knowledge Engineering

Manual methods are expensive, automatic methods are shallow

outline1
Outline
  • Halo: The Goal and Road Travelled…
    • AURA, Inquire, and reflections
  • Exploiting Semi-Formal Representations and Textual Inference
  • A New Challenge: Fourth-Grade Science Tests
levels of formality
Levels of Formality

Semi-

Formal

?

Text

Logic

Textual

entailment

Logical

entailment

Query

?- has-part(ribosome,?x).

1 representation
1. Representation

Sentence

"Channel proteins facilitate the passage of molecules across the membrane."

*S:-17

+----------------------------------+---------+

NP:-3 VP:-13

| +----------------------------+-----+

N^:-2 V:0 *NP:-12*

| | +------------+---------------+

N:-2 FACILITATE NP:-8 PP:-2

+----+----+ +-------+-------+ +-------+---+

N:-1 N:0 NP:-1 PP:-2 P:0 NP:-1

| | +----+--+ +----+--+ | +----+---+

CHANNEL PROTEINS DET:0 N^:0 P:0 NP:-1 ACROSS DET:0 N^:0

| | | | | |

THE N:0 OF N^:0 THE N:0

| | |

PASSAGE N:0 MEMBRANE

|

MOLECULES

Parse

across

obj

subj

of

Logical Form

“passage”

“channel protein”

“facilitate”

“membrane”

“molecule”

subject(facilitate-1, channel-protein-1).

object(facilitate-1, passage-1).

of(passage-1, molecule-1).

across(passage-1, membrane-1).

2 textual inference
2. Textual Inference
  • Reasoning with semi-formal structures
  • Find sequence of transformations from text to question
  • Requires general lexical and world knowledge

Which proteins help move molecules through the membrane?

IF X facilitates Y THEN X helps Y

“passage”(n) → “move”(v)

“through” ↔ “across”

Knowledge resources

Channel proteins facilitate the passage of molecules across the membrane.

A. Channel proteins

2 textual inference1
2. Textual Inference

Which proteins help move molecules through the membrane?

1. (simple) question decomposition

What ?x help move molecules through the membrane?

Is ?x a protein?

2a. textual entailment

Channel proteins facilitate the passage of molecules across the membrane.

IF X “facilitates” Y THEN X “helps” Y

Channel proteins help the passage of molecules across the membrane.

“passage”(n) → “move”(v),

“through” ↔ “across”

Channel proteins help move molecules through the membrane.

What ?x help move molecules through the membrane?

2 textual inference2
2. Textual Inference

Which proteins help move molecules through the membrane?

1. (simple) question decomposition

What ?x help move molecules through the membrane?

Is ?x a protein?

Is an evidence-gathering process

2a. textual entailment

Channel proteins facilitate the passage of molecules across the membrane.

IF X “facilitates” Y THEN X “helps” Y

Channel proteins help the passage of molecules across the membrane.

“passage”(n) → “move”(v),

“through” ↔ “across”

Channel proteins help move molecules through the membrane.

What ?x help move molecules through the membrane?

2 textual inference3
2. Textual Inference

Channel proteins facilitate the passage of molecules across the membrane.

Channel proteins help the passage of molecules across the membrane.

What evidence can I find that

“X facilitates Y” “X helps Y”?

30k rules

146k rules

12M rules

4M rules

PPDB

(Johns Hopkins)

DIRT paraphrases

BioKB-101

ontology

WordNet

2 textual inference4
2. Textual Inference

Channel proteins facilitate the passage of molecules across the membrane.

Channel proteins help the passage of molecules across the membrane.

What evidence can I find that

“X facilitates Y” “X helps Y”?

30k rules

146k rules

12M rules

4M rules

PPDB

(Johns Hopkins)

DIRT paraphrases

BioKB-101 ontology

WordNet

2 textual inference5
2. Textual Inference

Channel proteins facilitate the passage of molecules across the membrane.

Channel proteins help the passage of molecules across the membrane.

What evidence can I find that

“X facilitates Y” “X helps Y”?

30k rules

146k rules

12M rules

4M rules

PPDB

(Johns Hopkins)

DIRT paraphrases

BioKB-101 ontology

WordNet

2 textual inference6
2. Textual Inference

Channel proteins facilitate the passage of molecules across the membrane.

Channel proteins help the passage of molecules across the membrane.

What evidence can I find that

“X facilitates Y” “X helps Y”?

30k rules

146k rules

12M rules

4M rules

PPDB

(Johns Hopkins)

DIRT paraphrases

BioKB-101 ontology

WordNet

domain biased paraphrases johns hopkins
Domain-Biased Paraphrases (Johns Hopkins)
  • Paraphrases learned via bilingual pivoting, and rescored using distributional similarity.
some examples from ppdb
Some examples from PPDB

travel fly 0.893

travel roll over 0.882

travel relax 0.87

travel freeze 0.861

travel breathe 0.861

travel swim 0.858

travel move 0.855

travel die 0.848

travel swell 0.845

travel switch 0.842

travel consumers 0.838

travel bend 0.835

travel walk 0.835

travel paint 0.828

travel work 0.828

travel move over 0.825

travel feed 0.825

travel evolve 0.825

travel survive 0.821

… … …

amplify elevate 0.993

amplify explore 0.992

amplify enhance 0.984

amplify speed up 0.984

amplify strengthen 0.982

amplify improve 0.982

amplify magnify 0.98

amplify extend 0.978

amplify accept 0.97

amplify follow 0.965

amplify carry out 0.965

amplify broaden 0.962

amplify go into 0.962

amplify promote 0.959

amplify explain 0.955

amplify implement 0.951

amplify leave 0.944

amplify adopt 0.944

amplify acquire 0.942

amplify expand 0.942

… … …

???

???

performance
Performance
  • Currently, 3 databases of semi-formal representations
    • Current F1 ≈ 30% (e.g., 50% on 10% of qns)
    • Answer = weighted sum of evidence
    • Learn the weights (via simulated annealing)

levels of formality1
Levels of Formality

Semi-

Formal

?

Text

Logic

Query

?- has-part(ribosome,?x).

levels of formality2
Levels of Formality

Semi-

Formal

?

Text

Logic

What should go in here?

Query

?- has-part(ribosome,?x).

outline2
Outline
  • Halo: The Goal and Road Travelled…
    • AURA, Inquire, and reflections
  • Exploiting Semi-Formal Representations and Textual Inference
  • A New Challenge: Fourth-Grade Science Tests
k 12 grade science tests
K-12 Grade Science Tests
  • Provide a (task-oriented) focus
  • Simpler (question) language
  • Involves more common sense
  • Wide variety of question types and difficulties
  • Caveats
    • Multiple choice are common
    • Diagrams are common
the 4 th grade ny regents science exam
The 4th Grade NY Regents’ Science Exam
  • What types of questions are there?
  • What would it take to answer them?
the 4 th grade ny regents science exam1
The 4th Grade NY Regents’ Science Exam
  • What types of questions are there?
  • What would it take to answer them?
the 4 th grade ny regents science exam2
The 4th Grade NY Regents’ Science Exam
  • What types of questions are there?
  • What would it take to answer them?

“Retrieval”

1 taxonomic
1. Taxonomic
  • Question interpretation:
    • Decompose question into “isa” queries
  • Several good sources of simple “isa” knowledge
    • WordNet, Cyc, Wikipedia
    • Within text itself
  • “isa” knowledge is fundamental to other reasoning types
2 definitions
2. Definitions

Dictionary Resources

erosion: The process of being eroded by wind, water, or other natural agents.

erosion: The wearing away of rocks and other deposits on the earth's surface …

erosion: The gradual wearing away of land surface materials, especially rocks, …

2 definitions1
2. Definitions

Dictionary Resources

erosion: The process of being eroded by wind, water, or other natural agents.

erosion: The wearing away of rocks and other deposits on the earth's surface …

erosion: The gradual wearing away of land surface materials, especially rocks, …

Entailment-Style Reasoning

the movement of soil by wind or water

The gradual wearing away of land surface materials, especially rocks, sediments, and soils, by the action of water, wind, or a glacier.

3 basic facts
3. Basic Facts

“Semantic Databases”

  • Some basic facts can be pre-extracted and cleaned
    • parts, functions, steps in a process, etc.
  • + existing resources have some of this knowledge
building semantic databases
Building Semantic Databases…

Text

has-part(Leaf,Stomata)

Known

parts

“Stomata in a leaf's surface lead to a maze of internal air spaces”

good “parts”

relations

(training data)

Sentences expressing those relations

LOD

WordNet

AURA

MultiR (Univ Washington)

Final

parts database

candidate pair, e.g.,

“plant cell” has-part “chloroplast”?

Decision

(yes/no + confidence)

Classifier

Iterate,

+ Human/

machine

validation

the 4 th grade ny regents science exam3
The 4th Grade NY Regents’ Science Exam
  • What types of questions are there?
  • What would it take to answer them?

“Inference”

4 rules simple inference
4. “Rules” (simple inference)
  • Many questions require simple, one-step entailments
    • X eats → X gets nutrients
    • X breathes oxygen –enables→ X make energy
    • X made of metal → X conducts electricity
  • Large number of such facts and rules needed
    • Manually enter them?
    • Induce them?
    • Just read them?

Via:

Judicious forms of text

Good NLP

Manual validation

4 knowledge rule extraction from text
4. Knowledge (Rule) Extraction from Text

Animals take in air by breathing. They need oxygen, which is in the air. Oxygen allows the animal to make and use energy, which it needs to survive. Animals also need water to survive. Water is used to break down and move materials throughout the body. Animals cannot make their own food so they must eat to get nutrients. Nutrients are necessary for growth and energy.

  • Assertions
    • air contains oxygen
    • animals need oxygen
    • animals need energy
    • animals need water
  • Implications
    • animal breathes → animal takes in air
    • animal breathes oxygen -enables→ animal make energy
    • animal eat -enables→ animal get nutrients
    • animal get nutrients -enables→ animal grow
    • animal has water -enables→ animal breakdown materials
4 knowledge rule extraction from text1
4. Knowledge (Rule) Extraction from Text
  • Rule acquisition:
    • specific patterns in text

X Ys by Z

IF X ZsTHEN X Ys

“Animals take in

air by breathing.”

IF an animal breathes

THEN an animal takes in air

  • Rule application: using textual entailment-style inference
      • If rule condition entailed, then infer conclusion
  • Current status: Pretty noisy rules!
the 4 th grade ny regents science exam4
The 4th Grade NY Regents’ Science Exam
  • What types of questions are there?
  • What would it take to answer them?

“Models”

5 domain models
5. Domain Models
  • Sometimes you do need some “computational clockwork”
  • Qualitative models
    • qualitative influences (X goes up → Y goes down)
    • what happens to Z if X goes up?
  • Process models
    • partially ordered network of events
    • how does X contribute to Y?
  • Acquisition Task ≠ “read the text”

= extract/build model instances from the text

5 example process models
5. Example: Process Models

Process reasoner: Given a process, can answer questions, e.g. What is the role of Entity in Process?

What Entity performs Role in Event?

During X, what happens after Y?

KA Task = extract a process instance from text:

1. Identify where a process is being described

2. Extract it, e.g., with a set of trained classifiers

“stimulate” [theme: “cell”]

When the cell is stimulated, gated channels open that facilitate Na+ diffusion. Sodium ions then diffuse down their electrochemical gradient….

“open” [theme: “gated channels”]

“diffuse” [theme: “sodium ions”,”Na+”

direction: “down ec gradient”]

extracting process models
Extracting Process Models

gradient

flow down

H+ ions

rotor

binding site

enter

H+ ions flowing down their gradient enter a half channel in a stator, which is anchored in the membrane. H+ ions enter binding sites within a rotor, changing the shape of each subunit so that the rotor spins within the membrane... Spinning of the rotor causes an internal rod to spin as well. This rod extends like a stalk into the knob below it, which is held stationary by part of the stator. Turning of the rod activates catalytic sites in the knob that can produce ATP from ADP and Pi.

shape

change

causes

rotor, rod

spin

catalytic site

activate

ATP

produce

ADP, Pi

another example energy conversion
Another Example: Energy Conversion
  • Modeling technique: Energy conversion
    • extract event sequence (process model)
    • layer energy types on top
    • → initial form of energy? final? form that produced X? etc

baby shake rattle

rattle make noise

movement

sound

mechanical energy

sound energy

the 4 th grade ny regents science exam5
The 4th Grade NY Regents’ Science Exam
  • What types of questions are there?
  • What would it take to answer them?

“Diagrams”

6 diagrams images tables
6. Diagrams, Images, Tables
  • Common in exams; many different styles and challenges

(Non-essential diagram)

6 diagrams images tables1
6. Diagrams, Images, Tables
  • Common in exams; many different styles and challenges

(Hard)

6 diagrams images tables2
6. Diagrams, Images, Tables
  • Common in exams; many different styles and challenges

(Extremely

hard)

where to
Where to?
  • Revised picture of intelligence
    • Knowledge as a collection of resources, at various levels of formality
      • taxonomic, factual, semi-formal rules, formal models
    • Reasoning as a collection of “experts”, with various specialized skills
      • taxonomic, textual entailment, targeted formal systems
    • Semi-formal representations avoid some of the rigidity of deductive logic
      • ≠ proof tree, = most plausible chain of inference
    • Introspection: Why materialize knowledge at all?
      • Allows refinement and inconsistency reduction
what did we learn from watson
What did we learn from Watson?
  • The obvious:
    • leverage lots of data
    • multiple solvers + machine reasoning = better results
  • The less obvious:
    • evidential reasoning
      • not about finding a proof, but searching for evidence
      • deduction often comes “tantalizingly close”
    • no single, pre-defined ontology
      • Doesn’t mean we don’t need ontologies!

(judiciously chosen)

  • “What material is DNA made of?” → “nucleotides”
  • “What shape does the six carbon atoms in glucose form?”
summary
Summary
  • Halo: toward knowledgeable machines
  • Now pursuing a quite different model of

intelligence

  • Fourth-Grade Science Tests
    • Wide variety of question types and challenges
      • taxonomic
      • definitional
      • basic facts
      • simple (but many possible) inferences from given facts
      • formal modeling techniques
      • diagrams
    • A good driver and test for this picture!

Thank you!