Watson system
This presentation is the property of its rightful owner.
Sponsored Links
1 / 44

Watson System PowerPoint PPT Presentation


  • 83 Views
  • Uploaded on
  • Presentation posted in: General

Watson System. By : Devendra Chaplot Priyank Chhipa Pratik Kumar. What Computers Find Easier. 0.00885. ln ((12,546,798 * π ) ^ 2) / 34,567.46 = . What Computers Find Easier. Select Payment where Owner =“David Jones” and Type(Product)=“ Laptop”, . Dave Jones. ≠.

Download Presentation

Watson System

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Watson system

Watson System

By :

DevendraChaplot

Priyank Chhipa

Pratik Kumar


Watson system

What Computers Find Easier

0.00885

ln((12,546,798 * π) ^ 2) / 34,567.46

=


What computers find easier

What Computers Find Easier

Select Payment where Owner=“David Jones” and Type(Product)=“Laptop”,

Dave Jones

David Jones

=

David Jones

David Jones


Watson system

What Computers Find Hard

Computer programs are natively explicit, fast and exacting in their calculation over numbers and symbols….But Natural Language is implicit, highly contextual, ambiguous and often imprecise.

Structured

Unstructured

  • Where was X born?

    One day, from among his city views of Ulm, Otto chose a water color to send to Albert Einstein as a remembrance of Einstein´s birthplace.

  • X ran this?

    If leadership is an art then surely Jack Welch has proved himself a master painter during his tenure at GE.


A grand challenge opportunity

A Grand Challenge Opportunity

  • Capture the imagination

    • The Next Deep Blue

  • Engage the scientific community

    • Envision new ways for computers to impact

    • society & science

    • Drive important and measurable scientific advances

  • Be Relevant to Important Problems

    • Enable better, faster decision making over unstructured and structured content

    • Business Intelligence, Knowledge Discovery and Management, Government, Compliance, Publishing, Legal, Healthcare, Business Integrity,Customer Relationship Management, Web Self-Service, Product Support, etc.


Watson system

Real Language is Real Hard

  • Chess

    • A finite, mathematically well-defined search space

    • Limited number of moves and states

    • Grounded in explicit, unambiguous

      mathematical rules

  • Human Language

    • Ambiguous, contextual and implicit

    • Grounded only in human cognition

    • Seemingly infinitenumber of ways to express the same meaning


A l ong standing c hallenge in artificial intelligence to emulate human expertise

Automatic Open-Domain Question Answering

A Long-Standing Challenge in Artificial Intelligence to emulate human expertise

  • Given

    • Rich Natural Language Questions

    • Over a Broad Domain of Knowledge

  • Deliver

    • Precise Answers:Determine what is being asked & give precise response

    • Accurate Confidences:Determine likelihood answer is correct

    • Consumable Justifications:Explain why the answer is right

    • Fast Response Time:Precision & Confidence in <3 seconds

7


You may have heard of ibm s watson

You may have heard of IBM’s Watson…

A. What is the computer system that played against human opponents on “Jeopardy”…and won.

Why Jeopardy?

The game of Jeopardy! makes great demands on its players – from the range of topical knowledge covered to the nuances in language employed in the clues. The question IBM had for itself was“is it possible to build a computer system that could process big data and come up with sensible answers in seconds—so well that it could compete with human opponents?”


Some basic jeopardy clues

Some Basic Jeopardy! Clues

The type of thing being asked for is often indicated but can go from specific to very vague

  • This fish was thought to be extinct millions of years ago until one was found off South Africa in 1938

    • Category: ENDS IN "TH"

    • Answer:

  • When hit by electrons, a phosphor gives off electromagnetic energy in this form

    • Category: General Science

    • Answer:

  • Secy. Chase just submitted this to me for the third time--guess what, pal. This time I'm accepting it

    • Category: Lincoln Blogs

    • Answer:

coelacanth

light (or photons)

his resignation

9


Lexical answer type

Lexical Answer Type

  • We define a LAT to be a word in the clue that indicates the type of the answer, independent of assigning semantics to that word. For example in the following clue, the LAT is the string “maneuver.”

  • Category: Oooh….Chess

  • Clue: Invented in the 1500s to speed up the game, this maneuver involves two pieces of the same color.

  • Answer: Castling


Lexical answer type1

Lexical Answer Type

  • About 12 percent of the clues do not indicate an explicit lexical answer type but may refer to the answer with pronouns like “it,” “these,” or “this” or not refer to it at all. In these cases the type of answer must be inferred by the context. Here’s an example:

    • Category: Decorating

    • Clue: Though it sounds “harsh,” it’s just embroidery, often in a floral pattern, done with yarn on cotton cloth.

    • Answer: crewel


Watson system

How we convert data into knowledge for Watson’s use

Three types of knowledge

Domain Data(articles, books, documents)

Training and test question sets w/answer keys

NLP Resources(vocabularies, taxonomies, ontologies)

Converted to Indices for search/passage lookup

Named entity detection, relationship detection algorithms

Used to create logistic regression model that Watson uses for merging scores

Redirects extracted for disambiguation

Frame cuts generated with frequencies to determine likely context

Custom slot grammar parsers, prolog rules for semantic analysis

Pseudo docs extracted for Candidate answer generation


Machine learning

Machine learning

  • One of the core components of the system

    • Multiple models

    • 14000+ training questions

  • Every candidate answer gets hundreds of features/scores associated with it. There features/scores are passed through previously trained ML model for candidate answer scoring

  • It's not just one model. In fact there is a chain of models, each subsequent one utilizes scores produced by previously run models

  • Machine learning also used in other parts of the system, such as LAT confidence analysis.


Watson system

NLP

  • Used in many places (Question Analysis, Evidence Analysis, Content Pre-processing)

  • Combines both rule and statistic based approaches

  • Full NLP stack (used in QA)

    • Tokenization

    • Named Entity Recognition

    • Deep Parsing and Predicate Argument Structure creation

    • Lexical Answer Type (LAT) and Focus detection

    • Anaphora resolution

    • Semantic Relationships extraction

  • Various technologies and techniques are used (English Slot Grammar parser, R2 NED, machine learning for LAT confidence analysis, custom annotators written in Prolog and Java)


Nlp examples

NLP Examples

  • LAT and Focus

    • It's the Peter Benchley novelabout a killer giant squid that menaces the coast of Bermuda

  • Named Entity Recognition

    • It's the {Person::Peter Benchley} novel about a killer giant {Animal::squid} that menaces the {Location::coast of Bermuda}

  • Anaphora Resolution

    • Columbus embarked on his first voyage to this continent in 1492. In the next two decades he led three more expeditions there.


Nlp in evidence analysis and content pre processing

NLP in evidence analysis and content pre-processing

  • Why do NLP on evidence passages and ingested content?

  • NLP in Evidence Analysis allows:

    • LAT based scoring

    • Named entities alignment based scoring

  • NLP in Content Pre-processing

    • Extracting and accumulating “knowledge” frames from the content

      • For instance

        • SVO frame cuts will contain frequencies of Subject-Verb-Object occurrences in the content that Watson has ingested.

        • e.g squid menaces coast 809

    • These “knowledge” frames are then used to generate candidate answers


Broad domain

Broad Domain

We do NOT attempt to anticipate all questions and build databases.

We do NOT try to build a formal

model of the world

In a random sample of 20,000 questions we found

2,500 distinct types*. The most frequent occurring <3% of the time. The distribution has a very long tail.

And for each these types 1000’s of different things may be asked.

Even going for the head of the tail will

barely make a dent

*13% are non-distinct (e.g, it, this, these or NA)

Our Focus is on reusable NLP technology for analyzing vast volumes of as-is text.

Structured sources (DBs and KBs) provide background knowledge for interpreting the text.


Watson system

Automatic Learning for “Reading”

Generalization &

Statistical Aggregation

Sentence

Parsing

Volumes of Text

Syntactic Frames

Semantic Frames

verb

object

subject

Inventors patent inventions (.8)

Officials Submit Resignations (.7)

People earn degrees at schools (0.9)

Fluid is a liquid (.6)

Liquid is a fluid (.5)

Vessels Sink (0.7)

People sink 8-balls (0.5) (in pool/0.8)


Watson system

Evaluating Possibilities and Their Evidences

In cell division, mitosis splits the nucleus & cytokinesis splits this liquidcushioning the nucleus.

  • Many candidate answers (CAs) are generated from many different searches

  • Each possibility is evaluated according to different dimensions of evidence.

  • Just One piece of evidence is if the CA is of the right type. In this case a “liquid”.

  • Organelle

  • Vacuole

  • Cytoplasm

  • Plasma

  • Mitochondria

  • Blood …

“Cytoplasm is a fluidsurrounding the nucleus…”

Is(“Cytoplasm”, “liquid”) = 0.2

Is(“organelle”, “liquid”) = 0.1

Wordnet  Is_a(Fluid, Liquid)  ?

Is(“vacuole”, “liquid”) = 0.2

Learned  Is_a(Fluid, Liquid)  yes.

Is(“plasma”, “liquid”) = 0.7


Different types of evidence keyword evidence

InMay,Garyarrived in Indiaafter hecelebratedhisanniversaryin Portugal.

In May 1898 Portugal celebrated the 400th anniversary of this explorer’s arrival in India.

arrived in

celebrated

celebrated

In May 1898

In May

400th anniversary

anniversary

Portugal

in Portugal

arrival in

India

India

Gary

explorer

Different Types of Evidence: Keyword Evidence

Keyword Matching

Keyword Matching

Keyword Matching

Evidence suggests “Gary” is the answer BUT the system must learn that keyword matching may be weak relative to other types of evidence

Keyword Matching

Keyword Matching

21


Different types of evidence deeper evidence

In May 1898 Portugal celebrated the 400th anniversary of this explorer’s arrival in India.

On 27th May 1498, Vasco da Gama landed in Kappad Beach

On 27th May 1498, Vasco da Gama landed in Kappad Beach

On 27th May 1498, Vasco da Gama landed in Kappad Beach

On the 27th of May 1498, Vasco da Gama landed in Kappad Beach

celebrated

landed in

Portugal

27th May 1498

May 1898

400th anniversary

arrival in

Kappad Beach

India

Vasco da Gama

explorer

Different Types of Evidence: Deeper Evidence

  • Search Far and Wide

  • Explore many hypotheses

  • Find Judge Evidence

  • Many inference algorithms

Date

Math

Temporal Reasoning

Statistical Paraphrasing

Para-phrases

Stronger evidence can be much harder to find and score.

GeoSpatial Reasoning

Geo-KB

The evidence is still not 100% certain.


Deepqa

DeepQA

The technology & architecture behind Watson


The difference between search deepqa

The Difference Between Search & DeepQA

Decision Maker

Has Question

Search Engine

Distills to 2-3 Keywords

Finds Documents containing Keywords

Reads Documents, Finds Answers

Delivers Documents based on Popularity

Finds & Analyzes Evidence

Expert

Understands Question

Decision Maker

Produces Possible Answers & Evidence

Asks NL Question

Analyzes Evidence, Computes Confidence

Considers Answer & Evidence

Delivers Response, Evidence & Confidence


Deepqa the technology architecture behind watson

DeepQA: the technology & architecture behind Watson

Learned Models

help combine and weigh the Evidence

model

model

model

Evidence Sources

Answer Sources

Initial

Question

model

model

model

Deep Evidence Scoring

Candidate

Answer

Generation

Evidence

Retrieval

Answer Scoring

PrimarySearch

model

model

model

Final Confidence Merging & Ranking

Hypothesis

Generation

Hypothesis

& Evidence Scoring

Synthesis

Question

Decomposition

Question & Topic Analysis

Answer &Confidence

Hypothesis

Generation

Hypothesis and Evidence Scoring

Hypothesis

Generation

Hypothesis and Evidence Scoring


Deepqa the technology architecture behind watson1

DeepQA: the technology & architecture behind Watson

1

Initial Question Formulated: “The name of this monetary unit comes from the word for "round"; earlier coins were often oval”

Initial

Question

3

It decides whether the question needs to be subdivided.

Question

Decomposition

Question & Topic Analysis

Watson performs question analysis, determines what is being asked.

2


Deepqa the technology architecture behind watson2

DeepQA: the technology & architecture behind Watson

5

In creating the hypotheses it will use, Watson consults numerous sources for potential answers…

Answer Sources

Initial

Question

Candidate

Answer

Generation

PrimarySearch

4

Hypothesis

Generation

Question

Decomposition

Question & Topic Analysis

Watson then starts to generate hypotheses based on decomposition and initial analysis…as many hypothesis as may be relevant to the initial question…

Hypothesis

Generation

Hypothesis

Generation


Deepqa the technology architecture behind watson3

DeepQA: the technology & architecture behind Watson

7

Watson uses Evidence Sources to validate it’s hypothesis and help score the potential answers

Evidence Sources

Answer Sources

Initial

Question

Deep Evidence Scoring

Candidate

Answer

Generation

Evidence

Retrieval

Answer Scoring

PrimarySearch

8

If the question was decomposed, Watson brings together hypotheses from sub-parts

Hypothesis

Generation

Hypothesis

& Evidence Scoring

Synthesis

Question

Decomposition

Question & Topic Analysis

6

Watson then uses algorithms to “score” each potential answer and assign a confidence to that answer…

Hypothesis and Evidence Scoring

Hypothesis and Evidence Scoring


Deepqa the technology architecture behind watson4

DeepQA: the technology & architecture behind Watson

Learned Models

help combine and weigh the Evidence

9

Using models on the merged hypotheses, Watson can weigh evidence based on prior “experiences”

model

model

model

Answer Sources

Initial

Question

Initial

Question

model

model

model

Candidate

Answer

Generation

PrimarySearch

model

model

model

Final Confidence Merging & Ranking

Hypothesis

Generation

Hypothesis

& Evidence Scoring

Synthesis

Question

Decomposition

Question & Topic Analysis

10

Answer &Confidence

Hypothesis

Generation

Once Watson has ranked its answers, it then provides its answers as well as the confidence it has in each answer.

Hypothesis

Generation


Deepqa the technology architecture behind watson5

DeepQA: the technology & architecture behind Watson

Learned Models

help combine and weigh the Evidence

model

model

model

Evidence Sources

Answer Sources

Initial

Question

Initial

Question

model

model

model

Deep Evidence Scoring

Candidate

Answer

Generation

Evidence

Retrieval

Answer Scoring

PrimarySearch

model

model

model

Final Confidence Merging & Ranking

Hypothesis

Generation

Hypothesis

& Evidence Scoring

Synthesis

Question

Decomposition

Question & Topic Analysis

Answer &Confidence

Hypothesis

Generation

Hypothesis and Evidence Scoring

Hypothesis

Generation

Hypothesis and Evidence Scoring


Step 0 content acquisition

Step 0 : Content Acquisition

Content acquisition is a combination of manual and automatic steps.

The first step is to analyze example questions from the problem space to produce a description of the kinds of questions that must be answered and a characterization of the application domain.

Analyzing example questions is primarily a manual task, while domain analysis may be informed by automatic or statistical analyses, such as the LAT analysis.


Step 1 question analysis

Step 1 : Question Analysis

Initial

Question

Initial

Question

Question

Decomposition

Question & Topic Analysis

The system attempts to understandwhat the question is asking and performs the initial analyses that determine how the question will be processed by the rest of the system.

Question Classification e.g. puzzle/math

Focus and Lexical Answer Type (LAT) e.g. “On this day” LAT – date/day

Relation Detection e.g. sea (India, x, west)

Decomposition - divide and conquer.


Step 2 hypothesis generation

Step 2 : Hypothesis Generation

Answer Sources

Candidate

Answer

Generation

PrimarySearch

Hypothesis

Generation

Question

Decomposition

Hypothesis

Generation

Hypothesis

Generation

  • Primary search :

    • Keyword based search

    • Top 250 results are considered for CandidateAnswer generation.

    • Empirical statistics : 85% time answer is withintop 250 results.

  • CA generation :generates CAs using results ofPrimary Search

  • Soft Filtering

    • lightweight (less resource intensive) scoring algorithms to a larger set of initial candidates to prune them down to a smaller set of candidates

    • Reduction in number of CA to approx. 100

    • Answers are not fully discarded , may be reconsidered at final stage.


Step 2 hypothesis generation1

Step 2 : Hypothesis Generation

4.Each CA plugged back into the question is considered a hypothesis which the system has to prove correct with some threshold of confidence.

5.If failed at this state , system has no hope of answering the question whatsoever.

  • Noise tolerance - tolerate noise in the early stages of the pipeline and drive up precision downstream

  • Favors recall over precision, with the expectation that the rest of the processing pipeline will tease out the correct answer, even if the set of candidates is quite large


Step 3 hypothesis evidence scoring

Step 3 : Hypothesis & Evidence scoring

  • Candidate answers that pass the soft filtering threshold undergo a rigorous evaluation process that involves 2 steps :-

  • Evidence retrieval :

    • Gathers additional supporting evidence for each candidate answer, or hypothesis.

      e.g. Passage search: gathering passages by adding CA to primary search query.

  • Scoring:

    • Deep content analysis – includes many different components, or scorers, that consider different dimensions of the evidence

    • Produce a score that corresponds to how well evidence supports a candidate answer for a given question.


Step 4 final merging and ranking

Step 4 : Final Merging and Ranking

  • Merging:

    • Multiple candidate answers for a question may be equivalent despite very different surface forms.

    • Using an ensemble of matching, normalization and co-reference resolution algorithms, Watson identifies equivalent and related hypothesis.

    • Without merging, ranking algorithms would be comparing multiple surface forms that represent the same answer and trying to discriminate among them.


Step 4 final merging and ranking1

Step 4 : Final Merging and Ranking

  • Ranking and confidence estimation:

    • After merging, the system must rank the hypotheses and estimate confidence based on their merged scores

    • These hypothese are ran over set of training questions with known answers.

    • Watson’s metalearner uses multiple trained models to handle different question classes as, for instance, certain scores that may be crucial to iden- tifying the correct answer for a factoid question may not be as useful on puzzle questions


The final blow ctd

The Final Blow! (ctd.)

“I for one welcome our new computer overlords” - Jennings


Watson a workload optimized system

Watson – a Workload Optimized System

1 Note that the Power 750 featuring POWER7 is a commercially available server that runs AIX, IBM i and Linux and has been in market since Feb 2010

90 x IBM Power 7501 servers

2880 POWER7 cores

POWER7 3.55 GHz chip

500 GB per sec on-chip bandwidth

10 Gb Ethernet network

15 Terabytes of memory

20 Terabytes of disk, clustered

Can operate at 80 Teraflops

Runs IBM DeepQA software

Scales out with and searches vast amounts of unstructured information with UIMA & Hadoop open source components

Linux provides a scalable, open platform, optimized to exploit POWER7 performance

10 racks include servers, networking, shared disk system, cluster controllers


Watson precision confidence speed

Watson: Precision, Confidence & Speed

Deep Analytics – We achievedchampion-levels of Precision and Confidence over a huge variety of expression

Speed – By optimizing Watson’s computation for Jeopardy! on 2,880 POWER7 processing cores we went from 2 hours per question on a single CPU to an average of just 3 seconds – fast enough to compete with the best.

Results – in 55 real-time sparring against former Tournament of Champion Players last year, Watson put on a very competitive performance, winning 71%. In the final Exhibition Match against Ken Jennings and Brad Rutter, Watson won!


Potential business applications

Potential Business Applications

  • Healthcare Analytics

  • Analyzing: E-Medical records, hospital reports

  • For: Clinical analysis; treatment protocol optimization

  • Benefits: Better management of chronic diseases; optimized drug formularies; improved patient outcomes

  • Customer Care

  • Analyzing: Call center logs, emails, online media

  • For: Buyer Behavior, Churnprediction

  • Benefits: Improve Customer satisfaction and retention,marketing campaigns, find new revenue opportunities

  • Crime Analytics

  • Analyzing: Case files, police records, 911 calls…

  • For: Rapid crime solving & crime trend analysis

  • Benefits: Safer communities & optimized force deployment

  • Insurance Fraud

  • Analyzing: Insurance claims

  • For: Detecting Fraudulent activity & patterns

  • Benefits: Reduced losses, faster detection, more efficient claims processes

  • Automotive Quality Insight

  • Analyzing: Tech notes, call logs, online media

  • For: Warranty Analysis, Quality Assurance

  • Benefits: Reduce warranty costs, improve customer satisfaction, marketing campaigns

  • Social Media for Marketing

  • Analyzing: Call center notes, SharePoint, multiple content repositories

  • For: churn prediction, product/brand quality

  • Benefits: Improve consumer satisfaction, marketing campaigns, find new revenue opportunities or product/brand quality issues

41


References

References

  • The AI magazine

    • Ferrucci, David, et al. "Building Watson: An overview of the DeepQA project." AI magazine 31.3 (2010): 59-79.

  • Watson Systems:

    • http://www-03.ibm.com/innovation/us/watson/

  • Wiki Page

    • http://en.wikipedia.org/wiki/Watson_%28computer%2

  • Building Watson A Brief Overview of the DeepQAProject by Joel Farrell,IBM

    • http://www.medbiq.org/sites/default/files/presentations/2011/Farrell.ppt


References1

References

  • What is Watson, really?

    • http://www-01.ibm.com/software/ebusiness/jstart/downloads/IOD2011.ppt

    • Authors : KeyurDalal (IBM), Vladimir Stemkovski (IBM) and Jeff Sumner (IBM)

  • Jeopardy! IBM Watson Day 1 (Feb 14, 2011)

    • http://www.youtube.com/watch?v=seNkjYyG3gI&feature=related

  • Science Behind an Answer-

    • http://www-03.ibm.com/innovation/us/watson/what-is-watson/science-behind-an-answer.html

    • Video:  http://youtu.be/DywO4zksfXw


Watson system

Questions?


Thank you

Thank You!


  • Login