watson systems
Download
Skip this Video
Download Presentation
Watson Systems

Loading in 2 Seconds...

play fullscreen
1 / 41

Watson Systems - PowerPoint PPT Presentation


  • 72 Views
  • Uploaded on

Watson Systems. By- Team 7 : Pallav Dhobley 09005012 Vihang Gosavi 09005016 Ashish Yadav 09005018. Motivation:. Deep-Blue’s Triumph over Kasparov in 1997. In search of new challenge. Jeopardy!. 2004 – Search ends! One of the most popular Quiz show in U.S.A.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Watson Systems' - walker


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
watson systems

Watson Systems

By-

Team 7 :

PallavDhobley 09005012

VihangGosavi 09005016

AshishYadav 09005018

motivation
Motivation:
  • Deep-Blue’s Triumph over Kasparov

in 1997.

  • In search of

new challenge.

jeopardy
Jeopardy!
  • 2004 – Search ends!
  • One of the most popular Quiz show in U.S.A.
  • Broad/Open Domain.
  • Complex Language.
  • High Speed.
  • High precision.
  • Accurate Confidence.
jeopardy1
Jeopardy!
  • 2004 – Search ends!
  • One of the most popular Quiz show in U.S.A.
  • Broad/Open Domain.
  • Complex Language.
  • High Speed.
  • High precision.
  • Accurate Confidence.

*le IBM

easier than playing chess
Easier than playing Chess?
  • Chess:
  • Finite moves and states.
  • Mathematically well defined
  • search space
  • Symbols have mathematical
  • meaning
  • Natural Language:
  • Implicit
  • Highly Contextual
  • Ambiguous
  • Imprecise
easier than playing chess1
Easier than playing Chess?

NO!!

  • Chess:
  • Finite moves and states.
  • Mathematically well defined
  • search space
  • Symbols have mathematical
  • meaning
  • Natural Language:
  • Implicit
  • Highly Contextual
  • Ambiguous
  • Imprecise
easy question
Easy Question

(LN(1,25,46,798*π))^3 / 34,600.47

=

?

easy question1
Easy Question:

(LN(1,25,46,798*π))^3 / 34,600.47

=

0.155

hard question
Hard Question:
  • Where was our “father of nation” born?

- contextual.

- imprecise.

  • Easy for us Indians to relate term “father of nation” with M.K. Gandhi.
  • Not the same with computers.
  • Need of learning from As-Is content.
what is watson
What is Watson?
  • Advanced Search Engine? ×
  • Some fancy Database Retrieval System? ×
  • Beginning of Sky-Net? ×
  • Science behind an Answer? √
principles of deepqa
Principles of DeepQA:
  • Massive Parallelism

- Each hypothesis and interpretation is analyzed independently in parallel to generate candidate answers.

  • Many experts

- Facilitate the integration and contextual evaluation of a wide range of analytics generated by several algorithms running in parallel.

principles of deepqa ctd
Principles of DeepQA (ctd.)
  • Pervasive Confidence Estimation

- No component commits to an answer

  • Integrate shallow and deep knowledge

- Using shallow and deep semantics for better precision

e.g.

Shallow semantics : Keyword matching

Deep semantics : Logical Relationships

step 0 content acquisition
Step 0 : Content Acquisition
  • Identifying and gathering the

content to be used for answering

and evidence supporting.

  • Involves analyzing example questions from the problem space which consists of Q-A from previous games.
  • Encyclopedias, dictionaries, wiki pages etc. are use to make up the evidence sources.
  • Extract , verify and merge the most informative nuggets as a part of content acquisition.
step 1 question analysis
Step 1 : Question Analysis

The initial analysis that determines

how the question will be processed

by the rest of the system.

  • Question Classification e.g. puzzle/math
  • Focus and (Lexical Answer Type)LAT e.g. “On this day” LAT – date/day
  • Relation Detection e.g. sea(India, x, west)
  • Decomposition - divide and conquer.
step 2 hypothesis generation
Step 2 : Hypothesis Generation
  • Primary search :
    • Keyword based search
    • Top 250 results are considered for Candidate Answer generation.
    • Empirical statistics : 85% time answer is within top 250 results.
  • CA generation : above results are further processed for CA generation.
  • Soft Filtering
    • It reduces set of candidate answers using superficial analysis (machine learning).
    • Reduction in number of CA to approx. 100
    • Answers are not fully discarded , may be reconsidered at final stage.
step 2 hypothesis generation ctd
Step 2: Hypothesis Generation (ctd.)

4. Each CA plugged back into the question is considered a hypothesis which the system has to prove correct with some threshold of confidence.

5. If failed at this state , system has no hope of answering the question whatsoever.

  • Noise tolerance.
step 3 hypothesis evidence scoring
Step 3 : Hypothesis & evidence scoring
  • Evidence retrieval :
    • Further evidences are gathered to support the Hypothesis formed in last step .

e.g. Passage search: gathering passages by adding CA to primary search query.

  • Scoring:
    • Deep content analysis
    • Determines degree of certainty that retrieved evidence supports the CA.
step 4 final merging and ranking
Step 4 : Final Merging and Ranking
  • Merging:
    • Merging all the hypothesis which give you the same answer.
    • Using an ensemble of matching, normalization and co-reference resolution algorithms, Watson identifies equivalent and related hypothesis.
  • Ranking and confidence estimation:
    • The final set of hypothesis after merging are ran over set of training questions with known answers.
example
Example :
  • Q : “Who is the antagonist of Stevenson's Treasure Island?”
  • Step 1 :

Parse and generate a logical structure to describe the question.

-antagonist(X)

-antagonist_of(X, Stevenson’s TI)

-adj_possesive(Stevenson, TI)

example ctd
Example (ctd.):
  • Step 2:

Generating semantic assumptions

- island (TI) -book(TI) - movie(TI)

-author(Stevenson) -director(Stevenson)

  • Step 3:Builds different semantic queries based on phrases, keywords and semantic assumptions.
  • Step 4 : Generates 100s of answers based on passages, documents and facts returned from 3.

Long-John Silver is likely to be one of them.

example ctd1
Example (ctd.):
  • Step 5:Formulate evidence in support or refutation.

(+VE) evidence :

1. Long-John Silver the main character in TI.

2. The antagonist in Treasure Island is Long-John Silver

3. Treasure Island, by Stevenson was a great book.

(-VE) evidence :

Stevenson = Richard Lewis Stevenson

antagonist = Wolverine

example ctd2
Example (ctd.):
  • Step 6:

- Combine all the evidence and their scores.

- Analyze evidences to compute confidence and return the most confident answer.

Long-John Silver in this case !

watson s brain software
Watson’s Brain (Software):
  • Languages used : Java , C++ , prolog.
  • Apache Hadoop framework for distributed computing.
  • Apache UIMA framework.
    • Helps in DeepQA’s demand for Massive Parallelism.
    • Facilitated rapid component integration, testing , evaluation
  • SUSE Linux Enterprise Server 11
watson s brain hardware
Watson’s Brain(Hardware):
  • One Jeopardy! Question takes 2hours on normal desktop computer!
  • The real task

- Confidence determination before buzzing.

  • High Time need of faster Hardware support.
watson s brain ctd
Watson’s Brain: (ctd.)
  • Total Ninety POWER-750 servers.
  • Total 2880 POWER7 processor cores.
  • Total 16 Terabytes of R.A.M.
  • Each POWER-750 server uses a 3.5 GHz POWER7eight core processor, with 4 Threads per core.
  • Size of total 8 refrigerators.
  • Can process data up-to the speed of 500 GB/s.
the final blow
The Final Blow!
  • 3 rounds of Jeopardy! Between Watson , Rutter& Jennings.
  • Watson comprehensively defeats it’s competitors with net score of $77,147
  • Jennings managed $24,000.
  • Rutter ended third with $21,600.
the final blow ctd
The Final Blow! (ctd.)

“I for one welcome our new computer overlords” - Jennings

conclusion
Conclusion:
  • High performance analytics
  • Non-cognitive
  • Smart Learner
  • Not invincible
watson suits
Watson & Suits
  • Tech support
  • Knowledge management
  • Business Intelligence
  • Improvised Information sharing
watson for society health care
Watson for society- Health Care
  • Symptoms
  • Patient Records
  • Tests
  • Medications
  • Notes/Hypothesis
  • Texts, Journals

Diagnosis Models

Finding appropriate

“Disease” , As per

Asked by adjoining

“Symptoms” and

“Records”

references
References:
  • Watson Systems:

http://www-03.ibm.com/innovation/us/watson/

  • Wiki Page

http://en.wikipedia.org/wiki/Watson_%28computer%2

  • Research Papers:

http://researcher.ibm.com/researcher/view_page.php?id=2121

references1
References:
  • Jeopardy! IBM Watson Day 1 (Feb 14, 2011)

http://www.youtube.com/watch?v=seNkjYyG3gI&feature=related

  • Science Behind an Answer-

http://www-03.ibm.com/innovation/us/watson/what-is-watson/science-behind-an-answer.html

  • The AI magzine

http://www.aaai.org/ojs/index.php/aimagazine/article/view/2303

references2
References:
  • Philip Resnik. 1999.Semantic similarity in a taxonomy: An information-based measure and its application to problems of ambiguity in natural language. Journal of Artificial Intelligence Research.
  • Tom M. Mitchell. 1997. Machine Learning. Computer Science Series. McGraw-Hill.
ad