Advanced AI - Part II - PowerPoint PPT Presentation

advanced ai part ii n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Advanced AI - Part II PowerPoint Presentation
Download Presentation
Advanced AI - Part II

play fullscreen
1 / 18
Advanced AI - Part II
141 Views
Download Presentation
inez
Download Presentation

Advanced AI - Part II

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Advanced AI - Part II Luc De Raedt University of Freiburg WS 2004/2005 Many slides taken from Helmut Schmid

  2. Topic • Statistical Natural Language Processing • Applies • Machine Learning / Statistics to • Learning : the ability to improve one’s behaviour at a specific task over time - involves the analysis of data (statistics) • Natural Language Processing

  3. Rationalism versus Empiricism • Rationalist • Noam Chomsky - innate language structures • AI : hand coding NLP • Dominant view 1960-1985 • Empiricist • Ability to learn is innate • AI : language is learned from corpora • Dominant 1920-1960 and becoming increasingly important

  4. Rationalism versus Empiricism • Noam Chomsky: • But it must be recognized that the notion of “probability of a sentence” is an entirely useless one, under any known interpretation of this term • Fred Jelinek (IBM 1988) • Every time a linguist leaves the room the recognition rate goes up. • (Alternative: Every time I fire a linguist the recognizer improves)

  5. This course • Empiricist approach • Focus will be on probabilistic models for learning of natural language • No time to treat natural language in depth ! • (though this would be quite useful and interesting) • Deserves a full course by itself

  6. Ambiguity

  7. NLP and Statistics • Statistical Disambiguation • Define a probability model for the data • Compute the probability of each alternative • Choose the most likely alternative

  8. NLP and Statistics Statistical Methods deal with uncertainty.They predict the future behaviour of a systembased on the behaviour observed in the past.  Statistical Methods require training data. The data in Statistical NLP are the Corpora

  9. Corpora • Corpus: text collection for linguistic purposes • TokensHow many words are contained in Tom Sawyer? 71.370 • TypesHow many different words are contained in T.S.? 8.018 • Hapax Legomenawords appearing only once

  10. Word Counts  The most frequent words are function words

  11. Word Counts f nf 1 3993 2 1292 3 664 4 410 5 243 6 199 7 172 8 131 9 82 10 91 11-50 540 51-100 99 > 100 102 How many words appear f times?

  12. Word Counts

  13. Word Counts

  14. Zipf‘s Law word f r f*r word f r f*r the 3332 1 3332 turned 51 200 10200 and 2972 2 5944 you‘ll 30 300 9000 a 1775 3 5235 name 21 400 8400 he 877 10 8770 comes 16 500 8000 but 410 20 8400 group 13 600 7800 be 294 30 8820 lead 11 700 7700 there 222 40 8880 friends 10 800 8000 one 172 50 8600 begin 9 900 8100 about 158 60 9480 family 8 1000 8000 more 138 70 9660 brushed 4 2000 8000 never 124 80 9920 sins 2 3000 6000 Oh 116 90 10440 Could 2 4000 8000 two 104 100 10400 Applausive 1 8000 8000 Zipf‘s Law: f~1/r (f*r = const)

  15. Some probabilistic models • N-grams • Predicting the next word • Artificial intelligence and machine …. • Statistical natural language …. • Probabilistic • Regular (Markov Models) • Context-free grammars

  16. Illustration • Wall Street Journal Corpus • 3 000 000 words • Correct parse tree for sentences known • Constructed by hand • Can be used to derive stochastic context free grammars • SCFG assign probability to parse trees • Compute the most probable parse tree

  17. Conclusions • Overview of some probabilistic and machine learning methods for NLP • Also very relevant to bioinformatics ! • Analogy between parsing • A sentence • A biological string (DNA, protein, mRNA, …)