Textbooks you need
1 / 24

*Textbooks you need - PowerPoint PPT Presentation

  • Uploaded on

*Textbooks you need. Manning, C. D., Sch ü tze, H.: Foundations of Statistical Natural Language Processing . The MIT Press. 1999. ISBN 0-262-13360-1. [required] Allen, J.: Natural Language Understanding . The Benjamins/Cummins Publishing Co. 1995. 2 nd edition

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about ' *Textbooks you need' - peigi

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Textbooks you need
*Textbooks you need

  • Manning, C. D., Schütze, H.:

    • Foundations of Statistical Natural Language Processing. The MIT Press. 1999. ISBN 0-262-13360-1.[required]

  • Allen, J.:

    • Natural Language Understanding. The Benjamins/Cummins Publishing Co. 1995. 2nd edition

  • Jurafsky, D. and J. H. Martin:

    • Speech and Language Processing. Prentice-Hall. 2009. 2nd edition

  • Other reading
    Other reading

    • Charniak, E:

      • Statistical Language Learning. The MIT Press. 1996. ISBN 0-262-53141-0.

    • Cover, T. M., Thomas, J. A.:

      • Elements of Information Theory. Wiley. 1991. ISBN 0-471-06259-6.

    • Jelinek, F.:

      • Statistical Methods for Speech Recognition. The MIT Press. 1998. ISBN 0-262-10066-5

    • Proceedings of major conferences:

      • ACL (Assoc. of Computational Linguistics)

      • NAACL HLT (North American Chapter of ACL)

      • COLING (Intl. Committee of Computational Linguistics)

      • ACM SIGIR

      • Interspeech/ASRU/SLT

    Course segments
    Course segments

    • Intro & Probability & Information Theory

      • The very basics: definitions, formulas, examples.

    • Language Modeling

      • n-gram models, parameter estimation

      • smoothing (EM algorithm)

    • A Bit of Linguistics

      • phonology, morphology, syntax, semantics, discourse

    • Words and the Lexicon

      • word classes, mutual information, bit of lexicography.

    Course segments cont
    Course segments (cont.)

    • Hidden Markov Models

      • background, algorithms, parameter estimation

    • Tagging: Methods, Algorithms, Evaluation

      • tagsets, morphology, lemmatization

      • HMM tagging, Transformation-based, Feature-based

    • NL Grammars and Parsing: Data, Algorithms

      • Grammars and Automata, Deterministic Parsing

      • Statistical parsing. Algorithms, parameterization, evaluation

    • Applications (MT, ASR, IR, Q&A, ...)

    Goals of the hlt
    Goals of the HLT

    Computers would be a lot more useful if they could handle our email, do our library research, talk to us …

    But they are fazed by natural human language.

    How can we make computers have abilities to handle human language? (Or help them learn it as kids do?)

    A few applications of hlt
    A few applications of HLT

    • Spelling correction, grammar checking …(language learning and evaluation e.g. TOEFL essay score)

    • Better search engines

    • Information extraction, gisting

    • Psychotherapy; Harlequin romances; etc.

    • New interfaces:

      • Speech recognition (and text-to-speech)

      • Dialogue systems (USS Enterprise onboard computer)

      • Machine translation; speech translation (the Babel tower??)

    • Trans-lingual summarization, detection, extraction …

    Question answering ibm s watson
    Question Answering: IBM’s Watson

    • Won Jeopardy on February 16, 2011!





    Bram Stoker

    Information extraction
    Information Extraction

    Event: Curriculum mtg

    Date: Jan-16-2012

    Start: 10:00am

    End: 11:30am

    Where: Gates 159

    Subject: curriculum meeting

    Date: January 15, 2012

    To: Dan Jurafsky

    Hi Dan, we’ve now scheduled the curriculum meeting.

    It will be in Gates 159 tomorrow from 10:00-11:30.


    Create new Calendar entry

    Information extraction sentiment analysis
    Information Extraction & Sentiment Analysis




    size and weight


    ease of use

    • nice and compact to carry!

    • since the camera is small and light, I won't need to carry around those heavy, bulky professional cameras either!

    • the camera feels flimsy, is plastic and very light in weight you have to be very delicate in the handling of this camera

    Size and weight

    Machine translation
    Machine Translation

    • Helping human translators

    • Fully automatic

    Enter Source Text:

     这 不过 是 一 个 时间 的 问题 .

    Translation from Stanford’s Phrasal:

    This is only a matter of time.

    Language technology
    Language Technology

    making good progress

    Sentiment analysis

    still really hard

    mostly solved

    Best roast chicken in San Francisco!

    Question answering (QA)

    The waiter ignored us for 20 minutes.

    Q. How effective is ibuprofen in reducing fever in patients with acute febrile illness?

    Coreference resolution

    Spam detection

    Let’s go to Agra!


    Carter told Mubarak he shouldn’t run again.

    Buy V1AGRA …

    Word sense disambiguation (WSD)

    XYZ acquired ABC yesterday

    ABC has been taken over by XYZ

    Part-of-speech (POS) tagging

    I need new batteries for my mouse.




    Colorless green ideas sleep furiously.

    The Dow Jones is up

    Economy is good

    The S&P500 jumped

    I can see Alcatraz from the window!

    Named entity recognition (NER)

    Housing prices rose

    Machine translation (MT)




    Where is Citizen Kane playing in SF?

    Einstein met with UN officials in Princeton

    The 13th Shanghai International Film Festival…

    Castro Theatre at 7:30. Do you want a ticket?

    Information extraction (IE)

    PartyMay 27add

    You’re invited to our dinner party, Friday May 27 at 8:30

    Ambiguity makes nlp hard crash blossoms
    Ambiguity makes NLP hard:“Crash blossoms”

    Violinist Linked to JAL Crash Blossoms

    Teacher Strikes Idle Kids

    Red Tape Holds Up New Bridges

    Hospitals Are Sued by 7 Foot Doctors

    Juvenile Court to Try Shooting Defendant

    Local High School Dropouts Cut in Half


    Ambiguity is pervasive
    Ambiguity is pervasive

    • New York Times headline (17 May 2000)

    Fed raises interest rates

    Fed raises interest rates

    Fed raises interest rates 0.5%

    Why else is natural language understanding difficult
    Why else is natural language understanding difficult?

    segmentation issues

    non-standard English


    Great job @justinbieber! Were SOO PROUD of what youve accomplished! U taught us 2 #neversaynever & you yourself should never give up either♥

    dark horse

    get cold feet

    lose face

    throw in the towel

    the New York-New Haven Railroad

    the New York-New Haven Railroad

    tricky entity names

    world knowledge





    Where is A Bug’s Life playing …

    Let It Be was recorded …

    … a mutation on the for gene …

    Mary and Sue are sisters.

    Mary and Sue are mothers.

    But that’s what makes it fun!

    Making progress on this problem
    Making progress on this problem…

    • The task is difficult! What tools do we need?

      • Knowledge about language

      • Knowledge about the world

      • A way to combine knowledge sources

    • How we generally do this:

      • probabilistic models built from language data

        • P(“maison”  “house”) high

        • P(“L’avocat général”  “the general avocado”) low

      • Luckily, rough text features can often do half the job.

    This class
    This class

    • Teaches key theory and methods for statistical NLP:

      • Viterbi

      • Naïve Bayes, Maxent classifiers

      • N-gram language modeling

      • Statistical Parsing

      • Inverted index, tf-idf, vector models of meaning

    • For practical, robust real-world applications

      • Information extraction

      • Spelling correction

      • Information retrieval

      • Sentiment analysis

    Levels of language
    Levels of Language

    • Phonetics/phonology/morphology: what words (or subwords) are we dealing with?

    • Syntax: What phrases are we dealing with? Which words modify one another?

    • Semantics: What’s the literal meaning?

    • Pragmatics: What should you conclude from the fact that I said something? How should you react?

    What s hard ambiguities ambiguities all different levels of ambiguities
    What’s hard – ambiguities, ambiguities, all different levels of ambiguities

    John stopped at the donut store on his way home from work. He thought a coffee was good every few hours. But it turned out to be too expensive there. [from J. Eisner]

    - donut: To get a donut (doughnut; spare tire) for his car?

    - Donut store: store where donuts shop? or is run by donuts? or looks like a big donut? or made of donut?

    - From work: Well, actually, he stopped there from hunger and exhaustion, not just from work.

    - Every few hours: That’s how often he thought it? Or that’s for coffee?

    - it: the particular coffee that was good every few hours? the donut store? the situation

    - Too expensive: too expensive for what? what are we supposed to conclude about what John did?

    Nlp the main issues
    NLP: The Main Issues levels of ambiguities

    • Why is NLP difficult?

      • many “words”, many “phenomena” --> many “rules”

        • OED: 400k words; Finnish lexicon (of forms): ~2 . 107

        • sentences, clauses, phrases, constituents, coordination, negation, imperatives/questions, inflections, parts of speech, pronunciation, topic/focus, and much more!

        • irregularity (exceptions, exceptions to the exceptions, ...)

        • potato -> potato es (tomato, hero,...); photo -> photo s, and even: both mango -> mango s or -> mango es

        • Adjective / Noun order: new book, electrical engineering, general regulations, flower garden, garden flower, ...: but Governor General

    Difficulties in nlp cont
    Difficulties in NLP (cont.) levels of ambiguities

    • ambiguity

      • books: NOUN or VERB?

        • you need many booksvs. she books her flights online

      • No left turn weekdays 4-6 pm / except transit vehicles (Charles Street at Cold Spring)

        • when may transit vehicles turn: Always? Never?

      • Thank you for not smoking, drinking, eating or playing radios without earphones. (MTA bus)

        • Thank you for not eating without earphones??

        • or even: Thank you for not drinking without earphones!?

      • My neighbor’s hat was taken by wind. He tried to catch it.

        • ...catch the windor ...catch the hat ?

    Categorical rules or statistics
    (Categorical) Rules or Statistics? levels of ambiguities

    • Preferences:

      • clear cases: context clues: she books --> books is a verb

        • rule: if an ambiguous word (verb/nonverb) is preceded by a matching personal pronoun -> word is a verb

    • less clear cases: pronoun reference

      • she/he/it refers to the most recent noun or pronoun (?) (but maybe we can specify exceptions)

  • selectional:

    • catching hat >> catching wind (but why not?)

  • semantic:

    • never thank for drinking in a bus! (but what about the earphones?)

  • Solutions
    Solutions levels of ambiguities

    • Don’t guess if you know:

      • morphology (inflections)

      • lexicons (lists of words)

      • unambiguous names

      • perhaps some (really) fixed phrases

      • syntactic rules?

  • Use statistics (based on real-world data) for preferences (only?)

    • No doubt: but this is the big question!

  • Statistical nlp
    Statistical NLP levels of ambiguities

    • Imagine:

      • Each sentence W = { w1, w2, ..., wn } gets a probability P(W|X) in a context X (think of it in the intuitive sense for now)

      • For every possible context X, sort all the imaginable sentences W according to P(W|X):

      • Ideal situation:

        best sentence (most probable in context X) NB: same for


        P(W)“ungrammatical” sentences

    Real world situation
    Real World Situation levels of ambiguities

    • Unable to specify set of grammatical sentences today using fixed “categorical” rules (maybe never)

    • Use statistical “model” based on REAL WORLD DATA and care about the best sentence only (disregarding the “grammaticality” issue)

      best sentence


      Wbest Wworst