1 / 21

Parallel Tools for Natural Language Processing

Parallel Tools for Natural Language Processing. Mark Brigham Melanie Goetz Andrew Hogue. 6.338 / 18.337 - March 16, 2004. Sentence Parsing. Consider the sentence: “John ate the cookie on the table” We want to: Tag the sentence with parts of speech Group the words by phrase.

Download Presentation

Parallel Tools for Natural Language Processing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Parallel Tools for Natural Language Processing Mark Brigham Melanie Goetz Andrew Hogue 6.338 / 18.337 - March 16, 2004

  2. Sentence Parsing • Consider the sentence: “John ate the cookie on the table” • We want to: • Tag the sentence with parts of speech • Groupthe words by phrase

  3. Context Free Grammars • Recursive set of rules • Defines what syntactic structure can be applied to a phrase or word • Top-level rule S defines the sentence …

  4. Context Free Grammars • Applying a CFG to a sentence creates a parse-tree for that sentence

  5. Context Free Grammars Top-down parse

  6. Context Free Grammars Bottom-up parse Parallelizable!

  7. Ambiguity More than one parse for a single sentence!

  8. Parallelization • Bottom-up rule application appropriate for parallel processing • Ambiguous parses also parallelizable • Long, complex sentences may be most interesting • Proust?

  9. Chart Parsing • Create a matrix where entries correspond to words/phrases • If there is a valid CFG parse of a phrase [i,j], add it to that matrix cell • A cell [i,j] may only depend on other cells [m,n] where m < i and n < j.

  10. John ate the cookie on the table John ate the cookie on the table

  11. John ate the cookie on the table John ate the cookie on the table

  12. John ate the cookie on the table John ate the cookie on the table

  13. John ate the cookie on the table John ate the cookie on the table

  14. John ate the cookie on the table John ate the cookie on the table

  15. John ate the cookie on the table John ate the cookie on the table

  16. John ate the cookie on the table John ate the cookie on the table

  17. John ate the cookie on the table John ate the cookie on the table

  18. John ate the cookie on the table John ate the cookie on the table

  19. John ate the cookie on the table John ate the cookie on the table

  20. John ate the cookie on the table John ate the cookie on the table

  21. Other Tools • Considering parallelizing other NLP tools • Word-stemming: Multiple finite state automata applied to a single word in parallel • Automated part-of-speech recognition on large corpora

More Related