ling 570 day 17 named entity recognition chunking n.
Skip this Video
Loading SlideShow in 5 Seconds..
Ling 570 Day 17: Named Entity Recognition Chunking PowerPoint Presentation
Download Presentation
Ling 570 Day 17: Named Entity Recognition Chunking

play fullscreen
1 / 61
Download Presentation

Ling 570 Day 17: Named Entity Recognition Chunking - PowerPoint PPT Presentation

phong
96 Views
Download Presentation

Ling 570 Day 17: Named Entity Recognition Chunking

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Ling 570 Day 17:Named Entity RecognitionChunking

  2. Sequence Labeling • Goal: Find most probable labeling of a sequence • Many sequence labeling tasks • POS tagging • Word segmentation • Named entity tagging • Story/spoken sentence segmentation • Pitch accent detection • Dialog act tagging

  3. NER as Sequence Labeling

  4. NER as Classification Task • Instance:

  5. NER as Classification Task • Instance: token • Labels:

  6. NER as Classification Task • Instance: token • Labels: • Position: B(eginning), I(nside), Outside

  7. NER as Classification Task • Instance: token • Labels: • Position: B(eginning), I(nside), Outside • NER types: PER, ORG, LOC, NUM

  8. NER as Classification Task • Instance: token • Labels: • Position: B(eginning), I(nside), Outside • NER types: PER, ORG, LOC, NUM • Label: Type-Position, e.g. PER-B, PER-I, O, … • How many tags?

  9. NER as Classification Task • Instance: token • Labels: • Position: B(eginning), I(nside), Outside • NER types: PER, ORG, LOC, NUM • Label: Type-Position, e.g. PER-B, PER-I, O, … • How many tags? • (|NER Types|x 2) + 1

  10. NER as Classification: Features • What information can we use for NER?

  11. NER as Classification: Features • What information can we use for NER?

  12. NER as Classification: Features • What information can we use for NER? • Predictive tokens: e.g. MD, Rev, Inc,.. • How general are these features?

  13. NER as Classification: Features • What information can we use for NER? • Predictive tokens: e.g. MD, Rev, Inc,.. • How general are these features? • Language? Genre? Domain?

  14. NER as Classification: Shape Features • Shape types:

  15. NER as Classification: Shape Features • Shape types: • lower: e.g. e. e. cummings • All lower case

  16. NER as Classification: Shape Features • Shape types: • lower: e.g. e. e. cummings • All lower case • capitalized: e.g. Washington • First letter uppercase

  17. NER as Classification: Shape Features • Shape types: • lower: e.g. e. e. cummings • All lower case • capitalized: e.g. Washington • First letter uppercase • all caps: e.g. WHO • all letters capitalized

  18. NER as Classification: Shape Features • Shape types: • lower: e.g. e. e. cummings • All lower case • capitalized: e.g. Washington • First letter uppercase • all caps: e.g. WHO • all letters capitalized • mixed case: eBay • Mixed upper and lower case

  19. NER as Classification: Shape Features • Shape types: • lower: e.g. e. e. cummings • All lower case • capitalized: e.g. Washington • First letter uppercase • all caps: e.g. WHO • all letters capitalized • mixed case: eBay • Mixed upper and lower case • Capitalized with period: H.

  20. NER as Classification: Shape Features • Shape types: • lower: e.g. e. e. cummings • All lower case • capitalized: e.g. Washington • First letter uppercase • all caps: e.g. WHO • all letters capitalized • mixed case: eBay • Mixed upper and lower case • Capitalized with period: H. • Ends with digit: A9

  21. NER as Classification: Shape Features • Shape types: • lower: e.g. e. e. cummings • All lower case • capitalized: e.g. Washington • First letter uppercase • all caps: e.g. WHO • all letters capitalized • mixed case: eBay • Mixed upper and lower case • Capitalized with period: H. • Ends with digit: A9 • Contains hyphen: H-P

  22. Example Instance Representation • Example

  23. Sequence Labeling • Example

  24. Evaluation • System: output of automatic tagging • Gold Standard: true tags

  25. Evaluation • System: output of automatic tagging • Gold Standard: true tags • Precision: # correct chunks/# system chunks • Recall: # correct chunks/# gold chunks • F-measure:

  26. Evaluation • System: output of automatic tagging • Gold Standard: true tags • Precision: # correct chunks/# system chunks • Recall: # correct chunks/# gold chunks • F-measure: • F1 balances precision & recall

  27. Evaluation • Standard measures: • Precision, Recall, F-measure • Computed on entity types (Co-NLL evaluation)

  28. Evaluation • Standard measures: • Precision, Recall, F-measure • Computed on entity types (Co-NLL evaluation) • Classifiers vs evaluation measures • Classifiers optimize tag accuracy

  29. Evaluation • Standard measures: • Precision, Recall, F-measure • Computed on entity types (Co-NLL evaluation) • Classifiers vs evaluation measures • Classifiers optimize tag accuracy • Most common tag?

  30. Evaluation • Standard measures: • Precision, Recall, F-measure • Computed on entity types (Co-NLL evaluation) • Classifiers vs evaluation measures • Classifiers optimize tag accuracy • Most common tag? • O – most tokens aren’t NEs • Evaluation measures focuses on NE

  31. Evaluation • Standard measures: • Precision, Recall, F-measure • Computed on entity types (Co-NLL evaluation) • Classifiers vs evaluation measures • Classifiers optimize tag accuracy • Most common tag? • O – most tokens aren’t NEs • Evaluation measures focuses on NE • State-of-the-art: • Standard tasks: PER, LOC: 0.92; ORG: 0.84

  32. Hybrid Approaches • Practical sytems • Exploit lists, rules, learning…

  33. Hybrid Approaches • Practical sytems • Exploit lists, rules, learning… • Multi-pass: • Early passes: high precision, low recall • Later passes: noisier sequence learning

  34. Hybrid Approaches • Practical sytems • Exploit lists, rules, learning… • Multi-pass: • Early passes: high precision, low recall • Later passes: noisier sequence learning • Hybrid system: • High precision rules tag unambiguous mentions • Use string matching to capture substring matches

  35. Hybrid Approaches • Practical sytems • Exploit lists, rules, learning… • Multi-pass: • Early passes: high precision, low recall • Later passes: noisier sequence learning • Hybrid system: • High precision rules tag unambiguous mentions • Use string matching to capture substring matches • Tag items from domain-specific name lists • Apply sequence labeler

  36. Chunking

  37. What is Chunking? • Form of partial (shallow) parsing • Extracts major syntactic units, but not full parse trees • Task: identify and classify • Flat, non-overlapping segments of a sentence • Basic non-recursive phrases • Correspond to major POS • May ignore some categories; i.e. base NP chunking • Create simple bracketing • [NPThe morning flight][PPfrom][NPDenver][Vphas arrived] • [NPThe morning flight]from [NPDenver]has arrived

  38. Example

  39. Example NP VP PP NP

  40. Why Chunking? • Used when full parse unnecessary • Or infeasible or impossible (when?) • Extraction of subcategorization frames • Identify verb arguments • e.g. VP NP • VP NP NP • VP NP to NP • Information extraction: who did what to whom • Summarization: Base information, remove mods • Information retrieval: Restrict indexing to base NPs

  41. Processing Example • Tokenization: The morning flight from Denver has arrived • POS tagging: DT JJ N PREP NNP AUX V • Chunking: NP PP NP VP • Extraction: NP NP VP • etc

  42. Approaches • Finite-state Approaches • Grammatical rules in FSTs • Cascade to produce more complex structure • Machine Learning • Similar to POS tagging

  43. Finite-State Rule-Based Chunking • Hand-crafted rules model phrases • Typically application-specific • Left-to-right longest match (Abney 1996) • Start at beginning of sentence • Find longest matching rule • Greedy approach, not guaranteed optimal

  44. Finite-State Rule-Based Chunking • Chunk rules: • Cannot contain recursion • NP -> Det Nominal: Okay • Nominal -> Nominal PP: Not okay • Examples: • NP  (Det) Noun* Noun • NP  Proper-Noun • VP  Verb • VP  Aux Verb

  45. Finite-State Rule-Based Chunking • Chunk rules: • Cannot contain recursion • NP -> Det Nominal: Okay • Nominal -> Nominal PP: Not okay • Examples: • NP  (Det) Noun* Noun • NP  Proper-Noun • VP  Verb • VP  Aux Verb • Consider: Time flies like an arrow • Is this what we want?

  46. Cascading FSTs • Richer partial parsing • Pass output of FST to next FST • Approach: • First stage: Base phrase chunking • Next stage: Larger constituents (e.g. PPs, VPs) • Highest stage: Sentences

  47. Example

  48. Chunking by Classification • Model chunking as task similar to POS tagging • Instance:

  49. Chunking by Classification • Model chunking as task similar to POS tagging • Instance: tokens • Labels: • Simultaneously encode segmentation & identification

  50. Chunking by Classification • Model chunking as task similar to POS tagging • Instance: tokens • Labels: • Simultaneously encode segmentation & identification • IOB (or BIO tagging) (also BIOE or BIOSE) • Segment: B(eginning), I (nternal), O(utside)