PoS tagging and Chunking with HMM and CRF - PowerPoint PPT Presentation

pos tagging and chunking with hmm and crf n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
PoS tagging and Chunking with HMM and CRF PowerPoint Presentation
Download Presentation
PoS tagging and Chunking with HMM and CRF

play fullscreen
1 / 12
PoS tagging and Chunking with HMM and CRF
423 Views
Download Presentation
jaimie
Download Presentation

PoS tagging and Chunking with HMM and CRF

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. PoS tagging and Chunking with HMM and CRF Dept. Of CSE IIT Madras Pranjal Awasthi, Delip Rao, Ravindran Balaraman

  2. Outline • Overview of the system • PoS tagging with HMM • Chunking with CRF • Results • Summary

  3. Overview of the system Aim: To leverage existing tools and algorithms (for English) for the NLPAI task Tools used: TnT tagger, TBL, MALLET

  4. Overview of the system TNT CRF (MALLET) + TBL PoS Tagging Chunking

  5. The TnT tagger (Brants, 2000) • A Second Order Hidden Markov Model based tagger • Used for English and other languages • On NLPAI dataset, TnT alone gave F1=78.9 • Why TnT? • PoS tagging a sequence labeling task • HMM, CRFs are good candidates

  6. Poor performance of CRFs in PoS tagging • For NLPAI dataset F1 = 69.4 • Features used: wi-1, wi-1wi, wi+1, wiwi+1 • Linear chain CRF was used (MALLET) • Reasons for poor performance • Large number of PoS tags (26) compared to Chunking • Selection of features • Type of CRF?

  7. Transformation Based Learning (Brill, 1995) • Added as a post processing step to “correct” TnT output • Idea: • Derive correction rules during training based on observing what has gone wrong • Apply these rules for testing

  8. Transformation Based Learning (contd …) • Use of TnT improved F1 by 1% • TnT is sensitive to the templates used • Possible improvements on template selection • Training time can be long unless indexing is used

  9. Summary of PoS tagging Results

  10. Chunking with CRF • Based on (Sha & Periera, 2003) • Using SimpleTagger providedwith MALLET • Chunking accuracies

  11. Summary • Demonstrated the use of off-the-shelf software for Tagging and Chunking • Only code written: TBL + glue scripts • Overall PoS F1 = 80.74 and Chunk F1 = 79.58 • Have we “hit the wall” in pure ML based tools • Not sure yet!

  12. Thanks!