1 / 4

Shock

Shock. Progress & Direction. MetaMap. Tokenized words for Mohammed Enables him to test his new models for Pattern matcher Mallet Training Data for Laura Enables her to work on creating a better version of MetaMap .

dextra
Download Presentation

Shock

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Shock Progress & Direction

  2. MetaMap • Tokenized words for Mohammed • Enables him to test his new models for Pattern matcher • Mallet Training Data for Laura • Enables her to work on creating a better version of MetaMap. • MetaMap currently has many concept annotation issues because the dictionary they use is so large. Concepts are frequently tagged incorrectly.

  3. Mishaps & Solutions • Mapping concepts to the phrases • Maddening XML Schema • Makes it difficult to understand how the words & their POS’s in each phrase map to the concepts. • Created a model of the utterances, phrases and concepts to solve this problem. • Used a DOM parser at first. Took about 15 minutes per xml file for 18 hours total. • Replaced with SAX parser which sped up progress. • Lesson learned: Do not use a DOM parser for a large document.

  4. Next Task:Exploration of New MethodsFor Extracting Named Entities • The XConc Suite • Corpus developer and annotator • Explore XConc as a MetaMap replacement for extracting named entities using event based annotation.

More Related