integration of information extraction with an ontology l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Integration of Information Extraction with an Ontology PowerPoint Presentation
Download Presentation
Integration of Information Extraction with an Ontology

Loading in 2 Seconds...

play fullscreen
1 / 14

Integration of Information Extraction with an Ontology - PowerPoint PPT Presentation


  • 300 Views
  • Uploaded on

KMi Integration of Information Extraction with an Ontology Knowledge Media Institute M. Vargas-Vera, J.Domingue, Y.Kalfoglou, E. Motta and S. Buckingham Sum Introduction Ontology -> Information Extractor English text (NLP) Group of tools their IE system: KMi Ontology From UMass: Marmot

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Integration of Information Extraction with an Ontology' - libitha


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
integration of information extraction with an ontology

KMi

Integration of Information Extraction with an Ontology

Knowledge Media Institute

M. Vargas-Vera, J.Domingue, Y.Kalfoglou, E. Motta and S. Buckingham Sum

introduction
Introduction
  • Ontology -> Information Extractor
  • English text (NLP)
  • Group of tools their IE system:
    • KMi Ontology
    • From UMass:
      • Marmot
      • Crystal
      • Badger
    • OCML preprocessor
presentation layout
Presentation Layout
  • Background on tool origins and area of work
  • Description of tool integration
  • Coping with ambiguity
  • Description of output
  • Population of Ontology
  • Future Work
umass university of massachutes amherst
UMassUniversity of Massachutes Amherst
  • Marmot, Crystal, Badger
    • Classifies text by recognizing extraction patterns and semantic features associated to slots in predefined frames.
testing area kmi planet
Testing Area: KMi Planet
  • Web-based new server
    • Story Library
      • Collections of news stories and postings
    • Ontology Library
      • Ontologies stored for use in extracting information from the story library.
      • Uses OCML

myPlanet

myPlanet uses cue-phrases defined as “research areas” to query KMi planet through the ontology library and the information extraction tools we’re about to talk about

the ontology library
The Ontology Library
  • 40 different types of events or activities that can be described by the ontology library.

Event type 3: demonstration-of-technology

technology-being-demostrated (technology) (Info Extraction)

has-duration (duration) (30 min)

start-time (time-point) (3:30pm)

end-time (time-point) (4pm)

has-location (a place) (room 120 TMCB BYU campus)

other agents-involved (list of person(s)) (Dr. Embley)

main-agent (list of person(s)) (Brian Goodrich)

location-at-start (a place) (room 120 TMCB BYU campus)

location-at-end (a place) (room 120 TMCB BYU campus)

medium-used (equipment) (mutli-media projector, ppt)

subject-of-the-demo (title) (Integration of Information Extraction with an Ontology)

marmot
Marmot
  • Natural Language Processor
        • Noun, Verb, and Prepositional Phrases

“John DomingueWed, 15 Oct 1997.

David Brown, Universityfor Industryvisitsthe OU.”

  • <ex> 2 1
  • SUBJ(1): DAVID BROWN %COMMA% UNIVERSITY
  • PP (2): FOR INDUSTRY
  • VB (3): VISITS
  • OBJ1(4): THE OU
  • PUNC(5): %PERIOD%
  • </ex>
  • <ex> 1 1
  • SUBJ(1): JOHN DOMINGUE
  • ADVP(2): @WED_%COMMA%_15_OCT_1997@
  • PUNC(3): %PERIOD%
  • </ex>
crystal
Crystal
  • Dictionary Induction Tool
    • Using keyword to annotate text with semantic tags.
          • Visitor (<VI> David Brown <VI>)
          • Place (<PL> the OU <PL>)
    • Specific-to-general driven data search
      • Relaxes constraints on initial definitions until it finds the most specific definition that covers all instances of the word in the text.
      • Retains results for future use
    • Tested on over 300 stories, 100% precision and recall
badger
Badger

(fairly certain whoever wrote this section did not speak English as first language)

Matches sentences from text against concept nodes passed from Crystal. Select the best match by max number of features matching the concept node.

Can remove irrelevant sentences from problem set.

}

=>

+

http://rockape.qgl.org/crap/badger.swf

coping with ambiguity
Coping with Ambiguity

Query list of institutions

Return list of institutions – no match

Query list of projects

Return list of project - match

No discussion of whether this was automatically done by the extractor or manually by the users.

ocml code translator operational conceptual modeling language
OCML Code Translator (Operational Conceptual Modeling Language)
  • Tokenise Badger output, find corresponding CN definitions and extract all the objects found in the story
ontology maintenance
Ontology Maintenance
  • Use Badger (lexicon) and Crystal (concept) output to automatically update Ontology library whenever a new story is added to the Story library
  • Some cannot be automatically updated:
    • There is not enough information in the story
    • No current template to match with the sentence concepts.
conclusion
Conclusion
  • IE system created using Marmot, Crystal, Badger and the OCML translator.
  • Obtained good results in KMi stories.

Assessment

Sporadic periods of quality technical writing, interspersed with nearly impenetrable English

A borrowing of tools, translated to OCML and ported for KMi

future work
Future Work
  • Deriving the type of an object when it does not match a predefined template.
  • Automatic creation of new classes and subclasses.
  • Using this IE tool in other domains (need new training data?)
  • Trying out a new Machine Learning algorithm in Crystal and comparing performance.
  • Using the IE tool hypertext.
  • Saving Badger’s output in XML
  • Creating a more visual gui for the ontologies.