1 / 19

Automating Generation of Textual Class Definitions from OWL to English

Automating Generation of Textual Class Definitions from OWL to English. Robert Stevens, James Malone , Sandra Williams, Richard Power. Summary. Motivation Use Case Methods and Description Generator Results Evaluation Open Questions (still). Motivation. Textual definitions are

verena
Download Presentation

Automating Generation of Textual Class Definitions from OWL to English

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Automating Generation of Textual Class Definitions from OWL to English Robert Stevens, James Malone, Sandra Williams, Richard Power

  2. Summary • Motivation • Use Case • Methods and Description Generator • Results • Evaluation • Open Questions (still) Automating Generation of Textual Class Definitions from OWL to English

  3. Motivation • Textual definitions are • cornerstone of good practice in ontology delivery • a requirement of the OBO process • hard work to produce • Logical definitions • make meaning explicit to the computer • help maintenance of the ontology’s structure, querying, and so on • are also hard to produce but also more difficult to understand • The information in one form should reflect the information in the other • Need to keep textual and logical definitions synchronised • Aim to produce fluent textual definitions from logical definitions/description in OWL Automating Generation of Textual Class Definitions from OWL to English

  4. OWL Smackdown: Computer vs Human Automating Generation of Textual Class Definitions from OWL to English

  5. Our Hypotheses • Text = humans • Logical = computers (and future human-computer hybrids) • Textual definitions ≈ Logical definition • Textual definitions tend to be more lossy than logical (cardinalities are often dropped, specific roles not mentioned, etc.) • Logical definitions are often more explicit than natural language and therefore should contain sufficient content to produce a textual definition. Automating Generation of Textual Class Definitions from OWL to English

  6. EFO Use Casewww.ebi.ac.uk/efo • Experimental Factor Ontology (EFO) is an application ontology which consumes domain ontologies to satisfy specific application focused use cases • Primarily Gene Expression data from ArrayExpress @ EBI Automating Generation of Textual Class Definitions from OWL to English

  7. EFO @ Gene Expression Atlaswww.ebi.ac.uk/gxa Automating Generation of Textual Class Definitions from OWL to English

  8. Related Work • Generating descriptions from ontologies often called ‘ontology verbalisation’ • A number concerned only with ABox verbalisation (Hielkema 2009; Galanis and Androutsopoulos, 2007) • Others produce only separate sentences, one for each OWL axiom (Kalijurand, 2007) • Our approach has much in common but differs in; • only a subset of OWL is considered (the simple description logic EL++) • instead of realising axioms in isolation we apply some rules for organisation and aggregation to give more natural feel Automating Generation of Textual Class Definitions from OWL to English

  9. Method Overview • An OWL ontology is just a “pile of axioms” • We can produce individual sentences based on a grammar that guides transformation from OWL to English (or other natural language) • Need to group sentences (group axioms with the same subject together) • Need to aggregate axioms (collapse axioms with the same relationship together) • Once grouped and aggregated, a paragraph of text can be produced sentence by sentence. hasPart some leg hasPart some body hasPart some head Has parts leg, body and head Automating Generation of Textual Class Definitions from OWL to English

  10. Processing stages • Transcode OWL/XML to Prolog • Construct a lexicon for atomic entities – (next slide) • Group axioms by atomic entity • Aggregate axioms with similar structure • Generate sentences from aggregated axioms. class(animal). subClassOf(class(cat), class(animal). subClassOf(class(dog), class(animal). => class(animal). subClassOf([class(cat), class(dog)], class(animal)). => ANIMAL. A cat and a dog are both kinds of animals. Automating Generation of Textual Class Definitions from OWL to English

  11. Description Generator • Input: OWL/XML ontology • Output: Text describing atomic entities • generation from label/URL • It is assumed that the syntax of each phrase will be severely constrained as follows: • individuals are expressed by proper names • classes by common nouns (with singular and plural forms) • properties by transitive verbs (simple or compound) with slots for a subject and an object. ANIMAL. The following are kinds of animals: a cat, a duck, a giraffe, a person, a sheep, and a tiger. An animal eats a thing. If X has as pet Y then necessarily Y is an animal. Automating Generation of Textual Class Definitions from OWL to English

  12. Results *axioms placed on subclasses Automating Generation of Textual Class Definitions from OWL to English

  13. Results • Online survey of ontology users at EBI • 10 of the 50 verbalisations were evaluated based on widest range of axioms Total Judgement Automating Generation of Textual Class Definitions from OWL to English

  14. Findings • Finding of dodgy class; • definition for Ara-C-resistant murine leukemia indicated subclasses b117h and b140h types of this, implying that they were diseases rather than cell lines • Desire amongst this user group for simplicity of language – avoid ontological formality • e.g. bearer of • Especially property names for qualities • e.g. has as quality male • Initial verbalisation making semantics clear was not liked • Plural forms occasionally issue: lex(class(EFO_0000322),noun, ‘cell line’, ‘cell lines’). lex(class(EFO_0002095),noun, ‘22rv1’,’22rv1s’). Automating Generation of Textual Class Definitions from OWL to English

  15. Conclusion • Initial results were largely well received and considered useful in most cases • Discovery of incorrect class definition demonstrates potential as tool for class validation • Preference for text definitions was for ‘clear and simple’ over ‘precise and complex’ • Dependent entities could become adjectival forms of the independent entities in which they inhere (cell has quality female becomes female cell) • Formal relations/class labels reduce understanding and should be brought closer to domain language • Many ontologies are not amenable to text mining – this is an important use case neglected by most • Definitions now being imported into EFO Automating Generation of Textual Class Definitions from OWL to English

  16. Next Steps • Systematic study of acceptable wordings • Different wording styles for different users • Adjectival forms for qualities etc; the role of a upper level ontology • Moving beyond EL++ • Parsing for OBO Automating Generation of Textual Class Definitions from OWL to English

  17. Next Steps: Round Tripping Automating Generation of Textual Class Definitions from OWL to English

  18. Open Questions Should textual descriptions ≡ logical descriptions? Are discrepencies acceptable? Automating Generation of Textual Class Definitions from OWL to English

  19. Acknowledgements • Sandra Williams, Richard Power and Robert Stevens are funded by the SWAT project (EPSRC grants EP/G033579/1 and EP/G032459/1); • James Malone is funded by EMBL and EMERALD (project number LSHG-CT-2006-037686). • We would like to thank the members of the EBI’s ontology interest group, functional genomics group and Dr Helen Parkinson for comments and survey participation Automating Generation of Textual Class Definitions from OWL to English

More Related