User-Friendly Ontology Authoring Using a Controlled Language Valentin Tablan, Tamara Polajnar Hamish Cunningham, Kalina Bontcheva NLP Research Group University of Sheffield Regent Court, 211 Portobello Street, Sheffield, S1 4DP, UK http://nlp.shef.ac.uk, http://gate.ac.uk
Motivation • Ontologies starting to be used in many NLP applications for: • encoding system ‘knowledge’; • storing results. • Current standards (RDF-S, OWL) are complex: • Large number of features supported; • Steep learning curve; • Training required; • Authoring tools (e.g. Protégé) complicated and difficult to use by non-specialists.
Motivation (continued) • Ontological requirements for NLP applications usually simple: • Taxonomy of classes; • Hierarchy of properties; • Instances. • Graphical tools difficult to embed in a text-based pipelines (e.g. wikis, existing NLP apps, other web set-ups).
Controlled Languages • Good compromise between structured data and natural language: • Feels [almost] natural to humans; • Can be ‘understood’ by machines. • People find it easy to ‘put into words’ ontological information (which they may find difficult to do with a specialised tool). • Used before for automating translation (e.g. Caterpillar and Boeing).
Round-Trip Authoring • Very little or no training necessary (learning by example). • Can be used to extend existing ontologies or create new ones. • Limited number of syntactical constructs. • Open vocabulary. CLIE CL Text Generation
An Example There are pets and owners. Cat is a type of pet. Tabatha is a cat. John is an owner. Owners have pets. Pets can have textual nickname. John has Tabatha. Tabatha has nickname with value "Tabby".
From Text to Ontologies CL Text Tokeniser POS Tagger Morph Quote Finder Key-phrase NP Chunker CLIE Parser
Closing the Loop Generating CL text from ontologies: • Generate triples. • Match triples to generation templates. • Group similar triples. • Generate sentences for each group of triples.
<Pet, rdf:type, owl:Class> <in> <triple id="t1"> <property ns="rdf" name="type"/> <object ns="owl" name="Class"/> </triple> </in> <out> <singular> <phrase>There are <ref ref="t1.subject" number="plural"/>. </phrase> </singular> <plural> <phrase>There are <ref ref="t1.subject" number="plural"/>. </phrase> </plural> </out>
Conclusions • Simple way of editing ontologies. • Standards compliant (through GATE’s ontology support I/O). • No training required. • Embeddable in text-only applications. • Language could be extended to: • Better cover OWL features; • Better cover natural ways of expression.
Thank you! More information: • http://gate.ac.uk • http://nlp.shef.ac.uk