510 likes | 571 Views
Learn about the OntoLex-Lemon Model, a versatile RDF ontology for multilingual linked data, emphasizing its design, history, and reuse of standards. Understand its key features such as semantics by reference, part-of-speech values, and extensibility.
E N D
John P. McCrae1, Thierry Declerck2 Introduction to the OntoLex-Lemon Model 1Insight Centre for Data Analytics, National University of Ireland Galway 2Austrian Centre for Digital Humanities
Simple (!) RDF Document <http://dbpedia.org/resource/Paris> <http://dbpedia.org/ontology/populationTotal> “2229621”^^<http://www.w3.org/2001/XMLSchema#nonNegativeInteger> . <http://dbpedia.org/resource/Paris> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://dbpedia.org/ontology/Location> . <http://dbpedia.org/resource/Paris> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type><http://www.wikidata.org/entity/Q486972> .
Prefixes @prefix dbo: <http://dbpedia.org/ontology/> .@prefix dbp: <http://dbpedia.org/resource/> .@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .@prefix xsd: <http://www.w3.org/2001/XMLSchema#> . dbp:Parisdbo:populationTotal“2229621”^^xsd:nonNegativeInteger . dbp:Parisrdf:typedbo:Location . dbp:Parisrdf:type<http://www.wikidata.org/entity/Q486972> . <http://dbpedia.org/resource/Paris> <http://dbpedia.org/ontology/populationTotal> “2229621”^^<http://www.w3.org/2001/XMLSchema#nonNegativeInteger> . <http://dbpedia.org/resource/Paris> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://dbpedia.org/ontology/Location> . <http://dbpedia.org/resource/Paris> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type><http://www.wikidata.org/entity/Q486972> . @prefix pre: <long> . pre:name => <long+name>
Continuations dbp:Paris dbo:populationTotal “2229621”^^xsd:nonNegativeInteger .dbp:Paris rdf:type dbo:Location .dbp:Paris rdf:type <http://www.wikidata.org/entity/Q486972> . dbp:Paris dbo:populationTotal “2229621”^^xsd:nonNegativeInteger ; rdf:type dbo:Location , <http://www.wikidata.org/entity/Q486972> . dbp:Paris dbo:populationTotal “2229621”^^xsd:nonNegativeInteger ; rdf:type dbo:Location , <http://www.wikidata.org/entity/Q486972> . Replace . with ; to repeat subject and , to repeat subject and object Or more nicely formatted:
RDF Lists rdf:rest rdf:rest ex:node ?? rdf:nil rdf:first rdf:first “one” “two”
Blank nodes Some nodes do not have a known URI, we call these blank nodes, they are denoted with [ ] or _:id. A typical use is for lists: ex:node rdf:first “one” ; rdf:rest _:n1 ._:n1 rdf:first “two” ; rdf:rest rdf:nil . ex:node rdf:first “one” ; rdf:rest [ rdf:first “two” ; rdf:rest rdf:nil ] . Actually Turtle supports an even more compact syntax for lists ( “one” “two” )
URLs Fragment Domain http://www.example.com/path/to/file#identifer Protocol Path
Relative URLs URLs may be resolved relative to the Base URL (e.g., the URL used to find the document) <http://www.example.com/path/to/file#identifer> <//www.example.com/path/to/file#identifer> </path/to/file#identifer> <file#identifer> <#identifer>
History LingInfo (Buitelaar, 2006) Monnet Lemon (2011) OntoLex Use Cases (2014) Lexicography Module (2019) Linguistic Information Repository (Montiel-Ponsoda, 2008) LexInfo (2010) OntoLex Lemon Final Specification (2016) LexOnto (Cimiano, 2007) OntoLex CG Founded (2012)
R1. OWL and RDF R2. Multilinguality R3. Semantics by Reference R4. Openness R5. Reuse relevant standards General Requirements
RDF models are labelled directed graphs Representation Each entry has a URI Reuse of lexicon data Reasoning RDF and OWL
Support any language Do not make language-specific assumptions Part-of-speech values Gender Translation and variation Multilinguality
Meaning of a word given by reference Reference captures semantic information Disambiguation is performed relative to the ontology No (traditional) word senses Semantics by Reference
Extensible with new models No unnecessary choices of linguistic categories No payment or restrictions in using the model Openness
Reuse as many standards as possible OWL RDF SKOS Dublin Core LMF TMF Reuse standards
Ontologies http://www.example.com/foo/ID1234 “Language”@en “Teanga”@ga ...Cuairt Liteartha do Theangacha Mionlaigh san Eoraip...
Linked Data on the Web “Edema” http://dbpedia.org/resource/Edema http://de.dbpedia.org/resource/Ödem umls:C0013604 mesh:D00487 icd10:R60.9
Linked Data with Language “Edemata” http://dbpedia.org/resource/Edema http://de.dbpedia.org/resource/Ödem “Edema” mesh:D00487 icd10:R60.9 umls:C0013604 “Dropsy”
Lexical Entries “Edemata” dbpedia:Edema EDEMA mesh:D00487 “Edema” icd10:R60.9 DROPSY “Dropsy” umls:C0013604
What is a lexical entry? A lexical entry represents a unit of analysis of the lexicon that consists of a set of forms that are grammatically related and a set of base meanings that are associated with these forms. Thus, a lexical entry is a word, multiword expression or affix with a single part-of-speech, morphological pattern, etymology and set of senses.
Forms “Edemata” “Edema” number=singular number=plural EDEMA
Senses EDEMA DROPSY dating=old dbpedia:Fish_Dropsy dbpedia:Edema
Simple Entry OntoLex Namespace @prefix ontolex: <http://www.w3.org/ns/lemon/ontolex#> . @prefix skos: <http://www.w3.org/2004/02/skos/core#> . <#cat> a ontolex:Word ; ontolex:canonicalForm [ ontolex:writtenRep “cat”@en ] ; ontolex:denotes [ skos:definition “A four-legged, furry animal”@en ] . Lemma Sense
Simple Entry with Grammatical Information @prefix ontolex: <http://www.w3.org/ns/lemon/ontolex#> . @prefix skos: <http://www.w3.org/2004/02/skos/core#> . @prefix lexinfo: <http://www.lexinfo.net/ontology/2.0/lexinfo#> . <#cat> a ontolex:Word ; lexinfo:partOfSpeech lexinfo:noun ontolex:canonicalForm [ ontolex:writtenRep “cat”@en ; lexinfo:number lexinfo:singular ] ; ontolex:otherForm [ ontolex:writtenRep “cats”@en ; lexinfo:number lexinfo:plural ] ; ontolex:denotes [ skos:definition “A four-legged, furry animal”@en ] . LexInfo Ontology Part of Speech Inflected Form
Lexical Sense @prefix ontolex: <http://www.w3.org/ns/lemon/ontolex#> . @prefix dbpedia: <http://dbpedia.org/resource/> . @prefix dbo: <http://dbpedia.org/ontology/> . <#bulrush> a ontolex:Word ; ontolex:sense [ ontolex:reference dbpedia:Typha ; ontolex:usage [ rdf:value “British English” ] ] ; ontolex:denotes dbpedia:Typha . <#cattail> a ontolex:Word ; ontolex:sense [ ontolex:reference dbpedia:Typha ; ontolex:usage [ rdf:value “American English” ] ] ; ontolex:denotes dbpedia:Typha . sense ⚬ reference = denotes Restriction on Lexical Sense
Syntax and Semantics John knows Philipp <http://john.mccr.ae> foaf:knows agsc:cimiano
Synsem Module Syntactic Frames @prefix ontolex: <http://www.w3.org/ns/lemon/ontolex#> . @prefix synsem: <http://www.w3.org/ns/lemon/synsem#> . @prefix lexinfo: <http://www.lexinfo.net/ontology/2.0/lexinfo#> . <#know> a ontolex:Word ; synsem:synBehavior <#know/transitive> . <#know/transitive> a synsem:SyntacticFrame ; lexinfo:subject <#know/subject> ; lexinfo:directObject <#know/directObject> . Frame
Semantic Frames @prefix ontolex: <http://www.w3.org/ns/lemon/ontolex#> . @prefix synsem: <http://www.w3.org/ns/lemon/synsem#> . @prefix lexinfo: <http://www.lexinfo.net/ontology/2.0/lexinfo#> . @prefix foaf: <http://xmlns.com/foaf/0.1/>. @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . <#know> a ontolex:Word ; ontolex:sense <#know/sense> ; synsem:synBehavior <#know/transitive> . <#know/sense> a ontolex:LexicalSense , synsem:OntoMap ; synsem:ontoMap <#know/sense> ; ontolex:reference foaf:knows ; synsem:subjOfProp <#know/subject> ; synsem:objOfProp <#know/directObject> . foaf:knows a rdf:Property ; rdfs:domain foaf:Person ; rdfs:range foaf:Person . Lexical sense is an ontology mapping Identifiers from syntactic frame Ontological definition of semantic frame
Syntactic-Semantic Mapping Lexical Entry Argument (subject) Lexical Sense/ Onto Map Syntactic Frame Argument (object) Class (domain) Property Class (range)
Qualitätsmanagement-System Qualität Management System Decomposition
constituent ⚬ correspondsTo = subterm Decomposition @prefix ontolex: <http://www.w3.org/ns/lemon/ontolex#> . @prefix decomp: <http://www.w3.org/ns/lemon/decomp#> . <#summer_school> a ontolex:MultiWordExpression ; decomp:subterm <#summer>, <#school> . <#école_d’été> a ontolex:MultiWordExpression ; decomp:constituent <#école_d’été/école> , <#école_d’été/de> , <#école_d’été/été> ; rdf:_1 <#école_d’été/école> ; rdf:_2 <#école_d’été/de> ; rdf:_3 <#école_d’été/été> ; <#école_d’été/de> a decomp:Component ; decomp:correspondsTo <#de> ; lexinfo:lexTermType lexinfo:contraction . Order Component Properties
Variation and Translation Cultural Translation “Japanese Rice Cake”@en “もち”@ja
How to represent translation Lexical Level (4) Translatable As Rice@en “米”@ja Lexicosemantic Level vartrans:Translation (3) Stand-off Sense Sense (2) Translation Semantic Level dbpedia:Rice (1) Shared Reference
Linguistic Metadata Magic Ontology Jace the Wizard Erhnam the Djinn
LiMe - Linguistic Metadata See Manuel’s Talk
Morphology Lexicography (for traditional lexicographic resources) Frequency, Attribution and Corpus Information (FRAC) Etymology and Diachronicity Lexico-Syntactic Categories New Modules
http://www.w3.org/community/ontolex Community Group Please join!
Thanks. This work has emanated from research supported in part by a research grant from Science Foundation Ireland (SFI) under Grant Number SFI/12/RC/2289, co-funded by the European Regional Development Fund, and the European Union’s Horizon 2020 research and innovation programme under grant agreement No 731015, ELEXIS - European Lexical Infrastructure.