1 / 21

ISO/TC37/SC4/TDG6 Language Resource Ontologies

ISO/TC37/SC4/TDG6 Language Resource Ontologies. 2008-05-25, Marrakech HASIDA Koiti hasida.k@aist.go.jp CfSR , AIST, Japan. TDG6 Issues. ontologization DC, LAF, LMF, FS, MAF, SemAF, SynAF, TDG3, etc. Cf. the Pisa group’s work on LMF

gary-boyer
Download Presentation

ISO/TC37/SC4/TDG6 Language Resource Ontologies

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ISO/TC37/SC4/TDG6Language Resource Ontologies 2008-05-25, Marrakech HASIDA Koiti hasida.k@aist.go.jp CfSR, AIST, Japan

  2. TDG6 Issues • ontologization • DC, LAF, LMF, FS, MAF, SemAF, SynAF, TDG3, etc. • Cf. the Pisa group’s work on LMF • extension of RDF (and ontology framework) to more straightforwardly address linguistic information • extended RDF instead of XML • nodes embedding nodes … rdf:Container? • publish TRs • launch ISs

  3. Ontologization • ontology-based reformulation • Most current standards are based on XML and lack standard framework for semantic interpretation. • not XML but RDF as base description and modeling tool • Semantic interpretation is standardized not for XML but for RDF. • ontology as schema • not DTD, XML Schema, RELAXNG, etc.

  4. Motivations of Ontologization • Lack of formal tool by which to write schemas fully addressing the specifications in ISs. • DCR model lacks descriptive power.

  5. Weaknesses of DCR Metamodel • DCR metamodel cannot address • sorts of DCs: such as unary predicate, binary relation, symmetric binary relation, etc. • types of the domain (1st arg.) and the range (2nd arg.) of binary relations (properties)

  6. Semantic Mess of XML • Semantic interpretation of XML is not standardized but rather arbitrary. • Many inconsistent `standards’ on overlapping issues. • Huge standards containing many different semantic interpretation manners. • e.g., MPEG-7 > 2000 pages

  7. RDF • Resource Description Framework • W3C recommendation http://www.w3.org/RDF/ • basis of ontology standards such as RDFS, OWL, and SKOS. • graph data model • textual representation • XML • N3

  8. RDF Graph http://meetings.example.com/m1/hp m:homePage http://meetings.example.com/cal#m1 m:attending Fred m:givenName http://www.example.org/people#fred m:hasEmail mailto:fred@example.com

  9. Cf. RDF in Text <rdf:RDFxmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:m="http://www.example.org/meeting_organization#" xmlns="http://www.example.org/people#" xmlns:p="http://www.example.org/personal_details#"> <rdf:Description about="http://meetings.example.com/cal#m1"> <m:homePage resource="http://meetings.example.com/m1/hp"/> </rdf:Description> <rdf:Description about="http://www.example.org/people#fred"> <m:attending resource="http://meetings.example.com/cal#m1"/> <p:GivenName>Fred</p:GivenName> <p:hasEmail resource="mailto:fred@example.com"/> </rdf:Description> </rdf:RDF> XML Let’s forget these texts and use graphs! @prefix p: <http://www.example.org/personal_details#> . @prefix m: <http://www.example.org/meeting_organization#> . <http://meetings.example.com/cal#m1> m:homePage <http://meetings.example.com/m1/hp> . <http://www.example.org/people#fred> p:GivenName "Fred"; p:hasEmail <mailto:fred@example.com>; m:attending <http://meetings.example.com/cal#m1> . N3

  10. ISO 24610: Feature Structure • typed feature structure as in HPSG, etc. • ISO 24610-1: Feature Structure Representation • ISO 24610-2: Feature System Declaration • graph model • AVM (attribute-value matrix) • textual encoding by XML

  11. FS Graph determiner POS ORTH la SPECIFIER AGR NUMBER singular AGR HEAD noun POS ORTH pomme

  12. FS in AVM SPECIFIER HEAD POS determiner ORTH `la’ AGR [1][NUMBER singular] POS noun ORTH `pomme’ AGR [1]

  13. FS in XML <fs> <f name="specifier"> <fs> <f name="pos"><symbol value="determiner"/></f> <f name="orth"><string>la</string></f> <f name="agr"> <var label="n1"> <fs><f name="number"><symbol value="singular"/></f></fs> </var> </f> </fs> </f> <f name="head"> <fs> <f name="pos"><symbol value="noun"/></f> <f name="orth"><string>pomme</string></f> <f name="agr"><var label="n1"/></f> </fs> </f> </fs> Let’s forget this, too!

  14. FS in RDF Graph (= FS Graph) determiner POS ORTH la SPECIFIER AGR NUMBER singular AGR HEAD noun POS ORTH pomme

  15. Ontologies Subsume Feature Systems • Features are partial functions, whereas RDF properties are relations in general (possibly partial functions). • Usual feature systems have no taxonomy of features, whereas usual ontologies have taxonomies of properties (e.g., due to rdfs:subPropertyOf).

  16. Feature Structure Declaration <fsDecl type="word" baseTypes="sign"> <fsDescr>The fundamental type for individual words</fsDescr> <fDecl name="orth"> <fDescr>The orthographic representation for this word</fDescr> <vRange><string/></vRange> </fDecl> </fsDecl> The fundamental type for individual words sign rdfs:comment rdfs:subClassOf The orthographic representation for this word word rdfs:comment owl:FunctionalProperty rdf:type rdfs:domain orth rdfs:range string

  17. Constraint (Conditional) <cond> <fs> <f name="inv"> <binary value="true"/> </f> </fs> <then/> <fs> <f name="aux"> <binary value="true"/> </f> <f name="vform"> <symbol value="fin"/> </f> </fs> </cond> named graph true inv X cond true aux X vform fin

  18. FS Ontologization (Summary) • RDF ⊃ FS • Use ontologies for feature-system declarations. • We need RDF-based notations to encode constraints. • Defaults are outside of ontology.

  19. ISO 24612: Linguistic Annotation Framework

  20. RDF Extended for Embedding TOKEN rdfs:type DET POS The rdfs:type BASE THE clock NN POS BASE CLOCK ● ● rdfs:type NP NUMBER SING possibly stand-off annotation a node embedding nodes

  21. Prospects • RDF as basic data structure • Graph modelis essential. • Forget about textual encoding such as XML • though W3C insists on plain-test encoding. • ontology to address FSD • straightforward to basically declare features and feature structures • need some inventions for constraints • extension of RDF • embeddings (of strings) • collections (sets, bags, lists) • lots more to do

More Related