1 / 38

ISO 16642

ISO 16642. TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria. Overview. General principles. Expressing constraints on the representation of computerized terminologies What is the underlying structure of computerized terminologies?

rad
Download Presentation

ISO 16642

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ISO 16642 TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria

  2. Overview

  3. General principles • Expressing constraints on the representation of computerized terminologies • What is the underlying structure of computerized terminologies? • Which data-category is used and under which conditions? • Maintaining interoperability between representations • Providing a conceptual tool to compare two given formats

  4. Definitions • TMF: Terminological Mark-up Framework • Definition of underlying structures and mechanisms needed for the computer representation of terminological data • Independence with regards any specific format • TML: Terminological Mark-up Language • One specific representation format generated within TMF • E.g.: DXLT is a possible TML

  5. A family of formats TMF … TML1 TML2 TML3 TML1 (Geneter) (DXLT)

  6. Meta-model Representing the underlying structure of terminological data

  7. Terminological Data Collection 0:1 * * 1 1 1 Global Information Terminological Entry Complementary Information * * Terminology- related Information 1 * Language Section 1 * * 1 Term Section * * 1 Term Component Section

  8. The structural skeleton Terminological Data Collection (TDC) Global Information (GI) Complementary Information (CI) * Terminological Entry (TE) * Language Section (LS) * Term Level (TL) * Term Component Level (TCL)

  9. How does this work? Walking through an example…

  10. DXLT example <termEntryid='ID67'> <descrip type='subjectField‘>manufacturing</descrip> <descrip type='definition'>A value between 0 and 1 used in ...</descrip> <langSetlang='en'> <tig> <term>alpha smoothing factor</term> <termNote type='termType'>fullForm</termNote> </tig> </langSet> <langSetlang='hu'> <tig> <term>Alfa ...</term> </tig> </langSet> </termEntry>

  11. id=‘ID67’ [attribute] subjectField=‘ manufacturing ’ [typedElement] definition=‘A value…’ [typedElement] TE lang=‘ en ’ [attribute] LS lang=‘ hu ’ [attribute] TS term=‘…’ [element] term=‘alpha smoothing factor’ [element] termType=‘fullForm’ [typedElement] Identifying the structural skeleton TE: Terminological Entry LS: Language Section TS: Term Section

  12. TMF information model id=‘ID67’ subjectField=‘ manufacturing ’ definition=‘A value…’ TE LS LS lang=‘ hu ’ lang=‘ en ’ term=‘alpha smoothing factor’ termType=‘fullForm’ TS term=‘…’ TS

  13. GMT representation <struct type=“TE”> <feat type=“id”>ID67</feat> <feat type=“subjectField”>manufacturing</feat> <feat type=“definition”>A value between 0 and 1 used in ...</feat> <struct type=“LS”> <feat type=“lang”>en</feat> <struct type=“TS”> <feat type=“term”>alpha smoothing factor</feat> <feat type=“termType”>fullForm</feat> </struct> </struct> <struct type=“LS”> <feat type=“lang”>hu</feat> <struct type=“TS”> <feat type=“term”>Alfa ...</feat> </struct> </struct> </struct>

  14. TML à la mode ISO • Ingredients • A structural skeleton • (take the TMF Metamodel) • A reference Data Category Registry • ISO 12620 is a good place to find one • Recette • Choose some data categories from the registry • You can even constrain the values of your datcats • Associate a style and vocabulary to each datcat • You can inspire yourself from others (DXLT) • Serve it hot to your software guy with a piece of SALT software

  15. GMT Generic Mapping Tool

  16. Background • Interoperability principle • If any two TMLs have exactly the same DCS, even though they differ radically in style and vocabulary, they are equivalent. • Consequence • It is always possible to define a filter from one TML to another when they are interoperable • GMT is the intermediate representation to do so

  17. From one TML to another • GMT - Generic mapping tool • an abstract XML representation • identification of levels • <struct type=“LS”>…</struct> • a recursive element • representation of data-categories • <feat type=“definition”>…</feat>

  18. GMT description cont. • Bracketing features <brack> <feat type=“classificationCode“> xxx </feat> <feat type=“classificationSystem“> Lenoc </feat> </brack>

  19. GMT description cont • Annotating information <feat type=“definition”> pencil whose <annot type=“characteristic”> casing </annot> is fixed around a cental graphite medium which is used for writing or making marks </feat>

  20. Data Categories A Formal Description

  21. Data Category Registry DCRegistry rdf:about Description dcsd:DataCategory VersionNumber Data Category

  22. Data Category description DCIdentifier DCParent DCName dcsd:DCIdentifier dcsd:DCParent DCDefinition dcsd:DCName dcsd:DCDefinition dcsd:DCType DCType (S, C) Data Category dcsd:DCExample DCExample dcsd:DCAdmin dcsd:DCComment dcsd:Content dcsd:Level DCAdmin DCComment Locus Content Salt 2000-11-08/SEW

  23. Levels and content Content dcsd:DataType dcsd:TargetType Level/Loci rdf:Alt rdf:Alt TargetType DataType List of References List of References rdf:Alt rdf:li Ref to other datcats rdf:li List of References Ref to other datcat(s) rdf:li Ref to other datcat(s)

  24. Actualizing a DatCat TMF specific properties

  25. Styling properties Simple Element Attribute TypedElement ValuedElement TVElement Anchor StyleName Data Category dcsd:Anchor dcsd:StyleName dcsd:Style dcsd:ElementName ElementName Style dcsd:Value dcsd:AttributeName dcsd:TypeValue AttributeName Value TypeValue Pour simple

  26. Attribute style description • dcsd:StyleName=“Attribute” • Conditions of use: • Not valid for annotations • Required properties • dcsd:AttributeName • Example: • dcsd:AttributeName=“id” • <anchorElement id=“xx54893”>…</>

  27. Element style description • dcsd:StyleName=“Element” • Required properties • dcsd:ElementName • Example: • dcsd: ElementName =“definition” • <definition>…</definition>

  28. TypedElement style description • dcsd:StyleName=“TypedElement” • Required properties • dcsd:ElementName, dcsd:TypeValue • Example: • dcsd:ElementName =“termNote” • dcsd:TypeValue=“partOfSpeech” • <termNote type=“partOfSpeech”/>N</termNote>

  29. ValuedElement style description • dcsd:StyleName=“ValuedElement” • Conditions of use: • Not valid for annotations • Required properties • dcsd:ElementName • Example: • dcsd:ElementName =“pos” • <pos value=“noun”/>

  30. TVElement style description • dcsd:StyleName=“TVElement” • Conditions of use: • Not valid for annotations • Required properties • dcsd:ElementName, dcsd:TypeValue • Example: • dcsd:ElementName =“free” • dcsd:TypeValue=“pos” • <free type=“pos” value=“noun”/>

  31. Simple style description • dcsd:StyleName=“Simple” • Conditions of use: • Express the value of simple data categories • Required properties: • dcsd:Value • Example: • dcsd:Value =“Nom” • <pos>Nom</pos>

  32. Dealing with languages

  33. Two types of languages • Working language • The language used at a given place in a document, along the XML hierarchy • Representation: xml:lang • Object language • The language about which you speak at a given place in your terminological entry (e.g. describes the Language Section level) • Representation: as a data category “language”, with a narrow scope

  34. Example — DXLT <langSet lang='en’xml:lang=“fr”> <descrip type='definition’>Une valeur entre 0 et 1 utilisée…</descrip> <tig> <term xml:lang=“en”>alpha smoothing factor</term> <termNote type='termType'>fullForm</termNote> </tig> </langSet>

  35. Example — GMT <struct type=“LS”xml:lang=“fr”> <feat type=“language”>en</feat> <feat type='definition’>Une valeur entre 0 et 1 utilisée…</feat> <struct type=“TL”> <feat type=“term” xml:lang=“en”>alpha smoothing factor</feat> <feat type='termType'>fullForm</feat> </struct> </langSet>

  36. Conclusion • A general model for analysing and representing terminological data collection • An underlying formalism expressed in XML,RDF • Associated tools (Salt project) • DCSEditor, • DCSBrowser, • Automatic generation of XSLT filters and XML schemas from a given TML specification

  37. Useful pointers • SALT project • http://www.loria.fr/projets/SALT • http://www.ttt.org/ • The TMF site • http://www.loria.fr/projets/TMF

More Related