1 / 16

Discussion, Outlook and Further Directions

Discussion, Outlook and Further Directions. Topics. Container Data Categories Relation Registries Data Category Concepts …. partOfSpeech. Lemma. writtenForm. writtenForm. Word Form. grammaticalGender. lexicalType. Container Data Categories - I. wordOrder. grammaticalGender. Lexicon.

caelan
Download Presentation

Discussion, Outlook and Further Directions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Discussion, Outlook and Further Directions Standardizing Data Categories in ISOcat - Implementing Group Work for Thematic Domains

  2. Topics • Container Data Categories • Relation Registries • Data Category Concepts • … Standardizing Data Categories in ISOcat - Implementing Group Work for Thematic Domains

  3. partOfSpeech Lemma writtenForm writtenForm Word Form grammaticalGender lexicalType Container Data Categories - I wordOrder grammaticalGender Lexicon 1..* A (schema for a) typological database Lexical Entry 1..* 0..* Form Sense 0..* A LMF (ISO 24613:2008) compliant (schema for a) lexicon Standardizing Data Categories in ISOcat - Implementing Group Work for Thematic Domains

  4. Container Data Categories - II • A (TC 37) meta model which is instantiated with a domain/application specific data category selection into a data model • An proprietary data model with a related data category selection • A tweaked standardized meta model: • e.g., additional classes to the LMF meta model • Problem: where are the semantics of these ‘containers’ described? • LMF meta model in ISO 24613:2008 • But no standard place for own adaptations/models Standardizing Data Categories in ISOcat - Implementing Group Work for Thematic Domains

  5. Container Data Categories - III • Use the administrative and descriptive parts to manage standardization and describe the containers (components/tables/classes/objects/inner nodes…) of a meta/data model in the DCR • But the relationships between components and complex data categories wouldn’t be stored in the DCR (maybe in the RR) Standardizing Data Categories in ISOcat - Implementing Group Work for Thematic Domains

  6. Relation Registries - I • Value domain membership • Subsumption relationships between simple data categories (legacy) • Relationships between complex data categories are not stored in the DCR partOfSpeech string pronoun personal pronoun Standardizing Data Categories in ISOcat - Implementing Group Work for Thematic Domains

  7. Relation Registries - II • Rationale for not storing ontological relationships in the DCR: • Relation types and modeling strategies for a given data category may differ from application to application; • Motivation to agree on relation and modeling strategies will be stronger at individual application level; • Integration of multiple relation structures in DCR itself could lead to endless ontological clutter. Standardizing Data Categories in ISOcat - Implementing Group Work for Thematic Domains

  8. Relation Registries - III • TC 37 needs ontological relationships: • resurrect ‘broader generic concept’ • is-a relationships (between complex DCs?) • … • Bridges • within the DCR: • users create the same (or very) close DCs • between ISOcat and ISO/CDB • ISOcat PID vs IRDI PID • between various registries: • interoperability between various communities • same-as relationships • Resource discovery needs context: • granularity of DCs • the /title/ of a book or the /title/ of a … • has-a relationships Standardizing Data Categories in ISOcat - Implementing Group Work for Thematic Domains

  9. Relation Registries - IV MPI RR Typological Database System RR Relation registries MPI DCR ISO DCR Data category registries resource TDS database MPI archive Linguistic resources Standardizing Data Categories in ISOcat - Implementing Group Work for Thematic Domains

  10. Data Category Concepts - I • TDGs create DCs from with a certain domain modeling view: • CMDI • LMF • TBX • … • DC get types based on these views. However, users with other (proprietary) data model might want to use the DC, but the type doesn’t fit. • POS field (closed DC) of the lexical entry “walk” gets the value ‘verb’ (simple DC) • Verb (open DC) feature of a feature structure gets the value “walk” • Both DCs could be semantically equivalent • Decouple some of the semantics of the DC specification and move it to a (DC) Concept so multiple DCs, with different types, can reuse it? Standardizing Data Categories in ISOcat - Implementing Group Work for Thematic Domains

  11. Data Category Concepts - II • GOLD has been put into the DCR, but • Only some of the ontological relationships can be maintained • Only some concepts make sense as DCs, e.g., the upper ontology is too abstract • But you might still want to share/standardize these semantics and maintain these relationships … Standardizing Data Categories in ISOcat - Implementing Group Work for Thematic Domains

  12. Data Category Concepts - III Linguistic resource (schema) Linguistic knowledge base Data categories Containers Concepts Relation Standardizing Data Categories in ISOcat - Implementing Group Work for Thematic Domains

  13. Data Category Concepts - IV <lmf:lexiconxml:lang=“jp” alphabet=“ipa”> <lmf:entry> <lmf:lemma> <lmf:writtenForm>nihongo</…> … </…> … </…> … </…> Standardizing Data Categories in ISOcat - Implementing Group Work for Thematic Domains

  14. Data Category Concepts - V Data model Knowledge base lexicon language entry alphabet japanese ipa lemma writtenForm Standardizing Data Categories in ISOcat - Implementing Group Work for Thematic Domains

  15. Data Category Concepts - VI • Data Category Concepts use the administrative and descriptive parts of the DCR data model; Complex and simple DCs stay as they are now • Complex and simple DCs wrap around the semantics of a Data Category Concept and add information specific to their type • Is that possible? Or does the semantic description reflect the type? Standardizing Data Categories in ISOcat - Implementing Group Work for Thematic Domains

  16. Data Category Concepts - VII • DCR would move to or include a concept registry • Relationship to ISO/CDB? • standardized snapshot is also available in ISO/CDB • grassroots approach leads to possibly many more non-standardized concepts available in ISOcat • alignment of PIDs using the RR Standardizing Data Categories in ISOcat - Implementing Group Work for Thematic Domains

More Related