RDA Data Foundation and Terminology (DFT) IG: Introduction

Goal: Describe a basic, abstract (but clear) data organization model that systemizes the already large body of definition work on data management terms , especially as involved in RDA’s efforts. RDA Data Foundation and Terminology (DFT) IG: Introduction.

  1. Goal: Describe a basic, abstract (but clear) data organization model that systemizes the already large body of definition work on data management terms, especially as involved in RDA’s efforts. RDA Data Foundation and Terminology (DFT) IG:Introduction • A PID record that points to a metadata record and to instantiations of identical bit-streams that may store additional attributes Prepared for RDA Plenary San Diego, March 9, 2015 Gary Berg-Cross, Raphael Ritz, Co-Chairs DFT IG

  2. Prior DFT WG Activities & Accomplishments • One of the first RDA WGs • Drafted 4 related Model Documents on core work; • Data Models 1: Overview – 20 + models • Data Models 2: Analysis & Synthesis • Data Models 3: Term Snapshot • Data Models 4: Use Cases- Work with other RDA WGs on use cases to illustrate data concepts • Presented draft work & held community discussions at RDA P1-P3 meeting • Participated in cross WG discussions • Developed Semantic Media Wiki Term Definition Tool (Ted-T) to capture initial list of terms and definitions for discussions, demo held at P3 (see • Participated in Adoption Day -Common Language Resources and Technology Infrastructure Adopting DFT, CLARIN, Dieter van Uytvanck Candidate List Evolved to Refined List Tool demo at Plenary 3

  3. Overview of Term Development Getting Defs organized for review Scope Terms from Model Papers Placed In Tool Analysis and Revision Process Defs & Refinement • Starterareas and items : • Persistent Identifiers (PIDs and types) • Digital Object - Data Object • Collection - Data Set - Aggregation • Repository (Registries and related Policies

  4. Example of Current Work on 10 categories of Terms Digital Object (aka Digital Entity) A digital object is composed of structured sequence of bits/bytes. As an object it is named. This bit sequence can be identified & accessed by a unique and persistent identifier or by use of referencing attributes describing its properties. Note Digital Entity definition from X.1255 ITU standard “machine-independent data structure consisting of one or more elements in digital form that can be parsed by different information systems; the structure helps to enable interoperability among diverse information systems in the Internet.”

  5. More Terms and Initial definitions are in TeD-T

  6. Lessons Learned and Follow Up • It has, of course, been difficult to get consensus on the scope a common vocabulary with detailed definitions. • The work has been more of model and vocabulary identification than integrated definition • We are and were in frequent discussions with communities about our results and will intensify this interaction. • Based on this experience, a broader plan for long-term maintenance will be submitted to the TAB and Council. • As needed in consultation with these & other appropriate RDA entities, some update to term definitions the can be anticipated as part of maintenance. • The term tool (TED-T): a plan for its maintenance and use for DFT terms and perhaps other WGs must be provided.  • A special task force may be empowered to do this and other maintenance activities in line with guidance from RDA governance organizations. • Based on interest a DFT IG was formed to continue efforts

  7. Coordinated with several other RDA Groups • Considerable discussion of vocabularies has been part of RDA group activities at Plenaries and as part of ongoing RDA group discussion. • Cross-group coordinated with several RDA WGs, as shown in the Data Fabric Figure on data concepts and relations. • This coordination task needs to be ongoing. • Potentially all groups could be engaged in this IG and we with them • Much more work and discussion would be useful such as with the PP WG and its terminology that was only briefly sketched out without full definitions. • PP along with MIG has expressed an interest in more formalized definitions that can be processed by computer and the Ted-T tool may be capable of doing this or at least demonstrating its feasibility.

  8. Objectives for P5 • Start IG discussion and leverage existing work and approach but improve both • We are expecting considerable discussion of new requirements coming out of groups nearing completion, but also support as part of adoption. • We can also leverage the experience of other IGs as to success factors • Focus on facilitating community discussion on core concepts • Based on feedback, some curated revisions  on definitions and extension of the current synthesis model can be expected to finalize and stabilize the effort for subsequent use.   • Facilitate definition development • Potential adopters will be encouraged at P5 to provide feedback on additional use case scenarios to illustrate what areas of work they plan on using the models and vocabulary for. • This will serve to plan work and virtual meetings  between P5 and P6.

