ICSTI/ITOC
This presentation is the property of its rightful owner.
Sponsored Links
1 / 30

ICSTI/ITOC 15 October 2013 Larry Lannom PowerPoint PPT Presentation


  • 41 Views
  • Uploaded on
  • Presentation posted in: General

ICSTI/ITOC 15 October 2013 Larry Lannom Research Data Alliance Corporation for National Research Initiatives. RESEARCH DATA ALLIANCE. Corporation for National Research Initiatives. DAITF: Enabling Technologies 21 March 2012

Download Presentation

ICSTI/ITOC 15 October 2013 Larry Lannom

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Icsti itoc 15 october 2013 larry lannom

ICSTI/ITOC

15 October 2013

Larry Lannom

Research Data AllianceCorporation for National Research Initiatives

RESEARCH DATA ALLIANCE

Corporation for National Research Initiatives


Icsti itoc 15 october 2013 larry lannom

DAITF: Enabling Technologies

21 March 2012

Larry LannomCorporation for National Research Initiativeshttp://www.cnri.reston.va.us/http://www.handle.net/


Icsti itoc 15 october 2013 larry lannom

Enabling Technologies

ID

ID

ID

ID

ID

ID

010001010

010011011

010101001

101010000

010001010

010011011

010101001

101010000

ID

010001010

010011011

010101001

101010000

ID

ID

ID

ID

ID

Scientists, Data Curators,

End Users, Applications

Datasets


Icsti itoc 15 october 2013 larry lannom

Enabling Technologies

ID

ID

ID

ID

ID

0100

0101..

0100

0101..

0100

0101..

0100

0101..

0100

0101..

ID

ID

ID

ID

ID

ID

ID

ID

ID

ID

ID

ID

ID

Scientists, Data Curators,

End Users, Applications

Datasets

Accessed via Repositories


Icsti itoc 15 october 2013 larry lannom

Enabling Technologies

Enabling

Technologies

ID

ID

ID

ID

ID

0100

0101..

0100

0101..

0100

0101..

0100

0101..

0100

0101..

Discovery

ID

ID

ID

ID

ID

ID

ID

ID

ID

ID

ID

ID

ID

Scientists, Data Curators,

End Users, Applications

Datasets

Accessed via Repositories


Discovery evaluation

Discovery & Evaluation

  • Search

    • Metadata registries

      • Subject

      • Parties

      • Dates

      • Etc

    • Crawlers – more ad hoc

  • Citation

    • Formats

  • Permissions

    • Can I see it?

    • Can I use it?

  • Trust


Icsti itoc 15 october 2013 larry lannom

Enabling Technologies

Enabling

Technologies

ID

ID

ID

ID

ID

0100

0101..

0100

0101..

0100

0101..

0100

0101..

0100

0101..

Discovery

ID

ID

ID

ID

ID

ID

ID

ID

Access

ID

ID

ID

ID

ID

Scientists, Data Curators,

End Users, Applications

Datasets

Accessed via Repositories


Access

Access

  • ID / reference resolution

    • Go from ‘subject search’ to ‘known item’ search

  • Access Protocols

    • How to get it

    • Protocol registries

    • Bootstrapping into new protocols

  • Authentication & Authorization

    • Proof of identity (tradeoff: usability vs security)

    • Permissions: with the object or in some external system?


Icsti itoc 15 october 2013 larry lannom

Enabling Technologies

Enabling

Technologies

ID

ID

ID

ID

ID

0100

0101..

0100

0101..

0100

0101..

0100

0101..

0100

0101..

Discovery

ID

ID

ID

ID

ID

ID

ID

ID

Access

ID

ID

ID

ID

ID

Scientists, Data Curators,

End Users, Applications

Interpretation

Datasets

Accessed via Repositories


Interpretation

Interpretation

  • Registries

    • Schemas

    • Vocabularies

    • Formats

    • Available services

    • Useful client-side tools

  • Trust

    • Who did this?

    • Who owns this?

  • Provenance

    • Data Source

    • Processing steps

    • Computing environment

      • what is needed to trust the numbers?

      • Domain specific?


Icsti itoc 15 october 2013 larry lannom

Enabling Technologies

Enabling

Technologies

ID

ID

ID

ID

ID

0100

0101..

0100

0101..

0100

0101..

0100

0101..

0100

0101..

Discovery

ID

ID

ID

ID

ID

ID

ID

ID

Access

ID

ID

ID

ID

ID

Scientists, Data Curators,

End Users, Applications

Interpretation

Datasets

Accessed via Repositories

Reuse


Reuse

Reuse

  • Everything from Interpretation slide + Permissions

    • Example from BOF: I need to understand a data set for peer review but that doesn’t give me permission to use the data

  • Validation

  • Education & Training

    • Integrate ‘live’ data into education and training

  • Repurpose data


Daitf roles

DAITF Roles?

  • Bring good people together on a regular basis to discuss these issues

  • Get agreement on vocabulary for discussing data access and interoperability?

  • Working groups on specific topics

    • Prototyping specific interoperability issues / domains

  • Create high-level framework, ala OAIS? Multiple frameworks?

  • Guides to Registries and Best Practices


Icsti itoc 15 october 2013 larry lannom

Research Data Alliance Plenary 2 UpdateDr. Francine BermanChair, RDA/USHamilton Distinguished Chair in Computer ScienceRensselaer Polytechnic Institute


Rda plenary 2 september 16 18 washington d c 3 days of peace love and data

RDA Plenary 2 -- September 16-18, Washington D.C. -- 3 days of Peace, Love and Data

  • RDA Plenary 2

    • 368 participants from 22 countries and all sectors

    • All-hands stakeholder talks and RDA working meeting

    • Data Citation Summit convened by DataCite, FORCE11,CODATA/ICST, ESIP, DCC, etc. to create a common agenda

    • ~5000 tweets over 3 days


Rda community current status 1300 participants from 50 countries

RDA Community Current Status: ~1300 participants from 50+ countries

Albania

Australia

Austria

Bangladesh

Belgium

Bolivia

Botswana

Brazil

Bulgaria

Canada

China

Congo {Democratic Rep}

Costa Rica

Czech Republic

Denmark

Estonia

Finland

France

Germany

Greece

Iceland

India

Iran

Ireland

Ireland {Rep}

Italy

Japan

Krygrystan

Kuwait

Mexico

Netherlands

New Zealand

Norway

Palestine

Poland

Portugal

Russian Federation

Rwanda

Serbia

Singapore

Slovenia

South Africa

South Korea

Spain

Sweden

Switzerland

Taiwan

Turkey

United Arab Emirates

United Kingdom

United States

Vatican City

Venezuela

Fran Berman


Rda community building momentum

RDA Community Building Momentum

  • Growth in number and scope of Interest Groups and Working Groups

    • New: BOFs for groups as precursor to Interest Groups

    • Groups beginning to “self-monitor” to promote concrete deliverables to be used and adopted

    • Increasing interest in more interaction and “connective tissue” between groups

  • Pressing To-Dos before Plenary 3:

    • Develop an RDA policy for IP that comes up in Interest and Working Groups

    • Determine the form of RDA deliverables and what’s needed in terms of an “RDA archive”


Groups that met at the rda plenary

Groups that Met at the RDA Plenary

BOLD = new since last Plenary

  • Birds-of-a-Feather

    • Linked Data

    • Chemical Safety Data

    • Education and Skills Development in Data Intensive Science

    • Libraries and Research Data

    • Cloud Computing and Data Analysis Training for the Developing World

  • Working Groups

    • Data Type Registries

    • Metadata Standards

    • Practical Policy

    • Persistent Identifier Types

    • Data Foundations and Terminology

    • Data Categories and Codes

  • Interest Groups

    • Agricultural Data

    • Big Data Analytics

    • Data Brokering

    • Certification of Trusted Repositories (joint with ICSU-WDS)

    • Long tail of Research Data

    • Marine Data Harmonization

    • Community Capability Model

    • Data Publishing (joint with WDS)

    • Toxicogenomics Interoperability

    • Research Data Provenance

    • Data Citation

    • Metadata

    • Economic Models and Infrastructure for Federated Materials Data Management

    • Engagement

    • Preservation e-Infrastructure

    • Legal Interoperability (joint with CODATA)

    • Global Registry of Trusted Data Repositories and Services

    • Digital Practices in History and Ethnography

  • Data Citation Harmonization Summit

    • DataCite,FORCE11,CODATA/ICST, ESIP, DCC, etc.


Rda organizational partners

RDA Organizational Partners

New RDA constituencies / stakeholders

  • Organizational Assembly = Organizational Members (subscription) + Organizational Affiliates (MOUs).

  • Organizational Advisory Board will representOrganizational Assembly.

  • Current Status:

    • Organizational Membership under discussion with Microsoft, IBM, ANDS, Australian Antarctic Data Center, Intersect, Terrestrial Ecosystems Research Network, CSC – IT, Center for Science Ltd., Oracle, STFC, CNRI, STM, EUDAT, Barcelona Supercomputer Center, Columbia University Libraries / Information Services, and many more after the Plenary

    • Organizational Affiliation under discussion with CODATA, WDS and others

  • Next 6 months (before Plenary 3)

    • Firm up model for Affiliates (how many, how substantive should the interaction be?)

    • Complete creation of legal entity to host subscriptions for Organizational Members

    • Elect Organizational Advisory Board at Plenary 3


Rda constituent groups coming together

New Position: RDA recruiting for full-time Secretary- General

RDA Constituent Groups Coming Together

RDA Colloquium (National Research Agencies and Funders)

RDA Membership

RDA Council (overarching leadership)

Technical Advisory Board

(Technical oversight)

Secretary-General and Secretariat

(Administration and Operations)

Organizational Advisory Boards and Organizational Assembly

(Organizational partnerships and guidance)

Working Groups and Interest Groups(impact - focused infrastructure)


Next plenaries 2x a year

Next Plenaries (2X a year)

  • Plenary 3 will be in Dublin March 26-28 in 2014, hosted by Australia and Ireland

  • Plenary 4 will be in the Netherlands – late September in 2014

  • Plenary 5 or 6 likely back in the U.S. (west coast?)


Info enquiries@rd alliance org

Info:[email protected]

Fran Berman


Icsti itoc 15 october 2013 larry lannom

Data Type Registries (DTR)

Co-Chairs

Larry Lannom: CNRI

DaanBroeder: MPI

September 2013

RDA Plenary 2

Washington, DC


Icsti itoc 15 october 2013 larry lannom

Goal: Interoperable Set of Data Type Registries

  • Data Types

    • Characterize data structures at multiple levels of granularity

    • Formats are just part of the story

    • Optimize interactions between data producers & consumers by having types defined and associated with the data they describe

    • Types should be standardized, discoverable, and unique

  • Type Registries

    • Each type registered with unique identifier

    • Common data model and expression

    • Associate with services, tools, format registries, etc.

    • Common API for machine consumption


Icsti itoc 15 october 2013 larry lannom

Schedule

  • 3/2013 – 9/2013

    • Gathering use cases

    • Investigating other work in the area

    • First drafts of data model and functional specs for a type registry

  • 10/2013 – 12/2013

    • Refine data model and functional specs

    • Deploy initial prototype

  • 1/2014 – 5/2014 

    • Finalize data model and functional specs

    • Deploy functional type registry for PID types

    • Release turnkey registry conforming to functional specs


Icsti itoc 15 october 2013 larry lannom

DTR Use Cases

  • Broad Functional Classification

    • Repos hold widely varying levels of data & metadata

    • High-level functional classification of the identified object needed to make sense of what is available, e.g., data object, metadata, repo description, contact info, etc.

  • Simple License Information via PID Resolution

    • Data set access conditions cannot be predicted based on ID

    • For DataCiteDOIs, a handle/type/value triple could be used to provide access information, probably through a level of indirection, resulting in a pop-up or intervening page or open linked data

  • Object Types as a Short-cut for Dependent Services to Match Processing Requirements to Data Objects

    • Using data acquisition as an example

      • Determine object type you are trying to build

      • Consult registry to index into an ontology to dynamically define required and optional properties

      • Does the input data have what is needed?

  • Registration of PID Types (in ID/Type/Value triples) for Data Processing and Interpretation

    • Distinguish pointers to objects from pointers to metadata from pointers to services

    • Enable complex client interactions as opposed to simple one-to-one re-direction


Icsti itoc 15 october 2013 larry lannom

One Use of Type Registries

ID

ID

Type

ID

ID

Type

Users

ID

Payload

Type

Type

ID

Payload

Type

Payload

Payload

Type

Payload

Payload

3

4

2

1

2

3

4

1

4

Typed Data

Terms:…

I Agree

Visualization

10100

11010

101….

Rights

Data Set

Dissemination

Data Processing

Federated Set of Type Registries

Services

Client (process or people) encounters unknown type

Resolved to Type Registry

Response includes type definitions, relationships, properties, and possibly service pointers. Response can be

used locally for processing, or, optionally

Typed data or reference to typed data can be sent to service provider


A few words about cnri

A Few Words About CNRI

  • Not-for-profit organization formed in 1986 to foster research and development for the National Information Infrastructure (now internationally focused)

  • Major focus on management of information on networks: Digital Object Architecture

    • Handle System

    • DO Repository

    • DO Registry


Icsti itoc 15 october 2013 larry lannom

Handle System Adoption by Domain

  • Research Project: Early 90s

    • Initial US-funded digital library project (DARPA)

  • Library/Publishing: late 90s through 00s and continuing to grow

    • DSpace – turnkey digital library platform (MIT + HP)

    • Digital Object Identifier (DOI) for journal articles

    • International from the start, including Asia

  • Breaking out of the publisher/library ghetto: starting late 00s

    • Scientific data

      • Australian National Data Service (ANDS)

      • Max Planck (handles)

      • DataCite (DOIs)

      • EPIC (European Persistent Id Consortium)

      • EUDAT

    • Entertainment Industry

      • EIDR (DOIs)

  • Threshold of use and dependence brings governance and sustainability Issues

    • Who is CNRI? How long will they be around?

    • Who is in charge?

    • Not just a standards issue due to the global service (cf DNS)


Icsti itoc 15 october 2013 larry lannom

Infrastructural Governance and Sustainability

  • Spread Responsibility and Control from One Group to Many

    • Involve stakeholders

    • Develop financial sustainability plan

  • Develop an organizational model

    • Try to balance long-term and short-term incentives

    • Try to keep the organization from being captured by minority and/or moneyed interests

    • Build in flexibility

  • Independence from individual governments or industry players

  • DONA Foundation

    • Non-profit being established in Switzerland

    • Peer group of stakeholders will run and financially support the global infrastructure

    • Board of Directors will provide high-level guidance

    • CNRI will transfer relevant rights and technology to the Foundation and continue as 1/N stakeholders

    • Each stakeholder has identical responsibilities to the Foundation but otherwise independent

      • Governments could participate and provide their support out of general revenues

      • Industry could create appropriate business models

    • Formation in process, near term completion

    • Longer range objective is Digital Object Architecture approach to information system interoperability


  • Login