1 / 23

Digital Object Architecture: an Advanced Architecture for Managing Digital Information

Digital Object Architecture: an Advanced Architecture for Managing Digital Information. WSIS Forum 2011 May 19, 2011. Presentation by Robert E. Kahn President & CEO Corporation for National Research Initiatives. Origins of the Internet. Multiple Different Packet Networks Open Architecture

tassos
Download Presentation

Digital Object Architecture: an Advanced Architecture for Managing Digital Information

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Digital Object Architecture: an Advanced Architecture for Managing Digital Information WSIS Forum 2011 May 19, 2011 Presentation by Robert E. Kahn President & CEO Corporation for National Research Initiatives

  2. Origins of the Internet • Multiple Different Packet Networks • Open Architecture • Implemented via the TCP/IP Protocols • Standards Processes • Sustained Research Support • Eventually resulting in • Commercialization • Widespread Dissemination • Global Acceptance

  3. Three Initial Networks • DARPA originally funded three seminal packet networks – ARPANET, Packet Radio, Packet Satellite • The Internet came about from a desire to enable users and their computers to communicate efficiently, independent of the network they were using • Initial challenges were in areas such as: • Addressing • Routing • Congestion Control • Host Protocols • Addressing (16 bits to the wire, 32 bit IPv4 addresses; later -- 128 bit IPv6 addresses, URLs)

  4. Key Initial Decisions • Global Addresses (IP) freed us from ARPANET addressing of the wires • Gateways introduced for IP routing and for Network “Impedance Matching” – now called routers • TCP dealt with network-related concerns • different packet sizes, duplicates, error detection, losses due to tunnels, mountains, jamming, etc. • Enabled separate network administration • Global information system based on an open architecture

  5. From Packet Communication to Information Management • The Internet did not start out with a primary goal of assisting users in managing information. • Fast, efficient, reliable, global connectivity was the main goal • Information management was limited to ensuring proper information flows in the Internet • The World Wide Web was an important step in simplifying user access to information • Other alternatives are now emerging. • We now present an open architecture approach to information management that • Makes use of existing Internet capabilities • allows different types of information management systems to be developed and interoperate.

  6. Digital Object Architecture • To reformulate the Internet architecture to focus more specifically on managing information rather than just communicating bits • Making use of its world-wide connectivity, but independentof current technology choices • Enabling existing and new types of information to be reliably managed and accessed in the Internet environment, including over very long periods of time • Providing mechanisms to stimulate dynamic new forms of expression and to manifest older forms • Support for multi-lingual identifier names in most native/local scripts • While supporting privacy, security, intellectual property protection, managed access and well-formed business practices

  7. Digital Object Architecture • Technical Components • Digital Objects (DOs) • Structured data with a unique persistent identifier • Resolution of the Unique Identifiers • To “state information” about the DOs • Repositories • To deposit DOs • To access DOs with security • Registries • To create and store metadata • For secure searching

  8. Client Repositories / Collections • Metadata Registries • in lieu of traditional • Search Engines • Metadata Databases • Catalogues, Guides, etc. Resource Discovery Resolution System Digital Object Architecture User

  9. Selected Digital Object Types • Documents, Books, Music, Videos, Spreadsheets • Personal data (coordinates, financial, medical) • Observational data (climate, radio astronomy) • Networking Information (operations, provisioning, forecasting) • Commerce and Business Information (contracts, bills of lading, letters of credit, etc) • Software (programs, running processes & distributed systems) • Information about “Things”

  10. Repositories Store and Access Digital Objects on the Net Logical External Interface Any Hardware & Software Configuration Digital Object Protocol

  11. Digital Object Protocol • Uniform interface for accessing repositories and their digital objects • Based on the use of identifiers • Provides authentication of both users and servers upon request or where required • Uses identity management based on the use of public keys • Key means of implementing interoperability

  12. The Digital Object Protocol is a Meta-Level, Extensible Interface <input sequence><H1> <H2> <Params> <output sequence> H1 is a handle for the operation applied to the Target DO H2. Similarly both A and B are known by their Handles HA and HB. The steps of the protocol are: Establish a connection from A to B {Optionally} A asks B to authenticate himself If successful, A provides an input string to B {Optionally} B asks A to authenticate herself B provides the results of the operation Either party may choose to continue or close

  13. Metadata Registry • Registers the existence and access conditions for Digital Objects • Enables collections to be defined with appropriate access controls • Provides a user interface to browse and search the registry, and an API for other programs to search the registry • Integrates existing technologies • Handle System for identification and access • Digital Object Repository for metadata object storage and access • XML for object description and submission • Specification of Metadata Schemas

  14. CORDRA Community CORDRA Registry CORDRA Community Federation Level Metadata Master Registry of Registries CORDRA Registry Content Repositories Federation Level Metadata Content Repositories Federation Level Metadata Intermediate Registry of Registries CORDRA Registry Community CORDRA Registry CORDRA Registry Intermediate Registry of Registries CORDRA Community Content Repositories CORDRA Community Federation Level Metadata CORDRA Community CORDRA Registry CORDRA Registry Community Content Repositories CORDRA Registry Content Repositories CORDRA Federation Level Metadata CORDRA Registry Community Content Repositories

  15. What are Handles?Why Resolution Systems? • CNRI uses the name “Handles” to denote digital object identifiers • Others may prefer to use their own descriptors • Existing identifier schemes are accommodated • Identifiers provide a way to identify data structures independent of their physical form or location, if any • Identifiers can be of many forms, and may contain randomly generated strings, date-time stamps as well as semantics • The identifier itself will not usually contain useful information about the digital object • The resolution system is intended to make available the useful information

  16. Why are identifiers Important • For global addressing • and possibly routing • For long-term information preservation • For building linkages • In lieu of attachments • To create virtual structures • For accessing related metadata • To convey search results • To authenticate/validate • Connectivity • Individual Digital Objects • Identity

  17. Structure of the Identifiers • Digital Object Identifiers are structured as “prefix/suffix” • They may be conveyed in various forms, such as: • 10.1234/Conf_Summary • HDL:10.1234/Conf_ Summary • hdl.handle.net/10.1234/Conf_Summary • Each prefix has its own administrator with PKI access to the system for creation, change and deletion. • Resolution of an identifier results in a returned resolution record – generally within a fraction of a second

  18. Resolution Mechanism Multiple Workstations Distributed Globally DO Identifier Resolution Record Handle System <www.handle.net> System is non –nodal Scaleable & Distributed Supports global (and local) resolution

  19. Handle System Features • Supports both Resolution and Administration • Internationalized character sets • Secured resolution service • Provides for Unique Persistent Identifiers • Current Users include: • DOI System, Open Archives Initiative, Library of Congress, CNNIC, Office of European Publications, DataCite, EIDR, DSpace Community and others

  20. Client LHS LHS LHS The Handle System LHS is a collection of handle services, Site 1 Site 2 Site 2 each of which consists of one or more replicated sites, Site 1 Site 3 Site n …... each of which may have one or more servers. #1 #2 #1 #2 #3 #4 #n ... 4 123.456/abc URL http://www.acme.com/ URL 8 http://www.ideal.com/ Handle Resolution GHR

  21. Mirroring the Global Handle Registry Administration M M P M M • • • • • • • • Contains System Handle Records user user user Non-System Handle Records are in lots of Local Handle Services 

  22. Planned Deployment of aMulti-Primary Global Registry A limited number of primaries each Administered Separately Plus Mirrors Plus Mirrors P P P P P • • • • • • • • Contains System Handle Records user user user Non-System Handle Records are in lots of Local Handle Services 

  23. Observations • Identifiers provide the glue that holds complex distributed systems together • Security can be provided at a very fine level of granularity in the system • Repositories enable reliable long-term access to digital objects over generations of technology change • Registries enable digital objects to be made known and findable using multiple metadata schemas • The Multi-primary Global Registry enables distributed administration on a collaborative basis by multiple parties around the world. • Finally, DONA will provide a framework for the management of the DO Architecture in the future.

More Related