slide1 n.
Skip this Video
Loading SlideShow in 5 Seconds..
DFC Vision PowerPoint Presentation
Download Presentation
DFC Vision

Loading in 2 Seconds...

play fullscreen
1 / 19

DFC Vision - PowerPoint PPT Presentation

  • Uploaded on

DFC Vision. Build collaboration environment Sharing of data, information , and knowledge Form national data cyberinfrastructure Federation of existing data management systems Support reproducible data-driven research Encapsulate knowledge within shared workflows

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

DFC Vision

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
    Presentation Transcript

    1. DFC Vision • Build collaboration environment • Sharing of data, information, and knowledge • Form national data cyberinfrastructure • Federation of existing data management systems • Support reproducible data-driven research • Encapsulate knowledge within shared workflows • Enable student participation in research • Policy-controlled analysis of “live” data Compute Resources – HPC centers, institutional clusters DFC Collaboration Environment – Data Grid NEW Community Resources – Repository, Catalog

    2. Data Driven Science and Engineering • Plant biology – the iPlant Collaborative • Enable collaborative research across existing data repositories • Cognitive science – the Temporal Dynamics of Learning Center • Manage research data, apply IRB policies • Social Science – the Odum Institute • Integrate policy-based data management with the existing Dataverse repository Collaboration Environments • Oceanography – Ocean Observatory Initiative • Archiving climatic data records from real-time sensor data streams • Engineering – CIBER-U • Engineering Digital Library: Curating civil engineering data, materials data, archaeology data, student training materials • Hydrology - EarthCube • Automating hydrology research workflows (data retrieval, transformation, analysis)

    3. Challenges • Federated national data cyberinfrastructure • Existing projects have web services, data repositories, digital libraries, archives, processing pipelines, science portals • What are the interoperability mechanisms needed to enable federation of existing resources?

    4. DFC Builds on the iRODS data grid (integrated Rule Oriented Data System) • Astrophysics Auger supernova search • Atmospheric science NASA Langley Atmospheric Sciences Center • Biology Phylogenetics at CC IN2P3 • Climate NOAA National Climatic Data Center • Cognitive Science Temporal Dynamics of Learning Center • Computer Science GENI experimental network • Cosmic Ray AMS experiment on the International Space Station • Dark Matter Physics Edelweiss II • Earth Science NASA Center for Climate Simulations • Ecology CEED Caveat Emptor Ecological Data • Engineering CIBER-U • High Energy Physics BaBar / Stanford Linear Accelerator • Hydrology Institute for the Environment, UNC-CH; Hydroshare • Genomics Broad Institute, Wellcome Trust Sanger Institute, NGS • Medicine Sick Kids Hospital • Neuroscience International Neuroinformatics Coordinating Facility • Neutrino Physics T2K and dChooz neutrino experiments • Oceanography Ocean Observatories Initiative • Optical Astronomy National Optical Astronomy Observatory • Particle Physics Indra multi-detector collaboration at IN2P3 • Plant genetics the iPlant Collaborative • Quantum Chromodynamics IN2P3 • Radio Astronomy Cyber Square Kilometer Array, TREND, BAOradio • Seismology Southern California Earthquake Center • Social Science Odum, TerraPop

    5. Policy Concept Graph Policy Enforcement Purpose Persistent State Collection Property Procedure Policy Purpose DATA_ID DATA_REPL_NUM DATA_CHECKSUM Collection Defines Replication Policy Isa Isa Isa Has Has Isa Checksum Policy Defines Digital Object Attribute Has Isa Quota Policy Has Isa Integrity Data Type Policy Isa Updates Isa Isa Authenticity Persistent State Information Isa Property Policy Procedure Defines Updates Controls Isa Access control Isa SubType Has HasFeature GetUserACL Periodic Assessment Criteria Policy HasFeature Workflow Isa Policy Enforcement Point SetDataType Completeness HasFeature Chains Isa SetQuota Correctness Isa Function HasFeature Invokes Isa DataObjRepl Consensus Isa Isa SysChksumDataObj Operation Consistency Client Action

    6. Policy-based Data Management – Implementation in iRODS Purpose (5 main types) DATA_ID DATA_REPL_NUM DATA_CHECKSUM Collection Defines SubType Replication Policy Has Isa Isa Isa Has Isa Archive Data grid Collection Digital Library Processing Pipeline Checksum Policy Digital Object Attribute Has Isa Quota Policy Has Isa Defines Integrity Data Type Policy Isa Updates Isa Isa Persistent State Information (338) Authenticity Isa Property (7 default) Policy (11 default) Procedure(11 default) Defines Updates Controls Access control Isa Isa SubType Has HasFeature msiGetUserACL Periodic Assessment Criteria Policy HasFeature Workflow Isa Policy Enforcement Points (70) msiSetDataType Completeness HasFeature Chains Isa msiSetQuota Correctness Isa Micro-service (317) HasFeature Invokes Isa msiDataObjRepl Consensus Isa Isa msiSysChksumDataObj Operation Consistency Clients (50)

    7. Federation Approach • Use middleware to implement unifying name spaces for: • Users Single sign-on • Collections Directories, workflow, time series • Objects Files, soft links, workflows • Storage systems Cloud, tape, file systems, objects • Metadata Provenance, description, state • PoliciesManagement, assessment • Micro-services Procedures, interactions DFC - CNI

    8. DFC Federation Hub ooi 1247 renci 1247 engineering 1247 hydrology 2823 odumMain 1247 TDLC 6688 dfctest 1248 Port: 1237, Zone: dfcmain iCAT res-bk15 res-dfcmain demoResc hydroResc

    9. National Infrastructure Existing infrastructure XSEDE Kepler OOI TDLC iPlant CUAHSI NCDC Dataverse GeoBrain DataONE NCSA Polyglot Research Environment - Portals, Applications, Workflows DFC Collaboration Environment – Data Grid Community Resource Repository Community Resource Catalog Community Resource Services DFC - CNI

    10. The Future: Reproducible Research Sensors Simulation Literature Archives Experiments Petabytes Doubling every Two years The Challenge: Support reproducible data-driven research Deliver the capability to manage, mine, and publish knowledge through collaboration environments. DFC - CNI

    11. National Infrastructure Approach • Build national data cyberinfrastructure prototype • Support multiple science and engineering domains by loosely couplingtheir existing infrastructure with a collaboration environment • Develop generic interoperability framework • Define the generic infrastructure needed for the national infrastructure to manage knowledge as well as data and information • Define interoperability mechanisms • Support access across the disparate types of infrastructurein common use • Define domain specific extensions • Support three levels:  technical interoperability, project level policy, and end user usage requirements

    12. Interoperability Mechanisms Policies control execution of each interoperability mechanism Analysis Workflows Knowledge Creation Knowledge Procedures : Micro-services Knowledge Management Soft Links Collection Registration Information Message Queue Information Exchange Database Query Information Manipulation Micro-services Data Access Data Storage Driver Data Manipulation DFC - CNI

    13. DataNet Interoperability Research Environment- Portals, Applications, Workflows DFC Data Grid DFC Collaboration Environment SEAD Portal (VIVO) Message Queue Web Service DataONE Coordinating Node DataONE Member Node SEAD Data TerraPop Server DFC Data Grid SEAD Engagement Center DFC - CNI

    14. DFC Interoperability Layers Authentication InCommon, GSI, Kerberos, Shibboleth, LDAP PAM / GSSAPI Data Access DataONE, Data Conservancy, CUAHSI, NCDC Micro-Services Data Manipulation NetCDF, HDF5, THREDDS, ERDDAP Format Drivers Workflows Kepler, NCSA Cyberintegrator, Taverna, NCSA Polyglot Micro-Services Networks HTTPS, TCP/IP, Parallel TCP/IP, RBUDP Network Drivers Clients Web browsers, Web Services, Workflows, FUSE, Synchronization, MediaWiki OpenSocial Storage Systems File Systems, Tape Archives, Object Stores, Cloud Storage Storage Drivers Messaging AMQP, iRODSXmsg Micro-Services Vocabulary HIVE, (Cheshire) Micro-Services Management (RDA Policies), (ISO 16363 Criteria) Policies DFC - CNI

    15. Interoperability Mechanisms • Drivers • Encapsulate knowledge to support your operations at the remote repository: partial I/O, parsing of formats, manipulation of data structures • Authentication, format, storage • Micro-services • Encapsulate knowledge needed to interact with an external system or with a data set using the remote protocol • Data access, external workflows, semantics, messaging • Policies • Encapsulate knowledge needed for management functions • Federation control, administrative tasks, validation checks

    16. Assertion • Three basic types of interoperability mechanisms are sufficient for assembling national data cyberinfrastructure • Example: Linked software defined networks to data grids • From an iRODS data grid, controlled the selection of three disjoint network paths for optimizing data transport by adding appropriate policy enforcement points and micro-services • Expect functionality currently in data grid middleware to migrate into network middleware

    17. Future Architecture Clients Clients Virtual collection Data Grid Middleware Data Grid Middleware Virtual network Resources Network Middleware DFC Federation Resources GEMI - GENI

    18. Contacts Reagan W. Moore National Science Foundation Cooperative Agreement: OCI-0940841 DFC - CNI