1 / 13

Data Architecture Progress Report December 11, 2008 Chris Jordan

( e)Science -Driven, Production-Quality, Distributed Grid and Cloud Data Infrastructure for the Transformative, Disruptive, Revolutionary, Next-Generation TeraGrid (now with free ponies). Data Architecture Progress Report December 11, 2008 Chris Jordan. Goals for the Data Architecture.

necia
Download Presentation

Data Architecture Progress Report December 11, 2008 Chris Jordan

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. (e)Science-Driven, Production-Quality, Distributed Grid and Cloud Data Infrastructure for the Transformative, Disruptive, Revolutionary, Next-Generation TeraGrid(now with free ponies) Data Architecture Progress Report December 11, 2008 Chris Jordan

  2. Goals for the Data Architecture • Improve the experience of working with data in the TeraGrid for the user community • Reliability, Ease of use, Performance • Integrate data management into the user workflow • Balance performance goals against usability • Avoid overdependence on data location • Support the most common use cases as transparently as possible • Move data in, run job, move data out as basic pattern • Organize, search, and retrieve data from large “collections”

  3. Some Realities Cannot address the issue of available storage • Limited opportunity to improve data transfer performance at the high end • Cannot introduce drastic changes to TG infrastructure at this stage of the project • Remain dependent on the availability of technology and resources for wide-area file systems

  4. Areas of Effort • Simplifying command-line data movement • Extending the reach of WAN file systems • Develop unified data replication and management infrastructure • Extend and unify user portal interfaces to data • Integrate data into scheduling and workflows • Provide common access mechanisms to diverse, distributed data resources

  5. Extending Wide-Area File Systems • A “Wide-Area” file system is available on multiple resources • A “Global” file system is available on all TeraGrid resources • Indiana and SDSC each have a WAN-FS in production now • PSC has promising technology for distributed storage and Kerberos integration, but need testing to understand best management practices • Point of emphasis: going production

  6. Data Capacitor-WAN (DC-WAN) • IU has this in production on BigRed, PSC Pople • Can be mounted on any cluster running Lustre 1.4 or Lustre 1.6 • Ready for testing and move to production • Sites and resources committed: • TACC Lonestar, Ranger, Spur • NCSA Abe, possibly Cobalt and/or Mercury • LONI Queen Bee (testing, possible production) • Purdue Steele? • This presentation is an excellent opportunity to add your site to this list.

  7. PSC “Josephine-WAN” • Two major new design features: • Kerberos-based identity mapping • Distributed data and metadata • Kerberos is likely to work well OOTB • Distributed data/”storage pools” will need careful configuration and management • Technology working well, but needs to be actively investigated and tested in various configurations • Want to work on integration with TG User Portal

  8. Getting to Global • No single file system technology will be compatible/feasible to deploy on every system • Will require hybrid solutions • TGUP helps, but … • Need to understand limit on simultaneous mounts, and … • Once production DC-WAN reaches the technical limit, look at technologies to extend the FS: • pNFS • FUSE/SSHFS

  9. Command-line tools • Many users are still oriented towards shell access • GridFTP is complicated to use via globus-url-copy • Long URLS, many often inconsistent options • SSH/SCP is almost universally available and familiar to users • Limited usefulness for data transfer in current configuration • Simple changes to SSH/SCP configuration: • Support SCP-based access to data mover nodes • Support simpler addressing of data resources • Provide resource specific “default” configuration

  10. Unified Data Management • Management of both data and metadata, distributed across storage resources • Multiple sites support data collections using SRB, iRODS, databases, web services, etc. • This diversity is good in the abstract, but also confusing to new users • Extend current iRODS-based data management infrastructure to additional sites • Expand REDDNET “cloud storage” availability • Integrate access to as many collections as possible through the User Portal

  11. Interfaces to Data • SSH and “ls” are not effective interfaces to large, complex datasets • Portal and Gateway interfaces to data have proven useful and popular, but: • They may not be able to access all resources, may require significant gateway developer effort • Extend user portal to support WAN file systems and distributed data management • Possible to expose user portal and other APIs to ease development of gateways?

  12. Integrating Data into Workflows • Almost all tasks run on TeraGrid require some data management and multiple storage resources • Users should be able to include these steps as part of a job or workflow submission • Port DMOVER to additional schedulers, deploy across TeraGrid • Working on BigBen, ready for Kraken • Working on SGE and LSF • Evaluate PetaShare, other “Data Scheduling” systems (Stork?)

  13. Gratuitous end slide #42 • Data-WG has many attendees, but few participants • We need: • More sites committed to deploying DC-WAN in production • More sites committed to testing “Josephine-WAN” • More sites contributing to Data Collections infrastructure • Help porting DMOVER, testing PetaShare and REDDNET • Users and projects to exercise the infrastructure • Select one or more • If not you, who? If not now, when?

More Related