130 likes | 259 Views
Guiding ideas of the FACIT Storage Architecture. Library of Congress Storage Vendor Meeting Sept 18, 2007 Terry Moore, University of Tennessee, Knoxville Larry Carver, University of California at Santa Barabara. What is FACIT. FACIT – F ederated A rchive C yber i nfrastructure T estbed
E N D
Guiding ideas of the FACIT Storage Architecture Library of Congress Storage Vendor Meeting Sept 18, 2007 Terry Moore, University of Tennessee, Knoxville Larry Carver, University of California at Santa Barabara
What is FACIT • FACIT – Federated Archive Cyberinfrastructure Testbed • FACIT: project of National Geospatial Data Archive (NDIIPP partner) • Goal of FACIT: Create a testbed to test a different approach to federated resource sharing, redundancy and access • FACIT partners: • NGDA (UCSB and Stanford) • Logistical Networking (UTK) – network storage tech • REDDnet (Vanderbilt) – NSF funded infrastructure using LN for data intensive collaboration
Typical design perspective now + 100 years “The archive begins today” recent content now take action
FACIT design perspective now + 50 now take action now - 50 “mid-century perspective” content old content ancient content
What archivist in the middle sees • Repeated migrations across storage media and storage systems • past and future – 20 to 30+ over a century • Repeated migrations across archive systems • each possibly necessitating transformation and reorganization of archived content • Repeated handoffs between institutions • each implementing different policies • How can we create a “handoff process” that can be sustained? • Design for interoperability and deployment scalability first • How do you do that?
Generic storage stack The common interface “virtualizes” the technology beneath it What interface goes here? Issue:Whatever you choose will become the basis for storage interoperability for adopters LN hypothesis: Do it the way that the network people did it
“Bits are bits” infrastructure for storage One infrastructure serves all • Standardize on what we have an adequate common model for • Storage/buffer management • Coarse-grained data transfer • Leave everything else to higher layers • End-to-end services: checksums, encryption, error encoding, etc. • Enable autonomy in wide area service creation: security, resource allocation, QoS guarantees… • Gain the benefits of interoperability today!
Basic elements of the LN stack • Metadata container bit-level structure • Modeled on Unix inode • bit-level structure, control keys, … • XML encoded • Highly generic, “best effort” protocol for using storage • Generic -> doesn’t restrict applications • “best effort”-> low burden on providers • Easy to port and deploy
Sample exNodes Tennessee UCSB Stanford REDDnet IBP Depots Network 0 100 200 300 A B C Question: Where is data object C? Crossing administrative domains, sharing resources
New federation members? Tennessee UCSB Stanford REDDnet LoC IBP Depots Network • Add new depots • Copy the data • Rewrite the exNodes
LN file download IBP Depots TCP streams exNode – 4 copies
REDDnet depot unit: COTS • Dual core 2.4GHz AMD 64 X2 processors with 4GB of memory, • 4 x750GB SATA2 drives in hot-swap bays • Dual GigE NICs. • OS stored on a USB-header mounted transflash drive; all disk drives available for use • >$700 per TB • “But there’s so much it doesn’t do!” True • Question: How much can we do in software on top? E.g. check-sums, error encoding, encryption, etc.