OceanStore: An Infrastructure for Global-Scale Persistent Storage. John Kubiatowicz, David Bindel, Yan Chen, Steven Czerwinski, Patrick Eaton, Dennis Geels, Ramakrishna Gummadi, Sean Rhea, Hakim Weatherspoon, Westley Weimer, Chris Wells, Ben Zhao.
OceanStore: An Infrastructure for Global-Scale Persistent Storage
John Kubiatowicz, David Bindel, Yan Chen, Steven Czerwinski, Patrick Eaton, Dennis Geels, Ramakrishna Gummadi, Sean Rhea, Hakim Weatherspoon, Westley Weimer, Chris Wells, Ben Zhao
A few slides have been borrowed from the authors’ presentations
Source: Berkeley OceanStore Website
What is Oceanstore?
Where is persistent information stored?
20th-century tie between location and content outdated
In world-scale system, locality is key
How is it protected?
Can disgruntled employee of ISP sell your secrets?
Can’t trust anyone (how paranoid are you?)
Can we make it indestructible?
Want our data to survive “the big one”!
Highly resistant to hackers (denial of service)
Wide-scale disaster recovery
Is it hard to manage?
Worst failures are human-related
Want automatic (introspective) diagnosis and repair
Mark Weiser from Xerox: Transparent computing is the ultimate goal. Computers should disappear into the background
In the context of storage:
Don’t want to worry about backup
Don’t want to worry about obsolescence
Need lots of resources to make data secure and highly available, BUT don’t want to own them
Outsourcing of storage already becoming popular
Pay monthly fee and your “data is out there”
Service provided by confederation of companies
Monthly fee paid to one service provider
Companies buy and sell capacity from each other
Group calendar, contacts
Distributed design tools
Computer Supported Cooperative Work
A small number of servers may crash or leak information
most of the servers functioning correctly
financially “responsible party” of servers ensure integrity
but only clients trusted with cleartext
data divorced from location
flows freely within the storage infrastructure
promiscuouscaching: “anywhere, anytime”
location important for performance
dynamic system tuning through introspection
The Bayou System (Xerox PARC) is a platform of replicated, highly-available, variable-consistency, databases on which collaborative applications can be built. It caters to portable devices having intermittent connections.
Simple parity bits, or generalized Reed-Solomon codes
can be used to implement it.