1 / 37

NeST: Network Storage

NeST: Network Storage . Flexible Commodity Storage Appliances John Bent, Miron Livny, Andrea Arpaci-Dusseau and Remzi Arpaci-Dusseau. Terms. Appliance (Merriam-Webster) b : an instrument or device designed for a particular use; specifically a household or office device Storage appliance

orly
Download Presentation

NeST: Network Storage

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. NeST: Network Storage Flexible Commodity Storage Appliances John Bent, Miron Livny, Andrea Arpaci-Dusseau and Remzi Arpaci-Dusseau

  2. Terms • Appliance (Merriam-Webster) • b : an instrument or device designed for a particular use; specifically a household or office device • Storage appliance • Storage plus access methods

  3. What storage users want • Reliability and availability • Manageability • cost of management > cost of storage itself • “no futz” computing • Scalability • Performance

  4. What storage vendors have • NetApp, EMC, others make storage appliances (network-attached storage) • Manageable • Just plug it in and it works • Administrative web interface • Reliable and available • Standard RAID techniques • High performance • Specialized, thin OS focused on serving files

  5. What storage vendors get,annual revenues NetApp $800 million in 2000 EMC $9 billion in 2000

  6. What’s the problem? • False coupling between HW and SW • “Playground syndrome” • Myth of specialization

  7. H/W and S/W are bundled • Hardware decisions are imposed • Hard to ride commodity curve • Example: • Netapp F720 • $35,000.00, 252 GB • $138 / GB • Maxtor DiamondMax • $279.00, 80 GB • $3.50 / GB

  8. “Playground syndrome” • “We have storage appliances . . . • if you use these protocols, • if you use these security mechanisms, • if you are comfortable with our data semantics” • Non-flexible software entity

  9. Myth of specialization • Specialize for one protocol on one machine • Specialization decreases over time as • Protocols are added • Product line expands • Example: Netapp software • Generation 1 fit on a single floppy • Generation 2 took six • Generation 3?

  10. Alternatives? • Appliance (Merriam-Webster) • a : a piece of equipment for adapting a tool or machine to a special purpose

  11. Our game? • Flexible, commodity based, software-only storage appliances • Goal • Find a networked machine • “Drop” some software on it • Have a ready to use storage appliance with flexible mechanisms

  12. New worlds, new problems • Diverse hardware, software platforms • Netapp, EMC advantage • fewer platforms, control over OS • Our approach • Automate configuration to each host system • Hardware example - use file system or self-manage • Software example - use either read/write or mmap • Cost of flexibility • Key is design of the software

  13. Outline • Introduction • Building flexible storage modules • Big picture • Protocol layer • Concurrency architecture • Storage layer • Motivations for flexible storage appliances • Conclusion and current status

  14. NeST structure • Cleanly separated modules for communication, transfer and storage • Protocol layer • Maps diverse protocols into common control flows • Concurrency architectures • Different models to maximize system throughput • Storage layer • Provides abstract interface to disks

  15. GFTP NeST WiND HTTP NFS Transfer request Protocol Layer Storage Layer Concurrency Architecture Event driven Multi-process Multi-threaded Raw disk Local FS RAID NeST structure Central Control

  16. NeST NFS HTTP NFSd HTTPd Operating system Operating system Protocol layer A collection of servers is less than the sum of their parts.

  17. Consolidate protocols • Single point of control • Storage quotas and guarantees can be supported across multiple protocols. • Bandwidth can be controlled and quality of service can be guaranteed. • Single administrative interface • Set policies • Manage user accounts

  18. Protocol layer implementation • Each protocol listens on well-defined port • Central control accepts connections • Protocol layer reads from connection and returns generic request object • Like Linux V-nodes • Add new protocol by writing a couple of methods

  19. “31: LIST” FTP Central control Directory list “ftp, ftp, ftp” Directory list NeST speak “5” Linked list “nest, nest” Protocol layer Linked list Storage layer Protocol layer example,directory list request

  20. Concurrency architecture • Three difficult goals • Low latency • High bandwidth • Multiple simultaneous clients • No single portable solution • Provide multiple models to provide solutions on a range of different platforms • Multi-threaded • Multi-process • Event driven

  21. Concurrency architecture Event driven Multi-process Multi-threaded Concurrency architecture • Central control creates transfer object • Socket descriptor from the protocol layer • File descriptor from the storage layer • Transfer object passed to concurrency architecture

  22. Concurrency on Linux

  23. Storage layer • Three needed areas of flexiblity • File systems interfaces • Example: read()/write() or mmap() • Abstract storage models • RAID, JBOD, etc. • User account administration • Creation and removal • Quotas and guarentees for users and groups

  24. File system interfaces on Linux

  25. Outline • Introduction • Building flexible storage modules • Motivations for flexible storage appliances • Conclusion and current status

  26. Clients have different needs • Communication protocols • Replacement costs • Data semantics • Security and authentication

  27. Communication protocols • The Esperanto problem • Too many protocols to implement them all • Too many clients use proprietary protocols Storage must allow pluggable protocols.

  28. Replacement costs • Infinite cost to replace first class data. • Variable cost to replace cached data depending on size and distance. • Variable cost to replace job output files depending on computation cost. First class data Cheap cached files Cost aware storage can effectively increase its own capacity.

  29. Data semantics • Must stored objects be protected from read and write dependencies? • Is transaction support necessary? • Acceptable replies to storage requests.

  30. Data semantics, example • Problem • PFS on top of FTP fakes open • read may then return file not found • Solution • Mechanisms are needed to support flexible semantics independent of the transfer protocol. Divorce semantics from the protocol.

  31. Security and authentication • Ownership • Privacy • Encryption • Authentication • Access rights

  32. Promiscuous Abstinent Who, when, how and how much? • Who is allowed to use the storage? • Promiscuity and monogamy are easy • Polygamy is also easy

  33. Do I know you? • Problem • Migrant grid users may need temporary, preferential storage access • Solution • Provide mechanisms to • advertise available storage • create self-destructing user accounts Matchmake applications with storage opportunities.

  34. Outline • Introduction • Building flexible storage solutions • Motivations for flexible storage appliances • Conclusion • Current status • Future work • Concluding remarks

  35. Current status • Concurrency architectures are done • Gets, puts, reads and writes perform well • Virtual protocol class interface is built • NeST speak is fully implemented • Grid ftp coming soon!! • Simple first implementation of storage reservations and remote quota management is done • Venkateshwaran Venkataramani

  36. Future work • Discovery process of client storage requirements • Quality of service guarantees for bandwidth and storage • Support for transient and opportunistic users

  37. Concluding remarks • Return storage to the commodity curve by creating software-only storage appliances • Allow greater storage flexibility for a wide range of application needs

More Related