1 / 30

Research Opportunities in IP Wide Area Storage

Research Opportunities in IP Wide Area Storage. George Porter Li Yin Department of EECS U.C. Berkeley. Outline. Trends Challenges for wide-area storage Programmability inside networks Common techniques to hide latency Functionality that will benefit applications

blassj
Download Presentation

Research Opportunities in IP Wide Area Storage

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Research Opportunities in IP Wide Area Storage George Porter Li Yin Department of EECS U.C. Berkeley SAHARA Retreat

  2. Outline • Trends • Challenges for wide-area storage • Programmability inside networks • Common techniques to hide latency • Functionality that will benefit applications • Network Support for that functionality • Reconsidering the programmability model and application space • Feedback

  3. Outline • Trends • Challenges for wide-area storage • Programmability inside networks • Common techniques to hide latency • Functionality that will benefit applications • Network Support for that functionality • Reconsidering the programmability model and application space • Feedback

  4. Storage Wide-Area Networking Metro Area • SAN technology works well in relatively small area (metro-wide) • There is a desire to implement storage applications in the wide-area • Comparable performance as small area storage applications

  5. Outline • Trends • Challenges for wide-area storage • Programmability inside networks • Common techniques to hide latency • Functionality that will benefit applications • Network Support for that functionality • Reconsidering the programmability model and application space • Feedback

  6. Challenges in Wide-Area Storage • Speed of light is constant • Long distance implies propagation delay • Network dynamics • Variation of cross traffic load • Routes changes • Increasing storage capacity • Transmit huge amount of data across wide-area

  7. Local Disk Write Operation Time • Extra Delay Caused by Network Local Operation Time Challenges in Wide-Area Storage • Simple operation: Host writes data to the remote target disk • Send data to the remote target • Write disk • Where is the bottleneck? • Disk? • Network Link? • Distance? • Performance Degradation Extra Delay Caused By the Network Time

  8. Challenges in Wide-Area Storage • Three Cases: • Case 1: Limited link bandwidth • Case 2: Small data set with high bandwidth • Case 3: Large data set with high bandwidth

  9. Local Disk Write Operation Time Transmission Time Challenges in Wide-Area Storage • Case 1: Limited link bandwidth More data to be transmitted Time Time Extra Delay Caused By the Network

  10. Challenges in Wide-Area Storage • As more data to be transmitted: • The performance degradation caused by the transmission delay gets larger • Propagation delay does not matter • As the disk getting faster, more bandwidth is required to shift the bottleneck away from the network • Case 1: Limited link bandwidth

  11. Local Disk Write Operation Time Challenges in Wide-Area Storage • Case 2: Small data set with high link bandwidth • In this case, the throughput is very sensitive to the distance, especially when it becomes of the order of the disk latency Larger Distance Time Time

  12. Local Disk Write Operation Time Challenges in Wide-Area Storage • Case 3: Large data set with high link bandwidth More Data to be Transmitted Time Time Time Extra Delay Caused By the Network

  13. Challenges in Wide-Area Storage • In this case, disk is the bottleneck, the network only introduces the propagation delay which can be ignored as more data to be transmitted • As the disk getting faster, more bandwidth is required to shift the bottleneck away from the network • Case 3: Large data set with high link bandwidth

  14. Challenges in Wide-Area Storage • Where is the bottleneck? • Link Bandwidth • Size of data to be transmitted • Disk Speed • Key issue in the wide-area storage is how to reduce the latency • Latency introduced by the network • Latency introduced by the storage

  15. Outline • Trends • Challenges for wide-area storage • Programmability inside networks • Common techniques to hide latency • Functionality that will benefit applications • Network Support for that functionality • Reconsidering the programmability model and application space • Feedback

  16. Where and how to implement these techniques for wide-area storage applications? Common Techniques to Hide Latency • Caching • Parallelism • Pipelining • Prefetching • …

  17. Code at edge –vs- in the fabric A • Location of data separated from use of data • Idea is to put processing near the data it acts on • Better visibility into network conditions, dynamics • Big performance gains if we can act on streams of data in the datapath • Network processors are more powerful today A good match?

  18. Outline • Trends • Challenges for wide-area storage • Programmability inside networks • Common techniques to hide latency • Functionality that will benefit applications • Network Support for that functionality • Reconsidering the programmability model and application space • Feedback

  19. Gather • Digital animation editing • Large dataset visualization Synchronous Asynchronous • N-to-1 disk copies (KaZaa) • Recreate dataset from multiple sources/disks (scientific experiment) • Restore backup

  20. Gather • Techniques • Caching • Parallelism • Prefetching • Network Primitives • FS semantic information • Store block location state in router • View into network routes/conditions • Table lookup in router • Modify disk requests to point to correct locations • Join data streams to deliver coherent data to app • Orthogonal path selection • Digital animation editing • Large dataset visualization Synchronous

  21. Gather • Digital animation editing • Large dataset visualization Synchronous Asynchronous • N-to-1 disk copies (KaZaa) • Recreate dataset from multiple sources/disks (scientific experiment) • Restore backup

  22. Gather • Join data streams to deliver coherent data to app • Orthogonal path selection • Volume state in routers • Replicate SCSI requests • Reorder SCSI responses • Techniques • Pipelining • Avoid congestion/optimize for bandwidth • Network Primitives • FS semantic information • Store block location state in router • View into network routes/conditions • Table lookup in router • Modify disk requests to point to correct locations Asynchronous • N-to-1 disk copies (KaZaa) • Recreate dataset from multiple sources/disks (scientific experiment) • Restore backup

  23. Scatter • State dissemination • CDN/web server updating? • Gaming? • Updating mapping tables Synchronous Asynchronous • Disaster-recovery application • Experimental data unloading

  24. Scatter • Techniques • Delay-sensitive path selection • Congestion avoidance • Synchronization • Network Primitives • Network monitoring • FS semantic information • Store block location state in router • View into network routes/conditions • Table lookup in router • … • State dissemination • CDN/web server updating? • Gaming? • Updating mapping tables Synchronous

  25. Scatter • State dissemination • CDN/web server updating? • Gaming? • Updating mapping tables Synchronous Asynchronous • Disaster-recovery application • Experimental data unloading

  26. Scatter • Techniques • Disk location/selection • Load balancing • Physical distance knowledge • Network Primitives • Network monitoring • FS semantic information • Store block location state in router • View into network routes/conditions • Table lookup in router • … Asynchronous • Disaster-recovery application • Experimental data unloading

  27. Outline • Trends • Challenges for wide-area storage • Programmability inside networks • Common techniques to hide latency • Functionality that will benefit applications • Network Support for that functionality • Reconsidering the programmability model and application space • Feedback

  28. FS semantic information Store block location state in router View into network routes/conditions Table lookup in router Modify disk requests to point to correct locations Join data streams to deliver coherent data to app Orthogonal path selection Volume state in routers Replicate SCSI requests Reorder SCSI responses Others? Useful Network PrimitivesWhat is reasonable and possible?

  29. Outline • Trends • Challenges for wide-area storage • Programmability inside networks • Common techniques to hide latency • Functionality that will benefit applications • Network Support for that functionality • Reconsidering the programmability model and application space • Feedback

  30. Your Feedback? SAHARA Retreat

More Related