Starfish highly available block storage
Download
1 / 23

StarFish : highly-available block storage - PowerPoint PPT Presentation


  • 76 Views
  • Uploaded on

StarFish : highly-available block storage. 資訊三 李益昌 B00902051 資訊三 何柏勳 B00902097. Introduction. Data protection Disk failure V.S. catastrophic site failure Low price of disk drives and high-speed networking infrastructure StarFish Survive catastrophic site failure

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' StarFish : highly-available block storage' - meryle


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Starfish highly available block storage

StarFish: highly-available block storage

資訊三李益昌B00902051

資訊三 何柏勳 B00902097


Introduction
Introduction

  • Data protection

    • Disk failure V.S. catastrophic site failure

  • Low price of disk drives and high-speed networking infrastructure

  • StarFish

    • Survive catastrophic site failure

    • Use IP network : (1) geographically-dispersed (2) inexpensive

    • Good performance

    • Block level


Architecture
Architecture

  • One Host Element(HE)

    • Provides storage virtualization and read cache

  • N Storage Element(SE)

    • Q: write quorum size.

    • Synchronous updates to a quorum of Q SEs, and asynchronous updates to the rest.

  • Communicate by TCP/IP over high speed network


Architecture1
Architecture

  • Recommended configuration

    • N=3 Q=2


Architecture2
Architecture

  • Another configuration


Data consistency and se recovery
Data consistency and SE recovery

  • Log

    • sequential number

    • NVRAM

  • Data consistency

  • Failure

    • RAID or network connection fails

  • SE recovery

    • Quick recovery

    • Replay recovery

    • Full recovery


Availability and reliability analysis
Availability and reliability analysis

  • Parameter

    • SE failure process :

    • SE recovery process :

    • Number of SEs : N

    • Quorum size : Q

  • Model

    • SEs failure process is i.i.d Poisson process with mean rate

    • SEs recovery process is i.i.d Poisson process with mean rate

    • HE failure process Poisson process with mean rate

    • HE recovery process Poisson process with mean rate


Availability
Availability

  • a HE or SE is available if it can serve data

  • Availability of StarFish A(Q, N) : the steady-state probability that at least Q SEs are available

  • is called load ,

  • Repairman model



Availability cont1
Availability(cont.)

  • SE availability = 1-

  • X★9:the number of 9s in an availability measure

  • Fixed N, availability decreases with large Q

    • Trade off availability for reliability


R eliability
Reliability

  • Probability of data loss

    • HE and Q SEs fails

  • The reliability increases with larger Q

  • Two approach

    • Q > floor(N/2) and at least Q SEs are available

      • Reduce availability

    • Read-only consistency


Read only consistency
Read-only consistency

  • Available in read-only mode during failure.

    • Read-only mode obviates the need for Q SEs to be available to handle updates.

    • Increase availability





S etting
Setting

  • Gigabit Ethernet(GbE) with dummynet controlling delays and bandwith limit to model Internet links

  • Different network delays

    • 1, 2, 4, 8, 23, 36, 65 ms

  • Different bandwidth limitations

    • 31, 51, 62, 93, 124 Mb/s

  • Benchmark

    • Micro-benchmark

    • PostMark


Effects of network delays and he cache size
Effects of network delays and HE cache size

  • Larger cache improves performance

  • Larger cache doesn’t change the response time of write requests





Observation
Observation

  • Performance is affected by two parameters

    • Write quorum size Q

    • Delay to the SE

  • StarFish performs adequately when one of the SEs is placed in a remote location

    • At least 85% of the performance of a direct-attached RAID


Recovery
Recovery

  • Performance degrades more during full recovery


C onclusion
Conclusion

  • The StarFish system reveals significant benefits from a third copy of data at an intermediate distance

  • A StarFish system with 3 replicas, a write quorum size of 2, and read-only consistency yields better than 99.9999% availability assuming individual Storage Element availability of 99%


ad