Distributed Storage System Survey

Distributed Storage System Survey Yang Kun

Agenda • 1. History of DSS • 2. Definition & Terminology • 3. Basic Factors • 4. DSS Common Design • 5. Basic Theories • 6. Popular Algorithms • 7. Replication Strategies • 8. Implementations • 9. Open Source & Business

History of DSS • Network File System （1980s）

History of DSS • Storage Area Network（SAN）File System（1990s）

History of DSS • Object oriented parallel file system （2000s）

History of DSS • Cloud Storage

Definition & Terminology • Transparency • network-transparency, user-mobility • Performance Measurement • The amount of time needed to satisfy service requests. • The performance should be comparable to that of a conventional file system.

Definition & Terminology • Fault Tolerance: 1. Communication faults, machine failures ( of type fail stop), storage device crashes, decays of storage media. • Scalability: A scalable system should react more gracefully to increased load • The performance should degrade more moderately than that of a non-scalable system. • The resources should reach a saturated state later compared with a non-scalable system.

Definition & Terminology Consistency: Consistency requires that there must exist a total order on all operations such that each operation looks as if it were completed at a single instant. Availability: Every request received by a non-failing node in the system must result in a response. Reliability

Basic Factors • Location Transparency • User mobility • Security • Performance • Scalability • Availability • Failure Tolerance

DSS Common Design Client: Writing Client: Reading

Basic Theories • CAP Theory • ACID vs. BASE Model • Quorum NRW

CAP Theory

CAP Theory • In a partition network(both in synchronous and partially synchronous), it is impossible for a web service to provide consistency, availability and partition-tolerance at the same time. • Consistency • Availability • Partition-tolerance

CAP Theory • CP: All data in only one node, and other node read/write from this node. • CA: Database System • AP: Make sure that returns the value every time. • Cassandra = A + P + Eventually Consistency

ACID vs. BASE Model

Quorum NRW • N: Replica's mount, that is how many backup for each data object. • R: The minimum mount of successful reading, that is the minimum mount for identifying a reading operation is successful. • W: The minimum mount of successful writing, that is the minimum mount for identifying a writing operation is successful. • The three factors decide the availability, consistency and fault-tolerance. And Strong consistency can be guaranteed only if W + R > N.

Popular Algorithms • PAXOS Algorithms • Roles: Proposer, Acceptor, Learner • Phases: Accept, Learn

PAXOS

Popular Algorithms • Consistent Hashing

Popular Algorithms • Mutual Algorithms • Lamport Algorithm (3*(n - 1)) • Improved Lamport Algorithm (3*(n - 1)) • Ricart–Agrawala algorithm (2*(n - 1)) • Maekawa Algorithm • Roucairol-CarvalhoAlgorithm

Popular Algorithms • Election Algorithms • Chang-Roberts Algorithm ( n log n) • Garcia-Molina's bully Algorithm • Non-based on Comparison Algorithms

Popular Algorithms • Bidding Algorithms • Self Stabilization Algorithms

Replication Strategies • Asynchronous Master/Slave Replication Log appends are acknowledged at the master in parallel with transmission to slaves. (Not support ACID) • Synchronous Master/Slave Replication A master waits for changes to be mirrored to slaves before acknowledging them. (Need timely detection) • Optimistic Replication Any member of a homogeneous replica group can accept mutations (Order is not known, transaction is impossible)

Chain Replication

CRAQ • Chain Replication with Apportioned Queries

Funnel Replication • Topology • Vector Clock • Total Order • Write Request (key, value, vector clock, originating head replica)

Atomic Commit Protocol • Two-PC 1. Voting phase The coordinator requests all participating sites to prepare to commit. 2. Decision phase The coordinator either commits the transaction if all participants are prepared-to-commit (voted “yes”), or aborts the transaction if any participant has decided to abort (voted “no”).

Atomic Commit Protocol • Presumed Abort Protocol It is designed to reduce the cost associated with aborting transactions. • Presumed Commit Protocol It is designed to reduce the cost associated with committing transactions through interpret missing information about transactions as commit decisions. One-PC One-Phase Commit protocol consists of only a single phase which is the decision phase of 2PC. One-Two-PC

Implementations • BigTable • Windows Azure Storage • Google MegaStore • Chubby

Open Source & Business

Thank you!!!

Distributed Storage System Survey

Distributed Storage System Survey

Presentation Transcript

Bigtable : A Distributed Storage System for Structured Data

Bigtable : A Distributed Storage System for Structured Data

Distributed Storage

BigTable A System for Distributed Structured Storage

Bigtable : A Distributed Storage System for Structured Data

Bigtable : A Distributed Storage System for Structured Data

Bigtable : A Distributed Storage System for Structured Data

Bigtable : A Distributed Storage System for Structured Data

Distributed Storage Allocation Problems

Bigtable : A Distributed Storage System for Structured Data

Network Coding Distributed Storage

Distributed Storage

Distributed Storage Networks

(Distributed) (Structured) Storage Systems

BASIC Regenerating Codes for Distributed Storage System s

L-Store Distributed storage system

Bigtable : A Distributed Storage System for Structured Data

BigTable: A Distributed Storage System for Structured Data

Bigtable : A Distributed Storage System for Structured Data

Big Table: Distributed Storage System For Structured Data