1 / 1

A Simple and Scalable Distributed File System Dennis Fetterly, Maya Haridasan, and Michael Isard Microsoft Research – S

A Simple and Scalable Distributed File System Dennis Fetterly, Maya Haridasan, and Michael Isard Microsoft Research – Silicon Valley Lab. Metadata Server. Design Goals. Node Service. Example Uses. Names. Contains metadata for the system Maps streams to partitions

marlee
Download Presentation

A Simple and Scalable Distributed File System Dennis Fetterly, Maya Haridasan, and Michael Isard Microsoft Research – S

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Simple and Scalable Distributed File System Dennis Fetterly, Maya Haridasan, and Michael Isard Microsoft Research – Silicon Valley Lab Metadata Server Design Goals Node Service Example Uses Names Contains metadata for the system Maps streams to partitions Maps partitions (NTFS file or dir, SQL table) to data path Contains per stream metadata and per partition attributes Maintains machine state Replicated for scalability and fault tolerance Separate implementations utilizing SQL or RSL RSL : Replicated State Library implementation of Paxos consensus algorithm Garbage Collection Delete partitions that have been removed from TidyFS server Verify machine has all partitions expected by TidyFS server to ensure correct replica count Load balancing TidyFS server assigns partition replicas to machine Machine replicates partition to local filesystem Easy to change policies Validation Validate checksum of stored partitions Distributed computations using Dryad or DryadLINQ i.e. Terasort 240 machines reading at 240 MB/s = 56 GB/s 240 machines writing at 160 MB/s = 37 GB/s Replicate data partitions among machines for fault tolerant storage Stream: a sequence of partitions i.e. tidyfs://dryadlinqusers/fetterly/clueweb09-English Can have leases for temp files or cleanup from app crashes Partition: Immutable 64 bit identifier Can be a member of multiple streams Stored as NTFS file on cluster nodes Clients directly access partitions using standard APIs for performance Multiple replicas of each partition can be stored A simple fault-tolerant, distributed filesystemthat provides the abstractions necessary for data parallel computations on HPC clusters High performance, reliable, scalable service Prototypical workload High throughput, sequential IO, write once Cluster machines working in parallel Configurable number of replicas per dataset tidyfs://dryadlinqusers/fetterly/clueweb09-English client p3 p2 p1 pn Read/Write Partitions Get/Set Stream/Partition Metadata Attributes • Streams have metadata • Lease time, replication factor, fingerprint, size, creation time • Partitions have attributes • Fingerprint, size • User defined attributes and metadata • Key-value pairs associated with stream or partition • Currently support string, UInt64, and blob values Metadata: Streams, Partitions, Nodes, etc p2 p3 pn p2 pn p1 p3 Replicated p1 Storage Nodes Metadata Servers

More Related