distributed file system data storage for networks large and small n.
Skip this Video
Loading SlideShow in 5 Seconds..
Distributed File System: Data Storage for Networks Large and Small PowerPoint Presentation
Download Presentation
Distributed File System: Data Storage for Networks Large and Small

Loading in 2 Seconds...

play fullscreen
1 / 21

Distributed File System: Data Storage for Networks Large and Small - PowerPoint PPT Presentation

  • Uploaded on

Distributed File System: Data Storage for Networks Large and Small. Pei Cao Cisco Systems, Inc. . Review: DFS Design Considerations. Name space construction AAA Operator batching Client caching Data consistency Locking. Summing it Up: CIFS as an Example. Network transport in CIFS

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Distributed File System: Data Storage for Networks Large and Small' - elina

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
review dfs design considerations
Review: DFS Design Considerations
  • Name space construction
  • AAA
  • Operator batching
  • Client caching
  • Data consistency
  • Locking
summing it up cifs as an example
Summing it Up: CIFS as an Example
  • Network transport in CIFS
    • Use SMB (Server Message block) messages over a reliable connection-oriented transport
      • TCP
      • NetBIOS over TCP
    • Use persistent connections called “sessions”
      • If a session is broken, client does the recovery
design choices in cifs
Design Choices in CIFS
  • Name space construction:
    • per-client linkage, multiple methods for server resolution
      • file://fs.xyz.com/users/alice/stuff.doc
      • \\cifsserver\users\alice\stuff.doc
      • E:\stuff.doc
    • CIFS also offers “redirection” method
      • A share can be replicated in multiple servers or moved
      • Client open  server reply “STATUS_DFS_PATH_NOT_COVERED”  client issues “TRANS2_DFS_GET_REFERRAL”  server reply with new server
design choices in cifs1
Design Choices in CIFS
  • AAA: Kerberos
    • Older systems use NTLM
  • Operator batching: supported
    • These methods have “AndX” variations: TREE_CONNECT, OPEN, CREATE, READ, WRITE, LOCK
    • Server implicitly takes results of preceding operations as input for subsequent operations
    • First command that encounters an error stops all subsequent processing in the batch
design choices in cifs2
Design Choices in CIFS
  • Client caching
    • Cache both file data and file metadata, write-back cache, can read-ahead
    • Offers strong cache consistency using an invalidation-based approach
  • Data access consistency
    • Oplocks: similar to “tokens” in AFS v3
      • “level II oplock”: read-only data locks
      • “exclusive oplock”: exclusive read/write data lock
      • “batch oplock”: exclusive read/write “open” lock and data lock and metadata lock
    • Transition among the oplocks
    • Observation: can have a hierarchy of lock managers
design choices in cifs3
Design Choices in CIFS
  • File and data record locking
    • Offer “shared” (read-only) and “exclusive” (read/write) locks
    • Part of the file system; Mandatory
    • Can lock either a whole file or byte-range in the file
    • Lock request can specify a timeout for waiting
    • Enables atomic writes with the “ANDX” batching with Writes
      • “Lock/write/unlock” as a batched command sequence
  • Additional capability: “directory change notification”
dfs for mobile networks
DFS for Mobile Networks
  • What properties of DFS are desirable:
    • Handle frequent connection and disconnection
    • Enable clients to operate in disconnected state for an extended period of time
    • Ways to resolve/merge conflicts
design issues for dfs in mobile networks
Design Issues for DFS in Mobile Networks
  • What should be kept in client cache?
  • How to update the client cache copies with changes made on the server?
  • How to upload changes made by the client to the server?
  • How to resolve conflicts when more than one clients change a file during disconnected state?
example system coda
Example System: Coda
  • Client cache content:
    • User can specify which directories should always be cached on the client
    • Also cache recently used files
    • Cache replacement: walk over the cached items every 10 min to reevaluate their priorities
  • Updates from server to client:
    • The server keeps a log of callbacks that couldn’t be delivered and deliver them upon client connection
coda file system
Coda File System
  • Upload the changes from client to server
    • The client has to keep a “replay log”
      • Contents of the “replay log”
    • Ways to reduce the “replay log” size
  • Handling conflicts
    • Detecting conflicts
    • Resolving conflicts
performance issues in file servers
Performance Issues in File Servers
  • Components of server load
    • Network protocol handling
    • File system implementation
    • Disk accesses
  • Read operations
    • Metadata
    • Data
  • Write operations
    • Metadata
    • Data
  • Workload characterization
dfs for high speed networks dafs
DFS for High-Speed Networks: DAFS
  • Proposal from Network Appliance and companies
  • Goal: eliminate memory copies and protocol processing
    • Standard implementation: network buffers  file system buffer cache  user-level application buffers
  • Designed to take advantage of RDMA (“Remote DMA”) network protocols
    • Network transport provides direct memory  memory transfer
    • Protocol processing is provided in hardware
  • Suitable for high-bandwidth, low-error-rate, low-latency network
dafs protocol
DAFS Protocol
  • Data read from the client:
    • RDMA request from the server to copy file data directly into application buffer
  • Data write from the client
    • RDMA request from the server to copy application buffer into server memory
  • Implementation:
    • as a library linked to user application interface with RDMA network library directly
      • Eliminate two data copies
    • as a new file system implementation in the kernel
      • Eliminate one data copy
  • Performance advantage:
    • Example: 90 usec/op in NFS vs. 25 usec/op in DAFS
dafs features
DAFS Features
  • Session-based
  • Offer authentication of client machines
  • Flow control by server
  • Stateful lock implementation with leases
  • Offers atomic writes
  • Offers operator batching
clustered file servers
Clustered File Servers
  • Goal: scalability in file service
    • Build a high-performance file service using a collection of cheap file servers
  • Methods for Partitioning the Workload
    • Each server can support one “subtree”
      • Advantages
      • Disadvantages
    • Each server can support a group of clients
      • Advantages
      • Disadvantages
    • Client requests are sent to server in round-robin or load-balanced fashion
      • Advantages
      • Disadvantages
non subtree partition clustered file servers
Non-Subtree-Partition Clustered File Servers
  • Design issues
    • On which disks should the data be stored?
    • Management of memory cache in file servers
    • Data consistency management
      • Metadata operation consistency
      • Data operation consistency
    • Server failure management
      • Single server failure fault tolerance
      • Disk failure fault tolerance
mapping between disks and servers
Mapping Between Disks and Servers
  • Direct-attached disks
  • Network-attached disks
    • Fiber-channel attached disks
    • iSCSI attached disks
  • Managing the network-attached disks: “volume manager”
functionalities of a volume manager
Functionalities of a Volume Manager
  • Group multiple disk partitions into a “logical” disk volume
  • Volume can expand or shrink in size without affecting existing data
  • Volume can be RAID-0/1/5, tolerating disk failures
  • Volume can offer “snapshot” functionalities for easy backup
  • Volumes are “self-evident”
implementations of volume manager
Implementations of Volume Manager
  • In-kernel implementation
    • Example: Linux volume manager, Veritas volume manager, etc.
  • Disk server implementation
    • Example: EMC storage systems
serverless file systems
Serverless File Systems
  • Serverless file systems in WAN
    • Motivation: peer-to-peer storage; never lose the file
  • Serverless file system in LAN
    • Motivation: client powerful enough to be like servers; use all client’s memory to cache file data