1 / 21

Distributed File Systems: Issues and Strategies

This lecture covers the concepts and challenges of distributed file systems, including naming and transparency, remote file access, caching, and server with or without state. Different naming strategies and their pros and cons are discussed, along with remote file caching and cache consistency. The case study of Sun's Network File System (NFS) is also presented.

scalhoun
Download Presentation

Distributed File Systems: Issues and Strategies

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Operating SystemsCMPSCI 377Lecture 21: Distributed File Systems Emery Berger University of Massachusetts, Amherst

  2. Distributed File Systems • Most common use of distributed systems • Idea: • Given set of disks attached to different nodes,share as if all were attached to every node • Examples: • Edlab: one server, diskless workstations on LAN • AppleShare: nodes are servers with disk & client

  3. Distributed File Systems: Issues • Naming & transparency • Remote file access • Caching • Server with state or without • Replication

  4. Naming & Transparency • Issues • How are files named? • Do filenames reveal location? • Do filenames change if file moves? • Do filenames change if user moves?

  5. Transparency • Location transparency: filename does not reveal physical storage location • Location independence: filename need not change if file’s storage location changes • In practice: • Most naming schemes do not have location independence • Many have location transparency

  6. Naming Strategies:Absolute Names • Disadvantages: • User must know complete name – aware of which files are local & which are remote • File is location dependent (cannot move) • Makes sharing harder • Not fault-tolerant • <machine name, pathname> • Examples: AppleShare, Windows NT • Advantages: • Easy to find fully specified filename • Easy to add & delete new names • No global state • Scales easily

  7. Naming Strategies:Mount Points • Mount points (NFS – Network File System) • Each host has set of local names for remote locations • Mount table (/etc/fstab): specifies <remote pathname @ machine name, local pathname> • At boot: bind local name to remote • Users refer to local pathnames • NFS manages mapping

  8. Mount Points: Pros & Cons • Advantages: • Location transparent • Remote name can change across reboots • Disadvantages: • Single unified strategy hard to maintain • Same file can have different names

  9. NFS Example • Partial contents of /etc/fstab for Edlab: /usr1/mail@elux3.cs.umass.edu:/var/spool/mail /users/users1@elsrv1:/users/users1 /courses/cs300@elsrv3:/courses/cs300 /rcf/common@elsrv1:/exp/rcf/common

  10. NFS Example

  11. Naming Strategies:Global Name Space • Single name space: • Examples: • AFS (CMU’s Andrew File System) • Sprite (Berkeley) • No matter which node you are on,filenames remain the same • Client: gets filename structure from server(s) • When users access files, server sends copies to workstation, where they are cached

  12. Global Name Space: Pros & Cons • Advantages: • Naming – consistent & easy to keep consistent • Ensures all files are same regardless of where you login • Late binding of names ) moving them is easier • Disadvantages: • Difficult for OS to keep files consistent (caching) • Global name space may limit flexibility • Performance issues

  13. Distributed File Systems: Issues • Naming & transparency • Remote file access • Caching • Server with state or without • Replication

  14. Remote File Access & Caching • Can access files: • Remotely: returns results using RPC = remote service • Transfer part of file, perform local access = caching • Caching issues: • Where & when are file blocks cached? • When are modifications propagated back to remote file? • What happens when multiple clients cache same file?

  15. Remote File Caching • Local disk: • Reduces access time (compared to remote) • Safe if node fails • Difficult to keep local copy consistent with remote copy • Requires client to have disk! • Local memory: • Quick access time • Works without disks • Difficult to keep local copy consistent with remote copy • Smaller cache size • Not fault-tolerant

  16. Cache Update Policies • Write-through: write to remote disk • Reliable • Low-performance = remote service for all writes • Write-back: write only to cache • Write to disk on evictions, periodic synch • Quick • Reduces network traffic (repeated writes to same block) • User machine crashes ) data loss

  17. Cache Consistency • Client-initiated consistency:client contacts server and checks consistency • Can check every access • Can check at given intervals • Can check only upon opening a file • Server-initiated consistency:server detects potential conflicts, invalidates caches • Server needs to know which clients have cached which parts of which files • Server must know which clients are readers & which are writers

  18. Case Study:Sun’s Network File System • NFS: standard for distributed UNIX file access • Designed to run on LANs • Nodes are both servers & clients • Servers have no state • Uses mount protocol to make global name local • /etc/exports: lists local names server willing to export • /etc/fstab: lists global names that local nodes import • Corresponding global name must be in /etc/exports on server

  19. NFS Implementation • NFS defines set of RPC operations for remote file access: • Directory search, reading directory entries • Manipulating links & directories • Accessing file attributes • Reading/writing files • Does not rely on node homogeneity • Heterogeneous nodes support NFS mount & remote access protocols using RPC • Users may need to know different names depending upon which node they log on

  20. NFS Implementation

  21. Summary • Distributed File Systems • Naming & transparency • Remote file access • Caching • Server with state or without • Replication

More Related