distributed file systems design comparisons n.
Skip this Video
Loading SlideShow in 5 Seconds..
Distributed File Systems: Design Comparisons PowerPoint Presentation
Download Presentation
Distributed File Systems: Design Comparisons

Loading in 2 Seconds...

play fullscreen
1 / 36

Distributed File Systems: Design Comparisons - PowerPoint PPT Presentation

  • Uploaded on

Distributed File Systems: Design Comparisons. David Eckhardt, Bruce Maggs slides used and modified with permission from Pei Cao’s lectures in Stanford Class CS-244B. Other Materials Used. 15-410 Lecture 34, NFS & AFS, Fall 2003, Eckhardt NFS: RFC 1094 for v2 (3/1989)

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

Distributed File Systems: Design Comparisons

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
distributed file systems design comparisons

Distributed File Systems: Design Comparisons

David Eckhardt, Bruce Maggs

slides used and modified with permission from

Pei Cao’s

lectures in Stanford Class CS-244B

other materials used
Other Materials Used
  • 15-410 Lecture 34, NFS & AFS, Fall 2003, Eckhardt
  • NFS:
    • RFC 1094 for v2 (3/1989)
    • RFC 1813 for v3 (6/1995)
    • RFC 3530 for v4 (4/2003)
  • AFS:
    • “The ITC Distributed File System: Principles and Design”, Proceedings of the 10th ACM Symposium on Operating System Principles, Dec. 1985, pp. 35-50.
    • “Scale and Performance in a Distributed File System”, ACM Transactions on Computer Systems, Vol. 6, No. 1, Feb. 1988, pp. 51-81.
    • IBM AFS User Guide, version 36
more related material
More Related Material

RFCs related to Remote Procedure Calls (RPCs)

  • RFC 1831 XDR representation
  • RFC 1832 RPC
  • RFC 2203 RPC security
  • Why Distributed File Systems?
  • Basic mechanisms for building DFSs
    • Using NFS V2 as an example
  • Design choices and their implications
    • Naming (this lecture)
    • Authentication and Access Control (this lecture)
    • Batched Operations (this lecture)
    • Caching (next lecture)
    • Concurrency Control (next lecture)
    • Locking implementation (next lecture)
15 410 gratuitous quote of the day
15-410 Gratuitous Quote of the Day

Good judgment comes from experience… Experience comes from bad judgment.

- attributed to many

what distributed file systems provide
What Distributed File Systems Provide
  • Access to data stored at servers using file system interfaces
  • What are the file system interfaces?
    • Open a file, check status of a file, close a file
    • Read data from a file
    • Write data to a file
    • Lock a file or part of a file
    • List files in a directory, create/delete a directory
    • Delete a file, rename a file, add a symlink to a file
    • etc
why dfss are useful
Why DFSs are Useful
  • Data sharing among multiple users
  • User mobility
  • Location transparency
  • Backups and centralized management
file system interfaces vs block level interfaces
“File System Interfaces” vs. “Block Level Interfaces”
  • Data are organized in files, which in turn are organized in directories
  • Compare these with disk-level access or “block” access interfaces: [Read/Write, LUN, block#]
  • Key differences:
    • Implementation of the directory/file structure and semantics
    • Synchronization (locking)
components in a dfs implementation
Components in a DFS Implementation
  • Client side:
    • What has to happen to enable applications to access a remote file the same way a local file is accessed?
  • Communication layer:
    • Just TCP/IP or a protocol at a higher level of abstraction?
  • Server side:
    • How are requests from clients serviced?
client side example basic unix implementation
Client Side Example:Basic UNIX Implementation
  • Accessing remote files in the same way as accessing local files  kernel support
vfs interception
VFS interception
  • VFS provides “pluggable” file systems
  • Standard flow of remote access
    • User process calls read()
    • Kernel dispatches to VOP_READ() in some VFS
    • nfs_read()
      • check local cache
      • send RPC to remote NFS server
      • put process to sleep
vfs interception1
VFS interception
  • Standard flow of remote access (continued)
    • server interaction handled by kernel process
      • retransmit if necessary
      • convert RPC response to file system buffer
      • store in local cache
      • wake up user process
    • nfs_read()
      • copy bytes to user memory
communication layer example remote procedure calls rpc
Communication Layer Example:Remote Procedure Calls (RPC)

RPC call

RPC reply

Failure handling: timeout and re-issue













extended data representation xdr
Extended Data Representation (XDR)
  • Argument data and response data in RPC are packaged in XDR format
    • Integers are encoded in big-endian format
    • Strings: len followed by ascii bytes with NULL padded to four-byte boundaries
    • Arrays: 4-byte size followed by array entries
    • Opaque: 4-byte len followed by binary data
  • Marshalling and un-marshalling data
  • Extra overhead in data conversion to/from XDR
some nfs v2 rpc calls
Some NFS V2 RPC Calls
  • NFS RPCs using XDR over, e.g., TCP/IP
  • fhandle: 32-byte opaque data (64-byte in v3)
server side example mountd and nfsd
Server Side Example: mountd and nfsd
  • mountd: provides the initial file handle for the exported directory
    • Client issues nfs_mount request to mountd
    • mountd checks if the pathname is a directory and if the directory should be exported to the client
  • nfsd: answers the RPC calls, gets reply from local file system, and sends reply via RPC
    • Usually listening at port 2049
  • Both mountd and nfsd use underlying RPC implementation
nfs v2 design
NFS V2 Design
  • “Dumb”, “Stateless” servers
  • Smart clients
  • Portable across different OSs
  • Immediate commitment and idempotency of operations
  • Low implementation cost
  • Small number of clients
  • Single administrative domain
stateless file server
Stateless File Server?
  • Statelessness
    • Files are state, but...
    • Server exports files without creating extra state
      • No list of “who has this file open” (permission check on each operation on open file!)
      • No “pending transactions” across crash
  • Results
    • Crash recovery is “fast”
      • Reboot, let clients figure out what happened
    • Protocol is “simple”
  • State stashed elsewhere
    • Separate MOUNT protocol
    • Separate NLM locking protocol
nfs v2 operations
NFS V2 Operations
  • V2:
    • STATFS (get file system attributes)
nfs v3 and v4 operations
NFS V3 and V4 Operations
  • V3 added:
    • READDIRPLUS, COMMIT (server cache!)
  • V4 added:
    • COMPOUND (bundle operations)
    • LOCK (server becomes more stateful!)
    • PUTROOTFH, PUTPUBFH (no separate MOUNT)
    • Better security and authentication
nfs file server failure issues
NFS File Server Failure Issues
  • Semantics of file write in V2
    • Bypass UFS file buffer cache
  • Semantics of file write in V3
    • Provide “COMMIT” procedure
  • Locking provided by server in V4
topic 1 name space construction and organization
Topic 1: Name-Space Construction and Organization
  • NFS: per-client linkage
    • Server: export /root/fs1/
    • Client: mount server:/root/fs1 /fs1  fhandle
  • AFS: global name space
    • Name space is organized into Volumes
      • Global directory /afs;
      • /afs/cs.wisc.edu/vol1/…; /afs/cs.stanford.edu/vol1/…
    • Each file is identified as fid = <vol_id, vnode #, uniquifier>
    • All AFS servers keep a copy of “volume location database”, which is a table of vol_id server_ip mappings
implications on location transparency
Implications on Location Transparency
  • NFS: no transparency
    • If a directory is moved from one server to another, client must remount
  • AFS: transparency
    • If a volume is moved from one server to another, only the volume location database on the servers needs to be updated
topic 2 user authentication and access control
Topic 2: User Authentication and Access Control
  • User X logs onto workstation A, wants to access files on server B
    • How does A tell B who X is?
    • Should B believe A?
  • Choices made in NFS V2
    • All servers and all client workstations share the same <uid, gid> name space  B send X’s <uid,gid> to A
      • Problem: root access on any client workstation can lead to creation of users of arbitrary <uid, gid>
    • Server believes client workstation unconditionally
      • Problem: if any client workstation is broken into, the protection of data on the server is lost;
      • <uid, gid> sent in clear-text over wire  request packets can be faked easily
user authentication cont d
User Authentication (cont’d)
  • How do we fix the problems in NFS v2
    • Hack 1: root remapping  strange behavior
    • Hack 2: UID remapping  no user mobility
    • Real Solution: use a centralized Authentication/Authorization/Access-control (AAA) system
example aaa system ntlm
Example AAA System: NTLM
  • Microsoft Windows Domain Controller
    • Centralized AAA server
    • NTLM v2: per-connection authentication

Domain Controller









file server

a better aaa system kerberos
A Better AAA System: Kerberos
  • Basic idea: shared secrets
    • User proves to KDC who he is; KDC generates shared secret between client and file server


ticket server

generates S

“Need to access fs”

file server



encrypt S with

client’s key


S: specific to {client,fs} pair;

“short-term session-key”; expiration time (e.g. 8 hours)

kerberos interactions
Kerberos Interactions


“Need to access fs”

ticket server

generates S


Kclient[S], ticket = Kfs[use S for client]


ticket=Kfs[use S for client], S{client, time}




file server

  • Why “time”?: guard against replay attack
  • mutual authentication
  • File server doesn’t store S, which is specific to {client, fs}
  • Client doesn’t contact “ticket server” every time it contacts fs
kerberos user log on process
Kerberos: User Log-on Process
  • How does user prove to KDC who the user is
    • Long-term key: 1-way-hash-func(passwd)
    • Long-term key comparison happens once only, at which point the KDC generates a shared secret for the user and the KDC itself  ticket-granting ticket, or “logon session key”
    • The “ticket-granting ticket” is encrypted in KDC’s long-term key
operator batching
Operator Batching
  • Should each client/server interaction accomplish one file system operation or multiple operations?
  • Advantage of batched operations
  • How to define batched operations
examples of batched operators
Examples of Batched Operators
  • NFS v3:
  • NFS v4:
    • COMPOUND RPC calls
  • Functionalities of DFS
  • Implementation of DFS
    • Client side: VFC interception
    • Communication: RPC or TCP/UDP
    • Server side: server daemons
  • DFS name space construction
    • Mount vs. Global name space
  • DFS access control
    • NTLM
    • Kerberos