1 / 30

Distributed File Systems

Distributed File Systems. Group A5 Amit Sharma Dhaval Sanghvi Ali Abbas. Outline. What is a DFS Requirements of a DFS Sun Network File System History Architecture Protocols Implementation. Basics. File: named collection of logically related data

howe
Download Presentation

Distributed File Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Distributed File Systems Group A5 Amit Sharma Dhaval Sanghvi Ali Abbas

  2. Outline • What is a DFS • Requirements of a DFS • Sun Network File System • History • Architecture • Protocols • Implementation

  3. Basics • File: named collection of logically related data • Unix file: an uninterpreted sequence of bytes • File system: • Provides a logical view of data and storage functions • User-friendly interface • Provides facility to create, modify, organize, and delete files • Provides sharing among users in a controlled manner • Provides protection

  4. What is a DFS • A distributed implementation of time sharing model of a file system, where multiple users share files and storage resources. • Overall storage space managed by a DFS consists of different, remotely located, smaller storage spaces.

  5. Requirements • Transparency: • Access transparency • Location transparency • Mobility transparency • Failure transparency • Performance transparency

  6. Other Requirements • Scaling • Security • Hardware and operating System heterogeneity

  7. Sun’s Network File System • Introduced by Sun Microsystems in 1985 • Sun published the protocol and licensed reference implementation • Since then, NFS has been supported by every Unix variant

  8. NFS design objectives • Machine and OS independence, no recompilation of applications • Crash recovery • Transparent access • Reasonable performance (comparable to local FS)

  9. NFS - The basic idea • allow an arbitrary collection of clients and servers to share a common file system • In most cases all clients and servers are on the same LAN • each machine can be both a client and a server • Each NFS server exports one or more of its directories for access by remote clients

  10. NFS - The basic idea (cont.) • When a directory is made available, so are all of its subdirectories. • whole directory trees are exported by NFS as a unit • The list of exported directories a server exports is maintained in the /etc/exports file • Uses RPC / XDR

  11. NFS - How do we get the files • Mount protocol • access shared file systems by mounting them from an NFS server machine. • Where? at mount point • Mount point? -an empty directory or subdirectory, created as place to attach a remote file system.

  12. How do we get the files (cont.) • server returns a file handle to the client. • The file handle contains fields uniquely identifying • the file system type (ext2, vfat, Novell, BSD, NeXTSTEP..) • the disk • the i-node number of the directory • and security information

  13. How do we get the files (cont.) • The server daemons: • nfsd: The NFS Daemon which services requests from NFS clients. • mountd: The NFS Mount Daemon which actually carries out requests that nfsd passes on to it. • portmap: The portmapper daemon which allows NFS clients to find out which port the NFS server is using.

  14. NFS software architecture

  15. VFS • VFS allows diverse specific file systems to coexist in a file tree, isolating all FS-dependencies in pluggable filesystem modules. • VFS was an internal kernel restructuring with no effect on the syscall interface. • VFS layer maintains a table with one entry for each open file

  16. VFS 2 • VFS layer has an entry called a v-node (virtual i-node). • for every open file, V-nodes are used to tell whether the file is local or remote. • A V-node points to either an i-node, when the file is on the local disk, or an r-node in the NFS Client code, when the reference is to data on a remote disk. • all state information on the open files is stored on the client's side.

  17. Vnode use • To mount a remote file system, the system admn (or /etc/rc) calls the mount program • Kernel constructs vnode for remote directory and asks NFS-client code to create an r-node in its internal tables. Vnode in client VFS will point to local I-node or r-node.

  18. NFS implementation • Servers are stateless: Each request has complete information – does not rely on previous state. i.e. idempotent • User’s identity must be verified for each request • Most UNIX system calls are supported except for open and close

  19. Idempotent • idem·po·tent • Pronunciation: 'I -d&m-"pO-t&nt • Date: 1870 • : relating to or being a mathematical quantity which when applied to itself under a given binary operation (as multiplication) equals itself; • also : relating to or being an operation under which a mathematical quantity is idempotent

  20. Semantics of file sharing • On a single processor, when a read follows a write, the value returned by the read is the value just written. • In a distributed system with caching,obsolete values may be returned.

  21. Method Comment UNIX semantics Every operation on a file is instantly visible to all processes Session semantics No changes are visible to other processes until the file is closed Immutable files No updates are possible; simplifies sharing and replication Transaction All changes occur atomically Semantics of file sharing • NFS implements session semantics

  22. Caching • The cache consistency problem: cached data may become stale if cached data is updated elsewhere in the network. • NFS solution: • Timestamp invalidation. Timestamp each cache entry, and periodically query the server: “has this file changed since time t?”; invalidate cache if stale.

  23. NFS Client Caching • Where? -in main memory of clients • What? - file blocks, translation of file names to vnodes, and attributes of files and directories. • (1) File blocks- time stamp of file (when last modified on the server). • After certain age, blocks have to be validated with server • delay writing policy: modified blocks flushed to server after certain delay

  24. NFS Client Caching • Clients do not free delayed-write blocks until the server confirms that the data have been written to disk. • (2) Caching of file names to vnodes for remote directory access • speeds up the lookup procedure • (3) Caching of file and directory attributes • updated when new attributes received from server, discarded after certain time

  25. NFS Client Caching • Writes: • block marked dirty and scheduled for flushing. • flushing: when file is closed, or a sync occurs at client. • What if multiple clients write to same file at the same time? • Can get either version (or parts of both). Completely arbitrary. Just like normal Unix • Problem: Writes from clients So if writes happen at time t and close happens at t’ then other clients might not see new data till t’

  26. Cache validation • Validation check performed : • at file open • whenever server contacted to get new block • after timeout (3s for file blocks, 30s for directories) • Done for all files (even if not being shared). • Expensive! • Potentially, every 3 sec get file attributes. • If needed invalidate all blocks.

  27. Operation Description Lock Creates a lock for a range of bytes (non-blocking_ Lockt Test whether a conflicting lock has been granted Locku Remove a lock from a range of bytes Renew Renew the lease on a specified lock Locking in NFS • NFS supports file locking • Applications can use locks to ensure consistency • Locking was not part of NFS until version 3 • NFS v4 supports locking as part of the protocol (see above table)

  28. NFS score card • Pros: • simple • highly portable • Cons: • Not Secure • Locking is not good • Sometimes inconsistent • Clients maintain 2 caches, one for file attributes (i-nodes) and one for file data. Caching can be nasty

  29. Summary • How do we make it fast? • Answer: caching, read-ahead • How do we make it reliable? What if a message is dropped? What if the server crashes? • Answer: client retransmits request until it receives a response. • How do we preserve file system semantics in the presence of failures and/or sharing by multiple clients? • Answer: well, we don’t, at least not completely.

  30. Alternatives to NFS • Andrew File System - CMU, now IBM • Sprite • Coda • Distributed File System • Remote File System • Netware - Novell based file system

More Related