Caching in the Sprite Network File System Scale and Performance in a Distributed File System

Caching in the Sprite Network File SystemScale and Performance in a Distributed File System COMP 520 September 21, 2004

Agenda

The Sprite file systems is functionally similar to UNIX • Read, write, open, and close calls provide access to files • Sprite communicates kernel-to-kernel • Remote-procedure-calls (RPC) allow kernels to talk to each other

Sprite uses caching on the client and server side • Two different caching mechanisms • Server workstations use caching to reduce delays caused by disk accesses • Client workstations use caching to minimize the number of calls made to non-local disks Network Server Traffic Server Traffic File Traffic File Traffic Client Cache Server Cache Client Cache Disk Traffic Disk Traffic

Three main issues are addressed by Sprite’s caching system • Should client caches be kept in main memory or on local disk? • What structure and addressing scheme should be used for caching? • What should happen when a block is written back to disk?

Local Disk Sprite caches client data in main memory, not on local disk • Allows clients to be diskless • Cheaper • Quieter • Data access is faster • Physical memory is large enough • Provides a high hit ratio • Memory size will continue to grow • A single caching mechanism can be used for both client and server

A virtual addressing structure is used for caching • Data organized into blocks • 4 Kbytes • Virtually addressed • Unique file identifier and block index • Both client and server cache data blocks • Server also caches naming info. • Addressed using physical address • All naming operations (open, close, etc.) passed to the server • Cached file info. lost if server crashed Server Cache Client Cache Open Close Read Write Data block Data block Data block Mgmt Info Data block Data block Data block Mgmt Info Data blocks Data block Data block Data block Mgmt Info

Sprite uses a delayed-write policy to write dirty blocks to disk • Every 30 seconds dirty blocks which have not been changed in the last 30 seconds are written to disk • Blocks written by a client are written to the server’s cache in 30-60 seconds and to the server’s disk in 30-60 more seconds • Limits server traffic • Minimizes the damage in a crash Client Server Dirty Block 30 sec. 30 sec. Dirty Block Disk Untouched for 30 seconds Untouched for 30 seconds

Agenda

Two unusual design optimizations differentiate system, solve problems • Consistency guaranteed • All clients see the most recent version of a file • Provides transparency to the user • Concurrent and sequential write-sharing permitted • Cache size changes • Virtual memory system and file system negotiate over physical memory • Cache space reallocated dynamically

Concurrent write-sharing makes the file system more user friendly • A file is opened by multiple clients • At least one client has the file open for writing • Concurrent write-sharing occurs F1: W F1: R

Concurrent write-sharing can jeopardize file consistency F1: W • Server detects concurrent write-sharing • Server instructs client B to write all dirty blocks to memory • Server notifies all clients that file is no longer cacheable • Clients remove all cached blocks • All future access requests sent to server • Server serializes requests • File becomes cacheable again once no longer open and undergoing write sharing Notify Notify Write Notify Notify Request F1: R

Client A Server Client B Client C Sequential write-sharing provides transparency, but not without risks File 1: v2 File 1: v1 Client A Server Client B Client C • Sequential write-sharing • Occurs when a file is modified by a client, closed, then opened by a second client • Clients are always guaranteed to see the most recent version of the file File 1: v1 File 1: v1 File 1: v1 File 1: v1

Sequential write-sharing: Problem 1 F1: v1 F1: v2 • Problem: • Client A modifies a file • Client A closes the file • Client B opens the file using out-of-date cached blocks • Client B has an out-of-date version of the file • Solution: version numbers Close F1 Open F1 F1: cache F1: v1

Sequential write-sharing: Problem 2 F1: Dirty blocks • Problem: The last client to write to a file did not flush the dirty blocks • Solution: • Server keeps track of last writer • Only last writer allowed to have dirty blocks • When server receives open request, notifies last writer • Writer writes any dirty blocks to server • Ensures reader will receive up-to-date info. Notify Write: F1 Open: F1 F1

Cache consistency does increase server traffic • Server traffic was reduced by over 70% due to client caching • 25% of all traffic is a result of cache consistency • Table 2 • Gives an upper bound on cache consistency algorithms • Unrealistic since incorrect results occurred

Dynamic cache allocation also sets Sprite apart • Virtual memory and the file system battle over main memory • Both modules keep a time-of-last access • Both compare oldest page with oldest page in other module • The oldest page in the cache is recycled Virtual Memory Keeps pages in approx. LRU order using clock algorithm File System Keeps blocks in perfect LRU order by tracking read and write calls Pages Blocks Main Memory

Negotiations could cause double-caching • Problem: • Pages being read from backing files could wind up in both the files cache and the virtual memory cache • Could force a page eliminated from the virtual memory pool to be moved to the file cache • The page would then have to wait another 30 seconds to be sent to the server • Solution: • When writing and reading backing files, virtual memory skips the local file cache Virtual Memory File System Page A Page A Main Memory

Multi-block pages create problems in shared caching • Problem: • Virtual memory pages are big enough to hold multiple file blocks • Which block’s age should be used to represent the LRU time of the page? • What should be done with the other blocks once one is relinquished? • Solution: • The age of the page is the age of the youngest block • All blocks in a page are removed together Virtual Memory File System 2:15 2:16 2:19 3:05 4:30 4:31 Main Memory

Agenda

Micro-benchmarks show reading from a server cache is fast • An upper limit on remote file access costs • Two important results: • A client can access his own cache 6-8 time faster than he can access the server’s cache • A client is able to write and read from the server’s cache about as quickly as he can from a local disk

Macro-benchmarks indicate disks and caching together run fastest • With a warm start and client caching, diskless machines were only up to 12% worse than machines with disks • Without caching machines were 10-50% slower

Agenda

Andrew’s caching is notably different from Sprite’s Sprite Andrew Client Cache Server Cache Open Close Read Write Venus Vice Memory Data block Naming Local Disk Memory Data file Status info Open Close Data block Data block Data block Naming Data file Status info Data file Status info Data blocks Data files Data block Data block Data block Data file Status info Data file Naming Status info • Vice: a group of trusted servers • Stores data and status information in separate files • Has a directory hierarchy • Venus: A user-level process on each client workstation • Status cache: stored in virtual memory for quick status checks • Data cache: stored on local disk

…the pathname conventions are also very different • Two level naming • Each Vice file or directory identified by a unique fid • Venus maps Vice pathnames to fids • Servers see only fids • Each fid has 3 parts and is 96 bits long • 32-bit Volume number • 32-bit Vnode number (index into the Volume) • 32-bit Uniquifier (guarantees no fid is ever used twice) • Contains no location information • Volume locations are maintained on Volume Location Database found on each server Fid: Volume number Vnode number Uniquifier 32-bits 32-bits 32-bits

Sprite Delayed-write policy Changes written back every 30 seconds Prevents writing changes that are quickly erased Decreases damage in the event of a crash Rationed network traffic Andrew Write-on-close policy Write changes are visible to the network only after the file is closed Little information lost in a crash caching on local disk, not main memory The network will not see a file in the event of a client crash 75% of files open less than 0.5 seconds 90% open less than 10 seconds Could results in higher server traffic Delays closing process Andrew uses write-on-close convention

Client A Server Client B Client C Sequential consistency is guaranteed in Andrew and Sprite File 1: v2 File 1: v1 • Clients are guaranteed to see the latest version of a file • Venus assumes that cached entries are valid • Server maintains Callbacks to cached entries • Server notifies callbacks before allowing a file to be modified • Server has the ability to break callbacks to reclaim storage • Reduces server utilization since communication occurs only when a file is changed Client A Server Client B Client C File 1: v1 File 1: v1 File 1: v1 File 1: v1

Client A Server Client B Client C Concurrent write-sharing consistency is not guaranteed • Different workstations can perform the same operation on a file at the same time • No implicit file locking • Applications must coordinate if synchronization is an issue File 1: W File 1: R File 1: W

Comparison to other systems • With one client, Sprite is around 30% faster than NFS and 35% faster than Andrew • Andrew has the greatest scalability • Each client in Andrew utilized about 2.4% of the CPU • 5.4% in Sprite • 20% in NFS

Summary: Sprite vs. Andrew

Sprite Benefits: Guarantees sequential and concurrent consistency Faster runtime with single client due to memory caching and kernel-to-kernel communication Files can be cached in blocks Drawbacks: Lacks the scalability of Andrew Writing every 30 seconds could result in lost data Fewer files can be cached on main memory than on the disk Andrew Benefits: Better scalability due in part to shifting path lookup to client Transferring entire files reduces communication with server, no read and write calls Tracking entire files is easier than individual pages Drawbacks Lacks concurrent write-sharing consistency guarantees Caching to the disk slows runtime Files larger than the disk cannot be cached Conclusion: Both file systems have benefits and drawbacks

Backup

Cache consistency does increase server traffic • Server traffic was reduced by over 70% due to client caching • 25% of all traffic is a result of cache consistency • Table 2 • Gives an upper bound on cache consistency algorithms • Unrealistic since incorrect results occurred

Micro-benchmarks show reading from a server cache is fast Maximum read and write rates in various places • Give an upper limit on remote file access costs • Two important results: • A client can access his own cache 6-8 time faster than he can access the server’s cache • A client is able to write and read from the server’s cache about as quickly as he can from a local disk

Macro-benchmarks indicate disks and caching together run fastest • With a warm start and client caching, diskless machines were only up to 12% worse than machines with disks • Without caching machines were 10-50% slower Top number: time in seconds Bottom number: normalized time

Sprite mgmt cache contains: File maps Disk management info. Status info. on Andrew and Sprite

Volumes in Andrew • Volume • A collection of files • Forms a partial subtree in Vice name space • Volumes joined at Mount Points • Resides in a single disk partition • Can be moved from server to server easily for load balancing • Enables quotas and backup

Sprite caching improves speed and reduces overhead • Client side caching enables diskless workstations • Caching on diskless workstations improves runtime by 10-40% • Diskless workstations with caching are only 0-12% slower than workstations with disks • Caching on the server and client side result in overall system improvement • Server utilization is reduced from 5-18% to 1-9% per active client • File intensive benchmarking was completed 30-35% faster on Sprite than on other systems

Caching in the Sprite Network File System Scale and Performance in a Distributed File System

Caching in the Sprite Network File System Scale and Performance in a Distributed File System

Presentation Transcript

Distributed File System

Distributed File System

Network File System

File Access Patterns in Coda Distributed File System

DISTRIBUTED FILE SYSTEM

File System Performance

Scale and Performance in a Distributed File System

Caching in Distributed File System

Scale and Performance in a Distributed File System

Network File System

Network File System

Distributed File System

Distributed File System

Network File System

Distributed File System

distributed file system and google file system

Network File System

Distributed File System

File System Performance

File Access Patterns in Coda Distributed File System

Distributed File System

Distributed File System .