1 / 57

COSC 6360 Review Session

COSC 6360 Review Session. December 2004. COSC 6360 Review Session. December 2004. About the second program. It is quite easy to get the timestamps of a file: System call stat(…) returns a data structure containing all of them Data structure defined in <sys/stat.h>. Example.

vilina
Download Presentation

COSC 6360 Review Session

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. COSC 6360 Review Session December 2004

  2. COSC 6360 Review Session December 2004

  3. About the second program • It is quite easy to get the timestamps of a file: • System call stat(…) returns a data structure containing all of them • Data structure defined in <sys/stat.h>

  4. Example #include <sys/stat.h> #include <sys/types.h> main (int argc, char *argv[]) { int mymode; struct stat statbuf; stat(argv[1], &statbuf); mymode = statbuf.st_mode%01000; printf("File mode is %o \n", mymode); } // main

  5. File Systems LFS, Journaling file systems and Soft Updates all address the issue of metadata updates

  6. What is the metadata update problem?

  7. Metadata updates (I) • UNIX file system depends on its delayed write policy to achieve good performance • Writes are recorded in a shared I/O buffer • Recorded on disk when • File blocks are expelled from I/O buffer or • After a few seconds • User and FS has no control over timing or order of these writes

  8. Metadata updates (II) • Delayed writes are not acceptable for metadata updates • File crash would leave file system in an inconsistent state

  9. First Example I/O Buffer Directory block contains new entry pointing to a newi-node Cannot write to disk directory block beforei-node block it points to Directory Block I-node Block

  10. Otherwise I/O Cache On disk Crash wipes entire contents of I/O buffer New Directory Block ?

  11. Second Example I/O Buffer We removed from directory block an entry pointing to an i-node that is being recycled Must write to disk directory block beforei-node block it points to Directory Block X I-node Block

  12. Otherwise I/O Buffer On disk Crash wipes entire contents of I/O buffer OldDirectory Block ?

  13. Solutions (I) • Use NVRAM cache • Costly • Require all metadata updates to be written synchronously in the right order • Solution of UNIX FFS • Makes inefficient use of disk bandwidth • Do all the writes on a log (LFS) • Makes much better use of disk bandwidth

  14. Solutions (II) • Record all metadata updates on a sequential log before writing them at their regular location • Journaling file systems • Requires additional writes to the log/journal • These writes can be buffered or non-buffered • Ensure that I/O buffer blocks are written back to disk in the right order • Soft updates • Not always possible

  15. The trouble with Soft Updates I/O Buffer Same directory block contains a new entry and an old entry being deleted Both entries point to the same i-node block We have a circular dependency Directory Block X I-node Block

  16. Soft Update Solution • Do it in three steps • Write directory block with old entry deleted but without new entry • Write i-node block with new i-node and old i-node already recycled (no directory entry points to it) • Write directory block with old entry deleted and new entry pointing to new i-node • Disk always remains in a consistent state

  17. Which data structures are used by LFS?

  18. Key considerations • We can assume that most reads will be completed without disk access • I-node tables now fragmented into multiple blocks

  19. On-Disk Data Structures • Log • Contains data blocks, i-node blocks, blocks of i-node map, segment summaries and directory change log • Checkpoint area • Contains • Address of end of log at checkpoint time • Addresses of all i-node map blocks at checkpoint time

  20. In-Memory Data Structures • I/O Buffer • Also contains recently accessed i-node map blocks

  21. Finding the data (I) • This means • Finding the i-node • Locating the data blocks • If data blocks are in I/O buffer, we are done • Otherwise check whether i-node is not cached • Can reasonably hope that at least the required i-node block is cached

  22. Finding the data (II) • When nothing can be found in main memory • Go to checkpoint area • Find there address of i-node map blocks • Locate i-node of file • Locate data blocks • After a crash we may have to look up the portion of the log after the last checkpoint to locate new i-node map blocks, new i-node blocks and new data blocks

  23. What should I know about segment cleaning?

  24. What should I know about segment cleaning?

  25. Segment Cleaning • Key idea is to group into same segments file of equal age in order to have • Stable segments that will be rarely cleaned • Segments whose contents change very quickly • Their data age very quickly • These segments will return a lot of free space whenever they are cleaned

  26. What about Elephant?

  27. Elephant • Key idea is defining which versions to preserve • Two objectives • Being able to undo recent mistakes • Being able to retrieve old versions of a file • Solution • Keep the complete history of a file over a short period of time (one hour to one week) • Keep forever landmark versions of each file

  28. Complete history of a file X X X X X X X Example Time Elephant keeps X X X X X Two landmark versions All recent versions

  29. Distributed File Systems Let us look first at NFS and Coda

  30. What makes a server stateless?

  31. Stateless server • Keeps no track of previous user requests

  32. Advantages • Robustness • Server can reboot after any crash • Simplicity of design

  33. Disadvantages • Inefficient consistency control • Neither server nor client know whether other workstations access a given file • Must always assume risk of shared access even though shared access is infrequent • Requires a write-through policy at server

  34. Reintroducing state • Some state information does not need to be saved in stable storage: • Temporary data that would expire before the server can be rebooted • Leases, callbacks • Data that the client keeps in stable storage • Safe asynchronous writes allow NSF server to delay writes until client commits them

  35. What is close to open consistency?

  36. Close to open consistency • Guarantees that every process opening a file will see all the changes brought to the file by the last process that closed the file • Processes must • Check at open time whether they have the most recent version of the file known to server • Propagate all their changes to the server when they close the file

  37. What are callbacks?

  38. Callbacks • Are promises made by the server to notify a client when it receives a new version of the file from any other client • A client having a callback on a file does not need to check with the server whether it has the most recent version of the file known to server • Notifications can be lost • Clients must periodically check the validity of their callbacks

  39. Why do we have callbacks?

  40. Callbacks • Reduce the server workload whenever most files are not shared • The server bets that it will never have to break the callback • “Do not call us, we will call you!”

  41. What is the NFS model of consistency?

  42. NFS • Clients • Frequently check the validity of the data blocks in their cache • Frequently send to the server the new values of the blocks they have modified • Client and server also enforce close to open consistency

  43. How does Coda detects inconsistencies?

  44. Coda • Each replica has • ID of last store (LSID) • A current version vector (CVV) with • The version number of the replica • Conservative estimates of the version numbers of the other replicas

  45. Example • Three copies • A:LSID= 33345 v = 4 CVV = (4 4 3) • B:LSID= 33345 v = 4 CVV = (4 4 3) • C:LSID= 2235 v = 3 CVV = (3 3 3)

  46. Coda (II) • Coda compares the states of replicas by comparing their LSID’s and CVV’s • Four outcomes can be • Strong equality:same LSID’s and same CVV’s • Everything is fine

  47. Coda (III) • Weak equality: Same LSID’s and different CVV’s • Happens when one site was never notified that the other was updated • Must fix CVV’s

  48. Coda (IV) • Dominance /Submission:LSID’s are different and every element of the CVV of a replica is greater than or equal to the corresponding element of the CVV of the other replica Example: two replicas A and B CVVA = (4 3) A dominates B CVVB = (3 3) B is dominated by A A has the most recent version of the file

  49. Coda (V) • Inconsistency:LSID’s are different and some element of the CVV of a replica are greater than the corresponding elements of the CVV of the other replica but other are smaller Example: two replicas A and B CVVA = (4 2) A and B areCVVB = (2 3) inconsistentMust fix inconsistency before allowing access to the file

  50. What is the key idea inthe LBFS paper?

More Related