1 / 21

Using Declarative Invariants for Protecting File-System Integrity

Using Declarative Invariants for Protecting File-System Integrity. By Kuei (Jack) Sun, Daniel Fryer, Ashvin Goel and Angela Demke Brown University of Toronto. Motivation. File systems have bugs Cause corruption and/or data loss

serena
Download Presentation

Using Declarative Invariants for Protecting File-System Integrity

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Using Declarative Invariants for Protecting File-SystemIntegrity By Kuei (Jack) Sun, Daniel Fryer, AshvinGoel and Angela Demke Brown University of Toronto

  2. Motivation • File systems have bugs • Cause corruption and/or data loss • Existing reliability techniques, e.g., journaling, RAID, don’t help • Existing recovery solutions • Restore from backup, but slow, risks loss of data • Offline checker (i.e. fsck), but too slow • Can we do better?

  3. Possible Alternatives? • Eliminate Bugs • Static Analysis • Does not scale • Bugs may be input dependent • Tolerate Bugs • N-version programming (e.g., Envyfs) • High overheads (performance and storage) • Can only check features common to all versions • Micro reboot of file system (e.g. Membrane) • Requires detectable failures • Many corruption bugs are fail-silent

  4. Our Approach • Verify correctness at runtime • Benefit: Make silent failures detectable • What are we checking? • Same thing fsck checks • When are we checking for consistency? • When file system claims to be consistent • Leverage transactions provided for crash recovery • I.e. check at commit time • How are we doing the checks? • Convert global fsck checks into local checks on transactions

  5. No Really… How? • File systems have consistency properties • E.g. all in-use data blocks are marked in the block allocation bitmap • This is a global property that fsck checks • For each property, we derive an invariant • Invariant must hold in any transaction to preserve the corresponding property • E.g. when Block N is allocated, bit N must be set in the allocation bitmap, in the same transaction • Invariants operate on changes in transactions, requiring local checks

  6. Recon Data Flow • The focus of work… Change records encode the updates in a transaction Invariant Checker Write Cache Modified Block Logical Difference Engine Change Record Invariants Read Cache Original Block Violation?

  7. Change Record • Example: write 8 bytes of data into an empty file with inode #1 Block 7 allocated Direct block Bit set

  8. How to Express Invariants?

  9. Datalog Invariant Checking • R1_violation(IN ,BN) :- block_allocated(IN, BN), not(change(b_freemap, _, _, BN, _, 1)). • R2_violation(BN) :- change(b_freemap, _, _, BN, _, 1), not(block_allocated(_, BN)). Change records are trivially converted to Datalog facts

  10. Invariant Checking • On each transaction • Add facts from change records into Datalog knowledge base • Check all invariants on Datalog facts • Problem • Set of facts grows over time • Facts need to persist across reboots • Slows invariant checking • Introduces more consistency problems • Insight • After commit, all facts in the transaction are incorporated in file system

  11. Querying File System State • FS state is available in Recon caches • We provide Datalogprimitivesto access caches • Can discard all facts after transaction commit Datalog Interpreter Change Record … Fact Disk Read Cache Invariants Write Cache Primitives Violation?

  12. Using Primitives • Example:Directory Cycle Detection path(X , P) :- dir_get_parent(X, P). path(X , A) :- dir_get_parent(X, P), path(P, A). cycle(X) :- path(X, X). • No cycle for this tree! Primitive c b a /

  13. Current Status • Implemented for a simple test file system • TestFS implemented at user level • Designed to be a simplified version of Ext3 • All TestFS invariants are applicable to Ext3 • TestFS has 12 Datalog invariants • Ext3 has 33 invariants in C • Invariants are independent • Total number of lines of invariant code is 38

  14. Future Work • Datalog invariants for ext3, btrfs file systems • Currently, Ext3/Btrfs Recon is implemented in OS • We plan to implement it in a hypervisor to provide strong fault model • Don’t need to port Datalog to kernel! • Customize Datalog interpreter • Optimize for file-system specific operations

  15. Conclusion • The Recon framework allows detecting arbitrary metadata corruption through runtime consistency checking • When a transaction commits, Recon checks invariants to ensure file system consistency • Invariants can be expressed in Datalogclearly and concisely

  16. Using Declarative Invariants for Protecting File-SystemIntegrity By Kuei (Jack) Sun, Daniel Fryer, AshvinGoel and Angela Demke Brown • Questions?

  17. Evaluation • Workload • ~203K commands • e.g. mkdir, rmdir, rm, touch, cd, write to file

  18. Directory Cycle Detection • Example: move /a into /a/b/c : child entry c : parent entry b c b a a / /

  19. Directory Cycle Detection • Change records for move /a into /a/b/c

  20. Invariant Checking • cycle(3). • path(3, 3). • parent(3, 3). • parent(3, ?), path(?, 3). • parent(3, 2), path(2, 3). • parent(3, 2), parent(2, 3). • parent(3, 2), parent(2, ?), path(?, 3). • parent(3, 2), parent(2, 1), path(1, 3). • parent(3, 2), parent(2, 1), parent(1, 3). • We have a match, a.k.a: violation! path(IN , PIN) :- dir_get_parent(IN , PIN). path(IN , AIN) :- dir_get_parent(IN , PIN), path(PIN , AIN). cycle(IN) :- path(IN , IN). b c 2 3 a 1

  21. Primitives change(dir_block, 3, 1, φ, ‘a’). change(dir_block, 1, 3, φ, ‘..’). • Problem: • The set of change records that we have is insufficient. • From the transaction alone, we cannot deduce the parent of ‘c’ and ‘b’. • We know the parent of ‘a’ is ‘c’. • Solution: • Primitives are predicates written inthe C language that is able to querythe read and write cache in Recon b c 2 3 a 1

More Related