1 / 44

Files and Crypto

Files and Crypto. Recitation – 4/14 Nisarg Raval. DeFiler interfaces: overview. create , destroy , read , write a dfile. list dfiles. DFS. read (), write () startFetch (), startPush () waitValid (), waitClean (). DBuffer dbuf = getBlock (blockID) releaseBlock (dbuf).

eshana
Download Presentation

Files and Crypto

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Files and Crypto Recitation – 4/14 NisargRaval

  2. DeFiler interfaces: overview create, destroy, read, write a dfile list dfiles DFS read(), write() startFetch(), startPush() waitValid(), waitClean() DBuffer dbuf = getBlock(blockID) releaseBlock(dbuf) DBufferCache DBuffer ioComplete() startRequest(dbuf, r/w) VirtualDisk

  3. DFS /* creates a new dfile and returns the DFileID */ public DFileID createDFile(); /* destroys the dfile named by the DFileID */ public void destroyDFile(DFileID dFID); /* reads contents of the dfile named by DFileID into the ubuffer * starting from ubuffer offset startOffset; at most count bytes are transferred */ public int read(DFileID dFID, byte[] ubuffer, int startOffset, int count); /* writes to the file named by DFileID from the ubuffer * starting from ubuffer offset startOffset; at most count bytes are transferred */ public int write(DFileID dFID, byte[] ubuffer, int startOffset, int count); /* List DFileIDs for all existing dfiles in the volume */ public List<DFileID> listAllDFiles();

  4. Small details: low-level I/O DBuffer required interfaces VirtualDisk needs basic info about a DBuffer for I/O: the dbuf’s blockID and a reference to its byte buffer. blockID = getBlockID() buf[] = getBuffer() VirtualDisk has private methods to “format” the “disk” (VDF file), and fetch/push blocks at specified blockIDs. We give you sample code for these functions. VirtualDisk

  5. Small details: initializing new DFS(format) Initialization starts with a call from the test program. DFS If DFS constructor call has format == true, then all data in VDF is discarded and zeroed. constructor (int numblocks) DBuffer DBufferCache Create numblocks DBuffer objects (dbufs) and memory buffer regions (of size blocksize) for those dbufs. constructor (boolean format) VirtualDisk Create/truncate VDF (w/ optional name), or open existing VDF if format == false.

  6. Small details: exiting sync DFS A test program should call DFS sync before exit(), to force any dirty blocks in the I/O cache out to disk. Sync is implemented in DBufferCache, and is synchronous: don’t return until all writes complete. sync DBufferCache VirtualDisk

  7. Big details: DBuffer DFS read(…) write(...) startFetch(), startPush() waitValid(), waitClean() A DBuffer dbuf returned by getBlock is always associated with exactly one block in the disk volume. But it might or might not be “in sync” with the underlying disk contents. DBuffer A dbuf is valid iff it has the “correct” copy of the data. A dbuf is dirty iff it is valid and has an update (a write) that has not yet been written to disk. A valid dbuf is clean if it is not dirty. Your DeFiler should return only valid data to a client. That may require you to zero the dbuf or fetch data from the disk. Your DeFiler should ensure that all dirty data is eventually pushed to disk.

  8. DBufferCache /* Get buffer for block specified by blockID The buffer is “held” until the caller releases it. A “held” buffer cannot be evicted: its block ID cannot change. */ public DBuffer getBlock(int blockID); /* Release the buffer so that it may be eligible for eviction. */ public void releaseBlock(DBuffer dbuf); /* Write back all dirty blocks to the volume, and wait for completion. */ public void sync();

  9. DBuffer /* Start an asynchronous fetch of associated block from the volume */ public void startFetch(); /* Start an asynchronous write of buffer contents to block on volume */ public void startPush(); /* Check whether the buffer has valid data*/ public boolean checkValid(); /* Wait until the buffer has valid data (i.e., wait for fetch to complete) */ public boolean waitValid(); /* Check whether the buffer is dirty, i.e., has modified data to be written back */ public boolean checkClean(); /* Wait until the buffer is clean (i.e., wait for push to complete) */ public boolean waitClean(); /* Check if buffer is evictable: not evictable if I/O in progress, or buffer is held. */ public boolean isBusy();

  10. DBuffer /* Reads into the ubuffer[ ] from the contents of this Dbuffer dbuf. * Check first that dbuf has a valid copy of the data! * startOffset is for the ubuffer, not for dbuf. * Reads begin at offset 0 in dbuf and move at most count bytes. */ public int read(byte[] ubuffer, int startOffset, int count); /* Writes into this Dbuffer dbuf from the contents of ubuffer[ ]. * Mark dbuf dirty! startOffset is for the ubuffer, not for dbuf. * Writes begin at offset 0 in dbuf and move at most count bytes. */ public int write(byte[] ubuffer, int startOffset, int count); These calls are for use by the DFS layer to read/write user data between client ubuffers and Dbufferdbufs. DFS may read/write only on a helddbuf.

  11. VirtualDisk /* * Start an asynchronous I/O request to the device/disk. * The blockID and buffer array are given by the DBuffer dbuf. * The operation is either READ or WRITE (DiskOperationType). */ public void startRequest(DBuffer dbuf, DiskOperationType rw) throws…;

  12. Big issues: caching DeFiler uses an I/O cache in memory to stage transfers to/from disk and to reduce the need for I/O. The cache has a set of DBufferdbuf buffer objects. Each dbuf is either free or it is associated with exactly one disk block blockID. I/O to/from a block is staged from its dbuf. Each block has at most one dbuf. The dbuf for a block is kept in cache after access. If a requested block is already resident in the cache, then getBlock finds its dbuf and returns it. Else it allocates a free dbuf for the block. The system discards (evicts) a cached block if it has a better use for the memory. It frees the evicted block’s dbuf and soon reuses the dbuf for some other block. DFS dbuf = getBlock(blockID) releaseBlock(dbuf) DBufferCache DBuffer

  13. Big issues: eviction The I/O cache system has a replacement policy to select candidate blocks for eviction. It keeps an evict pool of dbufs ordered by some measure of their suitability for eviction, e.g., Least Recently Used (LRU). The evict pool data structure may require more state in dbufs, or interactions between DBufferCache and DBuffer. This is up to you: no formats or interfaces are specified. DFS dbuf = getBlock(blockID) releaseBlock(dbuf) DBufferCache DBuffer startRequest(dbuf, r/w); ioComplete() VirtualDisk

  14. Buffer states DBufferCache must not evict a block when its dbuf is in use by the layer above or below. You must think carefully about DBuffer (dbuf) states and how to synchronize access to dbufs. This is up to you. Suggestion. A dbuf is pinned if I/O is in progress, i.e., a VDF request has started but not yet completed. A dbuf is held if DFS obtained a reference to the dbuf from getBlock but has not yet released the dbuf. Don’t evict a dbuf that is pinned or held: pick another candidate. DFS dbuf = getBlock(blockID) releaseBlock(dbuf) DBufferCache DBuffer startRequest(dbuf, r/w); ioComplete() VirtualDisk

  15. Log Structured File System • With cheaper memory caching is efficient • Most read requests can be addressed through caching • The bottleneck is write requests • Limited by disk I/O • Needs several writes (data as well as metadata) • Require several disk seeks

  16. Log Structured File System • Instead of writing use logging • Logging writes only at the head • Data always appended sequentially • Reduces head seeks • If we keep appending we will run out of memory • Garbage collection

  17. LFS Vs UFS http://work.tinou.com/2012/03/log-structured-file-system-for-dummies.html

  18. Checkpoints for Inode Map http://work.tinou.com/2012/03/log-structured-file-system-for-dummies.html

  19. Clean up http://work.tinou.com/2012/03/log-structured-file-system-for-dummies.html

  20. Crypto: The Basics Jeff Chase Duke University

  21. Principals in a networked system Principals are users or organizations, or software entities acting on their behalf. Mallory Bob attack How can principals communicate securely? How do they decide whom to trust? Alice

  22. Message Indistinguishability • Plain Text - m • Cipher Text - c • Key - k m1 or m2 ? k Randomly Choose from c1 and c2 c1 m1 c k c2 m2

  23. What does it mean to be Secure? • Perfect Secrecy • Shannon showed that key must be as large as the message • Can not distinguish encryption of two messages with more than 50% probability • One Time Pad • Computational Security • Infeasible to decode the message • With high probability a polynomial time algorithm can not distinguish encryption of two messages with more than 50% probability

  24. Cryptography for Busy People • Standard crypto functions parameterized by keys. • Fixed-width “random” value (length matters, e.g., 256-bit) • Symmetric (DES: fast, requires shared key K1 = K2) • Asymmetric (RSA: slow, uses two keys) • “Believed to be computationally infeasible” to break E D M Encrypt K1 Decrypt K2 M [Image: Landon Cox]

  25. Symmetric Crypto • “Secret key” or “private key” cryptography. • DES, 3DES, DESX, IDEA, AES • Sender and receiver must possess a shared secret • Shared key K [a random bit string of chosen length] • K = K1 = K2 • Message M, Key K {M}K = Encrypt(M, K) M = Decrypt({M}K , K)

  26. Symmetric crypto Borrowedfrom https://spideroak.com/blog/20130523083520-drink-your-ovaltine-intro-to-encryption-101

  27. Example: Java Cipher class Symmetric crypto is easy to use from (e.g.) your Java code. “The Cipher class provides the functionality of a cryptographic cipher used for encryption and decryption. Encryption is the process of taking data (called cleartext) and a key, and producing data (ciphertext) meaningless to a third-party who does not know the key. Decryption is the inverse process: that of taking ciphertext and a key and producing cleartext.” [oracle.com]

  28. Symmetric crypto: keys • Anyone can generate keys. Functions are available or common programming languages. • A key is just a random bit string. Choose length wisely: short keys are cheaper to generate and use, but easier to crack. • Generators need a good random number generator, with a good seed for randomness. Clock? Timing of a burst of keystrokes? Unix systems have /dev/random: reads return a stream of “random” numbers. • The hard part is sharing the key securely. • We need secure communication with the partner to transfer the key, but the whole point of the key is to enable secure communication! • This is thekey distribution problem. • Asymmetric crypto can help, as we shall see.

  29. Example: Java KeyGenerator class “A key generator is used to generate secret keys for symmetric algorithms.” [oracle.com] But how to share the secret securely? This is the key distribution problem.

  30. Asymmetric (public key) crypto • Each subject/principal possesses a keypair. • Decrypt(K, Encrypt(K-1, M)) = M • Keep one key private; the other is public. • Either key can be used to encrypt/decrypt. If we know one another’s public keys then we can communicate securely. Anyone can mint a keypair.

  31. Asymmetric crypto works both ways Crypt E D A’s private key or A’s public key Crypt A’s public key or A’s private key [Landon Cox]

  32. How to use asymmetric crypto? • A can send a message to B, encrypted with A’s private key. • B can send a message to A, encrypted with A’s public key. • Benefits? Other possibilities?

  33. abstrusegoose.com

  34. Spelling it out • Do encrypt message M with your private key to authenticate it, i.e., to convince the recipient that M really came from you. • Better yet, digitally sign M: that’s faster (next). • Do encrypt M with the recipient’s public key to keep it secret: only the intended recipient can decrypt it. • Don’t encrypt M with your public key: it’s just weird and pointless, since nobody else can read the encrypted message. Bob probably blew his chances with Alice. • Don’t encrypt M with the recipient’s private key: if you know someone’s private key then you should not use it! Forget it and don’t tell anyone.

  35. Hybrid cryptosystems • Symmetric crypto is muchcheaper than asymmetric (10Kx). • But asymmetric is useful to “bootstrap” communication. • All it takes is knowledge of another party’s public key, and it is not necessary to keep the public keys secret. • These properties motivate hybrid cryptosystems that use asymmetric in combination with cheaper techniques. • Digital signatures combine asymmetric with hashing. “As for SpiderOak, our old clients used a combination of 2048 bit RSA and 256 bit AES. Now new clients use 3072-bit RSA combined with 256 bit AES to meet industry recommendations. We use this mixture of techniques where each is best suited: asymmetric encryption for communications channel setup and key exchange, and symmetric encryption for internal data structures and improved client performance.” August 2013: https://spideroak.com/blog/20130523083520-drink-your-ovaltine-intro-to-encryption-101

  36. Digital Signature • A hash digest of message M encrypted with principal B’s private key is called a digital signature • Unforgeable. “Proves” that B sent M. • Certified. “Proves” M has not been tampered. • Non-repudiable. B cannot deny sending M. • But not private. Alice, Will you marry me? Signed, Bob

  37. http://pst.libre.lu/mssi-luxmbg/p1/04_auth-art.html

  38. Digitally signed code • We have talked about the problem of verifying that programs originate from some trusted/trustworthy source, and are not hacked. • Where did you get those tools? • It is common for software updates and other code to be digitally signed by the originator. • It works if you think you can trust the originator, and you know the originator’s public key.

  39. Two “key points” • Digital signatures are “stronger” than physical signatures, because they are bound to the document contents. • Attacker cannot change the document contents without invalidating the signature. • To verify a signature, the receiver must already know the public key of the signer. • And it must be right. • But how to know for sure?

  40. Hashed password file # hashed This is a line from /etc/passwd for user Fred Flintstone. loginuses this record to validate the user’s password. The file is public, but Fred’s password is secret. Or is it?

  41. Cryptographic hashes # • Also called a secure hash or one-way hash • E.g., SHA0, MD5, SHA1, SHA2, SHA3 • Result called a hash, checksum, fingerprint, digest • Very efficient • SHA-x: Secure Hashing Algorithm SHA1 hash “Hash digest” Arbitrarily large 160 bits

  42. Properties of Secure Hashing • Collision-resistant • There exist distinct M1 and M2 such that h(M1) == h(M2). • Such collisions are “very hard” to find. • One way • Given digest, cannot generate an M with h(M) == digest. • Such collisions are “very hard” to find. • Secure • The digest does not help to discover any part of M. # X X # SHA1 SHA1 Cheap “Computationally infeasible”

  43. Using hashed passwords • This protocol takes place over an encrypted connection. The connection is established first, e.g., using SSL/TLS. (later) • Threat model: attacker steals stored password from server. • Defense: the server stores a hash of the password, and not the password itself. So an attacker cannot steal the password. “Hi, this is server. Login please.” “I’m fflintstone. Password: yabbadabbado.” Server code: phash = SHAx(“yabbadabbado”); shash = getStoredHash(“fflintstone”); verify shash == phash; Server Fred “Hi Fred. Welcome back.” …

  44. Let’s get this right • Hashing is not encryption. • “One way”  No way to decrypt! • No keys! • Client uses password to login, and not the hash. • If the hash alone is sufficient to log in, then an attacker who gains access to the hashed password file can compromise all accounts, even without knowing the “real” passwords! • The goal is not to protect the password in transit: we use encryption for that. We want to protect it on the server. • Server must remember something about a password so that it can verify it, but a hash is “good enough”: the server doesn’t need to remember the password itself. • So: server stores the hash, and forgets the password.

More Related