1 / 65

Timing-based Attestation: A survey through today

Timing-based Attestation: A survey through today. Xeno Kovah. Outline. Explain what it is Talk about the past decade of applying the technique to everything from keyboards to cars to SCADA RTUs End with our BIOS work, which is wicked sweet and which is going to be presented at BlackHat.

susane
Download Presentation

Timing-based Attestation: A survey through today

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Timing-based Attestation:A survey through today Xeno Kovah

  2. Outline • Explain what it is • Talk about the past decade of applying the technique to everything from keyboards to cars to SCADA RTUs • End with our BIOS work, which is wicked sweet and which is going to be presented at BlackHat

  3. The past decade(available at http://bit.ly/11xEmlV)

  4. But First:What is Timing-Based Attestation? • The fundamental premise: • "Build your software so that if it's code is modified, it runs slower." • I think of it as a purposeful and desirable timing side-channel that provides tamper-evidence • We coined "timing-based" because it is a superset of the "software-based" techniques, but using hardware (e.g. TPM) for timing measurement • Assumptions: • Attacker has complete control of execution environment before self-checking begins (i.e. same privilege as defender) • Self-checksuming code is time-optimal for a given microarchitecture • There are no free execution slots where an attacker can insert a "free" instruction and suffer no timing slowdown • There is a decade of work in this area, we can't do the many many nuances justices. A timeline of related work here: • http://bit.ly/11xEmlV

  5. Components of all self-checks • Nonce/PseudoRandom Number(PRN) • Decrease likelihood of precomputation due to storage constraints, and prevent replay (here only with online SMM-based challenges, not the boot) • Read own data • Incorporated into checksum so if it changes the checksum changes • Read own instruction and data pointers • Indicates where in memory the code itself is executing • Set a benign/beneficial value and then read "other stuff" that could a) help attackers or that b) will serve as a side-channel of your own code execution • E.g. a) debug registers, interrupt descriptor table, etc • E.g. b) instruction counts, TLB/cache state, etc • Do all the above in millions of loop iterations • So that ideally an instruction or two worth of conditional checks per loop iteration leads to millions of extra instructions in the overall runtime

  6. How the defender tries to funnel the attacker's perspective • "I want to attack the application, but the OS is protecting it…" • "I want to attack the OS but the security software is protecting it…" • "I want to attack the security software, but the self-check is protecting it…" • "I will attack the self-checksum code."

  7. "Welcome to my parlor said the spider to the fly" Applications OS Security Software Self-Check Dynamic Root of trust Measurements

  8. Simplified Selfcheck() Selfcheck(checksum, nonce, codeStart, codeEnd, codeSize) { while (iteration < 2500000) { checksum[0] += nonce; checksum[1] ^= DP; checksum[2] += *DP; checksum[4] ^= EIP; mix(checksum); nonce += (nonce*nonce) | 5; DP = codeStart + (nonce % codeSize); iteration++; } }

  9. Simplified Selfcheck() Forgery Selfcheck_forge(checksum, nonce, codeStart, codeEnd, codeSize) { while (iteration < 2500000) { checksum[0] += nonce; checksum[1] ^= DP; if (DP == myHookLocation) checksum[2] += copyOfGoodBytes; else checksum[2] += *DP; checksum[2] += *DP; checksum[4] ^= EIP; mix(checksum); nonce += (nonce*nonce) | 5; DP = codeStart + (nonce % codeSize); iteration++; } }

  10. Network Timing Implementation Server Client Measurement Type: FOO, Nonce = 0xf005ba11 Selfcheck(Nonce = 0xf005ba11) Δt Self-Checksum, Nonce = 0xf005ba11 FOO measurement Time FOO measurement results

  11. Network Timing Implementation(with attack) Server Client Measurement Type: FOO, Nonce = 0xf005ba11 Selfcheck(Nonce = 0xf005ba11) Δt Selfchecksum, Nonce = 0xf005ba11 Time FOO measurement FOO measurement results

  12. Memory Copy Attacks Maximum attacker overhead. Defender targets this situation

  13. Alternate Timing-based Attestation Attacks ("The only way to win is to not play the game") • Nonce to checksum pre-computation and lookup • Easily defeated with large nonce/checksum sizes • Parallelize for Faster checksum computation • Defeated by what Pioneer called "Strongly Ordered Function" • (((A XOR B) + C) XOR D)… != (A XOR B) + (C XOR D) …

  14. Timing-based Attestation Attacks 2 • Optimization: Forge one of DP or PC, but then make your self-check faster than • Will always be implementation-specific. This will be the final frontier of this research. • Seshadri tried to work on a proof of optimality of some x86 code. I don't hold out any hope for such a thing. As far as I'm concerned, if someone from Intel/AMD says "this is the best we can do", that's the final "good enough" end state

  15. Timing-based Attestation Attacks 3 • Proxy Attack • Send the checksum computation to a faster machine elsewhere on the network • Send the computation to a faster device within the system (e.g. if attesting peripheral firmware) • Unfortunately this is possible even with TPM • We haven't been able to come up with a way to use the TPM to truly bind the computation to the platform in a time-efficient way (TPMs are slow, it limits our options)

  16. Network Timing Implementation Server Client Measurement Type: FOO, Nonce = 0xf005ba11 Selfcheck(Nonce = 0xf005ba11) Δt Self-Checksum, Nonce = 0xf005ba11 Time

  17. Proxy Attacks Compromised Client Server Faster Client Measurement Type: FOO, Nonce = 0xf005ba11 Measurement Type: FOO, Nonce = 0xf005ba11 Self-Check (Nonce = 0xf005ba11) Self-Checksum, Nonce = 0xf005ba11 Δt Self-Checksum, Nonce = 0xf005ba11 Time

  18. TPM Timing Implementation Proxy Attack Server Slow Client TPM Fast Client TPM Tickstamp Nonce = 0xf005ba11 Request Tickstamp(0xf005ba11) Signed Tickstamp 1 Self-Check (nonce = signature) Request Self-Check (nonce = signature) Δt Time Self-Checksum Request Tickstamp(checksum[0]) Signed Tickstamp 2 Signed Tickstamp 1 & 2 Self-Checksum Nonce = 0xf005ba11

  19. Timing-based Attestation Attacks 4 • Time of Check,Time of Use (TOCTOU) • Conditions for TOCTOU • 1) The attacker must know when the measurement is about to start. • 2) The attacker must have some un-measured location to hide in for the duration of the measurement. • 3) The attacker must be able to reinstall as soon as possible after the measurement has finished.

  20. So what do these things look like?

  21. Establishing the Genuinity of Remote Computer Systems • Goal was to "check the integrity of its own instructions while simultaneously ensuring that the instructions are running on a real computer." • Focusing on side-effects of the memory hierarchy, such as TLB misses – introduced pseudo-random traversal Aug 2003

  22. Establishing the Genuinity of Remote Computer Systems 2 • Initial checksum modeled as: • Need isolated execution, so that other software doesn't influence the TLB. Thus, run w/o interrupts • Attested the result over the network. "In practice, however, we can assume that the network transport delay is small" – a limiting assumption • Assumed the location of the checksum code is known in advance, and downloaded the code over the network – only workable if the entire OS is provided via something like PXE boot like they did Aug 2003

  23. Establishing the Genuinity of Remote Computer Systems 3 • Summary: • Platform: PC • Architecture: x86 (133MHz, 32 bit, Pentium, Von Neumann) • Coverage: 16MB chunk of vmem (including OS kernel) • Attestation channel: network Aug 2003

  24. (Attack paper) Side effects are not sufficient to authenticate software & An analysis of proposed attacks against genuinity tests • A: "We made our own self-check and could totally fool it into replying fast enough."x • B: "Your self-check was wrong! You left out a bunch of stuff!"x • A: "Well…we asked for YOUR self-check but you didn't give it to us!"x • B: "Well I know the architecture better than you and you didn't consider this and this and this."x • Winner? It's in the eye of the beholder, but I side with the Genuinity researchers. • But the attack paper did bring up proxy attacks, which hasn't been solved for PC systems yet Aug 2004

  25. SWATT: SoftWare-based ATTestation for Embedded Devices • Adds explicit discussion of the non-parallizable property (which can still take the form of adds and xors.) • First thing to cover all of memory (but that's obviously more possible on a small embedded system) • Also fills in memory with a pseudo-random pattern • First in the line of many CMU software(timing)-based attestation systems • Summary: • Platform: Berkeley Mica Motes • Architecture: Atmel ATmega163L (8 bit, RISC, Harvard) • Coverage: all mem • Attestation channel: network Mar 2004

  26. SWATT: SoftWare-based ATTestation for Embedded Devices 2 • Posted asm (good on them!) Mar 2004

  27. Using Software-based Attestation for Verifying Embedded Systems in Cars • "Many of the embedded systems used in cars, such as the engine timing controller, are based on these kind of small microcontrollers. Hence, our approach is directly applicable to car-based embedded systems" • Bait and switch! It's just SWATT advertising >:( • I hadn't actually read this one before since I don't care as much about embedded systems. • My bad Nov 2004

  28. Remote Software-Based Attestation for Wireless Sensors • Entirely protocol specification, no actual implementation. • Lame. This is me saying, don't do this. Jan 2005

  29. (Attack paper) A generic attack on checksumming-based software tamper resistance • Play games with virtual memory mapping • Outlined the same type of attack on Sparc which later was used by ShadowWalker on x86, but said explicitly they didn't think it was possible on x86 • Play games with segmented memory mapping • "This execute-only permission can be used to detect when an application attempts to read memory relative to CS. As soon as the exception is delivered to an OS modified for our attack, the OS can automatically modify the memory map to make it appear as if the unmodified data was present at that memory page." • We consider this the first TOCTOU paper ("temporarily replacing the page table entry for Code with that for Code"), though they didn't call this fact out May 2005

  30. Pioneer: Verifying Code Integrity and Enforcing Untampered Code Execution on Legacy Systems • "We also assume that the communication channel between the dispatcher and the untrusted platform provides the property of message-origin authentication" uh…unlikely • Meant to serve as a dynamic root of trust to measure hashing code that then hashes whatever else you want to measure • Summary: • Platform: Linux PC • Architecture: 2.2GHz Pentium 4 x86-64 (64 bit, Von Neumann ) • Coverage: own kernel module (built into network driver) • Attestation channel: network Oct 2005

  31. Pioneer: Verifying Code Integrity and Enforcing Untampered Code Execution on Legacy Systems 2 • Posted asm (good on them!) Oct 2005

  32. Pioneer: Verifying Code Integrity and Enforcing Untampered Code Execution on Legacy Systems 3 Oct 2005

  33. PRISM: Enabling Personal Verification of Code Integrity, UntamperedExecution, and Trusted I/O on Legacy Systems orHuman-Verifiable Code Execution • Crank up the number of iterations so that a human can time the execution with a stop watch. Provide the user with a precomputed list of challenge/response pairs. • Implicit discussion of memory hierarchy benefits • Summary: • Platform: Sharp Zaurus SL6000 (PDA) w/Linux • remember PDAs? :D • Architecture: 400 MHz Intel XScale-PXA255 (ARM!, 32 bit, Von Neumann) • Coverage: own kernel module • Attestation channel: display to human! Feb 2007

  34. Mechanisms to Provide Integrity in SCADA and PCS devices • A port of PRISM to another ARM system • "We assume that the RTU cannot access a faster computing platform (proxy) to perform computation on its behalf." – how convenient! • Summary: • Platform: VxWorks5.5, SCADAPack 350 RTU by Control Microsystems • Architecture: ARM (32 MHz. ARM7TMDI, 32 bit, Von Neumann) • Coverage: own kernel driver • Attestation channel: network Jun 2008

  35. Remote attestation on legacy operating systems with trusted platform modules • Apply the Pioneer Protocol, but use the TPM as a trusted timer • We actually came up with this in parallel and then found this paper :-/ • But they never implemented it (again, don't be that guy), so we did • We also found some problems in their verification, but we didn't make a big deal out of it ;) Dec 2008

  36. Network Timing Implementation Server Client Measurement Type: FOO, Nonce = 0xf005ba11 Selfcheck(Nonce = 0xf005ba11) Δt Self-Checksum, Nonce = 0xf005ba11 FOO measurement Time FOO measurement results

  37. Trusted Platform Module (TPM) Timing Implementation Server Client TPM TPM Tickstamp Nonce = 0xf005ba11 Request Tickstamp(0xf005ba11) Signed Tickstamp 1 Time Δt Self-Check (nonce = signature) Request Tickstamp(Self-Checksum) Signed Tickstamp 2 Signed Tickstamp 1 & 2 Self-Checksum Nonce = 0xf005ba11

  38. TPM Implementation – 32 Hosts(e.g. the difference between paper protocols and reality!)

  39. (Attack paper & refutation)On the difficulty of software-based attestation of embedded devices. • ROP attack • They can subvert control flow through ROP in order to do a TOCTOU (but they don't call it a TOCTOU) • Counter: "To the first assumption, software-based attestation was primarily designed to achieve code integrity, but not control-flow integrity. In particular, the SWATT function’s main purpose was to validate the code memory. Hence, the presented “attack” is on a property that is not attempted in general software-based attestation mechanisms." – Technically true, but not a good argument • Counter: "To the second point, some software-based attestation mechanism do verify the stack information. For example, in the Pioneer [12] paper in Section 6.2, list point 3, we state: “The KMA also verifies that the return address on the stack points back to the kernel/LKM code segment.”" better argument Nov 2009

  40. (Attack paper & refutation)On the difficulty of software-based attestation of embedded devices. 2 • Compression Attack • "While incurred delay could be detected by a verifier" … WTF? Who cares then!?!? • In general the counter speaks to a lot of the differences of implementing on MicaZ instead of ATMega163L and errors in the port • In general I believe people who have implemented a bunch of TBA instances rather that some new person who comes in, tries to make an implementation, and then looks at their implementation and says "they're all broken!" Nov 2009

  41. SBAP: Software-Based Attestation for Peripherals • Counter-move to a 2009 talk at BlackHat on attacking the firmware on an Apple Wireless Keyboard! • Another port of SWATT • Summary: • Platform: Apple USB Keyboard • Architecture: Cypress enCoReTM II (24 MHz. CY7C63923, 8bit, Harvard) • Coverage: all of flash & RAM • Attestation channel: USB Jun 2010

  42. Retroactive Detection of Malware With Applications to Mobile Platforms • Basically an independent port of SWATT to an Android phone (independent replication is a Good Thing) • Playing up the exploitation of memory hierarchy (and filling of RAM) even though it has been done before • Focusing on speed of accessing RAM vs. flash • Trying to counter proxy attacks by making the self-check keep talking out to the server to get new nonces • Like VIPER later, trying to infer proxy communication latency • Summary: • Platform: Android G1 • Architecture: ARM (600 MHz. ARM1136EJ-S, 32 bit, Von Neumann) • Coverage: all of flash & RAM • Attestation channel: ? Aug 2010

  43. A software-based root-of-trust primitive on multicore platforms • Proposes 2 attacks utilizing multi-processing • Have one core run the PRNG and send results to the other so it doesn't need to execute those instructions • Very slight speed improvement, very easy fix of making the PRNG inputs incorporate checksum data (and thus depend on the self-data being read.) • A TOCTOU attack where one processor gains control right after the other processor is done with the self-check • Summary: • Platform: Android G1 • Architecture: x86-64 (2.53 GHz, Intel Core 2, 64 bit, Von Neumann) • Coverage: all of flash & RAM • Attestation channel: USB Mar 2011

  44. VIPER: Verifying the Integrity of Peripherals' Firmware • Want to make sure all your peripherals (NIC, HD, GPU) are unmodified • But a slow peripheral can proxy to a colluding fast peripheral • Change the protocol from one request for many iterations to many requests for few iterations • But now each few iterations only get a small chunk of memory measured • Make sure the expected RTT is less than would be allowed given the known latencies to talk to the colluder • Summary: • Platform: Netgear GA620 NIC • Architecture: MIPS (2 @ 200 MHz, 32 bit, Von Neumann) • Coverage: it's own "driver" • Attestation channel: Memory Mapped IO Oct 2011

  45. VIPER: Verifying the Integrity of Peripherals' Firmware 2

  46. Can we apply this to network? Compromised Client Server Faster Client Measurement Type: FOO, Nonce = 0xf005ba11 Measurement Type: FOO, Nonce = 0xf005ba11 Self-Check (Nonce = 0xf005ba11) Self-Checksum, Nonce = 0xf005ba11 Δt Self-Checksum, Nonce = 0xf005ba11 Time

  47. Can we apply this to network? 2 Compromised Client Server Faster Client Measurement Type: FOO, Nonce = 0xf005ba11 Measurement Type: FOO, Nonce = 0xf005ba11 Infinitely fast proxy host Δt Time

  48. Can we apply this to network? 3Make this the new, shorter, expected Δt Compromised Client Server Faster Client Measurement Type: FOO, Nonce = 0xf005ba11 Measurement Type: FOO, Nonce = 0xf005ba11 Attacker :'( Δt Time

  49. New results for timing-based attestationmy group • We can very clearly detect attackers over 10 links on an intranet! (6 switches, 3 routers) • We can also use the TPM as a trusted timer for long-haul networks! • We can do attestation on Windows, from a non-hardware-specific kernel driver! • TOCTOU attacks are a big problem people! • Summary: • Platform: Win XP 32 bit (subsequently ported to Win 7 32/64) • Architecture: Intel Core 2 (3GHz, 32 bit (later 64), Von Neumann) • Coverage: own kernel driver (which includes other measurement code) • Attestation channel: network May 2012

  50. Network Topology Links to client or server are copper. All other links are fiber. Client 10 links Server Client 1 link Client 8 links Client 2 links Client 3 links Router (building 2) Router (building1) Router (Core) Switch Switch Switch Switch Switch Switch Switch Icons from http://nag.ru/goodies/manuals/Cisco-icons.ppt

More Related