1 / 25

All You Ever Wanted to Know About Dynamic Taint Analysis & Forward Symbolic Execution (but might have been afraid to

All You Ever Wanted to Know About Dynamic Taint Analysis & Forward Symbolic Execution (but might have been afraid to ask). Edward J. Schwartz, Thanassis Avgerinos , David Brumley. (Yes, we were trying to overflow the title length field on the submission server).

colby
Download Presentation

All You Ever Wanted to Know About Dynamic Taint Analysis & Forward Symbolic Execution (but might have been afraid to

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. All You Ever Wanted to Know AboutDynamic Taint Analysis&Forward Symbolic Execution(but might have been afraid to ask) Edward J. Schwartz, Thanassis Avgerinos, David Brumley (Yes, we were trying to overflow the title length fieldon the submission server) Carnegie Mellon University

  2. A Few Things You Need to Know AboutDynamic Taint Analysis&Forward Symbolic Execution(but might have been afraid to ask) Edward J. Schwartz, Thanassis Avgerinos, David Brumley Carnegie Mellon University

  3. The Root of All Evil Humans write programs This Talk:Computers Analyzing Programs Dynamically at Runtime Carnegie Mellon University

  4. Two Essential Runtime Analyses Detect Exploits [Costa2005,Crandall2005, Newsome2005,Suh2004] Detectpacking in malware [Bayer2009,Yin2007] Dynamic Taint Analysis:What values are derived from user input? Automated Test Case Generation [Cadar2008,Godefroid2005,Sen2005] Input Filter Generation [Costa2007,Brumley2008] Forward Symbolic Execution:What input will make execution reach this line of code? Carnegie Mellon University

  5. Our Contributions Computers Analyzing Programs Dynamically at Runtime 1: Turn English descriptions into an algorithm • Operational Semantics 2: Algorithm highlights caveats, issues, and unsolved problems that are deceptively hard Dynamic Taint Analysis:Is this value affected by user input? Forward Symbolic Execution: What input will make execution reach this line of code? Carnegie Mellon University

  6. Our Contributions (cont’d) 3: Systematize recurring themes in a wealth of previous work Carnegie Mellon University

  7. Dynamic Taint Analysis:What values are derived from user input? • How it works – example • Desired properties • Example issue. Paper has many more. Carnegie Mellon University

  8. tainted untainted Δ t = IsUntrusted(src) x = get_input() y = x + 42 … gotoy Input Var Val x 7 get_input(src)↓ t Input is tainted τ Taint Introduction Tainted? Var T x Carnegie Mellon University

  9. tainted untainted Δ x = get_input() y = x + 42 … gotoy Var Val y 49 x 7 Data derived from user input is tainted τ Taint Propagation Tainted? Var t1 = τ[x1] , t2 = τ[x2] BinOp T x x1 + x2 ↓ t1 v t2 T y Carnegie Mellon University

  10. tainted untainted Δ x = get_input() y = x + 42 … gotoy Var Val x 7 Policy Violation Detected y 49 τ Taint Checking Tainted? Var T x Pgoto(ta) = ¬ ta(Must be true to execute) T y Carnegie Mellon University

  11. Real Use: Exploit Detection x = get_input() y = … … gotoy Jumping to overwritten return address … strcpy(buffer,argv[1]) ; … return ; Carnegie Mellon University

  12. Memory Load Δ μ Variables Memory Var Addr Val Val x 7 42 7 τμ τ Tainted? Tainted? Addr Var 7 T F x Carnegie Mellon University

  13. Problem: Memory Addresses Var Val • x = get_input() • y = load( x ) • … • gotoy Δ x 7 Addr Val μ 7 42 All values derived from user input are tainted?? Tainted? Addr τμ 7 F Carnegie Mellon University

  14. Policy 1: Taint depends only on the memory cell v = Δ[x] , t = τμ[v] Load Var Val • x = get_input() • y = load(x ) • … • gotoy Undertainting Failing to identify tainted values - e.g., missing exploits load(x) ↓ t Δ x 7 Jump target could be any untainted memory cell value Addr Val μ 7 42 Tainted? Addr Taint Propagation τμ 7 F Carnegie Mellon University

  15. Policy 2: v = Δ[x] , t = τμ[v], ta = τ[x] If either the address or the memory cell is tainted, then the value is tainted Load load(x) ↓ t v ta • x = get_input() • y = load(jmp_table +x % 2 ) • … • goto y Overtainting Unaffected values are tainted - e.g., exploits on safe inputs Address expression is tainted jmp_table Policy Violation? Taint Propagation Carnegie Mellon University

  16. Research ChallengeState-of-the-Art is not perfect for all programs Undertainting:Policy may miss taint Overtainting:Policy may wrongly detect taint Carnegie Mellon University

  17. Forward Symbolic Execution:What input will make execution reach this line of code? • How it works – example • Inherent problems of symbolic execution • Proposed solutions Carnegie Mellon University

  18. The Challenge bad_abs(x is input) if (x < 0) then return -x if (x = 0x12345678) then return -x return x 232 possible inputs 0x12345678 Forward Symbolic Execution: What input will make execution reach this line of code? Carnegie Mellon University

  19. A Simple Example x symbolic can have any value bad_abs(x is input) Interpreter What input will make execution reach this line of code? If (x < 0) Interpreter Interpreter x ≥ 0 t f return -x If x == 0x12345678 Interpreter Interpreter t f x < 0 return x return -x x ≥ 0 Λ x != 0x12345678 x ≥ 0 Λ x == 0x12345678 Carnegie Mellon University

  20. One Problem: Exponential Blowup Due to Branches Interpreter Branch 1 Branch 2 Branch 3 Exponential Number of Interpreters/formulas in # of branches Carnegie Mellon University

  21. Path Selection Heuristics Symbolic Execution Tree However, these are heuristics. In the worst case all create an exponential number of formulas in the tree height. • Depth-First Search (bounded) ,Random Search [Cadar2008] • Concolic Testing [Sen2005,Godefroid2008] … Carnegie Mellon University

  22. Symbolic Execution is not Easy • Exponential number of interpreters/formulas • Exponentially-sized formulas • Solving a formula is NP-Complete! branching s + s + s + s + s + s + s + s == 42 substitution Carnegie Mellon University

  23. Other Important Issues Formalization More complex policies Π = (s + s + s + s + s + s + s + s) == 42 Carnegie Mellon University

  24. Conclusion • Dynamic taint analysis and forward symbolic execution used extensively in literature • Formal algorithm and what is done for each possible step of execution often not emphasized • We provided a formal definition and summarized • Critical issues • State-of-the-art solutions • Common tradeoffs Carnegie Mellon University

  25. Thank You! thanassis@cmu.edu Questions? Carnegie Mellon University

More Related