1 / 32

Flexible Hardware Acceleration for Instruction-Grain Program Monitoring

Flexible Hardware Acceleration for Instruction-Grain Program Monitoring. Shimin Chen. Joint work with Michael Kozuch 1 , Theodoros Strigkos 2 , Babak Falsafi 3 , Phillip B. Gibbons 1 , Todd C. Mowry 1,2 , Vijaya Ramachandran 4 , Olatunji Ruwase 2 , Michael Ryan 1 , Evangelos Vlachos 2.

Download Presentation

Flexible Hardware Acceleration for Instruction-Grain Program Monitoring

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen Joint work with Michael Kozuch1, Theodoros Strigkos2, Babak Falsafi3, Phillip B. Gibbons1, Todd C. Mowry1,2, Vijaya Ramachandran4,Olatunji Ruwase2, Michael Ryan1, Evangelos Vlachos2 1Intel Research Pittsburgh 2CMU 3EPFL 4UT Austin

  2. Application Lifeguard Instruction-Grain Monitoring • Software often contain bugs • Memory corruptions, data races, …, crashes • Security attacks often designed to exploit bugs • Instruction-grain lifeguards can help • Dynamic monitoring: during application execution • Instruction-grain: e.g., memory access, data flow • Enables a wide range of powerful lifeguards Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen

  3. Example Instruction-Grain Lifeguards • AddrCheck: • Monitor malloc/free, memory accesses • Check if all memory accesses visit allocated memory regions • MemCheck:AddrCheck + check uninitialized values • Copying partially uninitialized structures is not an error • Lazy error detection to avoid many false positives • Track propagation of uninitialized values • TaintCheck:detect overwrite-based security exploits • Tainted data: data from network or disk • Track propagation of tainted data to detect violations • LockSet: detect data races in parallel programs [Nethercote’04] [Nethercote & Seward ’03 ’07] [Newsome & Song’05] [Savage et al.’97] Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen

  4. Dynamic binary instrumentation (DBI)10-100X slowdowns General-Purpose HW improving DBI3-8X slowdowns Lifeguard-specific hardware This paper Design Space of Support Platform Good [Crandall & Chong’04], [Dalton et al’07], [Shetty et al’06], [Shi et al’06], [Suh et al’04], [Venkataramani’07], [Venkataramani’08], [Zhou et al’07] [Chen et al’06] [Corliss’03] Performance [Bruening’04] [Luk et al’05][Nethercote’04] Poor General Purpose: Wide Range of Lifeguards Specific Lifeguard Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen

  5. Outline • Introduction • Background • Three Hardware Acceleration Techniques • Experimental Evaluation • Conclusion Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen

  6. Application TaintCheck Lifeguard Example Lifeguard: TaintCheck [Newsome & Song’05] • Purpose: detect overwrite-based security exploits • Metadata kept for application memory and registers • Tainted data: data from network or disk • Track taint propagation • Detect violation: e.g., tainted jump target address mov %eaxA mov B %eax taint(%eax) = taint(A) taint(B) = taint(%eax) taint(%ebx)|= taint(D) add %ebx D Detect exploit beforeattack code takes control jmp *(F) if (taint(F)==1) error; Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen

  7. Input Violation TaintCheck w/ Detailed Tracking TaintCheck: • Detect violation • 1 taint bit / application byte TaintCheck w/ detailed tracking: • Construct taint propagation trail • More detailed metadata per application location • PC of Instruction that tainted this location • “tainted from” address • Not supported by previous lifeguard-specific HW [Newsome & Song’05] Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen

  8. Instruction-Grain Lifeguard Metadata Characteristics • Organization varies • per application byte/word • size, format, semantics vary greatly • Frequently updated • e.g., propagation tracking • Frequently checked • e.g., memory accesses Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen

  9. Rare rare events Events Raree.g., malloc/free, system calls Update metadata Frequent e.g., memory access,data movement 1 2 3 Check Lifeguard Support Application (unmodified) Lifeguard (software) Event Handlers Event-capture and delivery General-Purpose HW improving DBI Performance bottlenecks: metadata mapping, updates, and checks Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen

  10. Rare rare events Events Raree.g., malloc/free, system calls Update metadata Frequent e.g., memory access,data movement Check Our Contributions Application (unmodified) Lifeguard (software) Event Handlers M-TLB IT IF Event-capture and delivery • Metadata-TLBfor metadata mapping • Inheritance Tracking for metadata updates • Idempotent Filters for metadata checks Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen

  11. Outline • Introduction • Background • Three Hardware Acceleration Techniques • Metadata-TLB • Inheritance Tracking • Idempotent Filters • Experimental Evaluation • Conclusion Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen

  12. metadata Level-1index Level-2 chunks Metadata-TLB: Motivation • Metadata per app byte/word • Element size may vary • Two-level structure: • Robustness & space efficiency • Mapping: application address  metadata address • Frequently used in almost every handler • Can be very costly Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen

  13. map *mp = level1_index[src_addr>>16];     mov  %eax, %ecx                  shr  $16, %ecx        mov  level1_index(,%ecx,4),%ecx int idx = (src_addr & 0xffff)>>2;    and  $0xffff, %eax shr  $2, %eax UChar mem_taint = mp[idx];   movzbl (%ecx,%eax,1), %eax reg_taint[dest_reg] |= mem_taint;   or %al, reg_taint(%edx)   nlba ();nlba Example (TaintCheck) void dest_reg_op_mem_4B (UINT32 src_addr /*%eax*/, UINT32 dest_reg /*%edx */) // app instruction type: dest_reg  dest_reg op mem(src_addr) // handler operation: reg_taint(dest_reg)|= mem_taint(src_addr) Metadata Mapping takes 5 out of 8 instructions ! Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen

  14. Our Solution: Metadata-TLB • A TLB-like HW associative lookup table • LMA (Load Metadata Address) instruction: • Application address  lifeguard metadata address • Managed by (user-mode) lifeguard software Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen

  15. map *mp = level1_index[src_addr>>16];   mov  %eax, %ecx                  shr  $16, %ecx        mov  level1_index(,%ecx,4),%ecx int idx = (src_addr & 0xffff)>>2; and  $0xffff, %eax shr  $2, %eax UChar mem_taint = mp[idx];   movzbl (%ecx,%eax,1), %eax reg_taint[dest_reg] |= mem_taint;   or %al, reg_taint(%edx)   nlba ();nlba UChar *p = LMA_macro(src_addr);LMA  %eax, %ecx UChar mem_taint = *p;  mov (%ecx), %al reg_taint[dest_reg] |= mem_taint; or %al, reg_taint(%edx) nlba ();nlba Example (TaintCheck) w/ M-TLB void dest_reg_op_mem_4B (UINT32 src_addr /*%eax*/, UINT32 dest_reg /*%edx */) // app instruction type: dest_reg  dest_reg op mem(src_addr) // handler operation: reg_taint(dest_reg)|= mem_taint(src_addr) Reduce handler size by half ! Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen

  16. Inheritance Tracking: Motivation • Propagation tracking is expensive • Metadata updates for almost every app instruction • Previous hardware solutions track propagation • automatically update metadata in hardware • Problem:only support simple metadata semantics • e.g., do not support TaintCheck w/ detailed tracking • Our goal: flexibility AND performance • Idea: inheritance structure is common, so let’s track inheritance in hardware! Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen

  17. insert D into %ebx’s inherit-from list add %ebxD taint(%ebx) |= taint(D) Application Propagation Tracking Inheritance Tracking %eax inherits from A B inherits from %eax mov %eaxA mov B%eax taint(%eax) = taint(A) taint(B) = taint(%eax) Problem with General Inheritance Tracking Problem:state explosion for binary operations ! Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen

  18. UnaryInheritance Tracking • Many lifeguards can take advantage of unary IT: • MemCheck • TaintCheck • Large performance improvements if used • Can be disabled if unary IT does not match the lifeguard check known check Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen

  19. Transformed event State Transition& Event to Deliver IT(%rd) IT(%rs) IT table for registers Tracking Register Inheritance Deliver event Original event • More details in the paper: • IT table and state transition table details • Conflict detection Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen

  20. Example Application Before Inheritance Tracking mem_to_mem mov %eaxA mov B%eax mem_to_reg reg_to_mem imm_to_mem mem_to_reg dest_reg_op_mem reg_to_mem mov %ebxC add %ebxD mov E%ebx Can significantly reduce metadata update events! Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen

  21. Idempotent Filters: Idea • Typically, metadata checks give the same result if • Event parameters are the same and • Metadata are the same • Idea: filter out idempotent (redundant) events • For example: • AddrCheck: • After checking that a memory location is allocated • Subsequent loads/stores to the same location are safe • Until the next free() event • LockSet: (surprisingly) • In between synchronization events (e.g., lock/unlock) • Check first load to a location • Check first store to a location Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen

  22. Outline • Introduction • Background • Three Hardware Acceleration Techniques • Experimental Evaluation • Log-Based Architectures (LBA) • Simulation Study (w/ reduced input sets) • PIN-based Analysis (w/ full inputs) • Conclusion Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen

  23. Rare rare events Events Raree.g., malloc/free, system calls Update metadata Frequent e.g., memory access,data movement Check Log-Based Architectures Application (unmodified) Lifeguard (software) Event Handlers Event-capture and delivery Log-Based Architecture (LBA) Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen

  24. Idea: Exploiting Chip Multiprocessors P P P P P P P P P P P P P P P P LBA components Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen

  25. Application Lifeguard Operating System: Fedora Core 5 Core 1 Core 2 dispatch capture Log Transport (e.g. L2 cache) decompress Compress Simulation Setup: Dual-Core LBA System Extend Virtutech Simics M-TLB IT & IF • Application and lifeguard are processes • Application is stalled when log buffer is full • Model a 2-level cache hierarchy Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen

  26. LBA baseline LBA optimized application execution time w/ lifeguard Slowdown = application execution time w/o lifeguard Overall Performance: TaintCheck 1.36X Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen

  27. Applying Our Techniques One by One • IT, IF, and M-TLB are indeed complementary • Achieve dramatically better performance 10.0 TaintCheck MemCheck TaintCheck w/ detailed tracking AddrCheck LockSet 9.0 7.80 8.0 7.0 6.05 6.0 average slowdowns 5.0 4.25 4.21 3.81 4.0 3.36 3.27 3.23 3.20 2.71 3.0 2.29 1.90 2.0 1.51 1.40 1.36 1.02 1.0 0.0 MTLB MTLB MTLB MTLB MTLB BASE BASE BASE BASE BASE MTLB+IT MTLB+IF MTLB+IT MTLB+IF MTLB+IT MTLB+IT+IF Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen

  28. PIN-Based Analysis: IT • IT removes 35.8% to 82.0% of the propagation events Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen

  29. AddrCheck LockSet 80 80 70 70 60 60 50 50 reduced check events (%) 40 40 reduced check events (%) fully-assoc 30 30 16-way 8-way 20 20 4-way 2-way 10 10 1-way 0 0 8 16 32 64 128 256 8 16 32 64 128 256 number of filter entries number of filter entries PIN-Based Analysis: IF • IF can effectively reduce check events • 4-way works as well as fully-associative Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen

  30. Conclusion • Our focus: Instruction-Grain Lifeguards • Three complementary hardware techniques: • Metadata-TLB (M-TLB) • Inheritance Tracking (IT) • Idempotent Filters (IF) • Flexible to support a wide range of lifeguards • Reducing overheads by 2-3X in our experiments • Achieving 2-51% overheads for all but MemCheck Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen

  31. Thank you! Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen

  32. People Working on LBA Project Intel Research: • Shimin Chen • Phillip B. Gibbons University Faculty: • Babak Falsafi (EPFL) • Todd C. Mowry (CMU) CMU Students: • Michelle Goodstein • Olatunji Ruwase • Mike Kozuch • Michael Ryan • Vijaya Ramachandran (UT Austin) • Theodoros Strigkos • Evangelos Vlachos Previous Contributors: • Limor Fix (IRP) • Steve Schlosser (IRP) • Anastasia Ailamaki (CMU) • Greg Ganger (CMU) • Bin Lin (Northwestern) • Radu Teodorescu (UIUC) Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen

More Related