1 / 33

Persistent Code Caching

Exploiting Code Reuse Across Executions & Applications. Persistent Code Caching. Vijay Janapa Reddi † Dan Connors ‡ , Robert Cohn § , Michael D. Smith †. Execution environments that provide an interface to the dynamic instruction stream of an application. Runtime Compilation System.

jovita
Download Presentation

Persistent Code Caching

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Exploiting Code Reuse Across Executions & Applications Persistent Code Caching Vijay Janapa Reddi† Dan Connors‡, Robert Cohn§, Michael D. Smith†

  2. Execution environments that provide an interface to the dynamic instruction stream of an application Runtime Compilation System Overheads • Runtime compilation • Performance of thecompiled code

  3. Managing compilation overheadvia software code caching Original dynamic instruction stream A B C C A Reuse of cached code Runtime Sys. (RS) Code caching RS A’ RS B’ RS C’ C’ A’ Execution time Basis: 90% execution time in 10% (hot) code

  4. Highlight of this talk: • Challenges in deploying dynamic binary instrumentation into production regression testing environments • Case study of the Oracle database Problem statement There exist execution domains where code caching is ineffective, which limits the deployment of runtime compilation systems

  5. Caching performance variesbased on program behavior Loop intensive application 181.mcf Runtime Compilation Code Cache 176.gcc Large code footprint & infrequent code re-use

  6. Caching performance variesbased on program behavior Loop intensive (frequent reuse) Mcf Eon Vpr Twolf Gap Bzip2 Runtime Compilation Code Cache Gzip Parser Vortex Crafty Perl Large footprint (infrequent reuse) Gcc Normalized execution time

  7. Benchmark 176.gcc is not an outlier Oracle Gedit Dia Runtime Compilation Gvim Code Cache File Roller GUI applications - Large startup cost - Library initialization executed < 10 times Gftp Gqview Normalized execution time

  8. Not uncommon! • Regression testing • Oracle (100,000 tests) • Gcc (4000+ tests) 176.gcc (5 SPEC reference inputs) Execution time Code caching suffers under certain execution behaviors Less code reuse Large code footprint Short run times Cold code is hot code across executions!!!

  9. Persistent caching (Run 2) A’ B’ C’ C’ A’ Caching (Run 2) RS RS A’ A’ RS RS B’ B’ RS RS C’ C’ C’ C’ A’ A’ Reduce overhead by storing & reusing caches Caching code across executions improves caching performance Original dynamic instruction stream A B C C A Caching (Run 1) Execution time

  10. Appropriate system for evaluating persistence General model Robust design Enterprise-scale usage Address Space Client Interface Runtime System Components Application Code Cache Operating System Hardware Implementation Framework: Pin(Dynamic binary instrumentation)

  11. Persistent Cache Translated code Translation data structures Correctness metadata Address Space Client Persistent Cache DB Interface Persistence Mgr. Pin Components Application Code Cache Operating System Hardware Persistent Pin

  12. Empty Cache Persistent Cache X Pin Pin Experimental setup Input X • IA32 Linux implementation • Bounded cache (320MB) • Applications ran unmodified • No cache flushes occurred Persistent Cache X Input ? Measure improvement

  13. Exploiting code reuse across executions and applications Code coverage: Bull's eye (100% reuse)

  14. Persistent caching is complementary to the current code caching model Persistent caching works across program classes Benefits large code footprint applications SPEC 2000 INT (Reference inputs)

  15. Persistent caching is effectivefor short-running applications Input data set alters program behavior Small improvements gets bigger (Gap) and large improvements get even larger (Gcc)

  16. Evaluating persistent caching across program inputs 253.perlbmk 175.vpr 176.gcc 164.gzip 256.bzip2 Oracle 90% 100% 50% 60% 70% 80% Code coverage between inputs

  17. Production environments require runtime systems improvements • Case study: Regression testing of Oracle XE Oracle: 80s Oracle + Pin (translation): 2000s Oracle + Pin (translation) + Instrumentation (memory tracing): 3000s One unit-test!

  18. 1 Large number of process compilations Oracle is a multi-process programming environment Challenges Oracle’s execution phases Mount Work Start Open Close

  19. 1 Large number of process compilations A A C C C C B B Z Z 2 Redundant translations across processes Processes exhibitcode sharing Challenges Oracle’s execution phases Mount Work Start Open Close

  20. 1 3 Redundant translations across unit-tests Large number of process compilations Every unit-test executes all phases 2 Redundant translations across processes Only phase changing across all unit-tests Every Oracle unit-test starts anew instance of the database Challenges Oracle’s execution phases Mount Unit-test 1 Start Open Close Mount Unit-test 2 Open Close Start

  21. Persistent Cache (Start) Low code coverage (15%) Persistent Cache (Open) High code coverage (77%) Leveraging persistence across processes

  22. Empty Cache Persistent Cache X Pin Pin Persistent Cache Accumulation (PCA) addresses limited code coverage Input Y Input X • Accumulate code across executions Persistent Cache X+Y Persistent Cache X InputZ Persistent Cache X+Y Pin Timed Run

  23. Performance improves with more accumulation of code Persistent Cache Accumulation (PCA) improves unit-test performance Accumulated persistent caches

  24. Contributions: Improved code caching • Cold code is hot code! • Persistence is effective • Less code reuse • Short run times • Large code footprint • Robust and performanceefficient implementation • Production environment regression testing study

  25. Backup Slides

  26. Future Research Questions Selective persistent caching Cache only cold/hot code Effectiveness of optimizations across Inputs Applications Impact of excessive cache accumulation

  27. Persistent Cache Sizes:DS is larger than CC!

  28. Persistent Cache Sizes:DS is larger than CC!

  29. Cross-input Persistence reduces re-translation across inputs Persistence is effective even across changing input data sets Without Persistence Re-invocation w/ Persistence using a previously cached execution Re-invocation w/ Persistence using a cache from a different input for a previously unseen input time ~30% improvement via Cross-input Persistence 29

  30. Persistent instrumentation issues Dynamically allocated memory Invalid pointer duringcache reuse Memory allocation during cache generation Called upon every instruction execution VOID Analysis(COUNTER * counter) { (*counter) ++; } VOID Instrumentation(INS ins, VOID *v) { STATS * stats = new STATS( INS_Address(ins)); INS_InsertCall(ins, IPOINT_BEFORE, AFUNPTR (Analysis), IARG_PTR, &stats->counter, …); … } VOID main(INT32 argc, CHAR *argv[]) { … INS_AddInstrumentFunction(Instrumentation, 0); … PIN_StartProgram(); } Called once per instruction compilation Solution: Allocate memory using the Persistent Memory Allocator

  31. Inter-Application exploits redundancy of library translations Libraries (DSO) Initialization Toolkits/Pkgs X11 GTK+ FLTK Persistent Cache X Persistent Cache Y Empty Cache Empty Cache Pin Pin Pin Pin Application A Application B InputX InputY Persistent Cache X Persistent Cache Y InputX InputY Timed Run

  32. Inter-Application Persistence ~60% improvement Verifies that large amount of time is spent initializing library routines

  33. 1 Large number of process compilations fork() exec() 2 Redundant translations across processes exec() loses parent cache: May re-translate parent code! Processes exhibitcode sharing Challenges Oracle’s execution phases Mount Work Start Open Close

More Related