1 / 37

Designing a Trace Format for Heap Allocation Events

Designing a Trace Format for Heap Allocation Events. Trishul Chilimbi, Microsoft Research Richard Jones, University of Kent Ben Zorn, Microsoft Research. Is Heap Allocation a Solved Problem?. Yes? Numerous techniques, 40+ years of research

Download Presentation

Designing a Trace Format for Heap Allocation Events

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Designing a Trace Format forHeap Allocation Events Trishul Chilimbi, Microsoft Research Richard Jones, University of Kent Ben Zorn, Microsoft Research

  2. Is Heap Allocation a Solved Problem? • Yes? • Numerous techniques, 40+ years of research • Fragmentation not an issue? (Johnstone et al. ISMM98) • How much faster can it get? • No! • Arenas, regions, user-defined heaps, etc. • Scalability of MP allocators • Data locality T. Chilimbi, B. Zorn (MSR), R.E. Jones (UKC)

  3. Are our Evaluation Methods Sound? • Heap allocation important in: • Streaming media applications • Long-running, quasi-real-time • Server applications (Larson & Krishnan ISMM98) • Heavy load, complex structure, multi-threaded • “Large” applications (OS, word proc., etc.) • Current benchmarks (BZ’s especially): • Small, single-threaded, short-running T. Chilimbi, B. Zorn (MSR), R.E. Jones (UKC)

  4. Are Traces a Solution? • Yes? • Easy to share, portable • Captures real behaviors, real programs under real loads • Easy to use for experimental evaluations • No? • Fixed format implies potential missing info • E.g., capturing references problematic • Trace size a significant issue • MP interleaving is non-deterministic T. Chilimbi, B. Zorn (MSR), R.E. Jones (UKC)

  5. Contributions • HATF – an allocation trace format • Trace contents focus on important issues • Representation is flexible, portable • Traces are compact, processing efficient • MetaTF – a language for describing trace formats • Raise awareness of issues • What should be in a trace? • Do you care about how a trace is represented? T. Chilimbi, B. Zorn (MSR), R.E. Jones (UKC)

  6. Assumptions and HATF Design Goals • We assume • Long traces (100M events) are necessary • Consumer will read/process events sequentially • Ease of consumption critical • Minimal dependencies, resource requirements • HATF design goals • Expressiveness – contents must be useful • Compactness – 10% space reduction “valuable” • Flexibility – allow limited extension (see MetaTF) T. Chilimbi, B. Zorn (MSR), R.E. Jones (UKC)

  7. Allocate 16 11 39 4601 0xf4567 main Tag Size Heap Thread Time Address Attrib.(e.g., caller) Trace Content • Standard allocation events • Allocate, reallocate, free • Context • In a specific region • In a specific thread • At a specific time • Attributes allow additional info T. Chilimbi, B. Zorn (MSR), R.E. Jones (UKC)

  8. Trace Representation • Fixed formats have obvious weaknesses • Multiple address sizes (32 vs. 64-bit) • Fields often empty (e.g., thread, heap) • Fields have exploitable properties • Skewed or predictable distributions of values • Size often small, time monotonically increasing • HATF includes dynamic metadata • Dynamically vary field width, interpretation T. Chilimbi, B. Zorn (MSR), R.E. Jones (UKC)

  9. Changing Field Size with Metadata HATF format Fixed binary format setWidth size 1 tag size address alloc 32 0x4a0 alloc 32 0x4a0 setWidth size 2 alloc 1024 0xa10 alloc 1024 0xa10 alloc 1024 0xc10 alloc 1024 0xc10 setWidth size 1 alloc 16 0xf10 alloc 16 0xf10 T. Chilimbi, B. Zorn (MSR), R.E. Jones (UKC)

  10. Changing Field Interpretation HATF format Fixed binary format setInterp size default 32 setInterp addr stride 0 100 tag size address alloc 32 100 alloc alloc 32 200 alloc alloc 32 300 alloc alloc 32 400 alloc setInterp size none, addr none, … alloc 1024 5000 alloc 1024 5000 T. Chilimbi, B. Zorn (MSR), R.E. Jones (UKC)

  11. Representation Effectiveness • Is HATF necessary, useful? • Comparison • Alternate representations • HATF (size/time opt), fixed width binary, ASCII • With/without gzip compression • Applications • Single-threaded, single-heap benchmarks • Multi-threaded, multi-heap MS apps • Trace size, reading/writing costs T. Chilimbi, B. Zorn (MSR), R.E. Jones (UKC)

  12. Trace Compression (Benchmark Avg.) 16 12.3 T. Chilimbi, B. Zorn (MSR), R.E. Jones (UKC)

  13. Trace Compression (MS Apps, w. gzip) T. Chilimbi, B. Zorn (MSR), R.E. Jones (UKC)

  14. Read Processing Time (Benchmark Avg.) T. Chilimbi, B. Zorn (MSR), R.E. Jones (UKC)

  15. HATF: Evaluation Summary • Space • Without compression, HATF smallest • With compression, ASCII and HATF close • Representing 64-bit timestamps is expensive • Time • Current implementation limited by I/O • Compression overhead small by comparison • ASCII marginally slower to decode T. Chilimbi, B. Zorn (MSR), R.E. Jones (UKC)

  16. Contributions • HATF – an allocation trace format • Trace contents focus on important issues • Representation is flexible, portable • Traces are compact, processing efficient • MetaTF – a language for describing trace formats • MetaTF ≈ HATF as XML ≈ HTML • HATF reader/writer generated automatically • Raise awareness of issues T. Chilimbi, B. Zorn (MSR), R.E. Jones (UKC)

  17. MetaTF – Beyond HATF • Aim: to facilitate exchange of trace data sets • Generalise HATF • An expressive way of specifying traces • Allow easy construction of readers and writers • Generate readers and writers automatically from the specification • Separate representation from content T. Chilimbi, B. Zorn (MSR), R.E. Jones (UKC)

  18. Component approach • Idea: Provide traces and API as a unit • Separate representation from content • A trace contains event types • Each event has a concrete representation • Reveal content; hide representation • Implementation: jar files? • Good for reader • Simple interface, e.g. Event getNextEvent(); • Doesn’t help writer • Design trace format • Implement interfaces T. Chilimbi, B. Zorn (MSR), R.E. Jones (UKC)

  19. MetaTF approach • A trace comprises • Document type definition (DTD) • Trace event data • Meta-approach: say how to specify events, DTD • Abstract syntax notations • SGML • Ride the XML wave • Verbose, ASCII only • ASN.1 • Obese, inflexible T. Chilimbi, B. Zorn (MSR), R.E. Jones (UKC)

  20. MetaTF DTD example • Section heap 1 { alloc : (tag, size, address) { tag.value = 4; size.width = 4; size.interpretation = none; address.size = 4; address.interpretation = none; }} • Metadata can change representation of event, e.g. Metadata alloc 2 Width 2 Metadata alloc 2 Delta 310004 Tag Event Field Interpretation Value T. Chilimbi, B. Zorn (MSR), R.E. Jones (UKC)

  21. MetaTF effectiveness • Auto-generation of readers and writers from MetaTF DTD • Simple interface • Class for each event type, inherited from Event • Event getNextEvent(); • void Event.putEvent(); • Separation of content and representation to some degree T. Chilimbi, B. Zorn (MSR), R.E. Jones (UKC)

  22. Client HATF10 (generated) Classes High-level readers/writers Binary Reader/writer Gzipped Reader/writer ASCII Reader/writer Data Architecture Understand interpretations Client-supplied Read/write n bytes T. Chilimbi, B. Zorn (MSR), R.E. Jones (UKC)

  23. MetaTF: Evaluation summary • Simple but expressive syntax, familiar to programmers • Generated readers and writers, comparable performance • Separation of representation and content? • Field properties • User-supplied, low-level readers/writers • Interface • Event classes • getNextEvent, putEvent methods • What else? T. Chilimbi, B. Zorn (MSR), R.E. Jones (UKC)

  24. Preliminary Results: XML Compression 25.5 T. Chilimbi, B. Zorn (MSR), R.E. Jones (UKC)

  25. Summary • Heap allocation research faces challenges • We want to support easy, effective research • HATF, MetaTF are suggestions • Content issues • What is the minimum content? • How to we define extensible formats? • Representation issues • Is HATF sufficiently better than ASCII? • How to separate, hide representation? • Organisation issues • What other meta-information should be stored? • What do you think? T. Chilimbi, B. Zorn (MSR), R.E. Jones (UKC)

  26. Status • HATF • Preliminary implementation complete • Trying to make code/traces available • Hoping 3rd party will develop implementation from specification • Will help fix specification, implementation • MetaTF • Preliminary implementation in progress • Definition converging T. Chilimbi, B. Zorn (MSR), R.E. Jones (UKC)

  27. Feedback – You tell us… • What else does HATF need to contain? • How important are references? • Does anybody really care about representation? • Should we just pick one and everybody will be happy? T. Chilimbi, B. Zorn (MSR), R.E. Jones (UKC)

  28. Backup Slides T. Chilimbi, B. Zorn (MSR), R.E. Jones (UKC)

  29. Talk Overview • Motivation • HATF – Heap Allocation Trace Format • Design goals • Trace content • Trace representation • Representation Effectiveness • MetaTF – specifying trace formats • Design • Generating readers and writers • Traces as “Components” T. Chilimbi, B. Zorn (MSR), R.E. Jones (UKC)

  30. Separating Content and Representation • Ideally, representation and content would be entirely separate • User could use trace via standard API with no external dependencies • Trace + API would be delivered as a “unit” • Similar in spirit to components (Java Beans, COM) • No “standard” off-the-shelf way to achieve this • Best thing we can think of is to make readers/writers easy to acquire and use T. Chilimbi, B. Zorn (MSR), R.E. Jones (UKC)

  31. ASCII versus Binary Representation • ASCII • Portable, easy to examine and debug • Manipulated via text scripting tools (Perl) • Potential to ride the XML wave • Binary • More compact representation (more later) • Faster to read • Contents exported to ASCII on demand We chose… Binary T. Chilimbi, B. Zorn (MSR), R.E. Jones (UKC)

  32. HATF Metadata • Metadata commands embedded in data • Field sizes range from 0 to 8 bytes • Field interpretations (mini compression ops) • Compute field value as some functionExamples: • None, default, base/offset, delta, stride • Size/interpretation stay in effect until changed again • Reader interprets value of fields on-the-fly T. Chilimbi, B. Zorn (MSR), R.E. Jones (UKC)

  33. Metadata Example • Goal: encode most allocate sizes in 1 byte • Example trace contents: • Metadata: setWidth field:size width:1 • Data: allocate size=40, addr=0x3ff, … • Metadata: setWidth field:size width:2 • Data: allocate size=1024, addr=0xa10, … • Data: allocate size=1024, addr=0xc10, … • Metadata: setWidth field:size width:1 • Data: allocate size=16, addr=0xf00, … • Data: allocate size=24, addr=0xf10, … T. Chilimbi, B. Zorn (MSR), R.E. Jones (UKC)

  34. Trace Compression (MS Apps, w/o gzip) T. Chilimbi, B. Zorn (MSR), R.E. Jones (UKC)

  35. HATF Compression across Apps T. Chilimbi, B. Zorn (MSR), R.E. Jones (UKC)

  36. <!-- DTD for HATF-1.0 --><!element size (#PCDATA)><!element address (#PCDATA)><!element time (#PCDATA)><!element thread (#PCDATA)><!element heap (#PCDATA)><!element attributes (#PCDATA)> <!element alloc size address time thread heap attributes><!element reallocNoALloc address address time thread heap attributes><!element reallocAllocFree address address time thread heap attributes><!element reallocAlloc address address time thread heap attributes><!element reallocFree address address time thread heap attributes><!element free address time thread heap attributes><!element createHeap thread heap attributes><!element destroyHeap thread heap attributes><!element createThread thread attributes><!element destroyThread thread attributes><!element comment attributes> XML T. Chilimbi, B. Zorn (MSR), R.E. Jones (UKC)

  37. tag.width = 1; size.width = 4; size.interpretation = none;address.width = 4; address.interpretation = none;attributes.width = 0;attributes.interpretation = none;time.interpretation = default 0;thread.interpretation = default 0;heap.interpretation = default 0; section heap 1 { reallocNoAlloc : (tag, address, address, time, thread, heap, vfield) {tag.value = 3; } reallocAllocFree : (tag, address, address, time, thread, heap, vfield) {tag.value = 4; } reallocAlloc : (tag, address, address, time, thread, heap, vfield) {tag.value = 5; } reallocFree : (tag, address, address, time, thread, heap, vfield) {tag.value = 6; } alloc : (tag, size, address, time, thread, heap, vfield) {tag.value = 1; } free : (tag, address, time, thread, heap, vfield) {tag.value = 2; }createHeap : (tag, thread, heap, vfield) {tag.value = 7; } destroyHeap : (tag, thread, heap, vfield) {tag.value = 8; } createThread : (tag, thread, vfield) {tag.value = 9; } destroyThread : (tag, thread, vfield) {tag.value = 10; }} HATF1.0 specified in MetaTF1.1 T. Chilimbi, B. Zorn (MSR), R.E. Jones (UKC)

More Related