instant profiling instrumentation sampling for profiling datacenter applications n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Instant Profiling: Instrumentation Sampling for Profiling Datacenter Applications PowerPoint Presentation
Download Presentation
Instant Profiling: Instrumentation Sampling for Profiling Datacenter Applications

Loading in 2 Seconds...

play fullscreen
1 / 20

Instant Profiling: Instrumentation Sampling for Profiling Datacenter Applications - PowerPoint PPT Presentation


  • 159 Views
  • Uploaded on

Instant Profiling: Instrumentation Sampling for Profiling Datacenter Applications. Hyoun Kyu Cho 1 , Tipp Moseley 2 , Richard Hank 2 , Derek Bruening 2 , Scott Mahlke 1. 1 University of Michigan 2 Google. Datacenter Applications. http://googleblog.blogspot.com.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Instant Profiling: Instrumentation Sampling for Profiling Datacenter Applications' - ganya


Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
instant profiling instrumentation sampling for profiling datacenter applications

Instant Profiling: Instrumentation Sampling for Profiling Datacenter Applications

HyounKyu Cho1, Tipp Moseley2, Richard Hank2,

Derek Bruening2, Scott Mahlke1

1University of Michigan 2Google

datacenter applications
Datacenter Applications

http://googleblog.blogspot.com

  • In 2010, US Datacenters spent 70~90 billion kWh*
  • Datacenter application performance is critical
  • Profiling can help

*[Koomey`11]

traditional profiling
Traditional Profiling

Source Code

  • Challenges for Datacenters
    • Need to run on live traffic
      • Difficult to isolate
    • Overheads
      • Value profiling 3.8x slowdown1
      • Path profiling 31%, edge profiling 16%2
    • Binary management
      • Many programs, multiple versions

Instrumentation Build

Instrumented Binary

Input Data

Training

Run

Profile Data

1[Calder`99] 2[Ball`96]

google wide profiling
Google-Wide Profiling
  • Continuous profiling infrastructure for datacenters
  • Negligible overhead
    • Sampling based
    • Aggregated profiling overhead less than 0.01%
  • Limitations
    • Heavily rely on Performance Monitoring Units
    • Limited flexibility and portabiliity

[Ren et al.`10]

goals
Goals
  • Unified profiling infrastructure for datacenters
    • Flexible types of profile data
    • Portable across heterogeneous datacenter
  • While maintaining
    • Low overhead
    • Does not burden binary management

Dynamic Binary

Instrumentation

Sampling

instrumentation sampling
Instrumentation Sampling

application

system call gateway

operating system

hardware

instrumentation sampling1
Instrumentation Sampling

application

dispatch

instrumentation

engine

client

context switch

operating system

code cache

DynamoRIO

hardware

[Bruening`04]

instrumentation sampling2
Instrumentation Sampling

application

shepherding thread

dispatch

instrumentation

engine

client

start profiling

operating system

code cache

stop

profiling

hardware

problems with basic implementation
Problems with Basic Implementation
  • Unbounded profiling periods due to fragment linking
  • Latency degradation due to initial instrumentation
  • Multi-threade programs
temporal unlinking relinking of fragments
Temporal Unlinking/Relinking of Fragments

context

switch

code cache

BB1

dispatch

BB2

BB2->BB1

s w code cache pre population
S/W Code Cache Pre-population

application

shepherding thread

  • Still have latency degradation for intial instrumentation phases

dispatch

instrumentation

engine

client

operating system

code cache

hardware

multithreaded program support
Multithreaded Program Support
  • Sampling makes it possible to miss thread operations
  • Forces Instant Profiling’s signal handler for every thread
  • Enumerates all threads and sends profiling start signal to each thread
experimental setup
Experimental Setup
  • 6-core Intel Xeon 2.67GHz w/ 12MB L3
  • 12GB main memory
  • Linux kernel 2.6.32
  • gcc 4.4.3 w/ -O3
  • SPEC INT2006, BigTable, Web search
  • Edge profiling client
conclusion
Conclusion
  • Low-overhead, portable, flexible profiling needed
  • Instant Profiling
    • Combines sampling and DBI
    • Pre-populates S/W code cache
    • Tunable tradeoff between overhead and information
    • Provides eventual profiling accuracy
  • Less than 5% overhead, more than 80% accuracy for naïve edge profiling client