1 / 29

Early Experiences with KTAU on the Blue Gene / L

Early Experiences with KTAU on the Blue Gene / L. A. Nataraj, A. Malony, A. Morris, S. Shende Performance Research Lab University of Oregon. Outline. Introduction Motivations Objectives Architecture KTAU on Blue Gene / L Ongoing / Recent work Future work and directions Acknowledgements

alaric
Download Presentation

Early Experiences with KTAU on the Blue Gene / L

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Early Experiences with KTAU on the Blue Gene / L A. Nataraj, A. Malony, A. Morris, S. Shende Performance Research Lab University of Oregon

  2. Outline • Introduction • Motivations • Objectives • Architecture • KTAU on Blue Gene / L • Ongoing / Recent work • Future work and directions • Acknowledgements • References • Team

  3. Introduction : ZeptoOS and TAU • DOE OS/RTS for Extreme Scale Scientific Computation(Fastos) • Conduct OS research to provide effective OS/Runtime for petascale systems • ZeptoOS (under Fastos) • Scalable components for petascale architectures • Joint project Argonne National Lab and University of Oregon • ANL: Putting light-weight kernel (based on Linux) on BG/L and other platforms (XT3) • University of Oregon • Kernel performance monitoring, tuning • KTAU • Integration of TAU infrastructure in Linux Kernel • Integration with ZeptoOS, installation on BG/L • Port to 32-bit and 64-bit Linux platforms

  4. ZeptoOS: The Small Linux for Big Computers • Research Exploration • What are the fundamental limits and advanced designs required for petascale Operating System Suites • Behaviour at large scales • Management & optimization of OS suites • Collectives • Fault tolerance • Measurement, collection and analysis of OS performance data from large number of nodes • Strategy • Modified Linux on BG/L I/O nodes • Measure and understand behavior • Modified Linux for BG/L compute nodes • Measure and understand behavior • Specialized I/O daemon on I/O node (ZOID) • Measure and understand behavior (ZeptoOS BG/L Symposium presentation slide reused with permission from Pete Beckman [beckman06-bgl])

  5. ZeptoOS and KTAU • Lots of fine-grained OS measurement is required for each component of the ZeptoOS work • Exactly what aspects of Linux need to be changed to achieve ZeptoOS goals? • How and why do the various OS source and configuration changes affect parallel applications? • How do we correlate performance data between • the parallel application, • the compute node OS, • the I/O Daemon and • the I/O Node OS • Enter TAU/KTAU - An integrated methodology and framework to measure performance of applications and OS kernel across a system like BG/L.

  6. Motivation • Application Performance • user-level execution performance + • OS-level operations performance • Domains: Time and Hardware Performance Metrics • PAPI (Performance Application Programming Interface) • Exposes virtualized hardware counters • TAU (Tuning and Analysis Utility) • Measures most user-level entities: parallel application, MPI, libraries … • Time domain • Uses PAPI to correlate counter information to source • But how to correlate OS-level influences with App. Performance?

  7. Motivation (continued) • As HPC systems continue to scale to larger processor counts • Application performance more sensitive • New OS factors become performance bottlenecks (E.g. [Petrini’03, Jones’03, other works…]) • Isolating these system-level issues as bottlenecks is non-trivial • Require Comprehensive Performance Understanding • Observation of all performance factors • Relative contributions and interrelationship • Can we correlate?

  8. Motivation (continued)Program - OS Interactions • Program OS Interactions - Direct vs. Indirect Entry Points • Direct - Applications invoke the OS for certain services • Syscalls (and internal OS routines called directly from syscalls) • Indirect - OS takes actions without explicit invocation by application • Preemptive Scheduling • (HW) Interrupt handling • OS-background activity (keeping track of time and timers, bottom-half handling, etc) • Indirect interactions can occur at any OS entry (not just when entering through Syscalls) • Direct Interactions easier to handle • Synchronous with user-code and in process-context • Indirect Interactions more difficult to handle • Usually asynchronous and in interrupt-context: Hard to measure and harder to correlate/integrate with app. measurements

  9. Motivation (continued)Kernel-wide vs. Process-centric • Kernel-wide - Aggregate kernel activity of all active processes in system • Understand overall OS behavior, identify and remove kernel hot spots. • Cannot show what parts of app. spend time in OS and why • Process-centric perspective - OS performance within context of a specific application’s execution • Virtualization and Mapping performance to process • Interactions between programs, daemons, and system services • Tune OS for specific workload or tune application to better conform to OS config. • Expose real source of performance problems (in the OS or the application)

  10. Motivation (continued)Existing Approaches • User-space Only measurement tools • Many tools only work at user-level and cannot observe system-level performance influences • Kernel-level Only measurement tools • Most only provide the kernel-wide perspective – lack proper mapping/virtualization • Some provide process-centric views but cannot integrate OS and user-level measurements • Combined or Integrated User/Kernel Measurement Tools • A few powerful tools allow fine-grained measurement and correlation of kernel and user-level performance • Typically these focus only on Direct OS interactions. Indirect interactions not merged. • Using Combinations of above tools • Without better integration, does not allow fine-grained correlation between OS and App. • Many kernel tools do not explicitly recognize Parallel workloads (e.g. MPI ranks) • Need an integrated approach to parallel perf. observation, analyses

  11. Support low-overhead OS performance measurement at multiple levels of function and detail Provide both kernel-wide and process-centric perspectives of OS performance Merge user-level and kernel-level performance information across all program-OS interactions Provide online information and the ability to function without a daemon where possible Support both profiling and tracing for kernel-wide and process-centric views in parallel systems Leverage existing parallel performance analysis tools Support for observing, collecting and analyzing parallel data High-Level Objectives

  12. KTAU: Outline • Introduction • Motivations • Objectives • Architecture • KTAU on Blue Gene / L • Recent/Ongoing Work (since publication) • Future work and directions • Acknowledgements • References • Team

  13. KTAU Architecture

  14. KTAU On BGL’s ZeptoOS • I/O Node • Open source modified Linux Kernel (2.4, 2.6) - ZeptoOS • Control I/O Daemon (CIOD) handles I/O syscalls from Compute nodes in pset. • Compute Node • IBM proprietary (closed-source) light-weight kernel • No scheduling or virtual memory support • Forwards I/O syscalls to CIOD on I/O node • KTAU on I/O Node: • Integrated into ZeptoOS config and build system. • Require KTAU-D (daemon) as CIOD is closed-source. • KTAU-D periodically monitors sys-wide or individual process • Visualization of trace/profile of ZeptoOS, CIOD using Paraprof, Vampir/Jumpshot.

  15. KTAU On BG/L (current)

  16. On BG/L (continued)Early Experiences CIOD Kernel Trace zoomed-in (running iotest benchmark)

  17. On BG/L(continued)Early Experiences

  18. On BG/L(continued)Early Experiences Correlating CIOD and RPC-IOD Activity

  19. KTAU On BG/L Will Eventually Look Like … Replace with: Linux + KTAU Replace with: ZOID + TAU

  20. Ongoing/Recent Work (since publication) • Accurate Identification of “noise” sources • Modified Linux on BG/L should not take a performance loss • One area of concern - OS “noise” effects on Synchronization / Collectives • Requires identifying exactly what aspects (code paths, configurations, devices attached) of the OS induce what types of interference • This will require user-level as well as OS measurement • Our Approach • Use the Selfish benchmark [Beckman06] to identify “detours” (or noise events) in user-space • This shows durations and frequencies of events, but NOT cause/source. • Simultaneously use KTAU OS-tracing to record OS activity • Correlate time of occurrence (both use same time source - hw time counter) • Infer which type of OS-activity (if any) caused the “detour” • Remove or alleviate interference using above information (Work-in-progress)

  21. Ongoing/Recent Work (continued)“Noise” Source Identification BGL IO-N: Merged OS/User Performance View of Scheduling

  22. Ongoing/Recent Work (continued)“Noise” Source Identification Merged OS/User View of OS Background Activity

  23. Ongoing/Recent Work (continued)“Noise” Source Identification Zoomed-In: Merged OS/User View of OS Background Activity

  24. Future Work • Dynamic measurement control - enable/disable events w/o recompilation or reboot • Improve performance data sources that KTAU can access - E.g. PAPI • Improve integration with TAU’s user-space capabilities to provide even better correlation of user and kernel performance information • full callpaths, • phase-based profiling, • merged user/kernel traces (already available) • Integration of Tau, Ktau with Supermon • Porting efforts: IA-64, PPC-64 and AMD Opteron • ZeptoOS: Planned characterization efforts • BGL I/O node • Dynamically adaptive kernels

  25. Support Acknowledgements • Department of Energy’s Office of Science (contract no. DE-FG02-05ER25663) and • National Science Foundation (grant no. NSF CCF 0444475)

  26. References • [petrini’03]:F. Petrini, D. J. Kerbyson, and S. Pakin, “The case of the missing supercomputer performance: Achieving optimal performance on the 8,192 processors of asci q,” in SC ’03 • [jones’03]: T. Jones and et al., “Improving the scalability of parallel jobs by adding parallel awareness to the operating system,” in SC ’03 • [PAPI]: S. Browne et al., “A Portable Programming Interface for Performance Evaluation on Modern Processors”. The International Journal of High Performance Computing Applications, 14(3):189--204, Fall 2000. • [VAMPIR]: W. E. Nagel et. al., “VAMPIR: Visualization and analysis of MPI resources,” Supercomputer, vol. 12, no. 1, pp. 69–80, 1996. • [ZeptoOS]: “ZeptoOS: The small linux for big computers,” http://www.mcs.anl.gov/zeptoos/ • [NPB]: D.H. Bailey et. al., “The nas parallel benchmarks,” The International Journal of Supercomputer Applications, vol. 5, no. 3, pp. 63–73, Fall 1991.

  27. References • [Sweep3d]: A. Hoise et. al., “A general predictive performance model for wavefront algorithms on clusters of SMPs,” in International Conference on Parallel Processing, 2000 • [LMBENCH]: L. W. McVoy and C. Staelin, “lmbench: Portable tools for performance analysis,” in USENIX Annual Technical Conference, 1996, pp. 279–294 • [TAU]: “TAU: Tuning and Analysis Utilities,” http://www.cs.uoregon.edu/research/paracomp/tau/ • [KTAU-BGL]: A. Nataraj, A. Malony, A. Morris, and S. Shende, “Early experiences with ktau on the ibm bg/l,” in EuroPar’06, European Conference on Parallel Processing, 2006. • [KTAU]: A. Nataraj et al., “Kernel-Level Measurement for Integrated Parallel Performance Views: the KTAU Project” in IEEE Cluster-2006 (Best Paper)

  28. Team University of Oregon (UO) Core Team • Aroon Nataraj, PhD Student • Prof. Allen D Malony • Dr. Sameer Shende, Senior Scientist • Alan Morris, Senior Software Engineer Argonne National Lab (ANL) Contributors • Pete Beckman • Kamil Iskra • Kazutomo Yoshii Past Members • Suravee Suthikulpanit , MS Student, UO, (Graduated)

  29. Thank You • Questions? • Comments? • Feedback?

More Related