1 / 29

K T A U Kernel Tuning and Analysis Utilities

K T A U Kernel Tuning and Analysis Utilities. Department of Computer and Information Science Performance Research Laboratory University of Oregon. Agenda. Motivations KTAU Overview ZeptoOS - KTAU - TAU on BG/L KTAU - TAU on Linux Cluster.

Download Presentation

K T A U Kernel Tuning and Analysis Utilities

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. K T A UKernel Tuning and Analysis Utilities Department of Computer and Information Science Performance Research Laboratory University of Oregon

  2. Agenda • Motivations • KTAU Overview • ZeptoOS - KTAU - TAU on BG/L • KTAU - TAU on Linux Cluster University of Oregon Performance Research Lab

  3. What is a process is doing inside a kernel? Solution: Context-of-Execution Based profile/trace We can analyze the execution path of a process, and store the data local to a process. University of Oregon Performance Research Lab

  4. What about other processes on the system? Solution: System-wide performance analysis By aggregating performance of each process in the system (all or selectively), we can capture interactions among processes. University of Oregon Performance Research Lab

  5. Profiling or Tracing? Answer: Why not doing both? • Profile • A summarized view of performance data, with the advantage of compact data size. • Trace • A detail view of process execution timeline, with a disadvantage of large data size. University of Oregon Performance Research Lab

  6. Why do we need another kernel profiling/tracing tool? Answer: Why not? • LTT • Oprofile • KernInst University of Oregon Performance Research Lab

  7. KTAU Design Goals • Fine-grained kernel-level performance measurement • Parallel applications • Support both profiling and tracing • Both process-centric and system-wide view • Merge user-space performance with kernel-space • Detailed program-OS interaction data • Analysis and visualization compatible with existing tools University of Oregon Performance Research Lab

  8. KTAU Method • Instruments Linux kernel source with KTAU profiling API • Maintains performance data for each kernel routine (per process) • Performance data accessible via /proc filesystem • Instrumented application maintains data in user-space • Post-execution performance data analysis University of Oregon Performance Research Lab

  9. K T AU F r a mework University of Oregon Performance Research Lab

  10. KTAU Architecture • 5 modules • KTAU Instrumentation • KTAU Profiling/Tracing Infrastructure • KTAU Proc Interface • KTAU User-API Library • KTAU-D University of Oregon Performance Research Lab

  11. Kernel Profiling Issues on BG/L • I/O node kernel • Linux kernel approach • Compute node kernel • No daemon processes • Single address space • single performance database • single callstack across user/kernel • Keeps track of one process only (optimization) • Instrumented compute node kernel University of Oregon Performance Research Lab

  12. KTAU on BG/L I/O Node University of Oregon Performance Research Lab

  13. KTAU on BG/L • Current status • IO Node ZeptoOS kernel profiling/tracing • KTAU integrated into ZeptoOS build system • Detailed IO Node kernel observation now possible • KTAU-Daemon (KTAU-D) on IO Node • monitors system-wide and individual process • more than what strace allows • Visualization of trace/profile of ZeptoOS and CIOD • Vampir/JumpShot (trace), and Paraprof (profile), University of Oregon Performance Research Lab

  14. KTAU Usage Models for BG/L IO-Node • Daemon-based monitoring (KTAU-D) • Use KTAU-D to monitor (profile/trace) a single process (e.g., CIOD) or entire IO-Node kernel • No access to source code of user-space program • CIOD kernel-activity available though CIOD source N/A • ‘Self’ monitoring • A user-space program can be instrumented (e.g., with TAU) to access its OWN kernel-level trace/profile data • ZIOD (ZeptoOS IO-D) source (when available) can be instrumented • Can produce MERGED user-kernel trace/profile University of Oregon Performance Research Lab

  15. More on KTAU-D • A daemon running on BG/L IO-node that periodically accesses kernel profile/trace data and outputs to filesystem • Configuration done through ZeptoOS configuration tool • KTAU-D, configuration file, and necessary scripts are integrated into the ZeptoOS runtime environment. University of Oregon Performance Research Lab

  16. KTAU-D Configuration in ZeptoOS-1.2 University of Oregon Performance Research Lab

  17. KTAU-D Profile Data • KTAU-D can be used to access profile data (system-wide and individual process) of BGL IO-Node • Data is obtained at the start and stop of KTAUD, and then the resulting profile is generated • Currrently flat profiles with inclusive/exclusive times and Function call counts are produced • (Future work: Call-graph profiles). • Profile data is viewed using the ParaProf visualization tool University of Oregon Performance Research Lab

  18. Example of Operating System Profile on I/O Nodes Running Flash3 on 32 compute-node Ciod Kernel Profile University of Oregon Performance Research Lab

  19. KTAU-D Trace • KTAU-D can be used to access system-wide and individual process trace data of BGL IO-Node • Trace from KTAU-D is converted into TAU trace-format which then can be converted into other formats • Vampir, Jumpshot • Trace from KTAU-D can be used together (merged) with trace from TAU to monitor both user and kernel space activities • (Work in progress) University of Oregon Performance Research Lab

  20. Exp 1: Observe activities on the IO node Set up: • KTAU: • Enable all instrumentation points • Number of kernel trace entries per process = 10K • KTAU-D: • System-wide tracing • Accessing trace every 1 second and dump trace output to a file in user’s home directory through NFS • IOTEST: • An mpi-based benchmark (open/write/read/close) • Running with default parameters (block-size = 16MB) on NFS. University of Oregon Performance Research Lab

  21. IOTEST with TAU instrumentation Write Seek Time Write Time Main Read Seek Time Read Time University of Oregon Performance Research Lab

  22. KTAU Trace of CIOD running 2, 4, 8, 16, 32 nodes sys_write() / sys_read() As the number of compute node increase, CIOD has to handle larger amount of sys_call being forwarded. University of Oregon Performance Research Lab

  23. Zoomed View of CIOD Trace (8 compute nodes) University of Oregon Performance Research Lab

  24. Can Correlate CIOD Activity with RPC-IOD? • Activity within a BG/L ionode system switching from “CIOD” to “rpciod” during a “sys_write” system call • rpciod performs “socket_send” and interrupt handling before switching back rpciod ciod University of Oregon Performance Research Lab

  25. Exp 2: Correlating multiple traces from Compute-node and IO-node • Set up: • Running IOTEST with TAU instrumentation on 64 compute nodes • Running ZeptoOS-1.2 with KTAU on 2 io-node • Reduced set of kernel instrumentation. • No TCP stack and schedule() • 10K entries of ring-trace buffer • Using PVFS2 (Note: Trace of 64 compute-node and 2 io-node) University of Oregon Performance Research Lab

  26. read() @ 12:678 sec write() @ 3:283 sec TAU Trace University of Oregon Performance Research Lab

  27. pvfs2-client on ionode23 sys_open() @ 53:1 sys_write() @ 56:6 sys_read() @1:05:545 ciod on ionode23 pvfs2-client on ionode47 sys_open() @ 53:2 sys_write() @ 56:85 sys_read() @ 1:05:778 University of Oregon Performance Research Lab ciod on ionode47

  28. Exp 3: Analyze system-wide performance • Set up: • 2 runs of IOTEST with TAU instrumentation on 32 compute nodes • NFS • PVFS • Running ZeptoOS-1.2 with KTAU on 1 io-node • Analyzing both profile and trace data University of Oregon Performance Research Lab

  29. pvfs2-client write() @ 39:00 ciod read() @ 47.804 rpciod University of Oregon Performance Research Lab write() @ 42:99 read() @ 54:61 ciod

More Related