1 / 56

Introduction and Case Studies

Introduction and Case Studies. (Xperf). Tate Calhoun Platforms Global Escalation Services Core Team. Agenda. What is Xperf? How does it work? What can it do? Case Studies! Where can I get it?. What is Xperf?.

misu
Download Presentation

Introduction and Case Studies

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction and Case Studies (Xperf) Tate Calhoun Platforms Global Escalation Services Core Team

  2. Agenda • What is Xperf? • How does it work? • What can it do? • Case Studies! • Where can I get it?

  3. What is Xperf? • No application or driver is an island! Triaging system wide performance issues via multiple trace streams introduces complexity. Data must be aggregated, visualized, and understood to determine root cause. Xperf is the long term investment Microsoft is making to meet this diagnostic need. • Support Engineer Trends/Challenges: • 1.) Memory.dmps are not getting smaller! (yet) • 2.) Classic yet challenging application and system hangs, slow I/O, and other system wide issues • 3.) User observation to code behavior. Stackwalking! • 4.) The visualization of complex ETW data

  4. What is Xperf? • An ETW controller and consumer • Created by the Windows Fundamentals team • Designed for analyzing system and application performance on Windows Vista, Windows Server 2008, and later.

  5. What is Xperf? • Capture and analysis model follows this general flow: • ETW tracing is enabled using xperf (or xbootmgr). • Scenario of interest is performed. • ETW tracing is stopped using xperf (or automatically by xbootmgr) and the data is saved to an ETL trace file. • Trace files can then further be processed with xperf and/or viewed with xperfview. • Traces can be processed on the machine on which they were taken, or copied to another machine for analysis (including cross-architecture).  Everything needed for analysis is stored in the trace file.

  6. How does this work? • Performance Analyzer is built on top of the Event Tracing for Windows (ETW) • infrastructure. ETW enables Windows and applications to efficiently generate • events, which can be enabled and disabled at any time without requiring system or • process restarts. ETW collects requested kernel events and saves to an etl file. • Provides extensive details about the system. Context switches, interrupts, deferred procedure calls, process and thread creation and destruction, disk I/Os, hard faults, processor P-State transitions, and registry operations, etc. • One of the great features of ETW, supported in WPT, is the support of symbol decoding, sample profiling, and capture of call stacks on kernel events.

  7. ETW Overview • ETW, or Event Tracing for Windows, is a high performance kernel level tracing system that made its first • appearance in Windows 2000 and has found widespread use since. Most operations throughout the • operating system that are interesting to a performance analyst are already instrumented. • ETW is comprised of four key components: • Providers: Components that generate events. They are instrumented with the ETW EventWrite() API. • Sessions: Kernel objects that collect events into buffers and send those events to a file or directly a consuming application. • Controllers: Programs that create and control sessions, and that control providers. Controlling providers consists of enabling and disabling events, or groups of events. • Consumer: Programs that consume event data from ETW sessions. The data can be consumed from a file, or in ‘real time’ streaming mode. • If you are interested in instrumenting your own applications, please see the ETW documentation in MSDN

  8. ETW Components

  9. ETW Advantages for Performance Tracing • Events can be enabled or disabled dynamically, no system or process restarts • By default all events are disabled and the call to EventWrite() incurs a very minimal overhead – just a flag check. This allows retail (shipping) code to be fully instrumented and enables tracing to be enabled in shipping components at any time. • Even when enabled, event tracing has a very low run-time overhead, even when large numbers of events are generated per second. • Once begun, an ETW logging session has a defined maximum memory footprint. The disk footprint can be similarly limited. CPU load is directly proportional to the event rate and scales very well. • Events from multiple providers, sessions and the kernel logger can be merged seamlessly when collected simultaneously on the same machine. This allows different kinds of events and settings to be used on different logging sessions and then merged together after the fact. • The Windows kernel and multiple components are already extensively instrumented. • ETW is the instrumentation method of choice on Windows.

  10. Key Benefits to Support Engineers • 1.) Lightweight/Non-Invasive, only about 2-3% CPU, 5% Memory, configurable. • 2.) STACK TRACING! • 3.) Independent Data Gathering vs. Analysis • (Easily understated…this is almost like being able to share a live debug session) • 4.) We are trying to move away from "how do I reproduce this problem internally?" to "What was the problem and how to fix it“ • 5.) A different…temporal debugger?

  11. More about stacks… • In Vista+, stack walking can be enabled for certain kernel events. When combined with symbols on the processing side this extends analysis using system events to correlate back into the code context that triggered the corresponding event, from up in kernel mode to deep inside applications. This is a very powerful tool!

  12. What can it do? • More and more each release and in the upcoming Win7 beta releases of the tool you will see improved data capturing ability. For now, here are some scenarios that we have explored. • 1.) CPU Consumption and Scheduling Issues • 2.) Storage Delay, Disk I/O Issues • 3.) Power Management Events

  13. Let's get started… • Download and install of the tools (preferably custom to c:\xperf) and an elevated CMD. http://msdn.microsoft.com/en-us/performance/default.aspx

  14. Let's get started… • Let's take a view of the advanced stackwalk options and other Kernel groups • (xperf -providers K)

  15. Let's get started… • Here are the stackwalk flags (xperf -help stackwalk)

  16. Let's get started… • Any Easy Syntax to start out!?! • Using the following is a good head start and we see that massive customization is possible with stackwalk, etc. • Xperf -on DiagEasy • For profiling issues to get stacks one can use: • Xperf -on Latency -stackwalk Profile

  17. Case Study 1 – IE Hang • Xperf -on latency -stackwalk profile • Repro IE Hang… • Xperf -d ielatency.etl • (-dMergedETL Merge the ETL files of stopped logging sessions into MergedETL; if no session is stopped explicitly, the "NT Kernel Logger" stopped implicitly.)

  18. Case Study 1 – IE Hang Case Study 1 – IE Hang

  19. Case Study 1 – IE Hang • So we have spent a significant portion of time in MSHTML.DLL!CHtPvPv::Lookup •  I happened to have a dump of the process as well….

  20. Case Study 1 – IE Hang • Summary: Searching around this mshtml code, this is being reviewed via a Windows 7 bug. Unfortunately, we didn’t have a repro at the time... I provided this etl trace to the team (with dumps and error reporting data) and the investigation continues! • A great example of the portability advantages of this tool!

  21. Case Study 2 – Everything slow… • Xperf -on latency -stackwalk profile • Repro slowness… • Xperf -d WUlatency.etl • Note overlay of Disk Utilization in the middle...

  22. Case Study 2 – Everything slow… • So we have some activity here in cbscore.dll!CCbsPublicPackage::EvaluateApplicability, etc.

  23. Case Study 2 – Everything slow… • What is cbscore.dll? • Component-Based Servicing • Component-Based Servicing (CBS) is part of the servicing stack. The servicing stack is a set of files and resources that are required to service a Windows image or operating system. The servicing stack is available on all Windows Vista and Windows Server 2008 installations, as well as in the Windows Automated Installation Kit AIK (Windows AIK) and the Windows OEM Preinstallation Kit (Windows OPK). CBS provides various APIs (which are not publicly available) to its client installers to service the operating system components. Client installers such as Windows Update or Windows Installer work with CBS to enumerate, install, update, and uninstall component packages on the destination operating system. CBS interacts with the Component Servicing Infrastructure to perform the necessary system changes. • For more information about CBS, see http://go.microsoft.com/fwlink/?LinkId=91917. • Wow, there's even CBS logging as well if needed: http://technet.microsoft.com/en-us/library/cc732334.aspx#CBS

  24. Case Study 2 – Everything slow… • Summary: My machine is working hard scanning for updates, so this seems "expected". However, it turns out there are actually performance improvements exactly here in cbscore for Windows 7 and Vista SP1 to mitigate this issue!

  25. Case Study 3 – Outlook hangs on startup… • Xperf -on latency -stackwalk profile • Start Outlook… • Xperf -d outlookhang.etl

  26. I've got a DPC issue on this machine to track down further with the NIC :)….15% DPC time as a constant is not good for performance!

  27. Case Study 3 – Outlook hangs on startup… • Interesting to note that it matches the CPU Sampling by CPU.

  28. Case Study 3 – Outlook hangs on startup… However, we are ~41% idleper the above…

  29. Case Study 3 – Outlook hangs on startup… What is going on here? Half Idle, Half…

  30. Case Study 3 – Outlook hangs on startup… Ah, my backup application at PID 496 is running doing it's thing… What is going on here? Half Idle, Half…

  31. Case Study 3 – Outlook hangs on startup… But what is Outlook doing during all this?

  32. Case Study 3 – Outlook hangs on startup… Summary: So I've found a taxing network DPC issue on my machine so I'll go follow up with my NIC configuration/driver. (turns out disable renable did the trick) Also, I've seen that I’ve got my backup application querying the file system here likely indexing by design. Also, I've got more digging to do into Outlook using the Cswitch stackwalk event (running the trace before the hang). Addendum: Found two separate hang bugs in Outlook actually where the root cause of the hangs(with the debugger, I cheated, sorry, I feel bad). The write stack matched one though!

  33. Case Study 4 – Run Scott Run… The situation is that on two different platforms (physical vs. virtual) of stated same build are displaying two different scheduling behaviors when viewing CPU time for RunScottRun.exe. The exe “doesn't stay on a processor”… xperf -on PROFILE+DISPATCHER+PROC_THREAD+LOADER+HARD_FAULTS+INTERRUPT+DPC+CSWITCH -maxbuffers 1024 Here is what the "good" trace looks like…We have our Process level view to clone the selection….

  34. Case Study 4 – Run Scott Run… Now the CPU Scheduling section. At a high level the problem reported is that RunScottRun.exe is having trouble getting enough time on a single CPU and we see context switches...

  35. Case Study 4 – Run Scott Run… We can see this at a little bit lower level what is getting to run…

  36. Case Study 4 – Run Scott Run… At an even lower level RunScottRun.exe is lower priority, we see SomeService.exe sneak in here at Pri 9, then back to RunScottRun.exe after Delay Execution.

  37. Case Study 4 – Run Scott Run… If you are an Excel wiz :) the data can be exported for another view.   Here's the "good" trace summary/pivot “Bad” “Good”

  38. Case Study 5 – A Random Stackwalk -CreateProcess C:\xperf>xperf -on DiagEasy -stackwalk ProcessCreate+ProcessDelete C:\xperf>notepad C:\xperf>xperf -d processcreate.etl Merged Etl: processcreate.etl (shows stacks on 7sdk Stack Counts by Type)

  39. Case Study 5 – A Random Stackwalk -CreateProcess

  40. Case Study 5 – A Random Stackwalk -CreateProcess

  41. Case Study 5 – A Random Stackwalk -CreateProcess

  42. Case Study 5 – A Random Stackwalk -CreateProcess

  43. Case Study 6 – A Random Stackwalk – File I/O C:\xperf>xperf -on FileIO+FILENAME -stackwalk @file.txt  C:\xperf>xperf -d file.etl Merged Etl: file.etl C:\xperf>xperf file.etl (shows stacks with 08 ver, in File IO Section)

  44. Case Study 6 – A Random Stackwalk – File I/O

  45. Case Study 7 – A Random Stackwalk – Registry C:\xperf>xperf -on DiagEasy+REGISTRY -stackwalk @reg.txt C:\xperf>regedit C:\xperf>xperf -d reg.etl Merged Etl: reg.etl (shows stacks with 08 ver, in Registry section)

  46. Case Study 7 – A Random Stackwalk – Registry

  47. Case Study 8 – A Random Stackwalk – Power C:\xperf>xperf -on DiagEasy+POWER -stackwalk @power.txt C:\xperf>xperf -d power.etl Merged Etl: power.etl C:\xperf>xperf power.etl C:\xperf>c:\xperf7sdk\xperf power.etl (shows stacks with 7sdk Stack Counts by Type)

  48. Case Study 8 – A Random Stackwalk – Power

  49. Case Study 9 – SLOOOW I/O

  50. Case Study 9 – SLOOOW I/O

More Related