1 / 47

The Case of the Unexplained

. Outline. IntroductionSluggish PerformanceApplication HangsError MessagesUnknown ImagesBlue Screens and System Hangs. Troubleshooting. Applications or the system sometimes exhibit mysterious sluggish performanceMost applications do a poor job of reporting unexpected errorsLocked, missing or

dareh
Download Presentation

The Case of the Unexplained

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    1. The Case of the Unexplained… Mark Russinovich Technical Fellow Microsoft Corporation

    2. Outline Introduction Sluggish Performance Application Hangs Error Messages Unknown Images Blue Screens and System Hangs

    3. Troubleshooting Applications or the system sometimes exhibit mysterious sluggish performance Most applications do a poor job of reporting unexpected errors Locked, missing or corrupt files Missing or corrupt registry data Permissions problems You might be plagued by the occasional hard hang or bluescreen

    4. Purpose of Talk Show you how to solve these classes of problems by peering beneath the surface Interpreting process, thread, file and registry activity Interpreting thread call stacks Analyzing crash dumps I’ll show you real case studies Some of these have been written up on my blog You’ll learn tools and techniques to help you solve seemingly unsolvable problems See that sometimes finding a workaround is all you can do

    5. Tools We’ll Use Sysinternals: www.microsoft.com/technet/sysinternals Process Explorer – process/thread viewer Process Monitor – file/registry/process/thread tracing Autoruns – displays all autostart locations SigCheck – shows file version information PsExec – execute processes remotely or in the system account Pslist – list process information Strings – dumps printable strings in any file Zoomit – presentation tool I’m using Debugging Tools for Windows: www.microsoft.com/whdc/devtools/debugging/Windbg Windbg - application and kernel debugger Kernrate: www.microsoft.com (search for it in the download center)

    6. Outline Sluggish Performance Application Hangs Error Messages Unknown Images Blue Screens and System Hangs

    7. Process Explorer Process Explorer is a Task Manager replacement You can literally replace Task Manager with Options->Replace Task Manager Hide-when-minimize to always have it handy Hover the mouse to see a tooltip showing the process consuming the most CPU Open System Information graph to see CPU usage history Graphs are time stamped with hover showing biggest consumer at point in time Also includes other activity such as I/O, kernel memory limits

    8. The Case of the Sidebar CPU Spike Noticed CPU spike every few minutes System is dual-core so spike shows as 50% of available CPU Process Explorer showed the Sidebar consuming 30 seconds of CPU every 5 minutes:

    9. Process Monitor Process Monitor is a real-time file, registry, process and thread monitor It requires Windows 2000 SP4 w/Update Rollup 1, XP SP2 or higher, Server 2003 SP1 or higher, Vista, or Server 2008 (including 64-bit versions of Windows) It replaces Filemon and Regmon, but you can use Filemon and Regmon on older operating systems Enhancements over Filemon/Regmon include: More advanced filtering Operation call stacks Boot-time logging Data mining views Process tree to see short-lived processes When in doubt, run Process Monitor! It will often show you the cause for error messages It many times tells you what is causing sluggish performance

    10. The Case of the Sidebar CPU Spike: Solved I didn’t know which Gadget was causing the spike Ran Process Monitor to see if there was associated file system or Registry activity Trace pointed at RSS Gadget:

    11. The Case of the System Process CPU Spikes Noticed System process spiking CPU intermittently: System process host operating system and device driver threads Needed to look inside of System process at threads

    12. Processes and Threads A process represents an instance of a running program Address space Resources (e.g., open handles) Security profile (token) A thread is an execution context within a process Unit of scheduling (threads run, processes don’t run) All threads in a process share the same per-process address space The System process is the default home for kernel mode system threads Functions in OS and some drivers that need to run as real threads E.g., need to run concurrently with other system activity, wait on timers, perform background “housekeeping” work

    13. Viewing Threads Task Manager doesn’t show thread details within a process Process Explorer does on “Threads” tab Displays thread details such as ID, CPU usage, start time, state, priority Start address is where the thread began running (not where it is now) Click Module to get details on module containing thread start address

    14. The Case of the System Process CPU Spikes: Investigation Looked at System process threads, but spike was generic worker thread: Needed to look at thread activity within System process

    15. The Case of the System Process CPU Spikes: Investigation Can’t look at stacks of System process because it’s a protected process Needed to sample thread execution to see where time was being spent Kernrate is a kernel profiling tool that’s part of WDK and separately downloadable Ran Kernrate and hit Ctrl+C after a spike

    16. The Case of the System Process CPU Spikes: Solved (sort of) Looked at System process DLL view: Went to Broadcom site, but driver was latest version At least the cause was identified

    17. Outline Sluggish Performance Application Hangs Error Messages Unknown Images Blue Screens and System Hangs

    18. The Case of the Explorer Hangs Explorer started hanging on certain folders Hangs were up to a minute Explorer would work normally for a minute and then hang again Ran Filemon and saw network path error Contained references to decommissioned computer Regmon showed icon lookup configured for missing computer Fix: Delete Paint Shop Pro (PSP) browse files and all PSP file associations

    19. Call Stacks Sometimes a thread start address doesn’t tell you what a thread is doing The stack might provide a hint: The stack is a per-thread region of memory that records a history of function nesting The bottom from (Function 3) is where the thread will continue executing

    20. Viewing Call Stacks Click Stack on the Threads tab to view a thread’s call stack Lists functions in reverse chronological order Note that start address on Threads tab is different than first function shown in stack This is because all threads created by Windows programs start in a library function in Kernel32.dll which calls the programmed start address

    21. The Case of the PowerPoint Hang Started experiencing one-minute hang every time PowerPoint started Looked at thread stack in Process Explorer: Saw wait on network printer and removed printer

    22. Outline Sluggish Performance Application Hangs Error Messages Unknown Images Blue Screens and System Hangs

    23. The Case of the Error Message Error Message I was using the system one day and got an error message apparently out of the blue:

    24. Associating Windows with Processes Task Manager can associate a window in its list with a process But sometimes windows appear that are not in its “Applications” list Process Explorer has a “window finder” tool On tool bar, drag window finder icon over window and release Process that owns thread that owns window is highlighted Visual Studio Spy++ tool shows which thread owns a window…

    25. The Case of the Error Message Error Message: Solved I used the Process Explorer Window finder to determine it was LiveMeeting, which I had recently exited:

    26. The Case of the Missing Quicktime Installer Inserted game magazine CD Launcher reported that it required Quicktime: Pressing button caused launcher to exit, but Quicktime setup didn’t start

    27. The Case of the Missing Quicktime Installer Captured trace with Process Monitor and looked at process tree to identify launcher Scrolled through trace looking for references to Quicktime: Confirmed launcher was missing from CD Installed Quicktime from Apple.com

    28. The Case of the Failed File Copy Saw this error trying to copy files to a USB flash drive: Volume had plenty of space and Chkdsk reported no problems

    29. The Case of the Failed File Copy: Solved Process Monitor shows actual error is “CANNOT MAKE”: Web search revealed that’s a FAT error when the root directory is full FAT has 512 root-directory entries Copied files to subdirectory to workaround

    30. The Case of the Build Failure While building a program using nmake on a command line link reported an error: “error writing to program database, check for insufficient disk space, invalid path, or insufficient privileges”

    31. Viewing Open Handles Each process has a list of open “objects” Files, Registry keys, synchronization objects, TCP/UDP ports… May be useful to query this list Microsoft tools: Oh.exe in Resource Kit XP/2003 have new “Openfiles /query” command Only shows handles to open files – not other non-file objects Both require setting a global flag and rebooting (see Gflags.exe in Support Tools) Process Explorer and Sysinternals Handle can show open handles without this flag Uses a device driver

    32. Uses Of Handle View Understand resources used by an application Files Registry keys Note: by default, shows named objects Click on Options->Show Unnamed Objects Solve file locked errors Use the search feature to determine what process is holding a file or directory open Can even close an open files (be careful!) View the state of synchronization objects (mutexes, semaphores, events) Detect handle leaks using refresh difference highlighting

    33. The Case of the Build Failure: Solved You can see sharing violation in Process Monitor Performed a handle search for the file in Process Explorer Saw Windbg had it opened from an earlier debug session even though debug session was closed Closed Windbg

    34. The Case of the Mysterious Logon Error Message Started getting an error message after logging on: Process Explorer window finder showed that dialog is owned by Csrss.exe, the Win32 subsystem process

    35. The Case of the Mysterious Logon Error Message: Solved Ran Filemon with PsExec in System account, logged off and on Psexec –sid <appname> During logon saw that Qttask, the QuickTime quick start applet, was accessing drive Z: Had previously watched a QuickTime movie from temporarily-mapped drive Z: Quicktime remembered the drive letter Used Autoruns to disable Qttask

    36. Viewing Autostarts Use Autoruns to see what’s configured to start when the system boots and you login Windows MsConfig shows a subset defined autostart locations MsConfig doesn’t show as much information

    37. The Case of the Missing Autoplay Inserted USB key on laptop and expected Autoplay dialog Wanted to speed system with ReadyBoost Looked at autoplay settings and everything looked fine:

    38. The Case of the Missing Autoplay: Solved Captured a Process Monitor trace and searched for “autorun”: Group policy had set this value Web search documentation said 255 (0xFF) turns off all autoplay Reset value to 0 and restarted Explorer and autoplay worked on next insertion Set permissions so value wouldn’t get reset by group policy ?

    39. Outline Sluggish Performance Application Hangs Error Messages Unknown Images Blue Screens and System Hangs

    40. The Case of the Unknown Autostart Was doing a routine check of auto-starting programs using Autoruns, and saw an item I didn’t recognize: It exhibits all the signs of malware: It is in the \Windows directory It has no version information It has a name that implies that it is a Windows component

    41. The Case of the Unknown Autostart: Solved Dumped the strings of the image using Strings: Recognized IconEdit2 as a freeware icon editor I had recently installed Author said that it is a licensing component

    42. Outline Sluggish Performance Application Hangs Error Messages Unknown Images Blue Screens and System Hangs

    43. Crashes and Hangs Windows has various components that run in Kernel Mode, the highest privilege mode of the OS OS components: Ntoskrnl.exe, Hal.dll Drivers: Ntfs.sys, Tcpip.sys, device drivers Kernel-mode components are privileged extensions to the OS have to adhere to various rules Not accessing invalid memory Accessing memory at the right “Interrupt Request Level” Not causing resource deadlocks When a kernel-mode component performs an illegal operation, Windows crashes (blue screens) Crashing helps preserve the integrity of user data A resource deadlock can hang the system

    44. Crash Dump Analysis When you reboot after a crash, Windows offers to upload it to Microsoft Online Crash Analysis (OCA) Automated server generates a thumbprint of the crash and uses it as a key in a database If the database has an entry, the user is told the cause and directed at a fix Many times, however, the cause is unknown: Basic crash dump analysis is easy and it might tell you the cause Requires Windbg and symbol configuration Dump files are in either: \Windows\Memory.dmp: Vista and servers \Windows\Minidump: Windows 2000 Pro and Windows XP

    45. The Case of the Periodic System Hangs User experienced intermittent hard hangs Configured Crash-on-Control-Scroll HKLM\System\CurrentControlSet\Services\i8042prt\ Parameters\CrashOnCtrlScroll DWORD 1 After reboot, can crash with Ctrl+Scroll Lock+Scroll Lock Opened dump file generated at subsequent hang and looked at stack with “kb” command: Pointed at wireless driver Windows Update had new version

    46. Another Case of the Periodic System Hangs Home system was intermittently hard-hanging Configured Crash-on-Control-Scroll Got a crash (not a hang) during the reboot! Opened dump file in Windbg: Looked at version with “lm kvm ctaud2k” command:

    47. Another Case of the Periodic System Hangs: Solved Went to Creative’s site and got updated driver Rebooted and got the hang Manually crashed the system Opened the dump after the second reboot Looked at stack and saw wireless driver (command: kb) Checked second CPU just to be sure (command: ~1) Downloaded new driver and problems were gone

    48. Summary and More Information A few basic tools and techniques can solve seemingly impossible problems I learn by always trying to determine the root cause Resources: Windows Internals, 4th Ed.: understand the way the OS works Sysinternals Video Library: in-depth dive on tools and troubleshooting Process Monitor webcast: see Sysinternals->Mark’s Webcasts If you’ve solved one, send me a description, screenshots and log files! I’ll send you a signed copy of Windows Internals

More Related