M windows nt 4 0 setup and debugging
Download
1 / 63

M Windows NT 4.0 Setup and Debugging - PowerPoint PPT Presentation


  • 197 Views
  • Uploaded on

M Windows NT 4.0 Setup and Debugging. Joseph West Sr Technology Specialist. Agenda . Setup (build overview) Three phases of Setup Character-Based Setup Boot from Character-Based to GUI-Based Setup GUI-Based Setup Troubleshooting ( Blue Screens & Stop Codes )

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'M Windows NT 4.0 Setup and Debugging' - daniel_millan


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
M windows nt 4 0 setup and debugging

MWindows NT 4.0Setup and Debugging

Joseph West

Sr Technology Specialist


Agenda
Agenda

  • Setup (build overview)

    • Three phases of Setup

      • Character-Based Setup

      • Boot from Character-Based to GUI-Based Setup

      • GUI-Based Setup

  • Troubleshooting

    • (Blue Screens & Stop Codes)

  • Latest information for NT 4.0

    • SP4


Hardware compatibility list
Hardware Compatibility List

  • How important is it

  • Support parameters

    http://www.microsoft.com/hwtest/

    http://support.microsoft.com/


Character based setup
Character-Based Setup

Gathering of System

Architecture Information

  • CPU Type

  • Motherboard Architecture

  • Hard Drive Controllers

  • File Systems

  • Disk Free Space

  • Memory


Info gathered is required for basic system initialization
Info Gathered is Required for Basic System Initialization

  • ‘Failure to Detect’ will lead to failure of Setup

  • Unsupported components and enhancements

    • PCI 2.1

    • Special Bus Drivers

    • Caching Chips for Burst Mode


Boot from character based to gui based setup
Boot from Character-Based to GUI-Based Setup

  • Windows NT Kernel is loaded completely for the first time

    • Finds a valid Hard Drive

    • Polls Adapters and tests Bus

  • Most likely point of failure

    • Drivers are loaded into Memory and Multi-threading is initialized


Gui based setup
GUI-Based Setup

  • Install secondary Drivers

  • Create Accounts

    • Machine and Administrator

  • Configure Network Settings

  • Build final System Tree and Registry


Troubleshooting character based setup
Troubleshooting Character-Based Setup

NTHQ Tool

  • Located in Support Directory

  • Purpose is to show all hardware peripheral settings

  • Works with PCI, PnP and Legacy peripherals


Troubleshooting character based setup1
Troubleshooting Character-Based Setup

NTHQ Demo


Troubleshooting character based setup2
Troubleshooting Character-Based Setup

Unsupported Controller

and BIOS Enhancements

  • 32-bit I/O

  • Enhanced Drive Access

  • Multiple Block Access or Rapid IDE

  • Power Management Features


Troubleshooting character based setup3
Troubleshooting Character-Based Setup

Setup Hangs During Initial Boot

  • Disable CD-Boot capability before installing

    • Needs to be done at both the Controller and BIOS levels


Troubleshooting character based setup4
Troubleshooting Character-Based Setup

Setup Cannot Find Hard Drive

  • Scan System for Viruses

  • Make certain there is valid Boot Sector on the Hard Drive


Troubleshooting character based setup5
Troubleshooting Character-Based Setup

Setup Cannot Find Hard Drive

  • If Hard Drive Controller is SCSI

    • Are devices properly terminated

    • Is SCSI BIOS enabled - first Controller (if at all)

    • On secondary Controllers, make certain BIOS is disabled

    • Partition and format using current Controller


Troubleshooting character based setup6
Troubleshooting Character-Based Setup

Setup Cannot Find Hard Drive

  • If Hard Drive Controller is IDE or EIDE

    • Make certain drive is on primary Controller Channel

    • Make certain drive is jumpered correctly

      • (i.e.) Master, Slave, Independent


Troubleshooting character based setup7
Troubleshooting Character-Based Setup

Setup Does Not Detect Hard

Drive Controller Correctly

  • Manually select Controller type

  • Make certain that an NT 4.0 driver is being loaded

  • Use NTHQ Tool to check for correct IRQ and Memory addressing


Troubleshooting character based setup8
Troubleshooting Character-Based Setup

Setup Cannot Find a Valid Partition

  • If Windows 95 is on the system, back-up and Fdisk Hard Drive (no support for Fat 32)

    • Recreate Partitions and Format with DOS 6.22

  • Restore Windows 95 and proceed with Windows NT installation

  • Make certain that correct HAL is being loaded


Troubleshooting failure to reboot from character based to gui based setup
Troubleshooting Failure to RebootFrom Character-Based to GUI-Based Setup

Stop Messages

  • Record Hex Value, 0x1e, 0x7b, etc.

  • Record Values in parentheses

  • Record component where failure occurred

  • Note where in Boot Process error occurred

  • Call PSS (installation support)


Troubleshooting failure to reboot
Troubleshooting Failure to Reboot

Stop Messages Which

can be Solved in the Field

  • 0x7b, (0x4,0,0,0), or 0x8b

    • Indicates problem with Master Boot Record

    • Scan for Viruses

    • Confirm correct Controller driver is loaded

    • Refresh Master Boot Record


Troubleshooting failure to reboot1
Troubleshooting Failure to Reboot

After Reboot,

Video Remains “Black”

  • Check for devices using IRQ’s 2, 9 or 12 (PCI)

  • Scan Hard Drive for Viruses


Troubleshooting failure to reboot2
Troubleshooting Failure to Reboot

Stop Messages Which

can be Solved in the Field

  • 0x1e or 0xa

    • Disable any Third-party services or drivers which were loaded prior to Upgrade

    • Use NTHQ to confirm appropriate Memory and IRQ settings


Troubleshooting gui based setup issues
Troubleshooting GUI-Based Setup Issues

Setup Will Not Read

From CD-ROM Drive

  • Make certain CD is on HCL

  • Copy I386 directory to the Hard Drive and start again from the beginning

  • Make certain that the Controller and/or Hard Drive is correctly configured


Troubleshooting gui based setup issues1
Troubleshooting GUI-Based Setup Issues

If Setup Fails During

Copy of Files to Hard Drive

  • Disable all external Caches in BIOS

  • Make certain Hard Drives are terminated correctly; Active Preferred


Setup enhancements in windows nt 4 0
Setup Enhancements in Windows NT 4.0

Bootable CD-ROM

  • Supports only El Torrito Specification

  • Can only be used in ‘No Emulation Mode’

  • Must be supported by both System and SCSI BIOS


Setup enhancements in windows nt 4 01
Setup Enhancements in Windows NT 4.0

Winnt Character-Based

Setup Logging

  • Using Winnt or Winnt32 /L:

    • Logs all actions during character-based setup to find last successful action

    • Helps to isolate where setup halted without requiring special DLL’s


Setup enhancements in windows nt 4 02
Setup Enhancements in Windows NT 4.0

Restartable GUI-Based Setup

  • If the machine fails during GUI-mode Setup; the problem can be fixed and setup will continue from reboot


Agenda1
Agenda

  • Setup (build overview)

    • Three phases of Setup

      • Character-Based Setup

      • Boot from Character-Based to GUI-Based Setup

      • GUI-Based Setup

  • Troubleshooting

    • (Blue Screens & Stop Codes)

  • Latest information for NT 4.0

    • SP4



Debugging the connection
Debugging(the connection)

  • Connect

    • Modem, Null-modem cable, LAN

  • Boot.ini

    • / Debug /Debugport=com1 / Baudrate=19200

  • Symbols

    • Retail NT CD (in the) support\debug\[platform]\symbols sub-directory


Interpreting blue screens
Interpreting Blue Screens

  • The error code and parameters at the top of the screen

  • The list of modules that have successfully loaded and initialized in the middle of the screen

  • The list of modules that are currently on the stack at the bottom of the screen


Stop codes
Stop Codes

Note: For a complete listing of stop codes, see Windows NTW 4.0 Resource Kit, Chapter 39, “Windows NT Debugger”, or Q142657 article on http://support.microsoft.com


Common stop codes
Common Stop Codes

  • 0xA

  • 0x1E

  • 0x24

  • 0x3F

  • 0x50

  • 0x7B

  • 0x7F

  • 0xC000021A


0xA

  • 0x0000000A IRQL_NOT_LESS_OR_EQUAL

  • Description

    • An attempt was made to touch paged out memory at a process interrupt request level (IRQL) that is too high. Code that runs at higher interrupt levels can’t touch paged-out memory because paging would be to expensive. If it happens that a pageable page is not committed, but it’s virtual address range is still in the translation buffer, high irql code can get away with touching it. But if the system is stressed – then the memory manager will have likely paged that page out and when an in page is attempted - the bugcheck will occur. So, this is why certain bugs tend to not show up on developers boxes which are less stressed than production.

  • Typical Scenarios

    • System configuration changes, virus scanners, other file I/O filters.


0x1E

  • 0x0000001E KMODE_EXCEPTION_NOT_HANDLED

  • Description

    • Essentially, this bugcheck identifies an error that occurred in a section of code where no error detection routines were in place. Most exceptions are generated directly in the section of code that is executing. In this case, the error was not trapped in the middle of the code that was executing. Therefore, the error was allowed to fall through to this default error handler. This makes the error a very common exception. The actual instruction fault is usually similar to a STOP 0xA – that is a memory access violation.

  • Typical Scenarios

    • Invalid or obsolete third-party driver or system service, Microsoft driver or system service bug, file I/O filter drivers.


0x24

  • 0x00000024 NTFS_FILE_SYSTEM

  • Description

    • A STOP 0x24 is the result of NTFS code that detects a problem with the structure of the NTFS file system. This is not a cut and dried exception code and debugging it is sometimes difficult. Disk corruption can generate a STOP 0x23 (FAT_FILE_SYSTEM) and 0x24. However any processes involved in reading or writing data from a FAT or NTFS file system could cause the disk data to appear corrupted. Therefore SCSI and IDE drivers as well as the disk structure itself (hard errors, i.e. bad blocks) can be suspect. The file system calls this bug check in multiple places and this will help us identify the actual source line that generated the bug check. Also, this bugcheck can be caused by I/O filter drivers (resource hangs, race conditions, etc.). After the above is eliminated, more low-level constructs such as file system synchronization objects, scb attributes, etc. need to be examined by the debug engineer.

  • Typical Scenarios

    • This bugcheck is encountered when the NTFS file system has a corruption, or the hard drive has a bad block.


0x3F

  • 0x0000003F NO_MORE_SYSTEM_PTES

  • Description

    • This stop isn’t as common as most of the others in this section, but a good explanation is warranted. A STOP 0x3F is the result of a system doing lots of I/O, therefor fragmenting the system PTE’s. The bugcheck occurs not because the system is out of PTE's, but because a driver requests a huge chunk of memory that can’t be satisfied because a contiguous block that big isn’t available.

  • Typical Scenarios

    • Often video drivers will allocate large amounts of kernel memory that must succeed. Also, some backup programs do the same.

    • For these situations, consult a PSS engineer for the Registry hack that allows the increase of total system PTE’s.


0x50

  • 0x00000050 PAGE_FAULT_IN_NONPAGED_AREA

  • Description

    • A STOP 0x50 is caused when a memory region that is not supposed to be paged out (usually for performance reasons) is paged out. This stop can be caused by a variety of problems including corrupt NTFS volumes, bad network packet data, and in general kernel mode drivers that corrupt memory. Also, drivers that free an MDL but don’t communicate it to all portions of the driver. Others include Disk, Controller, and Disk Driver problems.

  • Typical Scenarios

    • Usually third-party kernel mode drivers munging memory, or reading beyond allowable memory. Also, when the file system is pushed to the tested limits (large Mac volumes), bugs in NTFS are exposed that result in this STOP. This STOP can occur due to interaction problems between SCSI Controller firmware and Hard Drive firmware.


0x7B

  • 0x0000007B INACCESSIBLE_BOOT_DEVICE

  • Description

    • During the initialization of the I/O system, the driver for the boot device may have failed to initialize the device that the system is attempting to boot from, or the file system that is supposed to read that device may have either failed its initialization or simply not recognized the data on the boot device as a file system structure.

    • If this is the initial setup of the system, this error may have occurred because the system was installed on an unsupported Hard Disk or SCSI Controller.

    • This error can also be caused by the installation of a new SCSI Adapter or Hard Disk Controller or by repartitioning the Hard Disk with the System Partition.

  • Typical Scenarios

    • VIRUS

    • LBA type problems, MBR type problems, SCSI Controller/Hard Drive geometry issues, etc.


0x7F

  • 0x0000007F UNEXPECTED_KERNEL_MODE_TRAP

  • Description

    • This error means a trap occurred in kernel mode, either a kind of trap that the kernel is not allowed to have or catch (a bound trap), or a kind of trap that is always instant death (double fault).

  • Typical Scenarios

    • Hardware, kernel mode drivers that manipulate critical system data in an untimely fashion.

    • This STOP most often is the result of the processor taking a double 0x7f (8,0,0,0). Note that these parameters can also show up for a modern software issue involving Netmon (bhnt.sys).


0xc000021a
0xC000021A

  • 0xC000021A FATAL_SYSTEM_ERROR

  • Description

    • This is a typical description that accompanies this error: The Windows Subsystem System process terminated unexpectedly with a status of (0x6130F2B6 0x01B6FBA4). The system has been shutdown.

    • The failing process sometimes is listed in the blue screen itself.

    • This bugcheck occurs when a user-mode subsystem such as Winlogon or CSRSS is fatally compromised such that security can not be guaranteed. The Operating System makes a transition into kernel mode and throws this exception.

  • Typical Scenarios

    • A typical cause of this crash would be an extensible perfmon counter that overwrites it’s Winlogon shared data buffer (Q171033), and in general any access violation that compromises a user-mode subsystem.



Agenda2
Agenda

  • Setup (build overview)

    • Hardware Compatibility List

    • Three Phases of Setup

    • Character-Based Setup

    • Boot from Character-Based to GUI-Based Setup

    • GUI-Based Setup

  • Troubleshooting

    • (Blue Screens & Stop Codes)

  • Latest Information for NT 4.0

    • SP4



Nt4 service pack 4
NT4 Service Pack 4

  • Contents

    • Hotfixes for important customer-reported problems

    • Resource and memory leak bugfixes from NT5

    • 30+ support, diagnostic and repair tools from the NT Resource Kit are included on the SP4 CDROM

    • Event log entries for clean and dirty shutdown

  • Process Improvements

    • Dedicated Service Pack test team

    • Beta Program for Service Packs

    • Improving the Knowledge Base, depth and ease of use

    • Slipstreaming Service Packs into OEM releases


Resource memory leaks
Resource / Memory Leaks

  • Problem

    • Leaks lead to hung systems and bluescreen crashes

    • Some customers do “preventive reboots”

    • Difficult to stop or kill the offending process

  • Solutions

    • Fix leaks: several hundred in NT5, key fixes in NT4 SP4

    • Job objects in NT5, set memory limits on a collection of processes

    • Visual Studio adding leak checking to MFC and CRT

  • Next Work Items

    • Better leak detection

    • Logging in under low resource conditions

    • Stopping and killing processes


Bugchecks blue screens
Bugchecks (Blue Screens)

  • Kernel mode code detected a serious error

    • Blue screens are still frequent and very hard to diagnose

    • Crash dumps take too long on large memory systems

  • Prevention

    • Find and fix bugs in our code

    • Review all calls to KEbugcheck by NT5 RTM

  • Improve diagnosis

    • Reduced clutter on the blue screen, focus on key data, and add hints

    • Crash dumps are now dramatically faster in NT5

    • Developing comprehensive crashdump analysis tools for NT4 and NT5


Bugchecks blue screens1

Stop 0x0000001E ( 0xC0000005, 0xFDE38AF9, 0x00000001, 0x7E8B0EB4 )

KMODE_EXCEPTION_NOT_HANDLED

Address <x> has base at <x> - <filename> <manufacturer> <version>

If this is the first time you've seen this Stop error screen, restart your computer. If this screen appears again, follow these steps:

Check to make sure any new hardware or software is properly installed. If this is a new installation, ask your hardware or software manufacturer for any Windows NT updates you might need.

If problems continue, disable or remove any newly installed hardware or software. Disable BIOS memory options such as caching or shadowing. If you need to use Safe Mode to remove or disable components, restart your computer, press F8 to select Advanced Startup Options, and then select Safe Mode.

Refer to your Getting Started manual for more information on troubleshooting Stop errors.

Bugchecks (Blue Screens)


3rd party drivers
3rd Party Drivers 0x7E8B0EB4 )

  • Problem

    • One of the most common complaints from PSS

    • Source of pool corruption - difficult to diagnose

  • Solution

    • DDK driver samples and documentation is improved in NT5

    • Enhanced driver testing in NT4 and NT5, including pool corruption tests

    • NT5 will have driver signing, “warning” level by default

    • WDM drivers will drive higher quality

    • We are testing major third-party anti-virus software regularly


Unnecessary reboots in nt5
Unnecessary Reboots in NT5 0x7E8B0EB4 )

  • Problem

    • Hardware and software configuration and maintenance

  • Solutions

    • Fixed 50 software configuration cases which required a reboot in NT4. Key fixes include:

      • Adding, removing and configuring network protocols; changing IP addresses

      • Reconfiguring settings on PCI and other PnP hardware

    • Reboots still required for some rare cases

      • Machine name change, domain membership changes, system locale and system font changes, service pack installation

    • Hardware reconfiguration by clustering solutions in NTS/E

    • Where possible, hotfixes will avoid requiring a reboot


Diagnosis and recovery
Diagnosis and Recovery 0x7E8B0EB4 )

  • Recovery Involves

    • Detection (hard with a hung application or server)

    • Diagnosis (need good tools, need parallel installs, bad error messages)

    • System Recovery (chkdsk, crash dump biggest time hits)

    • Application recovery (SQL, Exchange Store, etc)

  • We are delivering

    • 30+ of the most critical support, diagnostic, and repair tools in SP4 and NT5 B2

    • Fixing 35 worst error messages by B2+30, then next 200 as time allows

    • NT5 Safe-mode Boot today and Floppy Boot by NT5 RTM

      • Both support NTFS

    • Web-based trouble-shooter for most common bluescreens

    • Online chkdsk post NT5


Nt test initiatives
NT Test Initiatives 0x7E8B0EB4 )

  • Long duration Server stress

    • 10 Servers running stress for a month+ starting at NT5 Beta 2

    • Mix of stress including BackOffice, IIS, Client/Server, etc

    • Specifically watching for memory and resource leaks

  • Improved driver testing for NT4 and NT5

    • Catch pool corruption

    • Fault injection

  • Better integration testing of Server applications

    • BackOffice applications: Exchange, SQL Server

    • Using automated scripts from BackOffice teams

    • Testing with Oracle, SAP R/3, Lotus Notes

    • 100 Top Server Applications from Tier 1 RDP customers

  • Expanded tests for customer configurations

    • RDP Customer configurations, ISP


Resource kit tools
Resource Kit Tools 0x7E8B0EB4 )

  • Network Diagnostic and Support Tools

    • nettest - quickly determine whether local uses network is configured properly (IDW)

  • Applications, Service Problems and Memory Leaks

    • memsnap - detection of memory and resource leaks over time (dump directory)

  • Disk Problems

    • fixacls - resets ACLs on system files to installation defaults, fixes users who hose their ACLs

  • Debugger Tools

    • debug wizard - easy setup of debuggers for customers

  • Other

    • windiff - file compare util, critical for many situations (reskit)


Event log analyst
Event Log Analyst 0x7E8B0EB4 )

  • Prototype tool for collecting and analyzing event log reliability data

  • Designed for collecting reliability trend data from an entire datacenter in few hours

    • Collected data from 800+ CDC servers in 5 hours

    • Analysis is manual with Excel, less than 3 hours

  • Provides trend analysis of reboots, bugchecks, and Dr Watsons


Event log analyst1
Event Log Analyst 0x7E8B0EB4 )


Event log analyst metrics
Event Log Analyst Metrics 0x7E8B0EB4 )

  • Mean time between reboots

  • Mean time between bugchecks

  • Mean time between Dr Watsons

  • Trend analysis of reboots/server-year

  • Trend analysis of bugchecks/server-year

  • Trend analysis of Dr Watsons/server-year

  • Bugcheck distribution

  • Dr Watson distribution

  • SP4 Only: Availability percentage

  • SP4 Only: Mean time to repair


Tools for nt4 sp4 and nt5
Tools for NT4 SP4 and NT5 0x7E8B0EB4 )

  • Network Diagnostic and Support Tools

    • browstat - only useful tool for diagnosing browser problems (reskit)

    • dhcpcmd - useful for fixing DHCP issues (reskit)

    • dnscmd - diagnose and repair DNS problems (reskit)

    • eseutil - used for WINS and DHCP database diagnosis and repair

    • nettest - quickly determine whether local uses network is configured properly (IDW)

    • winscl - diagnose and repair WINS (reskit)

    • winsadd - command line tool for batching static and dynamic entries in WINS

    • nltest - used for resetting secure channels, diagnosing and fixing trust problems (reskit)


Tools for nt4 sp4 and nt51
Tools for NT4 SP4 and NT5 0x7E8B0EB4 )

  • Applications, Service Problems and Memory Leaks

    • depends - display and troubleshoot application dependency problems (IDW)

    • tlist - list running processes, used in conjunction with kill (reskit)

    • kill - forcibly terminate processes (reskit)

    • memsnap - detection of memory and resource leaks over time (dump directory)

    • pmon - detection of memory and resource leaks over time (reskit)

    • pviewer - gather extended information about running processes (reskit)

    • reg - registry utility, used for diagnosis and repair of many types of issues


Tools for nt4 sp4 and nt52
Tools for NT4 SP4 and NT5 0x7E8B0EB4 )

  • Disk Problems

    • disksave - saves and restores the MBR (reskit)

    • fixacls - resets ACLs on system files to installation defaults, fixes users who hose their ACLs

    • ftedit - used daily to help customers repair fault tolerant volumes (reskit)

  • Debugger Tools

    • gflags - set global flags needed for various kinds of debugging (IDW)

    • remote - allow remote debugging by PSS (reskit)

    • debug wizard - easy setup of debuggers for customers

    • all standard debuggers - already ships in /support dir


Tools for nt4 sp4 and nt53
Tools for NT4 SP4 and NT5 0x7E8B0EB4 )

  • Other

    • uptomp - update system from uniproc to multiproc (reskit)

    • robocopy - used daily by PSS during support calls, easiest way to move large amounts of data around very quickly.

    • shutdown - remote shutdown of systems (reskit)

    • ntevntlg.mdb & ntmsgs.hlp - better error message docs (reskit)

    • windiff - file compare utility; critical for many situations (reskit)

    • dumpel - dump event log messages from local or remote systems (reskit)

    • list - used daily by PSS for reviewing exceedingly large log files, etc.


Summary
Summary 0x7E8B0EB4 )

  • Best Practices matter

    • Mature, disciplined planning & procedures

    • Design, Implement, Test

    • Configuration & Operational control

  • Technology matters

    • OS system services

    • UPS, RAID, ECC Memory, multi-homing

    • Cluster Services

  • We can deliver availability with Windows NT today

  • Microsoft is investing heavily in availability


References and resources
References and Resources 0x7E8B0EB4 )

  • http://www.microsoft.com/ntserver/

  • http://www.microsoft.com/ntworkstation/

  • http://www.microsoft.com/windowsnt5/

  • http://www.microsoft.com/hwtest/

  • http://support.microsoft.com/

  • http://support.microsoft.com/support/kb/articles/q103/0/59.asp

    • Descriptions of Bug Codes for Windows NT


References and resources1
References and Resources 0x7E8B0EB4 )

  • Inside Windows NT Second Edition, David A. Solomon

    MS Press 1998

  • Windows NTW 4.0 Resource Kit

    • Chapter 19: “What Happens When You Start Your Computer”

    • Chapter 21: “Troubleshooting Startup and Disk Problems”

    • Chapter 36: “General Troubleshooting”

    • Chapter 39: “Windows NT Debugger”, or Q142657 article

  • Supporting Windows NT Server in the Enterprise

    MS Press 1998

    • Chapter 7: “Troubleshooting Tools and Methods”


Questions
Questions? 0x7E8B0EB4 )


M 0x7E8B0EB4 )


ad