sco unix diagnostics and troubleshooting alexander sack alexs@sco com senior software engineer l.
Download
Skip this Video
Download Presentation
SCO Unix Diagnostics and Troubleshooting Alexander Sack ( alexs@sco.com ) Senior Software Engineer

Loading in 2 Seconds...

play fullscreen
1 / 37

SCO Unix Diagnostics and Troubleshooting Alexander Sack ( alexs@sco.com ) Senior Software Engineer - PowerPoint PPT Presentation


  • 417 Views
  • Uploaded on

SCO Unix Diagnostics and Troubleshooting Alexander Sack ( alexs@sco.com ) Senior Software Engineer. Intro Initial System Load (ISL) Common Hardware and Driver Issues System Tuning Networking Tips Reporting Problems Q & A. Agenda. Before installing…

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'SCO Unix Diagnostics and Troubleshooting Alexander Sack ( alexs@sco.com ) Senior Software Engineer' - juan


Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
sco unix diagnostics and troubleshooting alexander sack alexs@sco com senior software engineer

SCO Unix Diagnostics and TroubleshootingAlexander Sack (alexs@sco.com)Senior Software Engineer

agenda
Intro

Initial System Load (ISL)

Common Hardware and Driver Issues

System Tuning

Networking Tips

Reporting Problems

Q & A

Agenda
isl overview
Before installing…

Has the system itself been certified by the OEM?

Is the motherboard in the CHWP? (Intel whitebox)

Is it compatible kinda sorta maybe?

Do I need a third-party HBA diskette?

Network card supported?

Does X support my graphic chipset?

Disk layout issues, multi-boot?

ISL: Overview
isl debugging
“Alt-SysReq-H” or “Alt-Ctrl-H” to enter console mode

“Alt-SysReq-F1” or “Alt-Ctrl-F1” to go back to install screens

Acess to resmgr, ISL scripts (/isl/ui_modules), note any console messages during install

IVAR_DEBUG_ALL=1

Dumps log files in /tmp/log

Transfer logs to floppy via cpio

E.g. find /tmp/log/* | cpio –oc –O /dev/dsk/f03ht

cpio –ic –I /dev/dsk/f03ht

ISL: Debugging
isl issues
Problem: Installation sees more processors than actually present

Reasons:

Bad MPS tables

Cores listed as physical CPUs in BIOS

Limited ACPI support (OSR5 only)

Solution:

Boot in single processor mode (ATUP) and apply latest MP/SMP pack

ACPI=Y, USE_XAPIC=Y, ENABLE_JT=Y, MULTICORE=N

Flash BIOS

ISL: Issues
isl issues6
Problem: Kernel hangs on boot-up

Reasons:

Missing interrupts

Mixed stepping processors

Solution:

Boot in single processor mode (ATUP)

Reverse stepped processors, make the LOWER stepping processor in slot 1

Check BIOS settings, ACPI vs. MPS

Move add-on PCI card to a different slot

PnP set to OFF in BIOS

ISL: Issues
isl issues7
Problem: Can not load a HBA from USB floppy

Reasons:

BIOS does not support legacy mode (OSR5 only)

“Device enumeration timeout”

USB is disabled in the BIOS

ISL CD left in tray

Solution:

Check USB BIOS settings

Re-plug USB floppy device, verify sdiconfig output on console

Follow TA article on renaming disk nodes

Remove CD before load

Make sure disk was created correctly, dd image to p0 not s0

Try a different USB floppy device

ISL: Issues
isl issues8
Problem: Root HBA not found after the DCU runs

Reasons:

Didn’t load the right third-party HBA

Software based RAID issues

Valid media kit

USB floppy wasn’t really picked up (ISL will use CD1 for HBA drivers from an ATAPI drive)

Solution:

Disconnect USB floppy after HBA loads

Bind third-party resmgr entry to HBA driver manually via DCU

Check resmgr entry BOARDID and verify that HBA really supports the card

Download a later driver from IHV website

ISL: Issues
isl issues9
Problem: SATA or IDE hangs after loading or fails to recognize my devices

Reasons:

Missed interrupts (polling messages)

DMA incompatibility

Driver in slave only configuration (OSR6/UW7)

SATA/PATA card uses custom third-party driver (e.g. Adaptec, Silicon Image, Marvell)

Solution:

Check cables and jumpers Change mode in BIOS: Legacy, Compatible, Enhanced, AHCI

ATAPI_DMA_DISABLE=Y

Avoid cable select (legacy PATA)

ISL: Issues
isl issues10
Problem: Red screen during mount of CD

Reasons:

Missed interrupts (polling messages)

DMA incompatibility

Driver in slave only configuration (OSR6/UW7)

SATA/PATA card uses custom third-party driver (e.g. Adaptec, Silicon Image, Marvell)

Solution:

Check cables and jumpers Change mode in BIOS: Legacy, Compatible, Enhanced, AHCI

ATAPI_DMA_DISABLE=Y

Avoid cable select (legacy PATA)

ISL: Issues
isl issues11
Problem: NIC is not auto-detected

Reasons:

Driver on ISL media is older than card

Driver issues with card, driver loads but fails

Solution:

Defer networking and pkgadd drivers after install

After install, use SCOadmin Network to configure card

Bind entry to particular NIC driver if card is within the same family via DCU

Stick in another card!

ISL: Issues
isl issues12
Problem: vfs_mountroot() failure

Reasons:

Driver on ISL media is older than card

Driver issues with card, driver loads but fails

“$static” not added to ROOT HBA sdevice file

Solution:

Follow TA to mount disk from ISL

Use the RECUT media

Make sure you are using the latest HBA driver

ISL: Issues
isl issues13
Problem: Screen goes blank after logo appears

Reasons:

VESA mode is not supported by card

On-board chipset uses system memory for framebuffer

Solution:

AGP Gart is now supported, install latest maintenance pack

USE_VESA_BIOS=Y

Use a supported graphics chipset!

ISL: Issues
isl issues14
Problem: Filesystem is left dirty after ISL and every reboot

Reasons:

Aggressive BIOS Power Management

RAID battery failure

Target issues – CHECK CONDITIONS

Older driver and the write cache

Solution:

Check RAID battery levels

Check HBA and target firmware revision

Update to latest driver

ISL: Issues
isl issues15
Problem: Installed one OS and another one won’t boot

Reasons:

OSR5 8GB limit

UW7/OSR6 128GB limit

OSR5 on the first partition of a drive is recommended

MBR rewritten

Solution:

Use CD1 to boot-up and execute fdisk to rewrite MBR from UW7/OSR6 fdisk

Use a third-party boot loader like GRUB

ISL: Issues
isl issues16
Problem: Failing to create large logical volumes

Reasons:

VXFS technical 2TB limit

OSR6/UW7 1TB physical capacity limit

HTFS has issues with greater than 1TB filesystems (slow)

RAID utility issues

Solution:

Use VXFS and ODM

Split volumes in 1TB chunks

Use RAID BIOS or OEM utility if possible to always setup volumes

ISL: Issues
isl issues17
Problem: ISL load time is very slow

Reasons:

ATAPI DMA is disabled

Write caching is disabled

Media errors

Faulty hardware

Solution:

Check IDE/SATA settings

Some OEM disable write caching which makes install slow – future boot parameter

Check hardware and BIOS settings

ISL: Issues
isl issues18
Problem: Kernel link failure at end of ISL

Reasons:

IRQ conflicts in System driver file

Driver configuration build error

Solution:

Check BIOS settings

Disable serial or legacy devices you don’t need

Chroot into fresh install and check build files

Update HBA drivers if available

ISL: Issues
isl issues19
Problem: Kernel panics on boot-up

Reasons:

Full moon out

You weren’t nice to the machine that day

The customer is out to get you

Solution:

Boot in single processor mode

Disable USB via boot parameter or BIOS

Take note if possible of the stack trace to discern error

Cry to the OEM

Cry to SCO support

ISL: Issues
hardware and driver issues disk migration
Migrating OSR5 disk to OSR6

Install wd supplement before migration!

Administer the disk at the source system FIRST before migration

OSR6 Divvy now works on OSR5 (wd) and OSR6 disks

Limitations:

There is no conversion for UW VTOC disks to dual format OSR6

OSR6 does not support extended VTOC slices

Always back your data before migration!

Hardware and Driver Issues: Disk migration
hardware and driver issues multi core
All Intel based processors are multi-core!

ACPI is required to fully support multi-core (OSR6/UW7)

OSR5 supports multi-core provided MPS tables are sane – has some ACPI support (HT)

OEMs have stopped testing MPS table!

SCO licenses per CPU package not core (industry standard)

Mixed steppings headaches

Hardware and Driver Issues: Multi-core
hardware and driver issues hbas
What driver to use?

If in doubt, always use the driver diskette with the higher IHVVERSION in it!

Supported cards can be found in the Drvmap files of the HBA driver/btld package

http://pciids.sourceforge.net/

Sometimes adding a OEM branded BOARDID will work – sometimes it will panic your system!

“echo pcilong | ndcfg”

Management utilities are packaged with the driver if available

Recut media and maintenance packs include latest drivers

Read the README posted on the SCO download area!

Hardware and Driver Issues: HBAs
system tuning general
Migrating from OSR5 to OSR6

DO NOT BLINDLY import OSR5 tunables from OSR6

E.g. buffer cache has different use on OSR6

Identify the performance problem you are trying to solve first! [ GOLDEN RULE ]

Take measurements

/etc/conf/bin/idtune

SCOadmin has wrapper for idtune

System Tuning: General
system tuning performance
Performance Tuning

Identify bottleneck

Rtpm, prfstat, sar, prof, lprof

CPU performance

sar –u

00:00:00 %usr %sys %wio %idle %intr

00:00:01 30 10 10 46 4

high usr, investigate with truss, prof

high sys, intr, investigate with prfstat

high wio, storage throughput

System Tuning: Performance
system tuning storage
Storage Performance

Hardware configuration

Device topology

don’t connect slow devices and fast devices on the same bus e.g. put your slow tape drive on a separate controller

Cabling

ensure your cables are up to specifications

Hardware RAID

performance RAID 0 vs integrity RAID 1 RAID 5

Filesystem tuning

fsadm, block size, increase logsize (@ mkfs only)

mount options; tmplog

ODM dramatic performance boost for $99

System Tuning: Storage
system tuning memory
Memory

Avoid swapping

DEDICATED_MEMORY, use if using shared memory

mkdev dedicated

Dedicated memory reserves physical

Saves kernel virtual

Reduces paging, uses large mappings (PSE)

SEGKMEM_PSE_BYTES

Add more memory!

System Tuning: Memory
system tuning filesystem
Tuning for largefile support

HDATLIM, SDATLIM, HVMMLIM, SVMMLIM, HFSZLIM, SFSZLIM set to 0x7fffffff (unlimited)

/etc/conf/bin/idbuild –B && init 6

fsadm /mountpoint or raw device

fsadm –o largefiles /

OSR6 defaults to largefiles, UW7 does not

Building large file aware applications

-D_FILE_OFFSET_BITS=64

System Tuning: Filesystem
networking tips configuration
Network configuration

netconfig

drivers installed in /etc/inst/nd/

bcfg files are parsed by ndcfg

/etc/confnet.d/inet/interface is configured

at boot /etc/tcp (c.f. S69inet on UW) is run to link the driver into dlpi - initialize -U

STREAMS based network stack

ndcfg

useful for displaying info about the system

geared toward network device driver writers

Networking Tips: Configuration
networking tips tuning and tools
Network monitoring & tuning tools

netstat

ifconfig

inconfig

ndstat

ndcfg

traceroute

ping

Tcpdump

dlpid logging

dlpid –l <logfile> /etc/inst/nd/dlpidPIPE

or edit /etc/default/dlpid

LOG=<logfile>

NIC failover

automatically and transparently switch to a backup NIC in the event of failure of the primary

Chains of backup NICs supported

Networking Tips: Tuning and Tools
networking tips commons issues
Network is UP but can’t connect to other systems

is DNS configured correctly?

netstat –rna

do you have a default route?

Network performance is poor

check cabling

ndstat –l

collisions

inconfig

nfsstat

Networking Tips: Commons Issues
networking tips common issues
Network responds to pings but can’t login

are the daemons running ?

licensed ?

Multiple hosts with the same IP or MAC

arp –an (-n disable name resolution)

? (132.147.103.1) at xx:xx:xx:xx:xx:xx (802.3)

? (132.147.103.9) at xx:xx:xx:xx:xx:xx (802.3)

Stopping and starting the interface

ifconfig net0 down

/etc/tcp stop – daemons stopped, NIC is UP

/etc/tcp shutdown – everything down

/etc/nd stop start

Networking Tips: Common Issues
reporting problems
crash

Primarily used for panic analysis

/var/spool/dump

dumpmemory to generate a crash dump on a live system

crash –a <dumpfile>; will produce a listing suitable for SCO support

provide dumpfile, /stand/unix, all of /etc/conf/mod.d, /usr/sbin/crash

Useful crash commands

ps, as, trace, u, eng, od, addstruct, help

walk data structures using od

od –f

ksh style history buffer

lsof, can save hours of fun on a live system

Reporting Problems
reporting problems36
When reporting problems to support:

Establish a reproducible case (if possible)

Save any crash related files

Note stack trace, crash -a

Save system log files

/var/adm/

Include hardware specs when filing a bug

run sysinfo

Be aware of changes made to /stand/boot

bootparam

Reporting Problems