Ta19 vi3 advanced log analysis
Download
1 / 38

TA19 VI3 Advanced Log Analysis - PowerPoint PPT Presentation


  • 110 Views
  • Uploaded on

TA19 VI3 Advanced Log Analysis. Mostafa Khalil VCP, VMware Product Support Engineering. Housekeeping. Please turn off your mobile phones, blackberries and laptops

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'TA19 VI3 Advanced Log Analysis' - ronat


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Ta19 vi3 advanced log analysis

TA19VI3 Advanced Log Analysis

Mostafa KhalilVCP, VMware Product Support Engineering


Housekeeping
Housekeeping

  • Please turn off your mobile phones, blackberries and laptops

  • Your feedback is valued: please fill in the session evaluation form (specific to that session) & hand it to the room monitor / the materials pickup area at registration

  • Each delegate to return their completed event evaluation form to the materials pickup area will be eligible for a free evaluation copy of VMware’s ESX 3i

  • Please leave the room between sessions, even if your next session is in the same room as you will need to be rescanned


Agenda
Agenda

  • ESX Server Boot process

  • Locating logs

  • How to read the logs

  • Interpreting log entries

  • Making sense of it all


Esx server 3 0 x 3 5 boot process

boot loader

initrd

VMkernel

vmnix

VMware init

scripts

init scripts

/sbin/init

ESX Server 3.0.x/3.5 Boot Process

  • Boot loader in MBR points to the boot device

  • Grub.conf lists the boot menu options.

  • Selected menu provides:

    • Location of root partition. Uses UUID instead of device name (e.g. /dev/sda)

    • RAM Disk file name relative to /boot location


Esx server 3i boot process
ESX Server 3i Boot Process

boot loader

initrd

VMkernel

  • Boot loader in MBR points to the boot device

  • initrd (initial Ram disk) is loaded

VMware init

scripts

/sbin/init


Collecting logs ui
Collecting Logs – UI

  • Logon to VI Client as an Administrator

  • Select: File  Export  Export Diagnostics Data

  • Or Administration  Export Diagnostics Data

  • Select servers from which to collect the logs including VC Server

  • Select “Include information ..” Checkbox

  • Specify location for storing the files


Esx server 3 0 x logs collected via vm support
ESX Server 3.0.x Logs (collected via vm-support)

  • Logs are located mostly under /var/log directory. Locations listed here are relative to that directory

  • vmkernel

  • messages

  • dmesg

  • boot.log

  • initrdlogs/*

  • vmksummary

  • vmware/hostd.log

  • vmware/vpx/vpxa.log

  • vmware/esxcfg-boot.log

  • vmware/esxcfg-firewall.log

  • vmware/vmware-cim.log

  • vmware/esxcfg-linuxnet.log

  • vmware/esxupdate.log

  • oldconf/esx.conf.*

  • rpmpkgs

  • vmkernel-version


Esx 3i log files

config.log

messages

slpd.log

wsmand.log

configRP.log

vmware/hostd.log

vmware/aam/*

vmware/vpx/vpxa.log

ESX 3i Log Files


Vmkernel log 3 0 x 3 5
vmkernel Log (3.0.x/3.5)

  • Located in /var/log directory

  • Contains all events generated by vmkernel

  • vmkwarning log is a subset of this one and contains only the warning events

  • Rotated with a numeric extension. The current log wihout extension and the next newest one with “.1” extension

  • All events since last vmkernel load are also in memory in /proc/vmware/log


Messages log files 3i
messages log files (3i)

  • Located in /var/log directory

  • Contains all events generated by vmkernel

  • Rotated with a numeric extension. The current log wihout extension and the next newest one with “.0.gz” extension (rotated and compressed)


Vmkernel log components
vmkernel Log - Components

Jun 19 09:12:54 giza vmkernel: 14:22:31:50.009 cpu3:1033)scsi-qla0: Scheduling SCAN for new luns....

System Date/Time

Message source

Device

CPU:World ID

Host name

uptime

Message


Sample rescan event vmkernel log entries
Sample Rescan Event vmkernel Log Entries

  • cpu3:1033)<6>scsi-qla0: Scheduling SCAN for new luns....

  • cpu1:1034)SCSI: 8244: Starting rescan of adapter vmhba0

The beginning of a SAN Rescan event for vmhba0


Rescan event lun discovery
Rescan Event – LUN Discovery

  • Vendor: IBM Model: 1722-600 Rev: 0520

  • Type: Direct-Access ANSI SCSI revision: 03

  • cpu3:1033)LinSCSI: 4625: Device vmhba0:0:0 has appeared

  • VMWARE SCSI Id: Supported VPD pages for vmhba0:0:0 : 0x0 0x80 0x83 0xc0 0xc1 0xc2 0xc3 0xc4 0xc5 0xc6 0xc7 0xc8 0xc9 0xca 0xd0

Storage Vendor’s ID, Array Model and Microcode Rev.

Reported ANSI version is 3 = SCSI-3

LUN 0 on target 0 on vmhba0 discovered

Array sends supported Vital Product Data (VPD) pages for LUN 0

Array supports VPD pages 0x80 and 0x83


Rescan event lun discovery cont
Rescan Event – LUN Discovery – Cont.

  • VMWARE SCSI Id: Device id info for vmhba0:0:0: 0x1 0x3 0x0 0x10 0x60 0xa 0xb 0x80 0x0 0x17 0x4e 0x84 0x0 0x0 0x12 0x84 0x43 0x 78 0xa3 0x9f

  • VMWARE SCSI Id: Id for vmhba0:0:0 0x60 0x0a 0x0b 0x80 0x00 0x17 0x4e 0x84 0x00 0x00 0x12 0x84 0x43 0x78 0xa3 0x9f 0x31 0x37 0x32 0x32 0x2d 0x36

Device ID for the LUN reported. If VPD page 0x83 were not supported, this line would not show in the log

LUN Id is reported. This matches what is in the proc node/proc/vmware/scsi/vmhba0/0:0:Id: 60 a b 80 0 17 4e 84 0 0 12 84 43 78 a3 9f 31 37 32 32 2d 36


Rescan event lun discovery conclusion
Rescan Event – LUN Discovery - Conclusion

  • cpu3:1033)SCSI: 1424: Device vmhba0:0:0 is attached to a V53 FAStT SAN.

  • cpu3:1033)SCSI: 640: Dual Controllers active for adapter vmhba0

  • cpu3:1033)SCSI: 1450: The IBM FAStT device on vmhba0:0:0 is not configured in Auto-Volume Transfer mode. ESX will handle path failover to passive controllers as necessary.

  • cpu3:1033)SCSI: 2044: Setting default path policy to MRU on target vmhba0:0:0

The LUN is identified as attached to a V53 FAStT SAN

vmhba0 can access both controllers on the FAStT

AVT is identified as “Disabled” and path failover will be handled by ESX

Default Path Policy is set to MRU for that LUN

If AVT were enabled, the policy would have been set to Fixed


Path failover event
Path Failover Event

  • cpu1:1038)WARNING: SCSI: 1785: Manual switchover to path vmhba0:1:5 begins.

  • cpu1:1038)WARNING: SCSI: 1110: Did not switchover to vmhba0:1:5. Check Unit Ready Command returned READY instead of NOT READY for standby controller.

  • cpu1:1038)WARNING: SCSI: 1820: Manual switchover to vmhba0:1:5 completed successfully.

  • cpu1:1038)SCSI: 1789: Changing active path to vmhba0:1:5

Starting a manual switchover (done on ESX side)

No need to move the LUN on the array to target 1 since it returned a “READY” state on that target

Switchover (failover) completed

Now vmhba0:0:5 active path is changed to vmhba0:1:5


Snapshot lun detection esx 3 0 x
Snapshot LUN Detection (ESX 3.0.x)

LVM: 5739: Device vmhba2:2:2:1 is a snapshot:

LVM: 5745: disk ID: <type 3, len 15, lun 2, devType 0, scsi 3, h(id) 1771423412675533879>

LVM: 5747: m/d disk ID: <type 3, len 15, lun 2, devType 0, scsi 3, h(id) 9219142619163180480>

LVM: 5739: Device vmhba2:2:2:1 is a snapshot:

LVM: 5745: disk ID: <type 3, len 15, lun 2, devType 0, scsi 3, h(id) 1771423412675533879>

LVM: 5747: m/d disk ID: <type 3, len 15, lun 2, devType 0, scsi 3, h(id) 9219142619163180480>

ALERT: LVM: 4903: vmhba2:2:2:1 may be snapshot: disabling access. See resignaturing section in SAN config guide.

  • This logging appears in the /var/log/vmkernel log file.

  • The line containing m/d is the metadata.

  • In this case it is the h(id) data in the LVM header which is mismatched.


Snapshot lun detection esx 3 5 and 3i
Snapshot LUN Detection (ESX 3.5 and 3i)

LVM: 5573: Device vml.010044000044363048313739443030343420202020444636303046:1 detected to be a snapshot:

LVM: 5580: queried disk ID: <type 1, len 22, lun 68, devType 0, scsi 4, h(id) 3084339621621410734>

LVM: 5587: on-disk disk ID: <type 1, len 22, lun 68, devType 0, scsi 4, h(id) 3661551745314019942>

ALERT: LVM: 4469: vml.010044000044363048313739443030343420202020444636303046:1 may be snapshot: disabling access. See resignaturing section in SAN config guide.

  • This logging appears in /var/log/vmkernel log file on ESX 3.5

  • This logging appears in /var/log/message log file on ESX 3i

  • The line containing m/d is the metadata

  • In this case it is the h(id) data in the LVM header which is mismatched

  • The “type” field identifies the Disk ID type


Understanding scsi error strings
Understanding SCSI Error Strings

  • Format:

  • Device/Host Sense_buffer[2] 12 13

  • Abbrev: D/H S ASC ASCQ

  • Extended:

  • “Device Status”/”Host Status” “Sense Key” “Additional Sense Code” “Additional Sense Code Qualifier”

  • Example:

  • 2/0 0x6 0x29 0x0


Understanding scsi error strings device status
Understanding SCSI Error Strings – Device Status

  • Device Status: (Displayed in decimal values)


Understanding scsi error strings host status
Understanding SCSI Error Strings – Host Status

  • Host Status (displayed in decimal values)


Understanding scsi error strings host status cont
Understanding SCSI Error Strings – Host Status –cont.

  • Host Status (displayed in decimal values)




Understanding scsi error strings asc ascq
Understanding SCSI Error Strings – ASC/ASCQ

  • ASC and ASCQ are always in pairs (in hex) (ASQ usually 0)


Understanding scsi error strings examples
Understanding SCSI Error Strings - Examples

cpu3)WARNING: SCSI: 5663: vmhba1:0:10:1 status = 2/0 0x6 0x29 0x0

cpu3)WARNING: SCSI: 5663: vmhba2:1:5:0 status = 24/0 0x0 0x0 0x0

cpu0)SCSI: 8879: vmhba2:1:5:0 status = 24/0 0x0 0x0 0x0

cpu0)WARNING: SCSI: 8760: returns error: "SCSI reservation conflict". Code: 0xbad0023.

cpu3)WARNING: SCSI: 5663: vmhba1:0:9:1 status = 0/1 0x0 0x0 0x0

  • 2/0 0x6 0x29 0x0 (Device Check Condition - Lun Reset)

  • 24/0 0x0 0x0 0x0 (SCSI Reservation Conflict)

  • 0/1 0x0 0x0 0x0 (Device OK/Host No_Connect)


Translating vmkernel error codes
Translating vmkernel Error Codes

  • In VI3 already listed in English along with the code

  • You can find them in:

    VMware-esx-drivers-public-source-<ver>-<build>.tar.gz

    At:

    http://www.vmware.com/download/vi/open_source.html

  • File: return_status.h

    In:

    /src/include/vmware/vmklinks/vmkernel/public

  • Codes listed sequentially starting from 0x0 (hex)

  • Codes get renumbered with new releases


Translating vmkernel error codes examples 3 0 2u1
Translating vmkernel Error Codes – Examples (3.0.2U1)

Line number starts from “0”. Calculate the hex value which gives the last 2 digts in the hex code that starts with 0xBAD00


Messages log 3 0x 3 5
Messages Log (3.0x/3.5)

  • Console events

  • Logon events

  • iSCSI Authentication events

Jul 24 19:13:33 giza sshd[18915]: Connection from 10.16.112.24 port 1396

Jul 24 19:13:36 giza sshd[18915]: Accepted password for root from 10.16.112.24 port 1396 ssh2

Jul 24 19:13:36 giza sshd(pam_unix)[18915]: session opened for user root by (uid=0)

Jul 29 01:01:03 giza iscsid[32725]: cannot make connection to 10.16.95.161:3260: No route to host


Initrdlogs
initrdlogs

  • Located in /var/log/initrdlogs

  • Events during initial boot from RAM Disk

  • Logs include:

    • vmklog.vmk

    • messages

    • vmklog.<storage-driver-name> (e.g. vmklog.qla2300_7xx)


Hostd log
hostd.log

  • Located in /var/log/vmware

  • Sym-linked to the current rotated hostd log file

  • Hostd events

    • VI Client communications when directly connected to ESX

    • Events done on behalf of

      • VPXA

      • System Services

      • Firewall System

      • HA services

      • VMware Converter


Vpxa log
vpxa.log

  • Located in /var/log/vmware/vpx

  • Sym-linked to the current rotated vpxa.log

  • Events of intractions with Virtual Center Server

Log for VMware VirtualCenter, pid=4470, version=2.0.2, build=build-50618, option=Release, section=2

[2007-07-19 11:41:11.172 'App' 3076436896 info] Current working directory: /var/log/vmware/vpx

[2007-07-19 11:41:11.172 'App' 3076436896 info] Initializing SSL context

[2007-07-19 11:41:11.216 'App' 3076436896 info] Starting VMware VirtualCenter Agent Daemon 2.0.2 build-50618

[2007-07-19 11:41:11.221 'App' 3076436896 info] [VpxaInvtHost] Manager IP: :902 Host IP:


Esxcfg firewall log
esxcfg-firewall.log

  • Located in /var/log/vmware directory

  • All VMware Firewall rules events

2006-10-09 13:57:02 (15463) INFO : "/sbin/iptables -A icmp-out -p icmp --icmp-type echo-reply -j ACCEPT"

2006-10-09 13:57:02 (15463) INFO : "/sbin/iptables -A icmp-out -j DROP"

2006-10-09 13:57:02 (15463) INFO : "/sbin/iptables -N log-and-drop"

2006-10-09 13:57:02 (15463) INFO : "/sbin/iptables -A log-and-drop -j LOG --log-ip-options --log-tcp-options --log-level debug"

2006-10-09 13:57:02 (15463) INFO : "/sbin/iptables -A log-and-drop -j DROP"

2006-10-09 13:57:02 (15463) INFO : "/sbin/iptables -N valid-source-address"

2006-10-09 13:57:02 (15463) INFO : "/sbin/iptables -A valid-source-address -s 127.0.0.1 -j DROP"

2006-10-09 13:57:02 (15463) INFO : "/sbin/iptables -A valid-source-address -s 0.0.0.0/8 -j DROP"

2006-10-09 13:57:02 (15463) INFO : "/sbin/iptables -A valid-source-address -d 255.255.255.255 -j DROP“

2007-07-14 13:28:30 (30269) INFO : Setting service AAMClient to 1


Oldconf files
oldconf Files

  • Backup copies of /etc/vmware/esx.conf file

  • Located in /var/log/oldconf directory

    • Created prior to updating the existing file

    • Only when changes done via VC, VI Client or esxcfg-* scripts

    • Date and time of backup used as the extension of the file name

esx.conf.2007-06-30_03:56:58


Esxupdate log
esxupdate.log

  • In /var/log/vmware directory

  • History of all updates done via esxupdate tool

  • Date and PID

  • Packages installed

  • Results of the installation


Vmkernel version
vmkernel-version

  • In /var/log directory

  • lists current and all previous kernel build numbers

Found vmkernel version 27701

Found vmkernel version 32039


Questions
Questions?

  • TA19

  • VI3 Advanced Log Analysis

  • Mostafa Khalil, VCP

  • VMware Product Support Engineering

  • For more information …

  • http://www.vmware.com