slide1
Download
Skip this Video
Download Presentation
Incremental Detection and Visualization of Problem Patterns – a “Simplified” Symptomatic Event Vizualizer –

Loading in 2 Seconds...

play fullscreen
1 / 25

Incremental Detection and Visualization of Problem Patterns – a “Simplified” Symptomatic Event Vizualizer – - PowerPoint PPT Presentation


  • 395 Views
  • Uploaded on

Incremental Detection and Visualization of Problem Patterns – a “Simplified” Symptomatic Event Vizualizer – Marcelo Perazolo Autonomic Computing Architecture [email protected] Abdi Salahshour Autonomic Computing Technology & Development [email protected]

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Incremental Detection and Visualization of Problem Patterns – a “Simplified” Symptomatic Event Vizualizer –' - bernad


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

Incremental Detection and Visualizationof Problem Patterns–a “Simplified” Symptomatic Event Vizualizer –

Abdi Salahshour

Autonomic Computing

Technology & Development

[email protected]

April 25-26, 2006

agenda
Agenda
  • Statement of Problem
  • What is the Common Event Format
  • What is the Symptoms Reference Format
  • A Solution
  • Conclusion
  • Helpful Links
problems facing today s data collection
Problems Facing Today's Data Collection
  • Complexity of e-Business
    • Collection of distributed and heterogeneous software and hardware components
  • Variety of Data and Collectors/Adapters
    • Consume and publish proprietary data formats
    • Require ad hoc and product specifics code
      • Data format and APIs
    • Design and Standards considerations
    • Different skills set to configure, maintain, and tune
    • Difficult to correlate for e2e problem diagnostics
  • Instrumentation
    • Many-to-Many
    • Standards compliance
    • Customer pain and cost of ownership
event logging
[ibm][db2][jcc][t4] 0150 0400162110E2C1D4 D7D3C5F140404040 [email protected]@@@ .....SAMPLE1

[ibm][db2][jcc][t4] 0160 4040404040404000 59D0030003005324 @@@@@@@.Y.....S$ ..}......

[ibm][db2][jcc][t4] 0170 0800640000003032 30303053514C5249 ..d...02000SQLRI .............<..

[ibm][db2][jcc][t4] 0180 4558540001000480 0100000000000000 EXT............. ................

[ibm][db2][jcc][t4] 0190 0000000000000000 0000000020202020 ............ ................

[ibm][db2][jcc][t4] 01A0 2020202020202000 1253414D504C4531 ..SAMPLE1 ...........(&<..

[ibm][db2][jcc][t4] 01B0 2020202020202020 20202000000000FF ..... ................

[ibm][db2][jcc][t4]

[ibm][db2][jcc][[email protected]] BEGIN TRACE_RESULT_SET_META_DATA

[ibm][db2][jcc][[email protected]] Result set meta data for statement [email protected]

[ibm][db2][jcc][[email protected]] Number of result set columns: 1

isDescribed=true[ibm][db2][jcc][[email protected]] Column 1: { label=BALANCE, name=BALANCE, type name=DECIMAL, type=3, nullable=1, precision=9, scale=2, schema name=TEST , table name=ACCOUNTS, writable=false, sqlPrecision=9, sqlScale=2, sqlLength=0, sqlType=485, sqlCcsid=0, sqlName=BALANCE, sqlLabel=null, sqlUnnamed=0, sqlComment=null, sqludtxType=, sqludtRdb=, sqludtSchema=, sqludtName=, sqlxKeymem=0, sqlxGenerated=0, sqlxParmmode=0, sqlxCorname=ACCOUNTS, sqlxName=BALANCE, sqlxBasename=ACCOUNTS, sqlxUpdatable=0, sqlxSchema=TEST , sqlxRdbnam=SAMPLE1, internal type=3, is locator parameter=false }

[ibm][db2][jcc][[email protected]] { sqldHold=0, sqldReturn=0, sqldScroll=0, sqldSensitive=0, sqldFcode=85, sqldKeytype=0,

Event Logging

source=com.ibm.ws.rsadapter.spi.WSRdbDataSource org=IBM prod=WebSphere component=Application Server

[11/25/03 14:14:33:695 EST] 42754514 > UOW= source=com.ibm.ws.rsadapter.DSConfigurationHelper org=IBM prod=WebSphere component=Application Server

createDataStoreHelper parm1=com.ibm.websphere.rsadapter.CloudscapeDataStoreHelper parm2={}

[11/25/03 14:14:33:695 EST] 42754514 d UOW= source=com.ibm.websphere.rsadapter.GenericDataStoreHelper org=IBM prod=WebSphere component=Application Server

init [email protected]451b

[11/25/03 14:14:33:695 EST] 42754514 d UOW= source=com.ibm.websphere.rsadapter.DataStoreHelperMetaData org=IBM prod=WebSphere component=Application Server

setGetTypeMapSupport: false

[11/25/03 14:14:33:695 EST] 42754514 d UOW= source=com.ibm.websphere.rsadapter.DataStoreHelperMetaData org=IBM prod=WebSphere component=Application Server

setHelperType: 0

[11/25/03 14:14:33:695 EST] 42754514 d UOW= source=com.ibm.websphere.rsadapter.CloudscapeDataStoreHelper org=IBM prod=WebSphere component=Application Server

the cloudscape metadata is : parm1=

The defaultTransactionIsolation is: 2

The supportsExtendedForUpdate is: false

The supportsKerberos is: false

The supportsSelectForUpdate is: true

The supportsGetCatalog is: true

The supportsGetTypeMap is: false

The supportsIsReadOnly is: true

The supporstMultiplePartitionDB is: false

Applications

Database

Application

Servers

Servers

Storage devices

Networks

Proprietary format

blame storming syndrome
Problem determination

may take days or weeks

Blame Storming

Blame Storming Syndrome
  • Proprietary log format
  • Domain specific set of tools
  • No interfaces between tools
  • Siloed problem determination
  • Finger pointing resolution

Applications

Database

Application

Servers

Servers

Storage devices

Networks

Proprietary format

Specialized skills and tools

common base event cbe wsdm event format wef
Common Base Event (CBE) / WSDM Event Format (WEF)
  • Richer and normalized data enables cross-product analysis & correlation; is a prerequisite to effective root cause analysis and automation
  • Without standards the event data are of little value to autonomic management in problem determination and action in response
  • To alleviate this event data are structured in 4 categories
    • The identification of the component that is affectedby or experienced the situation
      • This is also known as the source of a situation
    • The identification of the component that is reporting the situation
      • This is also known as the reporter of a situation
      • It may be the same as the source component of the situation
    • The situation data
      • Properties or attributes that describes the situations
    • The Context/Correlation data
      • Properties or attributes to correlate the situations with others
  • CBE / WEF
    • A consistent specification for the definition of normalized event and log information for various domains (business, security, network, system, etc.)
    • An exchange format for events and logs
    • Describe situations about the external operational capabilities of the component.
    • data that captures execution information within a component (i.e. trace), which CBE/WEF is not positioned for
    • Context Data
what is a symptom
What is a Symptom?
  • Dictionary definition:“A characteristic sign or indication of the existence of something else.”
  • AC definition:“A characteristic sign or indication of a possible problem or situation happening in the context of one or more manageable resources.”
    • A form of knowledge, used to solve problems and situations automatically in an autonomic system.
    • Symptoms are composite records of information, formed by the combination of raw or composite information into patterns
    • Symptoms may be composed of other symptoms as well
from events to symptoms
From Events to Symptoms
  • Event: an indication of something being monitored
    • For example, memory usage has exceeded a set limit
  • Symptom: a characteristic sign or indication of a possible problem or situation happening in the context of one or more manageable resources
    • Symptom: If event x (and y (and…) ) occur (under certain conditions), then report the occurrence and possible resolution actions
    • For example, memory usage has exceeded a set limit three times in a 10-minute stretch: suggest increasing your buffer sizes
symptoms reference architecture
Symptoms Reference Architecture

schema:

metadata:

Policy

Change Req

Change Plan

Analyze

Plan

Symptom

Knowledge

SymptomDefinition

Monitor

Execute

Event

rule

effect:

rule:

instance

engine

deploy

engine:

instance:

SymptomCatalog

the value proposition
The Value Proposition
  • Management Data more consumable to end-user
    • Visualization of product symptoms within problem determination tooling
    • Symptoms are more deterministic than individual events
    • Increased customer satisfaction
  • Reduced problem determination costs
    • Administrators use automated event correlation to recognize symptoms (and potentially, corrective actions)
    • Support personnel access symptoms directly from the problem determination tools
    • Cross-product symptom catalogs allow quick diagnosis for known errors
  • Reduced maintenance costs
    • Incremental improvements to symptom databases will reduce requests to L2 and L3 support
    • Reduced support requests from other IBM organizations
    • Standard symptom format allows products to leverage problem resolution cost from other IBM organizations (e.g. Collaboration Center)
one tool does not fit all
One Tool Does Not Fit All!

Advanced

Developers

LTA-eclipse

LTA-portal

Change

Team

Correlation

Support

Engineers

System

Analysts

LTA-JD

Analysis

Operators

Triage

Basic (e.g. operators)

Advanced (e.g. developers)

Simple

User Skills

simple log and trace analyzer for java desktop
“Simple“ Log and Trace Analyzer for Java Desktop
  • Standalone simple Java event viewer to merge, filter, sort, and display contents of event sources in a common event format (i.e., CBE) for problem isolation and triage to problem analysis
    • Enables end-to-end viewing of event sources across the heterogeneous environment
    • Customizable summary view
    • Ability to select and expand any raw from the summary view to display the full CBE attributes
    • Correlate on timestamp and/or sorting on any Common Base Event property
    • Filtering and multi level sorting of any event properties
    • Custom highlighting of triage events (simple symptoms definition)
    • Save and share configuration settings (import/export)
    • Staring point for Support personnel and Operation staff
      • Springboard to more advanced analyzer tools
overall architecture
Overall Architecture

Fast XPath

Process CBE

CBE

Event

Sources

Visual

Filters

  • FastXPath
    • Integrates solution with existing code generation tools
    • Extracts XML schema-specific metadata from the object it queries
    • Uses metadata available in auto-generated classes to build optimized XSL engines
slide14
Event sources collection

Customizable Results/Summary area

Events detail area

slide16
=

Equivalent toSymptom Rules

This filter is by Creation Time using XPath that can be generated by the Filter Builder

slide17
Filter Builder (Novice Users)

Powerful composition dialogs…

… while still showing full XPath syntax for power users

slide18
=

We associate visualization attributes to Symptom Rules

slide19
1

2

3

4

5

slide20
Flexibility to show only what the user wants to see: filters out the non-participating events
helpful links
Helpful Links
  • Autonomic Computing Enablement Site
    • http://acenablement.raleigh.ibm.com/
    • http://acenablement.raleigh.ibm.com/html/technology/pd/pddwnlds.html
  • Autonomic Computing
    • http://www.ibm.com/autonomic
  • Autonomic Computing Toolkit
    • http://www.ibm.com/developerworks/autonomic
  • Autonomic Computing Toolkit Download
    • http://www-106.ibm.com/developerworks/autonomic/probdet1.html
  • Common Base Event Version V1.0.1 (CBE)
    • http://dev.eclipse.org/viewcvs/indextools.cgi/~checkout~/hyades-home/docs/components/common_base_event/cbe101spec/CommonBaseEvent_SituationData_V1.0.1.pdf
  • WSDM Event Format V1.0 (WEF)
    • PART 1: http://docs.oasis-open.org/wsdm/2004/12/muws/cd-wsdm-muws-part1-1.0.pdf
    • PART 2: http://docs.oasis-open.org/wsdm/2004/12/muws/cd-wsdm-muws-part2-1.0.pdf
  • Common Event Infrastructure (CEI)
    • http://www.ibm.com/software/tivoli/features/cei/
    • http://www-106.ibm.com/developerworks/library-combined/ac-cei
use cases
CBE Object

ACT/XPath

CEI/ESB

CBE

Logs

XPath

CBE

Logs

Import

CBE

Logs

CBE XML

Formatted

Logs

SymptomDB

SymptomDB

SymptomDB

Solution Problem

Isolation & Analysis

Product Problem

Isolation & Analysis

Solution Problem

Isolation

Solution Problem

Analysis

Use Cases

LTA-Eclipse (Correlate/Analyze)

  • Event viewing
  • Merge/sort/filter
  • Event correlation
  • Cross-Event analysis (symptoms)
  • Remote/local data collection
  • Event conversion

CBE XML

Log and Trace Analyzer

Tools Retrieve and Analyze

CBE Log Data

RAC (API)

CBE

Events

LTA-JD (Triage)

LTA-JD (Analyze)

Generic Log Adapters (GLA)

Triaged

CBE

Events

LTA-Portal (Correlate/Analyze)

  • Event viewing
  • Merge/sort/filter
  • Event correlation
  • Cross-Event analysis (symptoms)
  • Remote/local data collection
  • Event conversion

CBE XML

Formatted

Logs

  • Event viewing
  • Merge/sort/filter
  • Single Event Analysis (highlighting/simple symptom rules)
  • local data collection
  • Remote data collection from CEI server

Applications

lta jd performance
LTA-JD Performance
  • Evaluation of LTA-JD end-to-end (xml input – convert & process object - filter – display)
  • Evaluation of simple FastXPath expression
    • /CommonBaseEvent[@severity >= '10'] on 100000 CBEs
    • FastXPath (157millisecs), JXPath (468 millisecs), Xalan (1328 secs)
  • Better results with
    • smarter filters
    • bigger JVM heap
    • IBM JDK 1.5 (~ 60% improvement !!!)
ad