slide1 l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Incremental Detection and Visualization of Problem Patterns – a “Simplified” Symptomatic Event Vizualizer – PowerPoint Presentation
Download Presentation
Incremental Detection and Visualization of Problem Patterns – a “Simplified” Symptomatic Event Vizualizer –

Loading in 2 Seconds...

play fullscreen
1 / 25

Incremental Detection and Visualization of Problem Patterns – a “Simplified” Symptomatic Event Vizualizer – - PowerPoint PPT Presentation


  • 415 Views
  • Uploaded on

Incremental Detection and Visualization of Problem Patterns – a “Simplified” Symptomatic Event Vizualizer – Marcelo Perazolo Autonomic Computing Architecture mperazolo@us.ibm.com Abdi Salahshour Autonomic Computing Technology & Development abdis@us.ibm.com

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Incremental Detection and Visualization of Problem Patterns – a “Simplified” Symptomatic Event Vizualizer –' - bernad


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

Incremental Detection and Visualizationof Problem Patterns–a “Simplified” Symptomatic Event Vizualizer –

  • Marcelo Perazolo
  • Autonomic Computing
  • Architecture
  • mperazolo@us.ibm.com

Abdi Salahshour

Autonomic Computing

Technology & Development

abdis@us.ibm.com

April 25-26, 2006

agenda
Agenda
  • Statement of Problem
  • What is the Common Event Format
  • What is the Symptoms Reference Format
  • A Solution
  • Conclusion
  • Helpful Links
problems facing today s data collection
Problems Facing Today's Data Collection
  • Complexity of e-Business
    • Collection of distributed and heterogeneous software and hardware components
  • Variety of Data and Collectors/Adapters
    • Consume and publish proprietary data formats
    • Require ad hoc and product specifics code
      • Data format and APIs
    • Design and Standards considerations
    • Different skills set to configure, maintain, and tune
    • Difficult to correlate for e2e problem diagnostics
  • Instrumentation
    • Many-to-Many
    • Standards compliance
    • Customer pain and cost of ownership
event logging

[ibm][db2][jcc][t4] 0150 0400162110E2C1D4 D7D3C5F140404040 ...!........@@@@ .....SAMPLE1

[ibm][db2][jcc][t4] 0160 4040404040404000 59D0030003005324 @@@@@@@.Y.....S$ ..}......

[ibm][db2][jcc][t4] 0170 0800640000003032 30303053514C5249 ..d...02000SQLRI .............<..

[ibm][db2][jcc][t4] 0180 4558540001000480 0100000000000000 EXT............. ................

[ibm][db2][jcc][t4] 0190 0000000000000000 0000000020202020 ............ ................

[ibm][db2][jcc][t4] 01A0 2020202020202000 1253414D504C4531 ..SAMPLE1 ...........(&<..

[ibm][db2][jcc][t4] 01B0 2020202020202020 20202000000000FF ..... ................

[ibm][db2][jcc][t4]

[ibm][db2][jcc][ResultSetMetaData@108ac50a] BEGIN TRACE_RESULT_SET_META_DATA

[ibm][db2][jcc][ResultSetMetaData@108ac50a] Result set meta data for statement Statement@2b2cc50a

[ibm][db2][jcc][ResultSetMetaData@108ac50a] Number of result set columns: 1

isDescribed=true[ibm][db2][jcc][ResultSetMetaData@108ac50a] Column 1: { label=BALANCE, name=BALANCE, type name=DECIMAL, type=3, nullable=1, precision=9, scale=2, schema name=TEST , table name=ACCOUNTS, writable=false, sqlPrecision=9, sqlScale=2, sqlLength=0, sqlType=485, sqlCcsid=0, sqlName=BALANCE, sqlLabel=null, sqlUnnamed=0, sqlComment=null, sqludtxType=<null>, sqludtRdb=<null>, sqludtSchema=<null>, sqludtName=<null>, sqlxKeymem=0, sqlxGenerated=0, sqlxParmmode=0, sqlxCorname=ACCOUNTS, sqlxName=BALANCE, sqlxBasename=ACCOUNTS, sqlxUpdatable=0, sqlxSchema=TEST , sqlxRdbnam=SAMPLE1, internal type=3, is locator parameter=false }

[ibm][db2][jcc][ResultSetMetaData@108ac50a] { sqldHold=0, sqldReturn=0, sqldScroll=0, sqldSensitive=0, sqldFcode=85, sqldKeytype=0,

Event Logging

source=com.ibm.ws.rsadapter.spi.WSRdbDataSource org=IBM prod=WebSphere component=Application Server

<init>

[11/25/03 14:14:33:695 EST] 42754514 > UOW= source=com.ibm.ws.rsadapter.DSConfigurationHelper org=IBM prod=WebSphere component=Application Server

createDataStoreHelper parm1=com.ibm.websphere.rsadapter.CloudscapeDataStoreHelper parm2={}

[11/25/03 14:14:33:695 EST] 42754514 d UOW= source=com.ibm.websphere.rsadapter.GenericDataStoreHelper org=IBM prod=WebSphere component=Application Server

init parm1=com.ibm.websphere.rsadapter.CloudscapeDataStoreHelper@2128451b

[11/25/03 14:14:33:695 EST] 42754514 d UOW= source=com.ibm.websphere.rsadapter.DataStoreHelperMetaData org=IBM prod=WebSphere component=Application Server

setGetTypeMapSupport: false

[11/25/03 14:14:33:695 EST] 42754514 d UOW= source=com.ibm.websphere.rsadapter.DataStoreHelperMetaData org=IBM prod=WebSphere component=Application Server

setHelperType: 0

[11/25/03 14:14:33:695 EST] 42754514 d UOW= source=com.ibm.websphere.rsadapter.CloudscapeDataStoreHelper org=IBM prod=WebSphere component=Application Server

the cloudscape metadata is : parm1=

The defaultTransactionIsolation is: 2

The supportsExtendedForUpdate is: false

The supportsKerberos is: false

The supportsSelectForUpdate is: true

The supportsGetCatalog is: true

The supportsGetTypeMap is: false

The supportsIsReadOnly is: true

The supporstMultiplePartitionDB is: false

Applications

Database

Application

Servers

Servers

Storage devices

Networks

Proprietary format

blame storming syndrome

Problem determination

may take days or weeks

Blame Storming

Blame Storming Syndrome
  • Proprietary log format
  • Domain specific set of tools
  • No interfaces between tools
  • Siloed problem determination
  • Finger pointing resolution

Applications

Database

Application

Servers

Servers

Storage devices

Networks

Proprietary format

Specialized skills and tools

common base event cbe wsdm event format wef
Common Base Event (CBE) / WSDM Event Format (WEF)
  • Richer and normalized data enables cross-product analysis & correlation; is a prerequisite to effective root cause analysis and automation
  • Without standards the event data are of little value to autonomic management in problem determination and action in response
  • To alleviate this event data are structured in 4 categories
    • The identification of the component that is affectedby or experienced the situation
      • This is also known as the source of a situation
    • The identification of the component that is reporting the situation
      • This is also known as the reporter of a situation
      • It may be the same as the source component of the situation
    • The situation data
      • Properties or attributes that describes the situations
    • The Context/Correlation data
      • Properties or attributes to correlate the situations with others
  • CBE / WEF
    • A consistent specification for the definition of normalized event and log information for various domains (business, security, network, system, etc.)
    • An exchange format for events and logs
    • Describe situations about the external operational capabilities of the component.
    • data that captures execution information within a component (i.e. trace), which CBE/WEF is not positioned for
    • Context Data
what is a symptom
What is a Symptom?
  • Dictionary definition:“A characteristic sign or indication of the existence of something else.”
  • AC definition:“A characteristic sign or indication of a possible problem or situation happening in the context of one or more manageable resources.”
    • A form of knowledge, used to solve problems and situations automatically in an autonomic system.
    • Symptoms are composite records of information, formed by the combination of raw or composite information into patterns
    • Symptoms may be composed of other symptoms as well
from events to symptoms
From Events to Symptoms
  • Event: an indication of something being monitored
    • For example, memory usage has exceeded a set limit
  • Symptom: a characteristic sign or indication of a possible problem or situation happening in the context of one or more manageable resources
    • Symptom: If event x (and y (and…) ) occur (under certain conditions), then report the occurrence and possible resolution actions
    • For example, memory usage has exceeded a set limit three times in a 10-minute stretch: suggest increasing your buffer sizes
symptoms reference architecture
Symptoms Reference Architecture

schema:

<schema used to create a new instance of the symptom>

metadata:

<schema used to index and categorize all forms of knowledge>

Policy

Change Req

Change Plan

Analyze

Plan

Symptom

Knowledge

SymptomDefinition

Monitor

Execute

Event

rule

effect:

<schema that describes how to react to instances of the symptom>

rule:

<schema used to recognize a symptom instance>

instance

engine

deploy

engine:

<a runtime artifact used to produce symptom instances>

instance:

<an instance of this symptom that conforms to the symptom schema>

SymptomCatalog

the value proposition
The Value Proposition
  • Management Data more consumable to end-user
    • Visualization of product symptoms within problem determination tooling
    • Symptoms are more deterministic than individual events
    • Increased customer satisfaction
  • Reduced problem determination costs
    • Administrators use automated event correlation to recognize symptoms (and potentially, corrective actions)
    • Support personnel access symptoms directly from the problem determination tools
    • Cross-product symptom catalogs allow quick diagnosis for known errors
  • Reduced maintenance costs
    • Incremental improvements to symptom databases will reduce requests to L2 and L3 support
    • Reduced support requests from other IBM organizations
    • Standard symptom format allows products to leverage problem resolution cost from other IBM organizations (e.g. Collaboration Center)
one tool does not fit all
One Tool Does Not Fit All!

Advanced

Developers

LTA-eclipse

LTA-portal

Change

Team

Correlation

Support

Engineers

System

Analysts

LTA-JD

Analysis

Operators

Triage

Basic (e.g. operators)

Advanced (e.g. developers)

Simple

User Skills

simple log and trace analyzer for java desktop
“Simple“ Log and Trace Analyzer for Java Desktop
  • Standalone simple Java event viewer to merge, filter, sort, and display contents of event sources in a common event format (i.e., CBE) for problem isolation and triage to problem analysis
    • Enables end-to-end viewing of event sources across the heterogeneous environment
    • Customizable summary view
    • Ability to select and expand any raw from the summary view to display the full CBE attributes
    • Correlate on timestamp and/or sorting on any Common Base Event property
    • Filtering and multi level sorting of any event properties
    • Custom highlighting of triage events (simple symptoms definition)
    • Save and share configuration settings (import/export)
    • Staring point for Support personnel and Operation staff
      • Springboard to more advanced analyzer tools
overall architecture
Overall Architecture

Fast XPath

Process CBE

CBE

Event

Sources

Visual

Filters

  • FastXPath
    • Integrates solution with existing code generation tools
    • Extracts XML schema-specific metadata from the object it queries
    • Uses metadata available in auto-generated classes to build optimized XSL engines
slide14

Event sources collection

Customizable Results/Summary area

Events detail area

slide16
=

Equivalent toSymptom Rules

This filter is by Creation Time using XPath that can be generated by the Filter Builder

slide17

Filter Builder (Novice Users)

Powerful composition dialogs…

… while still showing full XPath syntax for power users

slide18
=

We associate visualization attributes to Symptom Rules

slide19

1

2

3

4

5

helpful links
Helpful Links
  • Autonomic Computing Enablement Site
    • http://acenablement.raleigh.ibm.com/
    • http://acenablement.raleigh.ibm.com/html/technology/pd/pddwnlds.html
  • Autonomic Computing
    • http://www.ibm.com/autonomic
  • Autonomic Computing Toolkit
    • http://www.ibm.com/developerworks/autonomic
  • Autonomic Computing Toolkit Download
    • http://www-106.ibm.com/developerworks/autonomic/probdet1.html
  • Common Base Event Version V1.0.1 (CBE)
    • http://dev.eclipse.org/viewcvs/indextools.cgi/~checkout~/hyades-home/docs/components/common_base_event/cbe101spec/CommonBaseEvent_SituationData_V1.0.1.pdf
  • WSDM Event Format V1.0 (WEF)
    • PART 1: http://docs.oasis-open.org/wsdm/2004/12/muws/cd-wsdm-muws-part1-1.0.pdf
    • PART 2: http://docs.oasis-open.org/wsdm/2004/12/muws/cd-wsdm-muws-part2-1.0.pdf
  • Common Event Infrastructure (CEI)
    • http://www.ibm.com/software/tivoli/features/cei/
    • http://www-106.ibm.com/developerworks/library-combined/ac-cei
use cases

CBE Object

ACT/XPath

CEI/ESB

CBE

Logs

XPath

CBE

Logs

Import

CBE

Logs

CBE XML

Formatted

Logs

SymptomDB

SymptomDB

SymptomDB

Solution Problem

Isolation & Analysis

Product Problem

Isolation & Analysis

Solution Problem

Isolation

Solution Problem

Analysis

Use Cases

LTA-Eclipse (Correlate/Analyze)

  • Event viewing
  • Merge/sort/filter
  • Event correlation
  • Cross-Event analysis (symptoms)
  • Remote/local data collection
  • Event conversion

CBE XML

Log and Trace Analyzer

Tools Retrieve and Analyze

CBE Log Data

RAC (API)

CBE

Events

LTA-JD (Triage)

LTA-JD (Analyze)

Generic Log Adapters (GLA)

Triaged

CBE

Events

LTA-Portal (Correlate/Analyze)

  • Event viewing
  • Merge/sort/filter
  • Event correlation
  • Cross-Event analysis (symptoms)
  • Remote/local data collection
  • Event conversion

CBE XML

Formatted

Logs

  • Event viewing
  • Merge/sort/filter
  • Single Event Analysis (highlighting/simple symptom rules)
  • local data collection
  • Remote data collection from CEI server

Applications

lta jd performance
LTA-JD Performance
  • Evaluation of LTA-JD end-to-end (xml input – convert & process object - filter – display)
  • Evaluation of simple FastXPath expression
    • /CommonBaseEvent[@severity >= '10'] on 100000 CBEs
    • FastXPath (157millisecs), JXPath (468 millisecs), Xalan (1328 secs)
  • Better results with
    • smarter filters
    • bigger JVM heap
    • IBM JDK 1.5 (~ 60% improvement !!!)