Capacity planning for the newer workloads l.jpg
Sponsored Links
This presentation is the property of its rightful owner.
1 / 70

Capacity Planning for the Newer Workloads PowerPoint PPT Presentation

  • Uploaded on
  • Presentation posted in: General

Capacity Planning for the Newer Workloads. Linwood Merritt Capital One Services, Inc. Disclaimer. These generic issues are addressed by this presentation: Vendor capacity ratings e-Commerce Continuous availability Data warehousing Growth rates

Download Presentation

Capacity Planning for the Newer Workloads

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Capacity Planning for the Newer Workloads

Linwood Merritt

Capital One Services, Inc.


  • These generic issues are addressed by this presentation:

    • Vendor capacity ratings

    • e-Commerce

    • Continuous availability

    • Data warehousing

    • Growth rates

  • This presentation contains no specific business-related information.

Introduction: Environment

  • Capital One

    • 5th largest card issuer in the United States

    • Capital One to S&P 500 in 1998

    • Fortune 500 company (#260)

    • Managed loans at $48.6 billion as of Q1 2002

    • Accounts at 46.6 million as of Q1 2002

    • Fortune 100 “Best Places to Work in America”

    • CIO 100 Award “Master of the Customer Connection”

    • Information Week “Innovation 100” Award Winner

    • ComputerWorld “Top 100 places to work in IT”

Outline of Approach

  • Understand behavior and issues around workloads, hardware, and data

  • Create projections and build recommendations.

  • Report the findings.

Outline of Presentation

  • Discussion of workload types and capacity projection approaches

  • Overall summary of issues and approaches

  • Examples

What Workloads?

  • E-Commerce

  • Relational database systems

  • Mainframe-class UNIX

  • Multiple platforms

  • New characteristics

e-Commerce WorkloadsDirect to Client (business-to-business)

  • Access

    • Internet

    • Leased line

  • Services

    • Point of Care / Point of Sale

    • Value-added analysis

e-Commerce WorkloadsDirect to Customer

  • Access

    • Internet

    • Dial-in

  • Services

    • Marketing

    • Account query

e-Commerce WorkloadsHow to Predict

  • Take business projections of volumes or users (include fudge factor)

  • Estimate transaction volumes and CPU/transaction

  • Convert to normalized unit such as MIPS

Relational Databases

  • Sub-second (OLTP), decision support / data mining

  • Distributed gateways

  • Database machines

  • Redundant data with extracts

  • How to predict: estimate a factor over current database demand or take usage estimates

Mainframe-Class Unix

  • Types: Mainframe USS or Linux, Future UNIX vendor offerings

  • Candidate applications

    • Web server

    • Vendor-ported applications

    • User-ported / new applications

  • How to predict:

    • Estimate by timeframe

    • Add factor to growth rates

Multiple Platforms

  • Mainframe: plan like existing applications (#users, transactions * CPU/transaction, application look-alikes, sizing tools)

  • Distributed: use vendor sizing, modeling tools, existing applications

  • Network: use network simulation tools, rules-of-thumb, bandwidth calculations

New Characteristics

  • External users

  • Continuous availability

  • New user interfaces

  • Cross-platform

External Users

  • Drive need for continuous availability

  • Different access patterns (e.g., doctor’s office vs. call center)

  • Service level measurement - harder to put agent on external workstations

Continuous Availability

  • Driven by external users

  • 24x7 schedule

    • Application redesign

    • Data Sharing: CPU overhead

    • Coupling Facility

    • Expansion of “prime shift”

  • 99.999% “up time”

    • Redundancy, overhead

    • Availability reporting

User Interfaces

  • TCP/IP - no “definite response” (end-to-end response time measurement)

  • Multiple internal transactions per “mouse click”

  • Response time measurement:

    • Agent on workstations

    • Scripting from “robots”

Cross Platform Applications

  • Only unified view: simulation package

  • Each platform (“silo”) can be analyzed separately.

  • Different application development groups

  • May be able to cross-validate user numbers

Types of Implementation (1)

  • Standalone / “shrink-wrap”

  • Layered onto legacy applications

    • New mainframe application code

    • GUI front-end

    • Browser

    • Middle-tier (Unix or NT)

    • MQSeries - can add middle-tier and new mainframe applications

Types of Implementation (2)

  • Legacy extracts

  • Re-engineered legacy applications

    • Convergence of business rules / applications

    • Re-usable components

    • Redundant access

    • Salvage investment, fix Band-Aids

    • Simplify logic, reduce platform complexity

What Are We Analyzing?(Mainframe)

  • MIPS - growth, latent demand, software cost

  • Memory - track and watch 2 GB limit on central storage (goes away with 64-bit)

  • I/O - channels, gigabytes of disk, tape

  • Coupling Facility - Parallel Sysplex, Shared Data, continuous availability

  • Vendor upgrade paths

  • New partitions

What Are We Analyzing?(Distributed)

  • Number and types of platforms

  • CPU, memory, disk space

  • Bandwidth

  • Location of applications / processes

  • Platform limitations (CPU, memory)

  • Software pricing considerations

  • Porting opportunities

Measurement of New Workloads

  • Summarize by platform:

    • Workload rules (process or user names)

    • Processes by descending CPU%

  • Resources: CPU, memory, disk space, Coupling Facility, network traffic

  • Growth:

    • Resources/user/application

    • Number of users + application changes

Distributed Approach

  • Consider tiers of service (not currently at Capital One)

  • Address service level measurement issue

  • Implement reporting

  • Add to Capacity Plan

  • “Silo” vs. “Application”

Tiers of Service“Platinum”

  • Most expensive

  • Modeling product

  • Install in one server for each major application, use collection product for other servers

Tiers of Service“Gold”

  • Collection product

  • Capacity planning with Rules of Thumb

Tiers of Service“Brass”

  • Least expensive (man-hours only)

  • “Native”

    • Unix scripts

    • NT PerfMon

Service Level Measurement

  • API call at workstation - “Applications Response Measurement” (ARM) or Windows 2000 trace API calls

  • Agents: software tracing of Windows API calls - can be installed in a subset of end-user base (sampling)

  • Scripting (“robots”)

  • Stop watch sampling and logging

Distributed Reporting

Add to Capacity Plan

Scope of Analysis

  • Silos

    • Look at each hardware/application environment independently.

  • Applications

    • Look at each application as a whole.

    • Application instrumentation

    • Inference: put platform silos together.

Analyzing the DataGrowth Rates

  • General list of business plans

  • List of technical scenarios

  • Timeline

  • Estimate median and maximum likely MIPS/CPU/users/business units

  • Derive scenario growth rates

Analyzing the DataAdditional Resources

  • Parallel Sysplex (Coupling Facility): important for continuous availability, level set functionality

  • Disk / channels / tape: disk megabytes, channel maximum, tape connectivity

  • Communications connectivity: new partitions for availability

  • Memory: 2 GB constraint, 64-bit


  • “Baseline” growth

  • “Scenario” growth

  • Independent events (merger/acquisition, potential major project)

Example 1: Mainframe Upgrade

  • Task force, led by Capacity Planner

  • Driven by expiring three-year lease (CPU replacement, three-year planning horizon)

  • “Vendor parade” - presentations and dialogues

    • Upgrade paths

    • Technology / service differences

    • References / site visits

    • Capacity sizing: MIPS charts, LSPR / sizing tools

Mainframe Upgrade Deliverables

  • Document

    • Business drivers and technical scenarios

    • Growth forecasts

    • Vendor options and growth paths

    • Coupling Facility / Parallel Sysplex

  • Evaluation

    • Difference thresholds: MIPS claims, price/MIPS, ICF

    • Differentiators

Business and Technical

Technical Scenarios

Consolidation of distributed servers

Continuous availability

Significant external business

Data Warehousing


Business Drivers

Cost management

External business

Improved data access

Business expansion


  • Make educated guess by timeframe for each scenario

  • Add to “baseline” growth

  • Convert to growth rate

  • Use both “baseline” and “scenario growth”

  • Compare maximum scenario growth to maximum for platform family

Impact Analysis


Initial muck exploitation with 250 Users

First Parallel Sysplex exploitation


First mainframe Wk1 Application


(Potential acquisition)

MajorProject A with 100 users, 150% CAGR

New DB2 functionality exploitation


64-bit OS/390

Full Data Sharing exploitation (IMS, CICS, DB2)


Full subsystem redundancy (IMS, CICS, DB2)


24x7 operation


Scenario Timeline

Vendor Upgrade PathsDetail

  • Use logarithms:

    Start*CAGR^x = Threshold

    x years = log(Threshold/Start)/log(CAGR)

  • ModelMIPSMSU+40%/Yr+25%/Yr

    • GS2068E952160Aug-00Sep-00

    • GS2074E1013171Oct-00Dec-00

    • GS2084E1141193Apr-01Jul-01

    • GS2094E1260213Sep-01Dec-01

    • GS2104E1378234Nov-01May-02

Vendor Upgrade PathsSummary

Upgrade Document

Example 2: UNIX Modeling

  • Modeling product installed on MQSeries server

  • Application running with a known number of users

  • Projected rollout schedule used to drive model

  • Mainframe side: CICS application, IMS load

UNIX Platform Workloads

  • Two primary workloads:

    • MQSeries userids (mqm*) - memory intensive

    • Messaging application processes (MDA*) - “CPU intensive”

Workload Modeling Methodology

  • MQSeries - Calculate relative workload intensity, enter model ratio.

  • Messaging application processes - Keep constant until application is removed from platform (“design loop” - always uses 1 CPU). Must adjust across CPU upgrade to continue using 1 CPU.


Track Across Upgrade

Model Spreadsheet

Model Presentation

Timeframe:April 2000

#Users:180, 100

Ratios:1.27, 1.00


Comment:Add Event1 Users

Validation - Tracking Users(on mainframe)





data ecld1;

format date date.;

format dt datetime.;






if recnum =: '99999' and rectype =: 'TCSCONFG';

dt = datetime();

date = datepart(dt);

hour = hour(dt);

data ecldpdb.users;

update ecldpdb.users ecld1;

by date hour;

proc print;

title 'Ecloud1 Users';

Example 3: Server Replacement

  • Project: replace “old” NT servers

  • Application: Imaging servers

  • Capacity sizing data:

    • Rules-of-thumb analysis by vendor, using projected claims/minute and processor clock speeds

    • Benchmark information

Server Replacement Process

  • Multiple servers: each server is a workload, must be sized separately.

  • Enumerate and measure servers.

  • Apply growth rates and determine processing power requirements for the replacements.

  • Research available configurations and order appropriate server configurations.

  • Track CPU utilization across the upgrades.

  • Update relative capacity specs for next upgrade.

Server Sizing

  • Find (or derive) benchmark capacity ratings for starting and replacement configurations.

  • Apply an estimate of current CPU utilization, a growth percentage, and a “peak/average” and performance buffer (+100% for this study).

  • Output: estimated percentages of a standard configuration. The number of estimated CPUs needed (23) came very close to the vendor’s original number of 24.

Sizing Spreadsheet

Example 4: Hundreds of Servers

  • Data capture

  • Reporting

  • Business drivers

Data Capture

  • Time-based scheduling product

  • Script-based data “pull”

  • Issue: data loss, time to find and rebuild

  • Potential fixes:

    • Product

    • Data “push” from servers

Data Reporting, Analysis

  • Color-based “health index” (Concord NetHealth metric).

  • Statistical Analysis (over two standard deviations from mean)

  • Thumbnail drilldown graphs

  • Automatic generation of html

  • “Treemap” graphs

Health Index *

* Concord NetHealth metric

Statistical Process Control


Thumbnail Html

Automatic Generation of Html

  • Driven by “matrix”

    • Originally spreadsheet

    • Converted to relational database

    • Ultimate capacity planning solution: information by server, application, platform, business driver

  • SAS code - builds web pages and hyperlinks











Paper by Ben Shneiderman, University of Maryland,

Business Drivers

  • Capacity Councils - business units responsible for capacity planning of “demand” side

  • Capacity Planners - build projections based on business drivers and historical trending

Business Driver Based Forecasts











Regression Analysis

Input = CPU and Business Drivers by month

Output = Coefficients








By month (input = Widgets, Gadgets, Customers):

projection =Widgets*f1 + Gadgets*f2 + Customers*f3;

Graphical Output

Widgets Gadgets Customers

Enterprise “Capacity at a Glance”


  • Access patterns and schedules

  • Platforms (more types and numbers)

  • Resources (what to track)

  • Levels of capacity management

  • Reporting of utilization and service levels, for large numbers of platforms

  • Higher availability (redundancy, reporting)

  • Deriving and reporting projections

SummaryDeriving Projections

  • Basic capacity planning:

    • Growth rates

    • Upgrade thresholds

  • Aggressive estimate of “scenario” demand

  • Bracket growth:

    • Lower end: “baseline”

    • Upper end: “scenarios”

SummaryTypes of Projections

  • Number of transactions

  • Number of users

  • Number of platforms

  • Application sizing input

  • Application complexity

  • Fraction of an existing workload

  • Growth rate

SummaryCapacity Planning

  • Projections based on application and platform

  • Levels of capacity planning service

  • Report on all enterprise resources

  • Organize data with “matrix” database

  • Login