Active active zero downtime and unlimited scalability with db2 purescale
This presentation is the property of its rightful owner.
Sponsored Links
1 / 28

Active-Active Zero Downtime and Unlimited Scalability with DB2 pureScale PowerPoint PPT Presentation


  • 113 Views
  • Uploaded on
  • Presentation posted in: General

Active-Active Zero Downtime and Unlimited Scalability with DB2 pureScale. Daniel Ling Senior Solution Specialist Information Management Software, IBM HK. Now everything Oracle got on database, DB2 has it and works better. Database High Availability Options.

Download Presentation

Active-Active Zero Downtime and Unlimited Scalability with DB2 pureScale

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Active active zero downtime and unlimited scalability with db2 purescale

Active-Active Zero Downtime and Unlimited Scalability with DB2 pureScale

Daniel Ling

Senior Solution Specialist

Information Management Software, IBM HK


Now everything oracle got on database db2 has it and works better

Now everything Oracle got on database, DB2 has it and works better


Database high availability options

Database High Availability Options


Server based failover i e most os clustering

Server Based Failover- i.e. most OS clustering

tx

tx

  • Integrated with Tivoli System Automation cluster manager

  • (included in both DB2 Enterprise and DB2 Workgroup without charge )

    • - Node Failure Detection

    • - Disk takeover

    • - IP takeover

    • - Restart DB2

Active Server


Db2 hadr database log shipping ha

DB2 (HADR) – database log shipping HA

  • Redundant copy of the database to protect against site or storage failure

  • Support for Rolling Upgrades

  • Failover in under 15 seconds

    • Real SAP workload with 600 SAP users – database available in 11 sec.

  • 100% performance after primary failure

  • Included in DB2 Enterprise and DB2 Workgroup without charge

Automatic Client Reroute

Client application automatically resumes on Standby

  • DB2 High Availability Disaster Recovery (HADR) enables highly available database standby

  • Fail over in minute

  • DB2 & MSSQL (2 modes)

    • Syn and Asyn

  • DB2 ( 3 modes )

    • Semi syn

    • Min delay in performance

    • Assure integrity

Network Connection

tx

tx

tx

tx

tx

tx

HADR

Keeps the two

servers in sync

Standby Server

Primary Server

Standby Server


Critical it applications need reliability and scalability

Critical IT Applications Need Reliability and Scalability

  • Local Databases are Becoming Global

    • Successful global businesses must deal with exploding data and server needs

    • Competitive IT organizations need to handle rapid change

  • Customers need a highly scalable, flexible solution for the growth of their information with the ability to easily grow existing applications

  • Down-time is Not Acceptable

    • Any outage means lost revenue and permanent customer loss

    • Today’s distributed systems need reliability


Introduce db2 purescale active active share disk

Introduce DB2 pureScale (Active-Active share disk)

  • Unlimited Capacity

    • Buy only what you need, add capacity as your needs grow

  • Application Transparency

    • Avoid the risk and cost of application changes

  • Continuous Availability

    • Deliver uninterrupted access to your data with consistent performance


Active active zero downtime and unlimited scalability with db2 purescale

Automatic workload balancing

Cluster of DB2 nodes running on Power servers

Leverages the global lock and memory manager technology from z/OS

Integrated Cluster Manager

InfiniBand network & DB2 Cluster Services

Shared Data

DB2 pureScale Architecture

  • Now available on

  • AIX InfiniBand

  • Intel InfiniBand

  • Intel Ethernet

  • AIX Ethernet target 2Q 11


The key to scalability and high availability

The Key to Scalability and High Availability

  • Efficient Centralized Locking and Caching

    • As the cluster grows, DB2 maintains one place to go for locking information and shared pages

    • Optimized for very high speed access

      • DB2 pureScale uses Remote Direct Memory Access (RDMA) to communicate with the powerHA pureScale server

      • No IP socket calls, no interrupts, no context switching

  • Results

    • Near Linear Scalability to large numbers of servers

    • Constant awareness of what each member is doing

      • If one member fails, no need to block I/O from other members

      • Recovery runs at memory speeds

Member 1

Member 1

Member 1

CF

Group Buffer Pool

PowerHA pureScale

Group Lock

Manager


Active active zero downtime and unlimited scalability with db2 purescale

Recover Instantaneously From Node Failure

- Using RDMA (Remote Direct Memory Access)

  • Protect from infrastructure related outages

    • Redistribute workload to surviving nodes immediately

    • Completely redundant architecture

    • Recover in-flight transactions on failing node in as little as 15 seconds including detection of the problem

Application Servers andDB2 Clients


Active active zero downtime and unlimited scalability with db2 purescale

Minimize the Impact of Planned Outages

  • Keep your system up

    • During OS fixes

    • HW updates

    • Administration

Identify Member

Do Maintenance

Bring node back online


Online recovery

DB2

DB2

DB2

DB2

Log

Log

Log

Log

Database member failure

Only data in-flight updates locked during recovery

100

% of Data Available

50

Time (~seconds)

Online Recovery

  • DB2 pureScale design point is to maximize availability during failure recovery processing

  • When a database member fails, only in-flight data remains locked until member recovery completes

    • In-flight = data being updated on the failed member at the time it failed

  • Time to row availability

    • <20 seconds

Shared Data

CF

CF


Active active zero downtime and unlimited scalability with db2 purescale

In-flight Transactions

In-flight Transactions

In-flight Transactions

2

2a

Global

Lock State

CF

Hot pages

pureScale: only data that was being updated on failed database member is (temporarily) locked

Database member failure

100

% of Data Available

Oracle RAC Shared Disk: partial or full “freeze” for global lock state re-master/re-build

This example assumes about 5% of the database data was being updated on the database node that failed, at the time of the failure.

Time (~seconds)

Compare to traditional technology – data availability

DB2 pureScale Oracle RAC Shared Disk

Node failure

Member failure

1

1

In-flight Transactions

In-flight Transactions

In-flight Transactions

Survivors can

get locks

throughout

Global Lock State

Global Lock State

Global Lock State

2

3

3a

Lock remaster/rebuild

Lock requests frozen

until lock state ‘rediscovered’

from log of failed node

2

Recovery performed on another host

2

Another machine performs recovery

3

2a

CF services most page requests from memory

More random disk I/Os needed

3a


Automatic workload balancing and routing

Automatic Workload Balancing and Routing

Run-time load information used to automatically balance load across the cluster (as in Z sysplex)

Load information of all members kept on each member

Piggy-backed to clients regularly

Used to route next connection or optionally next transaction to least loaded member

Routing occurs automatically (transparent to application)

Failover: load of failed member evenly distributed to surviving members automatically

Once the failed member is back online, fallback does the reverse

Affinity-based routing, and failover and fallback also possible

Clients

Clients

Transaction logs

Data

Transaction logs

Data

Shared Data

Shared Data

Log

Shared Data


Reduce system overhead by minimizing inter node communication

Reduce System Overhead by Minimizing Inter-node Communication

DB2 pureScale’s central locking and memory manager minimizes communication traffic

Other database software require CPU intensive communication between all servers in a cluster

DB2 pureScale grows efficiently as servers are added

Other database software waste more and more CPU as they grow


Active active zero downtime and unlimited scalability with db2 purescale

128 Members 84% Scalability

112 Members 89% Scalability

88 Members 90% Scalability

64 Members 95% Scalability

32 Members Over 95% Scalability

16 Members Over 95% Scalability

The Result

2, 4 and 8 Members Over 95% Scalability

Validation testing includes capabilities to be available in future releases.


It needs to adapt in hours not months

IT Needs to Adapt in Hours…Not Months

  • Handling Change is a Competitive Advantage

  • Dynamic Capacity is not the Exception

    • Over-provisioning to handle critical business spikes is inefficient

    • IT must respond to changing capacity demand in days, not months

  • Businesses need to be able grow their infrastructurewithout adding risk

  • Application Changes are Expensive

    • Changes to handle more workload volume can be costly and risky

    • Developers rarely design with scaling in mind

    • Adding capacity should be stress free


Db2 now has built in oracle compatibility

DB2 now has built in Oracle Compatibility

Changes are the exception. Not the rule.

THIS IS WHY WE CALL IT ENABLEMENT AND NOT PORT !

PL/SQL = Procedural Language/Structured Query Language


Concurrency prior to db2 v9 7

Concurrency prior to DB2 v9.7

  • Oracle default / DB2 9.7 default

    • Statement level snapshot

  • DB2 before 9.7

    • Cursor stability

* In default isolation DB2 keeps no rows locked while scanning

Enabling Oracle application to DB2 required significant effort to re-order table access to avoid deadlocks


Sql procedure language sql pl enhancements

SQL Procedure Language (SQL PL) enhancements

  • Advancements in DB2 PL/SQL

  • New SQL, stored procedures and triggers

PL/SQL

COMPILER

PL/SQL (Procedural Language/Structured Query Language)is Oracle Corporation's procedural extension language for SQL and the Oracle relational database. PL/SQL's general syntax resembles that of Ada. (extended from Pascal ) : source: www.wikipedia.org


Db2 now allow both shared disk or shared nothing scale out design

DB2 now allow both Shared-disk or Shared-Nothing scale out design

Best for

Transaction

Best for

Data Warehouse

  • Shared-DiskDB2 preScale FeatureBalance CPU node with shared disk and memory

  • Shared-NothingDB2 Database Partitioning FeatureBalance each node with dedicated CPU, memory and storage


Active active zero downtime and unlimited scalability with db2 purescale

part

part

part

part

part

part

part

part

part

part

part

part

part

part

part

part

Small tables

large tables

Parallel Processing Across Data Modules

  • Partitioned Database Model

  • Database is divided into multiple partitions

  • Partitions are spread across data modules

  • Each Partition has dedicated resources – cpu, memory, disk

  • Parallel Processing occurs on all partitions and is coordinated by the DBMS

  • Single system image to user and application

Corporate network

Foundation

SQL

10GB Ethernet


Parallel query processing

connect

46

Getstatistics

Sum

Optimize

Agent

Coord

Agent

Agent

Agent

Join

Read A

Read B

sum=10

sum=12

sum=13

sum=11

Sum

Sum

Sum

Sum

Join

Join

Join

Join

A

B

A

B

A

B

A

B

Parallel Query Processing

select sum(x) from table_a,table_b where a = b

Catalog

sum(…)

Part1

Part2

Part3

PartN

table_a

table_b


Automatic data distribution

Insert/Load

HASH (trans_id)

DISTRIBUTE BY

Partition 1

Partition 3

Table

Partition 2

Database

Automatic Data Distribution

CREATE TABLE sales(trans_id, col2, col3, …)

DISTRIBUTE BY (trans_id)


Hash partitioning divide and conquer

Hash Partitioning – “Divide and Conquer”

With IBM’s DB2 w/Data Partitioning Feature (DPF), the query may still read most of the data, but now this work can be attacked on all nodes in parallel.

Hash

Hash

Hash

P 1

P 2

P 3

26


Range table partitioning reduces i o

SELECT NAME,TOTAL_SPEND,LOYALTY_TIER from CUSTOMERS where REGION= and MONTH=‘Mar’

Range (Table) Partitioning Reduces I/O

Hash

Hash

Hash

P 2

P 1

P 3

Range

Jan

Range

Feb

Range

Mar

27


Multi dimensional clustering reduces i o

Multi Dimensional Clustering Reduces I/O

I/O problem solver

With MDC, data is further clustered by multiple attributes

Now even less I/O is done to retrieve the records of interest

Less I/O per query leads to more concurrency

Hash

Hash

Hash

MDC

MDC

MDC

MDC

MDC

MDC

MDC

MDC

MDC

P 1

P 2

P3

Range

Jan

Range

Feb

Range

Mar

28


  • Login