storage network designs for oltp business continuity n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Storage Network Designs for OLTP Business Continuity PowerPoint Presentation
Download Presentation
Storage Network Designs for OLTP Business Continuity

Loading in 2 Seconds...

play fullscreen
1 / 64

Storage Network Designs for OLTP Business Continuity - PowerPoint PPT Presentation


  • 104 Views
  • Uploaded on

Storage Network Designs for OLTP Business Continuity. Marc Farley President, Building Storage Networks, Inc. Agenda. The Vendor Neutral Approach Overview of OLTP &High Availability I/O Redundancy Methods Storage Network Technologies Storage Networking for HA OLTP. Vendor Neutral Approach.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Storage Network Designs for OLTP Business Continuity' - alesia


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
storage network designs for oltp business continuity

Storage Network Designs for OLTP Business Continuity

Marc Farley

President, Building Storage Networks, Inc.

agenda
Agenda
  • The Vendor Neutral Approach
  • Overview of OLTP &High Availability
  • I/O Redundancy Methods
  • Storage Network Technologies
  • Storage Networking for HA OLTP
vendor neutral approach
Vendor Neutral Approach
  • Generic terms, not vendor terms
  • Assumed basic knowledge of SAN, NAS, RAID
oltp environments
OLTP Environments
  • Mission critical business applications
    • Business in real-time
  • Expensive equipment and software
  • Aggressive performance objectives
  • Highly skilled IT staff
    • Hands-on computing operations
oltp database software
OLTP Database Software
  • Oracle,
    • 8i Oracle Parallel Server (OPS)
    • 9i Real Application Cluster (RAC)
  • IBM
    • DB2 UDB
    • Informix
  • MS SQL Server
  • Sybase, My SQL, others
oltp os platforms
OLTP OS Platforms
  • IBM S/390 MVS
  • Unix Systems
  • Windows 2000+
  • HA Linux
oltp requirements
OLTP Requirements
  • 99.999% uptime
  • Non-degrading response time
  • High transaction rates
  • Seamless scalability
  • Cost relief
database storage approaches
Database Storage Approaches
  • Raw parititions
    • Bypass OS I/O buffering
  • File system
    • Facilitates data management
  • NFS mounted
    • Offload DB server, NTAP + Oracle
acid properties of oltp
ACID Properties of OLTP
  • Atomicity– No partial transactions
  • Consistency– All tables are in a consistent state before and after a completed transaction
  • Isolation– One transaction cannot contaminate other transactions
  • Durability–Transactions are complete only when the database updates are written to disk storage
challenges of oltp
Challenges of OLTP
  • Major systems integration effort
    • Intricate tuning and monitoring
    • Little tolerance for errors
  • Complex data structures & relationships
  • Time and sequence-sensitive processes
    • Must be adhered to for data integrity
  • Shifting workloads and bottlenecks
oltp database files
OLTP Database Files
  • Data files
    • Database data, tablespaces
  • Redo log files, archive log files
    • Reconstruct or rollback transactions
  • Control files
    • File layout information
oltp table space storage
OLTP Table Space Storage
  • Use many spindles to distribute hot spots
  • RAID 0+1 recommended
  • File system recommended over raw partitions
    • Easier data management
striping for performance
Striping for Performance

RAID Controller (Microsecond performance)

DiskDrive

DiskDrive

DiskDrive

DiskDrive

DiskDrive

DiskDrive

Disk Drives (Millesecond performance)From rotational latency and seek time

my personal favorite raid 0 1
My Personal Favorite, RAID 0+1

RAID Controller

DiskDrive

DiskDrive

DiskDrive

DiskDrive

DiskDrive

DiskDrive

DiskDrive

DiskDrive

DiskDrive

DiskDrive

1

2

3

5

4

Mirrored Pairs of Striped Members

oltp redo log storage
OLTP Redo Log Storage
  • Raw partitions recommended
    • Sequential high speed writes
  • Separate mirror pairs per log file group
  • Capacity for 30 – 60 minutes of data
  • Goal is to limit disk contention for current and active log files
oltp archive log storage
OLTP Archive Log Storage
  • File system or NFS mounting is required
    • NFS mounting is recommended
  • Mirroring or RAID
  • Goal is to have easy access in case they are needed for reconstruction
high availability
High Availability
  • The ability for a system or application to immediately continue its mission after loss or damage to system components, systems, facilities and data
availability threats
Expected

Scaling limitations

Processor

Storage capacity

Network

Consolidations

Product life cycles

Unexpected

Failures

Bugs

Virus

Operator errors

Disasters

Availability Threats
ha engages all elements
HA Engages All Elements
  • Systems
    • Application
  • Network connections
    • Network services
  • Storage and I/O subsystems
managing the risks
Managing the Risks
  • Local copies of data
    • Immediate availability
  • (Remote) Nearby
    • Immediate availability to several hours
  • Remote Far away
    • One to several days availability
disaster availability radii
Disaster/Availability Radii

Local

Remote Nearby

Remote Far Away

nobody expects
Nobody Expects…..
  • Weird things to happen to them
  • Disintegration of media
  • Underground flooding through tunnels
  • Fires in Telco switching centers
high availability for oltp
High Availability for OLTP
  • Duplication of functions
    • Without degrading performance
    • Without risking data integrity
  • Brute force techniques
  • Automation and efficiency
  • Cost is always an issue
    • And high availability DOES cost
a long time ago in a job not so far away

Jedi Jim Gast

Marc Skyfaller Farley

A Long Time Ago in a Job Not So Far Away…………….

You must learn the to be a master of redundancy it if you are going to be a storage geek.

Remember Marc, there is only one concept:

REDUNDANCY!

Redundancy. Again!

Got it Jim. Let’s Eat!

Whatever

eventually i learned to appreciate his teachings
Eventually, I Learned to Appreciate His Teachings……
  • REDUNDANCYNSPoF (No Single Point of Failure)

Don’t get the giant spicy Polish for lunch – its too much for the digestion

oltp ha requires complete redundancy protection
OLTP HA Requires Complete Redundancy Protection
  • Client network
  • Server systems and components
  • Application modules
  • I/O Channels and Networks
  • Storage subsystems and components
  • Data
a quick look at clustered storage
A Quick Look At Clustered Storage

Shared Everything

Shared Nothing

Both servers share control of a common storage address space

Each server controls its own storage address space

examples of oltp clusters
Examples of OLTP Clusters

Microsoft SQL Server

Oracle 9.1 RAC

Data is exchanged between servers

Data is accessed directly from storage

Failoverpaths only

one more time with subsystems
One more time, with subsystems…

Microsoft SQL Server

Oracle 9.1 RAC

All storage is shared by all cluster nodes

Same subsystem but different address spaces

i o redundancy
I/O Redundancy
  • Host to subsystem
    • Mirroring: Host to independent targets
    • Multi-pathing: Host to a single target
  • Subsystem to subsystem
    • Store and forward:
      • Local
      • Remote
disk mirroring redundant storage targets
Disk Mirroring: Redundant storage targets

Independent, identically sized storage address spaces

One controller

Two controllers

disk mirroring i os to 2 targets
Disk Mirroring: I/Os to 2 Targets
  • “Brute force” redundancy: fast and simple
  • Both read and write I/Os
    • Overlapped reads for performance
  • Local connections
  • Limited capacity*
  • I/O Bottlenecks* for random I/O activity
    • * if targets are disk drives
disk mirroring for redo log files
Disk Mirroring for Redo Log Files
  • Log files are a common bottleneck
  • Use raw partitions
  • Redundancy is required
    • Mirroring is adequate
  • Use highest RPM with lowest seek times
  • Put on a separate channel from database I/O
  • Use separate mirrored pairs per group
mirroring to storage subsystems
Mirroring to Storage Subsystems

StorageSubsystem

Independent, identically sized storage address spaces

Two controllers

StorageSubsystem

mirroring to subsystems
Mirroring to Subsystems
  • Targets are subsystems, not disks
    • Separate address spaces
  • Capacity scales to subsystem max
  • Double level redundancy
    • Mirroring plus RAID
  • Multiple disk spindles reduces I/O bottlenecks
disk mirroring datafiles from host to storage subsystems
Disk Mirroring Datafiles from Host to Storage Subsystems
  • Disk mirroring + subsystem RAID
  • Excellent capacity scaling
  • Adjacent and across campus/town
    • One subsystem outside site radius
  • Requires longer distance cabling
  • Reads and writes both transmitted
multi pathing r edundant paths between a host subsystem
Multi-Pathing: Redundant Paths Between a Host & Subsystem

X

Application data volume

Pathing software determines that a transmission error occurs & switches to a redundant path

multi pathing vs mirroring
Multi-pathing vs Mirroring
  • Mirroring assumes independent, but similar storage targets
  • Multi-pathing assumes multiple paths to the exact same target
  • Mirroring can use a single HBA, multi-pathing needs two HBAs
path failures
Path Failures

1

3

2

1. HBA problem

Application data volume

2. Link, switch or network problem

3. Subsystem controller problem

transmission failures recognized after scsi timeouts are exceeded

I/O sent to storage

No ack received

Transmission failures recognized after SCSI timeouts are exceeded

The I/Os is retried and eventually an error is passed back to the process that issued the I/O

path failover for oltp i o
Path Failover for OLTP I/O
  • Redundant path resources take over activities for a failed path to sustain operations without disrupting service or risking data integrity
store and forward
Store and Forward

Independent, identically sized storage address spaces

Host

A

B

store forward one host i o and two copies of data
Store & Forward: One Host I/O and Two Copies of Data
  • Only real option for remote copies
  • Does not forward read I/Os
  • Proprietary protocols and methods
    • Standards are emerging ie. FC/IP
  • First step to storage snapshots
store and forward acknowledgements

ACK

ACK

I/O

I/O

Forward

Forward

ACK

Store and Forward: Acknowledgements

Asynchronous

Synchronous

B

B

A

A

trade offs with acknowledgement handling
Trade-offs withAcknowledgement Handling
  • Synchronous
    • Always preferred
    • Slowest performance
    • State of copy is precise
  • Asynchronous:
    • Fastest performance
    • Least precise knowledge of copy status
store forward local and remote copies
Store & Forward: Local and Remote Copies
  • Local & nearby copy techniques
    • Synchronous
    • Fiber optic cabling, optical/DWDM services
  • Remote-far away copy techniques
    • Asynchronous
    • ATM gateways, OC-12 or less, FC/IP
mirroring vs synchronous store and forward for local nearby copies
Mirroring

Async I/O

Reads and writes

No snapshot tie-in

Uses more host slots

Least costly

Store and Forward

Async or Sync I/O

Writes only

Snapshot ready

May conserve host I/O slots

Most costly

Mirroring vs Synchronous Store and Forward for Local & Nearby Copies
combining mirroring with store and forward
Combining Mirroring with Store and Forward

Store and Forward Radius

Local

Nearby

Remote Far Away

Mirroring Radius

data redundancy for oltp
Data Redundancy for OLTP
  • Backup
  • Snapshots
  • Delta (log files)
backup for oltp
Backup for OLTP
  • A whole subject unto itself
  • Disaster recovery primarily
  • Cold? Who can afford to do that anymore?
  • Hot – put DB in backup mode
  • Backup snapshot image of data
subsystem snapshots for oltp
Subsystem Snapshots for OLTP

1. Flush host buffers (sync, sync)

2. Create Snapshot

Database

Server

Disk

Storage

Subsystem

A

Disk

Storage

Subsystem

c

Disk

Storage

Subsystem

B

logical snapshots for oltp
Logical Snapshots for OLTP

1. The address space is mapped

2. First updates

v

Overwritten data locations are not returned to the free space pool. (Undelete)

3. Secondupdates

delta redundancy with log files
Delta Redundancy with Log Files
  • Recording of all transaction activities
  • Roll forward, bring up to date
  • Roll Backward, go to known good state
  • Terrific tool for remote redundancy
  • Not HA
  • Process cannot have holes in it
remote redundancy w log files
Remote Redundancy w/ Log Files

-1

d(x) = f(x) – f(x-1)

f(x-1)

f(x)

Current to Log File Switch Checkpoint

Latest Redo Log File

Previous Instance

and now some thoughts from our sponsor

How come I always end up doing all the work?

He never does anything except eat and sleep

And now, some thoughts from our sponsor…..

Redundancy is a way of life

ManagingRedundancy is Hard Work

san considerations
SAN Considerations
  • Fabrics and SAN Islands
  • Zoning
  • Switches and directors
  • Multiplexing (oversubscribing)
  • Security
fabrics are the san environment
Fabrics ARE the SAN Environment
  • One size does not fit all applications
  • Larger fabrics carry more risks
  • VSANs are probably a good idea
  • Only use switches supporting hot, stateful firmware upgrades
san islands may be best for oltp
SAN Islands May be Best for OLTP
  • Most risk averse approach
  • Dual fabrics, one fabric per I/O path
  • Switch problems do not cascade
  • But, higher management costs
zoning oltp
Zoning & OLTP
  • All ports defined to zones
    • No rogue ports and zombie zones
  • Restrict access to current servers
    • Need-to-access only
switches and directors
Switches and Directors
  • Redundancy eats slots and ports
    • Pathing, mirroring
    • Separate channels for data and logs
  • Avoid traversing ISLs, if possible
    • Added latency and blocking potential
    • Trunking must have NSPoF
security
Security
  • Admin security for an OLTP SAN should be as strong as possible
    • No monkey business
  • No default passwords left
  • WAN encryption of log files
recommendations
Recommendations:
  • Determine OLTP availability needs
    • Where copies should be, time to access
  • Match storage network implementation to DB file types
  • Develop availability-driven policies
    • Equipment
    • Processes