oracle9i real application clusters on commodity based cluster technology a case study n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Oracle9i Real Application Clusters on Commodity-based Cluster Technology A Case Study PowerPoint Presentation
Download Presentation
Oracle9i Real Application Clusters on Commodity-based Cluster Technology A Case Study

Loading in 2 Seconds...

play fullscreen
1 / 32

Oracle9i Real Application Clusters on Commodity-based Cluster Technology A Case Study - PowerPoint PPT Presentation


  • 126 Views
  • Uploaded on

Oracle9i Real Application Clusters on Commodity-based Cluster Technology A Case Study. Kevin Closson Sr. Staff Engineer, Database Engineering PolyServe Incorporated kevinc@polyserve.com. Agenda. Answers and Questions Project Description Project Goals

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Oracle9i Real Application Clusters on Commodity-based Cluster Technology A Case Study' - fritz


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
oracle9i real application clusters on commodity based cluster technology a case study

Oracle9i Real Application Clusters on Commodity-based Cluster TechnologyA Case Study

Kevin Closson

Sr. Staff Engineer, Database Engineering

PolyServe Incorporated

kevinc@polyserve.com

agenda
Agenda
  • Answers and Questions
  • Project Description
    • Project Goals
  • System and Test Environment Configuration
    • Hardware
    • Linux Distribution/Kernel Release
    • Oracle
  • Lessons Learned
  • Performance Observations
  • Clustered File System Considerations
questions and answers
Questions and Answers
  • Intel-based Servers? Can those processors really handle Oracle?
  • What about high I/O rates?
  • Huh? An 8 node Cluster? Isn’t that a management nightmare?
  • Really now, Linux? That isn’t a stable enough Operating System, is it?
  • Linux can’t be right for Oracle, it doesn’t scale to large processor counts, right?
  • What on earth could I possibly do with the primary release of Oracle9i Real Application Clusters (9.0.1)? Doesn’t it make sense to hold off until 9iR2?
  • What can I achieve without a high-speed interconnect, really ?
project description
Project Description
  • Case Study
    • OLTP workload analysis on an 8 node Intel-based cluster running Oracle9i RAC on Redhat Linux
  • Goals
    • To conduct a preliminaryassessment of Oracle9i Real Application Clusters
    • Determine “Barriers to Entry”
    • To study new Oracle9i features and how to use them to improve performance, managability and uptime.
    • Assess Linux/SAN technology interoperability
system configuration hardware
System Configuration - Hardware
  • 8 Dell PowerEdge 1550 Dual Proc (900Mhz) each:
    • 1GB Physical Memory
    • 2 100BaseT NIC
    • 1 Gigabit Ether NIC (Broadcom BCM5700 )
    • 1 Qlogic Fibre Channel HBA
    • Brocade 16port FC Switch
    • 288 GB SAN Storage Subsystem
      • RAID 1+0
test system overview

100BaseT Public Network (PolyServe Secondary)

100BaseT Polyserve Primary Interconnect

Gbit Ethernet Oracle9i RAC

Dual CPU

Intel

Dual CPU

Intel

Dual CPU

Intel

Dual CPU

Intel

Dual CPU

Intel

Dual CPU

Intel

Dual CPU

Intel

Dual CPU

Intel

Brocade Silkworm 2250

16 Port FC Switch

288 GB Capacity

SAN

Test System Overview
system configuration software
System Configuration - Software
  • Redhat Linux 7.2
    • Redhat Kernel version 2.4.7-10
    • glibc-2.2.4-13
    • gcc-2.96-98
    • binutils-2.11.90.0.8-9
  • Oracle9i Release 1 (9.0.1) with Real Application Clusters
    • Oracle “Clusterware” for Linux
      • OCMS
        • GCS
        • GES
goals ascertain barriers to entry
Goals – Ascertain Barriers to Entry
  • Oracle9i Interconnect Requirement
    • It seemed to be commonly accepted that Gigabit Ether is required for Oracle9i Real Application Clusters
      • Tested both 100baseT and Gigabit ether for Oracle’s Interconnect
        • Configured in $ORACLE_HOME/oracm/admin/nmcfg.ora in DefinedNodes parameter
      • At all OLTP scale points, 100baseT and Gigabit Ethernet provided the same throughput!
        • Why?
          • Same latencies
          • With OLTP, Oracle’s demand on the interconnect is mostly for lock converts and shipping consistent version row pieces (not entire blocks)
        • A credit to Oracle’s utilization of Interconnect bandwidth
      • This testing did not include IPQO, so there was no barrier to entry based on interconnects available
      • For early-access Oracle9i RAC OLTP testing, it seems 100baseT is a safe bet for up to 8 dual-proc nodes
goals study new oracle9i features
Goals – Study New Oracle9i Features
  • Oracle Managed Files
    • All testing done on Tablespaces created using OMF
      • Tablespaces created with default 100MB and grown based on AUTOEXTEND
  • Dynamic SGA
    • Not all cache buffers need to be allocated at Instance boot time.
    • This feature is an excellent enhancement for availability infrastructure
        • “Standby Instances” don’t have to be pre-sized and idle.
          • Saves Physical Memory on “secondary” nodes
performance observations1
Performance Observations
  • Testing based on varying read:write ratio OLTP workloads
    • Same database, different transaction mixes
  • Fixed pseudo user count from 1 through 8 nodes
    • Tests the ability for the same number of users to get more work done by adding hardware
  • Completely baseline configuration
    • Version 9.0.1 from OTN, no patches even !!
    • Minimal init.ora (only 17 parameters set! )
    • Very minimal Schema Tuning
      • Added freelist Groups, not much else …
  • No partitioning of any sort
    • No data-dependant request routing
    • No “application partitioning”
    • Simple tables, not physically partitioned
oltp workload specifics
OLTP Workload Specifics
  • Four read to write ratios
    • Reporting 99:1
    • Customer Service 76:24
    • Order Entry 70:30
    • Maintenance 60:40
  • I/O Rate per Transaction
    • Reporting 1.3/tps
    • Customer Service 2.5/tps
    • Order Entry 9/tps
    • Maintenance 15/tps
slide13

Reporting Workload

1800

1500

1200

TPS

900

600

300

0

1

2

3

4

5

6

Nodes

  • Read to write ratios
    • Reporting 99:1
  • I/O Rate per Transaction
    • Reporting 1.3/tps
slide14

Customer Service Workload

1500

1250

1000

TPS

750

500

250

0

1

2

3

4

5

6

Nodes

  • Read to write ratios
    • Customer Service 76:24
  • I/O Rate per Transaction
    • Customer Service 2.5/tps
slide15

Order Entry Workload

600

500

400

TPS

300

200

100

0

1

2

3

4

5

6

Nodes

  • Read to write ratios
    • Order Entry 70:30
  • I/O Rate per Transaction
    • Order Entry 9/tps
slide16

Maintenance Workload

360

300

240

TPS

180

120

60

0

1

2

3

4

5

6

Nodes

  • Read to write ratios
    • Maintenance 60:40
  • I/O Rate per Transaction
    • Maintenance 15/tps
lesson learned validate fc switch
Lesson Learned – Validate FC Switch
  • All Fibre Channel Switches may be created equal, but …
    • Use non-Oracle test to determine FC switch capability
      • Conduct single node random and sequential I/O tests using dd(1) or other
      • Conduct multiple node concurrent test
      • Practical Example of a poorly behaved switch:
        • 1GB file, Fixed number of Random 4K pread() calls*, Raw Disk access:

node1: TIME 81.10 OPS/sec: 3945.75 MB/sec: 15.4

node2: TIME 101.29 OPS/sec: 3159.25 MB/sec: 12.3

node3: TIME 72.03 OPS/sec: 4442.59 MB/sec: 17.4

node4: TIME 72.04 OPS/sec: 4441.98 MB/sec: 17.4

node5: TIME 72.05 OPS/sec: 4441.36 MB/sec: 17.3

node6: TIME 72.06 OPS/sec: 4440.74 MB/sec: 17.3

node7: TIME 72.53 OPS/sec: 4411.97 MB/sec: 17.2

node8: TIME 72.76 OPS/sec: 4398.02 MB/sec: 17.2

*Program available upon request

lessons learned proper statistics collection
Lessons Learned – Proper Statistics Collection
  • Use Statspack
    • Statspack does track multiple instance statistics
    • Produces per-instance report.
      • Post-process for aggregate cluster-wide stats
  • Use gv$ performance views
    • False readings from “sampling” instances is misleading
    • An important example:
      • Global Physical I/O profile:

select tablespace_name,sum(PHYRDS) reads,sum(PHYWRTS) writes

from dba_data_files,gv$filestat

where dba_data_files.file_id = gv$filestat.file#

group by tablespace_name

order by reads desc;

global i o profile
Global I/O Profile

TABLESPACE READS WRITES

-----------------------------------------------

WAREHOUSE 632344 624621

WHR_IDX 475200 794

PRODUCT 482780 59523

CUSTOMER 128967 136494

ITM_IDX 107937 121074

ORDERS 52678 66485

NAME_IDX 24728 3746

ORD_IDX 29663 27842

CUS_IDX 6452 2384

ITEM 17435 72617

CSTATE_IDX 1028 1554

ROLL_1 4 56131

ROLL_2 3 61119

STATE_IDX 2982 4217

-----------------------------------------------

TOTALS: 1962460 1238977

TOTAL I/O 3201437 I/O per second 3557

READ PCT 62

WRITE PCT 38

lessons learned linux kernel
Lessons Learned – Linux Kernel
  • Track Regression of Linux Kernel
    • 2.4.7 to 2.4.9 Kernel regression – 80% degradation for Shared memory based I/O due to Virtual Memory bug
      • Redhat Bug #59748
    • Establish an environement for quick regression analysis before depolying new Kernel revisions.
lesson learned dbwr performance
Lesson Learned – DBWR Performance
  • Async I/O is not supported on Linux
    • This is not really a problem
      • CPU to DBWR ratio
      • Huge SMPs cannot survive without Async I/O for DBWR
    • Configure DBWR writer slaves
      • Caveats
  • Async I/O in Linux will be available soon (reportedly 2.5 Kernel
  • Seek out a vendor that supports Oracle Disk Manager
    • PolyServe is implementing ODM through a partnership with Oracle
lesson learned dbwr i o slaves
Lesson Learned –DBWR I/O Slaves
  • Error During Instance Startup:
  • Workaround – Explicitly disable async I/O if slaves are used:

Sat Jan 9 10:26:30 2002

Errors in file /usr1/oracle/rdbms/log/dbw0_22830.trc:

ORA-01114: IO error writing block to file 1 (block # 300)

ORA-01110: data file 1: '/usr1/oracle/rw/DATA/benchSYSTEM'

ORA-27064: skgfdisp: cannot perform async I/O to file

DBW0: terminating instance due to error 1242

dbwr_io_slaves = 20

disk_asynch_io = false

_lgwr_async_io = false

_dbwr_async_io = false

lesson learned dbwr i o slaves1
Lesson Learned – DBWR I/O Slaves
  • Estimated DBWR Slave I/O Capability
    • DBWR I/O Slaves are synchronous I/O handling processes
      • Process name: “ora_i101_PROD1”, etc
    • Total Slave Write Capacity is a product of:
      • Avg Disk Write Service Time
      • Number of Slaves
      • Example:
        • Given an average service time of 12ms, 20 DBWR slaves can handle roughly 1600 writes per second
          • ( ( 1/.012) * 20 )
    • If suffering “free buffer waits”, try adding more slaves. This tactic does top out, however.
lessons learned buffer overhead for cache fusion
Lessons Learned – Buffer Overhead for Cache Fusion
  • SGA disk block buffers have additional overhead with Cache Fusion
    • Buffers are pinned down while current version data is sent to requesting nodes
      • Buffer pin duration reported as ‘global cache current block send time’
      • Most of the latency measured with this statistic is wrapped up in the processes involved being in ‘runable’ state. That is, this measurement includes time that processes wait for CPU. This was a very heavily utilized cluster 
      • Very small portion of this time is that of interconnect data transfer
    • It may be necessary to add buffers once Real Application Clusters are used
lesson learned linux raw 8 binding
Lesson Learned – Linux raw(8) binding
  • To use raw partitions with Oracle, it is required to bind /dev/sd* device with the raw(8) command
    • Ensure scripts are THOROUGHLY TESTED
    • Watch for “device slip”
    • Bindings are not persistent across system reboot 
    • Example simple script error, could be disastrous:

# raw /dev/raw/raw1 /dev/sdb1

/dev/raw/raw1: bound to major 8, minor 17

# raw /dev/raw/raw3 /dev/sdb1

/dev/raw/raw3: bound to major 8, minor 17

# raw -qa

/dev/raw/raw1: bound to major 8, minor 17

/dev/raw/raw3: bound to major 8, minor 17

$ sqlplus ‘/ as sysdba’ <<EOF

alter tablespace system add datafile ‘/dev/raw/raw3’ SIZE 50M;

EOF

clustered file system considerations1
Clustered File System Considerations
  • A good CFS should take care of the following:
    • SAN physical device abstraction
      • Based in WWN, a spindle or LUN does not need to be dealt with as some raw device
        • Disks are imported and treated as managed entities
        • Eliminates concerns of device slippage, raw(8) binding, etc
    • Allow Direct I/O for Oracle files in the Clustered File System
      • Control Files,Redo Logs, Tablespaces, etc
    • Shared Oracle Home directory
      • $ORACLE_HOME contains some 100,000 files under many directories
      • With Shared Oracle Home, only one copy of the Oracle product is needed
        • Ease of management
          • Patches, config, sheer mass, etc
      • OCMS presumes lack of CFS*

*Tips and Techniques Guide Available Documenting a Workaround

clustered file systems considerations
Clustered File Systems Considerations
  • Use non-Oracle “micro” benchmark* that possesses a similar I/O profile to Oracle to ensure the cost of I/O operations to and from the SAN are nearly the same when:
    • Accessed through raw devices (bound with the raw(8) command)
    • Accessed through the Clustered File System
      • No serializing anomalies
        • Allow concurrent writes to files
      • Direct, completely unbuffered
        • External Caching in Page Cache is a NO-NO with RAC
      • Commensurate processor overhead
        • Some filesystem implementations consume inordinate amounts of processor bandwidth in I/O code path

*Sample Program Available Upon Request

slide30

Micro Benchmark Random 4K pread() Fixed Workload

72 Processes Executing 5000 Random I/Os

100

90

80

70

60

2Node

50

Complete Tm (sec)

4Node

40

30

20

10

0

CFS

RAW

summary
Summary
  • Oracle9i Real Application Clusters, in its primary release, delivers tremendous horizontal scaling – and it’s stable 
    • “Barriers to Entry” are minimal given a Linux approach
  • It is recommended that you have a good Clustered File System
    • All OLTP performance data in the presentation was collected from a cluster with Oracle executing from a shared Oracle Home in the PolyServe Clustered Filesystem
    • The datafiles, redo logs,control files and RAC quorum disk were also stored in the PolyServe Clustered Filesystem
  • An interesting Quote
what the analysts say
What the Analysts Say….
  • “PolyServe is delivering on the dream of true clustered file systems. The Oracle9i integration is a perfect application for this technology, and users will see an immediate benefit.”
  • Steve Duplessie, Enterprise Storage Group