Performance Tuning

Performance Tuning Adam Backman adam@dbappraise.com DBAppraise, LLC

Overview • OpenEdge Architecture • OpenEdge Performance tuning tips • Vicious Sales Pitch

What is performance tuning? Performance tuning is a process where application response times are recorded and modifications to the environment are made is a systematic manner in an effort to reduce the amount of time it takes to complete system tasks.

What is performance tuning? A process of moving the bottleneck to the fastest resource.

Application Performance The single largest component of performance tuning is application performance. A well written application will run well given the proper resources while a poorly written application will rarely run well despite resource allocation.

A few application hints • Use indexes - not USE-INDEX • Use NO-UNDO • Use TEMP-TABLES not WORKFILES • Use NO-LOCK • Watch out for CAN-FIND • Use FIELD-LIST (Client/Server) • Go n-tier – Use OpenEdge AppServer

Hardware Resources • Network (Slowest) • Disk (Slow) • Memory (Less slow) • CPU (Not Slow) Once you have a real CPU bottleneck you are done tuning and have begun shopping

When to Tune Performance? • Installation or upgrade of hardware • Installation or upgrade of software • Material (5%+) change in workload • Number of users • Volume of data • Basically, It is an ongoing process

How Fast is Fast? Remember: Performance is relative so it is important to have a baseline A baseline is an end-to-end timing of a task(s) that can be as simple as wristwatch timings or as advanced as a full blown benchmark

OpenEdge Memory Architecture • Shared memory • Server-less • Multi-server • Multi-broker

OpenEdge Memory Architecture 11

OpenEdge Network Architecture • Primary broker • Splitting clients across servers • Secondary broker • Splitting clients across brokers

OpenEdge Architecture C/S Overview • The OpenEdge Server • A process that accesses the database for 1 or more remote clients 13

Network contention The network is the slowest resource for client/server applications so you want to eliminate contention for this resource before moving on to the other resources Get things off the network wherever possible

Message buffer size • Increase –Mm to at least 4096 (8192 likely better) • Must be changed on both client and server side • When increasing –Mm remember to move to large frames on the system side (MTU Size)

Networking tips • Keep things local • No temp files on network drives • Move the application “close” to the user • Push reporting to the database engine • PrefetchDelay, priority, numrecs (10.2B) • Use -cache to speed initial connection • Use -pls if you are using program libraries over the network • Application issues are magnified over a network (field-lists, no-lock, indexes, …)

Odd Network Issues • Bandwidth • Speed (10 (yikes)/100/1000/Auto) • Duplex (Always full) • Sharing TCP/IP, Client, iSCSI • Latency (mainly a WAN Issue) • Mixing hardware (Cisco and 3com)

OpenEdge Storage Considerations • Database block size • Setting records per block • Type II Storage areas

Database Block Size • Generally, 8k works best for Unix/Linux • 4k works best for Windows • Remember to build filesystems with larger block sizes (match if possible) • There are exceptions so a little testing goes a long way but if in doubt use the above guidelines

Determining Records per Block • Determine “Mean” record size • Use proutil <dbname> -C dbanalys • Add 20 bytes for record and block overhead • Divide this product into your database block size • Choose the next HIGHER binary number • Must be between 1 and 256

Example: Records /Block • Mean record size = 90 • Add 20 bytes for overhead (90 + 20 = 110) • Divide product into database blocksize • 8192 ÷ 110 = 74.47 • Choose next higher binary number 128 • Default records per block is 64 in version 9 and 10

Records per block

Records Type I Storage Areas • Data blocks are social • They allow data from any table in the area to be stored within a single block • Index blocks only contain data for a single index • Data and index blocks can be tightly interleaved potentially causing scatter

Database Blocks

Type II Storage Areas • Data is clustered together • A cluster will only contain records from a single table • A cluster can contain 8, 64 or 512 blocks • This helps performance as data scatter is reduced • Disk arrays have a feature called read-ahead that really improves efficiency with type II areas.

Type II Clusters Customer Order Order Index

Storage Areas Compared Type I Type II

OpenEdge Storage Cheat Sheet

Disk contention In most environments disks are the largest area for improvement. All of the data flows from the disks to the other resources so this effects both local and networked users.

Causes of Disk I/O • Database • User requests (Usually 90% of total load) • Updates (This affects DB, BI and AI) • Temporary file I/O - Use as a disk utilization leveler • Operating system - usually minimal provided enough memory is installed • Other I/O

Disks • This is where to spend your money • Goal: Use all disks evenly • Buy as many physical disks as possible • RAID 5 is still bad in many cases, improvements have been made but test before you buy as there is a performance wall out there and it is closer with RAID 5

Disks – General Rules • Use RAID 10 (0+1) or Mirroring and Striping for best protection of data with optimal performance for the database • For the AI and BI RAID 10 still makes sense in most cases. Exception: Single database environments

RAID 5: Not as bad but still not good • Poor man’s mirroring - This is the kiss of death for OLTP performance • User information is striped • Parity information is striped WITH user information • OK for 100% read only applications • Poor performance for writes

RAID 10 vs. RAID 5 Cache Fill Rate fillTime = cacheSize / (requestRate – serviceRate) • 4 disks • RAID10 vs RAID5 • 4KB db blocks • 4GB RAM cache (1048576 blocks) • Typical Production DB Example: • 4GB / ( 200 io/sec – 800 io/sec ) = cache doesn’t fill! • Heavy Update Production DB Example: • 4GB / ( 1200 io/sec – 800 io/sec ) = 2621 sec. (≈ 44 min.) (RAID10) • 4GB / ( 1200 io/sec – 200 io/sec ) = 1049 sec. (≈ 17 min.) (RAID5) • Maintenance Example: • 4GB / ( 5000 io/sec – 3200 io/sec ) = 583 sec. (≈ 10 min.) (RAID10) • 4GB / ( 5000 io/sec – 200 io/sec ) = 218 sec (≈ 4 min.) (RAID5)

Disk tips • No RAID-5 (yes, still) • Use type II storage areas • Use 8k block size • Use the correct BI cluster size • Use page writers • Use private buffers (-Bp) • Use -T to eliminate variance • Use Secondary buffer pools (-B2)

Disk Tips (continued) • Buy many small disks Two heads are better than one • Buy fast disks (SSDs are reality) Buy at least 10,000 RMP • Buy fast controllers Fibre channel is better than SCSI

BI Cluster size • Somewhere between 1MB and 4MB works for most people • If you are checkpointing every 2 minutes or more often during peak periods increase the cluster size • If you a “workgroup” version of Progress leave your cluster size alone (512kb)

Progress page writers • Every database that does updates should have a before image writer (BIW) • Every database that does updates should have at least 1 asynchronous page writer (APW) • Every database that is using after imaging should have a after image writer (AIW)

Tuning APWs • Start with 1 APW • Monitor buffers flushed at checkpoint on the activity screen (option 5) in promon • If buffers flushed increases during the “important” hours of the day add 1 APW

Secondary Buffer Pool -B2 • Have their own LRU chain • Reduce or eliminate LRU by allocation a bit higher than storage need • Very good for high volume tables • Known size • Frequent reuse • Best if all of B2 data can remain in memory to eliminate LRU chain scanning

Use -T to level disk I/O Local (host based) users and batch jobs should use the -T parameter to place their temporary file (.srt, .pge, .lbi, …) I/O on a drive that is not working as hard as the other drives on the system Note: -T should never point to a network drive

Memory contention Memory should be used to reduce disk I/O. Broker (server) side parameters should be tuned first and then user parameters can be modified. In a memory lean situation, memory should be taken away from individual users before reducing broker parameters.

Memory hints • Swapping and/or excessive paging is bad, buy more memory or reduce parameters to avoid it • Increase -B in large increments until the point of the point of diminishing returns (BigBGuestimator) • Use -B2 for highly accessed portions of the data. This is generally small(ish) high access tables

Memory hints (continued) • Use memory for the users closest to the customer first (developers increase last) • Use -Bt for large temp tables • Set -bibufs and –aibufs to 120. You can look at promon summary screen to see if any additional tuning is required, it won’t for 99.9999999%

CPU contention High CPU activity is not bad in and of itself but high “system” CPU activity is potentially bad and should be corrected.

Components of CPU activity • USER - This is what you paid for • SYSTEM - This is overhead • WAIT - This is waste • IDLE - This is nothing, kind of like management

CPU activity goals The goal is to have as much USER time as possible with as little SYSTEM and WAIT. A practical split is USER: 70% SYSTEM: 20% WAIT: 0% IDLE: 10%

Eliminate high SYSTEM CPU activity • Always use –spin • Use a setting of 1 for single CPU systems • Use a higher setting for multiple CPU systems • Testing has shown that the optimal setting for -spin is somewhere between 2000 and 10000 for most

Eliminate High CPU • -semsets set to at least 1 per 100 users • -lruskips to change latching behavior (10-50) • -napmax should default to 5000 but in some late 7 and early 8 versions of Progress it is set to 100 which is way too low

Eliminating high WAIT CPU activity • WAIT = Waiting on I/O • If you still have IDLE time it generally is not a big problem • Look at paging/swapping first • Next look at your disk I/O

Performance Tuning