How Much Do Concurrent Updates Impact Query Performance in Oracle?

How Much Do Concurrent Updates Impact Query Performance in Oracle? Roger Schrag Database Specialists, Inc. www.dbspecialists.com

Today's Session Read-consistency and concurrency: Basic concepts How Oracle does it Other approaches Theoretical cost of read-consistency in Oracle Measure the true cost in two simulations: Fully repeatable—all scripts provided TKPROF report analysis Evaluation of v$ data Lessons learned

White Paper • Contains all of the material we will discuss today and more • Code samples and TKPROF reports are easier to read • Easier to cut and paste the code for testing on your system • Download: www.dbspecialists.com/presentations

Read-Consistency and Concurrency • Basic concepts • How Oracle does it • Other approaches

Read-Consistency • Accurate retrieval of information • Query results reflect data integrity • Results never include uncommitted work (no invoice lines without a header or vice versa if the header and lines were created in one transaction) • Query results are accurate as of a single point in time • Every row of the result set reflects data in the database as of a single point in time

Bank Account Example • $100 in a checking account • $1,300 in a savings account • Combined balance must be at least $1,000 to avoid monthly service charge • Transfer $500 from savings to checking • What if the bank was computing the combined balance while the transfer was happening? • Correct: $100 + $1,300 > $1,000 • Also correct: $600 + $800 > $1,000 • Wrong: $100 + $800 < $1,000

Read-Consistency in Oracle • Maximizes concurrency: • Updates are never blocked by queries • Queries are never blocked by updates or other queries • Query results reflect the data as of a single “reference point” in time: • When the cursor was opened, or • When the read-only transaction that the query is part of began, or • A user-specified time (flashback query)

Oracle’s Secret: Multi-Versioning • During an insert, update, or delete, “undo” information is written to an undo segment: • Allows the user to roll back the transaction if necessary instead of committing • Also enables Oracle to reconstruct an image of what data in the database looked like at a time in the past • Enables Oracle to ensure read-consistency while allowing a high degree of concurrency • Implemented in Oracle V6

Other Approaches • Read locks and write locks: • Concurrency is limited in order to ensure read-consistency • Queries block updates and vice versa • Accurate results, but performance sacrificed • Don’t ensure read-consistency: • Dirty reads: Query results include uncommitted work • Fuzzy reads: Query results not accurate as of a single point in time

Summarizing Read-Consistency and Concurrency • Oracle ensures read-consistency while allowing a high degree of concurrency: • Very strict about this • Basic RDBMS functionality for 17+ years • Other databases compromise: • Allow anomalies/incorrect results, or • Stifle throughput by controlling concurrency

The Theoretical Cost of Read-Consistency in Oracle • Approach the question from a theoretical basis • Look at checks every query must perform on every data block read • Identify what must happen when a check is not satisfied • Consider the performance implications

Checks a Query Must Perform on Every Data Block Read • Check for updates after query’s reference point: • SCN of last update recorded in every block • If SCN of last update precedes SCN of query’s reference point, then there are no updates in the block made after the query’s reference point • Check for uncommitted work: • ITL recorded in every data block • If ITL is empty, then no uncommitted work

When a Check is Not Satisfied • Oracle must create an alternate version of the data block: • Allocate a new buffer in the buffer cache • Copy data block to new buffer • Read undo segment referenced by ITL • Apply undo to the copied data block • Repeat as necessary until a version of the data block that satisfies the two checks is found

Multi-Versioning Performance Implications • Impact low when data blocks do not contain updates after the query’s reference point or uncommitted work • Otherwise Oracle must use resources: • Tie up extra buffer in buffer cache • Generate logical reads to fetch undo • Possibly generate physical reads to fetch undo • Use CPU time to copy data, apply undo • Risk of ORA-1555 (query failure) if undo no longer available

Summarizing Theoretical Cost • Should be negligible most of the time • Could be non-trivial when significant multi-versioning occurs • Seems like a small price to pay for data integrity and accuracy without sacrificing concurrency • Even better if we can: • Detect excessive multi-versioning • Quantify the cost • Take steps to reduce impact

Measuring True Costs of Read-Consistency in Oracle • Walk through two fully repeatable simulations: • Create, populate test schema • Trace query execution • Trace query execution again while an external activity forces multi-versioning to occur • Compare TKPROF reports, v$ data, to deduce multi-versioning costs • All code is provided here: • I ran in Oracle 9i environment on Solaris • Should work on Oracle 10g and 8i as well (minor changes needed for Oracle 8i)

Simulation Overview • Querying a bank account balance • Query an account balance • Query again while another session is posting a deposit to a different bank account stored in the same data block • Reporting combined account balances • Launch a report to show customers below the minimum combined account balance • Launch report again while another session is posting ATM transactions

Test Schema Setup • Create a tablespace without ASSM: CREATE TABLESPACE testDATAFILE '/u03/oradata/dev920ee/test01.dbf' SIZE 200mEXTENT MANAGEMENT LOCAL AUTOALLOCATESEGMENT SPACE MANAGEMENT MANUAL; • Give ourselves quota: ALTER USER rschrag QUOTA UNLIMITED ON test;

Test Schema Setup • bank_accounts table: • 1,000,000 rows, about 120 bytes per row • Each row is one checking or savings account • About 90% of accounts are active, 10% inactive • Some customers will have multiple accounts • Data loaded will be “pseudo-random”: • Data scattered over a spectrum • Not truly random • Running script again should yield exact same data

Test Schema Setup CREATE TABLE bank_accounts(account_id NUMBER,account_number VARCHAR2(18),customer_id NUMBER,current_balance NUMBER,last_activity_date DATE,account_type VARCHAR2(10),status VARCHAR2(10),other_stuff VARCHAR2(100))TABLESPACE test;

Test Schema Setup BEGIN dbms_random.seed ('Set the random seed so that ' || 'this script will be repeatable'); FOR i IN 0..9 LOOP FOR j IN i * 100000..i * 100000 + 99999 LOOP INSERT INTO bank_accounts ( account_id, account_number, customer_id, current_balance, last_activity_date, account_type, status, other_stuff ) VALUES ( j, LPAD (LTRIM (TO_CHAR (TRUNC (dbms_random.value * 1000000000000000000))), 15, '0'), TRUNC (dbms_random.value * 700000), TRUNC (dbms_random.value * 5000, 2) + 250.00, TO_DATE ('12-31-2005 12:00:00', 'mm-dd-yyyy hh24:mi:ss') - dbms_random.value * 30, DECODE (TRUNC (dbms_random.value * 3), 1, 'SAVINGS', 'CHECKING'), DECODE (TRUNC (dbms_random.value * 10), 1, 'INACTIVE', 'ACTIVE'), 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx' ); END LOOP; COMMIT; END LOOP; END; /

Test Schema Setup ALTER TABLE bank_accounts ADD CONSTRAINT bank_accounts_pk PRIMARY KEY (account_id) USING INDEX TABLESPACE test; BEGIN dbms_stats.gather_table_stats (USER, 'BANK_ACCOUNTS', cascade=>TRUE); END; /

Test Schema Setup • bank_transactions table: • Roughly 90,000 rows • Each row is one ATM deposit or withdrawal • Flag on each row set to “n” to indicate transaction not yet posted to account balance CREATE TABLE bank_transactions ( transaction_id NUMBER, account_id NUMBER, transaction_date DATE, transaction_type VARCHAR2(10), amount NUMBER, processed VARCHAR2(1) ) TABLESPACE test;

Test Schema Setup DECLARE v_transaction_id NUMBER; v_transaction_date DATE; v_transaction_type VARCHAR2(10); v_amount NUMBER; BEGIN v_transaction_id := 1; v_transaction_date := TO_DATE ('01-01-2006 00:00:00','mm-dd-yyyy hh24:mi:ss'); FOR i IN 1..100000 LOOP v_amount := TRUNC (dbms_random.value * 10) * 20 + 20; IF TRUNC (dbms_random.value * 2) = 1 THEN v_transaction_type := 'DEPOSIT'; ELSE v_amount := 0 - v_amount; v_transaction_type := 'WITHDRAWAL'; END IF; INSERT INTO bank_transactions ( transaction_id, account_id, transaction_date, transaction_type, amount, processed ) SELECT v_transaction_id, account_id, v_transaction_date, v_transaction_type, v_amount, 'n' FROM bank_accounts WHERE account_id = TRUNC (dbms_random.value * 1000000) AND status = 'ACTIVE'; v_transaction_id := v_transaction_id + SQL%ROWCOUNT; v_transaction_date := v_transaction_date + (dbms_random.value / 5000); END LOOP; COMMIT; END; /

Test Schema Setup ALTER TABLE bank_transactions ADD CONSTRAINT bank_transactions_pk PRIMARY KEY (transaction_id) USING INDEX TABLESPACE test; BEGIN dbms_stats.gather_table_stats (USER, 'BANK_TRANSACTIONS', cascade=>TRUE); END; /

Test Schema Setup • post_transactions procedure: • Reads a specified number of unprocessed records from bank_transactions, updates balances in bank_accounts, and updates the processed flag in bank_transactions • Uses an autonomous transaction • Simulates updates being performed in another session

Test Schema Setup CREATE OR REPLACE PROCEDURE post_transactions (p_record_count IN NUMBER) IS PRAGMA AUTONOMOUS_TRANSACTION; CURSOR c_bank_transactions IS SELECT account_id, transaction_date, amount FROM bank_transactions WHERE processed = 'n' ORDER BY transaction_id FOR UPDATE; v_record_count NUMBER; BEGIN v_record_count := 0; FOR r IN c_bank_transactions LOOP UPDATE bank_accounts SET current_balance = current_balance + r.amount, last_activity_date = r.transaction_date WHERE account_id = r.account_id; UPDATE bank_transactions SET processed = 'y' WHERE CURRENT OF c_bank_transactions; v_record_count := v_record_count + 1; EXIT WHEN v_record_count >= p_record_count; END LOOP; COMMIT; END post_transactions; /

Simulation #1: Querying a Bank Account Balance • Overview: • Query an account balance • Query again while another session is posting a deposit to a different bank account stored in same data block • Objectives: • Quantify the cost of creating an alternate version of a data block in order to back out uncommitted work • Identify multi-versioning indicators • Learn scope of conflicting activities

Collect Trace Data • Start a new database session • Enable tracing with high level of detail: ALTER SESSION SET statistics_level = ALL; ALTER SESSION SET sql_trace = TRUE;

Collect Trace Data • Query balance for account_id 2: SELECT account_number, status, account_type, TO_CHAR (last_activity_date, 'mm-dd-yyyy hh24:mi:ss') last_activity, TO_CHAR (current_balance, '$999,990.00') current_balance FROM bank_accounts WHERE account_id = 2; • Possible variations from one run to next: • Hard parse • Physical disk reads • Run query three more times to get repeatable results • Use identical query text

Collect Trace Data • Update balance on account_id 3 in a second session: • Different row from query in first session, but same data block (note no ASSM in this tablespace) • Do not commit the update UPDATE bank_accounts SET last_activity_date = TO_DATE ('01-03-2006 11:15:22', 'mm-dd-yyyy hh24:mi:ss'), current_balance = current_balance + 20 WHERE account_id = 3;

Collect Trace Data • Query balance for account_id 2 again in first session: • Oracle will need to create an alternate version of the data block to undo the uncommitted update against account_id 3 • Use identical query text SELECT account_number, status, account_type, TO_CHAR (last_activity_date, 'mm-dd-yyyy hh24:mi:ss') last_activity, TO_CHAR (current_balance, '$999,990.00') current_balance FROM bank_accounts WHERE account_id = 2;

Generate TKPROF Report • Generate TKPROF report, listing each execution of each statement individually: tkprof simulation1.trc simulation1.prf aggregate=no sys=no • Recap of traced session activities: • Identical query run five times • First execution might involve hard parse and/or physical reads • Last execution involves multi-versioning

First Execution call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.03 0.05 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 2 0.00 0.02 4 4 0 1 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 4 0.03 0.08 4 4 0 1 Misses in library cache during parse: 1 Optimizer goal: CHOOSE Parsing user id: 97 Rows Row Source Operation ------- --------------------------------------------------- 1 TABLE ACCESS BY INDEX ROWID BANK_ACCOUNTS (cr=4 r=4 w=0 time=14008 us) 1 INDEX UNIQUE SCAN BANK_ACCOUNTS_PK (cr=3 r=3 w=0 time=13763 us)(object id 32144)

Second Execution call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 2 0.00 0.00 0 4 0 1 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 4 0.00 0.00 0 4 0 1 Misses in library cache during parse: 0 Optimizer goal: CHOOSE Parsing user id: 97 Rows Row Source Operation ------- --------------------------------------------------- 1 TABLE ACCESS BY INDEX ROWID BANK_ACCOUNTS (cr=4 r=0 w=0 time=58 us) 1 INDEX UNIQUE SCAN BANK_ACCOUNTS_PK (cr=3 r=0 w=0 time=36 us)(object id 32144)

Fifth Execution call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 2 0.00 0.00 0 6 0 1 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 4 0.00 0.00 0 6 0 1 Misses in library cache during parse: 0 Optimizer goal: CHOOSE Parsing user id: 97 Rows Row Source Operation ------- --------------------------------------------------- 1 TABLE ACCESS BY INDEX ROWID BANK_ACCOUNTS (cr=6 r=0 w=0 time=538 us) 1 INDEX UNIQUE SCAN BANK_ACCOUNTS_PK (cr=3 r=0 w=0 time=64 us)(object id 32144)

Query Statistics Summary

Learned From This Exercise • One simple multi-versioning operation caused: • Two extra logical reads • About 460 µS extra time • Table access by index ROWID operation required three logical reads for one row: • An indicator that more is happening than just a table access by index ROWID • Multi-versioning was necessary even though the row containing uncommitted work was not relevant to our query

Who Cares About 460 µS? • So what if the query required two extra logical reads and 460 µS of extra time? • Probably not a big deal for this query • But consider: • Multi-versioning made this query take about seven times longer • This could add up if it happens a lot • Multi-versioning here was the simplest case: • Only one operation to undo • No physical reads

Simulation #2: Minimum Balances • Overview: • Launch a report to show customers below the minimum combined account balance • Launch report again while another session is posting ATM transactions • Objectives: • Examine the case of multi-versioning caused by committed transactions occurring after a query’s reference point • See widespread multi-versioning • Identify more multi-versioning indicators

Collect Trace Data • Start a new database session • Enable tracing with high level of detail: ALTER SESSION SET statistics_level = ALL; ALTER SESSION SET sql_trace = TRUE;

Collect Trace Data • Run report for subset of customers: VARIABLE low_balances REFCURSOR BEGIN OPEN :low_balances FOR SELECT /*+ CACHE (bank_accounts) */ customer_id, COUNT (*) active_accounts, SUM (current_balance) combined_balance, MAX (last_activity_date) last_activity_date FROM bank_accounts WHERE status = 'ACTIVE' AND customer_id BETWEEN 10000 AND 10999 GROUP BY customer_id HAVING SUM (current_balance) < 1000 ORDER BY active_accounts, customer_id; END; /

Collect Trace Data SELECT b.value, a.name FROM v$statname a, v$mystat b WHERE a.name IN ('consistent gets', 'consistent changes') AND b.statistic# = a.statistic# ORDER BY a.statistic#; PRINT low_balances SELECT b.value, a.name FROM v$statname a, v$mystat b WHERE a.name IN ('consistent gets', 'consistent changes') AND b.statistic# = a.statistic# ORDER BY a.statistic#; • Run report three more times to get repeatable results: • Use identical query text

Report Notes • Use subset of customers to keep output brief • CACHE hint retains bank_accounts data blocks in buffer cache according to LRU algorithm • Opening cursor causes Oracle to set reference point, but real work does not begin until first fetch • Query from v$mystat shows count of changes that had to be rolled back in alternate data block versions

Collect Trace Data • Run report a fifth time: • Simulate transactions committed in another session after query reference point by doing the following after opening the cursor: EXECUTE post_transactions (10000) • Oracle will need to back out the updates committed by this procedure call when fetching report results • Use identical query text

Generate TKPROF Report • Generate TKPROF report, listing each execution of each statement individually: tkprof simulation2.trc simulation2.prf aggregate=no sys=no • Recap of traced session activities: • Identical query (in report) run five times • First execution might involve hard parse and/or physical reads • Last execution involves widespread multi-versioning

First Execution SELECT /*+ CACHE (bank_accounts) */ customer_id, COUNT (*) active_accounts, SUM (current_balance) combined_balance, MAX (last_activity_date) last_activity_date FROM bank_accounts WHERE status = 'ACTIVE' AND customer_id BETWEEN 10000 AND 10999 GROUP BY customer_id HAVING SUM (current_balance) < 1000 ORDER BY active_accounts, customer_id

First Execution call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.02 0.06 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 4 5.24 7.84 16669 16679 0 48 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 6 5.26 7.91 16669 16679 0 48 Misses in library cache during parse: 1 Optimizer goal: CHOOSE Parsing user id: 97 (recursive depth: 1) Rows Row Source Operation ------- --------------------------------------------------- 48 SORT ORDER BY (cr=16679 r=16669 w=0 time=7846722 us) 48 FILTER (cr=16679 r=16669 w=0 time=7835555 us) 708 SORT GROUP BY (cr=16679 r=16669 w=0 time=7834846 us) 1281 TABLE ACCESS FULL BANK_ACCOUNTS (cr=16679 r=16669 w=0 time=7795324 u s)

Second Execution call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 4 2.80 2.79 0 16679 0 48 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 6 2.80 2.79 0 16679 0 48 Misses in library cache during parse: 0 Optimizer goal: CHOOSE Parsing user id: 97 (recursive depth: 1) Rows Row Source Operation ------- --------------------------------------------------- 48 SORT ORDER BY (cr=16679 r=0 w=0 time=2793933 us) 48 FILTER (cr=16679 r=0 w=0 time=2793371 us) 708 SORT GROUP BY (cr=16679 r=0 w=0 time=2792563 us) 1281 TABLE ACCESS FULL BANK_ACCOUNTS (cr=16679 r=0 w=0 time=2768765 us)

Fifth Execution call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 4 3.42 3.81 0 26691 0 48 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 6 3.42 3.81 0 26691 0 48 Misses in library cache during parse: 0 Optimizer goal: CHOOSE Parsing user id: 97 (recursive depth: 1) Rows Row Source Operation ------- --------------------------------------------------- 48 SORT ORDER BY (cr=26691 r=0 w=0 time=3814002 us) 48 FILTER (cr=26691 r=0 w=0 time=3813425 us) 708 SORT GROUP BY (cr=26691 r=0 w=0 time=3812575 us) 1281 TABLE ACCESS FULL BANK_ACCOUNTS (cr=26691 r=0 w=0 time=3780240 us)

How Much Do Concurrent Updates Impact Query Performance in Oracle?

How Much Do Concurrent Updates Impact Query Performance in Oracle?

Presentation Transcript

Oracle Database Administration

Oracle and NetApp

Oracle Enterprise Manager Security Best Practices

Oracle Administration and Monitoring Tools for Windows

Oracle Partitioning in Oracle Database 11g

Fundamentals of Misys Query (Tiger and PM)

Chapter 4: Manipulating Routing Updates

Oracle Sourcing and Oracle Procurement Contracts: Actions You Can Take to Spur Performance

Computing with Concurrent Objects

Concurrent Systems, CSP, and FDR

Multi-dimensional Search Trees

Oracle Join Techniques

Structured Query Language

Chapter 4: Manipulating Routing Updates

Concurrent Objects

Implementation of Relational Operators

Analyzing Your Data with Analytic Functions

Concurrent Objects