1 / 36

DBMS Performance: A multidimensional Challenge

DBMS Performance: A multidimensional Challenge. Vadiraja Bhatt Database Performance engineering group. DBMS Performance . DBMS servers are defacto backbend for most applications Simple Client server Muli-tier ERP ( SAP, PeopleSoft..) Financial Healthcare Web applications

nili
Download Presentation

DBMS Performance: A multidimensional Challenge

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DBMS Performance:A multidimensional Challenge Vadiraja Bhatt Database Performance engineering group.

  2. DBMS Performance • DBMS servers are defacto backbend for most applications • Simple Client server • Muli-tier • ERP ( SAP, PeopleSoft..) • Financial • Healthcare • Web applications • Each application has unique performance requirement • Throughput v/s Response time

  3. Performance challenges • Keeping up with various application characteristics • Financial • Mostly OLTP • Batch jobs • Report Generations • Security • Internet Portal • Mostly readonly • Loosely structured data • Transactions are not important • Short-lived sessions

  4. Performance challengesTHE DATA EXPLOSION • Web 174 TB on the surface • Email 400,000 TB/ year • Instant Messaging 274 TB/ year • Transactions Growing 125%/ year Data Volume Time Source: http://www.sims.berkeley.edu/research/projects/how-much-info-2003/

  5. Performance challenges • Demand for increased capacity • Large number of users • Varying class of services • Need to satisfy SLA • Several generations of applications. • Client servers to Web applications. • Legacy is always painful

  6. Performance Challenges • Consolidation • Cheaper hardware • Increased maintenance cost • Increased CPU power • Reduction in TCO ( Total cost of Ownership) • Net effect • DBMS have to work lot harder to keep-up

  7. I/O Memory Processor System Performance 101 • How do I get my hardware and software to do more work without buying more hardware ? • System Performance mainly depends on 3 areas

  8. De-mystifying Processors • Classes of processors • CISC/RISC/EPIC • Penitum is the most popular one • Itanium didn’t make a big impact ( inspite of superior performance) • Dual-Core/Quad core are making noices • One 1.5GHz CPU does not necessarily yield the same performance as two 750MHz CPU’s. Various parameters to account for are • L1/L2/L3 cache sizes (Internal CPU caches) • Memory Latencies • Cycles per instruction

  9. DBMS and CPU utilization • DBMS servers are memory centric • Experiments have shown that < 50% utilization of CPU • > 50% cycles wasted in memory related stalls • Computational power of CPU is underutilized • Memory access is very expensive • Large L2 cache Increased memory latency • Cacheline sharing • Light-weight locking constructs causes lot of cache-to-cache transfers • Context switch becomes expensive • Maintaining locality is very important

  10. Hash ASE – Architecture overview Operating System Disks Engine 0 Engine 1 ... Engine N CPUs Registers File Descriptors Registers File Descriptors Registers File Descriptors 5 2 1 Running Running Running Shared Executable (Program Memory) Shared Memory Lock Chains Proc Cache Sleep Queue Run Queues lock sleep 4 6 3 disk I/O 7 Pending I/Os N E T N E T D I S K send sleep Other Memory 8

  11. ASE Engines • ASE Engines are OS processes that schedule tasks • ASE Engines are multi-threaded • Native thread implementation • ASE Performs automatic load balancing of tasks • Load balancing is dynamic • N/w endpoints are also balanced • ASE has automatic task affinity management • Tasks tend to run on the same engine that they last ran on to improve locality

  12. ASE Engines • 2 configuration parameters control the number of engines • The sp_engine stored procedure can be used to “online” or “offline” engines dynamically • Tune the number of engines based on the “Engine Busy Utilization” values presented by sp_sysmon

  13. Benefits of Multiple Engines • Multiple Engines take advantage of CPU processing power • Tasks operate in parallel • Efficient asynchronous processing • Network I/O is distributed across all engines • Adaptive Server performance is scalable according to the CPU processing power

  14. ASE Engines • Logical Process management • Lets users do create execution classes • Bind applications/login to specific execution classes • Add engines to execution classes • Priority can be assigned to execution classes • Lets DBA create various service hierarchies • Dynamically change the priorities • Dynamically modify the engine assignments based on the CPU workload

  15. Let’s not forget “Memory” • Memory is a very critical parameter to obtain overall system performance • Every disk I/O saved is performance gained. • Tools to monitor and manage memory

  16. Over 90% of memory is reserved for buffer caches. DBMS I/Os Varying sizes Random v/s Sequential Asynchronous v/s Synchronous Object access pattern Most often accessed objects ASE Memory Consumption

  17. Traditional cache management • All the objects share the space • Different objects can keep throwing each other out • Long running low priority query can throw out a often accessed objects • Increased contention in accessing data in cache

  18. ASE Distributing I/O across Named Caches • If there there are many widely used objects in the database, distributing them across named caches can reduce spinlock contention • Each named cache has its own hash table

  19. Hash Table Hash Table Hash Table Hash Table Distributing I/O across cachelets

  20. ASE Named Caches • What to bind • Transaction Log • Tempdb • Hot objects • Hot indexes • When to use Named caches • For highly contented objects • Hot lookup tables, frequently used indexes, tempdb activity, high transaction throughput applications are all good scenarios for using named caches. • How to determine what is hot ?? • ASE provides Cache Wizard

  21. Cache Wizard : Examples • sp_sysmon ’00:05:00’, ‘cache wizard’, ‘2’, ‘default data cache’ default data cache Buffer Pool Information Object Statistics Cache Occupancy Information

  22. Reading ’n’ Writing [Disk I/O] • I/O avoided is Performance Gained • ASE buffer cache has algorithms to avoid/delay/Optimize I/Os whenever possible • LRU replacement • MRU replacement • Tempdb writes delayed • Write ahead logging [Only Log is written immediately] • Group commit to batch Log writes • Coalescing I/O using Large buffer pools • UFS support • Raw devices and File systems supported • Asynchronous I/O is supported

  23. ASE Asynchronous Prefetch • Issue I/Os in advance on data that we are going to process later • Non-clustered index scan • Leaf pages have (key, pageno) • Table scan • Recovery • Reduces waiting on I/o completion. • Eliminates context switches.

  24. L1 L2 L3 Log Cache ASE – Private Log cache • begin tran • sql.. • Sql.. • Sql.. • commit tran PLC L1 L2 L3 Plc flush Log Disk Log Write

  25. Logging Resource Contention • PLC eliminates steady state logging contention • During Commit • Acquire Lock on Last Log page • Flush the PLC • Allocate new log pages while flushing PLC • Issue write on the dirty log pages • On high transaction throughput systems ( 1million/min) • Asynchronous logging service ( ALS) acts as coordinator for flushing PLC’s and issuing log writes • Eliminates contention for Log cache and streamlines log writing.

  26. Very Large Database Support • Very Large Device Support • Support storage of 1 Million Terabytes in 1 Server instance • Virtually unlimited devices per server (>2Billion) • Individual devices up to 4TB (support large S-ATA drives) • Partitions • Divide up and manage very large tables as individual components

  27. Partition Partition Partition Partition Partition Large Table DSS Query Large Delete Large Delete Update statistics Update statistics DSS OLTP apps OLTP apps Data load Data load Unpartitioned Table Partitioned Table More on PARTITION Small chunk

  28. Why Partitions VLDB Support High Performance and Parallel processing Increased Operational Scalability Lower Total Cost of Ownership (TCO) through: Improved Data Availability Improved Index Availability Enhanced Data Manageability Benefits Partition-level management operations require smaller maintenance windows, thereby increasing data availability for all applications Partition-level management operations reduce DBA time required for data management Improved VLDB and mixed work-load performance reduces the total cost of ownership by enabling server consolidation Partitions that are unavailable perhaps due to disk problems do not impede queries that only need to access other partitions PARTITIONS - Benefits

  29. Sustained Query performance • Upgrades are double edge sword • Offers new features, fixes, enhancement • May destabilized applications due to fixing double negative issues • Customer don’t want to compromise current performance levels • Long testing process before any upgrade happens • Time is money • Expensive to upgrade from customer perspective • Unless we have mechanism to ensure same performance level we are at loss

  30. Query Optimization - The Reality • ASE query optimization usually chooses the best query plan, however, sub-optimal query plans will occasionally result. • Furthermore, whenever changes are made to the optimization process on ASE: • Most queries will execute faster • Some queries will be unaffected • Some queries may execute slower (possibly a lot slower) • The bottom line is that optimizers are not perfect! • As a Senior Sybase Optimizer Developer once told me, “If they were perfect, they would call them perfectizers.”

  31. ASE Solution: Abstract Plans • The Abstract Plans feature provides a new mechanism for users to communicate with ASE regarding how queries are executed. • This communication: • Does not require changing application code • Applies to virtually all query plan attributes • Occurs at the individual query-level • May be stored and bundled in with applications • Will persist across ASE releases

  32. What Are Abstract Plans? • An abstract plan (AP) is a persistent, readable description of a query plan. • Abstract plans: • Are associated with exactly one query • Are not syntactically part of the query • May describe all or part of the query plan • Override or influence the optimizer • A relational algebra language, specific to ASE, is used to define the grammar of the abstract plan language. • This AP language, which uses a Lisp-like syntax

  33. What Do Abstract Plans Look Like? • Full plan examples: • select * from t1 where c=0 (i_scan c_index t1) • select * from t1, t2 where (nl_g_join t1.c = t2.c and t1.c = 0 (i_scan i1 t1) (i_scan i2 t2) ) Instructs the optimizer to perform an index scan on table t1 using the c_index index. • Instructs the optimizer to: • perform a nested loop join with table t1 outer to t2 • perform an index scan on table t1 using the i1 index • perform an index scan on table t2 using the i2 index

  34. Compilation Flow - Capture Mode • Abstract plans are generated and captured during compilation • Query plans are generated System Catalog on Disk (Persistent Storage) SQL Text For Compiled Objects Only Query Resolution Process Query Tree Stored in sysprocedures ASE Memory (Non-Persistent Storage) Query Compilation Process Stored in Stored in Query Plan Abstract Plan sysqueryplans Procedure Cache

  35. Compilation Flow - Association Mode • At query execution, the optimizer will search for a matching abstract plan. If a match is found, it will impact the query plan. If not, it won’t. System Catalog on Disk (Persistent Storage) SQL Text Query Resolution Process For Compiled Objects Only Query Tree Stored in sysprocedures Query Compilation Process Matching AP? No Yes ASE Memory (Non-Persistent Storage) Read from Abstract Plan sysqueryplans Stored in Procedure Cache Query Plan

  36. Q & A Thank you

More Related