410 likes | 1k Views
Practical Use of MDA Tables. A Case-Study in the Practical Use of MDA Tables from Sybase's Internal IT/DBA Team. David Burgess Sybase Staff DBA. Agenda. DBA Nightmares ASE Performance and Tuning Problems The ASE is slow, fix it! How MDA Tables can help How MDA Collector can help
E N D
Practical Use of MDA Tables A Case-Study in the Practical Use of MDA Tables from Sybase's Internal IT/DBA Team David Burgess Sybase Staff DBA
Agenda • DBA Nightmares • ASE Performance and Tuning Problems • The ASE is slow, fix it! • How MDA Tables can help • How MDA Collector can help • Performance and Tuning with MDA Collector • MDA Solutions: Practical applications of MDA Tables and Collector • Common Problems with MDA Tables • Q&A
Presentation Goal: Fixing DBA Nightmares • Black Box ASE • Black Box Applications connecting to the ASE • Hung query • Hung ASE • Query hogging 100% CPU • Query crashing ASE • Query resulting in massive blocking • Problem logins running problem SQL • Runaway SQL consuming log
MDA Quick Review • MDA tables were introduced in ASE 12.5.0.3 • MDA = Monitoring and Diagnostic Access • Also called “monitoring tables” • 41 tables installed ‘master’ database (15.0.2) • Table names begin with “mon” (e.g., monProcessActivity) • Can be accessed with standard SQL commands • Table data is generated from in-memory structures • No data is stored in master database • Tables are proxy tables • No license needed: included in ASE base product • MDA tables provide access to detailed monitoring data • Process Activity • CPU usage, IO activity, resource usage • Resource usage • Data cache, procedure cache, engines • Object usage • Tables, partitions, indexes, stored procedures • Query history • SQL text, statement metrics, query plans, errors
How MDA Tables can help • Identify, track and fix: • Root cause of many ASE P&T issues • Bad SQL that impacts performance • Rogue processes hogging resources (CPU, IO, etc.) • Bad stored procedures • Bad indexes • Problem applications and logins • Unused objects (tables, indexes) • MDA Tables provide a lot of data • Up until recently, ASE did not provide many ready-to-run sprocs to sort that data into information
How MDA Collector can help • Why Setup MDA Collector? • Collector tables can be used as ASE baseline • Problems always happen when DBA is away from desk; Collector is always working • Collector should be setup to gather enough info to catch the problem when it happens • Use the Collector reports to review errors • Review last few “collections” just before ASE problem • Archive Collector reports for root cause analysis
How MDA Collector can help • MDA Collector Reports • MDA Collector has two dozen ready-to-run reports to help ‘translate’ MDA data into information • Use the Collector reports to review errors • Review last few “collections” just before ASE crash or other error • Archive Collector reports for root cause analysis • Run Collector reports to generate a new baseline • New Baseline required after any change • Compare new baseline with old baseline to measure performance improvements
How MDA Collector works • Can be configured to collect on three options • Application SPID running longer than 15 seconds • System info every 60 seconds: Engine Counters, Device, syslogs, contention... • Historical SQL, Errorlog, Deadlocks... every 15 seconds • Can be set to collect ‘snapshot’ for any frequency • May set to archive for any number of days • Example: Prune everything older than 7 days • Techwave 2006 presentation on MDA collector reference: • http://www.sybase.com/techwave/techwave2006
How to setup MDA Collector • Enable MDA tables and grant ‘mon_role’ permission • Create Collector database • Install ASE sprocs and schema from zip-files • http://www.sybase.com/techwave/techwave2006 • Copy start_scripts from zip-files onto ASE host • Run start_script and purge_script
How to startup MDA Collector • START_MDA to call all 3 'start' sprocs • Makes 3 user connections to ASE as the mda_collect user • dbadb_metrics..sp_dba_MON_SYSTEM_control 'start‘ • dbadb_metrics..sp_dba_MON_APP_control 'start' • dbadb_metrics..sp_dba_MON_HIST_control 'start' • Those sprocs then wait in infinite loop until 'stop‘ • A snapshot is collected at every specified control_table interval • How to make it work for your version • Edit control table : …to collect what? …and how often? • Edit Collector sprocs to add additional “filters” sp_dba_MON_SYSTEM_col_v1252 sp_dba_MON_APP_col_v1252 sp_dba_MON_HIST_col_v1252 WHERE ml.SPID = mp.SPID AND ml.KPID = mp.KPID AND mp.SPID <> @v_my_SPID --AND mp.DBID <> db_id("dbadb_metrics") --AND mp.DBID <> db_id("sybsystemprocs") --AND mp.Login <> "mda_collect"
How to use MDA Collector • Example Collector Reports: HIST & APP show minutes, SYS is hour • HIST PROCS • SYS PROCS • APP PROCS dbasp_MONR_HIST_DeadLock Captures all deadlocks and can go back in history dbasp_MONR_HIST_ErrorLog All errors caused by SPIDS. dbasp_MONR_HIST_SQLSmt All statements and associated CPU, Read, Writes, Network activity etc. dbasp_MONR_HIST_SQLText Recreate all the SQL executed, can be limited to a particular SPID dbasp_MONR_SYS_Engine_Stats Collected every 180 Seconds, allows us to identify peak times at a granular level. dbasp_MONR_SYS_OpenDB_Stats DB activity, logs contention .. tune syslogs and tempdb, could be used to justify multiple tempdbs. dbasp_MONR_SYS_SysWaits_Stats Context Switches,. Allows us to identify what kind of resources are limiting the ASE. dbasp_MONR_SYS_CachedObj_Stats Used to identity if a user defined cached is being used or sized right ? dbatb_MON_App_Locks Similar to lock history, will identify which spid are holding locks on which objects during the snapshot. dbatb_MON_App_ProcessActivity Will identify the source of the SPID (IP Address) dbasp_MONR_APP_Process_Waits Shows what the SPID waiting for dbasp_MONR_APP_Process_Smt Information on Statements executing, e.g. CPU, Reads/Writes, source of statement etc., Network activity. dbasp_MONR_APP_Process_SQLText Catches all SQL text, for queries running longer that interval (15 seconds currently) dbasp_MONR_APP_Process_Procs Procedures running longer than 15 secs, and currently executing during the snapshot.
MDA Collector : Tips & Tricks • Consolidating MDA Collector START/STOP scripts to one central host • Consolidating MDA Collector data to one central Reporting ASE • Set RPC or loopback to the reporting ASE • Run Collector reports on Reporting rather than OLTP • Future Direction of DBA team for MDA Collector
ASE Performance and Tuning: ASE is slow, fix it! • DBA Solutions : • Slower than what? Compare against a baseline • Any recent system changes : upgrades, new hardware • Identify the problem: system resource …or… sql and query plan • Rogue transaction hogging resources or blocking users • Users blocked on tempdb • Make one change at a time and compare against previous baseline
P & T with MDA Collector : Where to Start? • Gather baseline of ‘normal’ performance • Advise smaller intervals of about 15 – 20 minutes • Get baseline for peak times and non-peak times • Keep archive of baseline samples dbasp_MONR_SYS_SysWaits_Stats 1 WaitEventID SUM_WaitTime SUM_Waits AVG_WaitTime AVG_Waits Description 250 35419 43653 621 765 waiting for incoming network data 179 13867 288344 243 5058 waiting while no network read or write is required 260 13680 570 240 10 waiting for date or time in waitfor command 61 10017 1797 175 31 hk: pause for some time 19 7081 118 124 2 xact coord: pause during idle loop 57 7079 65 124 1 checkpoint process idle loop 104 3541 118 62 2 wait until an engine has been offlined 178 3522 833 61 14 waiting while allocating new client socket 214 579 30190 10 529 waiting on run queue after yield 215 306 438260 5 7688 waiting on run queue after sleep 251 293 40505 5 710 waiting for network send to complete 51 153 14839 2 260 waiting for last i/o on MASS to complete 29 96 16440 1 288 waiting for regular buffer read to complete 266 88 169 1 2 waiting for message in worker thread mailbox 31 71 13061 1 229 waiting for buf write to complete before writing 41 51 707 0 12 wait to acquire latch 55 47 8310 0 145 wait for i/o to finish after writing last log page 150 40 536 0 9 waiting for a lock 36 17 4483 0 78 waiting for MASS to finish writing before changing 52 11 991 0 17 waiting for i/o on MASS initated by another task 124 9 344 0 6 wait for mass read to finish when getting page 209 3 1875 0 32 waiting for a pipe buffer to read
P & T with MDA Collector : Where to Start? • Reading that Baseline • Start with monProcessWaits/monSysWaits • Info on whether the next step is query related, client software, hardware or contention in ASE • If you know this to be SQL query related, you may be able to skip monProcessWaits and go directly to monProcessActivity/ monProcessStatement/monSysStatement • Most closely approximates sp_sysmon context switching section • …but gives you the details not available with sysmon • …and lets you focus down to the process detail level • Unfortunately, the “WaitEvents” require explanation (see later slide) • Comparing Baselines • Compare Apples-to-Apples : • Remember to compare an equivalent timeframe • Sample interval, peak periods vs. peak periods
P & T with MDA Collector : What to look for? • Focus on the "Waits" • Log, Tempdb, data IO, WaitEvents • Use MS Excel or OpenOffice to plot Requests vs. Waits • Look at monOpenObjectActivity for explanation • Sort report output • Sort by : CPUTime, WaitTime, PhysicalReads, LogicalReads, PagesRead, PhysicalWrites, PagesWritten, MemUsageKB • Remember that SPIDs report total of every and any sql every executed by that spid. • If same spid logged in for weeks, then those report numbers will be very high
P & T with MDA Collector : Good wait events • dbasp_MONR_SYS_SysWaits_Stats 1 WaitEventID SUM_WaitTime SUM_Waits AVG_WaitTime AVG_Waits Description 250 35419 43653 621 765 waiting for incoming network data 179 13867 288344 243 5058 waiting while no network read or write is required 260 13680 570 240 10 waiting for date or time in waitfor command 61 10017 1797 175 31 hk: pause for some time 19 7081 118 124 2 xact coord: pause during idle loop 57 7079 65 124 1 checkpoint process idle loop 104 3541 118 62 2 wait until an engine has been offlined 178 3522 833 61 14 waiting while allocating new client socket 214 579 30190 10 529 waiting on run queue after yield 215 306 438260 5 7688 waiting on run queue after sleep 251 293 40505 5 710 waiting for network send to complete 51 153 14839 2 260 waiting for last i/o on MASS to complete 29 96 16440 1 288 waiting for regular buffer read to complete 266 88 169 1 2 waiting for message in worker thread mailbox 31 71 13061 1 229 waiting for buf write to complete before writing 41 51 707 0 12 wait to acquire latch 55 47 8310 0 145 wait for i/o to finish after writing last log page 150 40 536 0 9 waiting for a lock 36 17 4483 0 78 waiting for MASS to finish writing before changing 52 11 991 0 17 waiting for i/o on MASS initated by another task 124 9 344 0 6 wait for mass read to finish when getting page 209 3 1875 0 32 waiting for a pipe buffer to read 54 1 158 0 2 waiting for write of the last log page to complete 334 1 307 0 5 waiting for Lava pipe buffer for write 35 0 0 0 0 waiting for buffer validation to complete 37 0 8 0 0 wait for MASS to finish changing before changing 53 0 0 0 0 waiting for MASS to finish changing to start i/o 83 0 0 0 0 wait for DES state is changing 85 0 0 0 0 wait for flusher to queue full DFLPIECE 91 0 0 0 0 waiting for disk buffer manager i/o to complete
P & T with MDA Collector : More wait events to watch for… • dbasp_MONR_SYS_SysWaits_Stats 1 WaitEventID SUM_WaitTime SUM_Waits AVG_WaitTime AVG_Waits Description … 51 153 14839 2 260 waiting for last i/o on MASS to complete 29 96 16440 1 288 waiting for regular buffer read to complete 266 88 169 1 2 waiting for message in worker thread mailbox 31 71 13061 1 229 waiting for buf write to complete before writing 41 51 707 0 12 wait to acquire latch 55 47 8310 0 145 wait for i/o to finish after writing last log page 150 40 536 0 9 waiting for a lock 36 17 4483 0 78 waiting for MASS to finish writing before changing 52 11 991 0 17 waiting for i/o on MASS initated by another task 124 9 344 0 6 wait for mass read to finish when getting page 209 3 1875 0 32 waiting for a pipe buffer to read 54 1 158 0 2 waiting for write of the last log page to complete 334 1 307 0 5 waiting for Lava pipe buffer for write 35 0 0 0 0 waiting for buffer validation to complete 37 0 8 0 0 wait for MASS to finish changing before changing 53 0 0 0 0 waiting for MASS to finish changing to start i/o 83 0 0 0 0 wait for DES state is changing 85 0 0 0 0 wait for flusher to queue full DFLPIECE 91 0 0 0 0 waiting for disk buffer manager i/o to complete 92 0 0 0 0 waiting for synchronous disk buffer manager i/o 99 0 0 0 0 wait for data from client 143 0 0 0 0 pause to synchronise with site manager 157 0 0 0 0 wait for object to be returned to pool 169 0 0 0 0 wait for message 197 0 0 0 0 waiting for read to complete in parallel dbcc 200 0 0 0 0 waiting for page reads in parallel dbcc 201 0 0 0 0 waiting for disk read in parallel dbcc 202 0 0 0 0 waiting to re-read page in parallel dbcc 203 0 0 0 0 waiting on MASS_READING bit in parallel dbcc 205 0 0 0 0 waiting on TPT lock in parallel dbcc 207 0 0 0 0 waiting sending fault msg to parent in PLL dbcc 230 0 0 0 0 waiting for site handler to complete setup
P & T with MDA Collector : Wait events to watch for… • Wait event descriptionsWhere to look • “waiting to be scheduled” (cpu) • “waiting for a disk read to complete” (read) • “waiting for a disk write to complete” (write) • “waiting to acquire the log semaphore” (log contention) • “waiting to take a lock” (lock contention) • “waiting for memory or a buffer” (address contention) • “waiting for input from the network” (client speed) • “waiting to output to the network” (client fetch/net stat) • “waiting for internal system event” (PLC, index balance) • “waiting on another thread” (contention)
P & T with MDA Collector : Wait events to watch for… • Client Related S/W Issues • 171 : Waiting for CTLIB event to complete -waiting on next Client command to be sent • 251 : Waiting for network send to complete -waiting for ct_sendpassthru(), • 250 : Waiting for incoming network data -Equivalent to ‘awaiting command’ • Transaction Log Delays • 259 : Waiting until last chance threshold is cleared -Transaction log keeps filling and crossing the LCT • 150 : Waiting for semaphore • 54 : waiting for write of the last log page to complete -you are waiting to write to the last log page • 55 : wait for i/o to finish after writing last log page -you are waiting for the last log page you wrote to flush • Contention • 41 : Wait to acquire latch -Address locking contention (tran log) • 70 : Waiting for semaphore -Typically normal row/pg lock, but could be log semaphore or spinlock contention • 52 : "waiting for last MASS on which i/o was issued by some other task" -Two nearly concurrent select/into's in tempdb • H/W Issues: CPU contention • 214 : Waiting on run queue after yield -in memory scan, join operations, sorting, looping logic in proc, etc. • 215 : Waiting on run queue after sleep -Slow cpu's could result in higher waits on log semaphore and disk writes • 33 : Wait for buffer read to complete -Logical read or network read • 34 : Wait for buffer write to complete -Logical write (update in cache before disk flush)/network send • 179 : waiting while no network read or write is required -Netserver checked and no network read/write pending • Waiting for disk write to complete -Exceeded disk i/o structures and delayed for pending i/o queue
P & T with MDA Collector : Where to look? • monCachedObject - dbasp_MONR_SYS_CachedObj_Stats • Provides statistics for all objects and indexes that currently have pages cached within a data cache • monCachePool - dbasp_MONR_SYS_CachePool_Stats • Provides statistics for all pools allocated for all caches • monOpenObjectActivity - dbasp_MONR_SYS_OpenObj_Stats • Provides statistics for all open objects • monIOQueue - dbasp_MONR_SYS_IOQueue_Stats • Provides device IO statistics broken down into data and log IO, for normal and temporary databases on each device. • monDeviceIO - dbasp_MONR_SYS_DeviceIO_Stats • Provides statistical information about devices • monSysWaits - dbasp_MONR_SYS_SysWaits_Stats • Provides a server-wide view of events that processes are waiting for • monProcessActivity - dbasp_MONR_APP_Proc_Activity • Provides statistics about process activity • monSysStatement - dbasp_MONR_HIST_SQLSmt • Provides statistics for the most recently executed statements. • monCachedProcedures - dbasp_MONR_APP_Process_Procs • Provides statistics about all procedures currently stored in procedure cache • monEngine - dbasp_MONR_SYS_Engine_Stats • Provides statistics regarding ASE engines • monProcedureCacheModuleUsage • Provides procedure cache usage statistics by ASE modules
P & T with MDA Collector : What next? • Step 1: Gather current statement statistics • Check monProcessStatement : dbasp_MONR_APP_Process_Smt • Check monProcessSQLtext : dbasp_MONR_APP_Process_SQLText • Step 2: Get SPID Resource Consumption • Check monProcessActivity : dbasp_MONR_APP_Proc_Activity • Step 3: If High Wait Time – Find cause • Check monProcessWaits : dbasp_MONR_APP_Process_Waits • Step 4: If High I/O Write waits or Tempdb is suspect • Check monProcessObject : dbasp_MONR_APP_Process_Object • Check monOpenObjectActivity : dbasp_MONR_SYS_OpenObj_Stats • Step 5: If Contention • Check monOpenObjectActivity to find table(s) with most contention (LockWaits) • Check monProcess for Blocking : dbasp_MONR_APP_Proc_Activity • Check monLocks, monDeadLocks : dbasp_MONR_APP_Lock_History • Step 6: If Proc (somewhere in proc is slow) • monProcessStatement only gives metrics on current statement within the current batch/context/line • monSysStatement – historical view of the query tree : dbasp_MONR_HIST_SQLSmt
MDA Collector SolutionsCase Study 1: Tempdb full, ASE sprocs hang <SERVER.master.1> select * from master..monProcessSQLText <SERVER.master.2> go SPID KPID ServerUserID BatchID LineNumber SequenceInLine SQLText ------ ----------- ------------ ----------- ----------- -------------- -------------------------------------------------------------------------------------------------------------------------------------- 369 1102774818 5 4 2 1 select bug_id, 369 1102774818 5 4 3 1 ( 369 1102774818 5 4 4 1 select min(a.action_ts) 369 1102774818 5 4 5 1 from sy_resolution r, 369 1102774818 5 4 6 1 sy_resolution_activity a 369 1102774818 5 4 7 1 where 369 1102774818 5 4 8 1 r.resolution_id = a.resolution_id 369 1102774818 5 4 9 1 and a.fieldname = 'res_status' 369 1102774818 5 4 10 1 and a.newval = 'Closed' 369 1102774818 5 4 11 1 and bug_id = bugs.bug_id 369 1102774818 5 4 12 1 ) as report_dt 369 1102774818 5 4 13 1 into #bug_closed 369 1102774818 5 4 14 1 from bugs 369 1102774818 5 4 15 1 where 369 1102774818 5 4 16 1 bug_id in ( 369 1102774818 5 4 17 1 select sy_addl_case.bug_id 369 1102774818 5 4 18 1 from 369 1102774818 5 4 19 1 sy_addl_case join clarify..table_case on 369 1102774818 5 4 20 1 left(sy_customer_case,8)=id_number 369 1102774818 5 4 21 1 join bugs b on b.bug_id = sy_addl_case.bug_id 369 1102774818 5 4 22 1 where 369 1102774818 5 4 23 1 sy_customer_case is not null 369 1102774818 5 4 24 1 and sy_customer_case <> '10001' 369 1102774818 5 4 25 1 and active_flag <> 2 369 1102774818 5 4 26 1 and cure_code = '1' 369 1102774818 5 4 27 1 and b.sy_product_family in ( 'SQL Server','Docs: SQL Server') 369 1102774818 5 4 28 1 and b.bug_type in ('Regression', 'Defect') 369 1102774818 5 4 29 1 and b.bug_severity in (0, 1) 369 1102774818 5 4 30 1 )
MDA Collector SolutionsCase Study 2: ASE stacktrace and crash • ASE stacktrace and die on spid 287 • Use Collector to find out what was spid 287 doing ? • Turns out that the sp_configure 'allow nested triggers' option was enabled on ASE. This results in a trigger + sproc infinite loop = stacktrace and crash. 00:00000:00287:2008/06/13 14:19:03.61 kernel Stack overflow detected: limit: 0x0000010003249290, sp: 0x0000010003249388 00:00000:00287:2008/06/13 14:19:03.62 kernel pc: 0x0000000080f84ea0 pcstkwalk+0x24(0x0000010003249208, 0x0000010003248580, 0x000000000000270f, 0x0000000000000002, 0x0000000000000250) 00:00000:00287:2008/06/13 14:19:03.62 kernel pc: 0x0000000080f84d04 ucstkgentrace+0x1c0(0x000001000be2ff18, 0x0000000000000002, 0x000000000000270f, 0x0000000000000000, 0x0000000000000000) dbasp_MONR_HIST_SQLText 60, null, null , 287 KeyCounter SampleTime SPID KPID BatchID SequenceInBatch SQLText ------ ------ ------ ------ ------ ------ ------ ------ 15244 6/13/08 2:18 PM 287 250937603 12 1 select * from x_debug_on 15244 6/13/08 2:18 PM 287 250937603 13 1 select name from master.dbo.sysdatabases 15244 6/13/08 2:18 PM 287 250937603 14 1 UPDATE table_subcase SET x_eng_prty_rating = "40"\t\t WHERE id_number = @subcase_num
MDA Collector SolutionsCase Study 3: Bad Sprocs • Use MDA to find problem sprocs • dbasp_MONR_APP_Process_Object • dbasp_MONR_APP_Process_Procs • Use MDA to find stored Procedure Performance Averages • Use Collector to find sproc averages over historical timeline… SPID KPID Login Application Command MasterTranID ObjectID OBJ_NAME IndexID LogicalReads 121 23855236 tools_user tools UPDATE $upd 466012129 [NULL] 0 0 121 23855236 tools_user tools UPDATE $upd 466012129 [NULL] 0 1 121 23855236 tools_user tools UPDATE $upd 1327447903 work_nav_names 0 1015 121 23855236 tools_user tools UPDATE $upd 1823449670 work_nav_trees 0 51797903 SPID KPID Login Application Command MasterTranID OwnerUID ObjectID OBJ_NAME 121 23855236 tools_user tools UPDATE $upd 1 276352199 p_tools_get_authorized_path
MDA Collector SolutionsCase Study 4: Bad Cache • Use MDA to show key columns in monStatementCache • TotalSizeKB, NumStatements, NumSearches, HitCount, NumRemovals • Use Collector to show cache stats over time… • dbasp_MONR_SYS_DataCache_Stats 1 , null, null, null, 'summary' 1> select * from monStatementCache 2> go TotalSizeKB UsedSizeKB NumStatements NumSearches HitCount NumInserts NumRemovals NumRecompilesSchemaChanges NumRecompilesPlanFlushes ----------- ----------- ------------- ----------- ----------- ----------- ----------- -------------------------- ------------------------ 40000 37800 1329 529133 138721 390412 421693 12 1398 KeyCounter SampleTime CacheID RelaxedReplacement BufferPools CacheSearches PhysicalReads LogicalReads PhysicalWrites Stalls CachePartitions CacheName Accurate SamplePeriod Hit Ratio 3793 7/7/08 4:10 PM 2 0 1 0 0 0 0 0 1 app_cache Y 97 0 3794 7/7/08 4:12 PM 2 0 1 0 0 0 0 0 1 app_cache Y 96 0 3795 7/7/08 4:13 PM 2 0 1 0 0 0 0 0 1 app_cache Y 93 0 3796 7/7/08 4:15 PM 2 0 1 0 0 0 0 0 1 app_cache Y 93 0 3797 7/7/08 4:16 PM 2 0 1 0 0 0 0 0 1 app_cache Y 94 0 3798 7/7/08 4:18 PM 2 0 1 0 0 0 0 0 1 app_cache Y 93 0 3799 7/7/08 4:19 PM 2 0 1 0 0 0 0 0 1 app_cache Y 92 0
MDA Collector SolutionsCase Study 5: Baseline for setting resource limits • Use Collector to find “high-watermark” of sql • Can then use this information to set realistic resource limits • dbasp_MONR_APP_Proc_Activity, dbasp_MONR_APP_Process_Object • LogicalReads ~ resource limit io_cost SPID Application ObjectID OwnerUserID LogicalReads 590 isql 298132503 1609 824100 sp_help_resource_limit name appname rangename rangeid limitid limitvalue enforced action scope ---- ------- ------------ ------- ------- ---------- -------- ------ ----- NULL isql at all times 1 4 1024000 2 2 1
MDA Collector SolutionsCase Study 6: Bad SQL • Use MDA to find most I/O intensive statement • Use Collector to find most I/O intensive statement in last 2 hours… select * into #ts from master..monSysStatement select KPID, BatchID, LineNumber, LogicalReads, Elapsed = datediff(ms, StartTime, EndTime) from #ts where LogicalReads > 100 order by 4 desc KPID BatchID LineNumber LogicalReads Elapsed ----------- ----------- ----------- ------------ ----------- 574619857 9 13 5509405 68613 575602932 10 3 360656 7956 575209751 10 3 86241 606 575406493 10 3 59546 983 576258422 2 62 39963 223 575275534 1 62 39959 216 575930476 2 62 39955 250 332857800 65454 1 15884 1600 575275534 9 1 12758 176 dbasp_MONR_APP_Proc_Activity 120 , null, null, null, 'ANALYZE'
MDA Collector SolutionsCase Study 7: Archiving unused objects • Use MDA to find old unused tables to be archived • Unused indexes that be dropped or impacting performance • Use Collector to find old unused tables over last few months…. • Archive Collector reports for a few Quarters Find Indexes that have not been used since the server was started select DB = convert(char(20), db_name()), TableName = convert(char(20), object_name(i.id, db_id())), IndexName = convert(char(20),i.name), IndID = i.indid from master..monOpenObjectActivity a, sysindexes i where a.ObjectID =* i.id and a.IndexID =* i.indid and (a.UsedCount = 0 or a.UsedCount is NULL) and i.indid > 0 and i.id > 99 -- No system tables order by 2, 4 asc
MDA Collector SolutionsCase Study 8: Hot tables • Use MDA to find Hot Tables and Indexes • Use Collector to find tables usage over time… • dbasp_MONR_SYS_OpenObj_Stats select db_name = db_name(DBID), table_name = object_name(ObjectID, DBID), IndexID, LogicalReads, PhysicalReads, Operations, LockWaits from master..monOpenObjectActivity order by LogicalReads desc DBName TableName IndexID LogicalReads PhysicalReads Operations ------------------------------ ------------------------------ ----------- ------------ ------------- ----------- qts_db bugs 0 1858361243 181866 3526752 qts_db sy_resolution 0 1534176756 16555 18742290 qts_db e2_cset 0 1462264415 1347 1307975 qts_db x_bugs_nc1 2 1023824797 332059 0 qts_db sy_product_name 0 930762883 145 799079 qts_db sy_resolution_activity 0 873960390 408227 229939 qts_db sn_activity_details 0 675669657 726 17946 qts_db doc_xref 0 396974905 3098 6262053
Common Problems with MDA Tables • Overconfiguration of MDA Table options could result in problems in the master database. • Keep the MDA Table options realistic • Turn on only what you need • Another benefit with Collector; smaller pipe config is enough • exec sp_configure "sql text pipe max messages",100 • exec sp_configure "plan text pipe max messages",100 • exec sp_configure "statement pipe max messages",100 • exec sp_configure "errorlog pipe max messages",100 • exec sp_configure "deadlock pipe max messages",100 • exec sp_configure "max SQL text monitored",2048
Common Problems with MDA Tables (Cont.) • Excessive Polling • E.g. sampling every second • If more than every minute, you'd better have a real good reason • Drives cpu & network I/O artificially high • Collecting Everything for Everybody • Instead of using MDA parameters (especially SPID & KPID) • "turn it all on and wait for magic to happen” • Joining MDA tables (or subqueries) • Accuracy problems if self-joins, subqueries – even normal joins • Enabling pipe tables too early • Determine that you have a bad query before looking for it
Common Problemswith MDA Tables (Cont.) • Performance problems with the following options • Enable only if requried for specific problem • exec sp_configure "plan text pipe active" • exec sp_configure "plan text pipe max messages“ • exec sp_configure "sql text pipe active" • exec sp_configure "sql text pipe max messages” • exec sp_configure "statement pipe max messages” • exec sp_configure "statement statistics active” • exec sp_configure "per object statistics active”
Common Problemswith MDA Tables (Cont.) • MDA table changes between ASE versions • Install script after ASE upgrade • Internal Ct-Lib/Cs-Lib error 33620240: 'cs_convert: cslib user api layer: external error: Conversion between 8 and 12 datatypes is not supported.‘ • Must re-run requried $SYBASE/$SYBASE_ASE/scripts/installmontables • ASE 15.0.2, this script is part of 'installmaster' • Do NOT run this script with 'sqsh‘. 'sqsh' sees a '$' as the start of a sqsh variable, and this messes up the native RPC names, since these RPC names start with a '$' as well.
Common Problemswith MDA Tables (Cont.) • Arithmetic overflow for multiple engines while using MDA monEngine table - ContextSwitches field. • "select sum(convert(numeric(18,0),ContextSwitches)) from master.dbo.monEngine“ • Invalid MDA table data • Incorrect DBID • Invalid columns : wait time
Common Problems with MDA Tables • CR 343252, 367005, 446789 : Fixed in ASE 12.5.4 • Querying the monCachedObject table can cause other processes to timeslice resulting in performance hit. • CR 427028 : Fixed in 12.5.4 • "ubo_object_from_slot:Pss Invalid slot id 0" … "current process infected with 11" in module "lock__print_deadlockchain" after ASE encounters a deadlock when "print deadlock information" is set or the "deadlock pipe active" MDA option is in use. • CR 446789 : Fixed in 12.5.4 ESD#5 • Enabling the 'per object statistics active' Tables configuration option may cause degradation in query performance.
Common Problems with MDA Tables • CR#436330 : Fixed in 15.0.2 ESD#3 • CR#336077 : Fixed in 12.5.1 ESD#2 , 15.0.2 ESD#5 • Under rare conditions querying the monOpenObjectActivity table may result in error 14108, or stacktrace, or crash • Workaround to not collect from monOpenObjectActivity table • update dbadb_metrics..dbatb_mon_control set Record = 'N‘ where KeyType = 'openobjectactivity‘ • sp_sysmon with MDA • Must use sp_sysmon option "noclear" to preserve the counters • sp_sysmon "00:01:00", noclear
Learn more about MDA • ISUG • http://www.sybase.com/sb_content/1027266/MDA_Tables_ISUG_Final.pdf • http://download.sybase.com/presentation/isug_presentations/sybs31104.wmv • http://www.sybase.com/sb_content/1027266/SybaseISUG_MDA-042406.pdf • MDA Tables in ASE – Tips and Tricks • Rob V’s 2004 Techwave Presentation • http://www.sypron.nl/mda.html • Source for the MDA Collector
Q & A and Collector Demo • Latest events from Sybase DBA team… • MDA Collector Demo • Examples of configuring Collector to Collect • sp_configure parameters • dbo.dbatb_mon_control table settings • Collector table schema • Collector sprocs • Example Collector setup in our Dev environment • Examples of running Collector Reports