1 / 42

Super Efficient Indexing

Super Efficient Indexing. Dan Foreman Progress Expert > Hasbeen. Introduction- Dan Foreman. Progress user since 1984 Author of: Progress Performance Tuning Guide Progress Database Admin Guide Progress System Tables ProMonitor - Database Monitoring Tool

nwilliamson
Download Presentation

Super Efficient Indexing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Super Efficient Indexing Dan Foreman Progress Expert > Hasbeen

  2. Introduction- Dan Foreman • Progress user since 1984 • Author of: • Progress Performance Tuning Guide • Progress Database Admin Guide • Progress System Tables • ProMonitor - Database Monitoring Tool • Pro D&L - Dump/Load with minimal downtime

  3. Who Are You? • Progress V6, V7, V8, V9, V10, V12 • Production Database OS?: • Unix • Linux • Windows • Largest Single Database • Largest Concurrent User Count

  4. Progress Access Methods • Indexes • Prefix B+ Tree • Does anyone know what that is? • Read Knuth’s (Donald E. Knuth) Books for an excellent description

  5. B-Tree Indexes

  6. Progress Access Methods • Recid/Rowid • Direct access to a Database Block and within the Block to a Record; no need to read an Index Block • Very Fast (fewest I/Os) • But how often do you (or the application) know the Recid of a Record?

  7. Record I/O Reading one Record may require accessing several Database Blocks (this is called a “DB Request” in promon) 1. Root Block of the B-Tree for that Index 2. One I/O for each B-Tree Level (you can see the total number of B-Tree Levels for each index in proutil dbanalys) 3. If Multi-Index Query, repeat #1 and #2 4. One I/O for each Record Fragment; Fragmentation can also be seen in dbanalys and VSTs

  8. Indexing Record? • The Current Record for most indexes on a table (that I have ever seen) • 59 !!!

  9. The Cost of More Indexes • Disk space due to larger Database size • Increased Index Rebuild time during D&L • Increased time to Create, Delete, and (sometimes) Update Records • Higher Potential for Unused or Underused Indexes

  10. Indexing Rules • Single Index Rules • Multi-Index Rules • Beyond the time available for this Presentation • Learn them! • Progress Database Design Guide • Progress Performance Tuning Guide

  11. The “Default” Index • If no Index is created for a table Progress will create one automatically • Index of RECIDs • Benchmark

  12. Terminology • Bracketing • Using an Index to read a subset of a Table using conditions specified in WHERE, OF, and USING Clauses • Equality Match ( EQ or = ) • Range Match • GT, GE • LT, LE • BEGINS

  13. Bracketing Principles • Components in an index after a Range Match cannot be bracketed • Example • Index on [ entity + order# ] FOR EACH order WHERE entity GE v_e1 AND entity LE v_e2 AND order# GE v_o1 AND order# LE v_o2

  14. Query by Server • A Remote Client sends the WHERE Clause to the Server for resolution so that too many (non-indexed) records do not need to be sent back to the Client over a Network Connection • However certain functions in the WHERE Clause can disable this, examples to follow • Sorting never takes place on the Server

  15. Query by Server • Examples of WHERE Clauses that Disable Query by Server • WHERE CAN-FIND ( something ) • WHERE CAN-DO ( something ) • WHERE RECID ( table ) = something

  16. Server Side Joins – V12.0 • Support of “for each” statements for joins up to 10 tables • No open query or dynamic query operations • Requires multi-threaded database server • -ssj can be changed online (currently primary broker only) • -threadedServer 1 and -ssj 1

  17. Which is Better? • Several Single Field Indexes? • Few Multi-Field Indexes? • Benchmark results

  18. Indexing Quiz - The Easy Ones • for each cus where can-do(cus-type,”a,b,c”) • cus-type is indexed • for each xxx /* Index on [ f1 + f2 ] */ where f1 GE v1a and f1 LE v1b and f2 GE v2a and f2 LE v2b • for each order where not back-ordered • back-ordered is indexed

  19. Indexing Quiz - The Easy Ones • Note that there is NO performance difference between: • WHERE back-ordered • WHERE back-ordered = YES

  20. Indexing Quiz • How do you know for sure which Index is being used? • We can FORCE Progress with USE-INDEX but is that a good thing?

  21. USE-INDEX Example • 100,000 Records • Index on [ order# ] • Index on [ cust# ] • Query: FOR EACH order WHERE order.order# GE v-start# AND order.order# LE v-end# BY order.cust#

  22. USE-INDEX Example • Query #1 Identifies a Bracket of 10,000 Orders • order# Index: 18 seconds • cust# Index: 62 seconds (3.4X slower) • No Surprise • Query #2 Identifies a Bracket of 90,000 Orders • order# Index: 191 seconds • cust# Index: 110 seconds (42% faster) • Big Surprise?

  23. Indexing Quiz • COMPILE code.p XREF code.xrf x.p x.p 1 COMPILE x.p ... x.p x.p 1 SEARCH sports.x f1f2f3asc WHOLE-INDEX x.p x.p 1 SORT-ACCESS sports.x f1 x.p x.p 1 SORT-ACCESS sports.x f2 x.p x.p 1 SORT-ACCESS sports.x f3 ... x.p x.p 4 SEARCH sports.x f1f2f3asc WHOLE-INDEX

  24. COMPILE/XREF • Neither example below is shown as WHOLE-INDEX even though each query could do a Full Table Scan • FOR EACH customer WHERE custnum > 0 : • FOR EACH customer WHERE custnum GE start# AND custnum LE end#

  25. COMPILE/XREF • All the queries below show WHOLE-INDEX • FIND FIRST customer. • no bracket, but fast • FIND LAST customer • no bracket, but fast • FIND NEXT customer. • no bracket, but fast • FIND PREV customer. • no bracket, but fast

  26. Indexing Quiz • Now we know what index or indexes were chosen by the compiler Are we sure that they were used at run-time? How effective were our brackets?

  27. Indexing Quiz • -zqil (Zecret Query Info Log) • Client Startup Option • Writes compile and execution information for a query to the database .lg file each time the query is executed • Beware of queries embedded in loops

  28. Indexing Quiz FOR EACH customer WHERE name EQ "John" AND Comments CONTAINS "hello": • -zqil output ==Compiled Query Resolution Method: Query No. 1== (6135) 11:02:43 AND (6137) 11:02:43 INDEX 13 1 1 (6157) 11:02:43 INDEX 15 1 1 EQUALITY (6154) 11:02:43 ==Server Query execution Method Query No. 1== (6136) 11:02:43 AND (6137) 11:02:43 QUERY-BY-WORD (6164) 11:02:43 INDEX 13 CONTAINS HELLO (6148) 11:02:43 INDEX 15 (6141)

  29. Indexing Quiz • Index on: [ F1 + F2 + F3 + F4 ] FOR EACH table WHERE f1 EQ v1 AND ( f2 EQ v2a OR f2 EQ v2b ) AND f3 EQ v3 • How far does the bracket extend? • Can you prove how far the bracket extends?

  30. Indexing Quiz FOR EACH table WHERE f1 EQ v1 AND ( f2 EQ v2a OR f2 EQ v2b ) AND f3 EQ v3 • -zqil shows: Compiled Query Resolution Method: Query No.1 (6135) INDEX 28 1 1 EQUALITY (6157) Can we get better Bracketing?

  31. Indexing Quiz • A Better Bracket (but uglier) FOR EACH table WHERE ( f1 EQ v1 and f2 EQ v2a and f3 EQ v3 ) OR ( f1 EQ v1 and f2 EQ v2b and f3 EQ v3 ):

  32. Indexing Quiz • -zqil shows: ==Compiled Query Resolution Method: Query No. 1== (6135) 14:07:17 OR (6138) 14:07:17 INDEX 36 3 3 EQUALITY (6154) 14:07:17 INDEX 36 3 3 EQUALITY (6154) 14:07:17 ==Server Query execution Method Query No. 1== (6136) 14:07:17 DUPLICATE-REMOVE (6163) 14:07:17 OR (6138) 14:07:17 INDEX 36 (6141) 14:07:17 INDEX 36 (6141)

  33. Indexing Quiz EACH customer WHERE sales-rep = “bbb” AND comments CONTAINS “Prog*” • COMPILE/XREF shows 2 indexes x.p x.p 1 SEARCH Customer Comments x.p x.p 1 SEARCH Customer Sales-Rep

  34. Indexing Quiz • -zqil shows only one index is actually used at run time; Why? ==Compiled Query Resolution Method: Query No. 1= (6135) 17:32:44 AND (6137) 17:32:44 INDEX 13 1 1 (6157) 17:32:44 INDEX 16 1 1 EQUALITY (6154) ==Server Query execution Method Query No. 1== (6136) 17:32:45 DUPLICATE-REMOVE (6163) 17:32:45 QUERY-BY-WORD (6164) 17:32:45 INDEX 13 CONTAINS Prog* (6148) • Full info on -zqil in Dan’s Progress Performance Tuning Guide

  35. Indexing Quiz FOR EACH table WHERE f1 = v1 OR f1 = v2 OR f1 = v3 • Index on f1 • Can this code be made more flexible? • Remember that for each x where lookup(f1,”a,b,c”) > 0 can’t Bracket on an Index

  36. Indexing Quiz • vstr = v1 + ‘|’ + v2 + ‘|’ + v3. for each x where f1 contains vstr • Flexible (can have one or more values) • Very Fast due to existence of a Word Index • Word Indexing Cost is Zero (for this code) • The size of the vstr variable can’t exceed approximately 128 bytes in older versions • Tested at least 8000 bytes in V11.6.2 • But is there another way?

  37. Indexing Quiz • Temp Table Join /* a browser let’s the user select values and store in temp table */ for each tmp, each cust where cust.f1 = tmp.f1 • Alternatively use a Master File instead of a Temp-Table

  38. Indexing Quiz • What is the fastest way to find unused Indexes? • Indexes defined but never used in the code; beware of Dynamic Queries • Indexes that are defined and used in the code but never accessed

  39. Indexing Quiz • _IndexStat Virtual System Table • Database Startup options required to make data in this table ‘visible’ • V8: -indexbase, -indexlimit • Now: -baseindex, -indexrangesize

  40. -indexrangesize Memory Impact • Increasing -tablerangesize and/or -indexrangesize has an impact on the amount of shared memory allocated by the Database Broker. • Starting in V10.1B -indexrangesize also allocates the memory for the _UserTableStat VST and the size of this memory is approximately: • ( 32 bytes ) * ( -n + -Mn ) * ( -indexrangesize ) • This could be a substantial amount of shared memory with the large value of -n.

  41. _IndexStat • Also use _IndexStat to fix Logical Scatter problems by identifying non-Primary Indexes that are used for frequent, high volume record scans

  42. Conclusion • Questions? • Thank You For Coming! • danf@prodb.com

More Related