1 / 17

Improved Query Performance With Variant Indexes

Improved Query Performance With Variant Indexes. written by Patrick O’Neil Dallan Quass. Outline. Value-List Index (B + tree) Bitmap Indexes Projection Indexes Bit-Sliced Indexes Comparison of Index Types for different queries. Value-List Index (B + tree).

isla
Download Presentation

Improved Query Performance With Variant Indexes

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Improved Query Performance With Variant Indexes written by Patrick O’Neil Dallan Quass

  2. Outline • Value-List Index (B+ tree) • Bitmap Indexes • Projection Indexes • Bit-Sliced Indexes • Comparison of Index Types for different queries

  3. Value-List Index (B+ tree) Used to retrieve rows of a table with specified values involving one or more indexed columns Leaf values of the B+ tree consists of a sequence of entries for index key values Each key value entry references the set of rows with that value ----usually B+ trees references each row as RID(Row ID) Problem- In indexes with a relatively small number of key values compared to the number of rows, most key values will have large number of associated RID’s

  4. Bitmap Indexes • Bitmap is an alternate method of representing RID in a Value-List index • A “Bitmap B” is defined on table T as a sequence of M bits, where for each row with ordinal number j, we set the jth bit in B if that row satisfies the property of the index • Advantages:- • Bitmaps are more space efficient than RID lists in a Value-List index • Bitmaps are more CPU efficient for may functions because of the simple representation • Disk savings are enormous in case when Bitmaps are dense i.e number of one bits as a proportion of all bits in a Bitmap are large

  5. Bitmap Indexes (contd ..) • Bitmap Index Performance :- • boolean operations such as AND, OR, and Not are extremely fast for • Bitmaps • --- AND for ( I=0 ; I < len(B1) ; I++) • B3[I] = B1[I] & B2[I]; • --- OR • for ( I=0 ; I < len(B1) ; I++) • B3[I] = B1[I] || B2[I]; • ---NOT • for ( I=0 ; I < len(B1) ; I++) • B3[I] = ~B1[I] & EBM[I]; • These loops to calculate AND,OR, or NOT are extremely fast compared to same done on RID lists as long as the Bitmaps involved have reasonably • high density.

  6. Advantages (Bitmap indexing)- • Fast operations • The most common operations are the bitwise logical operations • They are well supported by hardware • Easy to compress, potentially small index size • Each individual bitmap is small and frequently used ones can be cached in memory • Efficient for read-mostly data: data produced from scientific experiments can be appended in large groups • Available in most major commercial DBMS

  7. Projection Index A projection index for column duplicates all column values for lookup by ordinal number . projection index for col2

  8. Projection Index (cont..) ----projection index turns out to be quite useful in cases where column values must be retrieved for all rows of the foundset, where the foundset is dense enough such that several column values would probably be found on the same disk page of the projection index, but rows themselves being larger, would appear on several different pages ----faster to retrieve a foundset of column value from projection indexthan from rows themselves

  9. Bit-Sliced Index A Bit-Sliced index for a column creates Bitmaps for significant bits in the column value B5 bit-slice B4 B3 B2 B1 B0 Bnn – bitmap representing set of non null values in the indexed column

  10. Comparison of Indexes evaluating Single-Column Sum Aggregates Plan 1 : Direct access to the rows to calculate the Sum -- costs includes I/O cost and CPU cost for getting proper row and column value from buffer resident page. Plan 2 : Calculating Sum through a Projection Index Plan 3 : Calculating the Sum through a Bitmap Index if (COUNT (Bf AND Bnn) = = 0) Return null; SUM = 0.0; for each non-null value v in the index for C { Designate the set of rows with value v as Bv SUM += v * COUNT(Bf AND BV); } Return SUM;

  11. Comparison of Indexes evaluating Single-Column Sum Aggregates Plan 4 : Calculating the SUM through a Bit-Sliced Index if (COUNT (Bf AND Bnn) = = 0) Return null;SUM = 0.0;for i = 0 to N SUM += 2i * COUNT(Bi AND Bf);Return SUM;

  12. Comparison of Indexes evaluating Single-Column Sum Aggregates

  13. Evaluating Range Predicates SELECT target-list FROM T WHERE C-range AND <condition> Plan 1 : Evaluating the Range using the Projection Index :- --- we create BF in a straightforward way by accessing each C value in the index corresponding to an ordinal row number in Bf and testing whether it lies with the specified range Plan 2 : Evaluating range predicates using Value-List index :- Br = empty set; for each entry v in the index for C that satisfies the range specified Designate the set of rows with the value v as Bv Br = Br OR Bv BF = Bf AND Br

  14. Evaluating OLAP-style Queries • Data warehouses are often built to support OLAP. OLAP query • performance depends on creating a set of summary tables to efficiently • evaluate an expected set of queries. This approach is possible only whenthe expected set of queries is known in advance. Specifically, the OLAP • Approach addresses queries that group by different combination columns, known as dimensions. But when ad-hoc queries must be issued that selects the rows on the criteria that are not part of the dimensional scheme,summary tables that don’t foresee such selection cannot be used and in such cases other indexes on the base data must be used. • Join Indexes and Bitmap-Join-Indexes are tried to address this problem.

  15. Join Indexes • A join index is an index on one table that involves a column value from different table through a commonly encountered join. • but, if there r numerous columns used for restrictions in each dimensiontable, then the number of star join indexes needed to be able to combine any single column choice from each dimension table is a product of the number of columns in each dimension. There will be a combinatorial explosion of join indexes in terms of the number of useful columns. • bitmap join index addresses this problem

  16. Bitmap Join Index • It is an index on a table T based on a single column of a table S, where S commonly joins with T in a specified way. • Obviously the number of indexes of this kind increases linearly with the number of useful columns in the dimension tables, but the speed of combining Bitmapped indexes to create ad-hoc combinations takescare of the problem of explosion of star join indexes.

  17. Improved Grouping Efficiency Using Segmentation & Clustering • Segmentation: It is one of the methods used for speed up the query with one or more group-by attributes. It states that rather than evaluatinga query for all rows of a table at once, the rows are partitioned into segments. Query evaluation is performed on one segment at a time, and the results from evaluating each segment are combined at the end to formthe final query result. • Clustering: It asks for the clustering for the finely divided fact table Fto improve the performance. Not only queries on fact tables, it also improves group-by queries on the dimensions.

More Related