RDBMS Fundamentals: Indexing

Relational data structures: Indexes (examples on Informix Dynamic Server – IDS) RDBMS Fundamentals: Indexing

Objectives Understand the benefits and costs of indexes. Understand how indexes are implemented. Identify the different index structures: Understand the B+ tree. Understand features and guidelines to use: Implicit versus Explicit indexes. Unique versus Duplicate indexes. Simple versus Composite indexes. Cluster indexes. Functional indexes. Understand decision criteria for Index storage: Attached versus Detached indexes. Index fragmentation. Select an appropriate fill factor for an index. Know how to alter, drop, rename and maintain an index. Understand how Indexing works on IBM Informix.

Sequential Scans • A Sequential Scan or Table Scan: • Reads all the pages that belong to the table, returns all the rows of the table. • Starts with the first page of the table, and orderly travels across all the devices that contain pages of the table until retrieve the last page. • Sequential scan of a large table is an expensive operation in OLTP systems. • I/O is expensive (mechanic) operation. • Also fills up the memory buffers with unnecessary pages, affecting the buffer profiling and the performance of the database server. • Small sequential scans (seq scans in small tables) are acceptable in OLTP.

When do Sequential Scans work well? • In any case of access (random or sequential) on very small tables: • The table fits in just a few pages and can be accessed with minimum I/O. • Reporting-type of queries, where all the rows of the table are needed • Random-access is not an option because the whole table is needed to process the results: • Example: Queries using aggregate functions (max, min, avg, sum) and no filters. • Result set is large enough or non-selective enough, so scanning the whole table is cheaper than randomly access every row: • Non-restrictive filters using low cardinality columns or low selectivity filters.

Sequential Scans – The need for indexes (1) • Imagine this scenario: Filter rows in a table based on a condition: • Lookup case, search for rows meeting a restrictive condition (filter). • Just a few rows meet the condition (e.g. <= 5% of total data). • The filter will retrieve scattered rows, not placed together in disk. • We should NOT use a sequential scan to do a random or non-sequential access with high-selectivity filters: • The bad performance will be more evident as we use a more restrictive / selective filter (for instance, on columns with high-cardinality).

Sequential Scans – The need for indexes (2) • Imagine this scenario: Join between two tables, at least one is large • The tables are related in just a few rows (equi-join). • The iterations in a nested loop join scenario will severely decrease the performance • The # of access reads will be multiplied by the rows sequentially scanned. • The performance depends on the order taken for the tables in the join and whether or not an index will be used in any of the tables.

Indexes – The Basics (1) An index is a structure or object in the database. Database indexes are similar to indexes in books or file cabinets. Indexes provide fast access to rows in tables meeting certain condition in a query. Minimize I/O, improving performance, specially in random access. It is a dynamic structure: Changes as the data in the table changes. A table can have several indexes, to satisfy several queries.

Indexes –The Basics (2) Unique indexes are necessary on column(s) that must be unique. Presence of an index can allow the optimizer to speed up a query. The optimizer decides whether to use an index or not We can force the optimizer to use or avoid an index. The optimizer can use an index in the following ways: To replace sequential table scans with nonsequential/random access. To avoid reading row data when processing expressions that name only indexed columns. To avoid a sort (including building a temporary table) when executing the GROUP BY and ORDER BY clauses. An index on the appropriate column can save thousands, tens of thousands, or in extreme cases, even millions of disk operations during a query. Indexes entail costs (in space, processing and maintenance).

Query Speed Comparison: Seq scan vs Index scan Example: Suppose you have two tables: tab1: with 200 rows (small table), and tab2: with 500,000 rows (large table) Both with a unique index on the joining column. Assume 1 row per page, therefore 1 data read (I/O) needed per row: For tab1: Assume it takes 2 index reads (+ 1 data read per row) => 3 I/Os For tab2: Assume it takes 3 index reads (+ 1 data read per row) => 4 I/Os. A simple select to find related rows from the two tables is: SELECT * FROM tab1, tab2 where tab1.col1=tab2.col2 Results: Depending on the database optimizer’s decision How many I/O (disk) accesses (index + data reads) needed? If select from the tab1 first and then joins to tab2 using the index: 1,000 disk reads = 200 for tab1 + 200x4 for tab2. If selects from tab2 first, and then joins to tab1 using the index: 2 million disk reads! = 500,000 for tab2 + 500,000x3 for tab1. If no indexes used at all: Sequential scan in both tables: around 1 billion disk reads! = 200 for tab1 x 500,000 for tab2.

Typical Index Structures Binary Search Tree (BST) B-Tree B+ Tree: DB World’s favorite and most widely used index Hash Bitmap R-Tree Forest of Trees (FOT)

B-Tree (1) – Analogy with a File Cabinet • File Cabinet: Organized storage • B-Tree index

B-Tree (2) Allows several keys in one index page / node Keeps similar valued records together on a disk page. This takes advantage of locality of reference over binary search trees (BST). Allows nodes to be incompletely filled Every node in the tree will be full at least to a certain minimum percentage. Some space is wasted (nodes are not entirely full). Requires less re-balancing than binary search trees (BST). Reduces the number of disk fetches necessary during a search.

B Tree (3) • They are always in perfect balance, as all leaf nodes are at the same depth. • Keeps data sorted and allows searches, insertions, and deletions in logarithmic amortized time. • Unlike self-balancing BST, it is optimized for systems that read and write large blocks of data. • Grows from the bottom: When a node is over-full, it is split and the added node is put one level up. Deletions are the reverse of additions.

B-Tree (4) • In B-trees: • The root node points down to branch nodes. • Branch nodes point down to leaf nodes. • Leaf nodes address the actual data rows. • But: B-tree has a problem • It is inefficient to search for a range of data values. • Not able to be used to move laterally through the tree, as well as up and down: • If all adjacent sibling nodes were connected, you could scan an entire tree level with minimum effort. This is done by B+ trees, used by some of the major database vendors.

B-plus Tree (B+ tree) (1) • It is an improved B-tree. The preferred index structure: • Widely used in databases. Used in Informix. • Levels: • The topmost level of the index structure hierarchy contains a single root page. • Each branch page has entries that point to pages in the next level of the index and to the immediate peer nodes (next nodes at the same level). • Each leaf page contains a list of index entries that point to rows in the table. • Index entries in leaf pages/nodes are sorted in key-value order: • An Index entry consists of a key and one or more row pointers. • The key is a copy of the indexed columns from one row of data. • A row pointer provides an address used to locate a row that contains the key (rowid). A sample b+ tree where only forward scan is available

B-plus Tree (B+ tree) (2) – Informix • In Informix: • B+ tree index structures. • An index leaf is formed of: • A key value (the index key). • Row ID’s to find the data row(s). • A delete flag. • The Index root and branch nodes: • Similar to the leaves, but instead of rowids, they have the IDs of other index nodes (pointers to other nodes). • Each Index node/page is doubled-linked with its previous and next index peer node, to optimize: • Key-only and Range searches. • Both Forward and Reverse Order searches. • An index key contains the value of the column we want to search for: • e.g.: first_name, SSN, zip_code.

B+Tree (3) – Informix • In B+ trees: • The root node points down to branch nodes. • Branch nodes point down, left and right. • Leaf nodes point left, right, and down to data via rowids. • The ability to move right or left from a node to its adjacent node puts the plus in B+ tree: • All Informix indexes for non-multidimensional data are B+ trees.

Forest of Trees Index A larger B-Tree index divided into smaller subtrees called “buckets.” You define which columns are used to hash to a “bucket.” You define the number of “buckets.” Each “bucket” acts as sort of a mini index. Traditional Btree Index: create unique index security_idx on SECURITY( S_SYMB, S_CO_ID ) in dbs; Forest of Trees Index: create unique index security_idx on SECURITY( S_SYMB, S_CO_ID ) in dbs hash on ( S_SYMB ) with 1000 buckets;

Traditional B-Tree Index Key Organization

Forest of Trees Index Key Organization

Why Use a Forest of Trees Index? Reduces rootnode contention on small to medium size tables. Reduces the B-Tree Index depth on larger tables: Reducing a 4-level to a 3-level B-Tree reduces buffer reads 25 percent. onstat -g spioutput shows spin locks with large wait numbers and loops: Num Waits Num Loops Avg Loop/Wait Name 1427723 60715268 42.53 fast mutex, 7:bf[299] 0x3400006 0x108b7908000 security_s_symb_s_co_id 879324 40682456 46.27 fast mutex, 7:bf[24] 0x3400003 0x108b74bc000 pk_fk_last_trade 477038 19177763 40.20 fast mutex, 3:bf[391] 0x320000a 0x10177fe2000 daily_market_idx 436564 17239046 39.49 fast mutex, 7:bf[26] 0x3400004 0x108b74c4000 security_s_symb_s_num_out 330244 14563521 44.10 fast mutex, 7:bf[30] 0x3400008 0x108b74d4000 security_s_symb_s_ex_id_s 176874 5183331 29.31 fast mutex, 7:bf[38] 0x3400008 0x108b74f4000 security_s_symb_s_ex_id_ 27050 3636325 134.43 mutex lock, name = log 16555 1301066 78.59 fast mutex, lockhash[3104] 18101 885933 48.94 fast mutex, AIOSHCB lock 17274 879611 50.92 fast mutex, 3:bf[319] 0x1800002 0x10177f52000 trade_history_idx 22507 744135 33.06 fast mutex, 3:bf[655] 0x320000a 0x101781f2000 daily_market_idx 18791 650429 34.61 fast mutex, 3:bf[694] 0x320000a 0x10178240000 daily_market_idx 17600 488946 27.78 fast mutex, 7:bf[54] 0x3400008 0x108b7534000 security_s_symb_s_ex_id_ Large numbers of waits and loops on indexes as monitored by onstat -g spi | sort +1 -nr may indicate a need for a FOT index.

B-Tree Functions Supported in FOT B-Tree functionality supported: Primary key Foreign key B-Tree cleaning Replication Tools (dbschema, oncheck, and other utilties)

B-Tree Functions Not Supported in FOT B-Tree functionality that not is supported: Aggregates such as max() and min() Clustering FILLFACTOR Range scans on columns listed in the HASH ON column list. Range scan example: create index …( col1, col2 ) … hash on ( col1) with 100 buckets; select * …. where col1 > 0; /* not allowed */ select * …. where col1 = 100; /* allowed */ select * …. where col1 = 100 and col2 > 300 /* allowed */

B-plus Tree (B+ tree) (4) – Informix • In Informix: • Forward (Ascending) and Reversed-order (Descending) Index scan available using one single index. Examples (second one is a Key-Only search): • Select fname, lname from customer order by customer_num; • Select customer_num from customer order by customer_num desc; • Range searches get improved with this structure too. Ex: • Select customer_num, fname, lname from customer where customer_num between 50 and 300; • Select customer_num from customer where customer_num >= 50 and customer_num <= 300;

B+ Tree Splits

R-Tree (1) An R-tree index is a secondary data structure (or access method) that organizes data access similar to a B-tree index. Used in Informix, for multi-dimensional and spatial data. R-tree is specifically designed to index table columns that contain the following types of data: Multidimensional data: Spatial data in two or three dimensions An extra dimension that represents time could also be included. Combinations of numerical values treated as multidimensional values: Such as a configuration for a house that includes the number of stories, the number of bedrooms, the number of baths, the age of the house, and the sqf Range values: Such as the time of a television program (9:00 P.M. to 9:30 P.M.) (X, Y) coordinates of geographical data. A common real-world usage for an R-tree might be: "Find all museums within 2 miles (3.2 km) of my current location".

R-Tree (2) • Simple example of an R-tree for 2D rectangles:

Reading through a B+ tree Index (Index scan) • When you access a row through an index: • You read the B+ tree starting at the root node and follow the nodes down to the lowest level, which contains the pointer to the data. • In the example below, 3 index read operations are needed to find the pointer to the data. • Keep key size to a minimum for two reasons • To allow a single index page in memory to hold more key values: • Reduces the number of read operations necessary to look up several rows. • To have fewer B+ tree levels in the index, very important for performance: • An index with a 4-level tree needs 1 more read per row than an index with a 3-level tree • If 100,000 rows read in 1 hour -> 100,000 fewer reads needed to obtain the same data. • For Informix (IDS), the size of a node is the size of one page. 3 level b+ tree. # of page reads needed to find a row: If key-only scan (no need to get the row data page): 3 (index-only) • If we need to get the data pages: 4 (3 index + 1 data)

Placement of an Index – Informix Storage Review • Physical Storage Units: • Chunk • Largest unit of physical disk dedicated to Informix DB server data storage. • Provide DBAs with a significantly large unit for allocating disk space. • Max size of individual chunk is 4TB. • Number of allowable chunks is 32,766. • It is a disk unit: • Raw device (disk partition), or • Regular file system file (cooked file). • Contains a certain number of pages. • Pages from different tables/indexes can be allocated in the same chunk. • Page • Minimum I/O unit. • Commonly 2KB or 4KB, configurable up to 16K.

Placement of an Index – Review of Informix storage • Physical Storage Units (cont): • Extent • Group of contiguous pages within a chunk that store data for a given table, index, table fragment or index fragment. • When creating a table, you specify its initial extent size and next extent size. • As the table grows, it can have multiple extents. • Important to manage appropriate extent sizes and number of extents. • Default size configurable in Informix configuration file (ONCONFIG): • If not set, default is 16k. • An extent contains pages of a single table, index or fragment.

Placement of an Index – Informix Storage Review • Logical Storage Units • Tblspace • Logical collection of all the extents (not necessarily contiguous) allocated to a specific table, index or fragment. • It can include extents stored on a single chunk or on multiple chunks. • A tblspace, however, is always contained within a single dbspace. • Dbspace • Logical collection of chunks. • Form a pool of disk space that is used to store DBs, tables, indexes or fragments. • A single dbspace can contain pages of different tables/indexes. • Special purpose dbspaces: • Root dbspace: stores system’s catalog information and databases. • Blobspace: stores simple binary large objects. • Sbspace:stores smart binary large objects. • Temporary dbspaces: for non-logged temp data.

Placement of an Index – Informix Storage Review • CREATE DATABASE stores_demo WITH LOG IN dbs1; • CREATE TABLE customer (customer_num integer,…) IN dbs2; • CREATE INDEX ix_cust ON customer (customer_num) IN dbs3; • You can place different database objects in different disk spaces: • To balance I/O work across the disk devices and controllers available • Example: • The database stores_demo in a data dbspace dbs1 • If no dbspace is specified, the database is created in rootdbs (the Root dbspace). • The table customer in a data dbspace dbs2 • If no dbspace is specified, the table is created in a new tblspace where its database resides. • The index ix_cust in a data dbspace dbs3 • If no dbspace is specified, the index is created in a new tblspace in the same dbspace as its table. • When an index is created on an empty table, just the root node of the B-tree is created.

Attached vs Detached Indexes (1) • In the past, indexes used to be attached to the table – meaning they did not have a separate tblspace, and the index pages were interleaving with the data pages of the table in the table’s tblspace. • Now, on Informix, all indexes are detached – meaning index extents are stored separately (in a new tblspace) from table extents, even if they are placed within the same dbspace as the table.

Attached vs Detached Indexes (2) An index can be placed in a separate dbspace. Example, this index is stored in a separate dbspace called cust_ix_dbs: CREATE INDEX customer_ix ON customer (zipcode) IN cust_ix_dbs; By default, index extents are created in the dbspace that holds the data (table) extents. A detached index can have a fragmentation strategy different from the one used by its table, that you set up explicitly with CREATE INDEX.

Bidirectional Traversal of Indexes (1) ASC and DESC keywords specify the order to maintain the index. When creating an index on a column, if you omit or specify the ASC keyword, Informix stores the key values in ascending order: Default column order. From the smallest to the largest key. Example on customer’s last name: Albertson, Beatty, Currie. create index ix_cust on customer(lname asc); or create index ix_cust on customer(lname); Use DESC keyword for Informix to store the key values in descending order: From the largest to the smallest key. Example on customer’s last names: Currie, Beatty, Albertson. create index ix_cust on customer(lname desc);

Bidirectional Traversal of Indexes (2) • Informix’s bidirectional traversal capability of the database server lets you create just one index on a column and use that index for queries that specify sorting of results in either ascending or descending order of the sort column.

Implicit vs Explicit Index Implicit indexes are created when a constraint (primary key, foreign key, unique constraint) is defined that cannot use an existing index You cannot specify a dbspace location, fragmentation strategy or fill factor for the index. Implicit indexes are created in the same dbspace as the database: CREATE TABLE tab1 ( col1 INTEGER, col2 INTEGER, col3 CHAR(25), PRIMARY KEY (col1)) IN table1dbs; Explicit indexes are created using CREATE INDEX statement It is recommended to explicitly create indexes that exactly match the referential constraint and then use ALTER TABLE to add the constraint: The constraint will use the existing index instead of implicitly creating one CREATE TABLE tab1 (col1 INTEGER, col2 INTEGER, col3 CHAR (25)) IN table1dbs; CREATE INDEX index1 ON TABLE table_name(col1) IN idx1dbs FILLFACTOR 70; ALTER TABLE tab1 ADD CONSTRAINT PRIMARY KEY (col1);

Unique vs Duplicate Index • Unique indexes allow only one occurrence of a value in the indexed column • Created for columns whose values cannot be repeated within the table. • Used to enforce primary key (PK)’s uniqueness or unique constraints. • An index entry is created for each row in the table, prevents duplicates. • Ex: customer_num, SSN, employee_id • A non-unique, duplicate or secondary index is an index based in a non-key attribute, and allows identical values for multiple rows in an indexed column • Avoid having highly duplicated indexes • Indexes become less effective as they are less unique. • Ex: city, first_name, last_name, order_date, zipcode. CREATE UNIQUE INDEX cust_num_ix ON customer(customer_num); or CREATE DISTINCT INDEX cust_num_ix ON customer(customer_num); CREATE INDEX cust_lname_ix ON customer(lname);

Simple vs Composite Index • A simple index lists only one column (or for IDS, only one column or function) in its index key specification. Example: • CREATE INDEX cust_lname_ix • ON customer(lname); • Any index listing two or more columns is a composite index: • List the columns in the order from most frequently used to least frequently used • Facilitates multiple column joins • Increases uniqueness of indexed values • Example: • CREATE INDEX ix_items ON • items(manu_code, stock_num);

Taking advantage of a Composite Index On Informix, the optimizer can use a composite index (one that covers more than one column) in several ways: You can use an index on columns a, b, and c (in that order) in these ways: CREATE INDEX ix_sample ON sample_table (a, b, c); To locate a particular row using partial-key search: WHERE a=1 WHERE a>=12 AND a<15 WHERE a=1 AND b < 5 WHERE a=1 AND b = 17 AND c >= 40 The following examples of filters cannot use that composite index: WHERE b=10 WHERE c=221 WHERE a>=12 AND b=15 To replace a table scan by a key-only search: when all of the desired columns are contained within the index. To join column a, columns ab, or columns abc to another table. To implement ORDER BY or GROUP BY on columns a, aborabc but not on b, c, ac, or bc.

Cluster Indexes (1) • Used to physically order records of the table according to the index, remove extents interleaving – making pages in the table contiguous, and avoid sorts. • Use a cluster index: • In static tables, tables subject to few or no modifications, and • Where the table is frequently read. • CLUSTER physically reorders the rows of the table: • Informix rewrites the table data rows to match the index order. • Therefore, each table can have only ONE cluster index. • Ex: Create an index on a customer table and physically order the rows according to their last name values, in (by default) ascending order: • CREATE CLUSTER INDEX ix_cust ON customer(lname);

Cluster Indexes (2) • Over time, Informix does not maintain clustering of the data rows as new rows are inserted or as existing key values are updated: • If the table is modified, the benefit of an earlier cluster will disappear as rows are added in space-available, not sequential, order. • Cluster indexes are most effective on relatively static tables but less effective on very dynamic tables. • You can recluster the table to regain performance by issuing another ALTER INDEX TO CLUSTER statement on the clustered index: • ALTER INDEX ix_cust TO CLUSTER; • The TO NOT CLUSTER option drops the cluster attribute on the index name without affecting the physical table.

Functional Indexes (1) A functional index is one in which all keys derive from the results of a function. The functional index can be a B-tree index, an R-tree index, or a user-defined secondary-access method. R-Tree example: If you have a column of pictures, for example, and a function to identify the predominant color, you can create an index on the result of the function: Such an index would enable you to quickly retrieve all pictures having the same predominant color without re-executing the function.

Functional Indexes (2) The function must be a user-defined function (UDF) You cannot create a functional index on any built-in function of SQL. But you can create a UDF that calls a built in function and use this UDF as the index key of a functional index. B-Tree examples: invalid(upper is an Informix built-in function): CREATE INDEX ix1 on state (UPPER(sname)); valid(define UDF myupper as not variant): CREATE FUNCTION myupper (v_value char(15)) RETURNING char(15) with (not variant); define r_value char(15); execute function upper(v_value) into r_value; return r_value; END FUNCTION; create index ix1 on state (myupper (sname));

Index Fill Factor (1) • The DBA can specify the percentage of each page/node that the index will fill during index creation. This is the index fill factor. • The index fill factor is not maintained over the life of the index. Works only for index build.

Index Fill Factor (2) The Informix onconfig FILLFACTOR parameter sets the system default and is used by all indexes created in the system: If the FILLFACTOR is not specified the default is 90 Unless all table indexes receive the same type of activity it is recommended to use the FILLFACTOR option in the CREATE INDEX statement. Ex: CREATE INDEX state_code_idx ON state(code) FILLFACTOR 80;

Index Fill Factor (3) • A high fill factor will produce an initially compact/dense index, providing more efficient caching and reducing the number of pages to read when retrieving rows: • Use a FILLFACTOR of 100 for tables that are receive selects (read only) or deletes to minimize the merging and shuffling pages as keys are removed. • Creating an index with a lower fill factor will produce a sparse index (with more pages and probably levels to read), which can delay the need for node (page) splitting and the accompanying performance impact.

Creating indexes Examples: CREATE UNIQUE INDEX ix_orders ON orders(orders_num) IN idx_dbs; CREATE INDEX ix_items ON items(manu_code, stock_num); CREATE UNIQUE CLUSTER INDEX ix_manufact ON manufact(manu_code) FILLFACTOR 80; CREATE INDEX ix_man_stk ON items(manu_code desc, stock_num); CREATE INDEX order_ix1 ON orders (order_num, order_date desc);

Altering, Dropping, and Renaming Indexes Examples: ALTER INDEX ix_man_cd TO CLUSTER; RENAME INDEX ix_cust TO new_ix_cust; DROP INDEX ix_stock;

Index Partitioning (Fragmentation) • Fragmentation is the distribution of data or index from one table across separate dbspaces (logical groups of disk storage devices). • In Informix, you can create several partitions of a table/index in the same and/or different dbspaces. • Each fragment is stored in its own tablespace (group of extents).

RDBMS Fundamentals: Indexing