ECE 569 Database System Engineering Spring 2003

ECE 569 Database System EngineeringSpring 2003 Yanyong Zhang www.ece.rutgers.edu/~yyzhang Course URL www.ece.rutgers.edu/~yyzhang/spring03

Index • “If you don’t find it in the index, look very carefully through the entire catalog” • An index is a data structure that organizes data records on disk to optimize certain kind of retrieval operations. • A data entry refers to the records stored in an index file. A data entry with search key k, denoted as k*, contains enough information to locate (one or more) data records with search key value k. Three alternatives: • K* can be an actual data record • K* is a (k, tid) pair • K* is a (k, tid-list) pair

Clustered Indexes • When is a file is organized so that the ordering of data records is the same as or close to the ordering of data entries in some index, we say that the index is clustered. • Alternative (1) is clustered • Alternatives (2) and (3) can be a clustered index only if the data records are sorted on the search key field => this is expensive => usually they are unclustered.

Index Data Structures • Two basic approaches • Hash-based indexing • Tree-based indexing • ISAM tree • B+ tree

Indexed Sequential Access Method (ISAM) • Highly static • Each node is a disk page • Leaf nodes are first allocated, then index pages, then overflow pages • Once the ISAM file is created, inserts and deletes affect only the contents of leaf pages. Index pages leaf pages primary pages overflow pages

ISAM lookup 40 20 33 51 63 10* 15* 20* 27* 33* 37* 40* 46* 51* 55* 63* 97* • Primary leaf pages are allocated sequentially? • Is this assumption reasonable? • no “next-leaf” pointer is necessary

23* 48* 41* 42* ISAM insert Insert 23, 48, 41, 42 40 20 33 51 63 10* 15* 20* 27* 33* 37* 40* 46* 51* 55* 63* 97*

ISAM delete • Removes the entry • If the page becomes empty • If it is overflow page, then delete it • If it is primary page, just leave it as a place holder

ISAM discussion • Pros • We know that the index nodes will not be changed, so that we don’t need to lock them • Cons • Long chains of overflow pages are performance bottleneck

B+-tree • The tree grows/shrinks dynamically • Root index fits in one page and directs search for records in index below it • B+-tree is balanced, i.e., every path through tree is same length. Reasonably easy to maintain this property • Large fan-out of index nodes result in few levels. Three levels can address 16M pages (256 records / page) • depth Index entries (to direct search) data entries

Format of a node • Index node • An index node contains m entries, with d  m  2d. d is called the order of the tree. The root node is required to have 1  m  2d. • p0 K1 p1 K2 p2 … Km pm • Leaf node • Leaf nodes contain the data entries. • A page contains at most 2e-1 records • Records sorted by key value • Doubly linked list

Lookup of key K • Assume B+-tree is of depth l • Construct path B0B1…Bl-1 where • B0 is root node • Kj in block Bi-1 covers K and j th block pointer in Bi-1 is Bi. • Example – Find key K = 245 B0 168 296 B1 B2 B3 140 220 256 303 120 136 140 151 168 170 190 220 255 256 271 296 299 303 312 318 B5 B4 B6 B7 B8 B9 B10 • Path is B0 (168) B1 (220) B7 • 245 is not in B7 => 245 is not in main file

Insertion of record with key K • Follow lookup procedure to find block in which K belongs. Path is B0B1…Bl-1 • If room in Bl-1, then insert there. (Maintain sorted order of Bl-1) • Otherwise, allocate B’ and split records evenly between Bl-1 and B’ • Keys in Bl-1 are less than K’ and those in B’ are greater than or equal to K’ • Insert record (K’, B’) in Block Bl-1. (This insertion can also cause split) • Splitting the root (maintain B0 as the root) • Allocate two new blocks Bl and Br. • Move half of keys in root (keys smaller than K’) to Bl and the rest (keys greater than or equal to K’) to Br . • Modify B0 (original root) to contain (Bl , K’, Br) • Depth is increased to l+1

Delete record with key K • lookup K and find path B0B1…Bl-1 • Delete record from Bl-1 • If Bl-1 now contains fewer than e records • If a neighbor B’ has more than e records, divide the records between Bl-1 and B’ as evenly as possible. Update any ancestors necessary to reflect change. • Otherwise, the records of Bl-1 and one of its neighbors B’ can be combined. B’ is removed and (K’, B’) is removed from parent. This merge can propagate to the root. • If last two children of root are combined, depth is decreased.

Discussion • Merge operations have a high performance penalty; databases tend to grow, so some merges may not be necessary • Remove blocks when empty • Treat merge as a maintenance operation, and do it periodically • What kind of queries can B+-trees help with?

168 220 140 175 296 303 168 170 Dense Indices • Decouple allocation of tuples from access method • Allocate tuples following a heap organization (good utilization) • Access tuples using hashing, B+-tree, etc. • Access methods must be modified slightly • B+-trees: Keys adjacent in key space need not be physically adjacent. Need tuple pointer for each key value (not key range) in leaf nodes. (each tuple can be in a different page) • Hashing: Hash buckets contain key value, tuple pointer pairs.

Secondary Indices • Primary indices provide access based on primary key • Secondary indices provide access based on search fields other than primary key • Index can be used to cluster tuples • Sparse B-tree can be easily modified 168 220 140 175 296 303 168 168 168 170 175 175 175 180 187 200 200

Secondary Indexes (cont’d) • Non-clustered indexes 168 220 140 175 296 303 168 170 175 180 187 200

Performance • Lookup requires l accesses where l is depth • The depth is directly dependant on the fanout of index nodes • Define (sparse B+-tree) – • n = number of records • R = number of records / block (max) • F = number of index entries / block (max0 • u = average node occupancy • R eff = R  u = average number of records / page • F eff = F  u = average number of index entries / page • N  Fl-1eff  Reff •  logFeff( n / Reff) + 1 = l

Performance – cont’d • Utilization • If nodes are merged as described above u  69% • If nodes are removed when empty • # inserts = # deletes, u  40% • 60% inserts, 40% deletes, u  60%

Example • 4000 bytes / block • 200 bytes / record • Key requires 20 bytes • Block pointer requires 4 bytes • n = 1000000 • What is the depth? ( 4)

Key compression • Tree height can be reduced by increasing fan-out of index nodes • Key compression can increase the number of keys that can be stored in an index node • Suffix compression • Store only enough of the key value to discriminate between the children of the index node. For the following keys artful deliver hand access alert amass artful boom deal Deliver everyone fiddle Hand integral leaf • Only need to store the following ar del h access alert amass artful boom deal Deliver everyone fiddle Hand integral leaf

Key compression (cont’d) • Prefix compression • Rather than storing each key value, store the difference from the previous key value • Represent keyi as <j, keyi’> where j is the length of the common prefix shared by keyi and keyi-1, and keyi’ is the remainder of keyi after the common prefix is removed • The following keys (length 36 bytes) – can, cannon, canter, cantor, capacity, capital – can be encoded as (length 27 bytes) - <0, can>, <3, non>, <3, ter>, <4, or>, <2, pacity>, <3, ital> • How would you decide which of these two techniques to use in a particular situation?

ECE 569 Database System Engineering Spring 2003

ECE 569 Database System Engineering Spring 2003

Presentation Transcript

ECE 242 Spring 2003 Data Structures in Java

Software Engineering 2 Spring 2003 The development process

Software Engineering 2 Spring 2003

Software Engineering 2 Spring 2003

Gigabit Ethernet Design ECE 4006C – Spring 2003

ECE 569 Database System Engineering Spring 2004

ECE 569 High Performance Processors and Systems

ECE Engineering Model

ECE 569 High Performance Processors and Systems

ECE 569 High Performance Processors and Systems

ECE Engineering Model

ECE 569 Database System Engineering Fall 2004

ECE Engineering Model

ECE 569 Database System Engineering Fall 2004 Topic VIII: Query Execution and optimization

ECE 569 Database System Engineering Spring 2004 Topic VIII: Query Execution and optimization

Spring 2003

ECE 569 Database System Engineering Spring 2003

Computers in Civil Engineering 53:081 Spring 2003

Spring 2003

ECE Engineering Model

Gigabit Ethernet Design ECE 4006C – Spring 2003