Data structures
Download
1 / 69

Data Structures - PowerPoint PPT Presentation


  • 97 Views
  • Uploaded on

Data Structures. Lecture 5 B-Trees. Eran Halperin and Hanoch Levy March 2014. How does a binary tree compare with k- ary tree?. Binary worse: Higher height Cost is logk Binary better: Lower width Cost is k OVERALL: BINARY BETTER! SO WHY BOTHER WITH K- ary ? .

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Data Structures' - eamon


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Data structures

Data Structures

Lecture 5

B-Trees

Eran Halperin and Hanoch LevyMarch 2014


Data structures

How does a binary tree compare with k-ary tree?

  • Binary worse: Higher height

    • Cost is logk

  • Binary better: Lower width

    • Cost is k

  • OVERALL: BINARY BETTER!

  • SO WHY BOTHER WITH K-ary?


Data structures

Idealized computation model

CPU

RAM

Each instruction takes one unit of time

Each memory access takes one unit of time


Data structures

A more realistic model

CPU

Cache

Disk

Each level much larger but much slower

RAM

Information moved in blocks


Data structures

A simplified I/O mode

CPU

RAM

Disk

Each block is of size m.

Count both operations and I/O operations


Data structures

Data structures in the I/O model

Linked list and search trees behave poorly in the I/O model.

Each pointer followed may cause a disk access

We need an alternative for binary search treesthat is more suited to the I/O model

B-Trees !


Data structures

A 4-node

10

25

42

key< 10

10 < key < 25

25 < key < 42

42 < key

3 keys

4-way branch


Data structures

An r-node

k0

  • k1

  • k2

  • kr−3

  • kr−2

c0

c1

c2

cr−2

cr−1

r−1 keys

r-way branch


Data structures

B-Trees (with minimum degree d)

Each node holds between d−1 and 2d −1 keys

Each non-leaf node has between d and 2d children

The root is special:has between 1and 2d −1 keys

and between 2 and 2d children (if not a leaf)

All leaves are at the same depth


A 2 4 tree
A 2-4 tree

B-Tree with minimal degree d=2

13

4 6 10

15 28

1 3

30 40 50

14

5

7

11

16 17


Data structures

Node structure

k0

  • k1

  • k2

  • kr-3

  • kr-2

r –the degree

c0

c1

c2

cr−2

cr−1

key[0],…key[r−2] –the keys

item[0],…item[r−2] –the associated items

child[0],…child[r−1] –the children

leaf –is the node a leaf?

Possibly a different representation for leaves


The height of b trees
The height of B-Trees

  • At depth 1 we have at least 2 nodes

  • At depth 2 we have at least 2dnodes

  • At depth 3 we have at least 2d2nodes

  • At depth h we have at least 2dh−1nodes


Red black trees vs b trees
Red-BlackTrees vs. B-Trees

n = 230  109

30 ≤ height of Red-BlackTree ≤ 60

Up to 60pages read from disk

Height of B-Tree with d=1000 is only 3

Each B-Tree node resides in a block/page

Only 3 (or 4) pages read from disk

Disk access  1 millisecond (10-3 sec)

Memory access 100 nanosecond (10-7 sec)


Data structures

Look for k in node x

Look for k in the subtree of node x

Number of I/Os - logdn

Number of operations – O(d logdn)

Number of ops with binary search – O(log2d logdn) = O(log2n)


B trees what are they good for
B-Trees – What are they good for?

  • Large degree B-treesare used to represent very large disk dictionaries. The minimum degree d is chosen according to the size of a disk block.

  • Smaller degree B-trees used for internal-memory dictionaries to overcome cache-miss penalties.

  • B-trees with d=2, i.e., 2-4 trees, are very similar to Red-Black trees.


Data structures

Rotate right

A

B

B

A

Rotate left


Data structures

Split (a full node)

B

A

C

B

A

C

d−1

d−1

d−1

d−1

Join


Insert
Insert

13

5 10

15 28

1 3

30 40 50

14

6

11

16 17

Insert(T,2)


Insert1
Insert

13

5 10

15 28

1 2 3

30 40 50

14

6

11

16 17

Insert(T,2)


Insert2
Insert

13

5 10

15 28

1 2 3

30 40 50

14

6

11

16 17

Insert(T,4)


Insert3
Insert

13

5 10

15 28

1 2 3 4

30 40 50

14

6

11

16 17

Insert(T,4)


Split
Split

13

5 10

15 28

1 2 3 4

30 40 50

14

6

11

16 17

Insert(T,4)


Split1
Split

13

5 10

15 28

2

30 40 50

14

1

3 4

6

11

16 17

Insert(T,4)


Split2
Split

13

2 5 10

15 28

1

30 40 50

14

3 4

6

11

16 17

Insert(T,4)


Data structures

Splitting an overflowing node

B

A

C

B

A

C

d

d−1

d

d−1


Another insert
Another insert

13

2 5 10

15 28

1

30 40 50

14

3 4

6

11

16 17

Insert(T,7)


Another insert1
Another insert

13

2 5 10

15 28

1

30 40 50

14

6 7

3 4

11

16 17

Insert(T,7)


And another insert
and another insert

13

2 5 10

15 28

1

30 40 50

14

6 7

3 4

11

16 17

Insert(T,8)


And another insert1
and another insert

13

2 5 10

15 28

1

30 40 50

14

3 4

11

16 17

6 7 8

Insert(T,8)


And the last for today
and the last for today

13

2 5 10

15 28

1

30 40 50

14

3 4

11

16 17

6 7 89

Insert(T,9)


Split3
Split

13

2 5 10

15 28

7

1

30 40 50

14

3 4

8 9

11

6

16 17

Insert(T,9)


Split4
Split

13

2 5 7 10

15 28

1

30 40 50

14

3 4

8 9

11

6

16 17

Insert(T,9)


Split5
Split

13

5

2

7 10

15 28

1

30 40 50

14

3 4

8 9

11

6

16 17

Insert(T,9)


Split6
Split

5 13

2

7 10

15 28

1

30 40 50

14

3 4

8 9

11

6

16 17

Insert(T,9)


Data structures

Insert – Bottom up

  • Find the insertion point by a downward search

  • Insert the key in the appropriate place

  • If the current node isoverflowing, split it

  • If its parent is now overflowing, split it, etc.

  • Disadvantages:

  • Need both a downward scan and an upward scan

  • Nodes are temporarily overflowing

  • Need to keep parents on a stack


Data structures

Split-Root(T)

T.root

C

T.root

C

d−1

d−1

d−1

d−1

Number of I/Os – O(1)

Number of operations – O(d)


Data structures

Split-Child(x,i)

x

key[i]

x

key[i]

B

A

C

B

A

x.child[i]

x.child[i]

C

d−1

d−1

d−1

d−1

Number of I/Os – O(1)

Number of operations – O(d)


Data structures

Insert – Top down

  • While conducting the search,splitfull children on the search pathbefore descending to them!

Number of I/Os – O(logdn)

Number of operations – O(d logdn)

Amortized no. of splits – O(1)


Data structures

Insert – Top down

Number of I/Os – O(logdn)

Number of operations – O(d logdn)

Amortized no. of splits – O(1)

  • Argument:

  • Each split increases # nodes by 1

  • # nodes <= # values = #inserts

  •  # splits <= # inserts


Data structures

Bottom-UpDeletions fromB-Trees

As always, similar, but slightly more complicated than insertions

To delete an item in an internal node, replace it by its successor(or predecessor) and delete successor (or predecessor)

To delete a leaf, delete the relevant key,

and if the leaf has too few keys, fix the tree using rotations and joins.


Data structures

Split (a full node)

B

A

C

B

A

C

d−1

d−1

d−1

d−1

Join


Data structures

Rotate right

A

B

B

A

Rotate left


Delete
Delete

7 15

3

10 13

22 28

30 40 50

20

24 26

14

1 2

4 6

11 12

8 9

delete(T,26)


Delete1
Delete

7 15

3

10 13

22 28

30 40 50

20

24

14

1 2

4 6

11 12

8 9

delete(T,26)


Delete2
Delete

7 15

3

10 13

22 28

30 40 50

20

24

14

1 2

4 6

11 12

8 9

delete(T,13)


Delete replace with predecessor
Delete (Replace with predecessor)

7 15

3

10 12

22 28

30 40 50

20

24

14

1 2

4 6

11 12

8 9

delete(T,13)


Delete3
Delete

7 15

3

10 12

22 28

30 40 50

20

11

24

14

1 2

4 6

8 9

delete(T,13)


Delete4
Delete

7 15

3

10 12

22 28

30 40 50

20

11

24

14

1 2

4 6

8 9

delete(T,24)


Delete5
Delete

7 15

3

10 12

22 28

30 40 50

20

11

14

1 2

4 6

8 9

delete(T,24)


Delete steal from sibling
Delete (steal from sibling)

7 15

3

10 12

22 30

40 50

20

11

28

14

1 2

4 6

8 9

delete(T,24)


Data structures

Rotate right

A

B

B

A

Rotate left


Delete6
Delete

7 15

3

10 12

22 30

40 50

20

11

28

14

1 2

4 6

8 9

delete(T,20)


Delete7
Delete

7 15

3

10 12

22 30

40 50

11

28

14

1 2

4 6

8 9

delete(T,20)


Delete join
Delete (Join)

7 15

3

10 12

30

40 50

22 28

11

14

1 2

4 6

8 9

delete(T,20)


Data structures

Split (a full node)

B

A

C

B

A

C

d−1

d−1

d−1

d−1

Join


Few more
Few more..

7 15

3

10 12

30

40 50

22 28

11

14

1 2

4 6

8 9

delete(T,22)


Few more1
Few more..

7 15

3

10 12

30

40 50

28

11

14

1 2

4 6

8 9

delete(T,22)


Few more2
Few more..

7 15

3

10 12

30

40 50

28

11

14

1 2

4 6

8 9

delete(T,28)


Few more3
Few more..

7 15

3

10 12

30

40 50

11

14

1 2

4 6

8 9

delete(T,28)


Stealing again
Stealing again

7 15

3

10 12

40

50

30

11

14

1 2

4 6

8 9

delete(T,28)


Another one
Another one

7 15

3

10 12

40

50

30

11

14

1 2

4 6

8 9

delete(T,30)


Another one1
Another one

7 15

3

10 12

40

50

11

14

1 2

4 6

8 9

delete(30,T)


After join
After Join

7 15

3

10 12

11

40 50

14

1 2

4 6

8 9

delete(30,T)


Now we can steal
Now we can steal

7 15

3

10 12

11

40 50

14

1 2

4 6

8 9

delete(30,T)


Now we can steal1
Now we can steal

7 12

10

3

15

40 50

14

1 2

4 6

11

8 9

delete(30,T)


Data structures
More ?

7 12

10

3

15

40 50

14

1 2

4 6

11

8 9

delete(40,T)


Data structures

Delete – Top down

  • Assume, at first, that the item to be deleted is in a leaf

  • While conducting the search,make sure that each child descended into contains at least d keys

  • How?

  • Use rotations or joins

  • When the item is located, it resides in a leaf containing at least d keys, so it can be removed


Data structures

Delete – Top down

  • While conducting the search,make sure that each child you descend to contains at least d keys

d−1

 d

d−1

d−1

Rotate! (Steal)

Join!


Data structures

Delete – Top down

  • What if the item to be deleted is in an internal node?

  • Descend as before from the root untilthe item to be deleted is located

  • Keep a pointer to the node containing the item

  • Carry on descending towards the successor, making sure that nodes contain at least d keys

  • When the successor is found, delete it from its leafand use it to replace the item to be deleted