Data structures
Download
1 / 69

Data Structures - PowerPoint PPT Presentation


  • 51 Views
  • Uploaded on

Data Structures. Lecture 5 B-Trees. Haim Kaplan and Uri Zwick November 2012. A 4 -node. 10. 25. 42. key < 10. 10 < key < 25. 25 < key < 42. 42 < key. 3 keys. 4 -way branch. An r -node. …. k 0. k 1. k 2. k r−3. k r−2. c 0. c 1. c 2. c r −2. c r −1.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Data Structures' - burke


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Data structures

Data Structures

Lecture 5

B-Trees

Haim Kaplan and Uri ZwickNovember 2012


A 4-node

10

25

42

key< 10

10 < key < 25

25 < key < 42

42 < key

3 keys

4-way branch


An r-node

k0

  • k1

  • k2

  • kr−3

  • kr−2

c0

c1

c2

cr−2

cr−1

r−1 keys

r-way branch


B-Trees (with minimum degree d)

Each node holds between d−1 and 2d −1 keys

Each non-leaf node has between d and 2d children

The root is special:has between 1 and 2d −1 keys

and between 2 and 2d children (if not a leaf)

All leaves are at the same depth


A 2 4 tree
A 2-4 tree

B-Tree with minimal degree d=2

13

4 6 10

15 28

1 3

30 40 50

14

5

7

11

16 17


Node structure

k0

  • k1

  • k2

  • kr-3

  • kr-2

r –the degree

c0

c1

c2

cr−2

cr−1

key[0],…key[r−2] –the keys

item[0],…item[r−2] –the associated items

child[0],…child[r−1] –the children

leaf –is the node a leaf?

Possibly a different representation for leafs


The height of b trees
The height of B-Trees

  • At depth 1 we have at least 2 nodes

  • At depth 2 we have at least 2dnodes

  • At depth 3 we have at least 2d2nodes

  • At depth h we have at least 2dh−1nodes


Look for k in node x

Look for k in the subtree of node x

Number of nodes accessed - logdn

Number of operations – O(d logdn)

Number of ops with binary search – O(log2d logdn) = O(log2n)


B trees vs binary search trees
B-Trees vs binary search trees

  • Wider and shallower

  • Access less nodes during search

  • But may take more operations


B trees what are they good for
B-Trees – What are they good for?


The hardware structure

CPU

Cache

Disk

Each memory-level much larger but much slower

RAM

 Information moved in blocks


A simplified I/O model

CPU

RAM

Disk

Each block is of size m.

Count both operations and I/O operations


Data structures in the I/O model

Each node (struct) is allocated continuously.

Harder to control the disk blocks containing different nodes

 Linked list and search trees behave poorly in the I/O model.

Each pointer followed may cause a disk access

Pick d such that a node fits in a block

 B-trees reduce the worst case # of I/Os


Look for k in node x

Look for k in the subtree of node x

I/Os

Number of nodes accessed - logdn

Number of operations – O(d logdn)

Number of ops with binary search – O(log2d logdn) = O(log2n)


Red black trees vs b trees
Red-BlackTrees vs. B-Trees

n = 230  109

30 ≤ height of Red-BlackTree ≤ 60

Up to 60pages read from disk

Height of B-Tree with d=1000 is only 3

Each B-Tree node resides in a block/page

Only 3 (or 4) pages read from disk

Disk access  1 millisecond (10-3 sec)

Memory access 100 nanosecond (10-7 sec)


B trees what are they good for1
B-Trees – What are they good for?

  • Large degree B-treesare used to represent very large disk dictionaries. The minimum degree d is chosen according to the size of a disk block.

  • Smaller degree B-trees used for internal-memory dictionaries to overcome cache-miss penalties.

  • B-trees with d=2, i.e., 2-4 trees, are very similar to Red-Black trees.


Updates to a b tree
Updates to a B-tree


Rotate/Steal right

A

B

B

A

Rotate/Steal left

Number of operations – O(d)

Number of I/Os – O(1)


Split

B

A

C

B

A

C

d−1

d−1

d−1

d−1

Join

Number of operations – O(d)

Number of I/Os – O(1)


Insert
Insert

13

5 10

15 28

1 3

30 40 50

14

6

11

16 17

Insert(T,2)


Insert1
Insert

13

5 10

15 28

1 2 3

30 40 50

14

6

11

16 17

Insert(T,2)


Insert2
Insert

13

5 10

15 28

1 2 3

30 40 50

14

6

11

16 17

Insert(T,4)


Insert3
Insert

13

5 10

15 28

1 2 3 4

30 40 50

14

6

11

16 17

Insert(T,4)


Split
Split

13

5 10

15 28

1 2 3 4

30 40 50

14

6

11

16 17

Insert(T,4)


Split1
Split

13

5 10

15 28

2

30 40 50

14

1

3 4

6

11

16 17

Insert(T,4)


Split2
Split

13

2 5 10

15 28

1

30 40 50

14

3 4

6

11

16 17

Insert(T,4)


Splitting an overflowing node

B

A

C

B

A

C

d

d−1

d

d−1


Another insert
Another insert

13

2 5 10

15 28

1

30 40 50

14

3 4

6

11

16 17

Insert(T,7)


Another insert1
Another insert

13

2 5 10

15 28

1

30 40 50

14

6 7

3 4

11

16 17

Insert(T,7)


And another insert
and another insert

13

2 5 10

15 28

1

30 40 50

14

6 7

3 4

11

16 17

Insert(T,8)


And another insert1
and another insert

13

2 5 10

15 28

1

30 40 50

14

3 4

11

16 17

6 7 8

Insert(T,8)


And the last for today
and the last for today

13

2 5 10

15 28

1

30 40 50

14

3 4

11

16 17

6 7 89

Insert(T,9)


Split3
Split

13

2 5 10

15 28

7

1

30 40 50

14

3 4

8 9

11

6

16 17

Insert(T,9)


Split4
Split

13

2 5 7 10

15 28

1

30 40 50

14

3 4

8 9

11

6

16 17

Insert(T,9)


Split5
Split

13

5

2

7 10

15 28

1

30 40 50

14

3 4

8 9

11

6

16 17

Insert(T,9)


Split6
Split

5 13

2

7 10

15 28

1

30 40 50

14

3 4

8 9

11

6

16 17

Insert(T,9)


Insert – Bottom up

  • Find the insertion point by a downward search

  • Insert the key in the appropriate place

  • If the current node isoverflowing, split it

  • If its parent is now overflowing, split it, etc.

  • Disadvantages:

  • Need both a downward scan and an upward scan

  • Need to keep parents on a stack

  • Nodes are temporarily overflowing


Insert – Top down

  • While conducting the search,splitfull children on the search pathbefore descending to them!

  • When the appropriate leaf it reached,it is not full, so the new key may be added!


Split-Root(T)

T.root

C

T.root

C

d−1

d−1

d−1

d−1


Split-Child(x,i)

x

key[i]

x

key[i]

B

A

C

B

A

x.child[i]

x.child[i]

C

d−1

d−1

d−1

d−1


Insert – Top down

  • While conducting the search,splitfull children on the search pathbefore descending to them!

Number of I/Os – O(logdn)

Number of operations – O(d logdn)


Deletions from b trees
Deletions from B-Trees

7 15

3

10 13

22 28

30 40 50

20

24 26

14

1 2

4 6

11 12

8 9

delete(T,26)


Delete
Delete

7 15

3

10 13

22 28

30 40 50

20

24

14

1 2

4 6

11 12

8 9

delete(T,26)


Delete1
Delete

7 15

3

10 13

22 28

30 40 50

20

24

14

1 2

4 6

11 12

8 9

delete(T,13)


Delete replace with predecessor
Delete (Replace with predecessor)

7 15

3

10 12

22 28

30 40 50

20

24

14

1 2

4 6

11 12

8 9

delete(T,13)


Delete2
Delete

7 15

3

10 12

22 28

30 40 50

20

11

24

14

1 2

4 6

8 9

delete(T,13)


Delete3
Delete

7 15

3

10 12

22 28

30 40 50

20

11

24

14

1 2

4 6

8 9

delete(T,24)


Delete4
Delete

7 15

3

10 12

22 28

30 40 50

20

11

14

1 2

4 6

8 9

delete(T,24)


Delete steal from sibling
Delete (steal from sibling)

7 15

3

10 12

22 30

40 50

20

11

28

14

1 2

4 6

8 9

delete(T,24)


Rotate/Steal right

A

B

B

A

Rotate/Steal left


Delete5
Delete

7 15

3

10 12

22 30

40 50

20

11

28

14

1 2

4 6

8 9

delete(T,20)


Delete6
Delete

7 15

3

10 12

22 30

40 50

11

28

14

1 2

4 6

8 9

delete(T,20)


Delete join
Delete (Join)

7 15

3

10 12

30

40 50

22 28

11

14

1 2

4 6

8 9

delete(T,20)


Few more
Few more..

7 15

3

10 12

30

40 50

22 28

11

14

1 2

4 6

8 9

delete(T,22)


Few more1
Few more..

7 15

3

10 12

30

40 50

28

11

14

1 2

4 6

8 9

delete(T,22)


Few more2
Few more..

7 15

3

10 12

30

40 50

28

11

14

1 2

4 6

8 9

delete(T,28)


Few more3
Few more..

7 15

3

10 12

30

40 50

11

14

1 2

4 6

8 9

delete(T,28)


Stealing again
Stealing again

7 15

3

10 12

40

50

30

11

14

1 2

4 6

8 9

delete(T,28)


Another one
Another one

7 15

3

10 12

40

50

30

11

14

1 2

4 6

8 9

delete(T,30)


Another one1
Another one

7 15

3

10 12

40

50

11

14

1 2

4 6

8 9

delete(30,T)


After join
After Join

7 15

3

10 12

11

40 50

14

1 2

4 6

8 9

delete(30,T)


Now we can steal
Now we can steal

7 15

3

10 12

11

40 50

14

1 2

4 6

8 9

delete(30,T)


Now we can steal1
Now we can steal

7 12

10

3

15

40 50

14

1 2

4 6

11

8 9

delete(30,T)


More ?

7 12

10

3

15

40 50

14

1 2

4 6

11

8 9

delete(40,T)


Delete – Top down

  • Assume, at first, that the item to be deleted is in a leaf

  • While conducting the search,make sure that each child descended into contains at least d keys

  • How?

  • Steal or join

  • When the item is located, it resides in a leaf containing at least d keys, so it can be removed


Delete – Top down

  • While conducting the search,make sure that each child you descend to contains at least d keys

d−1

 d

d−1

d−1

Rotate! (Steal)

Join!


Delete – Top down

  • What if the item to be deleted is in an internal node?

  • Descend as before from the root untilthe item to be deleted is located

  • Keep a pointer to the node containing the item

  • Carry on descending towards the successor, making sure that nodes contain at least d keys

  • When the successor is found, delete it from its leafand use it to replace the item to be deleted


Deletions fromB-Trees

As always, similar, but slightly more complicated than insertions

(may need to replace with successor)

Deletion is slightly simpler for B+-Trees


B trees vs b trees
B-Trees vs. B+-Trees

  • In a B-tree each node contains items and keys

  • In a B+-tree leaves contain items and keys.Internal nodes contain keys to direct the search.

  • Keys in internal nodes are either keys of existing items, or keys of items that were deleted.

  • Internal nodes may contain more keysso overall the # of items we can store increases


ad