data structures
Download
Skip this Video
Download Presentation
Data Structures

Loading in 2 Seconds...

play fullscreen
1 / 69

Data Structures - PowerPoint PPT Presentation


  • 51 Views
  • Uploaded on

Data Structures. Lecture 5 B-Trees. Haim Kaplan and Uri Zwick November 2012. A 4 -node. 10. 25. 42. key < 10. 10 < key < 25. 25 < key < 42. 42 < key. 3 keys. 4 -way branch. An r -node. …. k 0. k 1. k 2. k r−3. k r−2. c 0. c 1. c 2. c r −2. c r −1.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Data Structures' - burke


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
data structures

Data Structures

Lecture 5

B-Trees

Haim Kaplan and Uri ZwickNovember 2012

slide2

A 4-node

10

25

42

key< 10

10 < key < 25

25 < key < 42

42 < key

3 keys

4-way branch

slide3

An r-node

k0

  • k1
  • k2
  • kr−3
  • kr−2

c0

c1

c2

cr−2

cr−1

r−1 keys

r-way branch

slide4

B-Trees (with minimum degree d)

Each node holds between d−1 and 2d −1 keys

Each non-leaf node has between d and 2d children

The root is special:has between 1 and 2d −1 keys

and between 2 and 2d children (if not a leaf)

All leaves are at the same depth

a 2 4 tree
A 2-4 tree

B-Tree with minimal degree d=2

13

4 6 10

15 28

1 3

30 40 50

14

5

7

11

16 17

slide6

Node structure

k0

  • k1
  • k2
  • kr-3
  • kr-2

r –the degree

c0

c1

c2

cr−2

cr−1

key[0],…key[r−2] –the keys

item[0],…item[r−2] –the associated items

child[0],…child[r−1] –the children

leaf –is the node a leaf?

Possibly a different representation for leafs

the height of b trees
The height of B-Trees
  • At depth 1 we have at least 2 nodes
  • At depth 2 we have at least 2dnodes
  • At depth 3 we have at least 2d2nodes
  • At depth h we have at least 2dh−1nodes
slide8

Look for k in node x

Look for k in the subtree of node x

Number of nodes accessed - logdn

Number of operations – O(d logdn)

Number of ops with binary search – O(log2d logdn) = O(log2n)

b trees vs binary search trees
B-Trees vs binary search trees
  • Wider and shallower
  • Access less nodes during search
  • But may take more operations
slide11

The hardware structure

CPU

Cache

Disk

Each memory-level much larger but much slower

RAM

 Information moved in blocks

slide12

A simplified I/O model

CPU

RAM

Disk

Each block is of size m.

Count both operations and I/O operations

slide13

Data structures in the I/O model

Each node (struct) is allocated continuously.

Harder to control the disk blocks containing different nodes

 Linked list and search trees behave poorly in the I/O model.

Each pointer followed may cause a disk access

Pick d such that a node fits in a block

 B-trees reduce the worst case # of I/Os

slide14

Look for k in node x

Look for k in the subtree of node x

I/Os

Number of nodes accessed - logdn

Number of operations – O(d logdn)

Number of ops with binary search – O(log2d logdn) = O(log2n)

red black trees vs b trees
Red-BlackTrees vs. B-Trees

n = 230  109

30 ≤ height of Red-BlackTree ≤ 60

Up to 60pages read from disk

Height of B-Tree with d=1000 is only 3

Each B-Tree node resides in a block/page

Only 3 (or 4) pages read from disk

Disk access  1 millisecond (10-3 sec)

Memory access 100 nanosecond (10-7 sec)

b trees what are they good for1
B-Trees – What are they good for?
  • Large degree B-treesare used to represent very large disk dictionaries. The minimum degree d is chosen according to the size of a disk block.
  • Smaller degree B-trees used for internal-memory dictionaries to overcome cache-miss penalties.
  • B-trees with d=2, i.e., 2-4 trees, are very similar to Red-Black trees.
slide18

Rotate/Steal right

A

B

B

A

Rotate/Steal left

Number of operations – O(d)

Number of I/Os – O(1)

slide19

Split

B

A

C

B

A

C

d−1

d−1

d−1

d−1

Join

Number of operations – O(d)

Number of I/Os – O(1)

insert
Insert

13

5 10

15 28

1 3

30 40 50

14

6

11

16 17

Insert(T,2)

insert1
Insert

13

5 10

15 28

1 2 3

30 40 50

14

6

11

16 17

Insert(T,2)

insert2
Insert

13

5 10

15 28

1 2 3

30 40 50

14

6

11

16 17

Insert(T,4)

insert3
Insert

13

5 10

15 28

1 2 3 4

30 40 50

14

6

11

16 17

Insert(T,4)

split
Split

13

5 10

15 28

1 2 3 4

30 40 50

14

6

11

16 17

Insert(T,4)

split1
Split

13

5 10

15 28

2

30 40 50

14

1

3 4

6

11

16 17

Insert(T,4)

split2
Split

13

2 5 10

15 28

1

30 40 50

14

3 4

6

11

16 17

Insert(T,4)

slide27

Splitting an overflowing node

B

A

C

B

A

C

d

d−1

d

d−1

another insert
Another insert

13

2 5 10

15 28

1

30 40 50

14

3 4

6

11

16 17

Insert(T,7)

another insert1
Another insert

13

2 5 10

15 28

1

30 40 50

14

6 7

3 4

11

16 17

Insert(T,7)

and another insert
and another insert

13

2 5 10

15 28

1

30 40 50

14

6 7

3 4

11

16 17

Insert(T,8)

and another insert1
and another insert

13

2 5 10

15 28

1

30 40 50

14

3 4

11

16 17

6 7 8

Insert(T,8)

and the last for today
and the last for today

13

2 5 10

15 28

1

30 40 50

14

3 4

11

16 17

6 7 89

Insert(T,9)

split3
Split

13

2 5 10

15 28

7

1

30 40 50

14

3 4

8 9

11

6

16 17

Insert(T,9)

split4
Split

13

2 5 7 10

15 28

1

30 40 50

14

3 4

8 9

11

6

16 17

Insert(T,9)

split5
Split

13

5

2

7 10

15 28

1

30 40 50

14

3 4

8 9

11

6

16 17

Insert(T,9)

split6
Split

5 13

2

7 10

15 28

1

30 40 50

14

3 4

8 9

11

6

16 17

Insert(T,9)

slide37

Insert – Bottom up

  • Find the insertion point by a downward search
  • Insert the key in the appropriate place
  • If the current node isoverflowing, split it
  • If its parent is now overflowing, split it, etc.
  • Disadvantages:
  • Need both a downward scan and an upward scan
  • Need to keep parents on a stack
  • Nodes are temporarily overflowing
slide38

Insert – Top down

  • While conducting the search,splitfull children on the search pathbefore descending to them!
  • When the appropriate leaf it reached,it is not full, so the new key may be added!
slide39

Split-Root(T)

T.root

C

T.root

C

d−1

d−1

d−1

d−1

slide40

Split-Child(x,i)

x

key[i]

x

key[i]

B

A

C

B

A

x.child[i]

x.child[i]

C

d−1

d−1

d−1

d−1

slide41

Insert – Top down

  • While conducting the search,splitfull children on the search pathbefore descending to them!

Number of I/Os – O(logdn)

Number of operations – O(d logdn)

deletions from b trees
Deletions from B-Trees

7 15

3

10 13

22 28

30 40 50

20

24 26

14

1 2

4 6

11 12

8 9

delete(T,26)

delete
Delete

7 15

3

10 13

22 28

30 40 50

20

24

14

1 2

4 6

11 12

8 9

delete(T,26)

delete1
Delete

7 15

3

10 13

22 28

30 40 50

20

24

14

1 2

4 6

11 12

8 9

delete(T,13)

delete replace with predecessor
Delete (Replace with predecessor)

7 15

3

10 12

22 28

30 40 50

20

24

14

1 2

4 6

11 12

8 9

delete(T,13)

delete2
Delete

7 15

3

10 12

22 28

30 40 50

20

11

24

14

1 2

4 6

8 9

delete(T,13)

delete3
Delete

7 15

3

10 12

22 28

30 40 50

20

11

24

14

1 2

4 6

8 9

delete(T,24)

delete4
Delete

7 15

3

10 12

22 28

30 40 50

20

11

14

1 2

4 6

8 9

delete(T,24)

delete steal from sibling
Delete (steal from sibling)

7 15

3

10 12

22 30

40 50

20

11

28

14

1 2

4 6

8 9

delete(T,24)

slide50

Rotate/Steal right

A

B

B

A

Rotate/Steal left

delete5
Delete

7 15

3

10 12

22 30

40 50

20

11

28

14

1 2

4 6

8 9

delete(T,20)

delete6
Delete

7 15

3

10 12

22 30

40 50

11

28

14

1 2

4 6

8 9

delete(T,20)

delete join
Delete (Join)

7 15

3

10 12

30

40 50

22 28

11

14

1 2

4 6

8 9

delete(T,20)

few more
Few more..

7 15

3

10 12

30

40 50

22 28

11

14

1 2

4 6

8 9

delete(T,22)

few more1
Few more..

7 15

3

10 12

30

40 50

28

11

14

1 2

4 6

8 9

delete(T,22)

few more2
Few more..

7 15

3

10 12

30

40 50

28

11

14

1 2

4 6

8 9

delete(T,28)

few more3
Few more..

7 15

3

10 12

30

40 50

11

14

1 2

4 6

8 9

delete(T,28)

stealing again
Stealing again

7 15

3

10 12

40

50

30

11

14

1 2

4 6

8 9

delete(T,28)

another one
Another one

7 15

3

10 12

40

50

30

11

14

1 2

4 6

8 9

delete(T,30)

another one1
Another one

7 15

3

10 12

40

50

11

14

1 2

4 6

8 9

delete(30,T)

after join
After Join

7 15

3

10 12

11

40 50

14

1 2

4 6

8 9

delete(30,T)

now we can steal
Now we can steal

7 15

3

10 12

11

40 50

14

1 2

4 6

8 9

delete(30,T)

now we can steal1
Now we can steal

7 12

10

3

15

40 50

14

1 2

4 6

11

8 9

delete(30,T)

slide64
More ?

7 12

10

3

15

40 50

14

1 2

4 6

11

8 9

delete(40,T)

slide65

Delete – Top down

  • Assume, at first, that the item to be deleted is in a leaf
  • While conducting the search,make sure that each child descended into contains at least d keys
  • How?
  • Steal or join
  • When the item is located, it resides in a leaf containing at least d keys, so it can be removed
slide66

Delete – Top down

  • While conducting the search,make sure that each child you descend to contains at least d keys

d−1

 d

d−1

d−1

Rotate! (Steal)

Join!

slide67

Delete – Top down

  • What if the item to be deleted is in an internal node?
  • Descend as before from the root untilthe item to be deleted is located
  • Keep a pointer to the node containing the item
  • Carry on descending towards the successor, making sure that nodes contain at least d keys
  • When the successor is found, delete it from its leafand use it to replace the item to be deleted
slide68

Deletions fromB-Trees

As always, similar, but slightly more complicated than insertions

(may need to replace with successor)

Deletion is slightly simpler for B+-Trees

b trees vs b trees
B-Trees vs. B+-Trees
  • In a B-tree each node contains items and keys
  • In a B+-tree leaves contain items and keys.Internal nodes contain keys to direct the search.
  • Keys in internal nodes are either keys of existing items, or keys of items that were deleted.
  • Internal nodes may contain more keysso overall the # of items we can store increases
ad