1 / 47

# Application of Data Structures - PowerPoint PPT Presentation

Application of Data Structures. Overview. Priority Queue structures Heaps Application: Dijkstra’s algorithm Cumulative Sum Data Structures on Intervals Augmenting data structures with extra info to solve questions. Priority Queue (PQ) Structures.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about 'Application of Data Structures' - Audrey

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Application of Data Structures

Christopher Moh 2005

• Priority Queue structures

• Heaps

• Application: Dijkstra’s algorithm

• Cumulative Sum Data Structures on Intervals

• Augmenting data structures with extra info to solve questions

Christopher Moh 2005

• Stores elements in a list by comparing a key field

• Often has other satellite data

• For example, when sorting pixels by their R value, we consider the R as the key field and GB as satellite data

• Priority queues allow us to sort elements by their key field.

Christopher Moh 2005

• Create()

• Creates an empty priority queue

• Find_Min()

• Returns the smallest element (by key field)

• Insert(x)

• Insert element x (with predefined key field)

• Delete(x)

• Delete position x from the queue

• Change(x, k)

• Change key field of position x to k

Christopher Moh 2005

• Union (a,b)

• Combines two PQs a and b

• Search (k)

• Returns the position of the element in the heap with key value k

Christopher Moh 2005

• How complicated is it?

• Is the code likely to be buggy?

• How fast does it need to be?

• Does a constant factor also come into the equation?

• Do I need to store extra data to do a Search?

• During the course of this presentation, we shall assume that there exists existing extra data which allows us to do a search in O(1) time. The handling of this data structure will be assumed and not covered.

Christopher Moh 2005

• Unsorted Array

• Create, Insert, Change in O(1) time

• Find_min, Delete in O(n) time

• Sorted Array

• Create, Find_min in O(1) time

• Insert, Delete, Change in O(n + log n) = O(n) time

Christopher Moh 2005

• Will be the most common structure that will be implemented in competition setting

• Efficient for most applications

• Easy to implement

• A heap is a structure where the value of a node is less than the value of all of its children

• A binary heap is a heap where the maximum number of children for each node is 2.

Christopher Moh 2005

• Consider a heap of size nheap in an array BHeap[1..nheap] (Define BHeap[nheap+1 .. (nheap*2)+1] to be INFINITY for practical reasons)

• The children of BHeap[x] are BHeap[x*2] and BHeap[x*2+1]

• The parent of BHeap[x] are BHeap[x/2]

• This allows a near uniform Binary Heap where we can ensure that the number of levels in this heap is O(log n)

• Some properties wrt Key values: BHeap[x] >= BHeap[x/2], BHeap[x] <= BHeap[x*2], BHeap[x] <= BHeap[x*2+1], BHeap[x*2] ?? BHeap[x*2+1]

Christopher Moh 2005

• We define BTree(x) to be the Binary Tree rooted at BHeap[x]

• We define Heapify(x) to be an operation that does the following:

• Assume: BTree(x*2) and BTree(x*2+1) are binary heaps but BTree(x) is not necessarily a binary heap

• Produce: BTree(x) binary heap

• Details of Heapify in later slides – but for now, we assume Heapify is O(log n)

• For the rest of the presentation, we assume the variable n refers to nheap

Christopher Moh 2005

• Create is trivial – O(1) time

• Find_min:

• Return BHeap[1]

• O(1) time

• Insert (element with key value x)

• nheap++

• BHeap[nheap] = x

• T = nheap

• While (T != 1 && Bheap[T] < BHeap[T/2])

• Swap (Bheap[T], BHeap[T/2]

• T = T / 2

• O(log n) time as the number of levels is O(log n)

Christopher Moh 2005

• ChangeDown (position x, new key value k)

• Assume: k < existing BHeap[x]

• BHeap[x] = k

• T = x

• While (T != 1 && BHeap[T] < BHeap[T/2])

• Swap (BHeap[T], BHeap[T/2])

• T = T/2

• Complexity: O(log n)

• This procedure is known as “bubbling up” the heap

Christopher Moh 2005

• ChangeUp (position x, new key value k)

• Assume: k > existing BHeap[x]

• BHeap[x] = k

• Heapify(x)

• O(log n) as complexity of Heapify is O(log n)

Christopher Moh 2005

• Delete (position x on the heap)

• BHeap[x] = BHeap[nheap]

• nheap—

• Heapify(x)

• T = x

• While (T != 1 && BHeap[T] < BHeap[T/2])

• Swap (BHeap[T], BHeap[T/2])

• T = T / 2

• Complexity is O(log n)

• Why must I do both Heapify and “bubble up”?

Christopher Moh 2005

• Heapify (position x on the heap)

• T = min(BHeap[x], BHeap[x*2], BHeap[x*2+1])

• If (T == BHeap[x]) return;

• K = position where BHeap[K] = T

• Swap(BHeap[x], BHeap[K])

• Heapify(K)

• O(log n) as the maximum number of levels in the heap is O(log n) and Heapify only goes through each level at most once

Christopher Moh 2005

• Create, Find_min in O(1) time

• Change (includes both ChangeUp and ChangeDown), Insert, and Delete are O(log n) time

• Union operations are how long?

• Insertion: O(n log n) union

• Heapify: O(n) union

Christopher Moh 2005

• We can convert an unsorted array to a heap using Heapify (why does this work?):

• For (i = n/2; i >= 1; i--)

• Heapify(i)

• We can then return a sorted list (list initially empty):

• For (i = 1; i <= n; i++)

• Append the value of find_min to the list

• Delete(1)

• Complexity is O(n log n)

Christopher Moh 2005

• Define Binomial Tree B(k) as follows:

• B(0) is a single node

• B(n), n != 0, is formed by merging two B(n-1) trees in the following way:

• The root of the B(n) tree is the root of one of the B(n-1) trees, and the (new) leftmost child of this root is the root of the other B(n-1) tree.

• Within the tree, the heap property holds i.e. that the key field of any node is greater than the key field of all its children.

Christopher Moh 2005

• The number of nodes in B(k) is exactly 2^k.

• The height of B(k) is exactly (k + 1)

• For any tree B(k)

• The root of B(k) has exactly k children

• If we take the children of B(k) from left to right, they form the roots of a B(k-1), B(k-2), …, B(0) tree in that order

Christopher Moh 2005

• Binomial Heaps are a forest of binomial trees with the following properties:

• All the binomial trees are of different sizes

• The binomial trees are ordered (from left to right) by increasing size

• If we consider the fact that the size of B(k) is 2^k, the binomial tree B(k) exists in a binomial heap of n nodes iff the bit representing 2^k is “1” in the binary representation of n

• For example: 13 (decimal) = 1101 (binary), so the binomial heap with 13 nodes consists of the binomial trees B(0), B(2), and B(3).

Christopher Moh 2005

• Each node will store the following data:

• Key field

• Pointers (if non-existent, points to NIL) to

• Parent

• Next Sibling (ordered left to right; a sibling must have the same parent); For roots of binomial trees, next sibling points to the root of the next binomial tree

• Leftmost child

• Number of children in field degree

• Any other data that might be useful for the program

• The binomial heap is represented by a head pointer that points to the root of the smallest binomial tree (which is the leftmost binomial tree)

Christopher Moh 2005

• Links two binomial trees with root h1 and h2 of the same order k to form a new binomial tree of order (k+1)

• We assume h1->key < h2->key which implies that h1 is the root of the new tree

• T = h1->leftchild

• h1->leftchild = h2

• h2->parent = h1

• H2->next_sibling= T

• O(1) time

Christopher Moh 2005

• Create – Create a new binomial heap with one node (key field set)

• Set Parent, Leftchild, Next sibling to NIL

• O(1) time

• Find_min

• X = head, min = INFINITY

• While (X != nil)

• If (X->key < min) min = X->key

• X = X->next_sibling

• Return min

• O(log n) time as there are at most log n binomial trees (log n bits)

Christopher Moh 2005

• Merge (h1, h2, L)

• Given binomial heaps with head pointers h1 and h2, create a list L of all the binomial trees of h1 U h2 arranged in ascending order of size

• For any order k, there may be zero, one, or two binomial trees of order k in this list.

Christopher Moh 2005

• Merge (h1, h2, L)

• Assume that NIL is a node of infinitely small order

• L = empty

• While (h1 != NIL || h2 != NIL)

• If (h1->degree < h2->degree)

• Append the (binomial)tree with root h1 to L

• h1 = h1->next_sibling

• Else

• Apply above steps to h2 instead

Christopher Moh 2005

• Union (h1, h2)

• The fundamental operation involving binomial heaps

• Takes two binomial heaps with head pointers h1 and h2 and creates a new binomial heap of the union of h1 and h2

Christopher Moh 2005

• Union (h1, h2)

• Merge (h1, h2, L)

• Go by increasing k in the list L until L is empty

• If there is exactly one or exactly three (how can this happen?) binomial trees of order k in L, append one binomial tree of order k to the binomial heap and remove that tree from L

• If there are two trees of order k, remove both trees, use Link to form a tree of order (k+1) and pre-pend this tree to L

• Union is O(log n)

Christopher Moh 2005

• Inserting a new node with key field set

• Create a new binomial heap with that one node

• Union (existing heap with head h, new heap)

• O (log n) time

• ChangeDown (node at position x, new value)

• Decreasing the key value of a node

• Same idea as binary heap: “Bubble” up the binomial tree containing this node (exchange only key fields and satellite data! What’s the complexity if you physically change the node?)

• O (log n) time

Christopher Moh 2005

• Delete (node at position x)

• Deleting position x from the heap

• ChangeDown(x, -INFINITY)

• Now x is at the root of its binomial tree

• Supposing that the binomial tree is of order k

• Recall that the children of the root of the binomial tree, from right to left, are binomial trees of order 0, 1, 2, 3, 4, …, k-1

• Form a new binomial heap with the children of the root of this binomial tree the roots in the new binomial heap

• Remove the original binomial tree from the original binomial heap

• Union (original heap, new heap)

• O(log n) complexity

Christopher Moh 2005

• ChangeUp (node at position X, new value)

• Delete (X)

• Insert (new value)

• O (log n) time

Christopher Moh 2005

• Create in O(1) time

• Union, Find_min, Delete, Insert, and Change operations take O(log n) time

• In general, because they are more complicated, in competition it is far more prudent (saves time coding and debugging) to use a binary heap instead

• Unless there are MANY Union operations

Christopher Moh 2005

• The following describes how Dijkstra’s algorithm can be coded with a binary heap

• Initializing phase:

• Let n be the number of nodes

• Create a heap of size n, all key fields initialized to INFINITY

• Change_val (s, 0) where s is the source node

Christopher Moh 2005

• While (heap is not empty)

• X = node corresponding to find_min value

• Delete (position of X in heap = 1)

• For all nodes k that are adjacent to X

• If (cost[X] + distance[X][k] < cost[k])

• ChangeDown (position of k in heap, cost[X] + distance[X][k])

Christopher Moh 2005

• At most n nodes are deleted

• O(n log n)

• Let m be the number of edges. Each edge is relaxed at most once.

• O(m log n)

• Total running time O([m+n] log n)

• This is faster than using a basic array list unless the graph is very dense, in which case m is about O(n^2) which leads to a running time of O(n^2 log n)

Christopher Moh 2005

• Problem: We have a line that runs from x coordinate 1 to x coordinate N. At x coordinate X [X an integer between 0 and N], there is g(X) gold. Given an interval [a,b], how much gold is there between a and b?

• How efficiently can this be done if we dynamically change the amount of gold and the interval [a,b] keeps changing?

Christopher Moh 2005

• Let us define C(0) = 0, and C(x) = C(x-1) + g(x) where g(x) is the amount of gold at position x

• C(x) then defines the total amount of gold from position 1 to position x

• The amount of gold in interval [a,b] is simply C(b) – C(a-1)

• For any change in a or b, we can perform the update in O(1) time

• However, if we change g(x), we will have to change C(x), C(x+1), C(x+2), …, C(N)

• Any change in gold results in an update in O(N) time

Christopher Moh 2005

• We can use the binary representation of any number to come up with a cumulative sum tree

• For example, let say we take 13 (decimal) = 1101 (binary)

• The cumulative sum of g(1) + g(2) + … g(13) can be represented as the sum of:

• g(1) + g(2) + … + g(8) [ 8 elements ]

• g(9) + g(10) + … + g(12) [ 4 elements ]

• g(13) [ 1 element ]

• Notice that the number of elements in each case represents a bit that is “1” in the binary representation of the number

Christopher Moh 2005

• Another example: C(19)

• 19 (decimal) is 10011 (binary)

• C(19) is the sum of the following:

• g(1) + g(2) + … + g(16) [ 16 elements ]

• g(17) + g(18) [ 2 elements ]

• g(19) [ 1 element ]

Christopher Moh 2005

• Let us define C2(x) to be the sum of g(x) + g(x-1) + … + g(p + 1) where p is a number with the same binary representation as x except the least significant bit of x (the rightmost bit of x that is “1”) is “0”

• Examples of x and the corresponding p:

• x = 6 [110], p = 4 [100]

• x = 13 [1101], p = 12 [1100]

• x = 16 [10000], p = 0 [00000]

Christopher Moh 2005

• If we want to find the cumulative sum C(x) = g(1) + g(2) + … + g(x), we can trace through the values of C2 using the binary representation of x

• Examples:

• C(13) = C2(8) + C2(8+4) + C2(8+4+1)

• C(16) = C2(16)

• C(21) = C2(16) + C2(16+4) + C2(16+4+1)

• C(99) = C2(64) + C2(64+32) + C2(64+32+2) + C2(64+32+2+1)

• This allows us to find C(x) in log x time

• Hence the amount of gold in interval [a,b] = C(b) – C(a-1) can be found in log N time, which implies updates of a and b can be done in O(log N)

Christopher Moh 2005

• What happens when we change g(x)?

• If g(x) is changed, we only need to update C2(y) where C2(y) covers g(x)

• We can go through all necessary C2(y) in the following way:

• While (x <= N)

• Update C2(x)

• Add the value of the least significant bit of x to x

• This runs in O(log N) time

• Hence updates to g can also be done in O(log n) time, which is a great improvement over the O(N) needed for an array.

Christopher Moh 2005

• Examples [binary representation in brackets]

• Change to g(5) [ 101 ] : Update C2(5), C2(6), C2(8), C2(16) and all C2(power of 2 > 16)

• Change to g(13) [ 1101 ]: Update C2(13), C2(14), C2(16), and all C2(power of 2 > 16)

• Change to g(35) [ 100011 ]: Update C2(35), C2(36), C2(40), C2(48), C2(64), and all C2(power of 2 > 64)

• We can implement a cumulative sum tree very simply: By simply using a linear array to store the values of C2.

• Can we extend a cumulative sum tree to 2 or more dimensions?

• See IOI 2001 Day 1 Question 1

Christopher Moh 2005

• Another way to solve the question is to use a “Sum of Intervals” Binary Tree

• Each node in the tree is represented by (L, R) and the value of (L,R) is g(L) + g(L+1) + … + g(R)

• The root of the tree has L = 1 and R = N

• Every leaf has L = R

• Every non-leaf has children (L, [L+R]/2) [left child] and ([L+R]/2+1, R) [right child]

• The number of nodes in the tree is O(2*N) [ why? ]

• In an implementation, every node should have pointers to its children and its parent

Christopher Moh 2005

• How to find C(x) = g(1) + g(2) + … + g(x)?

• We trace from the root downwards

• L = 1, R = N, C = 0

• While (L != R)

• M = (L + R) / 2

• If (M < x)

• C += value of (L,R)

• Set L and R to the left child of the current node

• Else

• Set L and R to the right child of the current node

• C += value at (L,R) [ or (L,L) or (R,R) as L = R ]

• Time complexity: O(log n)

Christopher Moh 2005

• What happens when g(x) is changed?

• Trace from (x,x) upwards to the root

• Let L = R = x

• While (L,R) is not the root

• Update the value of (L,R)

• Set (L,R) to the parent of (L,R)

• Update the root

• Complexity of O(log N)

• Hence all updates of interval [a,b] and g(x) can be done in O(log N) time

Christopher Moh 2005

• It is often useful to change the data structure in some way, by adding additional data in each node or changing what each node represents.

• This allows us to use the same data structure to solve problems

• For example, we can use so-called “interval trees” to solve not just cumulative sum problems

• We can use properties of elements in the interval (L,R) that are related to L and R.

Christopher Moh 2005

• Balanced (and unbalanced) binary trees

• Red-Black trees

• 2-3-4 trees

• Splay trees

• Suffix Trees

• Fibonacci Heaps

Christopher Moh 2005