1 / 16

Lecture 4 : Accelerated Cascading and Parallel List Ranking

Lecture 4 : Accelerated Cascading and Parallel List Ranking. We will first discuss a technique called accelerated cascading for designing very fast parallel algorithms. We will then study a very important technique for ranking the elements of a list in parallel. Fast computation of maximum.

cala
Download Presentation

Lecture 4 : Accelerated Cascading and Parallel List Ranking

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 4 : Accelerated Cascading and Parallel List Ranking • We will first discuss a technique called accelerated cascading for designing very fast parallel algorithms. • We will then study a very important technique for ranking the elements of a list in parallel. Advanced Topics in Algorithms and Data Structures

  2. Fast computation of maximum Input: An array A holding p elements from a linearly ordered universe S. We assume that all the elements in A are distinct. Output: The maximum element from the array A. We use a boolean array M such that M(k)=1 if and only if A(k) is the maximum element in A. Initialization: We allocate p processors to set each entry in M to 1. Advanced Topics in Algorithms and Data Structures

  3. Fast computation of maximum Step 1: Assign p processors for each element in A, p2 processors overall. • Consider the p processors allocated to A(j). We name these processors as P1, P2,..., Pi,..., Pp. • Pi compares A(j) with A(i) : IfA(i) > A(j)thenM(j) := 0 else do nothing. Advanced Topics in Algorithms and Data Structures

  4. Fast computation of maximum Step 2: At the end of Step 1, M(k) , 1  k  p will be 1 if and only if A(k) is the maximum element. • We allocate p processors, one for each entry in M. • If the entry is 0, the processor does nothing. • If the entry is 1, it outputs the index k of the maximum element. Advanced Topics in Algorithms and Data Structures

  5. Fast computation of maximum Complexity: The processor requirement is p2 and the time complexity is O(1). • We need concurrent write facility and hence the Common CRCW PRAM model. Advanced Topics in Algorithms and Data Structures

  6. Optimal computation of maximum • This is the same algorithm which we used for adding n numbers. Advanced Topics in Algorithms and Data Structures

  7. Optimal computation of maximum • This algorithm takes O(n) processors and O(log n) time. • We can reduce the processor complexity to O(n / log n). Hence the algorithm does optimal O(n) work. Advanced Topics in Algorithms and Data Structures

  8. An O(log log n) time algorithm • Instead of a binary tree, we use a more complex tree. Assume that . • The root of the tree has children. • Each node at the i-th level has children for . • Each node at level k has two children. Advanced Topics in Algorithms and Data Structures

  9. An O(log log n) time algorithm Some Properties • The depth of the tree is k. Since • The number of nodes at the i-th level is Prove this by induction. Advanced Topics in Algorithms and Data Structures

  10. An O(log log n) time algorithm The Algorithm • The algorithm proceeds level by level, starting from the leaves. • At every level, we compute the maximum of all the children of an internal node by the O(1) time algorithm. • The time complexity is O(log log n) since the depth of the tree is O(log log n). Advanced Topics in Algorithms and Data Structures

  11. An O(log log n) time algorithm Total Work: • Recall that the O(1) time algorithm needs O(p2) work for p elements. • Each node at the i-th level has children. • So the total work for each node at the i-th level is . Advanced Topics in Algorithms and Data Structures

  12. An O(log log n) time algorithm Total Work: • There are nodes at the i-th level. Hence the total work for the i-th level is: • For O(log log n) levels, the total work is O(n log log n) . This is suboptimal. Advanced Topics in Algorithms and Data Structures

  13. Accelerated cascading • The first algorithm which is based on a binary tree, is optimal but slow. • The second algorithm is suboptimal, but very fast. • We combine these two algorithms through the accelerated cascading strategy. Advanced Topics in Algorithms and Data Structures

  14. Accelerated cascading • We start with the optimal algorithm until the size of the problem is reduced to a certain value. • Then we use the suboptimal but very fast algorithm. Advanced Topics in Algorithms and Data Structures

  15. Accelerated cascading Phase 1. • We apply the binary tree algorithm, starting from the leaves and upto log log log n levels. • The number of candidates reduces to • The total work done so far is O(n)and the total time is O(log log log n). Advanced Topics in Algorithms and Data Structures

  16. Accelerated cascading Phase 2. • In this phase, we use the fast algorithm on the remaining candidates. • The total work is . • The total time is . • Theorem: Maximum of n elements can be computed in O(log log n)time and O(n)work on the Common CRCW PRAM. Advanced Topics in Algorithms and Data Structures

More Related