1 / 17

Parallel Algorithms

Parallel Algorithms. Parallel Models. Hypercube Butterfly Fully Connected Other Networks Shared Memory v.s. Distributed Memory SIMD v.s. MIMD. The PRAM Model. P arallel R andom A ccess M achine All processors act in lock-step Number of processors is not limited

stu
Download Presentation

Parallel Algorithms

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Parallel Algorithms

  2. Parallel Models • Hypercube • Butterfly • Fully Connected • Other Networks • Shared Memory v.s. Distributed Memory • SIMD v.s. MIMD

  3. The PRAM Model • Parallel Random Access Machine • All processors act in lock-step • Number of processors is not limited • All processors have local memory • One global memory accessible to all processors • Processors must read and write global memory

  4. A Pram Algorithm • Every Processor knows its own index (usually indicated by variable i) • Vector Sum: Read M[i] Into x; Read M[i+n] Into y; x := x + y; Write x into M[i];

  5. Binary Fan-In Read M[i] into Largest; Write M[i] into M[i+n]; Delta := 1; For k := 1 to élg nù Read M[i+Delta] into x; Largest := Maximum(x,Largest); Write Largest into M[i]; Delta := Delta * 2; End For

  6. Parallel Addition Read M[i] into Total; Write 0 into M[i+n]; Delta := 1; For k := 1 to élg nù Read M[i+Delta] into x; Total := x + Total; Write Total into M[i]; Delta := Delta * 2; End For

  7. Pointer Jumping Read M[i] Into Total; For k := 1 to élg nù Read Next[i] into Ptr If Ptr ¹ 0 Then Read M[Ptr] Into x; Total := Total + x; Write Total into M[i]; Read Next[Ptr] Into NewPtr Write NewPtr into Next[i] End If End For

  8. Initialization of Next[i] If i = n Then Write 0 Into Next[i]; Else Write i+1 Into Next[i]; End If

  9. Calculate Node Depth I If there is a Left Child 1 -1 To “1” of Left Child 0 From “-1” of Left Child

  10. Calculate Node Depth 2 If there is no left child 1 -1 0

  11. Calculate Node Depth 3 If there is a Right Child 1 -1 From “-1” of Right Child 0 To “1” of Right Child

  12. Calculate Node Depth 4 If there is no right child 1 -1 0

  13. Concurrent Reads & Writes • EREW - Exclusive Read, Exclusive Write • CREW - Common Read, Exclusive Write • CRCW - Common Read, Common Write • All common writes must write the same thing • Highest Priority Processor wins contest • CREW is more powerful than EREW • CRCW is more powerful than CREW

  14. Finding Max • Square Array of Processors Indexed by i,j Write True into R[i]; Read M[i] into x; Read M[j] into y; If x < y Then Write False Into R[i]; Else If y < x Then Write False Into R[j]; End If

  15. CRCW V.S. CREW • CRCW Max runs in constant time • CREW Max runs in lg n time • CRCW cannot be any better than lg p faster than EREW

  16. EREW V.S. CREW • Finding Roots by Shortcutting Pointers • CREW Runs in lg lg n Time • EREW Runs in lg n Time

  17. Optimal Parallel Algorithms • NC -- The class of algorithms that run in Q(logmn) time using Q(nk) processors • General Boolean Functions Cannot be Computed any Faster than Q(lg n) • Q(lg n) is optimal for computing the sum of n integers

More Related