220 likes | 339 Views
Introduction to CUDA Programming. Scan Algorithm Explained Andreas Moshovos Winter 2009. Reading. You are strongly encouraged to read the following as it a contains a more formal treatment of the algorithm, plus an overview of various applications of scan.
E N D
Introduction to CUDA Programming Scan Algorithm Explained Andreas Moshovos Winter 2009
Reading • You are strongly encouraged to read the following as it a contains a more formal treatment of the algorithm, plus an overview of various applications of scan. • Guy E. Blelloch. “Prefix Sums and Their Applications”. In John H. Reif (Ed.), Synthesis of Parallel Algorithms, Morgan Kaufmann, 1990. http://www.cs.cmu.edu/afs/cs.cmu.edu/project/scandal/public/papers/CMU-CS-90-190.html
Two phases • Up-Sweep • Essentially a reduction • Produces many partial results • Down-Sweep • Propagating the partial results to all relevant elements
Up-Sweep • Just a reduction: 1 2 2 5 6 3 8 2 4 1 5 2 7 9 3 5 1 3 2 7 6 9 8 10 4 5 5 7 7 16 3 8 1 3 2 10 6 9 8 19 4 5 5 12 7 16 3 24 1 3 2 10 6 9 8 29 4 5 5 12 7 16 3 36 1 3 2 10 6 9 8 29 4 5 5 12 7 16 3 65
Up-Sweep • Now let’s see this is a tree 1 2 2 5 6 3 8 2 4 1 5 2 7 9 3 5 3 7 9 10 5 7 16 8 10 19 12 24 29 36 • Notice we only have these nodes left in our array: • the rest were partial results 65 1 3 2 10 6 9 8 29 4 5 5 12 7 16 3 65
Up-Sweep • So, this is what’s left • nodes without values don’t exist, they were partial results 1 2 6 8 4 5 7 3 3 9 5 16 10 12 29 65
Down-Sweep • For the second phase we need to think: • The edges in reverse • The empty nodes as placeholders for partial results 1 2 6 8 4 5 7 3 3 9 5 16 10 12 29 65
Down-Sweep • Now let’s view the tree as a collection of nsubtrees • The root of each sub tree, where it’s still present contains the reduction of all subtree elements • i.e., the sum of all subtree elements 1 2 6 8 4 5 7 3 3 9 5 16 10 12 29 65
Down-Sweep • Let’s focus on the rightmost subtree: 1 2 6 8 4 5 7 3 3 9 5 16 10 12 29 65
Down-Sweep • Before the last step of the down-sweep phase the yellow element will contain the sum (57) of all elements to the left of the subtree. 3 57 • The last step will take the following two actions • 3+ 57 = 60, this goes on the rightmost element • This is the sum of all elements including 3 but excluding the right most one • overwrite 3 with 57 • This is the sum of all elements left of 3
Down-Sweep • In terms of the array stored in memory the aforementioned actions look like this: 57 61 3 57 • Where: • the dark arrows represent addition • the red dotted arrow represents a move
Down-Sweep • Let’s now focus at the rightmost subtree that contains the last four nodes: • This will be processed at the step before the previous subtree we just discussed 7 3 16
Down-Sweep • Before the previous to the last step of the down-sweep phase the green element will contain the sum (41) of all elements to the left of the subtree. 7 3 16 41
Down-Sweep • The actions that will be taken at this step are: • 16 + 41 = 57 will be written as the root of the rightmost subtree • As we saw before this is the sum of all element left of the rightmost subtree • 41 will replace 16 • This is the sum of all elements left of the subtree rooted by 16 7 3 41 57 41
Down-Sweep • In terms of the array stored in memory the aforementioned actions look like this: 7 41 3 57 7 16 3 41 • Where: • the dark arrows represent addition • the red dotted arrow represents a move
Down-Sweep • Now let’s go a step back looking at the complete right subtee (in green) 4 5 7 3 5 16 12
Down-Sweep • Before this step the root node will contain the sum (29) of all elements of the left subtree 4 5 7 3 5 16 12 29
Down-Sweep • As before we’ll do two things: • 29+12 = 41 and this becomes the root of the rightmost subtree • This should be the sum of all elements to the left of that subtree for the next step (which we saw previously) • 29 replaces 12 4 5 7 3 same reason: 29 is the sum of all elements left of the subtree rooted by what was 12. 5 16 29 41 29
Down-Sweep • Let’s try to generalize what happens at every step of the down-sweep phase • Let’s look at step 1: • There is only one subtree shown in purple 1 2 6 8 4 5 7 3 3 9 5 16 10 12 29 65
Down-Sweep • Before we process this tree as described before the root node must contain the sum of all elements to the left of the tree • There are no elements • Hence the root must be 0 1 2 6 8 4 5 7 3 3 9 5 16 10 12 29 0
Down-Sweep • Now repeat the steps we saw before • 29 + 0 = 29 and this becomes the root of the right subtree • 29 gets replaced by 0 1 2 6 8 4 5 7 3 3 9 5 16 10 12 0 29 0
Down-Sweep • In terms of the array stored in memory the aforementioned actions look like this: 1 3 2 10 6 9 8 0 4 5 5 12 7 16 3 29 1 3 2 10 6 9 8 29 4 5 5 12 7 16 3 0 • Where: • the dark arrows represent addition • the red dotted arrow represents a move