1 / 11

Recursion Unrolling for Divide and Conquer Programs

Recursion Unrolling for Divide and Conquer Programs. Radu Rugina and Martin Rinard Presented by: Cristian Petrescu-Prahova. Divide and Conquer. Idea: Divide problem in smaller sub problems, solve each in turn Use recursion as primary control structure

fala
Download Presentation

Recursion Unrolling for Divide and Conquer Programs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Recursion Unrolling for Divide and Conquer Programs Radu Rugina and Martin Rinard Presented by: Cristian Petrescu-Prahova

  2. Divide and Conquer • Idea: • Divide problem in smaller sub problems, solve each in turn • Use recursion as primary control structure • Base case computation terminates the recursion when a small enough size was reached • Combine results to generate solution of the original problem • Interesting properties: • Lots of inherent parallelism; natural recursively generated concurrency • Good cache performance; natural fits cache hierarchies • In practice: • Potentially too much time spent in divide/combine phases • Increasing the size of the base case alleviates the problem • But the simplest and least error-prone coding style reduces the problem to a minimum size (typically one) • Solution: recursion unrolling

  3. Example: Divide and Conquer Array Increment void dcInc (int * p, int n) { if (n == 1) { *p += 1; } else { dcInc (p, n/2); dcInc (p + n/2, n/2); } } Base case Divide

  4. Inlining Recursive Calls void dcIncI (int * p, int n) { if (n == 1) { *p += 1; } else { if (n/2 == 1) { *p += 1; } else { dcIncI (p, n/2/2); dcIncI (p + n/2/2, n/2/2); } if (n/2 == 1) { *(p + n/2) += 1; } else { dcIncI (p + n/2, n/2/2); dcIncI (p + n/2 + n/2/2, n/2/2); } } } Base case Divide

  5. Conditional Fusion void dcIncI (int * p, int n) { if (n == 1) { *p += 1; } else { if (n/2 == 1) { *p += 1; } else { dcIncI (p, n/2/2); dcIncI (p + n/2/2, n/2/2); } if (n/2 == 1) { *(p + n/2) += 1; } else { dcIncI (p + n/2, n/2/2); dcIncI (p + n/2 + n/2/2, n/2/2); } } } void dcIncF (int * p, int n) { if (n == 1) { *p += 1; } else { if (n/2 == 1) { *p += 1; *(p + n/2) += 1; } else { dcIncI (p, n/2/2); dcIncI (p + n/2/2, n/2/2); dcIncI (p + n/2, n/2/2); dcIncI (p + n/2 + n/2/2, n/2/2); } } } Base case Divide

  6. Reroll Second Unrolling Iteration void dcInc2 (int * p, int n) { if (n == 1) { *p += 1; } else { if (n/2 == 1) { *p += 1; *(p + n/2) += 1; } else { if (n/2/2 == 1) { *p += 1; *(p + n/2/2) += 1; *(p + n/2) += 1; *(p + n/2 + n/2/2) += 1 } else { dcIncI (p, n/2/2/2); dcIncI (p + n/2/2/2, n/2/2/2); dcIncI (p + n/2/2, n/2/2/2); dcIncI (p + n/2/2 + n/2/2/2, n/2/2/2); dcIncI (p + n/2, n/2/2/2); dcIncI (p + n/2 + n/2/2/2, n/2/2/2); dcIncI (p + n/2 + n/2/2, n/2/2/2); dcIncI (p + n/2 + n/2/2 + n/2/2/2, n/2/2/2); } } } } void dcInc2 (int * p, int n) { if (n == 1) { *p += 1; } else { if (n/2 == 1) { *p += 1; *(p + n/2) += 1; } else { if (n/2/2 == 1) { *p += 1; *(p + n/2/2) += 1; *(p + n/2) += 1; *(p + n/2 + n/2/2) += 1 } else { dcIncI (p, n/2/2/2); dcIncI (p + n/2/2/2, n/2/2/2); dcIncI (p + n/2/2, n/2/2/2); dcIncI (p + n/2/2 + n/2/2/2, n/2/2/2); dcIncI (p + n/2, n/2/2/2); dcIncI (p + n/2 + n/2/2/2, n/2/2/2); dcIncI (p + n/2 + n/2/2, n/2/2/2); dcIncI (p + n/2 + n/2/2 + n/2/2/2, n/2/2/2); } } } } void dcIncR (int * p, int n) { if (n == 1) { *p += 1; } else { if (n/2 == 1) { *p += 1; *(p + n/2) += 1; } else { if (n/2/2 == 1) { *p += 1; *(p + n/2/2) += 1; *(p + n/2) += 1; *(p + n/2 + n/2/2) += 1 } else { dcIncR (p, n/2); dcIncR (p + n/2, n/2); } } } } We need rerolling to ensure that the largest unrolled base case is always executed.

  7. Algorithm Algorithm RecursionUnrolling (Proc f, Int m) funroll,0 = clone (f); for (i = 1; i <= m; ++i) funroll,i = RecusionInline (funroll,i-1, f); funroll,i = ConditionalFusion (funroll); freroll,m = RecursionRerolling (funroll,m, f); return freroll,m

  8. Implementation details • Recursion unrolling • Standard procedure inlining • Increases the code size exponentially, must be used with care • Conditional fusion • Bottom up traversal of HTG + conditional match • Recursion rerolling • Replaces the unrolled procedure recursion block with the rolled procedure recursion block if the unrolled procedure conditional sequence implies the rolled procedure conditional sequence • Simple transformations !!!

  9. Experiments • Programs: • Mul: divide and conquer matrix multiplication • 1 recursive procedure with 8 recursive calls • Base case size: 1 element • LU: divide and conquer LU decomposition • 4 mutually recursive procedures; main procedure has 8 recursive calls • Base case size: 1 element • Implementation: • C to C transformations in SUIF • Comparison: • Handcoded divide and conquer from Cilk benchmark set (designed for thread parallelization)

  10. Results

  11. Conclusion • Recursion unrolling, similar with loop unrolling. • Basic recursion unrolling reduces the overhead of procedure call • Extra optimizations: • Conditional fusion: simplifies the control flow • Recursion rerolling: ensures the biggest unrolled base case is always executed • Optimized programs performance is close to that of handcoded programs

More Related