1 / 18

GC16/3011 Functional Programming Lecture 22 The Four-Stroke Reduction Engine

GC16/3011 Functional Programming Lecture 22 The Four-Stroke Reduction Engine. Contents. Motivation Model for Parallel Graph Reduction Parallelism and Tasks FSRE representation, synchronisation and scheduling Two-stroke reduction Four-stroke reduction Summary. Motivation.

monita
Download Presentation

GC16/3011 Functional Programming Lecture 22 The Four-Stroke Reduction Engine

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. GC16/3011 Functional ProgrammingLecture 22The Four-Stroke Reduction Engine

  2. Contents • Motivation • Model for Parallel Graph Reduction • Parallelism and Tasks • FSRE representation, synchronisation and scheduling • Two-stroke reduction • Four-stroke reduction • Summary

  3. Motivation • Previously: abstract/theoretical • This lecture: a real graph reducer • Details of the Four-Stroke Reduction Engine

  4. Model for PGR Agent Agent Agent Agent Agent Shared Heap Shared Task Pool @ @ 4 3 +

  5. Each task: • has access to any part of graph • performs reductions in normal order • reduces a subgraph to (weak head) normal form • Overwrites root node of redex (with indirection to result) as indivisible operation • Then simply “dies” • may anticipate need for value of a subgraph • Places task for that subgraph in task pool (sparking) • is executed by an agent (physical processor)

  6. Parallelism and tasks • Sparking could be conservative or speculative • Speculative sparking needs careful management • FSRE uses conservative sparking • For (e1 e2), e1 may not yet be evaluated • So could evaluate e1 in parallel with e2 • Extends to many arguments evaluated in parallel • But only those we know will be needed • Parallelism annotations advise when and what to spark

  7. Want to detect parallelism in three cases: • f x y = x + y • f will always evaluate x and y • Could annotate the function f, or the application nodes ((f x) y) • ((if e1 f g) e2) • Don’t know which function used until runtime • So annotate the functions • f x y = y 3 x • f is not strict in x if y doesn’t use x • But for application (f e +) the expression e WILL be used • So annotate the application nodes

  8. FSRE representation • A node (or cell) has a tag, a left field and a right field • Tags denote application, lambda, constant, parallelism annotations and “paint” (see later) etc. • A “task” is two pointers (B and F) • Graph traversal is achieved using pointer reversal (no stack required) • Current state of a suspended task is held in graph • Reversed pointers made inaccessible to other tasks (because nodes are “painted” – see later)

  9. FSRE synchronisation • Two tasks attempt to evaluate common subgraph? • Mutual exclusion not required, but desirable to prevent duplicated effort @ * @ * @ @ 3 + 1 @ 6 g

  10. FSRE synchronisation (2) • As task traverses graph, it “paints” all nodes it is working on (special versions of tags) • After working on a section of graph, it “unpaints” the nodes • If a task attempts to access a node that has been “painted” by another task, it blocks until the node is unpainted • Tasks are blocked and later resumed with no explicit communication between agents or tasks

  11. FSRE synchronisation (3) • A task (parent) sparks a subtask (child) to evaluate a subgraph • Later, the parent accesses the subgraph to get its value. The subgraph might be in one of three states: • Already evaluated: parent uses value • Being evaluated: subgraph is “painted” and parent blocks until it is “unpainted” • Not yet started to be evaluated: parent evaluates the subgraph (“paints” the nodes) and child will later block or die

  12. FSRE synchronisation (4) • A task is blocked when it accesses a “painted” node: • It is then placed on a queue of blocked tasks • This queue is attached to the node that caused the block • Using reversed pointer so no extra memory overhead! • When the node is “unpainted”, all tasks in the task queue for this node are sent to the task pool • Block on unwind, resume on rewind

  13. @ @ @ @ @ B’ F’ B F

  14. Q @ Q Q @ @ @ @ B’ F’ B F

  15. FSRE scheduling • Too many sparked tasks: task pool fills up • Ignore new sparked tasks! • Discard already-sparked tasks! • (parents always check on their children and do the work themselves if child doesn’t) • NB can’t ignore/discard RESUMED tasks (parent?) • Always schedule resumed tasks first • Use LIFO/FIFO switching for parallelism control (less/more) in system

  16. Two-stroke reduction • “Inlet” • Unwind down the spine to find the leftmost outermost function • Use pointer-reversal and “paint” nodes • If find parallelism annotations in application nodes, spark tasks to evaluate those arguments • Might block on way down, so don’t remember arguments • If leftmost outermost function is a lambda (or a primitive with no strict args), use 2-stroke reduction • if primitive operator, use 4-stroke reduction • “Exhaust” • Get parallelism info and number of args • Rewind (& unpaint) up the spine to find the root of the redex Overwrite root with IND to result of reduction • Then go to “Inlet” again!

  17. Four-stroke reduction • “Inlet” – same as before • “Compression” • Get parallelism info and number of strict args • Rewind (& unpaint) up the spine to the topmost strict argument, sparking strict args on the way up • “Power” • Unwind (& paint) the spine again, checking the evaluation of all strict args one at a time • “Exhaust” – same as before

  18. Summary • Motivation • Model for Parallel Graph Reduction • Parallelism and Tasks • FSRE representation, synchronisation and scheduling • Two-stroke reduction • Four-stroke reduction • Summary

More Related