1 / 13

Expressing Pipeline Parallelism Using TBB Constructs

Expressing Pipeline Parallelism Using TBB Constructs. A Case Study on What Works and What Doesn‘t Eric C. Reed Nicholas Chen Ralph E. Johnson. Motivation. Goal: Identify core programming patterns used in pipeline parallelism Convert “pipeline- ish ” serial programs to parallel ones

eshana
Download Presentation

Expressing Pipeline Parallelism Using TBB Constructs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Expressing Pipeline Parallelism Using TBB Constructs A Case Study on What Works and What Doesn‘t Eric C. Reed Nicholas Chen Ralph E. Johnson

  2. Motivation • Goal: Identify core programming patterns used in pipeline parallelism • Convert “pipeline-ish” serial programs to parallel ones • Identifying transformations could lead to automation • PARSEC & TBB pipelines • REU project focused on just part of the bigger picture • Always some “pre-transformation” needed before TBB could be used • TBB performed on par with or better than pthreads making library/framework based approaches attractive • TBB Flow Graph had not yet been released • Resolves some problems we found • Our work provides empirical evidence for needing more complex constructs than available in TBB pipelines

  3. ferret: Content-based Image Search

  4. ferret : Content-based Image Search • Read in image • Break image into segments • Extract feature vectors from segments • Query database with feature vectors to find candidate images • Rank candidate images based on similarity • Output best-matching images

  5. TBB filter • A single stage of the pipeline • Represented as a function object • Input: void* to output of previous stage • Output: void* to input of next stage • First/Last stage generates/consumes tokens • Serial-in-order, serial-out-of-order, or parallel class foo : tbb::filter { void* operator()(void* inp) { … operate on token … }; };

  6. TBB pipeline • A pipeline is a sequence of filters • Specified max number of live tokens • Calls first stage to get a new token • A NULL pointer signifies no more input tbb::pipeline pipe; pipe.add_filter(new ReadFilter()); pipe.add_filter(new DoFilter()); pipe.add_filter(new WriteFilter()); pipe.run( 10 ); pipe.clear();

  7. ferret : Content-based Image Search • Read in image (serial-in-order) • Break image into segments (parallel) • Extract feature vectors from segments (parallel) • Query database with feature vectors to find candidate images (parallel) • Rank candidate images by similarity (parallel) • Output best-matching images (serial-out-of-order)

  8. ferret Performance

  9. x264: H.264 Video Encoding • Frame contents predicted from already encoded reference frames • Frame processing cannot start until all reference frames are encoded • Cannot be guaranteed by TBB without blocking • TBB pipelines are not a suitable representation

  10. dedup: File (de)compression • Write a file segment once and its hash every other time • Read in a block of the file (serial-in-order) • Split block into small segments (parallel) • Hash the segment and check database (parallel) • If hash found in database go to step 5 • Otherwise go to step 4 • Compress the segment’s data (parallel) • Reorder segments into a block. Reorder blocks and write out data (serial-in-order) • Token generating stage (step 2) • Optional stage (step 4)

  11. dedup: File (de)compression • Read in a block from file (serial-in-order) • Do the following on the block (parallel) • Split block into segments (serial-in-order) • Compute and check hash (parallel) • Compress segment (parallel) • Check flag to either compress data or immediately return • Reorder segments into block (serial-in-order) • TBB handles reordering so we need only append the segment to the block data structure • Write out block (serial-in-order) • TBB handles reordering so we can just write out the block data

  12. dedupPerformance

  13. Summary • Transformations • Recursive generators become iterators with stacks • Semi-automation with user identifying state • Optional stages become required stages with flags • Semi-automation with user identifying conditions • Token generating stages require nested pipelines • Semi-automation with user specifying how to convert between pipelines • TBB pipeline unsuitability • Dynamically constructed pipeline • Waiting on earlier tokens to finish first

More Related