1 / 29

The Case for Tiny Tasks in Compute Clusters

The Case for Tiny Tasks in Compute Clusters. Kay Ousterhout * , Aurojit Panda * , Joshua Rosen * , Shivaram Venkataraman * , Reynold Xin * , Sylvia Ratnasamy * , Scott Shenker *+ , Ion Stoica *. * UC Berkeley, + ICSI. Setting. …. Task. Task. Map Reduce/Spark/Dryad Job. Task. ….

Download Presentation

The Case for Tiny Tasks in Compute Clusters

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Case for Tiny Tasks in Compute Clusters Kay Ousterhout*, Aurojit Panda*, Joshua Rosen*, ShivaramVenkataraman*, ReynoldXin*, Sylvia Ratnasamy*, Scott Shenker*+, Ion Stoica* * UC Berkeley, +ICSI

  2. Setting … Task Task Map Reduce/Spark/Dryad Job Task … … Task

  3. Use smaller tasks! Today’s tasks Tiny Tasks

  4. Why? How? Where?

  5. Why? How? Where?

  6. Problem: Skew and Stragglers Contended machine? Data skew?

  7. Benefit: Handling of Skew and Stragglers Today’s tasks Tiny Tasks As much as 5.2x reduction in job completion time!

  8. Problem: Batch and Interactive Sharing Clusters forced to trade off utilization and responsiveness! Low priority batch task High priority interactive job arrives

  9. Benefit: Improved Sharing Today’s tasks Tiny Tasks High-priority tasks not subject to long wait times!

  10. Benefits: Recap (1) Straggler mitigation (2) Improved sharing Mantri (OSDI ‘10) Scarlett (EuroSys’11) SkewTune (SIGMOD ‘12) Dolly (NSDI ’13) … Quincy (SOSP ‘09) Amoeba (SOCC ’12) …

  11. Why? How? Where?

  12. Schedule task Scheduling requirements: High Throughput (millions per second) Low Latency (milliseconds) Distributed Scheduling (e.g., Sparrow Scheduler)

  13. Schedule task Use existing thread pool to launch tasks Launch task

  14. Schedule task Use existing thread pool to launch tasks + Cache task binaries Launch task Task launch = RPC time (<1ms)

  15. Schedule task Smallest efficient file block size: Launch task 8MB Read input data Distribute Metadata (à la Flat Datacenter Storage, OSDI ‘12)

  16. Schedule task Launch task Read input data … … Tons of tiny transfers! Execute task + read data for next task Framework-Controlled I/O (enables optimizations, e.g., pipelining)

  17. Schedule task How low can you go? Launch task 8MB disk block Read input data 100’s of milliseconds Execute task + read data for next task

  18. Why? How? Where?

  19. Original Job Tiny Tasks Job 1 2 3 4 N Map Task 1 Map Tasks … … Map Task 2 … K1:  K1:  K1:  K1:  K2:  K5:  Reduce Tasks K1:  K2:  K5:  K2:  K2:  … … K1:  K3:  Reduce Task 1 Kn:  Kn:  Kn: 

  20. Original Reduce Phase K1:  K1:  K1:  K1:  K1:  K1:  K1:  K1:  K1:  K1:  K1:  K1:  K1:  K1:  K1:  K1:  Reduce Task 1 Tiny Tasks = ?

  21. Splitting Large Tasks • Aggregation trees • Works for functions that are associative and commutative • Framework-managed temporary state store • Ultimately, need to allow a small number of large tasks

  22. Tiny tasks mitigate stragglers + Improve sharing Distributed file metadata Launch task in existing thread pool Pipelined task execution Distributed scheduling Questions? Find me or Shivaram:

  23. Backup Slides

  24. Benefit of Eliminating StragglersBased on Facebook Trace 5.2x at the 95th percentile!

  25. Why Not Preemption? • Preemption only handles sharing (not stragglers) • Task migration is time consuming • Tiny tasks improve fault tolerance

  26. Dremel/Drill/Impala • Similar goals and challenges (supporting short tasks) • Dremel statically assigns tablets to machines; rebalances if query dispatcher notices that a machine is processing a tablet slowly  standard straggler mitigation • Most jobs expected to be interactive (no sharing)

  27. Scheduling Throughput 10,000 Machines 16 cores/machine 100 millisecond tasks Over 1 million task scheduling decisions per second

  28. Sparrow: Technique Place m tasks on the least loaded of dm slaves 4 probes (d = 2) Slave Scheduler Slave Job Scheduler Slave m = 2 tasks Slave Scheduler Slave Scheduler Slave More at tinyurl.com/sparrow-scheduler

  29. Sparrow: Performance on TPC-H Workload Within 12% of offline optimal; median queuing delay of 8ms More at tinyurl.com/sparrow-scheduler

More Related