1 / 23

Toward Progress Indicators on Steroids for Big Data Systems

Toward Progress Indicators on Steroids for Big Data Systems. Jiexing Li #* , Rimma Nehme * , Jeff Naughton #* # University of Wisconsin-Madison * Microsoft Jim Gray Systems Lab. Explosive growth in b ig d ata systems.

yaphet
Download Presentation

Toward Progress Indicators on Steroids for Big Data Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Toward Progress Indicators on Steroids for Big Data Systems JiexingLi#*, RimmaNehme*, Jeff Naughton#* #University of Wisconsin-Madison *Microsoft Jim Gray Systems Lab

  2. Explosive growth in big data systems • Explosive growth in the complexity, diversity,number of deployments, and capabilitiesof big data processing systems. [Ailamaki et al., 2011] Jaql Hive Pig Cascading SQL SimpleDB PM PNuts AQL HBase Map/Reduce Programming Model Simple DB KVS Algebrix PACT programming models runtime/execution engines Hadoop Map/Reduce Engine Hyracks Nephele PDW Engine PDW Store (SQL Server) Azure Engine Azure Store HDFS AsterixBTree Amazon S3 MySQL indexes

  3. Big data systems • They are large and complex beasts. • To operate them efficiently, we need information about what is going on in the system. Thousands of servers Data 1 Data 2 Data n Task 2 Task n Task 1 Node 1 Node 2 Node n …

  4. Need to know the future state • Instantaneous “snapshot” information is important and nice, but not sufficient. • We also need to know what it will look like in the future. CPU overload! Task 2 Task n Task 1 Node 1 Node 2 Node n … Bad disk! Lack of memory! Data 1 Data 2 Data n …

  5. Need “predict, monitor, revise” paradigm • Predicting the future in these systems is difficult or impossible. • Don’t require perfect predictions: • Instead, anticipate the presence of errors. • Detect them and react as time progresses. • Progress indicatorsfit this “predict, monitor, revise” paradigm really well. One-shot “predict and ignore” Predict, monitor, and revise Unreliable

  6. Progress indicator (PI) • A PI provides feedback to users on how much of the task has been completedor when the task will finish. • Begins with a prediction of the query progress, and while query executes, modifies the prediction based on the perceived information. • But they are currently too weak and limited for big data systems.

  7. Progress indicator on steroids • Our goal: to advocate for the creation and research into “progress indicators on steroids.” • Use more practical evaluation metrics to depict quality. • Expand the class of computations they can serve. • Expand the kinds of information they can provide. • Continue to increase the accuracy of their prediction. What? How?

  8. Our vision • Change our way of evaluating progress indicator technology. Helpful for specific tasks Accurate when nothing changes React to changes quickly Accuracy (Current PIs) Progress indicators

  9. Our vision (cont.) • Expand the class of computations they can serve. Scheduler Resource manager Optimizer Performance debugger Straggler/skew handler User interface … (Current PIs) Progress indicators

  10. Our vision (cont.) • Expand the kinds of information they can provide. Good/bad machines Resource availability Disk fragmentation Straggling tasks Automatic failure diagnosis p% or time … (Current PIs) Progress indicators

  11. A promising simple example • A progress score provided by Pig for a MapReduce job: • Divide it into 4 phases. • For a phase, the score is the percentage of data read/processed. • The overall progress for the job is the average of these 4 scores. • This is a very rough estimate, which assumes that each phase contributes equally to the overall score. Reduce task Map task Record Reader Map Combine Sort Reduce Copy

  12. A promising simple example (cont.) • Hadoop uses these progress estimates to select stragglers and schedule backup executions on other machines. • Improved execution time by 44%. [Dean et al., OSDI, 2004] • Improved execution time further by a factor of 2. [Zaharia et al., OSDI, 2008] Straggler Backup execution Node 1 Node 2 Node n … P1% P2% Pn% Straggler: a task that makes much less progress than tasks in its category. Already deployed! Simple and rough estimates, but really helpful!

  13. Achieving vision requires research • One line of research: retargeting even today’s simple progress indicators to new systems can be interesting and challenging. • Think: complexity and diversityof different data processing systems • Example: We attempted to apply a debug run-based PI developed for MapReduce jobs to parallel database systems.

  14. The idea of a debug run-based PI [Morton et al., SIGMOD, 2010] • For a query plan, estimates the processing speed for each phase/pipeline using information from earlier (debug) runs. 3. Execute the job 1. Original data 2. Sample data Data 4. Calculate the processing speed (e.g., how many bytes can be processed per second) for each phase. 5. Remaining time (RT) = remaining data/speed.

  15. Questions: • This worked very well for map-reduce jobs. • But what happens when we apply this debug-run approach to a parallel database system? • We ran a simple experiment to find out.

  16. Experimental setup • Implemented the progress indicator in SQL Server PDW. • Cluster: 18 nodes (1 control node, 1 data landing node, and 16compute nodes). • Connected with 1Gbit Ethernet switch • 2 Intel Xeon L5630 quad-core processors • 32 GB memory (at most 24 GB for DBMS) • 10 300 GB hard drivers (8 disks for data)

  17. Experimental setup (cont.) • Database: 1TB TPC-H. • Each table is either hash partitioned or replicated. • When a table is hash partitioned, each compute node contains 8horizontal data partitions (8*16in total).

  18. Debug run-based PI can work well • TPC-H Q1: no joins, 7 pipelines, and the speed estimates are accurate.

  19. Complex queries are more challenging • TPC-H Q4: Later joins in the debug run have very few tuples to process. 1% of the 1TB data Percentage: 1%, 0.01%, 0.0001%, 0%, 0%. 0 0

  20. Optimizer also presents challenges • Cost-based optimization may yield different plans for sampled versus entire dataset. Nested Loop Hash Match Broadcast Move Filter Table Scan [o] Shuffle Move Table Scan [l] Table Scan [o] Filter Original data Sample Table Scan [l] Only 6out of 22 TPC-H queries used the same plans.

  21. Conclusion from experiment. • Even a simple task (porting debug run-based PI from MapReduce to parallel DBMS) is challenging. • New ideas needed to make it work. • How to build progress indicators for variety of systems for variety of uses is a wide-open problem.

  22. Some specific technical challenges • Operators • Work and speed estimation • Pipeline definition & shape • Dynamicity • Statistics • Parallelism • … A promising direction, but still a really long way to go!

  23. Conclusions • Proposed and discussed the desirability of developing “progress indicators on steroids”. • Issues to consider include: • Evaluation metrics. • Computations to serve. • Information to provide. • Accuracy. • Small case study illustrates that even small steps toward “progress indicators on steroids” require effort and careful thought.

More Related