1 / 37

General and Effective Monetary Optimizations for Workflows in IaaS Clouds

General and Effective Monetary Optimizations for Workflows in IaaS Clouds. presented by. Amelie Chi Zhou amelie.czhou@gmail.com Xtra Computing Group http:// pdcc.ntu.edu.sg/xtra Nanyang Technological University, Singapore. Workflows for Scientific Applications .

ozzy
Download Presentation

General and Effective Monetary Optimizations for Workflows in IaaS Clouds

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. General and Effective Monetary Optimizations for Workflows in IaaS Clouds presented by Amelie Chi Zhou amelie.czhou@gmail.com Xtra Computing Group http://pdcc.ntu.edu.sg/xtra Nanyang Technological University, Singapore

  2. Workflows for Scientific Applications • Workflows are structured • Tasks have very different I/O and computational behavior. • Real-world workflows • Montage, Ligo, Epigenomics, water-simulation • Workflow ensembles [Malawski et al., SC’12] • Composition of workflows with similar structures and different parameters and priorities Epigenomics Ligo Montage

  3. Running Workflows on IaaS Clouds • Define IaaS clouds • Provide fundamental computing resources for users to provision • Examples: Amazon EC2, Rackspace, OpenStack, Google Compute Engine … • Example projects • Montage, Broadband, Epigenomics on Amazon EC2 [Juve et al., eScience’09] • Astronomy applications on Nimbus, Eucalyptus, and EC2 [Vöckler et al., ScienceCloud’11] • …

  4. Workflows in IaaS Clouds • Features of IaaS clouds • Pay as you go (e.g., hourly pricing scheme) • Rich and evolving cloud offerings • Research problems • Monetary cost optimizations • Performance optimizations • Elasticity • Fault tolerance • … Are the current solutions ideal/sufficient?

  5. Monetary Cost Opportunities • Instance types • Amazon EC2 provides 29 types of instances • Instance reuse • Hourly charging scheme • Pricing schemes • On-demand, spot and reserved pricing V.S. • Tasks can have very different I/O and computational behavior. • Workflows have different deadline and monetary constraints. • Users may have various workflow application scenarios.

  6. Current Solutions are Far From Ideal • Problems of current approaches • Auto-scaling [Mao et al., SC’11] resource management • More effective optimizations  29% less cost • Assume static cloud performance and pricing • Cloud dynamics + spot instances  73% less cost • Heuristic-based cost and performance optimizations are specific. • They are likely to be suboptimal in evolving and diversified workflow applications. 29% 73%

  7. Our Research Efforts • Effectiveness • Dyna: Minimize the monetary cost of workflows, addressing both the price and performance dynamics in clouds • Generality • ToF: Define transformation operations to model common cost and performance optimizations • Deco: Design a declarative language called WLog to specify various workflow optimization problems The focus of this presentation.

  8. Overall Design • We design general workflow optimization frameworks to fully explore the optimization opportunities that lie in workflows Problem specification layer Wlog programs Deco Transformation-based Optimizer Optimization layer ToF Execution layer

  9. Outline • Related Work • Generalized Optimization Frameworks • General transformations for cost and performance optimizations • A declarative language for workflow optimization problems • Conclusions

  10. Related Work • Performance and monetary cost optimization heuristics • Auto-scaling [Mao et al., SC’11] • Fixed sequence of workflow optimizations • Workflow scheduling with performance and cost constraints [Kllapi et al., SIGMOD’11] • Consider only one on-demand instance type The heuristics are specifically designed for specific optimization problems and the optimization opportunities are not fully explored.

  11. Related Work (cont’d) • Generalized optimization frameworks: overhead is a problem • Generalized bin-ball abstraction for resource allocation [Rai et al., SoCC’12] • GPU acceleration • Not always convenient to model a problem with the bin-ball model • Declarative language to model a wide range of COPs [Liu et al., VLDB’12] • Distributed systems • Ignorant to the special features and optimization opportunities in workflows There is no general optimization framework for workflows.

  12. Outline • Related Work • Generalized Optimization Frameworks • General transformations for cost and performance optimizations • A declarative language for workflow optimization problems • Conclusions

  13. ToF: A Transformation-based Optimization Framework • Outline • Main contributions of this work • System overview • Design details • Evaluation results

  14. Main Contributions • This study has two major contributions • We define a series of common transformations for the performance and cost optimizations of workflows. • We design a light-weight optimizer to guide the transformation process.

  15. Workflow Transformation • Definitions • Instance assignment graph • Each node represents instance configuration for a task. • Same structure as the workflow DAG • Transformation operation • Structural change in the instance assignment graph 0 0 0 Transformations 1 2,3 1,2 3 2 3 1 0 0 2 1,3 1,2,3

  16. System Overview • Design ideas • Two types of transformations • Main schemes: reduce cost • Auxiliary schemes: help main schemes to reduce cost • Use cost model to guide the transformation optimization • Periodical batch optimization • Maximize instance sharing and reuse • Reduce optimizer overhead Main Schemes Cost model Auxiliary Schemes No Termination? Yes Output Optimization process in one plan period

  17. Design Details • Transformation operations • Main schemes: Merge, Demote • Auxiliary schemes: Move, Promote, Split, Co-scheduling • Transformations can combine with each other

  18. Using Transformations • Example of using Move and Merge operations Only transform shape Reduces cost Charging hours: 

  19. Experimental Setup • Workload • Montage, Ligo and Mixed • Workflow submission rate follows Poisson distribution • Comparisons • ToF • Baseline: only implementthe initial instance configuration • Auto-scaling [Mao et al., SC’11] • Greedy: randomly select the transformation during optimization • All results are normalized to Baseline

  20. Evaluation Results on Cost Optimizations 29% 15% 28% 21% 17% 16% • Optimization results under the pricing scheme of Amazon EC2. • ToF obtains the lowest monetary cost on all workflows. • Over Auto-scaling by 29% • Over Baseline by 27% • Over Greedy by 17%

  21. Evaluation Results on Performance Optimizations 12% 21% 21% 18% 8% 16% • Performance optimization results. • ToF obtains the lowest average execution time on all workflows. • Over Auto-scaling by 21% • Over Baseline by 21% • Over Greedy by 18%

  22. Outline • Related Work • Generalized Optimization Frameworks • General transformations for cost and performance optimizations • A declarative language for workflow optimization problems • Conclusions

  23. Deco: A Declarative Optimization Framework • Outline • Main contributions of this work • System overview • A declarative language for workflows • GPU-accelerated search engine • Evaluation results

  24. Main Contributions • This work has three main contributions • A declarative language for resource provisioning of scientific workflows in IaaSclouds • A generalized optimization framework to serve a wide range of optimization problems • Fast GPU-based implementation for low optimization overhead

  25. Motivating Ideas • Why declarative language? • Declarative languages like HTML, SQL, Prolog • Concise and clear • Focus on what to do rather than how to do it • Why GPU acceleration? • Generic search has large runtime overhead • Monte Carlo method is used for probabilistic approximation [Raedt et al. 2007] which is suitable for GPU acceleration

  26. System Overview • Overview of the Deco system • WLog, a declarative language for workflows • GPU-Accelerated search engine

  27. WLog – A Declarative Language for Workflows • WLogis designed based on Prolog • A WLog program describing a workflow scheduling problem goal minimize Ct in totalcost(Ct). cons deadline(95%, 10h). varconfigs(Tid, Vid) forall task(Tid) and Vm(Vid). r1import(amazonec2). r2import(montage). r3 path(X,Y,Y,C) :- edge(X,Y), exetime(X,Vid,T), C is T. r4 path(X,Y,Z,C) :- edge(X,Z), Zn==Y, path(Z,Y,Z2,C1), exetime(X,Vid,T), C is T+C1. r5maxtime(Path,T) :- setof([Z,C],path(root,tail,Z,C),Set), max(Set,[Path,T]). r6 cost(Tid,Vid,C) :- price(Vid,Up), exetime(Tid,Vid,T), C is ceil(T/60.0)*Up. r7totalcost(Ct) :- findall(C,cost(Tid,Vid,C),Bag), sum(Bag,Ct). deadline(P, D) A probabilistic deadline requirement that D is at the P-thpercentile of workflow execution time. • problem specific keywords: • goal Optimization goal defined by the user. • cons Problem constraint defined by the user. • varProblem variable to be optimized. import(cloud) Import the cloud-related facts from the cloud metadata. import(daxfile) Import the workflow-related facts generated from a DAX file.

  28. GPU Accelerations • Explore vs. exploit • By exploit, partial results are prioritized. • Exploration traverses the search tree level by level which offers GPU a opportunity to parallel the searching process. • Memory optimizations • Minimize the usage of global memory • Reduce accesses to shared memory

  29. Evaluation Settings • Three use cases • Workflow scheduling problem • Workflow ensemble [Malawski et al., SC’12] • Goal: execute more workflows with high priorities within given budget and deadline • Follow-the-cost: multiple workflows, multiple datacenters • Comparison for workflow ensemble problem • Algorithms: Deco vs. SPSS [Malawski et al., SC’12] • Ensemble types: constant, Uniform(Un)sorted, Pareto(Un)sorted • Generate 5 budgets between [MinBudget, MaxBudget] • All results are normalized to that of SPSS

  30. Evaluation Results • Under all ensemble types and budget constraints • Deco obtains better score metric value than SPSS Obtained score results of SPSS and Deco with different ensemble types under budget 1 to 5 and fixed deadline. Workflow type is Ligo.

  31. Evaluation Results (cont’d) • Programmability of WLogin Deco (lines of codes) • Users (re-)implement the workflow application in C++. • With Deco, users implement in WLog. Deco allows much lower coding complexity than manual implementation.

  32. Performance Speedup of GPUs • Performance speedup of GPU implementation over CPU implementation on a single core for the three applications 437x 93x 31x

  33. Outline • Related Work • Generalized Optimization Frameworks • General transformations for cost and performance optimizations • A declarative language for workflow optimization problems • Conclusions

  34. Conclusions • IaaS clouds have become an attractive platform for hosting workflows. • Despite recent efforts in monetary cost optimizations of workflows in the cloud, there is still a large room for further improvements. • Due to the complex cloud offerings and problem specifications, we develop general optimization frameworks. • ToF achieves up to 29% improvement over the state-of-the-art algorithm. • Deco achieves up to 77% improvement over the state-of-the-art algorithm.

  35. Future Work • Energy-efficient Cloud • Reduce the investment cost of cloud provider to potentially reduce instance price with energy-efficient hardware/software • Optimization opportunities in Multi-Cloud • Utilize different cloud offerings, e.g., instance types, to further reduce cost

  36. References • MaciejMalawski, Gideon Juve, EwaDeelman, and JarekNabrzyski. 2012. Cost- and deadline-constrained provisioning for scientific workflow ensembles in IaaS clouds.SC '12. 11 pages. • Juve, G.; Deelman, E.; Vahi, K.; Mehta, G.; Berriman, B.; Berman, B.P.; Maechling, P., "Scientific workflow applications on Amazon EC2," E-Science Workshops, pp.59,66, 9-11 Dec. 2009. • Jens-SönkeVöckler, Gideon Juve, EwaDeelman, Mats Rynge, and Bruce Berriman. 2011. Experiences using cloud computing for a scientific workflow application. ScienceCloud '11. P15-P24. 2011. • Ming Mao, Marty Humphrey: Auto-scaling to minimize cost and meet application deadlines in cloud workflows. SC 2011: 49. • Herald Kllapi, Eva Sitaridi, Manolis M. Tsangaris, and Yannis Ioannidis. 2011. Schedule optimization for data processing flows on the cloud. SIGMOD '11. 289-300. • AnshulRai, RanjitaBhagwan, and SaikatGuha. 2012. Generalized resource allocation for the cloud. SoCC '12. Article 15 , 12 pages. • Changbin Liu, Lu Ren, Boon Thau Loo, Yun Mao, and PrithwishBasu. 2012. Cologne: a declarative distributed constraint optimization platform. Proc. VLDB Endow. 5, 8 752-763. • L. De Raedt, A. Kimmig, and H. Toivonen, ProbLog: A probabilistic Prolog and its application in link discovery, IJCAI 2007, pages 2462-2467, 2007. • Amelie Chi Zhou, Bingsheng He, Transformation-based Monetary Cost Optimizations for Workflows in the Cloud, accepted by TCC, Dec 2013. • Amelie Chi Zhou, Bingsheng He, A declarative optimization framework for workflows in IaaS clouds, submitted to SC 2014. • Amelie Chi Zhou, Bingsheng He, Cheng Liu, Monetary Cost Optimizations for Hosting Workflow-as-a-Service in IaaSClouds, submitted to ToC, 2014.

  37. Thank you! Amelie Chi Zhou amelie.czhou@gmail.com Advisor: Bingsheng He bshe@ntu.edu.sg Xtra Computing Group http://pdcc.ntu.edu.sg/xtra Nanyang Technological University, Singapore

More Related