1 / 28

An Effective Framework for Handling Recoverable Temporal Violations in Scientific Workflows

An Effective Framework for Handling Recoverable Temporal Violations in Scientific Workflows. Xiao Liu 1 , Zhiwei Ni 2 , Zhangjun Wu 2 , Dong Yuan 1 , Jinjun Chen 1 , Yun Yang 1 1 SUCCESS ( Centre for Computing and Engineering Software Systems ), Swinburne University of Technology

ronni
Download Presentation

An Effective Framework for Handling Recoverable Temporal Violations in Scientific Workflows

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An Effective Framework for Handling Recoverable Temporal Violations in Scientific Workflows Xiao Liu1, Zhiwei Ni2, Zhangjun Wu2, Dong Yuan1, Jinjun Chen1, Yun Yang1 1SUCCESS (Centre for Computing and Engineering Software Systems), Swinburne University of Technology Melbourne, Australia 2Institute of Intelligent Management, Hefei University of Technology Hefei, China

  2. Outline • Background • Workflow Technology Group • SwinDeW Family, SwinGrid, SwinCloud • Brief Overview: Workflow Temporal QoS Support • Handling Temporal Violations in Scientific Workflows • Problem Analysis • An Effective Light-Weight Handling Framework • Two-Stage Local Workflow Rescheduling Strategy • Evaluation • Summary 2

  3. Workflow Technology Group Overview • WT group is a part of SUCCESS (Centre for Computing and Engineering Software Systems), a Tier-1 university research centreat Swinburne University of Technology. Our group conducts research into workflow technologies for complex software systems and services including peer-to-peer, grid, and cloud computing based e-science, e-business, transactional and inter-organisational workflows. Others: Prof Ryszard Kowalczyk Prof Chengfei Liu Dr Jun Yan (Wollongong) Prof Hai Jin (HUST) Prof Mingshu Li (ISCAS) Prof Qing Wang (ISCAS) Prof Zhiwei Ni (HFUT) Prof Jinpeng Huai (BUAA) Leader: Prof Yun Yang Visitors (7-8/09): Prof Lee Osterweil Prof. Lori Clarke Researchers: Dr Jinjun Chen (Senior Lecture) Xiao Liu (PostDoc) Dong Yuan (PhD) Gaofeng Zhang (PhD) Wenhao Li (PhD) Dahai Cao (PhD) Xuyun Zhang (PhD) 3

  4. SwinDeW Family • SwinDeW – Swinburne Decentralised Workflow- foundation prototype based on p2p • SwinDeW – past • SwinDeW-A (for Agents) – ARC DP06 • SwinDeW-G (for Grid) – past • SwinDeW-V (for Verification) – current (ARC DP) • SwinDeW-C (for cloud) – current (ARC LP) • Others: SwinDeW-B / -S / -P / -G – past • Current Projects: • ARC DP110101340, Cost effective storage of massive intermediate data in cloud computing applications, Duration: 2011-2013 • ARC LP0990393, Novel cloud computing based on workflow technology for managing large numbers of process instances, Duration: 2010-2012. 4

  5. SwinGrid to SwinCloud 5

  6. Outline Background Workflow Technology Group SwinDeW Family, SwinGrid, SwinCloud Brief Overview: Workflow Temporal QoS Support Handling Temporal Violations in Scientific Workflows Problem Analysis An Effective Light-Weight Handling Framework Two-Stage Local Workflow Rescheduling Strategy Evaluation Summary 6

  7. Scientific Workflows • Scientific Workflow often underlies many large-scale complex e-science applications such as climate modeling, astrophysics, structural biology and chemistry, earth quake simulation and disaster recovery. • Scientific workflows are usually deployed in distributed high performance computing infrastructures such as cluster, grid and cloud. • Compared with conventional business workflows, most scientific workflow are more data and/or computation intensive, less human interaction, large scale, complex process structures.

  8. Temporal QoS Support for Scientific Workflows • Motivation: most e-science applications are time constrained with global temporal constraints (deadlines) and local temporal constraints (milestones) to achieve some pre-defined goals on schedule. • Basic requirements: automation and cost-effectiveness. • Challenges: highly dynamic system environments, changing process structures, charge for the usage of resources • Solution: A Novel Probabilistic Temporal Framework and Its Strategies for Cost-Effective Delivery of High QoS in Scientific Cloud Workflow Systems [PhD Thesis - Xiao Liu]

  9. Lifecycle Support of Temporal QoS

  10. Lifecycle Support of Temporal QoS • At workflow build-time modeling stage • Component 1: temporal constraint setting • Forecasting activity durations [eScience08], [JSS10b] • Setting both coarse-grained and fine-grained temporal constraints [BPM08], [CCPE09], [JCSS10] • Component 2: temporal consistency monitoring • Temporal checkpoint selection [ICSE08], [TAAS07] • Temporal verification [CCPE07], [ToSEM09] • Component 3: temporal violation handling • Temporal violation handling point selection [TSE] • Temporal violation handling [CCGrid], [JSS10a], [TSE], [ICPADS]

  11. Outline Background Workflow Technology Group SwinDeW Family, SwinGrid, SwinCloud Brief Overview: Workflow Temporal QoS Support Handling Temporal Violations in Scientific Workflows Problem Analysis An Effective Light-Weight Handling Framework Two-Stage Local Workflow Rescheduling Strategy Evaluation Summary 11

  12. Problem Analysis • Basic requirements: automation and cost-effectiveness • 1) How to define fine-grained recoverable temporal violations. • Define statistical recoverable and non-recoverable temporal violations, to avoid heavy-weight exception handling strategies and facilitate light-weight ones • Divide fine-grained recoverable temporal violations, to facilitate the choice of different handling strategies with different capability (higher capability, higher cost) • 2) Which light-weight effective exception handling strategies to be facilitated. • Employ or design a set of light-weight handling strategies, from low capability to high capability (low cost to high cost)

  13. An Effective Light-Weight Handling Framework • Three levels of temporal violations • Level I, Level II and Level III • Corresponding three levels of temporal violation handling strategies • TDA, ACOWR and TDA+ACOWR

  14. Three Levels of Handling Strategies • TDA (Time Deficit Allocation) [CCPE07] • TDA is to actively propagate small time deficits to the subsequent workflow activities so that they may be compensated by their saved execution time. • ACOWR (Ant Colony Optimisation based Workflow Rescheduling) [CCGrid10] • Based on our general two-stage local workflow rescheduling strategy • Using ACO as the metaheuristic algorithm • TDA+ACOWR (the hybrid strategy of TDA and ACOWR) • One time TDA and multiple times of ACOWR (normally smaller than 3)

  15. A General Two-Stage Workflow Local Rescheduling Strategy Handling temporal violations with workflow rescheduling Key objective: reduce or ideally remove the time deficit at the current checkpoint, i.e. to reduce the execution time of the subsequent activities after the checkpoint in the violated workflow segment as much as possible Requirement 1: fighting good balance between time deficit compensation and the completion time of other activities (workflow activities and general tasks, with or without temporal constraints) – from the overall makespan perspective Requirement 2: utilising available resources in the system rather than recruiting additional resources – from the overall cost perspective 15

  16. Integrated Task Resource List 16

  17. Pseudo-code for An Abstract Strategy 17

  18. Outline Background Workflow Technology Group SwinDeW Family, SwinGrid, SwinCloud Brief Overview: Workflow Temporal QoS Support Handling Temporal Violations in Scientific Workflows Problem Analysis An Effective Light-Weight Handling Framework Two-Stage Local Workflow Rescheduling Strategy Evaluation Summary 18

  19. Evaluation • Performance analysis and comparison (with GA) for ACOWR • Optimisation on Total Makespan • Optimisation on Total Cost • Time Compensation on Violated Workflow Segment • CPU Time • Effectiveness evaluation of the three-level handing framework • Violation Rate of Global Temporal Constraints and Local Temporal Constraints • Cost Analysis

  20. Optimisation on Total Makespan 20

  21. Optimisation on Total Cost 21

  22. Time Compensation on Violated Workflow Segment 22

  23. CPU Time 23

  24. Experiment Results on Temporal Violation Rates 24

  25. Cost Analysis

  26. Outline Background Workflow Technology Group SwinDeW Family, SwinGrid, SwinCloud Brief Overview: Workflow Temporal QoS Support Handling Temporal Violations in Scientific Workflows Problem Analysis An Effective Light-Weight Handling Framework Two-Stage Local Workflow Rescheduling Strategy Evaluation Summary 26

  27. Summary Temporal QoS Support is Critical in e-Science Applications Temporal Violation Handling in Scientific Workflows Automatic, Cost-Effective Level I, Level II and Level III TDA, ACOWR, TDA+ACOWR A Two-Stage Workflow Local Rescheduling Strategy ACO, GA, PSO, many other metaheuristics Future Work Data movement cost More scheduling algorithms 27

  28. The End – Thank You! Any questions or comments? Email: xliu@swin.edu.au Website: http://www.ict.swin.edu.au/personal/xliu/ An extension of this paper, titled “A Novel General Framework for Automatic and Cost-Effective Handling of Recoverable Temporal Violations in Scientific Workflow Systems,” has been accepted by Journal of Systems and Software (JSS), http://dx.doi.org/10.1016/j.jss.2010.10.027. 28

More Related