Overview: Cloud Computing and Workflow Research in NGSP Group Dr. Xiao Liu Sessional Lecturer, Research Fellow Centre of SUCCESS Swinburne University of Technology Melbourne, Australia
Outline • SUCCESS Centre and NGSP Group • Background: Cloud Computing and Workflow • Research Topics • Performance Management in Scientific Workflows • Data Management in Scientific Cloud Workflows • Security and Privacy Protection in the Cloud • Data Reliability Assurance in the Cloud • SwinDeW-C Cloud Workflow System • Future Work and Conclusions
The Centre of SUCCESS • SUCCESS: Swinburne University Centre for Computing and Engineering Software Systems • SUCCESS is the NO.1 Software Engineering Centre in Australia • SUCCESS is one of the 7 Tire 1Centres at Swinburne University of Technology (Times World Ranking: 351- 400) • The ambition of the Centre is to become the top centre for software research in the Southern Hemisphere within the next five years. To achieve world renowned software innovation and engineering with a balanced theoretic, applied, industry and education impact across the Centre
SUCCESS • Research Focus Areas • Knowledge and Data Intensive Systems • Nature of Software • Next Generation Software Platforms • SE Education and IBL/RBL • Software Analysis and Testing • Software R&D Group • http://www.swinburne.edu.au/ict/success/research-expertise/
NGSP (Small) Group Overview • This group conducts research into cloud computing and workflow technologies for complex software systems and services. • Members: Others: Prof John Grundy Prof Chengfei Liu Researchers: A/Prof Jinjun Chen (UTS) Dr Xiao Liu (Postdoc) Dr Dong Yuan (Postdoc) Gaofeng Zhang Wenhao Li Dahai Cao Xuyun Zhang Chang Liu Jofry Hadi SUTANTO Leader: Prof Yun Yang (PC Member for ICSE 07/08, FSE09 ICSE 10/11/12) Visitors: Prof Lee Osterweil Prof Lori Clarke Prof Ivan Stojmenovic Prof Paola Inverardi Prof Amit Sheth Prof Wil van der Aalst Prof Hai Zhuge
R&D Projects – Grants • Primary projects: • (Cloud) workflow technology • ARC LP0990393 (Y Yang, R Kotagiri, J Chen, C Liu) • Cloud computing • ARC DP110101340 (Y Yang, J Chen, J Grundy) • Secondary project: • Management control systems for effective information sharing and security in government organisations • ARC LP110100228 (S Cugenasen, Y Yang)
SwinDeW workflow family including SwinDeW-C Architectures / Models (D Cao) Scheduling / Data and service management (D Yuan, X Liu) Verification / Exception handling (X Liu) Cloud computing: Data management (D Yuan, X Liu, W Li) Privacy and Security (G Zhang, X Zhang, C Liu) R&D Projects – Overview
J. Chen and Y. Yang, Temporal Dependency based Checkpoint Selection for Dynamic Verification of Temporal Constraints in Scientific Workflow Systems. ACM Transactions on Software Engineering and Methodology, 20(3), 2011 X. Liu, Y. Yang, Y. Jiang and J. Chen, Preventing Temporal Violations in Scientific Workflows: Where and How. IEEE Transactions on Software Engineering, 37(6):805-825, Nov./Dec. 2011. D. Yuan, Y. Yang, X. Liu and J. Chen, On‑demand Minimum Cost Benchmarking for Intermediate Datasets Storage in Scientific Cloud Workflow Systems. Journal of Parallel and Distributed Computing, 71:(316-332), 2011 J. Chen and Y. Yang, Localising Temporal Constraints in Scientific Workflows. Journal of Computer and System Sciences, Elsevier, 76(6):464-474, Sept. 2010 G. Zhang, Y. Yang and J. Chen, A Historical Probability based Noise Generation Strategy for Privacy Protection in Cloud Computing. Journal of Computer and System Sciences, Elsevier, published online, Dec. 2011. Some Recent ERA A* Ranked Publications
Outline • SUCCESS Centre and NGSP Group • Background: Cloud Computing and Workflow • Research Topics • Performance Management in Scientific Workflows • Data Management in Scientific Cloud Workflows • Security and Privacy Protection in the Cloud • Data Reliability Assurance in the Cloud • SwinDeW-C Cloud Workflow System • Future Work and Conclusions 9
Background: Cloud Computing • What is cloud computing? • R. Buyya: "A Cloud is a type of parallel and distributed system consisting of a collection of inter-connected and virtualisedcomputers that are dynamically provisioned and presented as one or more unified computing resources based on service-level agreements established through negotiationbetween the service provider and consumers.” • I. Foster: " Cloud computing is a large-scale distributed computing paradigm that is driven by economies of scale, in which a pool of abstracted, virtualised, dynamically-scalable, managed computing power, storage, platforms, and services are delivered on demand to external customers over the Internet. “ • UC Berkeley: Cloud computing is utility computing plus SaaS.
Why Cloud Computing 1.2 ZB 24 PB • Data explosion • TB (1012), PB(1015), exabyte (EB, 1018), zettabyte (ZB, 1021), yottabyte (YB,1024) • The total amount of global data in 2010: • Google processes ？data everyday in 2009: • Every day, Facebook 10T, Twitter 7T, Youtube 4.5T • Moore's law vs. data explosion speed • Buzzwords: data storage, data processing, parallel, distributed, virtualisation, commodity machines, energy consumption, data centres, utility computing, software (everything) as a service
Benefits of Clouds • No upfront infrastructure investment • No procuring hardware, setup, hosting, power, etc.. • On demand access • Lease what you need and when you need.. • Efficient Resource Allocation • Globally shared infrastructure … • Nice Pricing • Based on Usage, QoS, Supply and Demand, Loyalty, … • Application Acceleration • Parallelism for large-scale data analysis… • Highly Availability, Scalable, and Energy Efficient • Supports Creation of 3rd Party Services & Seamless offering • Builds on infrastructure and follows similar Business model as Cloud
Successful Stories Google Animoto, 750,000 sign up in three days, 25,000 access one hour, 10 times capability required, Amazon NY Times, articles from 1851 to 1980, accomplished in 24 hours at a cost of only US$240 Facebook, Saleforce CRM, IBM Research Compute Cloud …..
Cloud Computing Classification • Cloud Services • IaaS: infrastructure as a service, Amazon S3, EC2 • PaaS: platform as a service, Google App Engine • SaaS: software as a servcie, Saleforce.com • Cloud Types • Public/Internet Clouds • Private/Enterprise Clouds • Hybrid/Mixed Clouds
Example (PaaS): Hadoop Project • The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using a simple programming model. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Hadoop provides a reliable shared storage and analysis system • Storage provided by HDFS: a distributed file system that provides high-throughput access to application data • Analysis provided by MapReduce: a software framework for distributed processing of large data sets on compute clusters • Hadoop for Yahoo! search • Hadoop: The Definitive Guide (by Tom White) • http://hadoop.apache.org/
Cloud in Australia Gartner estimated the global demand in 2009 for cloud computing at $46 billion, rising to $150 billion by 2013 The Australian Government’s business operations, ICT costs around $4.3 billion p.a. Australian Government ICT Sustainability Plan 2010 – 2015, an energy efficient technology for the Australian Government Data Centre Strategy. The Department of Finance and Deregulation estimated that costs of$1 billion could be avoided by developing a data centre strategy for the next 15 years. Australian Taxation Office (ATO), Department of Immigration and Citizenship (DIAC),, and Australian Maritime Safety Authority (AMSA), proof of concept, initiatives The Australian Academy of Technological Sciences and Engineering (ATSE), opportunities and challenges for government, universities and business. Westpac, Telstra, MYOB, Commonwealth Bank, Australian and New Zealand Banking Group and SAP, initiatives to support the migration and running of their business applications in the cloud. 16
Cloud in China The national twelfth five years plan http://www.chinacloud.cn/ http://www.china-cloud.com/ http://www.cloudcomputing-china.cn/ 17
Background: Workflow The automation of a business process, in whole or part, during which documents, information or tasks are passed from one participant to another for action, according to a set of procedural rules. A Workflow Management System is a system that provides procedural automation of a business process by managing the sequence of work activities and by managing the required resources (people, data & applications) associated with the various activity steps. -- [Workflow Management Coalition] 18
Why Workflow • Originated from office automation • Business process management, business agility • Business process analysis, re-design • Separation of workflow management system from software applications • Just like the separation of database management system from software applications • Software component reuse, Web-services • Programming by scripting the composition of software components 19
Workflow Applications • Office automation, review and approve process • Business process management systems, ERP systems • Machine shops, job shops and flow shops • Flight booking, insurance claim, tax refund… • Scientific workflows • IBM WebSphere Workflow • Microsoft Windows Workflow Foundation • http://wm.microsoft.com/ms/msdn/netframework/introwf.wmv 20
Example: Pulsar Searching Workflow Astrophysics: pulsar searching Pulsars: the collapsed cores of stars that were once more massive than 6-10 times the mass of the Sun http://astronomy.swin.edu.au/cosmos/P/Pulsar Parkes Radio Telescope (http://www.parkes.atnf.csiro.au/) Swinburne Astrophysics group (http://astronomy.swinburne.edu.au/) has been conducting pulsar searching surveys (http://astronomy.swin.edu.au/pulsar/) based on the observation data from Parkes Radio Telescope. Typical scientific workflow which involves a large number of data and computation intensive activities. For a single searching process, the average data volume (not including the raw stream data from the telescope) is over 4 terabytes and the average execution time is about 23 hours on Swinburne high performance supercomputing facility (http://astronomy.swinburne.edu.au/supercomputing/). left: Image of the Crab Nebula taken with the Palomar telescoperight: A close up of the Crab Pulsar from the Hubble Space TelescopeCredit: Jeff Hester and Paul Scowen (Arizona State University) and NASA 22
Pulsar Searching Workflow Dr. Willem van Straten 23
Outline • SUCCESS Centre and NGSP Group • Cloud Computing and Workflow • Research Topics • Performance Management in Scientific Workflows • Data Management in Scientific Cloud Workflows • Security and Privacy Protection in the Cloud • Data Reliability Assurance in the Cloud • SwinDeW-C Cloud Workflow System • Future Work and Conclusions 24
Dr. Xiao Liu firstname.lastname@example.org http://www.ict.swin.edu.au/personal/xliu/ Performance Management in Scientific Workflows Research Topics
Workflow QoS QoS dimensions time, cost, fidelity, reliability, security … QoS of Cloud Services Workflow QoS the overall QoS for a collection of cloud services but not simply add up! 26
Temporal QoS System performance Response time Throughput Temporal constraints Global constraints: deadlines Local constraints: milestones, individual activity durations Satisfactory temporal QoS High performance: fast response, high throughput On-time completion: low temporal violation rate 27
Problem Analysis Setting temporal constraints Coarse-grained and fine-grained temporal constraints Prerequisite: effective forecasting of activity durations Monitoring temporal consistency state Monitor workflow execution state Detect potential temporal violations Temporal violation handling Where to conduct violation handling What strategies to be used 28
Ultimate Goal 29 • Achieving on-time completion • Measurements: • Temporal correctness • Cost effectiveness
Temporal Consistency Model Temporal correctness: workflow execution towards the satisfaction of temporal constraints Temporal consistency model defines the system running state at a specific workflow activity point (i.e. temporal checkpoint) against specific temporal constraints Basic elements: real workflow running time (before and including the activity point), estimated running time for uncompleted workflow (after the checkpoint), temporal constraints
Probability Based Temporal Consistency Model Time attributes for workflow activity ai Maximum activity duration: D(ai) Mean activity duration: M(ai) Minimum activity duration: d(ai) Runtime activity duration: R(ai) 3 sigm rule, normal distribution, 99.73% (μ-3σ, μ+3σ), R(ai)~N(μ, σ) D(ai)= μ+3σ, M(ai)= μ, d(ai)= μ-3σ
Probability Based Temporal Consistency Model Type of Temporal Constraints Upper bound temporal constraint, U(W) Lower bound temporal constraint, L(W) Fixed-time temporal constraint, F(W) Relationship Upper bound, lower bound, symmetric Upper bound, fixed-time, special case Choice Upper bound/lower bound constraint for workflow build-time Fixed-time constraint for workflow runtime
Temporal Framework 36 • Component 1: Temporal Constraint Setting • Forecasting workflow activity durations • Setting coarse-grained temporal constraints • Setting fine-grained temporal constraints • Component 2: Temporal Consistency Monitoring • Temporal checkpoint selection • Temporal verification • Component 3: Temporal Violation Handling • Temporal violation handling point selection • Temporal violation handling
Forecasting Activity Durations 38 • Statistical time-series pattern based forecasting strategies • Selected Publications: • X. Liu, Z. Ni, D. Yuan, Y. Jiang, Z. Wu, J. Chen, Y. Yang, A Novel Statistical Time-Series Pattern based Interval Forecasting Strategy for Activity Durations in Workflow Systems, Journal of Systems and Software (JSS), vol. 84, no. 3, Pages 354-376, March 2011. • X. Liu, J. Chen, K. Liu and Y. Yang, Forecasting Duration Intervals of Scientific Workflow Activities based on Time-Series Patterns, Proc. of 4th IEEE International Conference on e-Science (e-Science08), pages 23-30, Indianapolis, USA, Dec. 2008.
Setting Temporal Constraints 39 • Probability based temporal consistency model • Time analysis based on Stochastic Petri Nets • Selected Publications: • X. Liu, Z. Ni, J. Chen, Y. Yang, A Probabilistic Strategy for Temporal Constraint Management in Scientific Workflow Systems, Concurrency and Computation: Practice and Experience (CCPE), Wiley, 23(16):1893-1919, Nov. 2011 . • X. Liu, J. Chen and Y. Yang, A Probabilistic Strategy for Setting Temporal Constraints in Scientific Workflows, Proc. 6th International Conference on Business Process Management (BPM2008), Lecture Notes in Computer Science, Vol. 5240, pages 180-195, Milan, Italy, Sept. 2008.
Temporal Consistency Monitoring • Minimum (Probability) Time Redundancy based Checkpoint Selection Strategy • Temporal Dependency based Checkpoint Selection Strategy • Selected Publications: • X. Liu, Y. Yang, Y. Jiang and J. Chen, Preventing Temporal Violations in Scientific Workflows: Where and How. IEEE Transactions on Software Engineering, 37(6):805-825, Nov./Dec. 2011. • J. Chen and Y. Yang, Temporal Dependency based Checkpoint Selection for Dynamic Verification of Temporal Constraints in Scientific Workflow Systems. ACM Transactions on Software Engineering and Methodology, 20(3), 2011
Violation Handling 43 • Violation Handling Point Selection • (Probability) Time deficit allocation • Workflow local rescheduling strategy – ACO, GA, PSO • Selected Publications: • X. Liu, Z. Ni, Z. Wu, D. Yuan, J. Chen and Y. Yang, A Novel General Framework for Automatic and Cost-Effective Handling of Recoverable Temporal Violations in Scientific Workflow Systems, Journal of Systems and Software, vol. 84, no. 3, pp. 492-509, 2011 • X. Liu, Y. Yang, Y. Jiang and J. Chen, Do We Need to Handle Every Temporal Violation in Scientific Workflow Systems, submitted to ACM Transactions on Software Engineering and Methodology
Yearly Cost and Time Reduction Yearly cost reduction for the pulsar searching workflow Yearly time reduction for the pulsar searching workflow
Research Topics Data Management in Scientific Cloud Workflows Dr. Dong Yuan, Dr. Xiao Liudyuan@swin.edu.au, email@example.com http://www.ict.swin.edu.au/personal/dyuan/
Data Management in Cloud Computing Scientific applications in cloud computing Computation and data intensive applications Massive computation and storage resources Pay-as-you-go model Computation and storage trade-off Some datasets should be stored (Storage cost) Some datasets can be regenerated (computation cost) Data Placement
Data Dependency Graph (DDG) A classification of the application data Original data and generated data Data provenance A kind of meta-data that records how data are generated. DDG
Attributes of a Dataset in DDG A datasetdiin DDG has the attributes: <xi, yi, fi, vi, provSeti, CostRi> xi ($) denotes the generation cost of dataset di from its direct predecessors. yi ($/t) denotes the cost of storing dataset di in the system per time unit. fi (Boolean) is a flag, which denotes the status whether dataset di is stored or deleted in the system. vi (Hz) denotes the usage frequency, which indicates how often di is used.