Service, Grid Service and Workflow

Service, Grid Service and Workflow Xian-He Sun Scalable Computing Software Laboratory Illinois Institute of Technology sun@iit.edu Nov. 30, 2006 Fermi Laboratory

NU-E NU-C UIC Star Tap IIT ANL Uof C NCSA/UIUC I-WIRE OMNI Scalable Computing Software (SCS) Lab. Distributed Optical Testbed (Grid) Parallel Computers at SCS Pervasive Computing Environments at SCS

Grid and Utility Computing Mimic the electrical power grid Increased Efficiency Higher Quality of Service Increased Productivity Reduced Complexity & Cost Improved Resiliency

Started far apart in applications & technology Service Oriented Computing GT1 Grid GT2 Computing as a service OGSi WS-I Compliant Technology Stack Have been converging WSRF BPEL WS-* WSDL, SOAP XML • Internet computing: • Web service • Grid computing: • Grid service and is merging with WS • Pervasive computing: • Human centered service • Mobile computing: • Phone service HTTP Web Convergence of Core Technology Standards allows Common base for Business and Technology Services

Challenge: Computing as a Service Information Service • SOC is about separation, sharing, and workflow • Sharing(service/resource) • Modeling • Scheduling: system vs application, replica vs consistency • QoS: external task vs local jobs • Security • Separation (service) • Abstraction: personalized service • Primary service: Automatic coding • Separation of concern • Separation of resource: Virtualization • Workflow Management

Service Oriented Architecture (SOA) • SOA is the special software architecture with services are the key building blocks • SOA is basically an application development style using services • They are principles or patterns to develop application using services The concept of SOA comes from software research SOA is developed from IT experience over 30 years

What is SOA ? – more detail • An architecture that implements business functionality as a set of shared, reusable services • Way of designing a software system and its surrounding environment to provide services either to end-user applications, to executable business processes or to other services through published and discoverable service interfaces. • Aggregation of components for a business driver • Extended bus with shared services • service interface being defined separately from implementation and provides service encapsulation and platform/language independence.

The General Service Oriented Architecture (SOA) • Service Provider • Provides a stateless, location transparent business service • Service Registry • Allows service consumers to locate service providers that meet required criteria • Service Consumer • Uses service providers to complete business processes Service Registry Find Publish Service Requestor Service Provider Bind Publish-Find-Bind mechanism

What is Web Service? • A software component • Identified by unique URI • Who can be discovered by other soft.comp • webservices are a stack of emerging standards that describe a service-oriented, component-based architecture

Key Players - • Do you know me ?? • Describe by – WSDL • Do you want to find me ?? • Discover in – UDDI • Do you want to communicate with me?? • Communicate through– SOAP/XML

Web Service Components Service Registry UDDI UDDI Register Find WSDL Service Contract Service Consumer Service Provider Bind Client Service SOAP

The Grid Computing • Infrastructure (“middleware” & “services”) for establishing, managing, and evolving multi-organizational federations • Mechanisms for creating and managing workflow within such federations • Three key criteria • Coordinates distributed resources … • using standard, open, general-purpose protocols and interfaces … • to deliver non-trivial qualities of service.

~PBytes/sec ~100 MBytes/sec Offline Processor Farm ~20 TIPS There is a “bunch crossing” every 25 nsecs. There are 100 “triggers” per second Each triggered event is ~1 MByte in size ~100 MBytes/sec Online System Tier 0 CERN Computer Centre ~622 Mbits/sec or Air Freight (deprecated) Tier 1 FermiLab ~4 TIPS France Regional Centre Germany Regional Centre Italy Regional Centre ~622 Mbits/sec Tier 2 Tier2 Centre ~1 TIPS Caltech ~1 TIPS Tier2 Centre ~1 TIPS Tier2 Centre ~1 TIPS Tier2 Centre ~1 TIPS HPSS HPSS HPSS HPSS HPSS ~622 Mbits/sec Institute ~0.25TIPS Institute Institute Institute Physics data cache ~1 MBytes/sec 1 TIPS is approximately 25,000 SpecInt95 equivalents Physicists work on analysis “channels”. Each institute will have ~10 physicists working on one or more channels; data for these channels should be cached by the institute server Pentium II 300 MHz Pentium II 300 MHz Pentium II 300 MHz Pentium II 300 MHz Tier 4 Physicist workstations Data Grids for High Energy Physics www.griphyn.org www.ppdg.net www.eu-datagrid.org

Managed shared virtual systems Computer science research Open Grid Services Arch Web services, etc. Real standards Multiple implementations Globus Toolkit Internet standards Defacto standard Single implementation The Emergence of Open Grid Standards Increased functionality, standardization Custom solutions 1990 1995 2000 2005 2010

Open Grid Services Architecture • Everything is a service • A standard substrate: the Grid service • A Grid service is a Web service • Standard interfaces and behaviors that address key distributed system issues: naming, service state, lifetime, notification • Supports standard service specifications • Agreement, data access & integration, workflow, security, policy, diagnostics, etc. • Target of current & planned GGF efforts • Supports arbitrary application-specific services based on these & other definitions

SOA and Web Service • SOA mostly defined and explained with some accompanied implementations • Web services are a stack of emerging standards that describe a service-oriented, component-based architecture • Web services are limited SOA, but they are the only available best practical solution till now • SOA and Web service are still evolving each other • Web service cannot support all the computing service in its current form

Grid and Web Service • Grid? What is the Grid? • Standard, technology, infrastructure, application • Globus or general distributed computing ? • Standard • Merging with Web service • Application • Large scientific application vs. light business application • Technology • Resource sharing vs. service sharing, resource sharing vs. pay for service, coordinate virtual organizations vs. create VOs (very hard), stateful vs. stateless

Workflow and LQCD Workflow Information Service • All SOC need the management of workflow • Is LQCD computing a SOC? • Does LQCD need to follow Web service standard? • If yes, we need to support Grid service (GT4) • If no, we do not

Information Services Performance Info Service Reliability Info Service Workflow Enactment Service Workflow Scheduling Data Movement Fault Management LQCD Workflow System Users Workflow Design Workflow template identification & generation Tools Build Time (user) Workflow Instantiation Run Time (system) workflow change Workflow Execution & Control Interaction with Information Services LQCD Middleware Interaction with computing Resources Resources

Workflow Management Systems • Comparison Functionality • Workflow template identification & generation Tools • Workflow specification • Workflow scheduling & rescheduling • Fault Management • Data Movement • Interaction with monitor system • Target Systems • Askalon • Kepler • Grid Physics Network

Current Result: the GHS System The GHS (Grid Harvest Service) system • GHS is a long-term, application-level performance evaluation and task scheduling system specially designed to handle the resources availability issues for solving large-scale applications. • The resource availability could be due to contention or due to fault. The two different causes require different performance modeling and prediction • Support rescheduling

GHS System Design Structure Task Rescheduling Task-Execution Scheduling Task Partition Task Scheduling Prediction System-level Prediction Application-level Prediction Modeling Computation Communication Measurement System Monitoring Application Monitoring Resource Management Reservation Compete Best-Effort CPU Network Memory

Rescheduling Algorithm The reason of rescheduling • Availability pattern change • Fault tolerance • New jobs arrive • Multi-campaign • New milestones

Automated Deployment of Meta-task • APST software • AppleS scheduling • NWS prediction • Integrating GHS prediction and scheduling into APST • Modify the MetricType and ServiceType data structure in the Meta-data Bookkeeper • Add GHS server to provide information service • Add GhsMetataskSched() • Modify XmlFile parser in the Controller component

Software Released • http://www.meta.cs.iit.edu/~ghs • GHS 1.0 • Functionalities for performance prediction, measurement, task allocation, and task scheduling • GHS-APST 1.0 • Integrate GHS prediction and scheduling into APST execution management • Add GHS server and GHS daemons for performance data collection and inquiry • Unchanged user interface • apstd –heuristc=ghs • Tested on SunOS 5.9 and Linux 2.4.20 • Releases are for contention availability, fault availability is a work in progress.

Service, Grid Service and Workflow

Service, Grid Service and Workflow

Presentation Transcript

The National Grid Service

Tri-service Workflow (TSWF) Templates

Service-Oriented Grid Middleware

The National Grid Service

ERNE Grid Service

Integration of Grid Service and Web Processing Service

Grid Service Specification

Grid Quality of Service and Service Level Agreements

The LHC Grid Service

The National Grid Service

The National Grid Service

UK National Grid Service

Grid Trust Service (GTS)

Globus OGSI Grid Service

Grid Service Architectures

Geospatial Service Workflow Concepts and Tools

Data Grid Service

Grid Service  Grid Webservice

Grid Quality of Service and Service Level Agreements

The National Grid Service