1 / 51

Holding slide prior to starting show

Holding slide prior to starting show. (Some) Key Issues in Grid Computing. David Walker School of Computer Science Cardiff University. http://www.cs.cf.ac.uk/user/David.W.Walker. Main Thesis of Talk.

helene
Download Presentation

Holding slide prior to starting show

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Holding slide prior to starting show

  2. (Some) Key Issues in Grid Computing David Walker School of Computer Science Cardiff University http://www.cs.cf.ac.uk/user/David.W.Walker

  3. Main Thesis of Talk • At a surface level many aspects of Grid Computing appear to be straightforward, and reduce to simple programming tasks and the use of existing tools. • This talk aims to show that for domain scientists to effectively use the Grid many challenging CS issues need to be addressed.

  4. A Typical Scientific Process

  5. Key Elements of the Grid • The specification of problems – how do you program the Grid? • The dynamic discovery of Grid resources. • Provenance support for Grid applications. • The interoperability and federation of different Grid middleware stacks. • Grid access to legacy applications. • Support for remote collaboration over the Grid.

  6. A Simple Example • A simple use of the Grid involves the use of a PSE or portal to do a set of pre-determined tasks. • This corresponds to the “utility computing” mode of use. • No support for building new applications or services. • No support for dynamic discovery of resources. • No support for collaboration.

  7. Programming the Grid • Problem specification could involve • Use of high-level domain-specific programming/scripting language. • Representing coordinated tasks with a workflow graph assembled in a visual programming environment. • Use of recommender systems to assist users in formulating and solving problems.

  8. Workflow • Commonly used to represent applications composed of interacting services. • Services may be hierarchical – composed of other services. • Easy to represent graphically, but not scalable with number of services or number of inputs/outputs.

  9. Problems in Workflow Composition • How do you know that the input port of one service is compatible with the output port of another service? • Given that the services may have been created by different people/organisations? • Type signatures must match, but semantics must also match.

  10. Annotating Services • To support “plug-and-play” between services in a workflow requires the use of ontologies. • Need to give semantic content (meaning) to service inputs and outputs. • This allows composition hints in the form of “semantic suggestions”. For example, for a given service port we could find all services that could be connected to it.

  11. Manual User generates workflow graphically or through text editor. Triana BPWS4J Self-Serve Semi-automated “Semantic suggestions” User still has to select the service required from a shortlist. Cardoso & Sheth GEODISE myGRID Sirin , Hendler et al., Automated The entire composition is automated using AI technologies. SHOP2 Pegasus – ISI McIllraith IRS-II Types of Workflow Composition

  12. Workflow Composition in Semantic Grids • Semantic Web technologies enable automation at several levels – automated resource discovery, selection, management, service composition, execution. • Promises automated seamless interoperation of autonomous, heterogeneous distributed applications. • Our focus is on the use of Semantic Web technologies to automate service composition in Grid environments. • See S Majithia, DW Walker, and WA Gray “Automatic Composition of Web Services,” in Proceedings of the UK e-Science Programme All-Hands Meeting 2004. Available online at http://www.allhands.org.uk/proceedings/papers/148.pdf • Main developer is Shalil Majithia.

  13. High level objective WFMS CWFC AWFC RS MMS CWFR AWFR RB Framework - Overview WFMS – Workflow Manager Service AWFC – Abstract Workflow Composition Service CWFC – Concrete Workflow Composition Service RS – Reasoning Service MMS – Matchmaking Service AWFR – Abstract Workflow Repository CWFR – Concrete Workflow Repository RB - Rulebase

  14. Client WFMS AWFC CWFC WFEE 1 2 3 4 5 6 7 8 • High Level Request 5. Composed Concrete WF • Request for Abstract WF 6. Request for Execution • Composed Abstract WF 7. Results or Request for Alternatives • Request for Concrete WF 8. Final Results Framework - Interactions

  15. Abstract Workflow Composer • An abstract workflow specifies a workflow without referring to a specific service implementation . • The Abstract Composer tries to generate an abstract workflow by using: • AWF Repository: stores semantically annotated descriptions of services and workflows. Use ontology to match services. • Rulebase: a rulebase specifies the “recipe” to achieve an objective • Chaining services: try and chain services by matching service outputs and inputs.

  16. Concrete Workflow Composer • A concrete workflow specifies an executable workflow by referring to specific service implementations. • The Concrete Composer tries to generate an executable workflow by using: • Matchmaking: match abstract workflow with service implementations available at that time. • Chaining services: try and chain services by matching service outputs and inputs.

  17. Other Components • Matchmaker service (based on that of Paolucci et al.) adapted for dynamic substitution. • Chaining service: backward chaining service based on domain ontologies. • Repositories: store semantically annotated abstract and concrete workflows.

  18. Implementation • All components implemented as Web services using Axis server. • Services and workflows described using OWL-S. • DQL/JTP server used for subsumption reasoning • Rulebase implemented in RuleML • Plug-in module enables generation of concrete workflows in BPEL4WS. Snippet of OWL-S Profile for FFT

  19. Family Tree Example • Families trees have 3 basic relationships • Spouse_of • Child_of • Parent_of • Other relationships (aunt, grandparent, cousin, etc) can expressed in terms of these relationships through an ontology.

  20. Cousins Example • Suppose we want to create a workflow to find the cousins of a given person, X. • Query is submitted to WFMS which checks the AWF repository (i.e., checks annotated name of workflows) • If no match then check rule base

  21. Rulebase Grandparents(X)=Parents[Parents[X]] Cousins(X)=exclude[Grandchildren[Grandparents(X), Children[Parents[X]]]] Note: There is no rule for Grandchildren[X]. The Chaining Service would deduce how to do this from the ontology.

  22. Grandparents Grandchildren Exclude X Cousins Parents Children Abstract Workflow From Rulebase Atomic service Composite service

  23. Parents Parents Grandchildren X Parents Children Exclude Cousins WF after Recursive Application of Rulebase

  24. Parents Parents Children Children X Parents Children Exclude Cousins WF after Application of Chaining Service Note opportunity for optimization and parallelism.

  25. Dynamic Resource Discovery and Scheduling • Assume that semantically annotated services can be found through a registry or repository service. • Scheduling of workflow nodes on distributed resources. • Early binding model: bind to specific service/platform at composition time (“validation”). • Intermediate binding model: bind at “compile” time (when converting from XML to executable form). • Late binding model: bind dynamically at runtime. • Later binding allows the use of more up-to-date information to make scheduling decisions. • In our framework binding is done by the Matchmaker Service, and can follow any of the above binding models.

  26. Provenance Support in Service-Oriented Grids • A workflow may produce many intermediate and final data products that may need to be later reviewed and analysed. • A person, project, or organisation may need to archive many such workflows and their results. • Want to store the provenance of data products: how they were produced and why. • Main developer is Shrija Rajbhandari.

  27. Provenance • Provenance can be regarded as historical metadata that provides an explanation of how a particular data product has been generated. • Uniquely defines the derived data. • Identifies what data is passed between services. • Provides a traceable path to the origin of the data.

  28. Provenance Importance and Problem • No known standards to support archiving provenance in service-oriented Grid environment. • Requires recording the provenance: • The transformation of data occurred during the invocation of services in a workflow. • Complex service executed via a workflow Engine.

  29. Original Motivation • Would like to be able to view an electronic publication, and click on tables and figures of results to: • See how they were generated: requires provenance browser. • Re-run the workflows that generated the results to verify them, or to perform “what-if” study by changing the workflow inputs. • See the results of any re-run workflows in the same format as the original data (table of graph).

  30. Provenance Model RDF Schema Workflow Engine [BPWS4J] Provenance Server I N TER FACE PCS Provenance mySql Database JENA PQS PCS = Provenance Collection Service PQS = Provenance Query Service Jena is a Java framework for building Semantic Web applications. http://jena.sourceforge.net/

  31. Prototype Provenance System • Provenance Schema • Resource Description Framework (RDF). • Provenance of workflow execution. • Provenance Collection Service (PCS) • Provenance is represented in RDF statements. • Database storage. • Provenance Query Service (PQS) • Client interface to browse provenance. • Allows re-execution of retrieve provenance for “what- if” style of analysis.

  32. Web Services 2) PCS sends the invocation initiation of a workflow to BPWS4J. BPWS4J Engine 3) BPWS4J invokes the partner services 4) BPWS4J sends message about invoked services, and the input and output parameters to PCS 1) User Client Interface sends the workflow invocation parameters to PCS. Uses Provenance RDF schema PCS Client Interface PCS 5) PCS Creates RDF representation of the collected provenance data of the workflow execution PQS Client Interface 7) PQS Client passes query to the database server which returns the provenance data using Jena tools to access RDF data. Provenance Database 6) PCS stores the RDF graph in the database server using Jena tools 8) PQS allows re-execution of the workflow from the provenance data retrieved. Also allows parameter changes during re-execution of such workflow. Prototype Dataflow

  33. Services Composition and Invocation • Compose Web services using BPEL4WS • Execute with BPEL4WS compliant engine: IBM’s BPWS4J • Dynamically invoke Web services using Web Service Invocation Framework (WSIF).

  34. Provenance RecordingExample: Adding two numbers and multiplying the result with a third number

  35. Provenance Recording (cont..)

  36. Provenance Recording (cont..)

  37. Provenance Query

  38. Re-execution for “what-if” analysis

  39. Support for Collaboration in Grid Environments • Collaboration can take various forms. • Making services available to others. • Making workflows available to others. • Making results available to others. • Collaboratively doing steering an application. • Collaborative visualisation of results.

  40. Resource-Aware Visualisation Environment (RAVE) • Aims to develop a collaborative visualization environment that scales across a wide range of network-enabled devices. • Will respond to changes in network bandwidth and capabilities of the target display device. • Will start by examining VizServer and COVISE systems. • RAVE postdoc is Dr Ian Grimstead.

  41. RAVE Overview

  42. RAVE Motivation • Current systems make assumptions about available resources. • RAVE makes use of local and/or remote resources, and can react dynamically to changes in these resources and the network connecting them

  43. RAVE Infrastructure • The RAVE infrastructure is based on Web services. • Services are published and discovered through a UDDI server. • Main services are • Data Service. • Render Service.

  44. Data Service • Imports data from a file, web resource, or external application. • Acts as a central distribution point for scene graph. • Bridging services link to external applications.

  45. Render Service • Render services connect to the Data Service which accepts and broadcasts changes in the scene graph. • Render services contain complete scene graph. • View may be rendered in mono or stereo mode. • Multiple render sessions supported.

  46. Thin Client • A thin client is a client with modest rendering capabilities, e.g., a PDA. • It can connect to a remote render service and make requests for off-screen rendered copies of the data. • Local user can still manipulate camera and underlying data.

  47. RAVE on Zaurus PDA

  48. Connecting to an Application • Data Service can receive live updates from an external application via a bridging service. • Future work will extend this to allow computational steering.

  49. Other Grid Projects • Quality of Service: http://www.cs.cf.ac.uk/user/Rashid/ • Grid-Enabled Computational Electromagnetics (GECEM): http://www.wesc.ac.uk/projects/gecem/ • Workflow Optimization Services for e-Science (WOSE): http://www.wesc.ac.uk/projects/wose/

  50. Summary • Semantic Web technologies play a key role in enabling; • “plug-and-play” in the composition of service to create workflows. • dynamic discovery of resources. • Support for provenance. • The above, together with collaborative visualisation, are important in convincing scientists (and others) to use the Grid.

More Related