1 / 21

Geospatial Service Workflow Concepts and Tools

Geospatial Service Workflow Concepts and Tools. Liping Di Laboratory for Advanced Information Technology and Standards (LAITS) George Mason University lpd@rattler.gsfc.nasa.gov. Contents. What are Service oriented architecture and web services? What is a workflow tool?  What does it do? 

Download Presentation

Geospatial Service Workflow Concepts and Tools

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Geospatial Service Workflow Concepts and Tools Liping Di Laboratory for Advanced Information Technology and Standards (LAITS) George Mason University lpd@rattler.gsfc.nasa.gov

  2. Contents • What are Service oriented architecture and web services? • What is a workflow tool?  • What does it do?  • Why do we need one in the Grid?  • What are some common workflow tools used by the Grid community and web service community? 

  3. The Service-Oriented Architecture (SOA) • The key component in the service-oriented architecture is services • A service is a well-defined set of actions. It is self-contained, stateless, and does not depend on the state of other services. • Stateless means that each time a consumer interacts with a Web Service, an action is performed. After the results of the service invocation have been returned, the action is finished. There is no assumption that subsequent invocations are associated with prior ones. • In the service-oriented architecture, the description of a service is essentially a description of the messages that are exchanged between the consumer and the service. • Standard-based individual services can been chained together to solve complex tasks. • The implementation of SOA in the web environment is called Web services.

  4. Web Services • Web Services are self-contained, self-describing, modular applications that can be published, located, and dynamically invoked across the Web. • Web services perform functions, which can be anything from simple requests to complicated business processes. • Once a Web service is deployed, other applications (and other Web services) can discover and invoke the deployed service. • The real power of web services relies on • Everyone on the Internet can set up a web service to provide service to anyone who wants—many services will be available. • The standard-based services can be chained together dynamically to solve complicated tasks – Just in-time integration.

  5. Globus Toolkit 3.0 (GT3) -- OGSA, OGSI, and GT3 the engineer the architect the workers

  6. Difference between Web Service and Open Grid Service • Globus 3.0 implemented the Open Grid Service Architecture. • The fundamental concepts of services in the Grid are the same as Web services. • The differences between Grid and Web services include • A Web service can be invoked by any consumer over the Web while a Grid service can only be invoked by consumers within the virtual organization, similar to the difference between Internet and Intranet. • Web services practice has been extended in Grid to accommodate the additional requirements of Grid services • Stateful interactions between consumers and services • Exposure of a web service’s “publicly visible state” • Access to (possibly large amounts of) identifiable data • Service lifetime management • Currently the Grid and Web communities are merging through the Web Service Resource Framework (WSRF).

  7. Service operations

  8. Service Operations • Publish – advertise (or remove) data and services to a broker (e.g., a registry, catalog or clearinghouse). • Find – Service requestors and service brokers collaborate to perform the find operation. Service requestors describe the kinds of services they’re looking for to the broker and the broker delivers the results that match the request. • Bind – A service requestor and a service provider negotiates as appropriate so the requestor can access and invoke services of the provider. • Chain – The chain operation binds a sequence of services.

  9. Service Chaining • A Service Chain is defined as: a sequence of services where, for each adjacent pair of services, occurrence of the first action is necessary for the occurrence of the second action. • When services are chained, they are combined in a dependent series to achieve larger tasks. • Three types of chaining defined in ISO 19119 and OGC: • User-defined (transparent) – the Human user defines and manages the chain. • Workflow-managed (translucent) – the Human user invokes a service that manages and controls the chain, where the user is aware of the individual services in the chain. • Aggregate (opaque) – the Human user invokes a service that carries out the chain, where the user has no awareness of the individual services in the chain.

  10. Construction of Service Chains • The first type of chaining allows users to construct a geospatial model to be run in the system • Require domain knowledge—for expert to contribute their domain knowledge. • The knowledge is kept in the Geo-tree/service chain. • The second type of chaining basically is to use existing geo-tree to materialize a virtual object. • Anyone can use this type of chaining to produce a virtual product on demand. • Anyone can use but it is not able to produce a product who’s geo-tree doesn’t already exist in a data/information system. • The third type of chaining require the system to be intelligent enough to automatically form a geo-tree/service chain by decomposing user’s query. • require the domain knowledge • require the automated reasoning. • Anyone can use and can produce a new product based on users’ query automatically. • The first two types of chains do not require significant machine intelligence. • Current technology is enough for implementing such chaining approach. • The third one requires significant machine intelligence • Current technologies are not able to provide such kind of chaining. • Significant research is needed.

  11. Workflows and workflow tools • What we mean: • The executable scripts representing the service chains. • The total composition and orchestration of an experimental run, including all the details of post-processing, data-mining, visualization. • What the high-end user (scientist) needs to do in order to get the underlying computational code to produce accessible and usable results somewhere. • What in the past was usually done through shell-scripting, but more (e.g., rpc’s). • Previous examples: not a “single” workflow, but a number of decoupled, cooperating, communicating workflows. • Workflows, in most cases, are encoded in BPEL4WS, a OASIS standard. • Any tools dealing with creation, management, and execution of workflows are called workflow tools. • The most significant one is the workflow engines that manage the execution of workflows.

  12. Steps from Geospatial process model to a user defined product (User geo-object) Knowledge Capture phase User retrieval phase User query Phase Geospatial Model Virtual geo-object Logical Workflow Concrete Workflow Workflow execution user geo-object

  13. Availability of Workflow Tools for Geospatial Services • Tools are needed for every steps from the creation of geospatial models to the materialization of virtual geospatial products. • General workflow tools are being developed both in Grid and Web service communities. • Most of the tools are not tested in geospatial environment.

  14. Workflow Tools built by GriPhyN • Using Virtual Data Language (VDL) from Globus team to encode both abstract and concrete workflows. • Build an abstract workflow based on VDL descriptions (Chimera) • Build an executable workflow based on the abstract workflows (Pegasus) • Execute the workflow (Condor’s DAGMan) • Those tools run under Globus 2

  15. Alliance Science Portal ExpeditionWorkflow Tools Development • Objective • Provide a workflow tool (engine + interface) through which all of this can be accomplished without any knowledge of: • XML • Jython, Java, or any particular PL • Provide a tool which is reusable in the sense of not being specific to any one scientific research domain • Approach 1. Templated Patterns (a repertoire of pre-defined, parameterized “workflow scripts”) • Just as with designing software systems in general ... • High-level (Sequence, Branch/Merge, Parallel, ...) • Extend these down several levels, e.g.: • “STAGE” = [ make dirs, get files, set permissions ] 2. An Environment through which the high-level user can create and manipulate workflow scripts.

  16. O.G.R.E.: An Extension to Apache Ant • O.G.R.E. = Open Grid Computing Environments Runtime Engine • What Ant lacked, but we needed: • Broader conditional execution, • Ant: based on write-once String properties. • A general “loop” structure for Task execution. • Data-communication between Tasks (and with their containers). • Specialized tasks • File reading and writing • Local and remote file management (gridftp) • Web service related tasks • Event- and process-monitoring-tasks

  17. Workflow Execution Engines in Web Services • We are examining two workflow execution engines • IBM BPWS4J –//http://www.alphaworks.ibm.com/tech/bpws4j • The Collaxa BPEL Server • The IBM BPWS4J is a free software while Collaxa BPEL server is commercial software. • Collaxa BPEL Server, Developer Edition $2K per developer • Collaxa BPEL Server, Enterprise Edition $20K per CPU • Both Engines work under web service environment. • Questions need to be answered: • Are the engines good enough for geospatial Grid/Web services? • Can make those engine works under Grid environment? • What is the evolution of Grid Workflow standards and the execution engine?

  18. BPWS4J -- The BPEL Engine for Execution • What is BPWS4J? The IBM Business Process Execution Language for Web Services Java Runtime provides a platform upon which business processes written using BPEL4WS may execute. BPWS4j-engine-2.0 version supports the BPEL4WS v1.1 specification. • How does it work? For each process, the engine takes in a BPEL4WS document which describes the process, a WSDL document (without binding information) which describes the interface that the process will present to clients, and WSDL documents (with binding information) which describe the services that the process may/will invoke during its execution. After deployment the process will be made available to outside consumers through a SOAP interface. The engine has been tested on WebSphere Application Server 5.0 and on Apache Tomcat under both Linux and Windows. ** Note: This and the next slide are from BEPL4J documentation.

  19. Developing and Deploying a Process Step 1: Create a BPEL4WS document and the corresponding WSDL document. The WSDL document describes the interface of the process that will be presented to the outside world. (This includes the description of all receive and onMessage elements.) The WSDL document should not contain any bindings; the SOAP binding will be added by the engine during deployment. One service element must be present within the WSDL file (the name of the process is taken from the name attribute on the service element.) Step 2: If the process invokes another Web service (i.e. if the process contains an invoke activity), then create/obtain the WSDL document(s) that describe the service which is to be invoked. These WSDL documents must have bindings and endpoint information that describe where and how the service may be invoked. The engine supports SOAP, EJB, JMS, and direct Java class bindings. Step 3: Deploy the process to the engine. When deploying the process, you will need to specify the WSDL documents which fulfill the partner roles . Step 4: Create the SOAP client. The client interaction with the service is defined by the process's WSDL document that you provided during deployment. Additional Notes: All imports within the WSDL documents must be absolute. If you are deploying on Tomcat and have WSDL documents which have imports, you must make sure that you have defined the .wsdl extension and text/xml MIME type to Tomcat, otherwise it will complain about not being able to resolve the imports. You can do so either by modifying the conf/web.xml under Tomcat,orby modifying the WEB-INF/web.xml file within your WAR file. See the web.xml file in the engine's WAR for an example.

  20. The Collaxa BPEL Server • Native BPEL 1.1 Implementation • Easy-to-Use Modeling Tool • Rich and Flexible Binding Framework(Web Services but also JCA, JMS, Email, EDI) • Unparalleled Management and Monitoring(In-flight Instance Management, Auditing, Debugging) • High Performance and Scalability(Throughput, Clustering, Large XML Documents) • Easy-to-deploy/Non-intrusive(Get up and running in less than 15 minutes)

  21. Eclipse BPEL DESIGNER DESIGN BPEL TaskService TASKS, PORTAL BPEL BPEL SERVER DEHYDRATE JAVA PLATFORM BPEL CONSOLE CONNECT MONITOR WSDL BINDING FRAMEWORK JCA JMS Email The Collaxa BPEL Server

More Related