1 / 46

Lightweight Service Oriented Parallelism

Lightweight Service Oriented Parallelism. Paul Roe Queensland University of Technology (QUT) p.roe@qut.edu.au. Brisbane. QUT. Queensland University of Technology (QUT) One of largest universities in Australia: 40,000+ students (undergraduate, postgraduate, 10% international)

deanne
Download Presentation

Lightweight Service Oriented Parallelism

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lightweight Service Oriented Parallelism Paul Roe Queensland University of Technology (QUT) p.roe@qut.edu.au

  2. Brisbane QUT • Queensland University of Technology (QUT) • One of largest universities in Australia: 40,000+ students (undergraduate, postgraduate, 10% international) • Applied emphasis, strong links with industry • Motto “A university for the real world” • Faculty of IT, 4000 students, 20% international

  3. My Background • Academic at QUT for 10 years • I am a computer scientist background in • Programming languages • Distributed computing • Practical / applied emphasis • I lead a small research group interested in grid computing and eScience

  4. Two Parts • Introduction to web services and service orientation • Lightweight Service Oriented Parallelism

  5. Web services

  6. Web services (WS) • Computer to computer messaging using XML • Typically SOAP for messaging protocol with WSDL (Web Service Definition Language) • Standard and platform neutral • Designed for eCommerce and enterprise application integration • Similarities with MPI • message passing • Support for different message exchange patterns • Web service principles and technologies are evolving • Originally SOAP was for lightweight RPC between objects • SOAP and WSDL support RPC and messaging encoding and styles • Now strong move to XML centric messaging

  7. Why Not CORBA, DCOM, Java RMI etc.? • Distributed object models try to scale local OO model • Ok for a LAN • Breaks for Internet • Too complex • Assume an object model, virtual machine etc. • Large investment for little return • Poor interoperability • WS designed for interoperability – primary goal • Designed for local area networks rather than Internet • Not standards based (except CORBA) • Problems bootstrapping, ‘all or nothing’ approach • Other attempts e.g. EDI • Problem fixed, not extensible

  8. XML Basics • XML is the basis for web services • XML is platform neutral data language • XML is three things: • Family of specifications e.g. XSLT, XPath, … • Serialisation format (XML 1.0 with tags etc.) • Infoset: Model for data • XML can be described by XML schema

  9. Infoset • Infoset is a model of XML • Essence of XML • XML is no longer just a syntax • This is important – opens the way to other representations of XML XML is very inefficient; it’s verbose, there’s lots of angle brackets, everything’s a Unicode string, there’s no binary format; you’ve always got to parse it first, and that’s why web services are slow … Wrong!

  10. SOAP • Provides two key features for XML based messaging • Separation of message header vs payload data (envelope with header and body) • Standard way to report faults • No further evolution of SOAP necessary! • Extensible header mechanism supports modular and composable advanced services e.g. security, transactions and reliability • Vital feature

  11. <header>: Message context SOAP <envelope> <body>: Message payload, data <fault>: Soap error (optional)

  12. SOAP Extensible Headers Extensible header info: can be optional or mandatory <soap:Envelope xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"> <soap:Header> <t:Transaction xmlns:t="some-URI" soap:mustUnderstand="1"> 5 </t:Transaction> </soap:Header> <soap:Body> <Add xmlns="http://www.qut.edu.au/"> <a>1</a> <b>2</b> </Add> </soap:Body> </soap:Envelope> SOAP body, message payload

  13. <types>: What data types will be transmitted? <messages>: What messages will be transmitted? <portType>: What operations (functions) will be supported? <service>: Where is the service located? WSDL (1.1) <definitions>: root element (Typically XML Schema) Abstract, c.f. interface <binding>: How will messages be transmitted + SOAP specifics, encoding etc. concrete WSDL is an XML document. Elements can be split across multiple files.

  14. Web service invocation:The big picture Generate using developer tools e.g. Visual Studio or Eclipse sender receiver WSDL doc (contains/refs XML schema) describes Client Program Web service Proxy Web service stub Server Program XML document Serialise message Send XML message on the wire, SOAP format Deserialise message

  15. Web Services Landscape Description Discovery: UDDI, WS–Discovery, MetaDataExchange Security Reliable Messaging Transactions WS-Policy Composable service assurances WS-Addressing, MTOM Messaging XML, SOAP WSDL, XML Schema HTTP, HTTPS, SMTP, TCP, … Transport

  16. Service Orientation

  17. Service Orientation (SO) • Architectural view of software and systems inspired by web services • Much hype! • “Service-oriented development focuses on systems that are built from a set of autonomous services.” Don Box • No flat space containing a sea of objects • There are four tenets: • Boundaries are explicit • Services are autonomous • Services share schema and contract, not class • Service compatibility is determined based on policy • Key idea services are loosely coupled and autonomous • Web services are one possible implementation

  18. SO vs Distributed Objects • CORBA, DCOM, Java RMI etc. try to present a uniform view of the world • Common object model • Set of objects all living in the same space • Ok for a LAN: single admin domain, reliable, simple security, homogeneous • Doesn’t work on the internet • Can’t do business by dictation: you must use Corba / RMI / DCOM etc. • Increasingly doesn’t work in LAN • Move to more structure, local firewalls and tiered admin within organisations • Déjà vu? • C.f. TCP sockets (no shared implementation) • Policy => metadata

  19. Parallelism

  20. Motivation and Ideas • Use SOAP instead of MPI • Interoperability • Leverage higher level WS specs e.g. security • Service orientation decouples clients and servers, producers and consumers • Simple producer consumer models of parallelism can benefit from SO • E.g. when producers are legacy applications and consumers are modern e.g. WS enabled apps or modern scripts

  21. Two Simple Models of Parallelism • (Both producer consumer) • Futures (Task-result) • Lisp futures or Cilk etc. • Linda • Tuple space, JavaSpaces etc.

  22. Futures • Idea, spawn function calls – asynchronous • handle = Future (Add(1,2)) • Create a task to perform Add(1,2) • Can interrogate the handle to enquire on result • Web services can naturally express this form of communication Client Cluster handle Add(int,int) int+ getAdd(handle)

  23. Add Request <?xml version="1.0" encoding="utf-8"?> <soap:Envelope xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"> <soap:Body> <Add xmlns="http://www.qut.edu.au/"> <a>1</a> <b>2</b> </Add> </soap:Body> </soap:Envelope>

  24. Add Response <?xml version="1.0" encoding="utf-8"?> <soap:Envelope xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"> <soap:Body> <AddResult xmlns="http://www.qut.edu.au/"> 437643786432 </AddResult> </soap:Body> </soap:Envelope>

  25. getResultAdd Request <?xml version="1.0" encoding="utf-8"?> <soap:Envelope xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"> <soap:Body> <getAdd xmlns="http://www.qut.edu.au/"> <handle>437643786432</handle> </getAdd > </soap:Body> </soap:Envelope>

  26. getResultAdd Response <?xml version="1.0" encoding="utf-8"?> <soap:Envelope xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"> <soap:Body> <getAddResult xmlns="http://www.qut.edu.au/"> 3 </getAddResult> </soap:Body> </soap:Envelope> If result not ready return null (empty)

  27. Caching • Assume computation is ‘functional’ • Cache results on server • Sessionless • Poll server until get result • Need to match args to see if already got result • Can support both kinds of function in web service interface Client Cluster int+ Add(int,int)

  28. Data Parallelism • Problem, asynchronous programming model rather tricky • Often want to invoke many functions en mass • Can build data parallel abstractions in language to support data parallelism • E.g. matrix add • Also build into web service framework, automatically lift point wise operations Client Cluster int+ [] Add(int[],int[])

  29. System Overview Decoupled And autonomous Grid/ Cluster Client Server Web Services Web Services Web Server Job Repository (function cache)

  30. System Properties • Job requestors poll for results and for creating tasks • Job executors poll for jobs • Decouple result requestors/consumers from result producers • Result producers can be legacy code • Result consumers can be different code • Completely decoupled • Can share results • Also naturally fault tolerant if cache results in a stable store • (Service orientation: 1. Boundaries are explicit 2. Services are autonomous)

  31. Result cache • Need a stable store • Need to efficiently store results and compare arguments XML • Use an XML database e.g. • Xindice, SQL Server 2005 etc. • One table per job type e.g. table for Add • Use stored procedures to perform operations • Need facility to create tables • Also a web service

  32. Jobs, Schema and Web Services Web Services Web Services Server Create table Job creators / consumers Job executors Create job Get result Job table Get result Put result Data parallel Schema WSDL

  33. Database

  34. WSDL, Schema etc • Typed jobs: when a job type is created the schema must be provided for the inputs and outputs to the function. • The WSDL, table, and web services are created automatically • (Service orientation: 3. Services share schema and contract, not class 4.Service compatibility is determined based on policy)

  35. Details • Using SQL 2005 • Supports XML indexing, but not testing XML for equality • Therefore need an efficient mechanism to compare web service call inputs with what already in database • Use canonicalisation provided by XML security and generate a hash from this

  36. User Interface

  37. Utilising Idle Machines • (old project G2, g2.fit.qut.edu.au) • System is amenable to cycle scavenging • Extend the system to also support code caching and distribution for simple code • Can be heterogenous and support Java applets, .NET etc. • Volunteer machines download jobs and code • Extra table in database

  38. Results • Blast application running on ten node test cluster • Speedup of 9.96 times for 40 jobs of approx 1m57s duration • The bioinformatics SVM application in 50 PC lab (cycle scavenging) • Speedup of 46 times with 200 jobs of approx 1m44s duration (input and output were negligible) • Works well for coarse grained parallelism • To generate tasks simply send an XML doc to the server via a tool or DIY

  39. REST • Many end user applications support binding to XML • E.g. in Excel can simply import XML data • REST – different style of web services based on HTTP verbs • Expose results as XML through a URL e.g. • eresearch.fit.qut.edu.au/g2x/Add/1/2 • Results in an XML doc

  40. Linda • (Work in progress) • Alternative simple model of parallelism • Linda has a tuple space and 4 operations: • in, out, rd, eval • Add and copy/remove tuples from tuplespace • Remove and copy by associative matching on data • Naturally asynchronous model

  41. XML Databases and Linda • Use XML instead of tuples • XML databases store XML data and support querying data • Build a Linda like system • SQL server supports XQuery (Xindice supports XPath) • Use XQuery to query for data • XQuery is a SQL like functional language for querying XML data • Have a few simple web services to add and remove XML data • (related work on XSpaces etc.)

  42. Operations • Like functional case support creation of typed XML tables, but hold just a single XML value • Operations (web services) URL CreateLindaTable(XML Schema) void Put(XMLDoc[]) XMLDoc[] Take (XQuery-string) XMLDoc[] Copy (XQuery-string)

  43. Linda Cluster Web services Producers Put(<foo> … </foo>) Table XML documents <foo> </foo> <foo> </foo> <foo> </foo> Consumers Take(“for $v in / where $v/@val < 2000 return $v”) <foo> </foo> <foo> </foo>

  44. Preliminary Results • Preliminary results encouraging • Sending around XQueries – some security issues e.g. DoS attacks etc. • Model well suited to certain algorithms e.g. genetic algorithms where got a set of improving values • Producers and consumers tend to be the same program • But just need to generate and send XML docs to server • Can have multiple tables • Locking?

  45. Future Work • Search on functional parallelism cache • Notification interface • WS Resource Framework • Untyped jobs • Security • Connect to a proper job scheduler • Server is a bottleneck – can we use database replication etc. to alleviate this

  46. Conclusions • Web services and databases can support simple lightweight service oriented parallelism • Service orientation very useful, particularly the decoupling • Databases useful – highly tuned • Need to support different paradigms

More Related