1 / 84

C ompiling H igh-level A ccess I nterfaces for M ulti-site S oftware

CHAIMS: Mega-Programming Research. C ompiling H igh-level A ccess I nterfaces for M ulti-site S oftware Stanford University Objective : Investigate revolutionary approaches to large-scale software composition . Approach : Develop and validate a composition-only language .

summer
Download Presentation

C ompiling H igh-level A ccess I nterfaces for M ulti-site S oftware

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CHAIMS: Mega-Programming Research Compiling High-level Access Interfaces for Multi-site Software Stanford University Objective: Investigate revolutionary approaches to large-scale software composition. Approach: Develop and validate a composition-only language. Planned contributions: Asynchrony by splitting up CALL-statement. Hardware and software platform independence. Potential for multi-site dataflow optimization. Performance optimization by invocation scheduling. CHAIMS

  2. Participants • Support • DARPA ISO EDCS program (1996-1999) • Siemens Corporate Research (1996-1998) • DoD AFOSR AASERT student support (1997-1999) • Sloan Foundation - computer industry study (1996-97) • People - Gio Wiederhold PI -- Marianne Siroker: Administration • Dorothea Beringer (postdoc EPF Lausanne) since Dec.1997 • Ron Burback (CS PhD cand.) • Laurence Melloul (CS MS) • Woody Pollack (CS MS) • MS and BS CS graduated: Joshua Hui, Gaurav Bhatia, Prasanna Ramaswami, Kirti Kwatra, Pankaj Jain, Mehul Bastawala, Catherine Tornabene, Wayne Lim, Connan King • Louis Perrochon (postdoc ETH Zurich) Fall quarter 1996 CHAIMS

  3. My Personal Background • Masters in Computer Science: hybrid-monitoring tool for debugging and software performance analysis for distributed software • Software engineer: telecommunication systems • Consultant: software methodologies, quality assurance, project management, CASE-tools • PhD: Modeling scenarios in object-oriented analysis • Teaching: Fusion • Now: CHAIMS -- large-scale software composition, distributed systems CHAIMS

  4. Presentation • Motivation and Objectives • changes in software production • basis for new visions and education • Concepts of CHAIMS • CHAIMS language • CHAIMS architecture and composition process • Scheduling • Dataflow optimization • Status, Plans, Conclusions CHAIMS

  5. Shift in Programming Tasks Integration Coding 1970 1990 2010 CHAIMS

  6. Typical Scenario: Logistics A general has to ship troops and/or various material from L.A. to Washington DC: • different kind of material: criteria for preferred transport differ • not every airport equally suited • congestion, prices • weather constraints • exact due or ready dates Today: calling different companies, looking up information on the web, reservations by hand Future: system proposes possibilities that take into account various conditions • hand-coded systems • composition of processes CHAIMS

  7. Scaling alternatives ? CHAIMS

  8. C H A I M S Megaprogram for composition, written by domain programmer CHAIMS system automates generation of client for distributed system CHAIMS Megamodules, provided byvarious megamodule providers Megamodules CHAIMS

  9. Megamodules - Definition Megamodules are large, autonomous, distributed, heterogeneous services or processes. • large: computation intensive, data intensive, ongoing processes (monitoring services) • distributed: to be used by more than one client • heterogeneous: accessible by various distribution protocols (not only different languages and systems) • autonomous: maintenance and control over recourses remains with provider, differing ontologies ( ==> SKC) Examples: • logistics: “find best transportation route from A to B”, reservation systems • genomics: easier framework for composing various processing tools than ad-hoc coding CHAIMS

  10. I/O I/O Data Resources Challenge: Thin Clients (1) Domain expert Client computer Control & Computation Services c e a b d Wrappers to resolve differences CHAIMS

  11. MEGA modules Sites Data Resources Challenge: Thin Clients (2) Domain expert Client workstation IO module IO module C Computation Services e b a d T c S U T R CHAIMS

  12. Challenge: Heavy-weight Services Services are not free for a client: • execution time of a service • transfer time for data • fees for services What we need: ==>monitoring progress of a service ==> possibility to choose among equivalent services based on estimated waiting time and fees ==>parallelism among services ==> preliminary overview results, choosing level of accuracy / number of results for complex processes ==> novel optimization techniques CHAIMS

  13. Challenge:Non-technical Domain Experts Company providing services: • domain experts of domain of service (e.g. weather) • technical experts for programming for distribution protocols, setting up servers in a middleware system • marketing experts “Megaprogrammer”: • is domain expert of domain that uses these services • is not technical expert of middleware system or experienced programmer, • wants to focus on problem at hand (=results of using megaprogram) • e.g. scientist, assistant of a general CHAIMS

  14. Challenge: Purely Compositional Language Possible? Which languages did succeed? • Algol, ADA: integrated composition and computation • C, C++: focus on computation Why new language? • complexity: not all facilities of a common language (compare to approach of Java), • inhibiting traditional computational programming (compare C++ and Smalltalk concerning object-oriented programming) • focus on issue of composition, parallelism by asynchrony, and optimization CHAIMS

  15. CHAIMS “Logical” Architecture Customer Megaprogram clients (in CHAIMS) Network/Transport (DCE, CORBA,...) Megamodules (Wrapped or Native) CHAIMS

  16. CHAIMS Physical Architecture Megaprogram Clients in CHAIMS Network DCE, CORBA, JAVA RMI, DCOM... Megamodules (wrapped, native) each supporting setup, estimate, invoke, examine, extract, and terminate. CHAIMS

  17. Decomposing CALL statements CALL gained functionality • Copying • Code sharing • Parameterized computation • Objects with overloaded method names • Remote procedure calls to distributed modules • Constrained (black box) access to encapsulated data progress in scale of computing CHAIMS decomposes CALL functions Setup Estimate Invoke Examine Extract CHAIMS

  18. CHAIMS Primitives Pre-invocation: SETUP: set up the connection to a megamodule SET-, GETATTRIBUTES: set global parameters in a megamodule ESTIMATE: get estimate of execution time for optimization Invocation and result gathering: INVOKE: start a specific method EXAMINE: test status of an invoked method EXTRACT: extract results from an invoked method Termination: TERMINATE: terminate a method invocation or a connection to a megamodule Control: Utility: WHILE, IF GETPARAM: get default parameters CHAIMS

  19. Megaprogram Example: Overview General I/O-megamodule • Input function takes as parameter a default data structure containing names, types and default values for expected input Travel information: • Computing all possible routes between two cities • Computing the air and ground cost for each leg given a list of city-pairs and data about the goods to be transported Two megamodules that offer equivalent functions for calculating optimal routes • Optimum and BestRoute both calculate the optimum route given routes and costs • Global variables: Optimization can be done for cost or for time InputOutput - Input - Output RouteInfo - AllRoutes - CityPairList - ... AirGround - CostForGround - CostForAir - ... Routing - BestRoute - ... RouteOptimizer - Optimum - ... CHAIMS

  20. Megaprogram Example: Code io_mmh = SETUP ("InputOutput") route_mmh = SETUP ("RouteInfo") ... best2_mmh.SETATTRIBUTES (criterion = "cost") cities_default = route_mmh.GETPARAM(Pair_of_Cities) input_cities_ih = io_mmh.INVOKE ("input”, cities_default) WHILE (input_cities_ih.EXAMINE() != DONE) {} cities = input_cities_ih.EXTRACT() ... route_ih = route_mmh.INVOKE ("AllRoutes", Pair_of_Cities = cities) WHILE (route_ih.EXAMINE() != DONE) {} routes = route_ih.EXTRACT() … IF (best1_mmh.ESTIMATE("Best_Route") < best2_mmh.ESTIMATE("Optimum") ) THEN {best_ih = best1_mmh.INVOKE ("Best_Route", Goods = info_goods, Pair_of_Cities = cities, List_of_Routes = routes, Cost_Ground = cost_list_ground, Cost_Air = cost_list_air)} ELSE {best_ih = best2_mmh.INVOKE ("Optimum", Goods = info_goods, … ... best2_mmh.TERMINATE() // Setup connections to megamodules. // Set global variables valid for all invocations // of this client. // Get information from the megaprogram user // about the goods to be transported and about // the two desired cities. // Get all routes between the two cities. //Get all city pairs in these routes. //Calculate the costs of all the routes. // Figure out the optimal megamodule for // picking the best route. //Pick the best route and display the result. // Terminate all invocations CHAIMS

  21. Operation of one Megamodule • SETUP • SETATTRIBUTES provides context • ESTIMATE serves scheduling • INVOKE initiates remote computation • EXAMINE checks for completion • EXTRACT obtains results • TERMINATE I / ALL M handle M handle M handle M handle I handle I handle I handle I handle M handle CHAIMS

  22. CHAIMS Megaprogr. Language Purely compositional: • no primitives for arithmetic ==> math megamodules • no primitives for input/output ==> general and problem-specific I/O megamodules Splitting up CALL-statement: • parallelism by asynchrony in sequential program • novel possibilities for optimizations • reduction of complexity of invoke statements • higher-level language (assembler => HLLs, HLLs => composition/megamodule paradigm) CHAIMS

  23. Distribution System (CORBA, RMI…) Architecture: Runtime b d e a CSRT(compiled megaprogram) c MEGA modules CHAIMS

  24. Architecture: Composition Process Megamodule Provider wraps non-CHAIMS compliant megamodules adds information to Wrapper Templates CHAIMS Repository b d e a c MEGA modules CHAIMS

  25. Megaprogram (in CHAIMS language) Architecture: Composition Process Megaprogrammer writes information CHAIMS Repository information CHAIMS Compiler generates CSRT(compiled megaprogram) CHAIMS

  26. Megaprogram (in CHAIMS language) Distribution System (CORBA, RMI…) Architecture: Overview Megamodule Provider Megaprogrammer wraps non-CHAIMS compliant megamodules writes adds information to information Wrapper Templates CHAIMS Repository information CHAIMS Compiler b d generates e a CSRT(compiled megaprogram) c MEGA modules CHAIMS

  27. Distribution System (CORBA, RMI…) Architecture: Overview Megamodule Providers Megaprogrammer wraps non-CHAIMS compliant megamodules writes adds information to information Wrapper Templates Megaprogram (in CHAIMS language) CHAIMS Repository information CHAIMS Compiler b d run-time generates e a CSRT(compiled megaprogram) c MEGA modules CHAIMS

  28. Architecture: CHAIMS-Language and CHAIMS-Protocols Megaprogrammer CHAIMS API defines interface between megaprogrammer and megaprogram; the megaprogram is written in the CHAIMS language. CHAIMS-language Megaprogram The CHAIMS protocols define the calls the mega-modules have to understand. These protocols are slightly different for the different distribution protocols, and are defined by an idl for CORBA, another idl for DCE, and a Java class for RMI. CHAIMS-protocols CORBA-idl DCE-idl Java-class M e g a m o d u l e s CHAIMS

  29. Architecture: Gentype A Gentype is a triple of name, type and value, where value is either a simple type or a list of other gentypes (i.e. a complex type). Possible simple types: given by ASN.1, the ASN.1-conversion library for C++, our own conversion routines. Example: Person_Information Name of Person complex Personal Data complex Address First Name string Joe Last Name string Smith Date of Birth date 6/21/54 Soc.Sec.No string 345-34-345 CHAIMS

  30. Wrapper: CHAIMS Compliance • CHAIMS protocol- knowing all CHAIMS primitives • State management and asynchrony: • clientId (megamodule handle in CHAIMS language) • callId (invocation handle in CHAIMS language) • results must be stored for possible extraction(s) until termination of the invocation • Data transformation: • all parameters of type blob (BER-encoded Gentype) must be converted into the megamodule specific data types (combination hand-coding/decoding routines CHAIMS

  31. Architecture: Three Views Composition View (megaprogram) - composition of megamodules - directing of opaque data blobs Data View - exchange of data - interpretation of data - in/between megamodules CHAIMS Layer Transportation View moving around data blobs and CHAIMS messages Distribution Layer Objective: Clear separation between composition of services, computation of data, and transport CHAIMS

  32. s s,i s,i i e e e s setup / set attributes invoke a method i extract results e Scheduler: Decomposed Execution time time time decomposed (no benefit for one module) asynchronous synchronous execution of a remote method available for other methods CHAIMS

  33. i1 M1 e1 i2 M2 e2 i3 M3 time e3 i4 M4 e4 i5 M5 e5 s set up / set attributes (not shown) invoke a method i extract results e Scheduler: Optimized Execution i3 i1 M3 (>M1+M2) i4 M1 M4 (<M1+M2) e1 i2 M2 time e4 e3 e2 i5 M5 e5 optimized by scheduler according to estimates data dependencies execution of a method non-optimized, or handprogrammed CHAIMS

  34. Scheduling: Simple Example 1 cost_ground_ih = cost_mmh.INVOKE ("Cost_for_Ground", 1 List_of_City_Pairs = city_pairs,Goods = info_goods) 2 WHILE (cost_ground_ih.EXAMINE() != DONE) {} 3 cost_list_ground = cost_ground_ih.EXTRACT() 3 cost_air_ih = cost_mmh.INVOKE ("Cost_for_Air", 2 List_of_City_Pairs = city_pairs,Goods = info_good) 4WHILE (cost_air_ih.EXAMINE() != DONE) {} 4 cost_list_air = cost_air_ih.EXTRACT() order in unscheduled megaprogram order in automatically scheduled megaprogram CHAIMS

  35. Scheduling: Possible Actions INVOKES: call INVOKE’s as soon as possible • may depend on other data • moving it outside of an if-block: depending on cost-function (ESTIMATE of this and following functions concerning execution time, dataflow and fees (resources). EXTRACT: move EXTRACT’s to where the result is actually needed • no sense of checking/waiting for results before they are needed • instead of waiting, polling all invocations and issue next possible invocation as soon as data could be extracted TERMINATE: terminate invocations that are no longer needed (save resources) • not every method invocation has an extract (e.g. print-like functions) CHAIMS

  36. current CHAIMS system Mega Program Mega Program Module B Module F Module F Module D Module D Module A Module C Module E with automatic dataflow optimization Mega Program Module B Module F Module D Module A Module C Module E Compiling into a Network control flow data flow CHAIMS

  37. CHAIMS Implementation • Specify minimal language • minimal functions: CALLs, While, If * • minimal typing {boolean, integer, string, handles, object} • objects encapsulated using ASN.1 standard • type conversion in wrappers, service modules* • Compiler for multiple protocols (one-at-time, mixed*) • Wrapper generation for multiple protocols • Native modules for I/O, simple mathematics*, other • Implement API for CORBA, Java RMI, DCE usage • Wrap / construct several programs for simple demos • Schedule optimization * • Demonstrate use in heterogeneous setting * • Define full-scale demonstration * in process CHAIMS

  38. Concept Status • Definition of architecture for Megaprogramming • bottom up assessment of code to be generated • examples: room reservation, shipping • primitives • handles for parallel operation • heterogeneity -- common features of distribution protocols • Minimal language that can generate the code • no versus very few types -- ASN.1 for complex types • natural parallelism -- still a major research issue • Awareness of novel optimizations • information flow constraints -- scheduling • direct data flow between megamodules CHAIMS

  39. Focus for Future • Finishing basic infrastructure and demo examples. • CHAIMS interpreter instead of CHAIMS compiler. • Scheduling of invocations and extractions. • Flexible interaction with megamodules; extracting and handling overview results. • Direct dataflows between megamodules (future project). CHAIMS

  40. Conclusion: Research Questions • Is a Megaprogramming language focusing only on composition feasible? • Can it exploit on-going progress in client-server models and be protocol independent? • Can natural parallelism for distributed services be effectively scheduled? • Can high-level dataflow among distributed modules be optimized? • Can CHAIMS express clearly a high-level distributed SW architecture? • Can the approach affect SW process concepts and practice? CHAIMS

  41. Buy Lease Limit Use poor some fair good protect update poor ok good good bill simple simple awkw. hard perform no no little some Paying for SW Services You can not run an effective (SW) business and not be reimbursed for it. How? Four approaches: • Sell Software sell oilfield to customer • Lease copy / usage rights lease well • Time / user limited access fill tank • Charge by use instance provide bus General problems, effects differ • IP protection? • keeping SW updated • billing for est.value • performance effect CHAIMS

  42. Conclusion: Questions not addressed • Will one Client/Server protocol subsume all others? • distributed optimization remains an issue • Synchronization / Concurrency Control • autonomy of sources negates current concepts • if modules share databases, then database locks may span setup/terminate all for a megaprogram handle. • Will software vendors consider moving to a service paradigm? • need CHAIMS demonstration for evaluation CHAIMS

  43. Composition of Processes... • versus composition and integration of Data • data-warehouses • wrapping data available on web • versus composition of Components • reusing small components via copy/paste or shared libraries locally installed • large distributed components within same “domain” as composition, e.g. within one bank or airline CHAIMS:»processed information » composing autonomous execution threads CHAIMS

  44. Summary • CHAIMS requires rethinking of many common assumptions • gain understanding via simple examples • Work focused on CALL statement decomposition • to accomplish integration of large services • exploit inherent asynchrony • First version of architecture and language drafts are completed; basic infrastructure partially available (compiler, wrapper templates). • More demos will come soon. Half-way through a four year project. Þ http://www-db.stanford.edu/CHAIMS CHAIMS

  45. CHAIMS

  46. Other systems (1) Darwin/Regis: • configuration language, architecture definition language, allowing various communication mechanisms (yet all in C++ and under local control) • local (distributed) architecture, non-autonomous, system not only specifies, generates and controls composition, but also basic components • only for homogeneous environment (C++), no legacy Polylith/Polygen: • changes source-files (C, Ada, Pascal, Lisp) according to annotated design data (architecture information) for distribution • only for local distribution (access to source code) and only for TCP/IP CHAIMS

  47. Other systems (2) Hadas: • network centric framework with peer-to-peer communication • Java-based (RMI) with Java-wrappers for non-compliant objects • concept of ambassador objects which are copied to the peer object • allows administrative and design autonomy for objects (heterogeneity, changes of methods) KQML: • protocol for asynchronous communication among agents • allows explicitly different ontologies • offers primitives for getting and updating data, but also for advertising available knowledge CHAIMS

  48. Other systems (3) Differences: • autonomous megamodules, legacy modules • higher-level (just invoke and extract, no specification of various kinds of communication; no specification of module content) • heavy-weight services: ESTIMATE, parallelism by automatic scheduling of invocations according to run-time estimates and cost-functions, EXAMINE, SETPARAM • heterogeneity in distribution protocols focus so far: heterogeneity in distribution protocol on client side focus in future: using ESTIMATE and EXAMINE, automatic optimization and invocationscheduling CHAIMS

  49. CHAIMS proves that... • We can do composition in a high-level language. • same language for Java-RMI-invocations and CORBA-invocations (and DCE, DCOM, TCP/IP protocols) • (single megaprogram can deal with multiple protocols simultaniously) • multiple megamodules can run in parallel • Large-scale composition can be automated. • in contrast to manual non-software composition (e.g. telephone, cut&paste) • in contrast to fixed programs for one specific problem (e.g. transporting military goods within US) • We can do schedulings of programs in a way right now only smart logistics officers can do, avoiding unnecessary waits. • Scheduling of invocations can be optimized. CHAIMS

  50. Long-term Objectives of CHAIMS 1 Implementing a system for a simple and purely compositional language hiding differences of diverse protocols 2 Automatic optimized scheduling of invocations (taking advantage of inherent parallelism and estimate-capabilities of megamodules, hence splitting up of CALL-statement) 3 Decision-making support (direct) interaction with megamodules, based on overview and incremental results (fixed flow, not yet interactive changes to megaprogram) 4 Automatic dataflow optimization (direct dataflows between megamodules), not yet CHAIMS

More Related