1 / 76

Distributed Programming in Mozart

Distributed Programming in Mozart. Per Brand. Programming system for distributed applications. Design a programming system from the start that is suitable for distributed applications (Mozart) Extend an existing programming system with libraries to support distributed computing (JAVA)

laura-bass
Download Presentation

Distributed Programming in Mozart

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Distributed Programming in Mozart Per Brand

  2. Programming system for distributed applications • Designa programming system from the start that is suitable for distributed applications (Mozart) • Extend an existing programming system with libraries to support distributed computing (JAVA) • Provide an distribution layer that is language independent (CORBA), this layer might be needed anyway for communication with foreign software.

  3. Programming system for distributed applications • The programming language by design provides abstractions necessary for distributed applications: • Concurrency and various communication abstraction • Mobility of code (or more generally closures) and other entities • Mechanisms for security at the language level -- the programming language by construction support all the concept needed for allowing arbitrary security levels (no holes) • Notion of sited resources, how to plug and unplug resources • Notion of a distributed/mobile component (for mobility of applications) • Dynamic connectivity, transfer of entities and modification of various applications at runtime • Abstraction of the network transport media

  4. Programming system for distributed applications • The programming system (runtime system) by design provides mechanisms to support: • Network transparency • Well defined and extended distributed behavior for all language entities -- part of network awareness. • Mechanisms for guaranteeing security on untrusted sites (fake implementations) • Mechanism for limiting resource (memory and processor time) consumption by foreign computations at runtime • Network layer that supports location transparency (mobile applications) (multiple) IP independent addressing • Configurable and scalable network layer (multiple protocols, TCP, TTCP, Reliable UDP, …) • Dynamic connectivity, fault/ connectivity detection • Firewall enabled

  5. Transparency or hiding the network-1Centralized Execution Threads language entity T1 ... Ti ... Tn Store Site or OS-process

  6. Transparency or hiding the network-2Distributed Execution Many sites or OS-processes Threads Threads Threads T1 ... Ti ... Tn Global Store

  7. Network Transparency • Language semantics unchanged irrespective of whether or not entity is shared or how it is shared • Observed behavior identical modulo • Speed (how fast threads run) • Under assumption • Network partitioning is temporary • Sites do not crash • The system gives the programmer/user the illusion of a global computation space • It means: • If you develop an application on a single machine, you can distribute the entities to different sites without changing the logical behavior (functionality/semantics) of the application • If you connect to independent applications together they will logically behave as if they were running on a single machine

  8. Example Assume: by magical bootstrapping procedure two threads on two different sites share a single-assignment variable X Site 2 Site 1 Threads Threads T1 T2 Global Store

  9. Thread T1 and T2 current PC local X1 in X=ping(X1) case X1 of pong then {Show pong} end case X of ping(X1) then {Show ping} X1=pong end Site 2 Site 1 Threads Threads T1 T2 What does network transparency say about what will happen?? Global Store

  10. Thread T1 and T2 make progress-1 local X1 in X=ping(X1) case X1 of pong then {Show pong} end 2 case X of ping(X1) then {Show ping} X1=pong end 1 3 Site 2 Site 1 Threads Threads T1 T2 ping Eventually ‘ping’ will be written to std output on site 2 Global Store

  11. Thread T1 and T2 make progress-2 local X1 in X=ping(X1) case X1 of pong then {Show pong} end case X of ping(X1) then {Show ping} X1=pong end 5 4 6 Site 2 Site 1 Threads Threads T1 T2 ping Eventually ‘pong’ will be written to std output on site 1 Global Store

  12. More about network transparency • It is not known at compile time or for objects creation time if an entity might be shared • e.g. Object created and locally used for some time and first thereafter shared with other sites - for instance by binding a shared variable • not the case with RMI in Java • We do require that achieving network transparency does not hurt ordinary centralized execution (by much) • execution on a local entity should not be much slower just because it might later be shared • We would like that network transparency also includes other properties of the programming system • garbage collection properties

  13. Mozart provides network transparency • In principle for all language entities including • Records, number, atoms, floats • Procedures, classes • Cells, ports, objects • In practice • (not yet dictionaries, arrays) • If site 1 had instead X={New MyClass init} then sites 1 and 2 would share an object and both could access and update object attributes. • If site 1 had instead X=class $ … end then sites 1 and 2 could create objects of the same (shared) class • If site1 has instead local Y Z U V in X=[Y Z U V] then the sites would share 4 single-assignment variables that they could later share other entities with, ad infinitum

  14. How to achieve network transparency • Interesting for a number of different reasons • framework for comparing with other distributed programming platforms • similarities • differences • framework for comparing with distributed applications • applications may contain similar protocols (without realizing that what they have are more general-purpose) • good example of distributed algorithms (protcols) • also provides a model for awareness aspects • also provides the basis for understanding control aspects • also provides a basic for understanding challenges in further development along these lines - show how Mozart falls far short of an ideal DPP.

  15. Fundamental Classification • Entities in Mozart are • stateless, e.g. records, classes, procedures, and object-record • single-assignment or logical variables • stateful, e.g. object-state • resources, e.g the Open modules • Challenges/difficulties are different • In Mozart/Oz this distinction is clear-cut and network transparency says this must be obeyed • This is not always so • web pages are treated by browsers as being stateless even though they are stateful - in a sense this is on a different level. • RMI (remote method invocation) in Java may treat certain entities as being stateful in one context and stateless in another • semantic mess

  16. Stateless Entities - 1 • Strategy for distribution - replication • If sites share a stateless data structure then the data structure is replicated on all the sites. • e.g if the shared variable X is bound to the list [1,2,…, 1000] then the list is copied (replicated) on all sites • Technical issue-1 • marshaling or serialization of stateless entities and the unmarshaling counterpart. • data structures (e.g. records) • code (e.g. procedures) • basically same format as for pickling • Java does this for Java bytecode, RMI in Java for data structures • Mozart for Mozart bytecode and stateless data structures

  17. Stateless Entities - 2 • Technical Issue - Token equality • entities with structural equality are easy • token or pointer equality on procedures and classes requires a global name space • recognize equality - you only want one copy per site • otherwise you can fill the memory with copies of same procedure (arriving at different times) • recognize inequality - • In Java this is not guaranteed (names are strings) • in Mozart achieved by names that are rather long • 3 parts <machine><process><long int> • But only one <machine> field per site (other names with the same first field are pointers to it). • sites have name tables (gc-enabled)

  18. Stateless Entities - 3 • Principal Issue - lazy or eager replication • advantages with eager • lower latency • less difficulties with failure • no protocol • advantages with lazy • less memory consumption • less traffic (site may never need to access the structure) • Mozart choice • object-records lazy • all other stateless entities eager • later we consider if this gives enough control

  19. Lazy objects • Consider a matrix of objects connected via object features • each object only acts on its neighbors • e.g. simulation • with lazy replication each object is replicated 5 times • with eager replication each object is replicated on all sites

  20. Protocols and Access Structures • Lazy objects (object-records) require a protocol • very simple protocol • The Mozart protocols work on a cross-site data structure called an access structure. • Access structures have both features common to all kinds of entities and features that differ.

  21. Access structure for a shared entity Site Depending on type of entity and manager state links may be double or single. Operation on entity may invoke protocols over the linked structure Proxy Site Site single Proxy double Manager Proxies always know their manager

  22. Distributed Memory Management Site Site The Lone Manager will be reclaimed. Manager

  23. Creating an access structure for lazy objects Site 2 Site 1 Site Object Thread 1 on site 1 exports Object to site 2

  24. Creating an access structure for lazy objects-2 Site 2 Site 1 Site Object object Manager Created Object Name sent inside a message for another protocol Manager

  25. Creating an access structure for lazy objects-3 Site 2 Proxy created as the object name is not in site table Proxy Site 1 Site Object Manager

  26. Creating an access structure for lazy objects-4a Site 2 Thread Thread performs operation on proxy. AskForObject message sent Proxy Site 1 Site askForObject Object Manager

  27. Creating an access structure for lazy objects-5a Site 2 Thread Proxy Site 1 Site Object <object-record> Manager Object-Record marshaled and sent

  28. Creating an access structure for lazy objects-6a Site 2 Object-Record built, Proxy reclaimed Thread Object Site 1 Site Object Manager will eventually be reclaimed Manager

  29. Creating an access structure for lazy objects-4b Site 2 Thread exports object proxy Thread Proxy object Site 1 Site Object Manager

  30. Creating an access structure for lazy objects-5b Site 2 Thread exports object proxy Thread Proxy Site 1 Site 3 Object Proxy Manager

  31. Stateless Entities- Concluding Remarks • Stateless entities relatively easy to deal with • Once they have been imported (in their entirety) • no further messages required • no effect on failure • Eager stateless entities • no extra messages at all (they are encapsulated in a message for another entity) • no extra latency • Lazy stateless entities • two extra messages at most per site • extra latency on first access by site • no extra latency on subsequent access

  32. Stateful Entities • With only stateless entities you can’t do much • Maintaining the consistency of stateful entities has been much studied • Databases, cache-coherence protocols, distributed shared memory etc. • For network transparency we want sequential consistency • if T1 and T2 are two threads that are synchronized (e.g. by dataflow) so that T1 updates before T2 then when T2 accesses the stateful entity the update is seen in its entirety • Example: attribute a is initially set to rec(u v w) • Thread1: a<- rec(x y z) X=unit • Thread2: {Wait X} {Show @a} • %%% should be rec(x y z) not rec(u v w) %%% or rec(x y w)

  33. Consistency Protocols -1 • Protocol 1: Stationary stateful entity • remote operations, both access and update are translated to access and update messages sent to the ‘home’ of the entity • if operations are asynchronous channels may need to be FIFO • all access/update message require messages to be sent (other than the owner of the entity) • 2 network hops for each access/update • Protocol 2: Token protocols • the state is a token that can be moved • operations both access and update require the token • if the token is on another site access/update require first that the token is brought to the site - protocol run • if the token is on the current site then both access/update can be done without message sending • K network hops for first access/update, 0 thereafter if no other site grabs the state in-between

  34. Consistency Protocols-2 • Protocol 3: Invalidation protocols • the state is replicated freely on access operations • update operations require invalidation and acknowledgement • invalidation requires at least 2*N messages where N is the number of sites that have a reference • invalidation latency is 2 network hops - but note that invalidation latency depends on the slowest responding site • access operations (after having received a copy) require no messages • Observations • Protocol 1 is terrible from the viewpoint of latency over WAN unless operations can be made coarse-grained. • Protocol 3 requires a heavy infrastructure (all references to entity) known • Protocol 3 is better for access than protocol 2, but worse for update • so relative frequency of update vs. access makes a difference

  35. Mozart • Cells and object states • Token-based protocol • Ports • Stationary • Stationary Objects • Can be achieved (almost) by an abstraction based on port • Is this enough?? • Consider this later

  36. State mobility protocol (1) Manager state access a <- …@a Thread object state S1

  37. State mobility protocol (2) Manager Get Thread object state S1

  38. State mobility protocol (3) Manager Forward Thread object state S1

  39. State mobility protocol (4) Manager Put Thread S1 object state S1

  40. State mobility protocol: Summary • Provides predictable network behavior • Provides lightweight migratory object behavior • Maintains consistency • At most 3 network hops for first access/update • thereafter local operation • Repeated operations - 0 network hops if no competition for state

  41. Mobile object: Local object Owner class state1

  42. Mobile object: Remote reference Owner node cell class state1

  43. Mobile object: Object application Class is replicated, no local state object state class class state1

  44. state1 Mobile object: Object application Class is replicated, no local state object state class class

  45. Mobile object: State access object state class class state2

  46. Mobile object: State access object state class class state3

  47. Ports: Local object Owner Stream

  48. Ports: Remote reference Owner node Stream

  49. Ports: Send {Send P Msg} S Owner node Stream

  50. Ports: Send {Send P Msg} S Owner node Send(Msg) Stream

More Related