Download presentation
Presentation is loading. Please wait.
Published bySteven Harvey Modified over 9 years ago
1 Distributed Programming in Mozart Per Brand
2 Programming system for distributed applications Design a programming system from the start that is suitable for distributed applications (Mozart) Extend an existing programming system with libraries to support distributed computing (JAVA) Provide an distribution layer that is language independent (CORBA), this layer might be needed anyway for communication with foreign software.
3 Programming system for distributed applications The programming language by design provides abstractions necessary for distributed applications: –Concurrency and various communication abstraction –Mobility of code (or more generally closures) and other entities –Mechanisms for security at the language level -- the programming language by construction support all the concept needed for allowing arbitrary security levels (no holes) –Notion of sited resources, how to plug and unplug resources –Notion of a distributed/mobile component (for mobility of applications) –Dynamic connectivity, transfer of entities and modification of various applications at runtime –Abstraction of the network transport media
4 Programming system for distributed applications The programming system (runtime system) by design provides mechanisms to support: –Network transparency –Well defined and extended distributed behavior for all language entities -- part of network awareness. –Mechanisms for guaranteeing security on untrusted sites (fake implementations) –Mechanism for limiting resource (memory and processor time) consumption by foreign computations at runtime –Network layer that supports location transparency (mobile applications) (multiple) IP independent addressing –Configurable and scalable network layer (multiple protocols, TCP, TTCP, Reliable UDP, …) –Dynamic connectivity, fault/ connectivity detection –Firewall enabled
5 Transparency or hiding the network-1 Centralized Execution Threads T1Ti...Tn Store Site or OS-process... language entity
6 Transparency or hiding the network-2 Distributed Execution Threads T1Ti...Tn Global Store Many sites or OS-processes... Threads
7 Network Transparency Language semantics unchanged irrespective of whether or not entity is shared or how it is shared Observed behavior identical modulo –Speed (how fast threads run) Under assumption –Network partitioning is temporary –Sites do not crash The system gives the programmer/user the illusion of a global computation space It means: –If you develop an application on a single machine, you can distribute the entities to different sites without changing the logical behavior (functionality/semantics) of the application –If you connect to independent applications together they will logically behave as if they were running on a single machine
8 Example Threads T1T2 Global Store Threads Assume: by magical bootstrapping procedure two threads on two different sites share a single-assignment variable X Site 1 Site 2
9 Thread T1 and T2 current PC Threads T1T2 Global Store Threads local X1 in X=ping(X1) case X1 of pong then {Show pong} end case X of ping(X1) then {Show ping} X1=pong end What does network transparency say about what will happen?? Site 1 Site 2
10 Thread T1 and T2 make progress-1 Threads T1T2 Global Store Threads local X1 in X=ping(X1) case X1 of pong then {Show pong} end case X of ping(X1) then {Show ping} X1=pong end Eventually ‘ping’ will be written to std output on site 2 Site 1 Site 2 1 2 3 ping
11 Thread T1 and T2 make progress-2 Threads T1T2 Global Store Threads local X1 in X=ping(X1) case X1 of pong then {Show pong} end case X of ping(X1) then {Show ping} X1=pong end Eventually ‘pong’ will be written to std output on site 1 Site 1 Site 2 5 4 ping 6
12 More about network transparency It is not known at compile time or for objects creation time if an entity might be shared – e.g. Object created and locally used for some time and first thereafter shared with other sites - for instance by binding a shared variable –not the case with RMI in Java We do require that achieving network transparency does not hurt ordinary centralized execution (by much) –execution on a local entity should not be much slower just because it might later be shared We would like that network transparency also includes other properties of the programming system –garbage collection properties
13 Mozart provides network transparency In principle for all language entities including –Records, number, atoms, floats –Procedures, classes –Cells, ports, objects In practice –(not yet dictionaries, arrays) If site 1 had instead X={New MyClass init} then sites 1 and 2 would share an object and both could access and update object attributes. If site 1 had instead X=class $ … end then sites 1 and 2 could create objects of the same (shared) class If site1 has instead local Y Z U V in X=[Y Z U V] then the sites would share 4 single-assignment variables that they could later share other entities with, ad infinitum
14 How to achieve network transparency Interesting for a number of different reasons –framework for comparing with other distributed programming platforms similarities differences –framework for comparing with distributed applications applications may contain similar protocols (without realizing that what they have are more general-purpose) –good example of distributed algorithms (protcols) –also provides a model for awareness aspects –also provides the basis for understanding control aspects –also provides a basic for understanding challenges in further development along these lines - show how Mozart falls far short of an ideal DPP.
15 Fundamental Classification Entities in Mozart are –stateless, e.g. records, classes, procedures, and object-record –single-assignment or logical variables –stateful, e.g. object-state –resources, e.g the Open modules Challenges/difficulties are different In Mozart/Oz this distinction is clear-cut and network transparency says this must be obeyed This is not always so –web pages are treated by browsers as being stateless even though they are stateful - in a sense this is on a different level. –RMI (remote method invocation) in Java may treat certain entities as being stateful in one context and stateless in another semantic mess
16 Stateless Entities - 1 Strategy for distribution - replication If sites share a stateless data structure then the data structure is replicated on all the sites. –e.g if the shared variable X is bound to the list [1,2,…, 1000] then the list is copied (replicated) on all sites Technical issue-1 –marshaling or serialization of stateless entities and the unmarshaling counterpart. data structures (e.g. records) code (e.g. procedures) basically same format as for pickling Java does this for Java bytecode, RMI in Java for data structures Mozart for Mozart bytecode and stateless data structures
17 Stateless Entities - 2 Technical Issue - Token equality –entities with structural equality are easy –token or pointer equality on procedures and classes requires a global name space –recognize equality - you only want one copy per site otherwise you can fill the memory with copies of same procedure (arriving at different times) –recognize inequality - In Java this is not guaranteed (names are strings) –in Mozart achieved by names that are rather long 3 parts But only one field per site (other names with the same first field are pointers to it). –sites have name tables (gc-enabled)
18 Stateless Entities - 3 Principal Issue - lazy or eager replication –advantages with eager lower latency less difficulties with failure no protocol –advantages with lazy less memory consumption less traffic (site may never need to access the structure) Mozart choice –object-records lazy –all other stateless entities eager –later we consider if this gives enough control
19 Lazy objects Consider a matrix of objects connected via object features –each object only acts on its neighbors e.g. simulation –with lazy replication each object is replicated 5 times –with eager replication each object is replicated on all sites
20 Protocols and Access Structures Lazy objects (object-records) require a protocol –very simple protocol The Mozart protocols work on a cross-site data structure called an access structure. Access structures have both features common to all kinds of entities and features that differ.
21 Access structure for a shared entity Manager Proxy Operation on entity may invoke protocols over the linked structure Depending on type of entity and manager state links may be double or single. double single Site Proxies always know their manager
22 Distributed Memory Management Manager The Lone Manager will be reclaimed. Site
23 Creating an access structure for lazy objects Object Site 1 Site 2 Site Thread 1 on site 1 exports Object to site 2
24 Creating an access structure for lazy objects-2 Object Site 1 Site 2 Site Manager Created Object Name sent inside a message for another protocol Manager object
25 Creating an access structure for lazy objects-3 Object Site 1 Site 2 Site Proxy created as the object name is not in site table Manager Proxy
26 Creating an access structure for lazy objects-4a Object Site 1 Site 2 Site Thread performs operation on proxy. AskForObject message sent Manager Proxy Thread askForObject
27 Creating an access structure for lazy objects-5a Object Site 1 Site 2 Site Manager Proxy Thread Object-Record marshaled and sent
28 Creating an access structure for lazy objects-6a Object Site 1 Site 2 Site Manager Thread Object-Record built, Proxy reclaimed Object Manager will eventually be reclaimed
29 Creating an access structure for lazy objects-4b Object Site 1 Site 2 Site Thread exports object proxy Manager Proxy Thread object
30 Creating an access structure for lazy objects-5b Object Site 1 Site 2 Site 3 Thread exports object proxy Manager Proxy Thread Proxy
31 Stateless Entities- Concluding Remarks Stateless entities relatively easy to deal with Once they have been imported (in their entirety) –no further messages required –no effect on failure Eager stateless entities –no extra messages at all (they are encapsulated in a message for another entity) –no extra latency Lazy stateless entities –two extra messages at most per site –extra latency on first access by site –no extra latency on subsequent access
32 Stateful Entities With only stateless entities you can’t do much Maintaining the consistency of stateful entities has been much studied –Databases, cache-coherence protocols, distributed shared memory etc. For network transparency we want sequential consistency –if T1 and T2 are two threads that are synchronized (e.g. by dataflow) so that T1 updates before T2 then when T2 accesses the stateful entity the update is seen in its entirety –Example: attribute a is initially set to rec(u v w) Thread1: a<- rec(x y z) X=unit Thread2: {Wait X} {Show @a} %% should be rec(x y z) not rec(u v w) %% or rec(x y w )
33 Consistency Protocols -1 Protocol 1: Stationary stateful entity –remote operations, both access and update are translated to access and update messages sent to the ‘home’ of the entity –if operations are asynchronous channels may need to be FIFO –all access/update message require messages to be sent (other than the owner of the entity) –2 network hops for each access/update Protocol 2: Token protocols –the state is a token that can be moved –operations both access and update require the token –if the token is on another site access/update require first that the token is brought to the site - protocol run –if the token is on the current site then both access/update can be done without message sending –K network hops for first access/update, 0 thereafter if no other site grabs the state in-between
34 Consistency Protocols-2 Protocol 3: Invalidation protocols –the state is replicated freely on access operations –update operations require invalidation and acknowledgement –invalidation requires at least 2*N messages where N is the number of sites that have a reference –invalidation latency is 2 network hops - but note that invalidation latency depends on the slowest responding site –access operations (after having received a copy) require no messages Observations –Protocol 1 is terrible from the viewpoint of latency over WAN unless operations can be made coarse-grained. –Protocol 3 requires a heavy infrastructure (all references to entity) known –Protocol 3 is better for access than protocol 2, but worse for update so relative frequency of update vs. access makes a difference
35 Mozart Cells and object states –Token-based protocol Ports –Stationary Stationary Objects –Can be achieved (almost) by an abstraction based on port Is this enough?? –Consider this later
36 State mobility protocol (1) S1S1 object state state access a <- …@a Thread Manager
37 State mobility protocol (2) S1S1 object state Thread Manager Get
38 State mobility protocol (3) S1S1 object state Thread Manager Forward
39 State mobility protocol (4) S1S1 object state Thread Manager Put S1S1
40 State mobility protocol: Summary Provides predictable network behavior Provides lightweight migratory object behavior Maintains consistency At most 3 network hops for first access/update –thereafter local operation Repeated operations - 0 network hops if no competition for state
41 Mobile object: Local object class state1 Owner
42 Mobile object: Remote reference class state1 cell Owner node
43 Mobile object: Object application class state1 class Class is replicated, no local state object state
44 Mobile object: Object application class state1 class Class is replicated, no local state object state
45 Mobile object: State access class state2 class object state
46 Mobile object: State access class state3 class object state
47 Ports: Local object Stream Owner
48 Ports: Remote reference Owner node Stream
49 Ports: Send Owner node Stream {Send P Msg} S
50 Ports: Send Owner node Stream {Send P Msg} S Send(Msg)
51 Ports: Asynchronous Owner node Stream {Send P Msg} S Send(Msg)
52 Ports: Added to stream Owner node {Send P Msg} S Msg
53 Ports: Send Owner node Stream {Send P Msg}
54 Summary:Ports Stationary Asynchronous Messages coming from same thread must arrive in same order –achieved by FIFO-channel
55 Variables Single-assignment variables are –stateful –but, state is changed only once –can this invariant be used to achieve a more optimal protocol??
56 Variable elimination (1) Proxy Manager Proxy X = Thread Thread operation Proxy
57 Variable elimination (2) Proxy Manager Proxy X = Thread Thread operation Bind
58 Variable elimination (3) Proxy Manager Proxy X = Thread Thread operation Surrender
59 Variable elimination (4) Proxy Manager Proxy X = Thread Thread operation Redirect
60 Variable elimination (5) Manager X = Thread Thread operation
61 Variable elimination (6) Manager X = Thread Thread operation Register
62 Variable elimination (7) Manager X = Thread Thread operation Redirect
63 Variable elimination (8) Manager X = Thread Thread operation
64 Variable elimination (9) X = Thread Thread operation
65 Properties of the variable elimination protocol Is the only distributed algorithm required for distributed unification. All other operations are local. Maintains consistency of the store. No inconsistent bindings can happen. Handles rational tree unification Is sufficient for both ask and tell Is efficient for client-server uses
66 Variable Protocol Protocol makes use of eager elimination –bindings propagated eagerly –reasonable as bindings only done once Access needs no protocol (unlike objects/cells) –this makes use of monotonicity –access by waiting for binding Compare to broadcasting state updates for objects/cells –invalidation for updating to guarantee access semantics
67 Locks and Cells Locks/cells/object-state share the same protocol –only difference is in the local execution locks have only two possible values (locked or unlocked) object state/cells are update/access by different operations –once a site has the token operations on it are performed almost exactly the same way as if entity was local only difference is to schedule the re-export of the token protocol Example: –threads 1,2,3 all ask for the currently non-local lock –the lock arrives and thread 1 is given the lock –a forward message is received (some other site wants the lock) –threads 4,5,6 ask for the lock –important that after thread 3 has finished using the lock that the lock is sent out (put message) and a new request for the lock is made
68 Resources Resources are (in principle) references to system resources that are bound to a site –Abstractions built on top of them –Also referred to in the documentation as ‘sited’ entities Distribution policy –Resources do not work outside home site except for equality test (token equality) –If resources are exported and then reimported to the original site they work normally Examples: –System modules Also currently some stateful language entities are treated as resources - Arrays and Dictionaries
69 Distribution Protocols Stateless ReplicationEager Lazy Single assignment Eager elimination Stateful LocalizationMobile Stationary All but object-records Object-records Logical Variables Locks,Object-state,Cell Ports Resources No-goods System Resources
70 Distributed behavior StatelessReplicationEager Lazy Single assignment Eager elimination StatefulLocalizationMobile Stationary Increased complexity in implementation to achieve transparency Better performance and behavior operating on distributed entities
71 Distribution support in Mozart The module Connection provides the basic mechanism (known as tickets) for active applications to connect with each other. The module Remote allows an active application to create a new site (local or remote operating system process) and connect with it. The site may be on the same machine or a remote machine. The module Pickle allows an application to store and retrieve arbitrary stateless data from files and URLs. The module Fault gives the basic primitives for fault detection and handling All these are found under System modules in the documentation
72 Pickling and URLs Pickle.load can be given an URL and access over the Internet For example –Site 1: { X ‘/home/perbrand/public_html/mySave’} –Site 2: {Pickle.load ‘’}
73 Connection module Let's say Application 1 has an entity E that it wants others to access. It can do this by creating a ticket that reference the entity. Other applications then just need to know the ticket to get access to the stream. Tickets are implemented by the module Connection, which has the following operations: {Connection.offer X T} creates a ticket T for X, which can be any language entity. The ticket can be taken just once. Attempting to take a ticket more than once will raise an exception. {Connection.take T X} creates a reference X when given a valid ticket in T. The X refers to exactly the same language entity as the original reference that was offered when the ticket was created. A ticket can be taken at any site. If taken at a different site, then there is network communication between the two sites.
74 Connection module - bootstrapping distribution What can you do with a ticket-1 –{Connection.offer X T}{Show T} –cut out the value of T from emulator window –put in a e-mail and send to a friend –the friend reads from e-mail and cuts and paste into the OPI –use {Connection.take T Ref} to get a reference to the entity that the ticket refers too. What can you do with a ticket more –very secret stuff hand over the ticket by hand –very public stuff put it on a web-page
75 Connection module Ordinary tickets can only be used once –extra protection –second {Connection.take T X} by anyone will raise an exception ‘Connection refused’ Unlimited tickets –{Connection.offerUnlimited X T} –can be used by any number of sites –useful for web pages Unlimited but can be closed –class Connection.gate –with methods init(X ?Ticket) –ticket can be used any number of times close –ticket no longer viable
76 Demonstration The files distOne.oz and distTwo.oz illustrate distribution in Mozart and are demonstrated
Similar presentations
© 2025 Inc.
All rights reserved.