Network Objects Marco F. Duarte COMP 520: Distributed Systems September 14, 2004
Introduction Distributed systems require data, process sharing among nodes Object oriented programming appropriate for distributed systems A. Birrell, G. Nelson, S. Owicki, E. Wobber (1993) How to share objects in distributed systems? Methods provide sharing interface – share methods? Network Objects: Objects whose methods can be accessed by other programs
Pickles Solution to marshaling complex data types Simple variable types marshaled in-line Complex types (i.e. objects) marshaled by pickle package – which can be customized for each object type Network objects are passed by reference Non-network objects are copied to destination Marshalling support for inter-process streams
Network Object Sharing Network object T, subtypes TImpl, TSrg Surrogates are created by the unmarshaling code Clients select a transport shared by client and owner Clients select TSrg corresponding to TImpl
Object Sharing How to choose best surrogate for a network object? Narrowest Surrogate: Choose TSrg which is the most specific and consistent with TImpl, and with stubs available both in client and owner. Third Party Transfers: Obtaining a reference to a network object from another client
Object Sharing: Example MODULE Server EXPORTS Main; IMPORT NetObj, FS, Time; TYPE File = FS.File OBJECT OVERRIDES getChar := GetChar; eof := Eof END; Svr = FS.Server OBJECT OVERRIDES open := Open; END; BEGIN NetObj.Export(NEW(Srv), “FS1”); END Server. MODULE Client EXPORTS Main; IMPORT NetObj, FS, IO; VAR s: FS.Server := NetObj.import(“FS1”, NetObj.LocateHost(“server”)); f:= s.open(“/usr/dict/words”); BEGIN WHILE NOT f.eof() DO IO.PutChar(f.getChar()) END END Client … TYPE NewFS.File OBJECT METHODS close() END;
Typecodes Unique object identifier in a machine Used for allocation Typecodes are matched with supertypecodes (parent)
Typecodes: Problem
Fingerprints: Solution 64 kilobit checksum dependent on object structure
Network Object Marshaling Networks Objects are marshaled through their wire representation: (SpaceID, ObjID) If object is not known at client, a surrogate is found for it using the narrowest surrogate rule.
Remote Invocation Stubs registered in table with srgType, disp. Obtain and release connections Dispatcher: obj.disp(c, obj) – written by stub generator – unmarshals arguments and calls appropriate method Methods identified by integers
Garbage Collection
Dirty Set: List of clients containing surrogates for the objects
Garbage Collection When surrogate is collected, RPC removes it from dirty set If there are no local references, TImpl can be collected
Garbage Collection Third party transfers as results require Ack message to protect both copies
Explicit Import/Export MODULE Server EXPORTS Main; IMPORT NetObj, FS, Time; TYPE File = FS.File OBJECT OVERRIDES getChar := GetChar; eof := Eof END; Svr = FS.Server OBJECT OVERRIDES open := Open; END; BEGIN NetObj.Export(NEW(Srv), “FS1”); END Server. MODULE Client EXPORTS Main; IMPORT NetObj, FS, IO; VAR s: FS.Server := NetObj.import(“FS1”, NetObj.LocateHost(“server”)); f:= s.open(“/usr/dict/words”); BEGIN WHILE NOT f.eof() DO IO.PutChar(F.getChar()) END END Client … TYPE NewFS.File OBJECT METHODS close() END;
Bootstrapping Objects passed as results to method calls How to share an “original” object? Forge original surrogate Location, Object ID, Surrogate Type Special Object w/ID = 0 Methods implement network object runtime operations (Import, Export, Locate, etc.) get, put operations Specific TCP port assigned for location
Performance doesn’t matter Network penalty 1600 usecs Null Call 3310 usecs/call Ten integer call 3435 usecs/call Same object argument 3895 usecs/call Same object return 4290 usecs/call (ack) New object argument 9148 usecs/call (dirty) New object return usecs/call (dirty) TCP throughput 3400 Kbytes/sec Reader test 2824 Kbytes/sec Writer test 2830 Kbytes/sec
Linda: Basic Concepts N. Carriero and D. Gelernter Simpler, more powerful and more elegant than alternatives Tuple: Unconstrained data structure A tuple is a series of typed fields (“a string”, 15.01, 17. “another string”)
Tuple Operations Four basic operations : eval,out create new data objects in, rd remove and read data objects Operation syntax: out(“a string”, 15.01, 17, “another string”) in(“a string”, ? f, ? i, “another string”) rd(“a string”, ? f, ? i, “another string”)
Using Tuples Live Tuple: Tuple whose data is to be determined by a running process. Tuple space: Collection of tuples available to all programs Implementing data structures as a collection of tuples: n-vector V (“V”, 1, FirstElt), (“V”, 2, SecondElt) … (“V”, n, NthElt) To read the jth element: rd(“V”, j, ? x); To modify the ith element: in(“V”, j, ? OldVal); … out(“V”, j, NewVal);
Advantages of Linda over Concurrent Objects Communication, synchronization and process creation are two facets of the same operation Tuples are persistent Asynchronous communication between processes Data structures can be expressed as a collection of tuples. Live data structures are a collection of live tuples Fine grained live data structure programs
Linda and Objects Can be used with object oriented programming Generate passive objects using out Generate active objects using eval Communication with active object goes through tuple space Parallelism-oriented, unlike other methods
Conclusions Network object simplifies communication in distributed systems, but introduces new complexities Identifying objects consistently across computers Network-based garbage collection Communication between objects can be implemented in several ways RPC conveniently implements remote method access Object-oriented programming itself doesn’t implement parallelism