Distributed Systems (part 1) Chris Gill Department of Computer Science and Engineering Washington University, St. Louis, MO, USA CSE.

Distributed Systems (part 1) Chris Gill cdgill@cse.wustl.edu Department of Computer Science and Engineering Washington University, St. Louis, MO, USA CSE 591 Area 5 Talk Monday, November 10, 2008

2 - Gill: Distributed Systems – 11/9/2015 What is a Distributed System? "A distributed system is one in which the failure of a computer you didn't even know existed can render your own computer unusable.” - Leslie Lamport (BTW, this is entirely “ha, ha, only serious” ;-)

3 - Gill: Distributed Systems – 11/9/2015 Key Characteristics of a Distributed System Programs on different computers must interact »A distributed system spans multiple computers »Programs must send information to each other »Programs must receive information from each other »Programs also need to do some work ;-) Programs play different roles in those interactions »Send a request (client), process the request (server), send a reply (server), receive and process reply (client) »Remember where to find things (directory, etc. services) »Mediate interactions among distributed programs (coordination, orchestration, etc. services) Programs can interact in many other ways as well »Coordination “tuple spaces” (JavaSpaces, Linda, LIME) »Publish-subscribe and message passing middleware »Externally driven (e.g., a workflow management system)

4 - Gill: Distributed Systems – 11/9/2015 Distribution Semantics Matters a Lot How are the different computers inter-connected? »Does all traffic move on a common data bus? »Or, does traffic move across (hierarchical) networks? »Or, does traffic move point-to-point between hosts? Are there spatial and/or temporal factors? »Does hosts physical location/movement matter? »Is delay noticeable, are bandwidth limits relevant? »Are connections “always on” or can they be intermittent? »Does the inter-connection topology change? »Is the inter-connection topology entirely dynamic? Programs can interact in many other ways as well »Coordination “tuple spaces” (JavaSpaces, Linda, LIME) »Publish-subscribe and message passing middleware »Externally driven (e.g., a workflow management system)

5 - Gill: Distributed Systems – 11/9/2015 Distribution Semantics Examples (1/3) Wired (hierarchical) internet »Can reach any host from any other host »Hosts are “always” on and available (% failure, downtime) »Much of the WWW depends on this notion (example?) J I H G F A B C D E

6 - Gill: Distributed Systems – 11/9/2015 Distribution Semantics Examples (2/3) Nomadic (hierarchical) internet »Some hosts are mobile, connect to nearest access point »Hosts may be unavailable, but reconnect eventually »Host-to-host path topology may change due to this »Cell phones, wireless laptops exhibit this behavior J I H G F A B C D E C

7 - Gill: Distributed Systems – 11/9/2015 Distribution Semantics Examples (3/3) Mobile ad hoc networks (MANETS) »Mobile hosts connect to each other (w/out access point) »Hosts may detect dynamic connection, disconnection »Hosts must exploit communication windows of opportunity »Enables ad-hoc routing, message “mule” behaviors JI H G F A B C D E H

8 - Gill: Distributed Systems – 11/9/2015 Distributed System Example (Wired) Real-time avionics middleware »Layer(s) between the application and the operating system »Ensures non-critical activities don’t interfere with timing of critical ones »Based on other open-source middleware projects »ACE C++ library and TAO object request broker »Standards-based (CORBA), written in C++/Ada Flight demonstrations : BBN, WUSTL, Boeing, Honeywell

9 - Gill: Distributed Systems – 11/9/2015 Distributed System Example (Nomadic/MANET) Sliver »A compact (small footprint) workflow engine for personal computing devices (e.g., cell phones, PDAs) »Allows mobile collaboration to assemble and complete automated work-flows (task graphs) »Standards-based (BPEL, SOAP), written in Java Developed by Greg Hackmann at WUSTL

10 - Gill: Distributed Systems – 11/9/2015 How do Distributed Systems Interact? Remote method invocations are one popular style »Allows method calls to be made between programs »Middleware uses threads, sockets, etc. to make it so »CORBA, Java RMI, SOAP, etc. standardize the details Other styles (better for nomadic/mobile settings) »Coordination “tuple spaces” (JavaSpaces, Linda, LIME) »Publish-subscribe and message passing middleware »Externally driven (e.g., a workflow management system)

11 - Gill: Distributed Systems – 11/9/2015 Challenges for (Wired) Distributed Systems Distributed systems are inherently complex »Remote concurrent programs must inter-operate »Interactions must be assured of liveness and safety Also must avoid accidental complexity »Design for ease of configuration, avoidance of mistakes »System architectures and design patterns can help map low level abstractions into appropriate higher level ones

12 - Gill: Distributed Systems – 11/9/2015 How to Abstract Concurrent Event Handling? Server CONNECT Client1 Port:27098 Client2 Port:26545 CONNECT Goal: process multiple service requests concurrently using OS level threads Port:30000 listen Port:24467 accept Port:25667 accept

13 - Gill: Distributed Systems – 11/9/2015 Basis: Synchronous vs. Reactive Read read() ClientsServer select() ClientsServer read() data HandleSet

14 - Gill: Distributed Systems – 11/9/2015 Approach: Reactive Serial Event Dispatching select() Clients Application Event Handlers read() Reactor handle_*() HandleSet

15 - Gill: Distributed Systems – 11/9/2015 Interactions among Participants Main Program Concrete Event Handler Reactor Synchronous Event Demultiplexer register_handler(handler, event_types) get_handle() handle_events() select() event handle_event()

16 - Gill: Distributed Systems – 11/9/2015 Distributed Interactions with Reactive Hosts Application components implemented as handlers »Use reactor threads to run input and output methods »Send requests to other handlers via sockets, upcalls Example of a multi-host request/result chain »h1 to h2, h2 to h3, h3 to h4 reactor r1 handler h1 reactor r2 reactor r3 socket handler h2handler h4handler h3

17 - Gill: Distributed Systems – 11/9/2015 WaitOnConnection Strategy Handler waits on socket connection for the reply –Makes a blocking call to socket’s recv() method Benefits –No interference from other requests that arrive while the reply is pending Drawbacks –One less thread in the Reactor for new requests –Could allow deadlocks when upcalls are nested

18 - Gill: Distributed Systems – 11/9/2015 WaitOnReactor Strategy Handler returns control to reactor until reply comes back –Reactor can keep processing other requests while replies are pending Benefits –Thread available, no deadlock –Thread stays fully occupied Drawbacks –Interleaving of request reply processing –Interference from other requests issued while reply is pending

19 - Gill: Distributed Systems – 11/9/2015 Blocking with WaitOnReactor Wait-on-Reactor strategy could cause interleaved request/reply processing Blocking factor could be large or even unbounded –Based on the upcall duration –And sequence of other intervening upcalls Blocking factors may affect real-time properties of other end-systems –Call-chains can have a cascading blocking effect f2f2 f5f5 f3f3 f 5 reply queued f 3 completes f 5 reply processed f 2 completes Blocking factor for f 2

20 - Gill: Distributed Systems – 11/9/2015 Why not a “Stackless” WaitOnReactor Variant? What if we didn’t “stack” processing of results? –But instead allowed them to handled asynchronously as they are ready –“Stackless Python” takes this approach –Thanks to Caleb Hines who pointed this out in CSE 532 Benefits –No interference from other requests that arrive when reply is pending –No risk of deadlock as thread still returns to reactor Drawbacks –Significant increase in implementation complexity –Time and space overhead to match requests to results (other patterns we cover in CSE 532 could help, though)

21 - Gill: Distributed Systems – 11/9/2015 Could WaitOnConnection Be Used? Main limitation is its potential for deadlock »And, it offers low overhead, ease of implementation/use Could we make a system deadlock-free … »if we knew its call-graph … and were careful about how threads were allowed to proceed? Notice that a lot of distributed systems research has this kind of flavor… »Given one approach (of probably several alternatives) Can we solve problem X that limits its applicability and/or utility? Can we apply that solution efficiently in practice? Does the solution raise other problems that need to be solved?

22 - Gill: Distributed Systems – 11/9/2015 Call graph often can be obtained Each reactor is assigned a color Deadlock can exist »If there exists > K c segments of color C »Where K c is the number of threads in node with color C »E.g., f3-f2-f4-f5-f2 needs at least 2 & 1 Deadlock Problem in Terms of a Call Graph f1 f2 f3 f4 f5 From V. Subramonian and C. Gill, “A Generative Programming Framework for Adaptive Middleware”, 2004

23 - Gill: Distributed Systems – 11/9/2015 Simulation Showing Thread Exhaustion Reactor1 Client1 Client2 Client3 Reactor2 Formally, increasing number of reactor threads may not prevent deadlock Server1Server2 Flow1 Flow2 Flow3 EH1 1 EH3 1 EH2 1 EH1 2 EH1 3 EH2 2 EH2 3 EH3 2 EH3 3 Clients send requests 3: Client3 : TRACE_SAP_Buffer_Write(13,10) 4: Unidir_IPC_13_14 : TRACE_SAP_Buffer_Transfer(13,14,10) 5: Client2 : TRACE_SAP_Buffer_Write(7,10) 6: Unidir_IPC_7_8 : TRACE_SAP_Buffer_Transfer(7,8,10) 7: Client1 : TRACE_SAP_Buffer_Write(1,10) 8: Unidir_IPC_1_2 : TRACE_SAP_Buffer_Transfer(1,2,10) Reactor1 makes upcalls to event handlers 10: Reactor1_TPRHE1 ---handle_input(2,1)---> Flow1_EH1 12: Reactor1_TPRHE2 ---handle_input(8,2)---> Flow2_EH1 14: Reactor1_TPRHE3 ---handle_input(14,3)---> Flow3_EH1 Flow1 proceeds 15: Time advanced by 25 units. Global time is 28 16: Flow1_EH1 : TRACE_SAP_Buffer_Write(3,10) 17: Unidir_IPC_3_4 : TRACE_SAP_Buffer_Transfer(3,4,10) 19: Reactor2_TPRHE4 ---handle_input(4,4)---> Flow1_EH2 20: Time advanced by 25 units. Global time is 53 21: Flow1_EH2 : TRACE_SAP_Buffer_Write(5,10) 22: Unidir_IPC_5_6 : TRACE_SAP_Buffer_Transfer(5,6,10) Flow2 proceeds 23: Time advanced by 25 units. Global time is 78 24: Flow2_EH1 : TRACE_SAP_Buffer_Write(9,10) 25: Unidir_IPC_9_10 : TRACE_SAP_Buffer_Transfer(9,10,10) 27: Reactor2_TPRHE5 ---handle_input(10,5)---> Flow2_EH2 28: Time advanced by 25 units. Global time is 103 29: Flow2_EH2 : TRACE_SAP_Buffer_Write(11,10) 30: Unidir_IPC_11_12 : TRACE_SAP_Buffer_Transfer(11,12,10) Flow3 proceeds 31: Time advanced by 25 units. Global time is 128 32: Flow3_EH1 : TRACE_SAP_Buffer_Write(15,10) 33: Unidir_IPC_15_16 : TRACE_SAP_Buffer_Transfer(15,16,10) 35: Reactor2_TPRHE6 ---handle_input(16,6)---> Flow3_EH2 36: Time advanced by 25 units. Global time is 153 37: Flow3_EH2 : TRACE_SAP_Buffer_Write(17,10) 38: Unidir_IPC_17_18 : TRACE_SAP_Buffer_Transfer(17,18,10) 39: Time advanced by 851 units. Global time is 1004

24 - Gill: Distributed Systems – 11/9/2015 Solution: New Deadlock Avoidance Protocols Papers at FORTE 2005 through EMSOFT 2006 http://www.cse.wustl.edu/~cdgill/PDF/forte05.pdf http://www.cse.wustl.edu/~cdgill/PDF/emsoft06_liveness.pdf César Sánchez PhD dissertation at Stanford »Collaboration with Henny Sipma and Zohar Manna Paul Oberlin: MS project here at WUSTL Avoid interactions leading to deadlock »a liveness property Like synchronization, achived via scheduling »Upcalls are delayed until enough threads are ready But, introduces small blocking delays »a timing property »In real-time systems, also a safety property

25 - Gill: Distributed Systems – 11/9/2015 Deadlock Avoidance Protocol Overview Regulates upcalls based on # of available reactor threads and call graph’s “thread height” –Does not allow exhaustion BASIC-P protocol implemented in the ACE Thread Pool Reactor –Using handle suspension and resumption –Backward compatible, minimal overhead EH1 1 EH2 1 EH1 2 EH1 3 EH2 2 EH2 3 EH3 3 Client3 Client2 Client1 Server1Server2 Reactor1Reactor2 EH3 1 EH3 2 Flow1 Flow2 Flow3

26 - Gill: Distributed Systems – 11/9/2015 Timing traces from model/execution show DA protocol regulating the flows to use available resources without deadlock EH3 3 EH2 3 EH1 3 Timing Traces: DA Protocol at Work EH2 2 EH1 2 R1R2 EH3 2 Flow2 R1R2 Flow3 EH3 1 EH2 1 EH1 1 R1R2 Flow1

27 - Gill: Distributed Systems – 11/9/2015 DA Blocking Delay (Simulated vs. Actual) Actual ExecutionModel Execution Blocking delay for Client2 Blocking delay for Client3

28 - Gill: Distributed Systems – 11/9/2015 Overhead of ACE TP reactor with DA Negligible overhead with no DA protocol Overhead increases with number of event handlers because of their suspension and resumption on protocol entry and exit

29 - Gill: Distributed Systems – 11/9/2015 Where Can We Go From Here? Distributed computing is ubiquitous »…in planes, trains, and automobiles … »…in medical devices and equipment… »…in more and more places each day Distributed systems offer many research opportunities »Discover them from specific problems »May allow advances even in well worked areas (e.g., deadlock avoidance) What new systems can we build by spanning different platforms? »I’ll leave that as an open question for you to consider (and ultimately, to answer) A fire extinguisher that runs UNIX?

Distributed Systems (part 1) Chris Gill Department of Computer Science and Engineering Washington University, St. Louis, MO, USA CSE.

Similar presentations

Presentation on theme: "Distributed Systems (part 1) Chris Gill Department of Computer Science and Engineering Washington University, St. Louis, MO, USA CSE."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Distributed Systems (part 1) Chris Gill Department of Computer Science and Engineering Washington University, St. Louis, MO, USA CSE.

Similar presentations

Presentation on theme: "Distributed Systems (part 1) Chris Gill Department of Computer Science and Engineering Washington University, St. Louis, MO, USA CSE."— Presentation transcript:

Similar presentations

About project

Feedback