Università di Roma La Sapienza Dipartimento di Informatica e Sistemistica Distributed Systems Corso di Laurea Specialistica in Ingegneria Informatica AA 2006/2007 Introduction Prof. Roberto Baldoni Ing. Alessia Milani Ing. Leonardo Querzoni Ing. Silvia Bonomi
Università di Roma La Sapienza Dipartimento di Informatica e Sistemistica A definition A distributed system is a set of spatially separate entities, each of these with a certain computational power that are able to communicate and to coordinate among themselves for reaching a common goal
Università di Roma La Sapienza Dipartimento di Informatica e Sistemistica Primary Goal: sharing data/resources Problems Synchronization Coordination
Università di Roma La Sapienza Dipartimento di Informatica e Sistemistica Coordination has to be implemented taking into account the following conditions that deviates from centralized systems: 1.Temporal and spatial concurrency 2.No global Clock 3.Failures 4.Unpredictable latencies These limitations restrict, for example, the set of coordination problems we can be solve in a distributed setting
Università di Roma La Sapienza Dipartimento di Informatica e Sistemistica Distributed Systems Examples internet intranet sistema mobile But also..... Service Oriented Architectures Overlay Networks Grid P2P Pervasive Systems&Ubiquitous Computing
Università di Roma La Sapienza Dipartimento di Informatica e Sistemistica Pervasive Systems Internet everywhere One-persons Many computers Mobility ……….. How can I keep consistent my mailbox???
Università di Roma La Sapienza Dipartimento di Informatica e Sistemistica Characteristics..and Challenges Heterogeneity Openess Security Scalability Fault Tolerance Concurrency Transparency
Università di Roma La Sapienza Dipartimento di Informatica e Sistemistica Heterogeneity Networks Hardware Operating Systems Programming Languages Implementations from different Developers Soluzioni Middleware (from RPC to Service oriented Architectures) Mobile code and Virtual Machine
Università di Roma La Sapienza Dipartimento di Informatica e Sistemistica Openess Capability of a system to be extended and re-implemented Necessary condition, set of documents with software interfaces Interface Definition Language (it describes the syntax and the semantic of a service/component, available functions/services, input parameters, exceptions, etc) Problem: semantic description of a service. A specification of a service/componet is well-formed if it is : –Complete. A specification is complete if everything related to the implementation has been specified. If something has not been specified, the designer needs to add implementation dependent details. –Neutral. A specification is neutral is it does not offer any detail on a possible implementation
Università di Roma La Sapienza Dipartimento di Informatica e Sistemistica Openess (ii) Interoperability. The capability of two systems to cooperate by using services/components specified by a common standard Portability. The capability if a service/component implemented on a distributed system A to work on a system B without doing any modification Flexibility. The capacity of a system to configure/ orchestrate components developed by various programmers Add-on Features. The capacity of a distributed system of adding components/services and be integrating in a running system
Università di Roma La Sapienza Dipartimento di Informatica e Sistemistica Openess (iii) Other recent capabilities: –Evolvability. The capacity of a system to evolve in time for example leaving active two different version of the same service. –Self-* (self organization, self management, self healing etc). The capacity of a system to reconfigure, to manage itself without uman intervention The number of independent software developers make very complex the development of a distributed platform Examples: –RFC for internet –JBoss for J2EE platform
Università di Roma La Sapienza Dipartimento di Informatica e Sistemistica Security Confidenziality (protection against the interception of data from unauthorized users) Integrity (protection against data alteration) Availability (protection against the interference in the access to a resource)
Università di Roma La Sapienza Dipartimento di Informatica e Sistemistica Scalability A system is scalable if it remains running with adequate performance even if the number of users grow up of orders of magnitude Centralization is against scalability : –Service (single service for all users) –Data (single data structure for all processes) Date Computers Web servers 1979, Dec , July130, , July56,218,0005,560,866 Computers connected to the internet
Università di Roma La Sapienza Dipartimento di Informatica e Sistemistica Scalability (ii) It becomes necessary using: –Service Replication Coordination Problems –Data Replication Consistency Problems –Distributed Algorithms No node has the current state of the whole system Nodes base their decisions on data they own A failure of a node should not compromize the goal of the algorithm Geographical Scalability
Università di Roma La Sapienza Dipartimento di Informatica e Sistemistica Internet-scale Applications Enterprise Data Centers Scalable Consistency- based Applications First Open Workshop Budapest What is a large-scale distributed system? Cooperative Information Systems Scalable QoS based Applications eGov envelo p
Università di Roma La Sapienza Dipartimento di Informatica e Sistemistica Failure Management Failure detection –Example: Checksum detects a corrupted packet Failure masking –Example: message retransmission Tolerating Failures –Example: intrusion tolerant systems Failure Recovery –Example: completing long running computation Redoundancy –Example: DNS
Università di Roma La Sapienza Dipartimento di Informatica e Sistemistica Concurrency Multiple access to shared resources –If clients invokes concurrently read and write methods on a shared variable –Which value returns each read? Coordination Synchronization
Università di Roma La Sapienza Dipartimento di Informatica e Sistemistica Transparency Access: allow to access remote and local resources with the same operations Location: allows to access resources without knowing their physical location Concurrency: allows a set of processes to run concurrently on shared resources without interfering among themselves Failures: allow to mask failures in order that users can complete remaining requested operations Mobility: allows to move resources and users without influencing operation issued by users Performance: allow system reconfiguration changing the load Performance of a solution based on a distributed system not always improve with respect to a solution based on a centralized system.
Università di Roma La Sapienza Dipartimento di Informatica e Sistemistica Layering hw and sw
Università di Roma La Sapienza Dipartimento di Informatica e Sistemistica Interaction Models client/server peer-to-peer
Università di Roma La Sapienza Dipartimento di Informatica e Sistemistica Interaction Models Interaction models impact on scalability, availability, cost, security performances Eg. client/server with replicated services: availability, scalability performance: replication imposes an extra work for maintaining consistency despite replica crashes
Università di Roma La Sapienza Dipartimento di Informatica e Sistemistica Web proxy server
Università di Roma La Sapienza Dipartimento di Informatica e Sistemistica Web applets
Università di Roma La Sapienza Dipartimento di Informatica e Sistemistica Thin clients and compute servers Thin Client Application Process Network computer or PC Compute server network
Università di Roma La Sapienza Dipartimento di Informatica e Sistemistica Spontaneous networking in a hotel Internet gateway PDA service Music service Discovery Alarm Camera Guests devices Laptop TV/PC Hotel wireless network
Università di Roma La Sapienza Dipartimento di Informatica e Sistemistica Real-time ordering of events
Università di Roma La Sapienza Dipartimento di Informatica e Sistemistica Processes and channels
Università di Roma La Sapienza Dipartimento di Informatica e Sistemistica Middleware : problems to face Heterogeneity: OS, clock speeds, data representation, memory, architecture HW Local Asynchrony: load on a noad, diffeent OW, Interrupts Lack of global knowledge: knowledge propagates through messages whose messages whose propagation times will be much slower than time taken by the execution of an internal event Network Asyncrony: propagation times of message can be unpredictable. Failures of nodes or network partitions. Lack of a global order of events This limits the set of problems that can be solved through deterministic algorithms on some distributed systems