Distributed Processing, Client/Server and Clusters Chapter 16
Client/Server Computing Client machines: single-user PCs or workstations that provide a highly user-friendly interface to the end user Each server provides a set of shared user services to the clients The server enables many clients to share access to the same database and enables the use of a high-performance computer system to manage the database
Client/Server Applications Client and server platforms/OS may be different These lower-level differences are irrelevant as long as a client and server share the same communications protocols (ex: TCP/IP) and support the same applications Actual functions performed by the application can be split up between client and server in a way that optimizes the use of resources Optimize the ability of users to perform various tasks and to cooperate with one another using shared resources Heavy emphasis on providing a user-friendly Graphical User Interface (GUI) on the client side (presentation services layer)
Generic Client/Server Architecture
Database Applications One of the most common families of client/server applications The server is a database server responsible for maintaining the database Interaction between client and server is in the form of transactions the client makes a database request and receives a database response A variety of different client applications can use the same database server; all using the same interface/protocol
Client/Server Architecture for Database Applications e.g. SQL: Structured Query Language
Client/Server Database Usage Example: think of an application in which we need to compute the mean of the ages of a certain population and the search criteria returns 300K records heavy network traffic To optimize performance: server can be equipped with application logic for performing data analysis (computation of mean). Split-up the application logic
Client/Server Interaction Yet another example: think of an application in which we need to search for a person named Jo Smith, born in 1980, whose SSN starts with 123 ……. If initial query results in 100K possible records, the server may just indicate that without sending the records. Then search can be narrowed down and the # of possible records is reduced drastically. After a few iterations we may receive the desired record.
Classes of Client/Server Applications Host-based (dumb terminal) not true client/server computing traditional mainframe environment Server-based (thin client) server does all the processing User(client) workstation provides a graphical user interface
Classes of Client/Server Applications Fat client models Takes advantage of desktop power and can serve large number of clients Cooperative application processing is performed in an optimized fashion complex to set up and maintain but greater user productivity gains and greater network efficiency Client-based Most common client/server model all application processing done at the client data validation routines and other database logic function are done at the server Some of the more sophisticated database logic functions are housed on the client side It enables the user to employ applications tailored to local needs
Three-Tier Client/Server Architecture Application software distributed among three types of machines User machine thin client Middle-tier server Gateway Converts protocols Map from one type of database query to another Merge/integrate results from different data sources Assumes both roles: server & client Backend server Legacy applications
File Cache Consistency File caches hold recently accessed file records Cache consistency problem: Caches are consistent when they contain exact copies for remote data Simple solution: File-locking prevents simultaneous access to a file Complicated approach: allow multiple read but one write access; when there is a write, mark the file as non-cacheable
Middleware Lack of standards for client/server models makes it difficult to implement an integrated, multivendor, enterprise-wide client/server configuration Middleware: Set of tools that provide a uniform means and style of access to system resources across different platforms. Goal: to enable an application or user at a client to access a variety of services on servers without being concerned about differences among them Provides standard programming interfaces/protocols that sit between the application above and the communications software+OS below. Capability to hide the complexities and disparities of different network protocols and OS Enable programmers to build applications that look and feel the same with little effort Enable programmers to use the same method to access data
LOGICAL VIEW OF MIDDLEWARE Middleware which cuts across all client and server platforms, is responsible for routing client requests to the appropriate server. SOA (Service-Oriented Architecture): Services with well-defined interfaces are shared by different departments. Standardized interfaces are used to enable service modules to communicate with one another and to enable client applications to communicate with service modules. The most popular interface is the use of XML (Extensible Markup Language) over HTTP (Hypertext Transfer Protocol), known as Web services. SOAs are also implemented using other standards, such as CORBA (Common Object Request Broker Architecture).
Distributed Message Passing Middleware products are typically based on one of two underlying mechanisms: Message-passing or RPC (Remote procedure calls)
Message-passing schemes Reliable Guarantees delivery if possible - Not necessary to let the sending process know that the message was delivered Unreliable Send the message out into the communication network without reporting success or failure - Reduces complexity and overhead Blocking Send does not return control to the sending process until the message has been transmitted OR does not return control until an acknowledgment is received Receive does not return until a message has been placed in the allocated buffer Nonblocking Process is not suspended as a result of a Send or a Receive Efficient and flexible Difficult to debug
Clusters Clusters Compared to SMP Alternative to symmetric multiprocessing (SMP) Group of interconnected, whole computers working together as a unified computing resource illusion is one machine system can run on its own Clusters Compared to SMP SMP is easier to manage and configure SMP takes up less space and draws less power Clusters are better for incremental and absolute scalability Add new systems in small increments Can have dozens of machines each of which is a multiprocessor Clusters are superior in terms of availability Failure of one node does not mean loss of service Clusters have superior price/performance
Beowulf and Linux Clusters Mass market commodity components (No custom components) A dedicated, private network (LAN or WAN or internetworked combination) Easy replication from multiple vendors Scalable I/O A freely available software base Returning the design and improvements to the community
Issues on Clusters Failure management Load balancing Highly available vs. fault-tolerant clusters Highly available clusters offers a high probability that all resources will be in service Fault-tolerant cluster ensures that all resources are always available (use of redundant disks/processors etc.) Load balancing When a new computer is added to the cluster, the load-balancing facility should automatically include this computer in scheduling applications Parallelizing Computation Parallelizing compiler Parallelized application
Parallelizing Computation Parallelizing compiler determines, at compile time, which parts of an application can be executed in parallel performance depends on the nature of the problem and how well the compiler is designed Parallelized application the programmer writes the application from the outset to run on a cluster and uses message passing to move data, as required, between cluster nodes this places a high burden on the programmer but may be the best approach for exploiting clusters for some applications In some cases, effective use of a cluster requires executing software from a single application in parallel. Three general approaches to the problem: Parallelizing compiler: A parallelizing compiler determines, at compile time, which parts of an application can be executed in parallel. These are then split off to be assigned to different computers in the cluster. Performance depends on the nature of the problem and how well the compiler is designed. Parallelized application: The programmer writes the application from the outset to run on a cluster and uses message passing to move data, as required, between cluster nodes. This places a high burden on the programmer but may be the best approach for exploiting clusters for some applications. Parametric computing: This approach can be used if the essence of the application is an algorithm or program that must be executed a large number of times, each time with a different set of starting conditions or parameters. Parametric computing this approach can be used if the essence of the application is an algorithm or program that must be executed a large number of times, each time with a different set of starting conditions or parameters for this approach to be effective, parametric processing tools are needed to organize, run, and manage the jobs in an orderly manner