Implementing a Load-balanced Web Server System
Architecture of A Cluster-based Web System Courtesy: IBM Research Report, The state of the art in the locally distributed Web Server systems.
Architecture of Our Web Server Cluster Web Server 1 File Service Database Service Web Server 2 File Service Video-On-Demand Load Distributor (grid1.cs.ucr.edu) http requests
Our Web Server Cluster The whole web server only provides one visible web address to the outside world. Each Web Server is able to provide two kinds of web services. The load distributor distributes the incoming requests among the servers according to either content-aware or content-unaware load balancing strategies.
Tasks to do to set up the system Building up the web services on the servers File Services Video-on-demand Services Database Services Implementing the load distributor on the frond-end node Content aware request distribution Content unaware request distribution
Building Up the Web Services File service Built on top of Apache server. File set is generated by SPECWEB99. Video-On-Demand Real MPEG2 movies are stored in a specific directory on the Apache server. Client video streaming software (VideoLan) is installed and automatically launched by the Apache server. Database service Built on top of Apache and MySQL.
Video On Demand Service VideoLAN project (Open Source Media Streaming Solution) Targeting multimedia streaming of MPEG-1, MPEG-2, MPEG-4 and DivX files, DVDs, digital satellite channels, digital terrestial television channels and live videos on a high-bandwidth IPv4 or IPv6 network in unicast or multicast. Client-server Architecture Server streams MPEG-1, MPEG-2 and MPEG-4 / DivX files, DVDs and live videos on the network in unicast or multicast. Client receives, decodes and displays MPEG stream.
VideoLan System
Building Up Video-On-Demand Service in Our Web Server VideoLan client-server software is installed Server can stream movies to the client in realtime through UDP/RTP or HTTP/TCP For video-on-demand service using HTTP/TCP, only the client is needed. The client software (vlc) is automatically launched once the Apache server detects that it is a video file.
Load Balancing Schemes Content Unaware Scheme Choose a server before receiving the URL request Round Robin Content Aware Schemes Choose a server to dispatch a request after receiving and looking at the URL request Balance load according to different URL request For database service — Database Server For video-on-demand service — Multimedia Server For file service — Round Robin
Implementing the Load Distributor Install the TCPSP The tcp splicing is a technique to splice two connections inside the kernel, so that data relaying between the two connections can be run at near router speeds. Write the Distributor program in C language Two load balancing strategies are implemented The installed kernel module TCPSP is invoked to perform TCP splicing Run the distributor program in the application level
Flow Chart of the Load Distributor (content aware) Establish a TCP connection with the chosen server Splice two TCP connections End Write the URL request to the second TCP connection Monitor the two TCP connections and close them when no more activities are going on DistributorChild Process Listen for incoming connections on port 8888 Accept the connection Choose a server according to the request type and load balancing scheme Create a child process to do further processing Read the URL request
Flow Chart of the Load Distributor (content unaware) Establish a TCP connection with the chosen server Listen for incoming connections on port 8888 Accept the connection Choose a server according to the load balancing scheme Create a child process to do further processing End Splice two TCP connections Read the URL request Write the URL request to the second TCP connection Monitor the two TCP connections and close them when no more activities are going on DistributorChild Process
Comparison with Gage Gage : A QoS Aware Web Server System “Performance Guarantees for Cluster-Based Internet Services”, Chang Li, State University of New York at Stony Brook. The load distributor is implemented as a kernel module. It is faster but can only implement content-unaware load balancing. Gage doesn’t provide a variety of web services.
Planned Performance Measurement Let all servers provide file service, use SPECWEB99 to test the performance of the cluster-based file server. Compare the time taken to service a Database request through the load distributor with that without the load distributor.
SPECWEB99
Let’s go to the lab to see DEMO!