Cluster-Based Scalable Network Services Authors: Armando Fox, Steven D. Gribble, Yatin Chawathe, Eric A. Brewer (University of California, Berkeley) Paul Gauthier (Inktomi Corporation)
Contents Clusters & Load balancing Problems in providing clustered services TACC - programming model Clustered-Based Scalable Network Architecture
Contents (cont..) TranSend & HotBot cluster implementations TranSend performance results Current cluster servers
Clusters & Load balancing Cluster : A domain where several homogeneous systems aim which behave as a single system to provide high performance service, availability, reliability & transparency of data over a network. Load Balancer r1 r3, r4 r1 r2 r3 r1,r2
Problems in providing clustered services Scalability Availability Cost effective high performance Transparency Configurability & maintenance Extensibility
TACC - programming model Transformation Aggregation Caching Customization
Conventional Transactional systems ACID properties (Atomicity, Consistency, Isolation & Durability) Internet Services are requires BASE properties (Basic Availability, Soft state & Eventual Consistency)
Clustered-Based Scalable Network Architecture
TranSend version of SNS (a) Front Ends HTTP Request accept Pairing HTTP requests with User’s preferences Assigning requests to distillers Return cached data
TranSend (Cont…) (b) Load Balancing Requests Front End MS Cache ($) User Profile Front End MS Manager Cache ($) Distiller + WS Response Requests
(d) User Profile database TranSend (Cont…) (c) Crash recovery & fault tolerance (d) User Profile database (e) Graphical Monitor (d) Caching
HotBot implementation (Inktomi work) Front End Nodes – multiple threads put the connections in the queue Load balancer -- statistically partition the database among worker nodes Failure Management -- similar nodes are attached for a partition
TranSend – performance details Distiller Performance
TranSend -- Cache performance Average cache hit take 27 millisecs to serve the request 95% of cache hits takes less than 100 milliseconds Observed increased performance using LRU caching mechanism till the total users wont exceed the cache size. Latency due to many connections with Front-Ends
TranSend – scalability Front-Ends scalability Distiller scalability
Interface layer for cluster object distribution Current cluster servers (WebLogic, WebSphere, PRAMATI Web-Server) requests requests response response Load Balancer Load Balancer Interface layer for cluster object distribution Node3-r1 Node1-r1 Node1-r2 registering New Node Node2-r1
Scalability of web servers K E T Connections Processing Threads (keep alive) Accepting Threads
Conclusions & Contributions Easy implementation of stateless workers to achieve TACC for Internet Content Large scale network services can be achieved by BASE principles References: www.theServerSide.com --- for cluster architecture PRAMATI web server , Resin, Tomcat for connection scalability