Enhance Features and Performance of Content Switches 9/18/2018 Enhance Features and Performance of Content Switches Chandra Prakash Department of Computer Science University of Colorado at Colorado Springs 9/18/2018 Chandra Prakash/Enhance Features & Performance of CS Chandra Prakash/Enhance Features of CS
Chandra Prakash/Enhance Features & Performance of CS Outline of the Talk Introduction Existing Content Switch related products/techniques Basic Architecture of Content Switch (CS) TCP Delayed Binding and proposed improvements Performance results of various schemes for improving TCP Delayed Binding Handle multiple requests in HTTP Keep-Alive connection Enhancements to CS Handling muitlple packets of a HTTP request Handling different data encoding formats Improved XML rule matching High-Availability of Linux CS Cluster Conclusion Future Work 9/18/2018 Chandra Prakash/Enhance Features & Performance of CS
Content Switch-based Cluster RIP1 Real Server1 RIP2 Real Server2 Internet WAN/ LAN VIP RIP3 CIP Virtual Server/ Content Switch Real Server3 Client CIP: Client IP Address VIP: Virtual IP Address RIP: Real Server IP Address 9/18/2018 Chandra Prakash/Enhance Features & Performance of CS
Existing Content Switch Related Products Alteon's series (A180e/A184) products support URL-based Server load balancing F5's Big-IP product supports load balancing and contents switching Foundry network's ServerIron product supports URL, Cookie, and SSL Session ID-based switching Intel’s XML accelerator products can distribute Web load based on XML tag values 9/18/2018 Chandra Prakash/Enhance Features & Performance of CS
Existing Content Switch Related Techniques MAC address translation (MAT) MAC multicast Half network address translation (HNAT), also known as NAT in Linux Virtual Server Project (http://www.linuxvirtualserver.org) Full network address translation (FNAT) IP tunneling 9/18/2018 Chandra Prakash/Enhance Features & Performance of CS
MAC Address Translation server to client traffic client traffic real server 1 ethernet IP:128.198.192.1 loopback IP:128.198.192.182 LAN client traffic real server 2 ethernet IP:128.198.192.2 loopback IP:128.198.192.182 virtual server VIP:128.198.192.182 real server 3 ethernet IP:128.198.192.3 loopback IP:128.198.192.182 9/18/2018 Chandra Prakash/Enhance Features & Performance of CS
Chandra Prakash/Enhance Features & Performance of CS MAC Multicast server to client traffic client traffic client traffic real server 1 IP1 LAN client traffic real server 2 IP2 switch client traffic MAC Multicast group real server 3 IP3 9/18/2018 Chandra Prakash/Enhance Features & Performance of CS
Chandra Prakash/Enhance Features & Performance of CS Half NAT client traffic server to client traffic client traffic real server 1 ethernet IP:128.198.192.1 Default GW:128.198.192.182 server to client traffic real server 2 ethernet IP:128.198.192.2 Default GW :128.198.192.182 virtual server VIP:128.198.192.182 LAN real server 3 ethernet IP:128.198.192.3 Default GW :128.198.192.182 9/18/2018 Chandra Prakash/Enhance Features & Performance of CS
Chandra Prakash/Enhance Features & Performance of CS Full NAT server to client traffic server to client traffic client traffic client traffic real server 1 IP1 real server 2 IP2 virtual server VIP real server 3 IP3 9/18/2018 Chandra Prakash/Enhance Features & Performance of CS
Chandra Prakash/Enhance Features & Performance of CS IP Tunneling server to client traffic client traffic real server 1 IP1 client traffic real server 2 IP2 virtual server VIP packet destined for real server encapsulation at virtual server VIP RIP VIP client packet real server 3 IP3 decapsulation at real server VIP 9/18/2018 Chandra Prakash/Enhance Features & Performance of CS
Basic Operations of Content Switching CS: Content Switching CS Rule Editor CS Rules Incoming Packets Packet Classification Header Content Extraction CS Rule Matching Algorithm Packet Routing (Load Balancing) Network Path Info Forward Packet To Servers Server Load Status Load Balancing Repository 9/18/2018 Chandra Prakash/Enhance Features & Performance of CS
TCP Delayed Binding(Basic Scheme) client content switch server SYN(CSEQ) step1 SYN(DSEQ) step2 ACK(CSEQ+1) ACK(DSEQ+1) step3 DATA(CSEQ+1) ACK(DSEQ+1) step4 SYN(CSEQ) step5 SYN(SSEQ) step6 ACK(CSEQ+1) step7 ACK(SSEQ+1) step8 DATA(CSEQ+1) ACK(SSEQ+1) DATA(DSEQ+1) step9 DATA(SSEQ+1) ACK(CSEQ+LenR+1) ACK(CSEQ+lenR+1) step10 ACK(DSEQ+ lenD+1) ACK(SSEQ+lenD+1) lenR: size of http request. . lenD: size of return document 9/18/2018 Chandra Prakash/Enhance Features & Performance of CS
Pre-Allocate Scheme if Guess is Correct client Pre-allocated server content switch SYN(CSEQ) step1 SYN(CSEQ) SYN(SSEQ) step2 SYN(SSEQ) ACK(CSEQ+1) ACK(CSEQ+1) step3 ACK(SSEQ + 1) ACK(SSEQ+1) step4 ACK(SSEQ+1) step5 step6 DATA(SSEQ+1) ACK(CSEQ+lenR+1) ACK(CSEQ+LenR+1) ACK(SSEQ+ lenD+1) ACK(SSEQ+lenD+1) DATA(CSEQ+1) DATA(CSEQ+1) Guess routing decision based on IP/Port#/History Advantage: Faster than TCP delay binding. Possible direct route between client and server Reduce session processing overhead no need to convert server sequence # . 9/18/2018 Chandra Prakash/Enhance Features & Performance of CS
Pre-allocate Scheme if Guess is Wrong Pre-allocated server client content switch SYN(CSEQ) step1 SYN(CSEQ) step2 SYN(SSEQ)/ ACK(CSEQ+1) SYN(SSEQ)/ ACK(CSEQ+1) step3 ACK(SSEQ + 1) ACK(SSEQ+1) DATA(CSEQ+1)/ ACK(SSEQ+1) step4 DATA(CSEQ+1)/ACK(SSEQ+1) step5 DATA(SSEQ+1) Server sent HTTP 404 RST step6 Right server step7 SYN(CSEQ) SYN(RSEQ)/ ACK(CSEQ+1) step8 Sequence # conversion needed for right server now ACK(RSEQ+1) step9 step10 DATA(CSEQ+1)/ACK(RSEQ+1) DATA(SSEQ+1)/ACK(CSEQ+LenR+1) DATA(RSEQ+1)/ACK(CSEQ+lenR+1) step11 ACK(SSEQ+lenD+1 step12 ACK(RSEQ+lenD+1) 9/18/2018 Chandra Prakash/Enhance Features & Performance of CS
Migrate (Data, CSEQ, DSEQ) Filter Process Scheme client Filter Process run on server content switch server SYN(CSEQ) step1 step2 SYN(DSEQ)/ACK(CSEQ+1) ACK(DSEQ+1) step3 DATA(CSEQ+1)/ACK(DSEQ+1) step4 step5b Migrate (Data, CSEQ, DSEQ) SYN(CSEQ) step5 a SYN(SSEQ)/ ACK(CSEQ+1) step6 ACK(SSEQ+1) step7 DATA(CSEQ+1)/ACK(SSEQ+1) step8 step9 DATA(SSEQ+1) ACK(CSEQ+lenR+1) DATA(DSEQ+1) ACK(CSEQ+LenR+1) ACK(DSEQ+ lenD+1) ACK(SSEQ+lenD+1) step10 9/18/2018 Chandra Prakash/Enhance Features & Performance of CS
Chandra Prakash/Enhance Features & Performance of CS Performance Metrics Processing time vs document size for GET request Processing time vs document size for POST request Obtained results for individual schemes using Webbench by varying the delay and number of threads sending request Plot of max sustainable requests/sec vs. number of rules Plot of max throughput in bytes/sec vs. number of rules 9/18/2018 Chandra Prakash/Enhance Features & Performance of CS
Benchmark Configuration fladnag.uccs.edu - content switch - 128.198.192.184 - Linux 2.2-16-3 vinci.uccs.edu - real server 1 - 128.198.192.193 - Linux 2.2-16-3 gandalf.uccs.edu - real server 2 - 128.198.192.194 - Linux 2.2-16-3 dilbert.uccs.edu - client - 128.198.192.195 - Windows NT 4.0 For plot of processing time vs. document size. used a Perl script that sends GET and POST requests with varying request and response sizes. For response time and throughput measurement. used Webbench 9/18/2018 Chandra Prakash/Enhance Features & Performance of CS
Processing Time vs. Response Size for GET Request 9/18/2018 Chandra Prakash/Enhance Features & Performance of CS
Processing Time vs. Request Size for POST Request 9/18/2018 Chandra Prakash/Enhance Features & Performance of CS
Comparison of Overall Webbench Requests/Second Metric 9/18/2018 Chandra Prakash/Enhance Features & Performance of CS
Comparison of Overall Webbench Throughput (Bytes/Second) Metric 9/18/2018 Chandra Prakash/Enhance Features & Performance of CS
Request/Sec vs. Number of CS Rules 9/18/2018 Chandra Prakash/Enhance Features & Performance of CS
Throughput in Bytes/Sec vs. Number of CS Rules 9/18/2018 Chandra Prakash/Enhance Features & Performance of CS
Handling Multiple Requests in a Keep-Alive Connection Determine when new request arrives Verify that previous request has been completely received TCP payload size is > 0 Key assumption is only one outstanding request is sent at a time by client, i.e., requests are not pipelined. Reuse connections Store each connection control information in a hash table keyed by real server address, once it is established. 9/18/2018 Chandra Prakash/Enhance Features & Performance of CS
Keep-Alive Connection Hash Table For each real server, hash table entry stores following parameters: rs_addr cli_str_seq cli_str_ack_seq rs_last_next_seq rs_last_ack_seq 9/18/2018 Chandra Prakash/Enhance Features & Performance of CS
Client to Real Server Sequence Translation in a Keep-Alive Connection Sequence number of packet sent by client and forwarded to current real server rs_last_ack_seq + (cli_cur_seq - cli_str_seq) Here cli_cur_seq is the sequence number of currently forwarded client packet. Acknowledgment number of packet sent by client and forwarded to current real server rs_last_next_seq + (cli_cur_ack_seq - cli_str_ack_seq) Here cli_cur_ack_seq is the acknowledgment number of currently forwarded client packet. 9/18/2018 Chandra Prakash/Enhance Features & Performance of CS
Real Server to Client Sequence Translation in a Keep-Alive Connection Sequence number of packet sent by real server and forwarded to client cli_str_ack_seq + (rs_cur_seq - rs_last_next_seq) Here rs_cur_seq is the sequence number of currently forwarded real server packet. Acknowledgment number of packet sent by real server and forwarded to client cli_str_seq + (rs_cur_ack_seq - rs_last_ack_seq) Here rs_cur_ack_seq is the acknowledgment number of currently forwarded real server packet 9/18/2018 Chandra Prakash/Enhance Features & Performance of CS
Handling Multiple Packets of a HTTP Request Request may span over multiple TCP segments which requires queuing of incoming packets Determine when request is completely received which requires parsing of HTTP header content, e.g., “Content-Length” tag in requests like PUT and POST Keeping in sync with client and server TCP HTTP request fragmentation example, where TCP Segment n contains: POST /cgi-bin/cs622/purchase.pl HTTP/1.0\r\n Referer: http://archie.uccs.edu/~acsd/lcs/xmldemo.html\r\n Connection: Keep-Alive\r\n Content-type: application/x-www-form-urlencoded\r\n Content-length: 7 and TCP Segment n+1 contains: 53\r\n data (753 bytes) 9/18/2018 Chandra Prakash/Enhance Features & Performance of CS
Handling Different Data Encodings in XML Document Typically there are two encoding techniques text/xml consist of plain ascii text with no specical encoding x-www-form-urlencoded Consist of text where special characters are encoded as “%XX”, where XX is the hexadecimal value of the special character. For example, newline and left anchor (‘<‘) characters are encoded as "%0A" and "%3C” respectively. 9/18/2018 Chandra Prakash/Enhance Features & Performance of CS
Improved Rule Specification <purchase> <customerID>111222333</customerID> <item> <productID>309121544</productID> <unitPrice>5000</unitPrice> </item> <productID>309121538</productID> <unitPrice>200</unitPrice> </purchase> Many tags with the same name make rule specification ambiguous, e.g, the item tag in above XML sample document A rule specification like “purchase:1:item:2:uniPrice:1 > 200” allows to access unitPrice tag of the second item 9/18/2018 Chandra Prakash/Enhance Features & Performance of CS
High-Availability of Linux Content Switch (HA-LCS) Address issues related to fault tolerance of Virtual server real servers or services on the real server high-availability of data files (e.g. HTML docs) The setup is based on existing configuration of high-availability of LVS with with following key software components: Heartbeat (for fault tolerance of virtual server) Mon (for fault tolerance of services of real server) Coda (for high-availability of data files) 9/18/2018 Chandra Prakash/Enhance Features & Performance of CS
Chandra Prakash/Enhance Features & Performance of CS HA-LCS Architecture user real server 1 mon heartbeat Coda file system primary real server 2 mon heartbeat backup virtual server cluster LAN real server 3 9/18/2018 Chandra Prakash/Enhance Features & Performance of CS
Chandra Prakash/Enhance Features & Performance of CS HA-LCS Configuration fladnag.uccs.edu - content switch (primary) - 128.198.192.184 - Linux 2.2.16-3 walden.uccs.edu - content switch(secondary) - 128.198.192.203 - Linux 2.2.16-3 vinci.uccs.edu - real server 1 (coda client) - 128.198.192.193 - Linux 2.4.2-2 gandalf.uccs.edu - real server 2 (coda client) - 128.198.192.194 - Linux 2.4.2-2 wait.uccs.edu - coda server - 128.198.192.202 - Linux 2.4.2-2 9/18/2018 Chandra Prakash/Enhance Features & Performance of CS
Unique Constraints Imposed in HA-LCS as Compared to HA-LVS In LCS, switching rules based on application content are hard wired in kernel rule module. To change a switching rule requires: modify rule module code to reflect changed rule compile modified rule module remove old rule module insert new old module In LVS, switching rules based a simple load balancing policy and can be changed via built in commands 9/18/2018 Chandra Prakash/Enhance Features & Performance of CS
HA-LCS Configuration Notes Heartbeat Setting up “ha.cf” and “haresources” file Note: The IP address specified in the “haresources” file should not be configured via OS Mon Setting up “mon.cf” and real server failure/startup handler script “wk_up.ksh” Coda Setting up coda server using “vice-setup” and configuring client When creating “coda volume” on server the INSTALL.linux setup file says create volume as: createvol_rep coda:root E0000100 /vicepa, where “coda:root” is root volume On the other hand, online coda help says root volume be set as “coda.root”. While creating coda admin, do not specify user id as 1, even though instructions say one can use coda admin user id as 1. 9/18/2018 Chandra Prakash/Enhance Features & Performance of CS
Performance of Different File Systems with HA-LCS 9/18/2018 Chandra Prakash/Enhance Features & Performance of CS
Chandra Prakash/Enhance Features & Performance of CS Conclusion Implemented three schemes to study improvement on TCP delayed binding. Pre-allocate scheme gave best results followed by basic and filter scheme. Implemented scheme to handle multiple requests in a given connection coming in a non-pipelined fashion. Proposed ways to handle request sent in a pipelined fashion in a Content Switch Addressed issues related to content switch processing: handling multiple packets in a request, improving rule matching, handling different data encoding formats. Implemented a highly-available Linux Content Switch (LCS) system Identified key issues related to fault tolerance in LCS and implemented the solutions. 9/18/2018 Chandra Prakash/Enhance Features & Performance of CS
Chandra Prakash/Enhance Features & Performance of CS Future Work Improve the reliability of LCS by moving the content switch processing from IP layer to Transport layer. Enhance load balancing policies by considering network path status and server load. Improve performance by reusing connections in a connection pool to avoid setup overhead Utilize latest protocols, e.g., ASAP/ENRP, for managing fault-tolerant clusters. 9/18/2018 Chandra Prakash/Enhance Features & Performance of CS