Transparent Caching The art of caching network traffic without requiring user / browser side configuration.
Who am I?
Overview What is transparent caching, and why use it? Tools available How to set it up Common problems Alternatives
What is transparent caching? Transparently proxying / caching network traffic without requiring user configuration or knowledge. A way to simplify caching for the end user Forces all users to use the cache.
Why use transparent caching? Ease of use. No configuration required by the end user. Catching all users. No users can bypass the cache.
Reasons to not use it It is not a magical tool without problems. Technical issues –Networking issues. “Woodo magic” –Stability / Reliability –Only port 80 –FTP not supported –To be efficient modern browsers are required
Reasons to not use it (cont.) Political reasons –What is an internet connection? –Privacy No user control. Users can’t bypass the cache.
What is involved? TCP level routing Reverse NAT or related technology to hijack port 80 traffic. A proxy with some knowledge of transparent proxying A cache
Tools available TCP level Routing –Policy routing / route maps –TCP / layer 4 switches with or without NAT –Cisco WCCP Host level NAT –Linux firewall code –FreeBSD firewall code –IP-Filter
Using policy routing to redirect traffic A standard router configured to route TCP port 80 to the cache server. Router Cache Server Internet User 1 User 2 User.. User 3
Policy routing (cont.) Benefits –Can usually be deployed without extra hardware Drawbacks –Only static routing –No fault tolerance. Port 80 traffic disrupted if cache server fails. –More CPU load on the router
Running the cache on a router Small network / firewall. Host used as router. Router / Cache Server Internet User 1 User 2 User.. User 3
Caching router / firewall Benefits –Less hardware required –Well suited for small to medium sized firewalls. Drawbacks –Stability / reliability. Can disrupt all communication. –If running on a firewall: make sure the firewall protects the cache software.
Cisco Web Cache Control Protocol (WCCP) Developed by Cisco for Cisco Automated configuration. Proxy servers announce their presence to the router. Load balancing Fault recovery Commercial Licensing required. Not currently an option for free software.
TCP level / layer 4 switching The use of a smart and efficient network device to redirect traffic. Cache Server Internet Router TCP Switch User 1 User 2 User.. User 3...
TCP switch benefits –Can bypass the cache if it malfunctions –Good reliability –Can distribute the load on multiple cache servers –Can do the required NAT, allowing the use of any OS on the cache server. –Some do HTTP proxy translation, allowing the use of any proxy software.
TCP switch drawbacks –One more expensive box to purchase –Using NAT requires switch vendor support in the proxy software to support old browsers.
Request formats Proxy request TCP connection from client to proxy GET HTTP/ Server request TCP connection from client to server IP GET /path/to/file HTTP/1.0 Host: (if supported)...
Problems related to request formats A transparent proxy must reconstruct the URL of the request. Host: headers not always available. HTTP/1.1 feature or 1.0 add-on. IP address from NAT translation.
What happens at the TCP level? Normal communication / proxying –IP based routing –TCP is end-to-end –One IP address, one Host Transparent proxying –TCP based routing –TCP is no longer end-to-end –One IP address, “multiple hosts”
Problems at the TCP level TCP normally relies on two IP protocols. TCP and ICMP. Of these only TCP can be reliably redirected. ICMP is required for Path MTU discovery. TCP resets if a single packet travels another path bypassing the redirection.
Things to consider when configuring OS level NAT Try not to disturb traffic to/from the cache server host. Make sure that the proxy traffic is not redirected back to the proxy. Be prepared to do packet level traces, preferably from a separate box.
Recommended steps when building a transparent proxy Think it over. Is it really required? Build and test the proxy server Configure NAT on the proxy server Test it using a local LAN client Set up TCP level routing.
Common problems Communication hangs for some users –Most likely caused by MTU related problems. Connection reset errors –Usually misconfigured NAT or TCP routing. Bad performance –Possibly CPU bottleneck in the router.
TCP Reset errors (cont.) Error messages seen from the proxy. –TCP routing or NAT affects traffic generated by the proxy. Error messages seen by the browser (popup) –TCP routing or NAT failure, causing some client traffic to bypass the redirection.
Alternatives PAC files Blocking port 80 –Selectively or everything –Possibly with an automated message
Selectively blocking port 80 with a message A good alternative to transparent proxying Uses the same techniques as transparent proxying for hijacking port 80, but only to deliver the instructions.
Blocking port 80, benefits Forces the users to configure their proxy settings Users are automatically provided with configuration instructions when needed. Less calls to support line. Users get the information on why caching is good for them. PAC file allows easy configuration of exceptions
Blocking port 80, drawbacks Not all browsers supports proxy settings Users are required to be capable of following instructions.
Summary Transparent caching is a good tool in most configurations to ease user side configuration. It has some important limitations. Not a full replacement for standard proxying. For many automatic instructions on how to configure proxy settings achieves the same goals.
Sources for more information –Squid FAQ –Router manuals on policy routing –IP-Filter home page –Linux 2.0 ipfadm –Linux 2.2 ipchains
Questions
Example Cisco IP policy route map Policy route map, routing port 80 (www) to server ! Enable policy routing interface Ethernet0 ip policy route-map proxy-redirect ! Route to proxy server route-map proxy-redirect permit 10 match ip address 110 set ip next-hop ! Only policy route client www traffic access-list 110 deny tcp any any neq www access-list 110 deny tcp host any access-list 110 permit tcp any any
Example Linux ipfwadm NAT Linux 2.0 redirecting eth0 TCP port 80 to Squid on port 3128 –Kernel options: CONFIG_IP_FIREWALL=y CONFIG_IP_ALWAYS_DEFRAG=y –ipfwadm ruleset # Accept local traffic ipfwadm -I -a accept -W eth0 -D this.host # Redirect port 80 to Squid on 3128 ipfwadm -I -a accept -W eth0 -P tcp -D /0 80 -r 3128
Example Linux ipchains NAT Linux 2.2 redirecting eth0 TCP port 80 to Squid on port 3128 –Kernel options: CONFIG_IP_FIREWALL=y CONFIG_IP_ALWAYS_DEFRAG=y –ipchains ruleset # Accept local traffic ipchains -A input -j ACCEPT -i eth0 -d /32 # Redirect port 80 to Squid on port 3128 ipchains -A input -j REDIRECT i eth0 -p tcp -d /0 80
Example IP-Filter NAT ipnat ruleset redirecting TCP port 80 to Squid on port 3128 # Redirect direct web traffic to local web server. rdr de /32 port 80 -> port 80 tcp # Redirect everything else to squid on port 3128 rdr de /0 port 80 -> port 3128 tcp
Running Squid on Linux
What is Linux Linux is like any other UNIX POSIX standards GNU tools Best of SysV and BSD families
Filesystem performance To few performance counters for I/O to make any good measurements Asynchronous writes by default (like fastfs on Solaris) noatime mount option
Kernel performance / tuning Memory freelist tuning on smaller systems –/proc/sys/vm/freepages Filedescriptor limits –Default 256 –Later revisions of 2.2 may allow 1024 –Patches available for higher limits
Hands on transparent caching Linux configuration –Kernel configuration Firewalling & Transparent proxy support –ipfwadm configuration ipfwadm -I accept -D thishost ipfwadm -I accept -P tcp -D /0 80 -r 3128
Hands on transparent caching (cont.) Squid configuration httpd_accel_host virtual httpd_accel_uses_host_header on