Web Server Design Week 8 Old Dominion University Department of Computer Science CS 495/595 Spring 2007 Michael L. Nelson <mln@cs.odu.edu> 2/26/07
Encodings gzip extension: .gz (sometimes seen as x-gzip) compress extension: .Z (sometimes seen as x-compress) deflate extension: .zip identity no encoding at all chunked breaks the body into a series of server-chosen “chunks” optimization for dynamically produced content
Identity Encoding The default, “no transformation” encoding “applying the identity encoding to a resource is an _________ operation”
Content Encoding vs. Transfer Encoding GET /bottle1 HTTP/1.1 Transfer-Encoding Origin Server Client HTTP/1.1 200 OK Content-Encoding images from: http://www.madeirawineguide.com/madeira_v300/
Content-Encodings “Content-Encoding is primarily used to allow a document to be compressed without losing the identity of its underlying media type.” section 14.11
Content-Encoding Example (Correct) AIHT:~/Desktop/cs595-s06 mln$ telnet www.cs.odu.edu 80 Trying 128.82.4.2... Connected to xenon.cs.odu.edu. Escape character is '^]'. HEAD /~mln/pubs/bollenj_adaptive.ps.gz HTTP/1.1 Host: www.cs.odu.edu Connection: close HTTP/1.1 200 OK Date: Mon, 20 Feb 2006 04:30:25 GMT Server: Apache/1.3.26 (Unix) ApacheJServ/1.1.2 PHP/4.3.4 Last-Modified: Thu, 25 Jul 2002 16:58:58 GMT ETag: "1c16-139ea-3d402e52" Accept-Ranges: bytes Content-Length: 80362 Content-Type: application/postscript Content-Encoding: x-gzip Connection closed by foreign host.
Content-Encoding Example (Incorrect) AIHT:~/Desktop mln$ telnet www.cs.odu.edu 80 Trying 128.82.4.2... Connected to xenon.cs.odu.edu. Escape character is '^]'. HEAD /~mln/pubs/bollenj_adaptive.ps.gz HTTP/1.1 Host: www.cs.odu.edu Connection: close HTTP/1.1 200 OK Date: Mon, 26 Feb 2007 02:06:25 GMT Server: Apache/2.2.0 Last-Modified: Thu, 25 Jul 2002 16:58:58 GMT ETag: "1c16-139ea-92cab880" Accept-Ranges: bytes Content-Length: 80362 Content-Type: application/x-gzip Connection closed by foreign host. Wrong, Wrong, Wrong!!!!!!!
Transfer Encodings “Transfer-Encoding is a property of the message, not of the entity, and thus MAY be added or removed by any application along the request/response chain.” section 4.5
TE Request Header & Transfer-Encoding Response Header Client specifies preferences for transfer encoding in the “TE:” header section 14.39 Server marks the encoding used with the “Transfer-Encoding” header section 14.41 Both headers use the same encoding values available with “Content-Encoding”, plus the special “chunked” encoding and the “Trailers” keyword
TE & Transfer-Encoding dhcp65-74-196-93:~/Desktop/cs595-s06 mln$ telnet www.cs.odu.edu 80 Trying 128.82.4.2... Connected to xenon.cs.odu.edu. Escape character is '^]'. GET / HTTP/1.1 TE: gzip;q=1.0, Trailers Host: www.cs.odu.edu HTTP/1.1 200 OK Date: Mon, 27 Feb 2006 15:52:33 GMT Server: Apache/1.3.26 (Unix) ApacheJServ/1.1.2 PHP/4.3.4 Transfer-Encoding: chunked Content-Type: text/html 5f6 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> <!-- saved from url=(0036)http://www.cs.odu.edu/newcssite/new/ --> <!-- saved from url=(0019)http://sci.odu.edu/ --> [more html deleted]
Time / Space Tradeoff Hard to find examples of compression used in transfer encoding http://www.webreference.com/internet/software/servers/http/compression/2.html http://www-128.ibm.com/developerworks/web/library/wa-httpcomp/ idea: for very heavy volume web servers, answering the request quickly is more important than preserving bandwidth Complexity of management seems to be the limiting factor in compression with content encodings
Chunked Encoding “The chunked encoding modifies the body of a message in order to transfer it as a series of chunks, each with its own size indicator, followed by an OPTIONAL trailer containing entity-header fields. This allows dynamically produced content to be transferred along with the information necessary for the recipient to verify that it has received the full message.” sections 3.6.1, 19.4.6
Chunked Encoding Example AIHT:~/Desktop/cs595-s06 mln$ telnet www.cs.odu.edu 80 | tee 1-6.out Trying 128.82.4.2... Connected to xenon.cs.odu.edu. Escape character is '^]'. GET /~mln HTTP/1.1 Connection: close Host: www.cs.odu.edu Connection closed by foreign host. HTTP/1.1 301 Moved Permanently Date: Mon, 09 Jan 2006 19:32:24 GMT Server: Apache/1.3.26 (Unix) ApacheJServ/1.1.2 PHP/4.3.4 Location: http://www.cs.odu.edu/~mln/ Connection: close Transfer-Encoding: chunked Content-Type: text/html; charset=iso-8859-1 12e <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <HTML><HEAD> <TITLE>301 Moved Permanently</TITLE> </HEAD><BODY> <H1>Moved Permanently</H1> The document has moved <A HREF="http://www.cs.odu.edu/~mln/">here</A>.<P> <HR> <ADDRESS>Apache/1.3.26 Server at www.cs.odu.edu Port 80</ADDRESS> </BODY></HTML> 0
Chunked Encoding Example 2 AIHT:~/Desktop/cs595-s06 mln$ telnet www.cs.odu.edu 80 Trying 128.82.4.2... Connected to xenon.cs.odu.edu. Escape character is '^]'. GET / HTTP/1.1 Host: www.cs.odu.edu HTTP/1.1 200 OK Date: Tue, 21 Feb 2006 03:54:31 GMT Server: Apache/1.3.26 (Unix) ApacheJServ/1.1.2 PHP/4.3.4 Transfer-Encoding: chunked Content-Type: text/html 5f6 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> <!-- saved from url=(0036)http://www.cs.odu.edu/newcssite/new/ --> <!-- saved from url=(0019)http://sci.odu.edu/ --> <HTML xmlns:st1 = "urn:schemas-microsoft-com:office:smarttags"><HEAD><TITLE>Department Of Computer Science</TITLE> <META http-equiv=Content-Type content="text/html; charset=windows-1252"> <META content="College of Sciences" name=Description> <META content="College of Sciences" name=Keywords><LINK title=style href="files/style.css" type=text/css rel=stylesheet> [lots of html deleted] [demo this example to see the various “chunks”]
Trailer Response Header The “Trailer” response header lets the client know that additional headers will appear at the end of the chunked response section 14.40 headers can be reconstructed by downstream servers headers that can never be trailers: Transfer-Encoding Content-Length Trailer
Trailer Example HTTP/1.1 200 OK Date: Mon, 22 Mar 2004 11:15:03 GMT Content-Type: text/html Content-Length: 129 Expires: Sat, 27 Mar 2004 21:12:00 GMT <html><body><p>The file you requested is 3,400 bytes long and was last modified: Sat, 20 Mar 2004 21:12:00 GMT. </p></body></html> HTTP/1.1 200 OK Date: Mon, 22 Mar 2004 11:15:03 GMT Content-Type: text/html Transfer-Encoding: chunked Trailer: Expires 29 <html><body><p>The file you requested is 5 3,400 23 bytes long and was last modified: 1d Sat, 20 Mar 2004 21:12:00 GMT 13 .</p></body></html> Expires: Sat, 27 Mar 2004 21:12:00 GMT “Expires:” response header covered in section 14.21 Examples from: http://www.tcpipguide.com/free/t_HTTPDataLengthIssuesChunkedTransfersandMessageTrai-3.htm
Two More Request Headers to Process
User-Agent Request Header section 14.43 “This is for statistical purposes, the tracing of protocol violations, and automated recognition of user agents for the sake of tailoring responses to avoid particular user agent limitations. User agents SHOULD include this field with requests.” examples: User-Agent: CERN-LineMode/2.15 libwww/2.17b3 User-Agent: CS 495/595 Spring 2006 Automated Checker (A3)
Referer Request Header section 14.36 “The Referer[sic] request-header field allows the client to specify, for the server's benefit, the address (URI) of the resource from which the Request-URI was obtained (the "referrer", although the header field is misspelled.) The Referer request-header allows a server to generate lists of back-links to resources for interest, logging, optimized caching, etc. It also allows obsolete or mistyped links to be traced for maintenance. The Referer field MUST NOT be sent if the Request-URI was obtained from a source that does not have its own URI, such as input from the user keyboard.” examples: Referer: http://www.w3.org/hypertext/DataSources/Overview.html Referer: /opening-page.php