Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 HTTP messages Entities and Encoding Herng-Yow Chen.

Similar presentations


Presentation on theme: "1 HTTP messages Entities and Encoding Herng-Yow Chen."— Presentation transcript:

1 1 HTTP messages Entities and Encoding Herng-Yow Chen

2 2 Outline The format and behavior of HTTP message entities as HTTP containers How HTTP describes the size of entity bodies, and what HTTP requires in the way of sizing The entity headers used to describe the format, alphabet, and language of content, so clients can process it properly

3 3 Reversible content encoding transforms data format to take up less space or be more secure Transfer encoding modifies how HTTP ships data to enhance the communication of some kinds of data Chunked encoding chops data into multiple pieces to deliver content of unknown length safely

4 4 The assortment of tags, labels, times, and checksums help clients get the latest version of requested content Ranges are useful for continuing aborted downloads where they left off Delta encoding extensions allow client to request just those parts of a web page that actually have changed since a previously viewed revision

5 5 Checksums of entity bodies are used to detect changes in entity content as it passes through proxies

6 6 Message is made up of header and body HTTP/1.0 200 OK Server: Netscape_Enterprise/3.6 Date: Sun, 17 Sep 2000 00:01:05 GMT Content_type: text/plain Content-length :18 Hi!I ’ m a message! Entity body Entity headers Entity

7 7 HTTP 1.1 defines 10 entity headers Content-Type Content-Length Content-Language Content-Encoding Content-Location Content-Range Content-MD5 Last-Modified Expires Allow ETag Cache-Control

8 8 Entity Bodies

9 9 Why content-length is important? Detecting Truncation Incorrect Content-Length problems? When connection is persistent, where one entity body ends and the next message begins. Chunked encoding is an alternate, sending the data in a series of chunks, each with a specified chunk size. When content-encoding is applied Content-length refers to the encoded body, not the length of the original, unencoded body.

10 10 Entity Digest Content-MD5 Is used to check message integrity Also can be used as a key into a hash table to quickly locate documents and reduce duplicate storage of content.

11 11 Media type and Charset Content-type refers to original entity body type before encoding. Support optional parameters to further specify the content type. Character Encodings for Text Media Content-Type: text/html; charset=iso-8859-4

12 12 Common media types Media typeDescription Text/htmlEntity body is an HTML document Text/plainEntity body is a document in plain text Image/gifEntity body is an image of type GIF Image/jpegEntity body is an image of type JPEG Audio/x-wavEntity body contains WAV sound data Model/vrmlEntity body is a three-dimensional VRML model Application/vnd.ms-powerpointEntity body is a Microsoft PowerPoint presentation Multipart/byterangesEntity body has multiple parts,each containing a different range(in bytes) of the full document Message/httpEntity body contains a complete HTTP message (see TRACE)

13 13 Multipart Media Types MIME “ multipart ” email messages contain multiple messages stuck together and sent as a single, complex message. Each component is self-contained, with its own headers describing its contents; the different components are concatenated together and delimited by a string. HTTP also supports multipart bodies; however, only used in two cases: fill-in form submission and range responses carrying pieces of a document.

14 14 Multipart Form Submissions Your Name? Your File to send? http://xxx/cgi

15 15 If the user enters “ John ” and selects the text file “ hello.txt ” Content-Type: multipart/form-data; boundary=AaBo3x --AaBo3x Content-Disposition: form-data; name= “ submit-name ” John --AaBo3x Content-Disposition: form-data; name= “ files ” ; filename= “ hello.txt ” Content-Type: text/plain … contents of hello.txt … --AaBo3x

16 16 If selects the text file “ hello.txt ” and the second image file “ image.gif ” Content-Type: multipart/form-data; boundary=AaBo3x --AaBo3x Content-Disposition: form-data; name= “ submit-name ” John --AaBo3x Content-Disposition: form-data; name= “ files ” ; Content-type: multipart/mixed; boundary=BbC04y --BbC04y Content-Disposition: file: filename= “ hello.txt ” Content-type: text/plain … contents of hello.txt … --BbC04y Content-Disposition: file: filename= “ image.gif ” Content-Type: image/gif Content-Transfer-Encoding: binary … contents of image.gif … --BbC04y --AaBo3x

17 17 Multipart Range Response HTTP/1.0 206 Partial Content Server: Microsoft-IIS/5.0 Content-Location: http://xxx/hello.txt Content-Type: martipart/x-byteranges; boundary=--[abcdefghik … z]-- ----[abcdefghik … z] — Content-Type: text/plain Content-Range: bytes 0-174/1441 …. Part I content --- --[abcdefghik … z]-- Content-Type: text/plain Content-Range: bytes 1344-1441/1441 …. Part II content --- --[abcdefghik … z]--

18 18 Content-Encoding HTTP applications sometimes want to encode content before sending it, to help lesson the time it takes to transmit the data. Content-Type is the type of the original format, before encoding Content-Length is the length of the encoded length

19 19 Content Encoding Original content Content-Type: text/html Content-Length: 17571 Original content Content-Type: text/html Content-Length: 17571 Content-encoded content Content-Type: text/html Content-Length: 5746 content-encoding: gzip 01110001 00110010 Gzip content decoder Gzip content encoder

20 20 Content-encoding tokens Content-encoding value Description gzipUsing the GNU zip encoding (RFC1952) compressUsing the UNIX file compression program deflateUsing zlib format (RFC1950) for deflate compression (RFC 1951) identityNo encoding has been performed. When a Content-encoding header is not present, this can be assumed.

21 21 Accept-Encoding Headers server client HTTP/1.1 200 OK Content-type: image/gif Content-encoding: gzip [ … ] Request message Response message … 00101101 … The server compresses the image with gzip to transport a smaller file over the thin Network connection between itself and the client.This saves network bandwidth And reduces the amount of time that the client waits for the transfer.Though,the Client will have to spend time decompressing the image once the image is served. gzipgunzip GET /logo.gif HTTP/1.1 Accept-encoding: gzip [ … ]

22 22 Client can indicate preferred encodings by attaching Q values Accept-Encoding: compress, gzip Accept-Encoding: Accept-Encoding: * Accept-Encoding: compress;q=0.5, gzip;q=1.0 Accept-Encoding: gzip;q=1.0, identity;q=0.5; *;q=0

23 23 Transfer Encoding Content-Encodings are to deal with the entity content to be encoded for less- space or security reason, tightly associated with the content format. In comparison, transfer encodings are applied for architectural reasons and are independent of the content format.

24 24 Content encoding vs. transfer encoding HTTP/1.0 200 OK content-encoding: gzip Content-Type: text/html [ … ] [encoded message] HTTP/1.1 200 OK Transfer-encoding: Chunked 10 abcdefghijk 1 a Content-encoded response Transfer-encoded response Normal header block Normal entity (just encoded) Basic header Encoded blocks A content-encoded message just encodes the entity Section of the message. With Transfer-encoded Messages the encoding is a function of the entire Message, changing the structure of the message itself

25 25 Transfer-Encoding Headers TE Used in the request header to tell the server what extension transfer encoding are okay to use. Transfer-Encoding Used in the response header to tell the receiver (client) what encoding has been perform

26 26 Example GET /1.html HTTP/1.1 Host: www.csie.ncnu.edu.tw User-Agent: Mozilla/4.61 TE: trailers, chunked HTTP/1.1 200 ok Transfer-Encoding: chunked Server: Apache 3.0

27 27 Chunked Encoding

28 28 Chunked Encoding (continued) Chunking and Persistent connection Trailers in chunked messages Combining Content and Transfer Encoding

29 29 Combining Content and Transfer Encodings 9BF2578EA4 2670CD 9BF2578EA4 2670CD 426 8EA 257 98B 426 8EA 257 98B Content encoding Transfer encoding (chunking) Content-type: text/heml Content-Type: text/html content-encoding: gzip Content-Type: text/html content-encoding: gzip Transfer-encoding: chunked

30 30 Time-Varying Instance Web objects usually are not static. The same URL can, over time, point to different versions of an object. For example, the website of any media company like CNN, and BBC.

31 31 Time-Varying Instances

32 32 Validators and Freshness In the previous CNN example, the client got the initial resource V1 and can cache this copy, but for how long? Once the document has “ expired ” at the client, it must request a fresh copy from the server. Using a “ conditional request ” to tell the server which version it currently has, using a validator, and ask for a copy to be sent only if its current copy is no long valid.

33 33 Cache-Control header directives DirectiveMessage type no-cacheRequest no-storeRequest max-ageRequest max-freshRequest no-transformRequest only-if-cachedRequest publicResponse privateResponse

34 34 Cache-Control header directives DirectiveMessage type no-cacheResponse no-storeResponse no-transformResponse must-revalidateResponse proxy-revalidateResponse max-ageResponse s-max-ageResponse

35 35 Conditional request types Request typevalidator If-Modified-SinceLast-Modified If-Unmodified-SinceLast-Modified If-MatchETag If-None-MatchETag

36 36 Range Request HTTP allows clients to actually request just part or a range of a document. Applications: Request RoI (Region of Interest) Media Indexing and Access Streaming applications

37 37 Range Requests GET /bigfile.html HTTP/1.1 [ … ] GET /bigfile.html HTTP/1.1 Range: bytes=20224- [ … ] HTTP/1.1 200 OK Content-Type: text/html Content-Length: 65537 Accept-Ranges: bytes [ … ] HTTP/1.1 200 OK Content-Type: text/html Range: bytes=20224- Accept-Ranges: bytes [ … ] Response message Range response message Request message www.csie.ncnu.edu.tw client 110100 111001 101001 110010 The client ’ s original request was Interrupted,but a second request For the part of the message that Was not received allows the Client to resume form the point Of the interruption Range request message

38 38 Delta Encoding An extension to the HTTP protocol that optimizes transfer by communicating changes instead of entire objects. RFC 3229 describe delta encoding.

39 39 Delta Encoding

40 40 Delta Encoding

41 41 Delta-encoding headers Etag If-None-Match A-IM IM Delta-Base

42 42 IANA registered types of instance manipulations TypeDescription vcdiff Delta using the vcdiff algorithm diffe Delta using the Unix diff-e command gdiff Delta using the gdiff algorithm gzip Compression using the gzip algorithm deflate Compression using the deflate algorithm range Used in a server response to indicate that the response is partial content as the result of a range selection identity Used in a client request ’ s A-IM header to indicate that the client is willing to accept an identity instance manipulation

43 43 For More Information http://www.ietf.org/rfc/rfc2616.txt Hypertext Transfer Protocol -- HTTP/1.1 http://www.ietf.org/rfc/rfc3229.txt Delta encoding in HTTP http://www.ietf.org/rfc/rfc1521.txt MIME (Multipurpose Internet Mail Extensions) Part One:Mechanisms for Specifying and Describing the Format of Internet Message Bodies http://www.ietf.org/rfc/rfc2045.txt Multipurpose Internet Mail Extensions(MIME) Part One:Format of Internet Message Bodies http://www.ietf.org/rfc/rfc1864.txt The Content-MD5 Header Field http://www.ietf.org/rfc/rfc3230.txt Instance Digests in HTTP


Download ppt "1 HTTP messages Entities and Encoding Herng-Yow Chen."

Similar presentations


Ads by Google