HTTP Protocol Design1 HTTP - timeline r Mar 1990 CERN labs document proposing Web r Jan 1992 HTTP/0.9 specification r Dec 1992 Proposal to add MIME to HTTP r Feb 1993 UDI (Universal Document Identifier) Network r Mar 1993 HTTP/1.0 first draft r Jun 1993 HTML (1.0 Specification) r Oct 1993 URL specification r Nov 1993 HTTP/1.0 second draft r Mar 1994 URI in WWW r May 1996 HTTP/1.0 Informational, RFC 1945 r Jan 1997 HTTP/1.1 Proposed Standard, RFC 2068 r Jun 1999 HTTP/1.1 Draft Standard, RFC 2616 r 2001 HTTP/1.1 Formal Standard
HTTP Protocol Design2 Uniform Resource Identifier (URI) r Resource independent of its current location or name by which it is known r URI combination of : l Uniform Resource Locator (URL) - Several alternatives (e.g., ftp://) - Most popular l Uniform Resource Name (URN) - Globally unique - Like ISBN for a book r URI characteristics l Absolute: if scheme:string (scheme: file news, http, telnet,…) l Relative: if no scheme
HTTP Protocol Design3 MIME and HTTP r Original proposal l All resources MIME encapsulated l Protocols such as Web should only handle MIME- compliant data r Adopted l Classification of data formats (MIME types) l Formats for multipart messages r Not adopted l Rich text markup mechanism (rather used HTML) l Addressing external documents (rather used URLs)
HTTP Protocol Design4 MIME and HTTP differences r MIME defined for r HTTP high performance r Interpretation of header fields (content- length) r Limitation on line length r HTTP is not MIME-compliant (content- encoding) r Different kinds of entities
HTTP Protocol Design5 HTTP terms r Message l Sequence of octets l Syntax: Request Request-Line General/Request/Entity Header(s) CRLF Optional Message Body l Syntax: Response Status-Line General/Response/Entity Header(s) CRLF Optional Message Body
HTTP Protocol Design6 HTTP terms (cont.) r Entity l Representation of a resource from request or response message l Includes entity headers and an optional entity body r Resource l “Network data object or service that can be identified by a URI” r User agent
HTTP Protocol Design7 HTTP/1.0 request methods r Safety: examines the state of a resource r Idempotent: side effects of one request == those of multiple requests r GET (safe, idempotent) r HEAD r POST (not safe, not idempotent) r PUT (not safe, idempotent) r Delete r LINK/UNLINK
HTTP Protocol Design8 HTTP/1.0 headers r General l Date l Pragma (no-cache) r Request l Authorization l From l If-Modified-Since l Referer l User-Agent r Response l Location (redirects) l Server l WWW-Authenticate (issues challenge)
HTTP Protocol Design9 HTTP/1.0 headers (cont.) r Entity l Allow (valid methods) l Content-Type l Content-Encoding l Content-Length l Expires l Last-Modified
HTTP Protocol Design10 HTTP/1.0 response classes? r From SMTP reply codes (yet no specific meaning) r X00: default response r 1XX: Informational r 2XX: Success 200 OK, 201 Created, 202 Accepted, 204 No Content r 3XX: Redirection 300 Multiple Choices, 301 Moved Permanently, 302 Moved Temporarily, 304 Not Modified r 4XX: Client error 400 Bad Request, 401 Unauthorized, 403 Forbidden, 404 Not Found r 5XX: Server error 500 Internal Server Error 502 Not Implemented
HTTP Protocol Design11 Problems with HTTP/1.0 r Lack of control: cache duration, cache location, selection among cached variants, … r Ambiguity of rules for proxies and caches r Download of full resource instead of needed part r Poor use of TCP: short Web responses r No guarantee for full receipt for dynamically generated responses r Depletion of IP addresses r Inability to tailor request, responses according to client, server preference r Poor level of security r Miscellaneous
HTTP Protocol Design12 New concepts r Hop-by-hop mechanism l Headers valid only for a single transport-level connection: Transfer-Encoding, Connection l Cannot be stored by caches or forwarded by proxies r Transfer coding l Split: message vs. entity (including headers) l Content coding is applied to whole entity l Transfer coding applies to entity-body Property of message not original entity TE, Transfer-Encoding r Virtual hosting r Semantic transparency for caching r Support for variants of a resource
HTTP Protocol Design13 HTTP/1.1 methods r GET, HEAD, POST r PUT, DELETE: formalized r OPTIONS: purpose extensibility l Learn about a server’s capability l Learn about intermediate servers in the path (Max- Forwards header == TTL) r TRACE: purpose extensibility l Returns the content of the message from the receiver (Via header == records intermediaries) r CONNECT: future use (extensibility: Upgrade header allows switch to other protocols)
HTTP Protocol Design14 New headers: General r Old: l Date, Pragma r New: l Cache-Control Caching l Connection Hop-by-hop l Trailer List of headers at end l Transfer-Encoding Transformation to message body l Upgrade Upgrade to other protocols l Via Intermediate servers l Warning Error-notification
HTTP Protocol Design15 New headers: Request r Response preference l New: Accept (charset, encoding, language), TE r Information l Old: Authorization, From, Referer, User-Agent l New: Proxy-Authorization r Conditional request l Old: If-Modified-Since l New: If-Match, If-None-Match, If-Unmodified- Since, If-Range r Constraint on server l New: Expect, Host, Max-Forwards, Range
HTTP Protocol Design16 New headers: Response r Redirection: l Old: Location r Information l Old: Server l New: Retry-After, Accept-Ranges r Security related l Old: WWW-Authenticate l New: Proxy-Authenticate r Caching related l New: Etag, Age, Vary
HTTP Protocol Design17 New headers: Entity r Old: l Allow l Content-Encoding, -Length, -Type l Expires l Last-Modified r New: l Content-Language, -Location, -MD5, -Range
HTTP Protocol Design18 Response codes: Examples r Informational l 100 Continue, 101 Switching Protocols r Success: 206 Partial Content, … r Redirection: 305 Use Proxy, … r Client errors l 14 new ones l Error codes: 400 bad request, 404 not found l Clarification status codes: 405 method not allowed, 410 gone l Using negotiation: 406 not acceptable, 412 unsupported media type l Length related: 411 length required l Other features: 402 Payment Required, 417 expectation failed r Server errors: 504 gateway Timeout, …
HTTP Protocol Design19 Caching HTTP/1.0 r Control options l Request directive (Pragma: no-cache) l Modifier to GET (If-Modified-Since) l Response header (Expires) r Cache busting l Expire header forced immediate expiry of resource l Last-Modified typically means not dynamically generated r Absolute clock values
HTTP Protocol Design20 Caching HTTP/1.1 r Issues: l Separation of cacheable and save use of copy l Ensure correctness (no cache should unknowingly return a stale value) l More control by server over cacheability l No absolute timestamps (no synch) l Caching of negotiated responses r Headers: l Age, Cache-control, Etag, Vary
HTTP Protocol Design21 Cache-control HTTP/1.1 request r No-cache forcible revalidation r Only-if-cached resource only from cache r No-store cache cannot store r Max-age age <= value r Max-stale expired OK but <= value r Min-fresh remain fresh for value r No-transform no change of media type r Extension new tokens
HTTP Protocol Design22 Cache-control HTTP/1.1 response r Public OK to cache r Private Response for specific user only r No-store not permitted to store r No-cache do not serve from cache without r revalidation r No-transform proxy cannot change media type, etc r Must-revalidate cached but revalidate if stale r Proxy-revalidate shared caches need revalidation r Max-age response age should be <= age r S-max age shared caches use value as max-age
HTTP Protocol Design23 Etag r Opaque value r Different versions of resources => different etag values r Decoupled from cache validation l If-Match, If-None-Match l E.g. If-None-Match in PUT to avoid overwriting
HTTP Protocol Design24 Vary r E.g., Accept-Language
HTTP Protocol Design25 Bandwidth Optimization :Factors r Resource sizes are growing r Embedded Images r More users, better connected r Multiple parallel connections
HTTP Protocol Design26 Bandwidth Optimization: Solutions r Only transfer necessary pieces of resource l Range request r Only transfer if receiver can handle response l Expect/continue r Transform resource before sending l Compression
HTTP Protocol Design27 Connection management r Problem: l TCP not optimized for typical short-lived HTTP l message exchange l Use of parallel connections r Solution: l Persistent connections - Keep-Alive l Pipelined connections - Connection header l Problems: - Head of line blocking - Unexpected close (aborts)
HTTP Protocol Design28 Message transmission r HTTP/1.0 l Content-length field r HTTP/1.1 l Chunked encoding (ends with zero-length chunk) l Response: Transfer-Encoding: chunked l Request: TE: trailers
HTTP Protocol Design29 Internet address conservation r Many Web server on a single host r HTTP/1.0 one IP address per Web server r HTTP/1.1 Host header l Host:
HTTP Protocol Design30 Content negotiation r Different formats for each resource r Client and server negotiate about preferred representation l Agent-driven l Server-driven
HTTP Protocol Design31 Proxies in HTTP/1.1: syntax r Requirement dealing with forwarding messages l HTTP/1.1 vs. HTTP/1.0 l Forward non understood headers l Treat hop-by-hop headers l Remove Connection header r Requirement dealing with modifying existing headers, adding new ones l Add Via header l Do not alter the order of field values l Adhere to cache control directive l Do not modify From and Server l Do not alter fully qualified domain names l Do not generate certain headers: Content-MD5
HTTP Protocol Design32 Proxies in HTTP/1.1: semantic r Caching requirements l See cache control l Obligated to send Age header r Connection management requirements l HTTP/1.1 proxies may not establish persistent connections with HTTP/1.0 clients l Different guidelines regarding persistent connections (2*simultaneously active users) r Bandwidth management requirements l Range requests l Forward expect header/417 expectation failed response