The abs_path in a URI If the abs_path is not present in the URL, it must be given as "/" in a Request-URI for a resource. Thus, if a user points a browser at this will make the browser act, when writing the URI in a HTTP request, as if the user had entered where the extra / is the default abs_path, the path for the default resource in the HTTP document-root directory at the server – Note that the HTTP document-root directory is not the same as the root directory for the operating system on the server machine. –The default resource in a HTTP directory (whether it is the HTTP document-root or not) is usually a file called index.html or welcome.html, depending on configuration of server software)
Example 5: two pipelined requests which actually refer to same resource interzone.ucc.ie> telnet student.cs.ucc.ie 80 Trying Connected to student.cs.ucc.ie. Escape character is '^]'. HEAD HTTP/1.1 Host: student.cs.ucc.ie HEAD HTTP/1.1 Host: student.cs.ucc.ie Connection: close
Example 5: responses show same resource HTTP/ OK Date: Wed, 31 Jan :54:06 GMT Server: Apache/ (Unix) PHP/4.0.3pl1 Last-Modified: Thu, 25 Jan :26:32 GMT ETag: "2160-2e25-3a702988" Accept-Ranges: bytes Content-Length: Content-Type: text/html HTTP/ OK Date: Wed, 31 Jan :54:06 GMT Server: Apache/ (Unix) PHP/4.0.3pl1 Last-Modified: Thu, 25 Jan :26:32 GMT (Same time/date as above) ETag: "2160-2e25-3a702988” (Is given same Etag since it is the same resource) Accept-Ranges: bytes Content-Length: (Same file size as above -- it is the same file.) Connection: close Content-Type: text/html. Connection closed by foreign host.
URI Comparison –A comparison of two URIs should be case-sensitive, with these exceptions: A port that is empty or not given is equivalent to the default port for that URI-reference; Comparisons of host names must be case-insensitive; Comparisons of scheme names must be case-insensitive; An empty abs_path is equivalent to an abs_path of "/". Characters other than those in the "reserved" and "unsafe" sets are equivalent to their escaped encoding (%HexHex encoding). URL-codes are case-insensitive.
URI comparison (contd.) For example, the following URIs are equivalent: because scheme and hostnames are case-insensitive, 80 is the default port and the escaped (URL) encoding for ~ is %7E escaped encodings are case-insensitive
URI comparison (contd.) For example, the following URIs are equivalent: because an empty abs_path is equivalent to an abs_path of "/".
Detailed Consideration of HTTP Message Types
HTTP Message Types HTTP messages consist of requests from client to server and responses from server to client. Both types of message consist of –a start-line (a request-line or a status-line) –zero or more header-fields (also known as "headers"), –an empty line (i.e., a line with nothing preceding the CRLF) indicating the end of the header fields, –and (possibly) a message-body. We will consider first the features that are common to requests and responses: –Header-fields –Message-Bodies Later, we will consider features specific to requests (request-lines) and responses (status-lines)
HTTP Header-fields HTTP header-fields include –general-header-fields –request-header-fields (or response-header-fields), and –entity-header-fields, Each header-field consists of a name followed by a colon and the field value. There are many different types of header-fields, which we will consider later
Header-fields (contd.) Field names are case-insensitive. The field value may be preceded by any amount of white- space. Header fields can extend over multiple lines by if each extra line starts with at least one SP or HT. The order in which header fields with differing field names are received is not significant. However, it is "good practice" to send general-header fields first, followed by request-header (or response- header fields), and ending with the entity-header fields.
Header-fields (contd.) Multiple message-header fields with the same field-name may be present in a message if and only if –the entire field-value for that header field is defined as a comma- separated list Appending the multiple header field-values into a comma- separated list must not alter the meaning of the message Therefore, the order in which header fields with the same field-name are received is significant. Thus a proxy must NOT change the order of these field values when a message is forwarded.
Message-bodies The message-body (if any) of a HTTP message is used to carry the entity-body associated with the request or response. The message-body differs from the entity-body only when a transfer-coding has been applied, as indicated by a header-field called the Transfer-Encoding header field
Transfer Encoding The Transfer-Encoding header is used to indicate any transfer-codings applied by an application to ensure safe and proper transfer of the message. Transfer-Encoding is a property of the message, not of the entity, and thus may be added or removed by any application along the request/response chain.
One type of Transfer-Encoding: chunked The chunked encoding method modifies the body of a message in order to transfer it as a series of chunks, each with its own size indicator This allows dynamically produced content to be transferred along with the information necessary for the recipient to verify that it has received the full message: –such content cannot be preceded by a Content-Length header if the program producing the content dynamically is not able to predict how long its output will be A chunk-size indicator is a line containing a string of hex digits, giving the number of octets in the chunk. The chunked encoding is ended by any chunk whose size is zero
Example: request for dynamic output The resource in the request below is a CGI program interzone.ucc.ie> telnet student.cs.ucc.ie 80 Trying Connected to student.cs.ucc.ie. Escape character is '^]'. GET HTTP/1.1 Host: student.cs.ucc.ie
Example (contd.): chunked response HTTP/ OK Date: Wed, 31 Jan :52:27 GMT Server: Apache/ (Unix) PHP/4.0.3pl1 Transfer-Encoding: chunked Content-Type: text/html 7c Short response This is a short response produced by short.cgi 0 Connection closed by foreign host.
Presence of message-body Not every message (request or response) can have a message-body The rules for when a message-body is allowed in a message differ for requests and responses.
Message-bodies in requests Presence of a message-body in a request is signaled by inclusion of a Content-Length or Transfer-Encoding header field in the request's message-headers. A message-body must NOT be included in a request if the specification of the request method (see later) does not allow sending an entity-body in requests.
Message-bodies in responses For response messages, whether or not a message-body is included with a message is dependent on both –the method used in the request which prompted the response and – the status-code (see later) in the status-line of the response. As we have already seen, no response to a HEAD method may include a message-body, even if entity-header fields are present. No response with one of the following status-codes types may include a message-body: 1xx (informational), 204 (no content), and 304 (not modified) All other responses do include a message-body, although it may be of zero length.
General Header Fields These are header fields which can appear in both request and response messages These header fields apply only to the message being transmitted (as opposed to the entity being carried by the message) The types of general-header-fields are Cache-Control: Connection: Date: Pragma: Trailer: Transfer-Encoding: Upgrade: Via: Warning: We have seen some of these headers already (eg, Date: Connection: Transfer-Encoding: ) Some of the others may be be presented later Otherwise, use the web to read RFC2616
Requests
Request format Remember that a request message consists of –a request-line, –zero or more header-fields (general-headers or request-headers or entity-headers), –an empty line (i.e., a line with nothing preceding the CRLF) indicating the end of the header fields, –and (possibly) a message-body.
Request-line The first line of a request message, the request-line, contains – a method-token, followed by –the request-URI and –the protocol version, and ending with –a CRLF. These elements are separated by SP characters. No CR or LF is allowed except in the final CRLF sequence.
Method-token The Method token indicates the method to be performed on the resource identified by the Request-URI. The token is case-sensitive (???). HTTP/1.1 defines the following method tokens: OPTIONS GET HEAD POST PUT DELETE TRACE CONNECT The semantics of these predefined method tokens will be defined later In addition to these predefined methods, HTTP/1.1 allows arbitrary “extension-methods” to appear in request-lines, provided sender and recipient programs have implemented semantics for them
Request-URI The Request-URI is a Uniform Resource Identifier It identifies the resource upon which to apply the request. It must be one of the following forms: "*" | absoluteURI | abs_path | authority These options are dependent on the nature of the request.
Request-URI (contd.) The asterisk "*" request-URI means that the request applies to the server itself, rather to any specific resource on the server –therefore, it is allowed only when the method used does not require a resource. One example request-line would be OPTIONS * HTTP/1.1 in which the client asks for the capabilities of the server
Request-URI (contd.) We have already seen the abs_path form, as in GET /cs1064/jabowen/ HTTP/1.1 We could have used the absoluteURI form instead, as in GET HTTP/1.1 However, the absoluteURI form is required when the request is being made to a proxy. The proxy is requested to forward the request, or service it from a valid cache, and return the response. Note that the proxy may forward the request on to another proxy or directly to the server specified by the absoluteURI.
Request-URIs (contd.) The last form of Request-URI, the authority form, is only used by a method we have not seen yet –the CONNECT method, which is reserved by the protocol for use with a proxy that can dynamically switch to being a tunnel, e.g. in Secure Sockets Layer (SSL) tunneling.
Request Header Fields The request-header fields allow the client to pass additional information about the request, and about the client itself, to the server. The following are the types of request-header-fields defined in HTTP/1.1: Accept: Accept-Charset: Accept-Encoding: Accept-Language: Authorization: Expect: From: Host: If-Match: If-Modified-Since: If-None-Match: If-Range: If-Unmodified-Since: Max-Forwards: Proxy-Authorization: Range: Referer: TE: User-Agent: The Host: header, which we have seen before, must appear in all HTTP/1.1 requests The semantics of some of these fields will be given later. Otherwise, use the web to read RFC 2616
Identification of resource in a request The exact resource identified by an Internet request is determined by examining both –the Request-URI in the request-line and –the Host: request-header field. HTTP/1.1 allows origin servers to support several “virtual” hosts and the Host: header is used to distinguish among the virtual hosts supported by the server listening to a connection An origin server that does not support virtual hosts may ignore the Host: header field value when determining the resource identified by an HTTP/1.1 request. An origin server that does support virtual hosts must use the following rules for determining the requested resource on a HTTP/1.1 request:
Identification (contd.) 1. If the Request-URI in the Request-line is an absoluteURI, the host is part of the Request-URI. –so any Host: request-header in the request must be ignored. 2. If the Request-URI is not an absoluteURI, and the request includes a Host request-header, the host is determined by the value in the Host request-header. 3. If the host as determined by rule 1 or 2 is not a valid host on the server, the response must be a 400 (Bad Request) error message.