HTTP The HyperText Transfer Protocol
Objectives Introduce HTTP Introduce HTTP support in.NET
Content What’s the purpose? HTTP Messages – Bottom up Overview Requests and Responses State/Session Management: Cookies Security: Challenge and Response Authentication HTTP and.NET
What’s the End Goal? Make it possible to share information Publish some kind of resource Written information A software application Data from a database Whatever!
Overview of How it Works A “host” makes resources available A Resource is identified by a Universal Resource Identifiers The host listens for requests for its resource(s) It listens using what is called a port The HTTP port can be any numeric value but “80” is the default Clients request a resource from the host Provides a scheme: HTTP Provides a Universal Resource Identifier (URI) May specify the port with which to talk The host responds!
HTTP Defined 1/3 HTTP: HyperText Transfer Protocol Application level protocol HTTP communication usually takes place over TCP/IP This is not a requirement, but most often the case
TCP/IPOSI Model HTTP, TCP/IP and the OSI Model Application Presentation Session Transport Network Data Link Physical HTTP/1.1 Application Transport Internet Network Physical
HTTP Request / Response in Action ClientServer HTTP Request HTTP Response Time
HTTP Defined 2/3 It is a request/response protocol A “client” sends a request to a “server” Requests are made to a specific resource – more later The “server” returns a response Message based communication
HTTP Defined 3/3 Designed for distributed, collaborative information systems Designed specifically for “HyperMedia” – or HyperText Generic, stateless protocol HTTP/1.1 extends the previous version HTTP/1.0 Digest authorization, persistent connections,etc The Web as you know it is built on HTTP!
HTTP/1.1 vs. HTTP/1.0 Persistent connections Default behavior is now: persistent connections Replace the practice of using “Keep-alive” messages Additional status codes 1xx status codes introduced
Protocol Parameters of Interest HTTP Version Uniform Resource Identifier (URI) Date and Time Formats Character sets Content codings Transfer codings Chunked transfer codings
Messages Only two types of messages in HTTP Request Response Types of messages differ only in the their “start line” Messages contain zero or more headers Provide information about the message Depend on the type and the message content May contain a message body
A Message by Example HTTP/ OK Server: Microsoft-IIS/5.0 Date: Tue, 27 Mar :35:30 GMT Content-Type: text/html Accept-Ranges: bytes Last-Modified: Tue, 27 Mar :34:52 GMT ETag: "8c70de8ea9b6c01:d0d“ Content-Length: 488 Test Page For HTTP Test Page!
Message Dissected by Diagram Request Line Method Request URI HTTP Version Info Response Line (a.k.a. Status Line) HTTP Version Info Status Code Description Headers Message Body
Message Body Overview Used to carry an entity body Entity differs from message body when “encoding” exist Example: the entity body is compressed It is an Octet – an 8-bit sequence of data May be divided into pieces and sent in chunks When size cannot be predetermined Reassembled during reception of the messages Messages do not have to have a message body Some messages cannot have a message body
Examples of a Message Body A Web page! The text to render as the page is the body Login information or other form data Shopping information – item you wish to buy Data from a data source
Overview of Headers Provide information about the message This may be about the entire message The length of the message Date or time when the message was generated The message body specifically Is it compressed or otherwise “transformed” in some way? Or the method Request information only after a certain date and/or time
Header Syntax Each message header is a value pair header name “:” header value The header value can be a separated list Examples: Content-Encoding : gzip, abc, xyz Accept: audio/* Accept: text/html, text/plain, text/pdf Headers are case insensitive
Types of Headers Several types of headers General Request Response Entity “Best Practice”: Order the headers from General to Entity
General Headers Applicable to both requests and responses Apply only to the transmitted message Examples of general headers: Connection: Connection options Date: Date & time at which message was originated Via: Used for tracking message forwards etc
Entity Headers Give meta-information About the entity-body being transferred Or, if no entity-body exists, about the resource of the request Apply only if a message body exists Examples of entity headers Allow: List of methods supported by the resource Content-Encoding: Indicates types of content codings applied Content-Language: Language of the intended audience Content-Length: Size of entity-body Expires: Date/time after which response is considered stale etc
Requests Headers Additional information about the request May include information about the client (or sender) itself Examples of request headers Accept: Specifies media types acceptable for response Accept-Charset: Indicates acceptable character sets Accept-Encoding: Similar to Accept; specific to encodings Accept-Language: Limits response to preferred languages Host: Specifies the host & (optional) port of the resource etc
Responses Headers More information than available from just the status line May be information about the server or the resource Examples of response headers Age: Estimate of time since response was generated ETag: Current value of the entity tag Location: Used to redirect to a different location (URI) Proxy-Authenticate: Proxy authentication challenge Retry-After: Expected time that a service will be unavailable Server: Information about the server software used WWW-Authenticate: Authentication challenge etc
Three Parts of a Request Line Request Method Request URI HTTP version information – which protocol are we using?
Request Methods Indicates the type of request to perform Request methods of interest GET (or retrieve) information from the resource server POST “the information” back to the resource server A few other request methods of interest DELETE “the information” from the resource server PUT “the information” at the resource location HEAD: Like GET but only returns meta-information OPTIONS: Gets the communication available
Uniform Resource Identifier (URI) Identifies a (network) resource RFC 2396 defines syntax and semantics of URIs May be an absolute or relative address The resource syntax http_URL = " "//" host [ ":" port ] [ abs_path [ "?" query ]]
Universal Resources: URI, URL, URN Three types of resources, all acceptable! Universal Resource Identifier (URI) Universal Resource Location (URL) Universal Resource Name (URN) No limits on character length of a URI But the server may “artificially” constrain length - typically 4-8 KB Examples of HTTP resource:
HTTP Version Used by sender to notify receiver of its abilities Version information is included in first line of message Uses. numeric notation Examples: 1.0 or 1.1 number indicates the message format number indicates extensions to major format HTTP-Version = " HTTP " " / " 1*DIGIT ". " 1*DIGIT Examples: HTTP/1.0 or HTTP/1.1
Response Line Dissected HTTP Version Information Status Code Status Description
Status Codes & Descriptions Status Code Conveys information about the response 3-digit result code Intended for use by automata Reason phrase or description Text description of the status code For presentation to the user Existing phrases are only suggestions - may be modified
Status Codes – 5 Categories 1xx: Informational Request received and processing is continuing 2xx: Success The action was successfully received, understood, & accepted 3xx: Redirection Further action must must be taken to complete the request 4xx: Client Error A client error occurred 5xx: Server Error A server error occurred
Status Codes of Interest 1/2 100: Continue Tells the client to continue with a request 200: OK The request has succeeded Information returned depends on the type of request 202: Accepted The request has been accepted but not processed 302: Found Resource requested found but temporarily moved
Status Codes of Interest 2/2 400: Bad Request The request could not be understood 401: Unauthorized The request requires proper authorization 403: Forbidden The client may not access the resource 500: Internal Server Error The server encountered an unexpected error The request was not fulfilled 505: HTTP Version Not Supported The server does not or will not support the HTTP version
Persistent Connections Default behavior of connections in HTTP/1.1 Faster and more efficient than “temporary” connections Fewer connections require less resources Request and responses can be pipelined in one connection Reduced number of packets generated Reduced TCP handshaking performed Summary of Benefits Decreased Internet congestion Decreased load on the server: CPU, memory, etc
Cookies: State/Session Management HTTP is stateless by definition Achieve state/session management using cookies Defined and described in RFC 2965 Intent is to have 1 cookie per host or group of related hosts Created and stored on the client Accomplished using Cookie2 and Set-Cookie2 headers Contain attribute value pairs Not designed or intended to hold authentication information Cookie information is unprotected
Baking and Eating Cookies State/session initiated by server – not the client Sends a response which includes the Set-Cookie2 header Set-Cookie2 may have a predefined attribute values pairs Max-Age : Defines the maximum lifespan of the cookie Version : Version of the state management specification Discard : Tells client to discard the cookie when it terminates etc Client response includes the Cookie2 header
ServerClient Cookies in Action POST /foo/login HTTP/1.1 [some form data] HTTP/ OK Set-Cookie2: Customer=“you”; Version=“1”; Path=“/foo” POST /foo/bar HTTP/1.1 Cookie2: $Version=“1”; Customer=“you”; $Path=“/foo” [some form data] HTTP/ OK...
HTTP/1.1 Authentication Basic and Digest Access Authentication Described and defined in RFC 2617 Supports basic authentication of HTTP/1.0 Adds digest based authentication Challenge / response authorization scheme Used for both basic and digest based authentication
Challenge / Response in Action ClientServer Request Response (Credentials) Challenge
Basic Authentication User name and password are passed as clear text Client requests a resource Server challenges the request Sends an HTTP/ Unauthorized response Includes the WWW-Authenticate header Provides the realm or protected space accessed Client responds by resending request with credentials Includes the Authorization header
Basic Authentication in Action ClientServer GET HTTP/1.1 GET HTTP/1.1 Authorization: Basic user_id : password HTTP/ Unauthorized WWW-Authenticate: Basic realm=“
Digest Authentication 1/2 User name and password are not passed as clear text Client and server use a common hashing algorithm This algorithm is used to mask the user and password Same algorithm must be supported by both client and server Default hashing algorithm is MD5 Possible to define your own algorithm(s) Does not provide any encryption of the message Encryption can be done but is not part of the specification
Digest Authentication 2/2 Client requests a resource Server challenges Client responds Concatenates user name, realm and password user_name : realm : password Generates a hash using the concatenated value Sends the response Server uses the same algorithm to authorize the Client Server sends back an acknowledgment of success
Digest Authentication in Action ClientServer GET HTTP/1.1 GET HTTP/1.1 Authorization: Digest user_name... HTTP/ Unauthorized WWW-Authenticate: Digest realm=“ Response with Authentication-Info header
System.Net : HTTP Support Extracted Provides simple interface to network protocols WebRequest & WebResponse Base classes for request/response model in.NET Protocol agnostic abstract classes Should not be created directly Use WebRequestFactory.Create(... ) WebRequest req; req=WebRequestFactory.Create(“
HTTP Support in System.Net HttpWebRequest : Derived from WebRequest HttpWebResponse : Derived from WebResponse HttpVersion : Encapsulates the HTTP version HttpStatusCode : Contains the HTTP status codes etc
HttpWebRequest HTTP specific implementation of WebRequest HttpWebRequest objects should not be created directly Create a WebRequest using the WebRequestFactory WebRequestFactory will decide if HttpWebRequest needed Provides methods to ease working with HTTP requests GetResponse : Gets the response from the request GetResponseStream : Gets a Stream to write the request data etc
Properties of Interest Method : Gets/sets the request method RequestURI : Gets the original request URI ProtocolVersion : HTTP version in use (1.0 or 1.1) Headers : Collection of request headers Additional components of an HTTP request
HttpWebResponse HTTP specific implementation of WebResponse HttpWebResponse objects should not be created directly Returned by call to WebRequest.GetResponse() Provides methods to ease working with HTTP responses GetResponseHeader : Gets the value of a specified header GetResponseStream : Gets a Stream for reading the response body etc
Properties of Interest ProtocolVersion : HTTP version in use (1.0 or 1.1) Status : Gets the status code StatusDescription : Gets the status description Headers : Collection of response headers etc
HttpWebRequest/Response in Action // Issue a request... HttpWebRequest req; req=(HttpWebRequest) WebRequestFactory.Create(" // Retrieve the response... HttpWebResponse result=(HttpWebResponse)req.GetResponse(); // Print the response... Stream resStream = result.GetResponseStream(); Byte[] read = new Byte[512]; int bytes = ReceiveStream.Read(read, 0, 512); Console.WriteLine(“Your HTML...\r\n"); while (bytes > 0) { Console.Write( System.Text.Encoding.ASCII.GetString(read, 0, bytes) ); bytes = ReceiveStream.Read(read, 0, 512); }
Summary HTTP is an application protocol The World Wide Web runs on it Its a simple but robust message based protocol Its designed for more than just the Web HTTP is fully supported in.NET
Section 5: Q&A