Hypertext Transfer Protocol Anupam Joshi
HTTP1.0 Basics TCP protocol (not required) Connection-oriented, 1 connection / request Stateless Request - Reply Version 1.0 is most prevalent -- version 1.1 is picking up
HTTP Versions Old: HTTP/0.9 Oldish: HTTP/1.0 Currentish: HTTP/1.1 When?: HTTP Next Generation
HTTP Chat between client and server in ISO Latin1 (Negotiable in 1.1) CR LF separates lines in request/reply Format: request_method URL [protocol_version] <blank line> header_field: header_field_data
HTTP Requests Full request syntax: Methods: GET: return requested doc Method SP Request-URI SP HTTP-Version CRLF Methods: GET: return requested doc HEAD: return header info about requested doc POST: treat doc as script and send data PUT: replace doc with data DELETE: delete doc
HTTP Requests Request-URI is an absolute URI (if server is a proxy) or absolute path Request header fields: Authorization From If-Modified-Since Referer User-Agent Accept Accept-Encoding
HTTP Requests Request data: if POST or PUT, Content-Length bytes of data follows after empty line
GET Requests Unconditional or conditional If-Modified-Since: date
HEAD Requests Same as GET, except no body No conditional requests
POST Requests Do something based on the URI given Content-Length bytes long data follows Can result in no reply or some reply Shouldn’t cache responses!
HTTP Responses Simple response: no header, just data [ONLY if HTTP/0.9 request or server] Full response syntax: status_line header_fields <blank line> data
HTTP Responses Status line: Status code: 3-digit integer: HTTP-Version SP Status-Code SP Reason-Phrase CRLF Status code: 3-digit integer: 1xx: informational (not used, but reserved) 2xx: Success (action complete) 3xx: Redirection (action incomplete) 4xx: Client error (bad request) 5xx: Server error (no can do) Reason phrase: a comment for humans
2xx Status Codes 200 OK 201 Created: URL created by POST 202 Accepted: accepted for later processing 203 Partial Information: “unofficial” info 204 No Content: done, but no output
Other Status Codes 304 Not Modified: response to a conditional GET 401 Unauthorized: need authorization to complete 403 Forbidden: have info, but no can do 404 Not Found: huh? 500 Internal Error: ouch
Access Authentication Simple challenge-response authentication mechanism If no perms to get doc, server sends 401 (unauthorized) + WWW-Authenticate field WWW-Authenticate: auth_scheme realm=realm_value params Client re-requests with Authorization field Authorization: auth_scheme stuff
Basic Authorization Scheme Based on user-agent authenticating with user-ID + passwd for each realm Realm is an opaque string for equality comparison with others Example challenge WWW-Authenticate: Basic realm=“SLNet News”
Basic Authorization Scheme Client must send user-ID + passwd separated by ‘:’ in a base64 encoded string (<= 76 chars/line) Example response: Authorization: Basic QWxhZGRpbjpvc=Q2Ft
Basic Authorization Scheme Not secure! Assumes: connection between client and server is a trusted carrier Clients should implement to talk with servers that use it
Security Considerations Client authentication: basic isn’t safe Method safety: GET/HEAD should be just that Allow clients to treat POST in a special way Unannounced side effects of GET/HEAD: can’t hold user responsible! Abuse of server log information
Security Considerations Transfer of sensitive information: applications must be careful Server: field.. can be abused by crackers Referer: field.. can expose private stuff From: field.. can break privacy or security policies
Problems with HTTP Doesn’t handle (well): In class and want to all look at slides Low bandwidth connections “flash crowds” Pages containing dynamically updating text etc. Disconnected browsing Bad network usage Issues: Scaling, latency, bandwidth and disconnection
HTTP Next Generation Family of protocols caching and replication of servers notification of changes client/server transport Replacement of HTTP/1.x, not fix
HTTP-NG Proposal Multiple, asynchronous requests over a single sonnection Server responds in any order or interleaved: “parallel” transfer Session layer protocol implemented with separate channels for control and data One data channel for each object. ASN.1 and PER for describing and encoding requests