Download presentation
Presentation is loading. Please wait.
Published byEleanor McKinney Modified over 9 years ago
1
HTTP Here, we examine the hypertext transfer protocol (http) – originally introduced around 1990 but not standardized until 1997 (version 1.0) – protocol permits transfer of hypertext documents the request is usually generated by clicking on a hyperlink in a browser – server responds to the request and sends back requested document PC running Explorer Server running Apache Web server Mac running Navigator HTTP request HTTP response Different types of machines can request the same resource Apache, is just one of many web servers
2
Some Definitions Client – the machine requesting a resource – often through a web browser Server – the machine that responds to requests and transfers documents to fulfill the requests – usually a dedicated machine running some server software Request – message that contains an HTTP method (we will cover these shortly) sent from client to server Response –document/file requested, along with a message – or the message alone if the document/file does not exist or the request was ill-formed or not understood Header – both request and response are placed into headers – headers are usually not visible to the user – header requests start with the method (e.g., GET), the resource requested and the protocol/version – we explore headers in more detail in a few slides
3
HTTP Methods The method is the action that the client wishes the server to perform – GET – request a resource, to be displayed in the web browser (if possible, else save to disk) – Conditional GET includes If-Modified-Since – comes with a specified date, server returns the requested item if it has been modified since that date If-Unmodified-Since If-Match – comes with a condition tested by the server that if true causes the server to return the resource If-None-Match If-Range – return the resource if it falls within a given range – For example: GET /index.html HTTP/1.1 If-Modified-Since: Mon, 11 Jan 2010 12:30:15 GMT – GET is the most common method – Conditional GETs are used to prevent the server from taking time or Internet usage when it may not be necessary/desired
4
Other Methods HEAD – return the header portion only, not the actual page PUT – used to upload a page (or content) – must be sent with the content to be uploaded – can only be used if either the user has been authenticated or the server does not require authentication (this would be a security flaw if PUT is allowed without authentication) POST – same as PUT except that POST appends to a file – this can be used to place data into a bulletin/posting board or database OPTIONS – queries the web server to find out what methods are available for use DELETE – used to delete the specified resource TRACE – used for troubleshooting (trace the route) CONNECT – used in conjunction with a proxy server
5
Headers The header is a portion of the message transmitted – if a request, the header is the request – if a response, it precedes the resource being returned Request headers will include – the method, resource location, protocol and version – host name – user agent (browser) if sent by browser, including version of browser and preferred language (e.g., English) – what form(s) of encoding is preferred – how long the request should remain active Response headers will include – protocol and version, status of request (see next slide) – date/time – server name – last modification date/time – content-type
6
More on Headers Four classes of headers – general headers consist of four parts Connection indicates whether the TCP connection should close at the end of the request or response or be persistent (the default) Date (date/time of when the message was sent) Transfer-encoding (what if any type of encoding has been applied) Warning – status code – request headers are sent when a browser makes a request of a server and may contain the following Accept – what types of media are acceptable by the client, provided in MIME format, e.g., text/html, image/png, etc Accept-Charset – what character sets are acceptable Accept-Encoding – what types of encoding *can* be applied Accept-Language – what language(s) is(are) preferred From and Host specifiers Conditions – if-match, if-modified-since, if-range, if-unmodified-since, range User-Agent – the type of browser
7
Continued Response headers – sent by the server to the requester (which may be a proxy server, a web browser, another program (e.g., web crawler) or a command via nc or curl for instance) and may contain – Accept-Ranges (if the request had a range header) – ETags – an identifier generated from the file’s inode – Server – information about the server (web server software and version, platform) Entity headers – may be sent in response to a document being sent via post, put, etc – Allow – lists set of methods available for the server – Content-Encoding, Content-Language, Content-Length, Content-Range, Content-Type – information about the document being sent – Last-Modified – if the item being sent already existed, last modification information about it
8
Examples GET / HTTP/1.1 Host: www.alcpress.com User-agent: Mozilla/5.0... Accept */* Accept-Language: en Accept-Encoding: gzip,deflate,compress,identity Keep-Alive: 300 Connection: keep-alive HTTP/1.1 200 OK Date: Tue, 07 Aug 2001 23:06:18 GMT Server: Apache 1.3.20 Cache-Control: max-age=604800 Expires: Tue, 14 Aug 2001 23:06:18 GMT Last-Modified: Tue, 06 Feb 2001 20:16:28 GMT Etag: 1033e-607-3a7fd5d0 Acept-Ranges: bytes Content-Length: 2357 Keep-Alive: timeout=15, max=100 Connection: keep-alve Content-Type: text/html [data] Example GET header Example response from a GET request
9
Status Codes See Appendix A for the complete list – 100 codes – informational 100 – continue, 101 – switching protocols – 200 codes – success 200 – request succeeded, 201 – resource created, 202 – command accepted, 204 – request succeeded but no content sent back, 205 – reset content – 300 codes – redirection (URL redirected to a different resource) 300 – multiple choices, 301 – resource permanently moved, 302 – resource temporarily moved, 305 – use proxy – 400 codes – client error codes 400 – bad request, 401 – unauthorized, 402- payment required, 403 – forbidden, 404 – not found, 405 – method not allowed, 406 – not accepted, 408 – timeout, 410 – gone – 500 codes – server error codes 500 – internal server error, 501 – not implemented, 503 – service unavailable, 504 – gateway timeout
10
URLs The URL is the specification of the resource – [protocol:]//host[:port][path/file][?query] protocol is typically http but could be https or ftp or other port defaults to 80 but can be overridden, for instance if the client knows that a different port should be used to fulfill the given request path specifies where to look in the web server’s document space, servers may have defaults if the file is omitted (e.g., index.html, index.php, index.cgi) query is used to specify a given location within a file (e.g., a database record) URI is a more genetic form of identifier used in the semantic web (the book will use URL & URI interchangeably) – URLs consist only of letters, digits, $, -, _,., +, !, *, ’, () – URLs may be case sensitive (true for Linux/Unix servers, not necessarily true for Windows servers)
11
Negotiation In some cases, a request does not precisely match a resource in which case negotiation may take place – Language negotiation – if a file exists in multiple languages and the client has specified a preference, the server will respond with the document that fits the most preferred language if possible Accept-Language: de, en-us;q=0.7,en;q=0.3 – request German first, and if not available, then American English and finally non-American English – Content negotiation – preference of types by placing types in prioritized list of MIME types Accept: image/png,image/jpg;q=0.8,image/gif;q=0.5 – Content coding – lists what type(s) of encoding can be used to help reduce the message traffic over the Internet These may include gzip (or x-gzip), compress (or x-compress), deflate and identity (no encoding)
12
Other Topics Caching – to reduce Internet traffic, caching can take place in three different locations – web browser (client) caching – server caching – proxy caching we cover proxy caching in chapter 11 Cookies – HTTP is a stateless form of communication – you cannot store what is currently going on in the communication – a cookie is a file that stores the state (e.g., passwords, preferred pages, contents of shopping carts) – since cookie information is meant to be transmitted to a server, they can represent security holes – what if a cookie is set up by server1 but server2 asks for that information? Cookies can also violate privacy
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.