Download presentation
Presentation is loading. Please wait.
1
1 HTTP – HyperText Transfer Protocol Part 1
2
2 Common Protocols In order for two remote machines to “ understand ” each other they should –‘‘ speak the same language ’’ –coordinate their ‘‘ talk ’’ The solution is to use protocols Examples: –FTP – File Transfer Protocol –SMTP – Simple Mail Transfer Protocol –NNTP – Network News Transfer Protocol –HTTP – HyperText Transfer Protocol
3
3 Why HTTP was Needed? According to Tim Berners-Lee (1991), a protocol was needed with the following features: –A subset of the file transfer protocol –The ability to request an index search –Automatic format negotiation –The ability to refer the client to another server
4
4 File System Proxy Server Web Server HTTP Request HTTP Request HTTP Response HTTP Response www.cs.huji.ac.il:80 http://www.cs.huji.ac.il/~dbi
5
5 Department Proxy Server University Proxy Server Israel Proxy Server Web Server www.w3.org:80
6
6 Terminology User agent: client which initiates a request (browser, editor, Web robot, … ) Origin server: the server on which a given resource resides (Web server a.k.a. HTTP server) Proxy: acts as both a server and a client Gateway: server which acts as intermediary for other servers Tunnel: acts as a blind relay between two applications – we can implement a custom protocol using HTTP tunneling
7
7 Resources A resource is a chunk of information that can be identified by a URL (Universal Resource Locator) A resource can be –A file –A dynamically created page What we see on the browser can be a combination of some resources
8
8 Universal Resource Locator There are other types of URL ’ s –mailto: –news: protocol://host:port/path#anchor?parameters http://www.cs.huji.ac.il/~dbi/index.html#info http://www.google.com/search?hl=en&q=blabla protocol://host:port/path#anchor?parameters
9
9 In a URL Spaces are represented by “ + ” Characters such as &,+,% are encoded in the form “ %xx ” where xx is the ascii value in hexadecimal; For example, “ & ” = “ %26 ” The inputs to the parameters are given as a list of pairs of a parameter and a value: var1=value1&var2=value2&var3=value3
10
10 war&peace Tolstoy
11
11 http://www.google.com/search?hl=en&q=war%26peace+Tolstoy
12
12 An HTTP Session A basic HTTP session has four phases: 1.Client opens the connection (a TCP connection) 2.Client makes a request 3.Server sends a response 4.Server closes the connection
13
13 Nesting in Page Index.html Left frameRight frame Jumping fish Fairy iconHUJI icon What we see on the browser can be a combination of several resources What we see on the browser can be a combination of several resources
14
14 Nested Objects Suppose a client accesses a page containing 10 inline images, how many sessions will be required to display the page completely? The answer is 11 HTTP sessions – why? Some browsers/servers support a feature called keep-alive which can keep the connection open until it is explicitly closed How can this help?
15
15 Stateless Protocol HTTP is a stateless protocol, which means that once a server has delivered the requested data to a client, the server retains no memory of what has just taken place (even if the connection is keep-alive) What are the difficulties in working with a stateless protocol? How would you implement a site for buying some items? So why don ’ t we have states in HTTP?
16
16 The Format of HTTP Requests and Responses An initial line Zero or more header lines A blank line (i.e., a CRLF by itself), and An optional message body (e.g., a file, query data, or query output) Note: CRLF = “ \r\n ” (usually ASCII 13 followed by ASCII 10)
17
17 Headers HTTP 1.0 defines 16 headers –None are required HTTP 1.1 defines 46 headers –One header (Host:) is required in requests that are sent to Web servers –A request that is sent to a proxy does not have to include any header –A response does not have to include any header How do we know who is the host when there is no host header?
18
18 HTTP Requests
19
19 The Format of a Request methodspURLspversion header crlf : value crlf header : value crlfcrlf Entity Body headers lines
20
20 Request Example GET /index.html HTTP/1.1 [CRLF] Accept: image/gif, image/jpeg [CRLF] User-Agent: Mozilla/4.0 [CRLF] Host: www.cs.huji.ac.il:80 [CRLF] Connection: Keep-Alive [CRLF] [CRLF]
21
21 Request Example GET /index.html HTTP/1.1 Accept: image/gif, image/jpeg User-Agent: Mozilla/4.0 Host: www.cs.huji.ac.il:80 Connection: Keep-Alive [blank line here] method request URL version headers
22
22 Request Methods
23
23 Common Request Methods GET returns the contents of the indicated document HEAD returns the header information for the indicated document –Useful for finding out info about a resource without retrieving it POST treats the document as an application and sends some data to it
24
24 More Request Methods PUT replaces the content of the document with some data DELETE deletes the indicated document TRACE invokes a remote loop-back of the request. The final recipient SHOULD reflect the message back to the client Usually these methods are not allowed
25
25 GET Request A request to get a resource from the Web The most frequently used method The request has no message body, but parameters can be sent in the request URL (i.e., the URL without the host part)
26
26 HEAD Request A HEAD request asks the server to return the response headers only, and not the actual resource (i.e., no message body) This is useful for checking characteristics of a resource without actually downloading it, thus saving bandwidth Used for testing hypertext links for validity, accessibility and recent modification
27
27 Post Request POST request can send data to the server POST is mostly used in form-filling –The data filled into the form are translated by the browser into some special format and sent to a program on the server using the POST command
28
28 Post Request (cont.) There is a block of data sent with the request, in the message body There are usually extra headers to describe this message body, like Content-Type: and Content-Length: The request URL is a URL of a program to handle the sent data, not a file The HTTP response is normally the output of a program, not a static file
29
29 Post Example Here's a typical form submission, using POST: POST /path/register.cgi HTTP/1.0 From: frog@cs.huji.ac.il User-Agent: HTTPTool/1.0 Content-Type: application/x-www-form-urlencoded Content-Length: 35 home=Ross+109&favorite+flavor=flies
30
30 Request Headers
31
31 HTTP 1.1 Request Headers The common request headers of HTTP 1.1 are described in the following slides –Accept –Accept-Encoding –Authorization –Connection –Cookie –Host –If-Modified-Since –Referer –User-Agent
32
32 Accept Request Headers Accept –Specifies the MIME types that the client can handle (e.g., text/html, image/gif) –Server can send different content to different clients Accept-Encoding –Indicates encodings (e.g., gzip) client can handle
33
33 More Accept Request Headers Accept-Charset Accept-Language
34
34 Authorization Request Header Authorization –User identification for password-protected pages –Instead of HTTP authorization, use HTML forms to send username/password and store in state (e.g., session object )
35
35 Connection Request Header Connection –Connection: keep-alive means that the browser can handle persistent connection –Keep-alive is the default in HTTP 1.1 –In a persistent connection, the server can reuse the same socket over again for requests that are very close together from the same client –Connection: close means that the connection is closed after each request
36
36 Content-Length Request Header This header is only applicable to POST requests It specifies the size of the POST data in bytes
37
37 Cookie Request Header Gives cookies previously sent to the client Not in the HTTP 1.1 specification, but is widely supported (originally, a Netscape extension)
38
38 Host Request Header Indicates host and port as given in the original URL –Required in HTTP 1.1 Needed due to request forwarding and machines that have multiple hostnames
39
39 If-Modified-Since Request Header This header indicates that client wants the page only if it has been changed after the specified data If-Unmodified-Since is the reverse of If-Modified-Since –It is used for PUT requests ( “ update this document only if nobody else has changed it since I generated it ” )
40
40 The Format of the Date in If-Modified-Since and in If-Unmodified-Since Greenwich Mean Time should be used and the format is: Last-Modified: Fri, 31 Dec 1999 23:59:59 GMT
41
41 Referer Request Header URL of referring Web page Useful for tracking traffic It is logged by many servers Can be easily spoofed Note the spelling error – correct spelling is Referrer, but use Referer
42
42 User-Agent Request Header The value of this header is a string identifying the browser making the request Use sparingly Again, can be easily spoofed
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.