HyperText Transfer Protocol (HTTP) RICHI GUPTA CISC 856: TCP/IP and Upper Layer Protocols Fall 2007 Thanks to Dr. Amer, UDEL for some of the slides used in this presentation Thanks to Madhusri Nayak for some of the slides used in this presentation
Motivation ? Single informational network Light protocol Speed Tim-Berners Lee Director of the W3C 2 HTTP Versions Format : HTTP/. HTTP/0.9 – No RFC HTTP/1.0 – RFC 1945 HTTP/1.1 – RFC
Position of HTTP in the TCP/IP Protocol suite HTTP TCP Application-layer Transport layer Network layer Data link layer Physical layer IGMPICMP ARPRARP IP Underlying LAN or WAN 3
Request –Response Protocol Origin Server URL User-Agent (browser/client) HTTP response DNS response HTTP request DNS Server DNS query TCP connection optional TCP connections 4
A-PDU format Request Line A Blank Line Body Entity Headers Request Headers General Headers Request Message Status Line A Blank Line Body Entity Headers Response Headers General Headers Response Message Note: Each line ends with ‘cr lf ‘ control characters. 5
GET, HEAD, POST, PUT, TRACE, CONNECT, OPTION URL HTTP version space Request Line Request Type Header Name Header format : Header Value General Header Date, Pragma, Cache control,Connection, MIME-version, Upgrade, Transfer encoding space Request Header From, Referer, User-agent, Authorization, If-Modified-Since, Accept * Entity HeaderContent-length, Content-type, Content- encoding, Last-modified, Expires, Upgrade Response Header Location, Age, Retry-after, Server A-PDU format (cont’d) 6
1xx: Informational Continue (100), Switching Protocols(101) 2xx: Success - action was successfully received, understood, and accepted Created (201), Accepted (202), No Content (204), OK (200) 3xx: Redirection - further action needed to complete request Moved Permanently (301), Moved Temporarily (302), Not Modified (304) 4xx: Client Error - request contains bad syntax or cannot be fulfilled Bad Request (400), Unauthorized (401), Forbidden (403), Not Found (404) 5xx: Server Error - server failed to fulfill an apparently valid request Internal Server Error (500), Not Implemented (501), Bad Gateway (502), Service Unavailable (503) Status Code Status Phrase HTTP Version Status Line space A-PDU format Cont’d… 7
Example Of Request/Response Note: Headers are in ascii format. 8
4 variations of HTTP Nonpersistent with one connection Nonpersistent with parallel connections Persistent without pipelining Persistent with pipelining
Nonpersistent (HTTP /1.0 default)
SYN SYN-ACK ACK ClientServer GET web page HTTP/1.0 OK Web page transferred 3-way handshake Connection close Get web page Web page Client parses HTML web page FIN 1.Found referenced object “Image 1” 2. Found referenced object “Image 2” Ack Data 11 ACK FIN
Nonpersistent (cont’d) Ack Data 12 SYN SYN-ACK ACK Client Server GET image1 HTTP/1.0 OK FIN Image 1 Transferred Connection close 3-Way Handshake Get image1 FIN ACK Image1 SYN SYN-ACK ACK Client Server GET image2 HTTP/1.0 OK FIN Image 2 Transferred Connection close 3-Way Handshake Get image2 FIN ACK Image2
Key points 13 Connection does not persist for other objects Connections are sequential
Rough calculation for number of RTTS ClientServer Delay due to connection request/handshake Delay Due to HTML Page Request Delay Due to Object Request Time delay in RTTs = 6 Can we reduce the number of RTTS? Web Page Image 1 Image 2 14
Nonpersistent with parallel connections ( browser dependent) 15
Ack Data 16 Parallel connections SYN SYN-ACK ACK Client Server GET web page HTTP/1.0 OK FIN Web page Transferred Connection close 3-Way Handshake Get web page FIN ACK Web page SYN SYN-ACK ClientServer GET image1 HTTP/1.0 OK FIN Image 1 Transferred Connection close 3-Way Handshake Get image1 FIN ACK Image1 SYN SYN-ACK ACK Client Server GET image2 HTTP/1.0 OK FIN Image 2 Transferred Connection close 3-Way Handshake Get image2 FIN ACK Image2 Client parses HTML web page 1. Referenced object “Image 1” 2. Referenced object “Image 2”
Rough calculation ClientServer Time delay in RTTs = 4 Delay due to connection request/handshake Delay due to HTML page request Delay due to object request 17 Web page Image1 & Image2
Disadvantages: overhead of multiple TCP connections A busy server could end up with lots of connections in the ‘TIME- WAIT’ state Seldom does each connection get past the ‘slow-start’ region failure to use the full end-to-end available bandwidth extra time opening connections increases user-perceived latency Can HTTP be further improved? 18
Persistent without pipelining
20 FIN Connection close ACK FIN ACK Time Out Ack Data 20 SYN SYN-ACK ClientServer GET web page HTTP/1.1 OK Web page Transferred 3-Way Handshake Get web page Web page GET image2 HTTP/1.1 OK Image 2 Transferred Get image2 Image2 GET image1 HTTP/1.1 OK Image 1 Transferred Get image1 Image1 Note: 1) Requests are sequential 2) Timer is at application layer Timer started
Rough calculation 21 ClientServer Time delay in RTTs = 4 Delay due to connection request/handshake Delay due to HTML page request Delay due to object request Web page Image1 Image2
Persistent with pipelining 22
Client parses web page; Gets Image 1 Gets Image 2 FIN Connection close FIN ACK Back to back requests Ack Data 23 SYN SYN-ACK ClientServer GET webpage HTTP/1.1 OK Web page Transferred 3-Way Handshake Get web page Web page Image 2 OK Image 2 Get image1 Get image2 GET image1 HTTP/1.1 GET image2 HTTP/1.1 ACK Image 1 OK Time Out Timer started
Rough calculation ClientServer 24 Time delay in RTTs = 3 24 Delay due to connection request/handshake Delay due to HTML page request Delay due to object request Web page Image1 & Image2
Advantages: fewer connections Reduced network traffic CPU time is saved in routers and hosts Reduced perceived latency on subsequent requests Either client or server can close the connection Disadvantages: Connections stay open longer at the server 25
FTP vs HTTP 26 1 RTT control-channel OPEN 0.5 RTT send request on control-channel 1 RTT file- channel OPEN 0.5 RTT file starts to arrive on file-channel Ftrans time to transmit the file RTT + Ftrans = time to get first ftp file 1 RTT channel OPEN 0.5 RTT send request 0.5 RTT file starts to arrive Ftrans time to transmit the file RTT + Ftrans = time to get a file in HTTP
Figure 6-1: Latencies for a remote server, image size = 2544 bytes Experimental Results (NP HTTP/1.0) without parallel connections Number of in lined images Network Latency (seconds) 27 (Persistent without pipelining) (Persistent with pipelining)
Figure 6-2: Latencies for a remote server, image size = bytes Experimental Results (cont’d) Number of in lined images Network Latency (seconds) 28 (Persistent without pipelining) (Persistent with pipelining) (NP HTTP/1.0) without parallel connections
Cache Eliminate the need to send requests to origin servers reduces the number of network round-trips expiration mechanism Eliminate the need to send full responses reduces network bandwidth requirements validation mechanism Proxy Servers Origin Server Client 1 Client 2 HTTP Request HTTP Response (MISS) HTTP Response(HIT) HTTP Request (MISS) HTTP Response 29
Expiration Model Explicit expiration times (expires / max-age directive ). Heuristic expiration times Validation model Cache validators (e.g Last-Modified Dates ) Server attaches validator with full response user agent or proxy cache includes the associated validator in request The server then checks the validator Special status code (304 (Not modified)) Full response 30
What Clients control max-age: age is no greater than the specified. min-fresh: fresh for at least the specified number of seconds max-stale: exceeded its expiration time by no more than the specified number of seconds. 31 Cachable/non-cachable object Cachable at proxy Cached object expiration time Operations performed on copy What servers control
Content negotiation Multiple representations (variants) of a single resource The process of selecting the best representation for a given response Types: Server-driven Negotiation: Selection algorithm located at server Agent-driven Negotiation Selection done by the user agent from the list of available representations within header fields or entity-body of initial response This negotiation is performed in 2 steps Transparent Negotiation Combination of both server-driven and agent-driven negotiation 32
Messages for GET / HTTP/1.1 User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv: ) Gecko/ Firefox/ Host: Accept:text/xml,application/xml,application/xhtml+xml;text/html;q=0.9,text/plain;q=0.8,image/png,*/ *;q=0.5 Accept-Language: en,en-us;q=0.5;q=0.5 Accept-Encoding: none Accept-Charset: ISO ,utf-8;q=0.7,*;q=0.7 Connection: Keep-Alive Request Header HTTP/ OK Cache-Control: private Content-Type: text/html; charset=UTF-8 Set-Cookie: PREF=ID=b77cac251a771420:TM= :LM= :S=rI0Vm3o4ZErGKlM8; expires=Fri, 09-Oct :47:59 GMT; path=/; domain=.google.com Server: gws Content-Length: 5471 Date: Wed, 10 Oct :47:59 GMT Response Header 33
Summary: 4 variations of HTTP Nonpersistent with 1 connection One TCP connection at a time Server initiates connection close Nonpersistent with parallel connections More than one TCP connection at a time Server initiates connection close Persistent without pipelining One TCP connection at a time Sequential requests of embedded web page objects Server or client initiates connection close Persistent with pipelining One TCP connection at a time Back to back requests for embedded web page objects Server or client initiates connection close
Questions? 35 Do you know? IE will only open 2 parallel HTTP connections to a named server by default Do you know? Firefox will open 4 parallel HTTP connections to a named server by default Do you know? Pipelining is implemented entirely at the browser end. Thanks