How does it work ?
Hypertext Transfer Protocol (HTTP) A communication protocol It is used to communicate between a client and a web server It is the network protocol of the web Let us take a look on a typical scenario
1) Typing a URL The browser/client parses the URL The URL pattern is: protocol://server:port/requestURI?arg1=val1&…&argN=valN Protocol: in our case HTTP Server : Server location (e.g. www.NBA.com) Port: port that the web server listens to (default:80) Request-URI: web server resource (e.g. index.html) Arg: argument ( e.g. username) Val: values for the argument (e.g.JohnnyCash)
2) Sending HTTP-Request The browser decides which information to send The browser sends a text called request to the server Request pattern: [METHOD] [REQUEST-URI] HTTP/[VER] [fieldname1]: [field-value1] [fieldname2]: [field-value2] [request body, if any] The server knows how to handle/parse a request
Request methods Get Method Post Method and more: HEAD, DELETE .. Data is visible in the URL GET requests can be cached GET requests remain in the browser history GET requests should never be used when dealing with sensitive data GET requests have length restrictions GET requests should be used only to retrieve data Post Method Data is not displayed in the URL POST requests are never cached POST requests do not remain in the browser history POST requests cannot be bookmarked POST requests have no restrictions on data length and more: HEAD, DELETE ..
Request example GET /players/mJordan/info.html HTTP/1.1 Host: www.nba.com User-Agent: Mozilla/5.0 (Windows;) Gecko Firefox/3.0.4 Accept: text/html,application/xhtml+xml,application/xml; Accept-Language: en-us,en;q=0.5 X-cept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive [no body]
Conditions in request It’s possible to add conditions in the HTTP request Syntax: If-Match, If-None-Match, If-Range, If-Unmodified-Since, If-Modified-Since Servers along the way can change the request how do we call these servers? Why would we use it?
3) Server Processing The web-server listens to specific ports (usually 80) It receives the request and parse it Typical web-server (Get method) has a mapping between resource-URI to the local hard-drive
4) Server response The server sends back information and content back to the user-agent Response pattern: HTTP/1.0 code text Field1: Value1 Field2: Value2 ...Document content here... (e.g. HTML code)
Status codes The status code is a three-digit integer, and the first digit identifies the general category of the response: 1xx indicates an informational message (mostly for experimental purposes only) 2xx indicates success of some kind (e.g. 200 OK) 3xx redirects the client to another URL 301 Moved permanently 302 Moved temporarily 4xx indicates an error on the client's part 400: Bad request ( bad syntax mostly) 401: Unauthorized (e.g. wrong user/pass) 403: Forbidden ( e.g. not allowed client) 404: Not found 5xx indicates an error on the server's part 500: Internal server error
Response example HTTP/1.0 200 OK Date: Fri, 31 Dec 1999 23:59:59 GMT Content-Type: text/html Content-Length: 1354 <html> <body> <h1>Hello World</h1> </body> </html>
Persistence (HTTP) When the browser receives and renders HTML, it sends new request to get any resource the HTML points to (e.g. images) Basically the connection is closed after a single request/response pair Each request creates new TCP connection Creating TCP connection is super-slow
Persistence – (cont’d) HTTP 1.1 (1999) keep-alive-mechanism: The browser creates one TCP connection and sends many request through it
Stateless The HTTP protocol doesn’t save any information in a request that can be used in a different request later on Advantage? If a client dies in mid-transaction, no part of the system needs to be responsible for cleaning the present state of the server Disadvantage? User responsibility
Caching Browser Caching Proxy Caching Browser caches most resources in a temporary folder Browser sends a request to check if it has the most updated resource Proxy Caching Some servers along the way hold cache of resources
“Talking” with the server The user communicate with the web server through HTML forms The user fills in the form and hits the submit button, upon which the data is submitted to the server The <FORM> tag has a method attribute E.g. <form method=“post”>
Submitting forms GET: form data are encoded into the URL Disadvantages ? POST: the HTTP request will include a body that contains the parameters Rule of thumb: Primarily, POST should be used when the request causes a permanent change See Example: form_get.html
Static Web Pages Stored as files in the file system Delivered to the user exactly as stored Same information for all users, from all contexts via HTTP Large numbers of static pages as files can be impractical
Cascading Style Sheets (CSS) CSS is a style sheet language used to describe the presentation of a document written in HTML Styles define how to display HTML elements External Style Sheets can save you a lot of work External Style Sheets are stored in CSS files HTML code become cleaner and less messy HTML pages become richer and user-friendly see example: Zen Garden
Dynamic web pages A web page with web content that varies based on parameters provided by a user or a computer program Client-side scripting generally event driven, using the DOM elements Server-side scripting servers response affected by HTML forms, browser type, etc’ Combination using Ajax
Dynamic web pages – (cont.) Client-side scripting (JavaScript, Flash, etc.) client-side content is generated on the user's local computer system event-driven can appear in events on HTML or separately Pros: nice and dynamic Cons: slow, browsers behave differently
Dynamic web pages – (cont.) Server–side scripting (PHP, Perl, Ruby, etc.) Server processes script on request Server provides client-designated HTML Stateful behavior on stateless protocol See example: WebGT
Combination - Basic Ajax Not a cleaning spray
Combination - Basic Ajax Not a soccer team
Combination - Basic Ajax Asynchrony JavaScript And XML Sending requests to the server without refreshing the page Using Javascript code Client side uses callback functions to manipulate server responses Do you remember such behavior from Facebook? See example: WebGT
Pros and Cons Pros Better layout Efficiency – Instead of getting an entire page we retrieve the needed data only Reduce the number of connections (css/images/js) Cons the browser can’t register an Ajax action as a different page – No back button