27.1 Chapter 27 WWW and HTTP Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
ARCHITECTURE The WWW today is a distributed client/server service, in which a client using a browser can access a service using a server. However, the service provided is distributed over many locations called sites. Client (Browser) Server Uniform Resource Locator Cookies Topics discussed in this section:
27.3 Figure 27.1 Architecture of WWW
27.4 Figure 27.2 Browser
27.5 Figure 27.3 URL
WEB DOCUMENTS The documents in the WWW can be grouped into three broad categories: static, dynamic, and active. The category is based on the time at which the contents of the document are determined. Static Documents Dynamic Documents Active Documents Topics discussed in this section:
27.7 Figure 27.4 Static document
27.8 Figure 27.5 Boldface tags
27.9 Figure 27.6 Effect of boldface tags
27.10 Figure 27.7 Beginning and ending tags
27.11 Figure 27.8 Dynamic document using CGI
27.12 Figure 27.9 Dynamic document using server-site script
27.13 Dynamic documents are sometimes referred to as server-site dynamic documents. Note
27.14 Figure Active document using Java applet
27.15 Figure Active document using client-site script
27.16 Active documents are sometimes referred to as client-site dynamic documents. Note
HTTP The Hypertext Transfer Protocol (HTTP) is a protocol used mainly to access data on the World Wide Web. HTTP functions as a combination of FTP and SMTP. HTTP Transaction Persistent Versus Nonpersistent Connection Topics discussed in this section:
27.18 HTTP uses the services of TCP on well- known port 80. Note
27.19 Figure HTTP transaction
27.20 Figure Request and response messages
27.21 Figure Request and status lines
27.22 Table 27.1 Methods
27.23 Table 27.2 Status codes
27.24 Table 27.2 Status codes (continued)
27.25 Figure Header format
27.26 Table 27.3 General headers
27.27 Table 27.4 Request headers
27.28 Table 27.5 Response headers
27.29 Table 27.6 Entity headers
27.30 This example retrieves a document. We use the GET method to retrieve an image with the path /usr/bin/image1. The request line shows the method (GET), the URL, and the HTTP version (1.1). The header has two lines that show that the client can accept images in the GIF or JPEG format. The request does not have a body. The response message contains the status line and four lines of header. The header lines define the date, server, MIME version, and length of the document. The body of the document follows the header (see Figure 27.16). Example 27.1
27.31 Figure Example 27.1
27.32 In this example, the client wants to send data to the server. We use the POST method. The request line shows the method (POST), URL, and HTTP version (1.1). There are four lines of headers. The request body contains the input information. The response message contains the status line and four lines of headers. The created document, which is a CGI document, is included as the body (see Figure 27.17). Example 27.2
27.33 Figure Example 27.2
27.34 HTTP uses ASCII characters. A client can directly connect to a server using TELNET, which logs into port 80 (see next slide). The next three lines show that the connection is successful. We then type three lines. The first shows the request line (GET method), the second is the header (defining the host), the third is a blank, terminating the request. The server response is seven lines starting with the status line. The blank line at the end terminates the server response. The file of 14,230 lines is received after the blank line (not shown here). The last line is the output by the client. Example 27.3
27.35 Example 27.3 (continued)
27.36 HTTP version 1.1 specifies a persistent connection by default. Note
27.37 HTTP connections Nonpersistent HTTP At most one object is sent over a TCP connection. HTTP/1.0 uses nonpersistent HTTP Persistent HTTP Multiple objects can be sent over single TCP connection between client and server. HTTP/1.1 uses persistent connections in default mode
27.38 User-server state: cookies Many major Web sites use cookies Four components: 1) cookie header line of HTTP response message 2) cookie header line in HTTP request message 3) cookie file kept on user ’ s host, managed by user ’ s browser 4) back-end database at Web site Example: Susan access Internet always from same PC She visits a specific e- commerce site for first time When initial HTTP requests arrives at site, site creates a unique ID and creates an entry in backend database for ID
27.39 Cookies: keeping “ state ” (cont.) client server usual http request msg usual http response + Set-cookie: 1678 usual http request msg cookie: 1678 usual http response msg usual http request msg cookie: 1678 usual http response msg cookie- specific action cookie- spectific action server creates ID 1678 for user entry in backend database access Cookie file amazon: 1678 ebay: 8734 Cookie file ebay: 8734 Cookie file amazon: 1678 ebay: 8734 one week later:
27.40 Cookies (continued) What cookies can bring: authorization shopping carts recommendations user session state (Web ) Cookies and privacy: cookies permit sites to learn a lot about you you may supply name and to sites search engines use redirection & cookies to learn yet more advertising companies obtain info across sites