World Wide Web Hypertext model Use of hypertext in World Wide Web (WWW) WWW client-server model Use of TCP/IP protocols in WWW
Hypertext and Hypermedia Hypermedia system allows interactive access to collections of documents Document can hold: Text (hypertext) Graphics Sound Animations Video Documents linked together Nondistributed - all documents stored locally (like CD-ROM) Distributed - documents stored on remote servers
HyperPointers Each document contains links (pointers) to other documents Link represented by "active area" on screen Graphic - button Text - highlighted Selecting link fetches referenced document for display Links may become invalid Link is simply a text name for a remote document Remote document may be removed while name in link remains in place
Browser Interface Interactive, "point-and-click" interface to hypermedia documents Each document is displayed in screen User can select and follow links - "point- and-click" Application is called a browser (infinite time sink)
Each WWW document is called a page Initial page for individual or organization is called a home page Page can contain many different types of information; page must specify Content Type of content Location Links Rather than fixed WYSIWYG representation (e.g., Word), pages are formatted with a mark up language (like TeX) Allows browser to reformat to fit display Allows text-only browser to discard graphics Standard is HyperText Markup Language (HTML)
HTML specifies Major structure of document Formatting instructions Hypermedia links Additional information about document contents Two parts to document: Head contains details about the document Body contains information/content Page is represented in ASCII text with embedded HTML tags formatting instructions Tags have format End of formatted section is
Page identified by: Protocol used to access page Computer on which page is stored TCP port to access page Pathname of file on server Specific syntax for Uniform Resource Locator (URL): protocol://computer_name:port/document_name Protocol can be http, ftp, file, mailto Computer name is DNS name (Optional) port is TCP port document_name is path on computer to page
Browser is client, WWW server is server Browser: Makes TCP connection Sends request for page Reads page Each different item - e.g., IMG - requires separate TCP connection HyperText Transport Protocol (HTTP) specifies commands and client-server interaction Client/Server Model
Server Architecture Much like ftp server Waits for incoming connection Accepts command from connection Writes page to connection Performance is hard issue
Browser Architecture Browser has more components: Display driver for painting screen HTML interpreter for HTML-formatted documents Other interpreters (e.g., Shockwave) for other items HTTP client to fetch HTML documents from WWW server Other clients for other protocols (e.g., ftp) Controller to accept input from user Must be multi-threaded
Caching Downloading HTML documents from servers may be slow Internet congested Dialup connection Server busy Returning to previous HTML document requires reload from server Local cache can be used to hold copies of visited pages Also can implement organizational HTTP proxy that caches documents for multiple users
Security Routers forward packets - from any source Bad guys can send in packets from outside How to avoid security breaches?
Security Policies Can't describe a network as secure in the abstract University may have different notion of security than military installation Must define a security policy Many possibilities to consider: Data stored on servers Messages traversing LANs Internal or external access Read/write versus read-only access
Encryption Encryption - rewrite contents so that they cannot be read without key Encrypting function - produces encrypted message Decrypting function - extracts original message Encryption key - parameter that controls encryption/decryption; sender and receiver share secret key Sender produces: E = encrypt(K, M) Sender transmits E on network Receiver extracts: M = decrypt(K, E)
Public Key Previous scheme requires shared secret K If K is discovered, security is compromised Public key encryption uses two keys: Private key - kept secret by user Public key - published by user To send to user, encrypt using public key, decrypt using private key
Digital Signatures Goal - guarantee that message must have originated with certain entity Authenticate Sender Idea - encrypt with private key, decrypt with public key Only owner of private key could have generated original message
Packet Filtering Can configure packet forwarding devices - esp. routers - to drop certain packets Consider example: Suppose is test network and has controlling workstations Install filter to allow packets only from to Keeps potentially bad packets away from remainder of Internet s
Internet Firewall Packet filter at edge of intranet can disallow unauthorized packets Restricts external packets to just a few internal hosts Proxies forward packets through firewall after authorization