Lecture 1: Basics of Web Technologies and Client-side Language Dr. Mohammad Anwar Hossain Software Engineering Department, KSU
Learning Outcomes In this chapter, you will learn about: The evolution of the Internet, Internet standards organizations, and the difference between the Internet, intranets, and extranets. The beginning of the World Wide Web, ethical use of information on the Web, Web Accessibility, and future Internet trends. The Client/Server Model, Internet Protocols, Networks, URLs and Domain Names, and Markup Languages. 2
The Evolution of the Internet Internet ◦ Inter connected net work of computer networks ◦ ARPAnet Advanced Research Project Agency 1969 – four computers connected ◦ NSFnet National Science Foundation ◦ Use of the Internet was originally limited to government, research and academic use ◦ 1991 Commercial ban lifted
Intranet & Extranets Intranet A private network contained within an organization or business used to share information and resources among coworkers. Extranet A private network that securely shares part of an organization’s information or operations with external partners
The World Wide Web The World Wide Web (WWW) is a system of interlinked hypertext documents accessed via the Internet. With a web browser, one can view web pages that may contain text, images, videos, and other multimedia and navigate between them by using hyperlinks. (wiki)
Web Standards and the W3C Consortium W3C – World Wide Web Consortium ◦ Develops recommendations and prototype technologies related to the Web ◦ Produces specifications, called Recommendations, in an effort to standardize web technologies
Web Accessibility WAI – Web Accessibility Initiative ◦ Develops recommendations for web content developers, web authoring tool developers, developers of web browsers, and developers of other user agents to facilitate use of the web by those with special needs. ◦ WCAG Web Content Accessibility Guidelines
Checkpoint Describe the difference between the Internet and an intranet. 2. Describe the difference between the Internet and the Web.
WWW –The World Wide Web (WWW) was developed by Tim Berners-Lee and other research scientists at CERN, the European center for nuclear research, in the late 1980s and early 1990s. –WWW is a client-server model and uses TCP connections to transfer information or web pages from server to client. –WWW uses a Hypertext model. Hypertext allows interactive accesses to a collection of documents. –Documents can hold Text (hypertext), Graphics, Sound, Animations, Video –Documents are linked together Non-distributed – all documents stored locally (e.g on CD- Rom). Distributed – documents stored at remote servers on the Internet.
WWW - Hyperlinks (or links) –Each document contains links (pointers) to other documents. –The link represented by "active area" on screen Graphic - button Text - highlighted –By selecting a particular link, the client fetches the referenced document from a server for display. –Links may become invalid. –Link is simply a text name for a remote document. –Remote document may be moved to a new location while name in link remains in place.
WWW – Document Representation –Each WWW document is called a page. –Initial page for individual or organization is called a home page. –Page can contain many different types of information; page must specify: Content – The actual information Type of content – The type of information, e.g. text, pictures etc Links to other documents –Rather than having a fixed representation for every browser, pages are formatted with a mark up language. –This allows browser to format page to fit display. –Standard is called HyperText Markup Language (HTML).
WWW – HTML –HTML specifies Major structure of document Formatting instructions for browsers to execute. Hypertext links – Links to other documents Additional information about document contents –Two parts to document: Head contains details about the document. Body contains the information/content of the document. –Each web page is represented in ASCII text with embedded HTML tags that give formatting instructions to the browser. Formatted section begins with tag, End of formatted section is indicated by
WWW – HTML Example Example Page for lecture Lecture notes for today go here! Previous Lecture Next Lecture Table of contents Solutions to Assignments Index of terms
WWW – Other HTML Tags –Headings -, –Lists - Ordered (numbered) list - Unordered (bulleted) list - List item –Tables, - Define table - Begin row - Begin item in row –Parameters Keyword-value pairs in HTML tags
WWW – Embedding Graphics –IMG tag specifies insertion of graphic Parameters: SRC="filename" ALIGN= - alignment relative to text – –The above line would insert the image in the file GCD.gif into any web page. –Image must be in format known to browser, e.g., Graphics Interchange Format (GIF), Joint Photographic Experts Group (JPEG), Bitmap etc
WWW – Style body {background-color: yellow} h1 {background-color: #00ff00} h2 {background-color: transparent} p {background-color: rgb(250,0,255)} This is header 1 This is header 2 This is a paragraph The layout and format of an HTML document can be simplified by using CSS (Cascading Style Sheets)
Basic Internet Protocols –TCP/IP is fundamental to the Internet – , web browsing, file downloads, accessing database are built on top of TCP and IP protocols –TCP is the transmission control protocol –TCP extends IP to provide added functionality –However, only IP is fundamental to the definition of Internet –IP address: 32 bit number (sequence of 4 decimal numbers separated by dots) –Other protocols: UDP, FTP, SMTP etc.
UDP- User Datagram Protocol –An alternative protocol to TCP –It builds on IP –Does not provide two-way connection –Does not provide guaranteed delivery unlike TCP –Faster than TCP for simple tasks –Used for tasks like downloading video, short message etc.
FTP - File Transfer Protocol –People required a protocol to reliably transfer files between any two computers connected to the Internet. –Why not use HTTP? The HTTP protocol was developed in the late 1980s and the early 1990s. HTTP provides a poor authentication mechanism of users of the protocol. HTTP doesn’t easily allow files to be sent in both directions. HTTP doesn’t allow files to be downloaded in separate stages.
FTP –FTP is a client/server program –An FTP client program enables the user to interact with an ftp server in order to access files on the ftp server computer. –Client programs can be: Simple command line interfaces. E.g. MS-Dos Prompt C:\ ftp ftp.maths.tcd.ie Integrated with Web browsers, e.g. Netscape Navigator, Internet Explorer. –FTP provides similar services to those available on most filesystems: list directories, create new files, download files, delete files. –FTP uses TCP connections and the default server port for FTP is 21.
FTP - Transfer modes –Batch transfer User creates list of files to be transferred by ftp program. Users request is dropped into a queue of similar requests. FTP program reads requests and performs transfers of files. Transfer program can retry until successful. Good for slow or unreliable transfers. –Interactive transfer User starts ftp program User can interactively list contents of directories, transfer files, delete files etc. User can find and transfer files immediately Quick feedback in case of mistakes, e.g., spelling errors
WWW – Identifying a web page –A web page is identified by: The protocol used to access the web page. The computer on which the web page is stored. The TCP port that the server is listening on to allow a client to access the web page. Directory pathname of web page on server. –Specific syntax for Uniform Resource Locator (URL): protocol://computer_name:port/document_name Protocol can be http, UDP, SMTP, FTP, mailto.
WWW – Identifying a web page –Computer name can be DNS name or IP address. –TCP port is optional (http uses port 80 as its default port). –document_name is path on server to web page (file). –E.g. –Protocol is http –Computer name or DNS name is –Port number is the default port for http, i.e. port 80. –Document name is /Recreation/Sports/Soccer/index.html
WWW – Hyperlinks between web pages –Each hyperlink is specified in HTML by using a special tag. –An item on a page is associated with another HTML document. –Each link is passive, no action is taken until link is selected. –HTML tags for a hyperlink are and –The linked document is specified by parameter to the tag: HREF="document URL" – Click here to go to GCD web site. –Whatever is between the HTML tags, and is the highlighted hyperlink.
WWW – Client Server Model –The browser is the client, WWW (or web) server is the server. –Browser: The browser makes TCP connection to the web server. The browser sends request for the particular web page that it wishes to display. The browser reads the contents of the web page from the TCP connection and displays it in the browsers window. The browser closes the TCP connection used to transfer the web page. –Each separate item in a web page (e.g., pictures, audio) require a separate TCP connection. –HyperText Transport Protocol (HTTP) specifies commands that the client (browser) issues to the server (web server) and the responses that the server sends back to the client.
Figure 1-1: Web client/server architecture WWW – Client Server Model
Web Server Basics Duties –Listen to a port –When a client is connected, read the HTTP request –Perform some lookup function –Send HTTP response and the requested data
Serving a Page User of client machine types in a URL
Serving a Page Server name is translated to an IP address via DNS client (Netscape) server (Apache)
Serving a Page Client connects to server using IP address and port number
Serving a Page Client determines path and file to request
Serving a Page Client sends HTTP request to server
Serving a Page Server determines which file to send
Serving a Page Server sends response code and the document
Serving a Page Connection is broken
HTTP HTTP is… –Designed for document transfer –A form of communication protocol –Specifies how a server and client communicate –Most HTTP messages are sent using TCP –Generic not tied to web browsers exclusively can serve any data type –Stateless no persistent client/server connection
HTTP Protocol Definitions MIME –Multipurpose Internet Mail Extensions –Standards for encoding different media types in a message –Originally developed for ing files and messages in different languages
WWW – HTTP Protocol –When a user types in the broswer creates a HTTP GET Request message and sends it over a TCP connection to the web server. –In the above case, the HTTP GET Request message would be GET /Recreation/Sports/Soccer/index.html HTTP/1.0 User-Agent: InternetExplorer/5.0 Accept: text/html, text/plain, image/gif, audio/au “\r\n”
WWW – HTTP Request messages –HTTP Request messages are sent from client to server. “\r\n”Request LineOptional DataOptional HTTP Header Type of Request (e.g. GET) Additional information such as brower being used, media types accepted Delimiter Carriage return Line feed User data e.g. contents of completed form –There are a number of valid HTTP Request messages Get – Used to request a web page from a web server Post – Used to send data (e.g. results of registration form) to a web server Head – Return the header of a web page, used by search engines to test the validity of hyperlinks Put / Delete – Not typically implemented by browsers.
WWW – HTTP Response messages –HTTP Response messages are sent from server to client. “\r\n”Status LineOptional DataOptional HTTP Header Success/Failure Indication Number between 200 and 599 Type of content returned e.g. text/html or image/gif Delimiter Requested Data e.g. web page –The Status Line gives information about the success of the previous HTTP Request 200 – 299Success 300 – 399Redirection – Document has been moved 400 – 499Client Error – Bad Request, Unauthorised, Not found 500 – 599Server Error – Internal Error, Service Overloaded
WWW – Caching Web pages –Downloading HTML documents from servers can be slow due to a number of conditions: Parts of the Internet can be congested Dialup connection is typically very slow, 33Kbps or 56Kbps Web server can have a lot of clients connecting to it at the same time, causing it to be overloaded. –If a user returns to previous HTML document, then this could require downloading the document from the server again. –A browser can hold copies of recently visited pages. This avoids having to download pages again. –An organisation can use a HTTP proxy that caches documents for multiple users. Thus improving the speed at which pages can be displayed on each users computer.
WWW – Browser Architecture Network Interface HTTP client Other client … Controller html interpreter optional plugins DisplayDriverDisplayDriver Input from keyboard and mouse Output sent to display Communication with remote server …
WWW – Browser Architecture –Browser has more components than a server: Display driver for painting screen. HTML interpreter for formatting HTML documents. Plugins to display different content (e.g., Shockwave or Real Audio content) HTTP client to fetch HTML documents from WWW server. Other clients for other protocols (e.g., ftp, mail) Controller also must accept input from the computer user through the mouse or keyboard.
What has been covered this week: Overview of applications – what’s out there Internet basics – architecture and protocols HTML Next Week: JavaScript