Browsers and Servers CGI Processing Model ( Common Gateway Interface ) © Norman White, 2013
WWW and Client Server computing Forms & CGI programming Writing a CGI program
Server computers are located all around the world and respond to requests (messages) from computers running browser software (Netscape, IE) Browser applications understand HTML, (and now Javascript, Java etc.)
BServer http request Browser sends http request to server (I.e. GET index1.html)
<html><head> Sample Title Sample Title </head><body> Here is some text and a picture Here is some text and a picture </body></html>
BServer http request HTML file Server retrieves file Sends file (index1.html) to Browser
BServer http request HTML index1.html Browser “formats” index1.html May mean retrieving more files In order to display
BServer http request HTML index1.html GET pic1.gif index1.html contains reference to pic1.gif Browser then requests pic1.gif
BServer http request Index1.html GET pic1.gif index1.html pic1.gif Server next sends pic1.gif
BServer http request index1.html GET pic1.gif index1.html pic1.gif pic1.gif Browser displays pic1.gif
Web Server sends a header in front of each file identifying the file type (HTML,GIF,JPEG etc.) Most Browsers understand HTML, GIF and TEXT Browsers can be configured to call external programs to handle new types of files
These programs are called HELPER applications and dramatically extend the capabilities of the browser, since they can be developed independently of the client software Examples - Quicktime viewers, sound players, VRML viewers etc. To see the currently configured viewers go to options on the Browser title bar
Browser functionality can also be extended by adding plugins. Plugins are not standalone applications, but executable code that is dynamically linked into the browser when necessary.
HTML provides an easy to use FORM capability, which allows a wide variety of input forms to be easily generated. Form data types include Text input - One line of text Textarea - Multiple lines of text Check boxes (on/off) Radio boxes (1 of N) Etc.
Output of Form is formatted and sent to Server, along with the name of a program to process the contents of the form. The WEB Server takes information from form, and passes it on as input to a Common Gateway Interface Program (CGI) Output of CGI program is sent back to Client browser as an HTML (or other) file.
CGI programs can do an almost unlimited set of activities... Look up info in a database and send it to Browser. Take input from user and add to a file. Take input and send to a standard business application CGI program can be in any language that runs on the server, including a shell language like sh or bash.
B httpserver CGI Program http form content input output HTML (Note, all processing is on server)
Develop form to collect information from users Write and test CGI program to handle form information Put the name of the CGI program in the “ACTION” statement of the form. Note: program can be on another server.
Two Types of FORM processing options, GET and POST GET - parameters sent as additions to URL string. Each individual parameter separated by & POST - Data sent in message body. This is a more general method and can handle more input data.
Server sends form (in html document) to client Client displays form, and user fills in fields Client sends form info to server (SUBMIT button) Server runs the CGI program named in the ACTION section of the FORM CGI program parses data as input Output of CGI program is sent by the server to the client (i.e. it should be HTML)
Advantages Very general model, easy to do really neat things like front end existing applications, databases etc. Many toolkits available to do common things Disadvantages All processing is done on server. May overload server Interaction is all through forms Lot’s of data traffic back and forth Solution HTML5 and it’s features
CGI program needs to Parse form input Process the input Generate html output
GET format Information is passed as a series of variable=value pairs separated by “&” to program named in action statement by adding them on to the URL (after a “?”) Simple example – one line form with a field named “userid” and “ACTION=mycgiprog.cgi” User enters “nwhite” Browser sends the following to the web server
Web server takes the information after the “?” and creates an environment variable named “QUERY_STRING”, then executes the program “mycgiprog.cgi” QUERY_STRING contains userid=nwhite CGI program retrieves value of QUERY_STRING, does appropriate processing, and (optionally) sends an HTML response back
Both Windows and Unix support environment variables. These are user session variables which contain character strings. Many are automatically created when the user logs in, like PATH, PROMPT etc. Any program can create or retrieve the value of environment variables, so they are often used to pass small amounts of information from one application to another. Different operating systems have different methods for setting and retrieving environment variables. For example, in unix, you can retrieve an environment variables value by putting a $ in front of it I.e. $PATH. In Windows, you put % around it. I.e. %PATH% Try this in unix echo $PATH Or Windows echo %PATH%
What if we want have more than one field? No problem QUERY_STRING can contain many variable=value pairs separated by “&” i.e. userid=nwhite&password=junk&fname=Norman Possible problem, how big can environment variables be (how many characters) GET only useful for limited input
POST method is more general since it can handle lots of input Input is passed as a sequence of characters (stdin) Variable1=value1&Variable2=value2 …. Environment variable CONTENT_LENGTH is set to the number of characters of input. Environment variable Request_Method is set to POST (Instead of GET) Input processing logic needs to be (slightly) different for GET and POST methods
CGI output is passed back to the browser, hence has to be something (HTML) the browser can understand Like… Content-type: text/html output of HTML from CGI script Sample output What do you think of this?
List the contents of your “websys” directory Create a Shell Script named lister.cgi which contains #! /bin/sh # echo “Content-type: text/html” echo “ “ echo Listing echo ls –alt echo
List the contents of your “websys” directory With options passed as part of url (type=XXX) Create a Shell Script named lister.cgi which contains #! /bin/sh # eval QS=`echo $QUERY_STRING` eval `echo $QS|sed –e “s/\&/ /g”` echo “Content-type: text/html” echo echo Listing echo ls *.$type echo
To run it, put it in your websys directory chmod +rx lister.cgi Type the following as a URL Your userid
In the POST method, form data is NOT passed as part of the URL, instead it is passed to the “STANDARD INPUT” of the CGI program. Advantages Not limited by max size of environment variables Users can’t see the input fields Disadvantages A little harder to handle Users can’t save/send the link plus form data i.e Send the results of a search to someone else
World-Wide-Web model is much more powerful than it appears on the surface Easily integrated with existing applications Easy to add new functionality CGI model can do lots of things… Update files Link to corporate databases Specialized Applications
Problems with CGI Model Need to parse input Overhead Need to start up a new program for every request Scalability All processing on server, what happens as usage grows? Reliability How do we replicate for redundancy?
We will look at unix and how we can use it to develop web applications. Later we will see how to streamline the interaction with AJAX and then HTML5