Basics of the HTTP Protocol and Apache Web Server Brandon Checketts
At first there was HTTP 0.9 This is as simple as it can get GET Hello Created by Tim Berners-Lee in 1989(?) Created by Tim Berners-Lee in 1989(?) The 0.9 version number was actually created after the 1.0 spec The 0.9 version number was actually created after the 1.0 spec
HTTP 1.0 The first really practical revision of the HTTP protocol The first really practical revision of the HTTP protocol HTTP Request Headers and Response Headers HTTP Request Headers and Response Headers Simple caching Simple caching Authentication Authentication Content-Type Content-Type Sending data via POST Sending data via POST HTTP Status codes (200, 404, etc) HTTP Status codes (200, 404, etc)
HTTP 1.1 (in use today) Includes everything from HTTP 1.0 Includes everything from HTTP 1.0 Host header is required Host header is required Defines more status codes, more request methods Defines more status codes, more request methods Much more flexible caching available Much more flexible caching available Digest Authentication Digest Authentication
Sample HTTP Request / Response GET / HTTP/1.1 Host: User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv: ) Gecko/ Firefox/3.5.3 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO ,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive HTTP/1.x 200 OK X-TR: 1 Date: Thu, 15 Oct :50:12 GMT Expires: -1 Cache-Control: private, max-age=0 Content-Type: text/html; charset=UTF-8 Set-Cookie: __utmv=; expires=Mon, 01-Jan :00:00 GMT; path=/; domain= Set-Cookie: __utmv=; expires=Mon, 01-Jan :00:00 GMT; path=/; domain=.google.com Server: gws X-XSS-Protection: 0 Content-Length: 9256
Headers of Interest Referer Referer Says which page referred you to the current URL Says which page referred you to the current URL Note the misspelling Note the misspelling Used in Analytics to provide a lot of useful metrics Used in Analytics to provide a lot of useful metrics User Agent User Agent Specifies OS and Browser (often faked) Specifies OS and Browser (often faked) Cookie / Set-Cookie (more on this later) Cookie / Set-Cookie (more on this later)
HTTP Cookies Cookies are generally good! They provide some incredibly useful functionality. Cookies are generally good! They provide some incredibly useful functionality. Server sends a Set-Cookie Server sends a Set-Cookie Client sends back a Cookie Client sends back a Cookie Demonstrate a cookie Demonstrate a cookie Be careful what you put in a cookie! Be careful what you put in a cookie! Don’t store user ID’s, authentication credentials, etc Don’t store user ID’s, authentication credentials, etc
Using Cookies to create sessions Without cookies, all HTTP requests are completely independent Without cookies, all HTTP requests are completely independent Cookies allow the server to add some persistence to multiple requests and create a session Cookies allow the server to add some persistence to multiple requests and create a session Most programming languages have some built-in support for sessions. (PHPSESSID, JSESSIONID, etc) Most programming languages have some built-in support for sessions. (PHPSESSID, JSESSIONID, etc) Session information can be stored in file system, database, memcache, etc. Session information can be stored in file system, database, memcache, etc. Don’t pass Session ID through GET requests Don’t pass Session ID through GET requests Demo some simple session examples: Demo some simple session examples:
Apache
Apache Web Server Apache is the most popular web server Apache is the most popular web server Wikipedia says it powers 55% of all websites and 66% of the biggest websites Wikipedia says it powers 55% of all websites and 66% of the biggest websites Derived from patches to NCSA httpd … ‘A Patchy’ Server Derived from patches to NCSA httpd … ‘A Patchy’ Server Modules provide a lot of extra functionality Modules provide a lot of extra functionality Some people complain that the modules add a lot of bloat Some people complain that the modules add a lot of bloat High Performance, very configurable, easily available. High Performance, very configurable, easily available. Virtual Hosts allow granular control of almost everything Virtual Hosts allow granular control of almost everything Hundreds and thousands of virtual hosts per physical host Hundreds and thousands of virtual hosts per physical host Worker (multi-threaded) versus Prefork (separate processes) Worker (multi-threaded) versus Prefork (separate processes) Version 2.2 is in wide use today Version 2.2 is in wide use today
Sample Apache VirtualHost Config NameVirtualHost :80 ServerName mydomain.com ServerName mydomain.com ServerAlias *.mydomain.com ServerAlias *.mydomain.com DocumentRoot /home/mydomain.com/www DocumentRoot /home/mydomain.com/www CustomLog /home/mydomain.com/logs/access_log combined CustomLog /home/mydomain.com/logs/access_log combined CustomLog /home/mydomain.com/logs/deflate_log deflate CustomLog /home/mydomain.com/logs/deflate_log deflate ErrorLog /home/mydomain.com/logs/error_log ErrorLog /home/mydomain.com/logs/error_log ScriptAlias /cgi-bin/ /home/mydomain.com/cgi-bin/ ScriptAlias /cgi-bin/ /home/mydomain.com/cgi-bin/ php_admin_flag engine on php_admin_flag engine on php_admin_value open_basedir "/home/mydomain.com/" php_admin_value open_basedir "/home/mydomain.com/" RewriteEngine On RewriteEngine On</VirtualHost>
Apache Modules Authentication (mod_auth_*) Authentication (mod_auth_*) Via MySQL (multiple applications single password database) Via MySQL (multiple applications single password database) Proxying (HTTP, AJP, load balancing) Proxying (HTTP, AJP, load balancing) Programs (mod_php, mod_python, mod_perl, passenger) Programs (mod_php, mod_python, mod_perl, passenger) SSL SSL URL rewriting (mod_rewrite) URL rewriting (mod_rewrite) CGI and Fast-CGI, SCGI CGI and Fast-CGI, SCGI WebDav WebDav SVN SVN Practically anything Practically anything ….mod_security… ….mod_security…
Apache Proxying Load Balancing Load Balancing BalancerMember BalancerMember BalancerMember BalancerMember ProxyPass /test balancer://mycluster/ ProxyPass /test balancer://mycluster/ Proxying Tomcat Proxying Tomcat ProxyPass /myapp ajp:// :8009/myapp/ ProxyPass /myapp ajp:// :8009/myapp/ ProxyPassReverse /myapp ajp:// :8009/myapp/ ProxyPassReverse /myapp ajp:// :8009/myapp/
mod_rewrite Used to create ‘pretty’ url’s Used to create ‘pretty’ url’s RewriteRule (.*).html /realpage.php?name=$1 Redirect any non-existant request to some page: Redirect any non-existant request to some page: RewriteEngine On RewriteBase / RewriteCond %{REQUEST_FILENAME} !-f RewriteCond %{REQUEST_FILENAME} !-d RewriteRule. /index.php [L]
Useful Apache Tricks /server-status/ /server-status/ apachectl –t –D DUMP_VHOSTS apachectl –t –D DUMP_VHOSTS Shows all of the virtual hosts configured Shows all of the virtual hosts configured Debian style setup with a2ensite, a2enmod Debian style setup with a2ensite, a2enmod Symlinks to enable/disable sites and modules Symlinks to enable/disable sites and modules Documentation is very good Documentation is very good
Apache Alternatives Nginx (Engine X) Nginx (Engine X) Supposed to be very good at proxying Supposed to be very good at proxying Lighttpd (Lighty) Lighttpd (Lighty)