– Mod_perl : Performance CGI Introduction to mod_perl Mod_perl : Performance CGI in Apache mod_perl is more than CGI scripting on steroids. It is a whole new way to create dynamic content by utilizing the full power of the Apache web server to create stateful sessions, customized user authentication systems, smart proxies and much more. Yet, magically, your old CGI scripts will continue to work and work very fast indeed. With mod_perl you give up nothing and gain so much! -- Lincoln Stein
– Mod_perl : Performance CGI Introduction to mod_perl 2 Classical Perl-CGI Conventional perl CGI scripts are compiled, interpreted, and executed like any other perl script. · Every time a perl script is run, it is translated (interpreted/compiled) into op code, then executed. · The translation step takes time. · Any database connections, filehandles, or other like resources are created only for the life of the particular instance of the script being run.
– Mod_perl : Performance CGI Introduction to mod_perl 3 Classical Perl-CGI In CGI, a Perl script’s output is directed to the user’s browser via the web browser, instead of the usual STDOUT (screen). Each time a script is run, it becomes its own process. Each process requires its own compilation and memory space, even if the script is the same. %> top i
– Mod_perl : Performance CGI Introduction to mod_perl 4 Classical Perl-CGI : Performance Scenario Consider the following scenario: A website served from a single server. 20 perl CGI scripts which each serve 5 clients / second. Each script loads 5 modules. Each script creates a database connection. Each script accesses files on the filesystem. The time that it takes to compile/interpret/run each script is 2 seconds. Each instance of the script requires 10Mb of memory. The effect on the server would be: 100 perl processes compiling, interpreting, and running (5 processes / script / second ). 200 seconds of cpu-time consumed. 1Gb of memory used. 100 database connections created and destroyed. Solution: Compile the scripts once. Server clients from single cached version of scripts Share the database connections. Share any common data in memory
– Mod_perl : Performance CGI Introduction to mod_perl 5 Why Mod_perl? Problems with conventional perl-CGI: Compilation of script for each request (slow) New process created for each request (resource- intensive) No easy way to share commonly used resources such as modules, data memory, database connections, etc. Limited integration with Apache server, limited control of Apache modules, services, and functions. Mod_perl’s solutions and features: Speed and Efficiency 1. The standard Apache::Registry module can provide 100x speedups for existing CGI scripts and reduce the load on the server at the same time. 2. Scripts are wrapped as subroutines within a handler in the server module which execute faster. Shared Resources 1. Share database connections. 2. Share memory. Server control / Customization 1. Apache can be controlled using existing modules. 2. Custom modules and handlers can be easily written to extend server functionality. Control over request stages 1. Rewrite URL’s in Perl based on the content of directories, settings stored in a database, or anything else conceivable. 2. Maintenance of state within the server memory.
– Mod_perl : Performance CGI Introduction to mod_perl 6 Approaches to Perl Coding One-off scripting and the “one- pot”approach Sufficient for non-persistent script One set of output based on one set of inputs Subroutines can access and modify the globally available data. Programming by “passing-the-buck” Input Output Variables Functions Subroutines Main Input Output Better for persistent “program” Input/Output dynamic based on parameters Subroutines should only be able to access global data under certain conditions
– Mod_perl : Performance CGI Introduction to mod_perl 7 Nested Subroutines in Perl nested.pl #!/usr/bin/perl –w use diagnostics; use strict; sub print_power_of_2 { my $x = shift; sub power_of_2 { return $x ** 2; } my $result = power_of_2(); print "$x^2 = $result\n"; } print_power_of_2(5); print_power_of_2(6); The script should print the square of the numbers passed to it: %./nested.pl 5^2 = 25 6^2 = 25 If we use the warnings(-w) pragma we get the warning: Variable "$x" will not stay shared at./nested.pl line 9. If we use diagnostics.pm we get: (W) An inner (nested) named subroutine is referencing a lexical variable defined in an outer subroutine. When the inner subroutine is called, it will probably see the value of the outer subroutine's variable as it was before and during the *first* call to the outer subroutine; in this case, after the first call to the outer subroutine is complete, the inner and outer subroutines will no longer share a common value for the variable. In other words, the variable will no longer be shared. Furthermore, if the outer subroutine is anonymous and references a lexical variable outside itself, then the outer and inner subroutines will never share the given variable. This problem can usually be solved by making the inner subroutine anonymous, using the sub {} syntax. When inner anonymous subs that reference variables in outer subroutines are called or referenced, they are automatically rebound to the current values of such variables.
– Mod_perl : Performance CGI Introduction to mod_perl 8 How mod_perl Works Mod_perl is a binary module extension which provides Apache with a “built-in” perl interpreter. Requests which map to directories assigned to mod_perl are serviced by perl packages called “handlers” The handler is interpreted by the built-in interpreter, compiled, and cached in memory. The most important mod_perl handler is called Apache::Registry The Apache server loads a parent “server” process (httpd), and this process forks a specified number of children. Each process contains the mod_perl module and can serve requests. The children can share memory from the parent. httpd parent | mod_perl Port 80 httpd child | mod_perl
– Mod_perl : Performance CGI Introduction to mod_perl 9 Content Handlers ModPerl/Rules1.pm package ModPerl::Rules1; use Apache::Constants qw(:common); sub handler { print "Content-type: text/plain\n\n"; print "mod_perl rules!\n"; return OK; # We must return a status to mod_perl } 1; # This is a perl module so we must return true to perl ModPerl/Rules2.pm package ModPerl::Rules2; use Apache::Constants qw(:common); sub handler { my $r = shift; $r->send_http_header('text/plain'); $r->print("mod_perl rules!\n"); return OK; # We must return a status to mod_perl } 1; All content handlers in mod_perl must have the ‘handler’ subroutine. To add the handler to the server configuration, the httpd.conf file must be modified and the server restarted: /usr/local/apache/conf/httpd.conf In redhat 9 httpd.conf is moved, and the mod_perl configuration is in another file: /etc/httpd/conf/httpd.conf /etc/httpd/conf.d/perl.conf The following configuration snippet is added to httpd.conf or perl.conf: PerlModule ModPerl::Rules1 SetHandler perl-script PerlHandler ModPerl::Rules1 PerlSendHeader On mod_perl rules!
– Mod_perl : Performance CGI Introduction to mod_perl 10 Apache::Registry / ModPerl::Registry counter.pl: #!/usr/bin/perl –w use CGI qw(:all); use strict; print header; my $counter = 0; #redundant for (1..5) { increment_counter(); } sub increment_counter{ $counter++; print “Counter is equal to counter !”, br; } To use this script in mod_perl’s Apache::Registry, we must save the file in the appropriate directory specified in the directive in httpd.conf / perl.conf: Standard Apache installation: SetHandler perl-script PerlHandler Apache::Registry Options +ExecCGI PerlSendHeader On Redhat 9 (Apache 2.0): SetHandler perl-script PerlModule ModPerl::Registry PerlHandler ModPerl::Registry::handler Options +ExecCGI
– Mod_perl : Performance CGI Introduction to mod_perl 11 Apache::Registry / ModPerl::Registry : Continued package Apache::ROOT::perl::counter_2epl; use Apache qw(exit); sub handler { use strict; print header"; my $counter = 0; # redundant for (1..5) { increment_counter(); } sub increment_counter{ $counter++; print "Counter is equal to $counter !\r\n"; } The script counter.pl is compiled into the package Apache::ROOT::perl::counter_2epl and is wrapped into this package’s “handler” subroutine. We would expect to see the output: Counter is equal to 1 ! Counter is equal to 2 ! Counter is equal to 3 ! Counter is equal to 4 ! Counter is equal to 5 ! After some reloading, we start to get strange results, with the counter starting at higher numbers like 6, 11, 15 and so on: Counter is equal to 6 ! Counter is equal to 7 ! Counter is equal to 8 ! Counter is equal to 9 ! Counter is equal to 10 ! The major cause of this bug: nested subroutines. Non-linearity of buggy output is caused by the requests being served by different children
– Mod_perl : Performance CGI Introduction to mod_perl 12 Solving the Nested Subroutine Problem: Anonymous subs, Scoping anonymous.pl #!/usr/bin/perl use strict; sub print_power_of_2 { my $x = shift; my $func_ref = sub { return $x ** 2; }; my $result = &$func_ref(); print "$x^2 = $result\n"; } print_power_of_2(5); print_power_of_2(6); Change the named inner nested subroutine to an anonymous subroutine. The anonymous subroutine sees the variables in the same lexical context, at any moment that it is called. The $x variable is in the same lexical scope as the anonymous subroutine call so it ‘sees’ the variable and its value at any given moment. Acts like a closure %./anonymous.pl 5^2 = 25 6^2 = 36
– Mod_perl : Performance CGI Introduction to mod_perl 13 Solving the Nested Subroutine Problem: Package Scoped Variables multirun.pl #!/usr/bin/perl use strict; use warnings; for (1..2) { print "run: [time $_]\n"; run(); } sub run { my $counter = 0; our $counter = 0; local our $counter = 0; increment_counter(); sub increment_counter{ $counter++; print "Counter is equal to $counter !\n"; } } # end of sub run When the script is run using the lexically scoped $counter variable we get: Variable "$counter" will not stay shared at./nested.pl line 18. run: [time 1] Counter is equal to 1 ! Counter is equal to 2 ! run: [time 2] Counter is equal to 3 ! Counter is equal to 4 ! The $counter variable in the named subroutine remains bound to the initial value (named subs are compiled once) If we use ‘our’ to scope $counter to the package it works: run: [time 1] Counter is equal to 1 ! Counter is equal to 2 ! run: [time 2] Counter is equal to 1 ! Counter is equal to 2 ! If we add ‘local’ then the variable is reset to its default value when it goes out of scope. For variables which are references to large data structures, this is useful in preventing memory leakage.
– Mod_perl : Performance CGI Introduction to mod_perl 14 Solving the Nested Subroutine Problem: Parameter Passing, References multirun3.pl #!/usr/bin/perl use strict; use warnings; for (1..3){ print "run: [time $_]\n"; run(); } sub run { my $counter = 0; $counter = increment_counter($counter); sub increment_counter{ my $counter = shift; $counter++; print "Counter is equal to $counter !\n"; return $counter; } } # end of sub run multirun4.pl #!/usr/bin/perl use strict; use warnings; for (1..3){ print "run: [time $_]\n"; run(); } sub run { my $counter = 0; increment_counter(\$counter); sub increment_counter{ my $r_counter = shift; $$r_counter++; print "Counter is equal to $$r_counter !\n"; } } # end of sub run
– Mod_perl : Performance CGI Introduction to mod_perl 15 Porting example Param_printer.pl #!/usr/bin/perl -w use strict; use CGI qw(:standard); front_page() if !param(); my $opt_p = param('p') || 20; # primer size my $opt_a = param('a') || 2; # primer size range my $opt_t = param('t') || 60; # opt. tm my $opt_b = param('b') || 5; # tm range my $opt_y = param('y') || 5; # primer sets per exon print header; print_options; print end_html; sub print_options { print "$opt_p, $opt_a, $opt_t, $opt_b, $opt_y", br; } sub front_page { # code to print the default webpage if no parameters are passed. # The frontpage will have the form which sends the parameters back to the #script using GET / POST } Param_printer.pl #!/usr/bin/perl -w use strict; use CGI qw(:standard); front_page() if !param(); my %opt; $opt{p} = param('p') || 20; # primer size $opt{a} = param('a') || 2; # primer size range $opt{t} = param('t') || 60; # opt. tm $opt{b} = param('b') || 5; # tm range $opt{y} = param('y') || 5; # primer sets per exon print header; print_options(%opt); print end_html; sub print_options { my %opt print "$opt{p}, $opt{a}, $opt{t}, $opt{b}, $opt{y}", br; } sub front_page { # code to print the default webpage if no parameters are passed. } When this script is run in mod_perl, it is wrapped in the handler subroutine of the package – inner subroutine problem – we get the same initial parameters repeatedly. Since the variables follow a distinct pattern we can use commandline perl and regex to convert them to a hash. % perl -i.bak -pe 's/\$opt_(\w+)/\$opt{$1}/g' param_printer.pl The ‘my’ scoping must be removed from the hash assignments. We declare the hash %opt and then pass the options into the subroutine:
– Mod_perl : Performance CGI Introduction to mod_perl 16 Porting example : continued #!/usr/bin/perl -w use strict; use CGI qw(:standard); front_page() if !param(); my %opt; $opt{p} = param('p') || 20; # primer size $opt{a} = param('a') || 2; # primer size range $opt{t} = param('t') || 60; # opt. tm $opt{b} = param('b') || 5; # tm range $opt{y} = param('y') || 5; # primer sets per exon my $text = "These are the parameters"; = split (param('a')); print header; print end_html; sub print_options { my ($opt_ref, $text_ref, $array_ref) my %opt = %$opt_ref; my $text = $$text_ref; print $text, br, "$opt{p}, $opt{a}, $opt{t}, $opt{b}, $opt{y}", br; print join } sub front_page { # code to print the default webpage if no parameters are passed. } If we want to pass more than one variable of different types (arrays, scalars, and hashes) into the subroutine, we can use references. The references will cause mod_perl to hold-on to the data that they reference We should use ‘local our’ to clean up those references after they are used. local our %opt; ….. local our $text = "These are the parameters"; local = split (param('a')); print header; print end_html; sub print_options { print $text, br, "$opt{p}, $opt{a}, $opt{t}, $opt{b}, $opt{y}", br; print join }
– Mod_perl : Performance CGI Introduction to mod_perl 17 Database Connections : Apache::DBI In regular CGI, the script which connects to the database creates its own connection in every instance it is run. If 20 scripts are accessed each 10 times, that’s 200 database connections which are created and destroyed. Database connections are expensive. To mitigate this shortcoming, use Apache::DBI, which allows persistent database connections to be created in mod_perl. The DBI module will check $ENV{MOD_PERL} environment variable. If Apache::DBI has been loaded, it forwards connect() requests to it. The disconnect() method is overloaded with nothing. To load Apache::DBI, it should be loaded in httpd.conf / perl.conf: PerlModule Apache::DBI After that, you program DBI just as if you used: “use DBI;” The “use DBI;” statement can remain in your scripts. use DBI; $dbh = DBI->connect($data_source, $username, $auth, \%attr); $sth = $dbh->prepare($statement); $rv = = $dbh->selectrow_array($statement);
– Mod_perl : Performance CGI Introduction to mod_perl 18 Sharing Memory : Aliasing package My::Config; use strict; use vars qw(%c); %c = ( dir => { cgi => "/home/httpd/perl", docs => "/home/httpd/docs", img => "/home/httpd/docs/images", }, url => { cgi => "/perl", docs => "/", img => "/images", }, color => { hint => "#777777", warn => "#990066", normal => "#000000", }, ); use strict; use My::Config (); use vars qw(%c); *c = \%My::Config::c; print "Content-type: text/plain\r\n\r\n"; print "My url docs root: $c{url}{docs}\n"; A Package is created with a hash that contains configuration parameters for some scripts. We want to be able to use this hash in other scripts The *c glob has been aliased with \%My::Config::c, a reference to a hash. From now on, %My::Config::c and %c are the same hash and you can read from or modify either of them. Any script that you use can share this variable You can also use arrays in your package to export the variables that you want to share.
– Mod_perl : Performance CGI Introduction to mod_perl 19 Server Configuration – httpd.conf / perl.conf ## ## Server-Pool Size Regulation (MPM specific) ## # prefork MPM # StartServers: number of server processes to start # MinSpareServers: minimum number of server processes which are kept spare # MaxSpareServers: maximum number of server processes which are kept spare # MaxClients: maximum number of server processes allowed to start # MaxRequestsPerChild: maximum number of requests a server process serves StartServers 1 MinSpareServers 1 MaxSpareServers 1 MaxClients 1 MaxRequestsPerChild 1000 # worker MPM # StartServers: initial number of server processes to start # MaxClients: maximum number of simultaneous client connections # MinSpareThreads: minimum number of worker threads which are kept spare # MaxSpareThreads: maximum number of worker threads which are kept spare # ThreadsPerChild: constant number of worker threads in each server process # MaxRequestsPerChild: maximum number of requests a server process serves StartServers 1 MaxClients 10 MinSpareThreads 1 MaxSpareThreads 1 ThreadsPerChild 10 MaxRequestsPerChild 0 # perchild MPM # NumServers: constant number of server processes # StartThreads: initial number of worker threads in each server process # MinSpareThreads: minimum number of worker threads which are kept spare # MaxSpareThreads: maximum number of worker threads which are kept spare # MaxThreadsPerChild: maximum number of worker threads in each server process # MaxRequestsPerChild: maximum number of connections per server process NumServers 5 StartThreads 5 MinSpareThreads 5 MaxSpareThreads 10 MaxThreadsPerChild 20 MaxRequestsPerChild 0 # # Mod_perl incorporates a Perl interpreter into the Apache web server, # so that the Apache web server can directly execute Perl code. # Mod_perl links the Perl runtime library into the Apache web server # and provides an object-oriented Perl interface for Apache's C # language API. The end result is a quicker CGI script turnaround # process, since no external Perl interpreter has to be started. # LoadModule perl_module modules/mod_perl.so PerlRequire /etc/httpd/conf/start-up.pl # This will allow execution of mod_perl to compile your scripts to # subroutines which it will execute directly, avoiding the costly # compile process for most requests. Alias /perl /var/www/perl SetHandler perl-script PerlHandler ModPerl::Registry::handler PerlOptions +ParseHeaders Options +ExecCGI
– Mod_perl : Performance CGI Introduction to mod_perl 20 Performance Tuning : Startup.pl use lib("/var/www/perl"); use MultisageConfig (); use DBI () ; use CGI () ; CGI->compile(':all');
– Mod_perl : Performance CGI Introduction to mod_perl 21 Mod Perl API / Packages Apache::Session - Maintain session state across HTTP requests Apache::DBI - Initiate a persistent database connection Apache::Watchdog::RunAway - Hanging Processes Monitor and Terminator Apache::VMonitor -- Visual System and Apache Server Monitor Apache::GTopLimit - Limit Apache httpd processes Apache::Request (libapreq) - Generic Apache Request Library Apache::RequestNotes - Allow Easy, Consistent Access to Cookie and Form Data Across Each Request Phase Apache::PerlRun - Run unaltered CGI scripts under mod_perl Apache::RegistryNG -- Apache::Registry New Generation Apache::RegistryBB -- Apache::Registry Bare Bones Apache::OutputChain -- Chain Stacked Perl Handlers Apache::Filter - Alter the output of previous handlers Apache::GzipChain - compress HTML (or anything) in the OutputChain Apache::Gzip - Auto-compress web files with Gzip Apache::PerlVINC - Allows Module Versioning in Location blocks and Virtual Hosts Apache::LogSTDERR Apache::RedirectLogFix Apache::SubProcess Module::Use - Log and Load used Perl modules Apache::ConfigFile - Parse an Apache style httpd.conf config file Apache::Admin::Config - Object oriented access to Apache style config files Maintainers Authors
– Mod_perl : Performance CGI Introduction to mod_perl 22 Conclusions Mod_perl has to be done right Take care of nested subroutines Goto perl.apache.org