1 / 20

An Introduction to CGI Programming in Perl

By Amy Morrell and Lisa Wilcox. An Introduction to CGI Programming in Perl. What is Perl? Why use Perl?. PERL – P ractical E xtraction and R eporting L anguage (by Larry Wall) ‏ Interpreted language (not a compiled language) –simple to learn, very powerful, gets the job done quickly.

eilis
Download Presentation

An Introduction to CGI Programming in Perl

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. By Amy Morrell and Lisa Wilcox An Introduction to CGI Programming in Perl

  2. What is Perl? Why use Perl? PERL – Practical Extraction and Reporting Language (by Larry Wall)‏ Interpreted language (not a compiled language) –simple to learn, very powerful, gets the job done quickly. Open Source. Widely and freely available. Multi-platform (Perl runs on nearly anything). HINT: If working on Unix/Linux, learn to use the vi editor. We find it to be the most expedient way to edit our code. Cross Platform compatible.

  3. What is Perl? Why use Perl? Multi-use: Powerful text manipulation language – (uses powerful regular expression engine). Example: search for a sub-string is MUCH easier in Perl than C++. Web applications (Side Note: mod_perl is the marriage of Perl and Apache. You can use Perl to manage Apache, respond to requests for web pages and much more.) Stand alone programs, Batch processing, networking, GUI applications, image creation DB connectivity: use DBI module to interface with different databases with the same syntax. Can interact with other languages from within the Perl code.

  4. CPAN “Comprehensive Perl Archive Network” Located at http://www.cpan.org/ Contains thousands of perl modules, along with corresponding documentation. Most software on CPAN is free. If you need code to accomplish a task, see if there’s a module on CPAN first. (someone has usually done the work for you)‏

  5. The Obligatory “Hello World” Example The Obligatory “Hello World” Example

  6. Let’s see what it looks like http://www.nacarbon.org/cgi-nacp/perl_presentation/hello_world.pl

  7. Now for an example that you can actually use in the real world: The NACP Publications List: http://www.nacarbon.org/cgi-nacp/web/investigations/nacp_pubs_list.pl

  8. Code to Create the NACP Publications List – Part I #!/usr/bin/perl -T ########################################################################### require "/var/www/cgi-nacp/nacp_web_gen_conf.pl"; require "/var/www/cgi-nacp/web/investigations/inv_conf.pl"; # We’ve required two of our home grown configuration files. The first contains # variables, queries, and sub routines that we use throughout our www.nacarbon.org # website. The second only contains variables, queries, sub routines, and template # locations that we use within the “investigations” directory. # REQUIRE VS USE: ‘use’ and ‘require’ behave differently. When we use 'require' above, # it's an easy way that we can create and maintain a configuration file without having to # create a Perl module and store it in one of the directory paths recognized by # @INC (the list of directories Perl uses when searching for modules). Ideally, you # would use 'use' as much as possible because it's more object oriented. use DBI; # Perl’s database interface (from www.cpan.org). use CGI; my $query = new CGI; # Creating the new CGI object and parsing our input. print $query->header; # Printing out the “Content-type” header. use Text::Template; # CPAN module to do template handling. #Install interrupt handler for "die" $SIG{__DIE__} = die_handler; # connect to the DB $dbh = DBI->connect($DB_SOURCE,$DB_USER,$DB_PW); # Creating the DB connection. Note # that these variables are stored in # our main configuration file. $dbh->{RaiseError} = 1;

  9. Code to Create the NACP Publications List – Part II # Get the pub info. We’re loading a query from our configuration file that gets the # publication information that we want to display in the report. We are then sending # this to a sub routine called get_hashref_data, which creates a “structure” of tabular # data using a reference to an array of references to hashes. If you’re shaking your # head right now thinking “What the heck is she talking about?!?!”, don’t worry about # it. This is just an FYI that Perl can store data in complex ways using references. my $pub_refarray = get_hashref_data(*dbh,$QRY_PUBS,1); # Assign title specific to page. We’re using a Perl regular expression here to # replace the default NACP HTML title with a more specific title. Basically this is # looking to substitute anything within the <title> tags. The dot matches any character, # the ‘\n’ matches any newline, * matches 0 or more times, and ? is the minimal match # or non-greedy quantifier (so that the regex doesn’t match the rest of the template). # The ‘m’ at the end tells it to span multiple lines and the ‘i’ tells it that it should # be case insensitive. $T{'HEADER'} =~ s/<title>(.|\n)*?<\/title>/<title>NACP Publications List<\/title>/mi; # Mung the email address. This is using something called an HTML entity in the regex # substitution. $T{'PUB_EMAIL'} =~ s/\@/&#64;/; my $Template = new Text::Template(TYPE=>FILE, UNTAINT=>1, SOURCE=>$T{'T_NACP_PUBS_LIST'}); # Using our template file with Text::Template. my $HTML = $Template->fill_in(HASH=>[\%T,{publications => $pub_refarray}]); print "$HTML\n"; $dbh->disconnect; exit 0;

  10. Template to Create the NACP Publications List {$HEADER} <div align="left"> <table width=600 cellpadding=3 cellspacing=3> <tr><td align="left" colspan=2> <p class="pageheader"><strong>NACP Publications List</strong></p> </td></tr> { foreach my $pub_ref (@publications) { # We’re looping through the array, accessing each reference to the hash # of key/value pairs, getting the values that we need (referenced by the # key), and printing them out in a nicely formatted way. Note the ‘->’. # This is known as the arrow notation or the infix deference operator. # In this case, it means that we are dereferencing $pub_ref to get the value # of the “pub_citations” key in the referenced hash. if ($pub_ref->{'pub_citations'} =~ /\w/) { $OUT .= "<tr><td align=\"left\" valign=\"top\" colspan=2><p class=\"sectionheader\"><a href=\"inv_pgp.pl?pgid=$pub_ref->{'project_group_id'}\"> $pub_ref->{'project_group'} </a></p></td></tr>"; # We are now tidying things up a bit by using the regular expression substitution # to replace the line feed (\n) with a HTML line break tag. $pub_ref->{'pub_citations'} =~ s/\n/<br>/g; $OUT .= "<tr><td align=\"left\" valign=\"top\" colspan=2><p class=\"default-style\">$pub_ref->{'pub_citations'}</p></td></tr>"; } } } <tr><td colspan=2> &nbsp;</td></tr></table> </div> {$FOOTER}

  11. Debugging Tips Read “perldoc perldebtut” on your Unix system (at the command line). This talks about: Use strict – forces each variable to be declared before use -w switch – verbose debugging, including syntax that’s technically valid but isn’t working the way you think it should. -c switch - for syntax checking at the command line Use the Perl debugger: perl –d <scriptname.pl> Also run at the Unix command line. Make sure you remove the –T switch first. Use n to step through each line of the script. Use s to also step through associated sub routines. Old fashioned brute force debugging with the print command.

  12. Security Perl Taint Mode (-T): Strongly Recommended! Taint mode requires that any data provided by users is not trusted until it is validated and declared “safe” within the program. Perldoc perlsec Always Validate Inputs – Perl’s Regular Expressions engine is ideal for scrubbing values (blacklisting) and requiring valid values (whitelisting).

  13. More on Security Use CPAN modules and Perl functions – built in security (team of developers working on security). Always avoid running system commands using backticks in Perl. File manipulation: CPAN modules and perl functions exist for system commands and offer safer list argument methods. Web: CGI, HTML::Entities::encode. Database: DBI::Placeholders –Uses prepared SQL statements with placeholders where data should be inserted. The statement is compiled once (very efficient) and ensures nothing in the inserted data can cause MySQL to do anything unexpected (secure). Performs proper quoting for you automatically. We use our homegrown “get_query” function that serves the same purpose.

  14. Some Useful Resources Books: Programming Perl and Learning Perl by O’Reilly are great for beginning Perl programmers. Also books in this series for intermediate and advanced users. Get acquainted with the internal documentation on your system by typing in perldoc perldoc – Overview of perldoc perldoc perltoc – Table of Contents for the Perl documentation on your system. perldoc perlfunc – Syntax on Perl functions Search Web (Google) for oodles of code snippets, information on specific Perl commands or to troubleshoot error messages. CPAN!!! If you need help with a CPAN module, you can: Run perldoc on the full name of the module (such as ‘perldoc Text::Template’)‏ http://www.cpan.org has documentation for each module.

  15. Other Help Contact us if you’d like: Lisa Wilcox (lisa.e.wilcox@nasa.gov) Amy Morrell (amy.l.morrell@nasa.gov)‏

  16. Additional Code Examples sub get_hashref_data sub get_query sub scrub_a_value

  17. sub get_hashref_data ################################################################################ # Gets row(s) of data from the database in a neat associative array. Returns # a single reference or a reference to an array of references to hashes, # based on the third parameter passed: # 0 : single reference # 1 : array of references # Default action is a single reference. Pass $ref "as is" to Text::Template # to fill in the values of the requested keys. Use @$refarray where there are # multiple rows to fill in. In Text::Template, loop through as you would # rows from get_data, but instead of splitting on delimiters, just place # hash references ($refarray->{'key'}) where you want them. # ################################################################################ sub get_hashref_data { (*dbh,$qry,$flag) = @_; my $sth = ""; my @refarray = (); my $ref = ""; $sth = $dbh->prepare($qry); $sth->execute; if ($flag == '1') { while($ref = $sth->fetchrow_hashref) { push(@refarray, $ref); } return(\@refarray); } else { $ref = $sth->fetchrow_hashref; return($ref); } $sth->finish; }

  18. sub get_query – Part I ################################################################################ # GET_QUERY - Typically used for replacing variables in a SQL query # ex: select time from timetable where btime between ^1 and ^2; # The ^1 and ^2 would be replaced with appropriate values from the specified # @parm list # Parameters: sql - sql string (actually, can be any string)‏ # Parms - Array of values. Values are substitued in ^1, ^2, ..., ^n # in sql string ################################################################################ sub get_query { local($sql, @parms) = @_; local($i) = 1; local($j) = $[; # Whether or not we send input to scrub_a_value is also determined by $T{'scrub_flag'}. # $T{'scrub_flag'} = 1 : all input is validated, regardless of existence of a session # $T{'scrub_flag'} = 2 : no input is validated, regardless of existence of a session # Added single quote escape code for ($j=$[; $j<=$#parms; $j++) { # CHANGE all " to single ' $parms[$j] =~ s/\"/\'/g; # escape single quotes except when "like '..." passed or enclosed # in single quotes. for example "in (^1^)" where 1 is 'us-pi','sa-pi'

  19. sub get_query – Part II if( (!($parms[$j] =~ /like '/)) and (!($parms[$j] =~ /= '/)) and (!($parms[$j] =~ /like upper\('/)) and (!($parms[$j] =~ /^\'.*\'$/)) and (!($parms[$j] =~ /.*\\'.*/)) and (!($parms[$j] =~ /\\'/)) ){ $parms[$j] =~ s/\'/\\'/gm; } # Remove all backticks from user input $parms[$j] =~ s/`/ /gm; # Remove all pipesigns from user input $parms[$j] =~ s/\|/ /gm; # Remove all script tags. Attempt to remove the whole tag and contents first. $parms[$j] =~ s/<script.*?>.*?<\/script.*?>//igm; $parms[$j] =~ s/javascript://ig; $parms[$j] =~ s/<script.*?>//ig; $parms[$j] =~ s/<\/script.*?>//ig; unless ($T{'scrub_flag'} == 2) { $parms[$j] = scrub_a_value($parms[$j]); } } # End of for loop # ALM noticed a problem here. If one query uses ^1^ and ^3^, but not ^2^, # then the ^3^ will never get substituted for ($i=1; $i<=$#parms+1; $i++) { if ($sql =~ /\^${i}\^/) { $sql =~ s/\^${i}\^/$parms[$i-1]/g; }#if }#for return($sql); }

  20. sub scrub_a_value sub scrub_a_value { # Takes a scalar value and removes any tagged input and other problematic # characters that a hacker can use to compromise the system via the website. # This sub routine is initially being used to "untaint" input from a user that # has not logged in, and that does not go through get_query for whatever reason # (such as queries that we create from various search tools). my ($value) = @_; # escape single quotes except when "like '..." passed or enclosed # in single quotes. for example "in (^1^)" where 1 is 'us-pi','sa-pi' if( (!($value =~ /like '/)) and (!($value =~ /= '/)) and (!($value =~ /like upper\('/)) and (!($value =~ /^\'.*\'$/)) and (!($value =~ /.*\\'.*/))){ $value =~ s/\'/\\'/gm; } # Remove backticks. $value =~ s/`/ /gm; # Remove tagged input. $value =~ s/<.*?>//g; # Remove semi-colons and other potentially hazardous characters. $value =~ s/;//g; $value =~ s/\|//g; $value =~ s/\*//g; $value =~ s/\|//g; return $value; }

More Related