1 / 54

Topics

Quiz 1 Homework Review Programming Assignment # 1 Perl shortcuts Declaring variables and Scope Subroutines passing arguments array references Programming Methods Top Down Design Bottom Up Coding and Testing Debugging Reading manuals and help pages Plain old documentation (POD).

Download Presentation

Topics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Quiz 1 Homework Review Programming Assignment # 1 Perl shortcuts Declaring variables and Scope Subroutines passing arguments array references Programming Methods Top Down Design Bottom Up Coding and Testing Debugging Reading manuals and help pages Plain old documentation (POD) Lab time Topics BINF 634 FALL 2014

  2. Acknowledgements • Thanks to John Grefenstette for allowing me to use these slides as a starting point for tonight’s lecture BINF 634 FALL 2014

  3. Some Humor • Perl can be powerful BINF 634 FALL 2014

  4. Output 0 1 2 3 4 5 6 7 8 9 Perl Shortcuts • Any simple statement can be followed by a single modifier right before the ; or closing } STATEMENT if EXPR STATEMENT unless EXPR STATEMENT while EXPR STATEMENT until EXPR $ave = $ave/$n unless $n == 0; Same as: unless ($n == 0) { $ave = $ave/$n } What does this do? $x = 0; print $x++, "\n" until $x == 10; BINF 634 FALL 2014

  5. Perl Shortcuts • Any simple statement can be followed by a single modifier STATEMENT foreach LIST STATEMENT is evaluated for each item in LIST, with $_ set to current item. @A = qw/One two three four/; print "$_\n" foreach @A; Output: One two three four BINF 634 FALL 2014

  6. 3 One 3 two 5 three 4 four Perl Shortcuts • Predefined Perl functions may be used with or without parentheses around their arguments: $next = shift @A; open FILE, $filename or die "Can't open $filename"; $next = shift @A; @chars = split //, $word; @fields = split /:/, $line; • Many Perl functions assume $_ if their argument is omitted: @A = qw/One two three four/; print length, " $_\n" foreach @A; BINF 634 FALL 2014

  7. Scope of variables • my variables can be accessed only until the end of the enclosing block (or until end of file, if outside any block) • It's best to declare a variable in the smallest possible scope if ($x < $y) { my $tmp = $x; $x = $y; $y = $tmp } • Variable declared in a control-flow statement are visible only with the associated block: my @seq_list = qw/ATT TTT GGG/; my $sequence = "NNN"; for my $sequence (@seq_list){ $sequence .= "TAG"; print "$sequence\n"; } print "$sequence\n"; Output: ATTTAG TTTTAG GGGTAG NNN Are these two different variables? BINF 634 FALL 2014

  8. Subroutines Advantages of Subroutines • Shorter code • Easier to test • Easier to understand • More reliable • Faster to write • Re-usable BINF 634 FALL 2014

  9. Subroutines • Defining a subroutine: sub name { BLOCK } • Arguments are accessed through array @_ • Subroutine values are returned by: return VALUE • Subroutines may be defined anywhere in the file, but are usually placed at end • They can be arranged alphabetically or by functionality BINF 634 FALL 2014

  10. Passing Parameters Into Subroutines • Values are passed into subroutines using the special array @_ • How do we know that this is an array?? • The shortened name of this argument is _ • It contains all of the scalars passed into the subroutine BINF 634 FALL 2014

  11. Pass by Value Why are the two values different? #!/usr/bin/perl -w # A driver program to test a subroutine that # uses pass by value use strict; use warnings; my $i = 2; simple_sub($i); print "In main program, after the subroutine call, \$i equals $i\n\n"; exit; sub simple_sub { my($i)=@_; $i += 100; print "In subroutine simple_sub, \$i equals $i\n\n"; } Output In subroutine simple_sub, $i equals 102 In main program, after the subroutine call, $i equals 2 BINF 634 FALL 2014

  12. There is a bug in this program as written can you find it? How would you fix it to produce the indicated output below? #!/usr/bin/perl use strict; use warnings; # File: min.pl my $a = <STDIN>; chomp $a; my $b = <STDIN>; chomp $b; $small = min($a, $b); print "min of $a and $b is $small\n"; exit; sub min { my ($n, $m) = @_; # @_ is the array of parameters if ($n < $m) { return $n } else { return $m } } %min.pl 123 45 min of 123 and 45 is 45 $small is not declared BINF 634 FALL 2014

  13. #!/usr/bin/perl use strict; use warnings; # File: min_max.pl ## Subroutines can return lists my $a = <STDIN>; chomp $a; my $b = <STDIN>; chomp $b; my ($small, $big) = min_max($a, $b); print "max of $a and $b is $big\n"; print "min of $a and $b is $small\n"; exit; sub min_max { my ($n, $m) = @_; # @_ is the array of parameters if ($n < $m) { return ($n, $m) } else { return ($m, $n) } } % min_max.pl 123 45 max of 123 and 45 is 123 min of 123 and 45 is 45 BINF 634 FALL 2014

  14. Passing arguments • All arguments are passed in a single list @a = qw/ This will all /; $b = "end"; @c = qw/ up together /; @c = foo(@a, $b, @c); print "@c\n"; sub foo { my @args = @_; return @args; } Output: This will all end up together BINF 634 FALL 2014

  15. #!/usr/bin/perl -w # A driver program to test a subroutine that # illustrates array flattening use strict; use warnings; my @i = ('1', '2', '3'); my @j = ('a','b','c'); print "In main program before calling subroutine: i = " . "@i\n"; print "In main program before calling subroutine: j = " . "@j\n"; reference_sub(@i, @j); print "In main program after calling subroutine: i = " . "@i\n"; print "In main program after calling subroutine: j = " . "@j\n"; exit; sub reference_sub { my (@i, @j) = @_; print "In subroutine : i = " . "@i\n"; print "In subroutine : j = " . "@j\n"; push(@i, '4'); shift(@j); } Output In main program before calling subroutine: i = 1 2 3 In main program before calling subroutine: j = a b c In subroutine : i = 1 2 3 a b c In subroutine : j = In main program after calling subroutine: i = 1 2 3 In main program after calling subroutine: j = a b c Array Flattening BINF 634 FALL 2014

  16. Passing by Value Versus Passing by Reference • Passing by Value • Pass a copy of the variable • Changes made to variable in subroutine do not effect the value of variables in the main body • Can cause array flattening • Passing by Reference • Pass a reference (pointer) to the variable • Must be dereferenced when used in the subroutine • This is the cure for array flattening BINF 634 FALL 2014

  17. Perl References - I • A reference is a scalar variable that refers to (points to) another variable • So a reference might refer to an array $aref = \@array; # $aref now holds a reference to @array $xy = $aref; # $xy now holds a reference to @array #Lines 2 and 3 working together do the same thing as line 1 $aref = [ 1, 2, 3 ]; @array = (1, 2, 3); $aref = \@array; http://perl.plover.com/FAQs/references.html BINF 634 FALL 2014

  18. Perl References - II http://perl.plover.com/FAQs/references.html BINF 634 FALL 2014

  19. Dereferencing ${$aref}[3] is too hard to read, so you can write  $aref->[3] instead • Additional helpful discussions can be found at • http://oreilly.com/catalog/advperl/excerpt/ch01.html http://perl.plover.com/FAQs/references.html BINF 634 FALL 2014

  20. #!/usr/bin/perl -w # A driver program to test a subroutine that # passes by reference use strict; use warnings; my @i = ('1', '2', '3'); my @j = ('a','b','c'); print "In main program before calling subroutine: i = " . "@i\n"; print "In main program before calling subroutine: j = " . "@j\n"; reference_sub(\@i, \@j); print "In main program after calling subroutine: i = " . "@i\n"; print "In main program after calling subroutine: j = " . "@j\n"; exit; sub reference_sub { my ($i, $j) = @_; print "In subroutine : i = " . "@$i\n"; print "In subroutine : j = " . "@$j\n"; push(@$i, '4'); shift(@$j); } Output: In main program before calling subroutine: i = 1 2 3 In main program before calling subroutine: j = a b c In subroutine : i = 1 2 3 In subroutine : j = a b c In main program after calling subroutine: i = 1 2 3 4 In main program after calling subroutine: j = b c Passing by Reference BINF 634 FALL 2014

  21. Arrays references @a = qw/ This will all /; $b = "end"; @c = qw/ up together /; # this passes in references to the arrays bar(\@a, $b, \@c); # \@a is a reference (pointer) to @a sub bar { my ($x, $b, $z) = @_; # @_ has three items # dereference first argument my @A = @$x; # @$x is the array referenced by $x # dereference third argument my @C = @$z; print "@A\n"; print "$b\n"; print "@C\n"; } This will all end up together • To pass more than one list to a subroutine, use references to the arrays BINF 634 FALL 2014

  22. Input Algorithm Output Program Design Q. What is the form of input data? Q. How will the program will get it? • interactive • command line • parameter file Q. How will the program process the data to compute the desired output? • How will the output be formatted and delivered? Specified by user requirements BINF 634 FALL 2014

  23. Program Design • Design Top Down • Identify the inputs • Understand the requirements for the output • Design an overall algorithm for computing the output • Express overall method in pseudocode • Refine pseudocode until each step forms a well-defined subroutine • Test Bottom Up • Write and debug subroutines one at a time • Start with “utility” subroutines that will be used by other subroutines • Test each subroutine with input data that gives known results • Include subroutines that help debugging, such as printing routines for data structures BINF 634 FALL 2014

  24. Pseudocode • High level, informal program • No details Example: print out length statistics and overall nucleotide usage statistics for a file of sequences Input: get sequences from DNAfile Algorithm: for each DNA sequence, get length statistics count each type of nucleotide Output: print length statistics print nucleotide usage statistics BINF 634 FALL 2014

  25. Pseudocode • Keep pseudocode in perl program as comments # get sequences from DNAfile # for each DNA sequence, # get length statistics # count each type of nucleotide # print length statistics # print nucleotide usage statistics BINF 634 FALL 2014

  26. Refinement Refine pseudocode into more detailed steps: Input: get name of DNAfile open DNAfile read lines from DNAfile, putting DNA sequences in a list Algorithm: for each DNA sequence in the list get length and update statistics count each type of nucleotide in the sequence Output: print length statistics print nucleotide usage statistics BINF 634 FALL 2014

  27. Algorithm Refinement Try to express complex tasks using Perl control structures (e.g. loops) until inner subtasks for well-defined tasks that can be done by a single subroutine. Algorithm: for each DNA sequence in the list get length and update statistics count each type of nucleotide in the sequence for each DNA sequence in the list get length and update statistics for each base count the occurrence of that base in the sequence Now write a subroutine to count any base in any sequence BINF 634 FALL 2014

  28. Program Design • Design Top Down • Identify the inputs • Understand the requirements for the output • Design an overall algorithm for computing the output • Express overall method in pseudocode • Refine pseudocode until each step forms a well-defined subroutine • Test Bottom Up • Write and debug subroutines one at a time • Start with “utility” subroutines that will be used by other subroutines • Test each subroutine with input data that gives known results • Include subroutines that help debugging, such as printing routines for data structures BINF 634 FALL 2014

  29. #!/usr/bin/perl # File: sub1.pl # subroutine to count A's in DNA use warnings; use strict; my $a; my $dna = "tagATAGAC"; $a = count_A($dna); print "$dna\n"; print "a: $a\n"; exit; ######################################### # subroutine to count A's in DNA # sub count_A { # @_ is the list of parameters my ($dna) = @_; # array context assignment my $count; # tr returns number of matches $count = ($dna =~ tr/Aa//); return $count; } Output: tagATAGAC a: 4 After you've written a subroutine, ask yourself if it can be made a bit more general BINF 634 FALL 2014

  30. #!/usr/bin/perl # File: sub2.pl # subroutine to count any letter in DNA use warnings; use strict; my ($a, $c, $g, $t); my $dna = "tagATAGAC"; $a = count_base('A', $dna); $t = count_base('T', $dna); $c = count_base('C', $dna); $g = count_base('G', $dna); print "$dna\n"; print "a: $a t: $t c: $c g: $g\n"; exit; ######################################### # # subroutine to count any letter in DNA # sub count_base { my( $base, $dna ) = @_; my( $count ); $count = ($dna =~ s/$base//ig); return $count; } Output: tagATAGAC a: 4 t: 2 c: 1 g: 2 BINF 634 FALL 2014

  31. Program Design: Managing Complexity • Understand inputs and outputs • Use pseudocode to refine your algorithm • Use divide-and-conquer to turn big problems into manageable pieces • within a chromosomes, process one gene at a time • within each gene, process one reading frame at a time • within each reading frame, process one ORF at a time • Pick data structures that make algorithms easier • this gets easier with experience! • Write subroutines to • transform one data object to another, for example: • dna (string) to reading frame (array of codons) • reading frame to orf • perform some well defined task • compute some statistics on a single data object • produce final output format • Write small programs (drivers) to test each subroutine before combining them together BINF 634 FALL 2014

  32. Some Good Programming References • Algorithms + Data Structures = Programs (Prentice-Hall Series in Automatic Computation)[Hardcover] • Niklaus Wirth (Author) • Introduction to Algorithms [Hardcover] • Thomas H. Cormen (Author), Charles E. Leiserson (Author), Ronald L. Rivest (Author), Clifford Stein (Author) BINF 634 FALL 2014

  33. Read The Fine Manual (RTFM) • The more you read manuals, the easier it will be • For each function we have covered tonight, read the corresponding description in Ch. 29 of Wall • If you find something in the manual you don't understand, look it up (or ask someone) • Learn to use the online help pages, e.g., % perldoc -f join • To see a list of online tutorials, see % perldoc perl For example: % perldoc perlstyle • The interface is somewhat vi like BINF 634 FALL 2014

  34. Debugging Strategies • Before running the program, always run % perl -c prog • Read the warnings and error message from the compiler carefully • Always use strict and use warnings • Basic strategy: bottom-up debugging • Test and debug one subroutine at a time • Insert print statements • to figure out where a program fails • to print values of variables • Comment out when not needed - don't remove! BINF 634 FALL 2014

  35. Starting the Debugger [binf:~/binf634/workspace/binf634_book_examples] jsolka% perl -d example-6-4.pl Loading DB routines from perl5db.pl version 1.28 Editor support available. Enter h or `h h' for help, or `man perldebug' for more help. main::(example-6-4.pl:11): my $dna = 'CGACGTCTTCTAAGGCGA'; DB<1> BINF 634 FALL 2014

  36. Getting Help Within the Debugger - I DB<2> h List/search source lines: Control script execution: l [ln|sub] List source code T Stack trace - or . List previous/current line s [expr] Single step [in expr] v [line] View around line n [expr] Next, steps over subs f filename View source in file <CR/Enter> Repeat last n or s /pattern/ ?patt? Search forw/backw r Return from subroutine M Show module versions c [ln|sub] Continue until position Debugger controls: L List break/watch/actions o [...] Set debugger options t [expr] Toggle trace [trace expr] <[<]|{[{]|>[>] [cmd] Do pre/post-prompt b [ln|event|sub] [cnd] Set breakpoint ! [N|pat] Redo a previous command B ln|* Delete a/all breakpoints H [-num] Display last num commands a [ln] cmd Do cmd before line = [a val] Define/list an alias A ln|* Delete a/all actions h [db_cmd] Get help on command w expr Add a watch expression h h Complete help page W expr|* Delete a/all watch exprs |[|]db_cmd Send output to pager ![!] syscmd Run cmd in a subprocess q or ^D Quit R Attempt a restart BINF 634 FALL 2014

  37. Getting Help With the Debugger - II Data Examination: expr Execute perl code, also see: s,n,t expr x|m expr Evals expr in list context, dumps the result or lists methods. p expr Print expression (uses script's current package). S [[!]pat] List subroutine names [not] matching pattern V [Pk [Vars]] List Variables in Package. Vars can be ~pattern or !pattern. X [Vars] Same as "V current_package [Vars]". i class inheritance tree. y [n [Vars]] List lexicals in higher scope <n>. Vars same as V. e Display thread id E Display all thread ids. For more help, type h cmd_letter, or run man perldebug for all docs. BINF 634 FALL 2014

  38. Stepping Through Statements With the Debugger main::(example-6-4.pl:11): my $dna = 'CGACGTCTTCTAAGGCGA'; DB<2> p $dna DB<3> DB<3> n main::(example-6-4.pl:12): my @dna; DB<6> l 12==> my @dna; 13: my $receivingcommittment; 14: my $previousbase = ''; 15 16: my$subsequence = ''; 17 18: if (@ARGV) { 19: my$subsequence = $ARGV[0]; 20 }else{ 21: $subsequence = 'TA'; DB<6> p $dna CGACGTCTTCTAAGGCGA BINF 634 FALL 2014

  39. Using the Perl Debugger DB<7> n n main::(example-6-4.pl:13): my $receivingcommittment; DB<7> n main::(example-6-4.pl:14): my $previousbase = ''; DB<7> n main::(example-6-4.pl:16): my$subsequence = ''; DB<7> n main::(example-6-4.pl:18): if (@ARGV) { DB<7> n main::(example-6-4.pl:21): $subsequence = 'TA'; DB<7> n main::(example-6-4.pl:24): my $base1 = substr($subsequence, 0, 1); BINF 634 FALL 2014

  40. Using the Perl Debugger DB<7> n main::(example-6-4.pl:25): my $base2 = substr($subsequence, 1, 1); DB<7> n main::(example-6-4.pl:28): @dna = split ( '', $dna ); DB<7> p $base1 T DB<8> p $base2 A DB<9> DB<9> n main::(example-6-4.pl:39): foreach (@dna) { DB<9> p @dna CGACGTCTTCTAAGGCGA DB<10> p "@dna" C G A C G T C T T C T A A G G C G A DB<11> BINF 634 FALL 2014

  41. Examining the Loop DB<12> l 39-52 39==> foreach (@dna) { 40: if ($receivingcommittment) { 41: print; 42: next; 43 } elsif ($previousbase eq $base1) { 44: if ( /$base2/ ) { 45: print $base1, $base2; 46: $recievingcommitment = 1; 47 } 48 } 49: $previousbase = $_; 50 } 51 52: print "\n"; DB<13> DB<13> b 40 BINF 634 FALL 2014

  42. Clearing Breakpoints and Exiting the Debugger DB<14> c main::(example-6-4.pl:40): if ($receivingcommittment) { DB<14> p C DB<16> B Deleting a breakpoint requires a line number, or '*' for all DB<18> q • For additional discussions please see • Ch. 20 of Wall or Ch. 6 of Tisdall BINF 634 FALL 2014

  43. Modules and Libraries - I • We will have more to say about this later • We will collect subroutines into handy files called modules or libraries • We tell the Perl compiler to utilize a particular module with the “use” command BINF 634 FALL 2014

  44. Modules and Libraries - II • Modules end in .pm BeginPerlBioinfo.pm • The last line in a module must be 1; • So we would access this module by putting the line use BeginPerlBioinfo; • If the Perl compiler can’t find it you may have to tell it the path use lib ‘/home/tisdall/book’ use BeginPerlBioinfo; BINF 634 FALL 2014

  45. POD(Ch. 26 in Wall) • Plain Old Documentation produces self-documenting programs • Comments can be extracted and formatted by external programs called translators • Keeps program documentation consistent with external documentation • pod text begins with "=identifier" at the start of a line • but only where the compiler is expected a new statement • All text is ignored by compiler until next line starting with "=cut" • Various translators produced formatted documentation • perldoc, pod2text, pod2html, pod2latex ,etc • details of format depends on identifier BINF 634 FALL 2014

  46. =begin Put any number of lines of comments here. They will appear in the proper format when processed by pod translators. =cut # program text goes here =begin comment The identifier indicates which translator should process this text. This text will be ignored by all translators. Use this for internal documentation only. =cut # more program text ... =head1 Section Heading text goes here, for example: =head1 SYNOPSIS usage: fasta.pl fastafile =over This starts a list: =item * First item in a list. =item * Second item. =back =cut BINF 634 FALL 2014

  47. An Example Program #!/usr/bin/perl =head1 NAME arglist.pl =head1 AUTHOR Jeff Solka =head1 SYNOPSIS usage: arglist.pl arg1 arg2 ... =head1 DESCRIPTION Echoes out the command line arguments. =over =item * First item in a list. =item * Second item. =back =cut ### main program print "The arguments are: @ARGV\n"; exit; BINF 634 FALL 2014

  48. Our Program in Action [binf:fall09/binf634/mycode] jsolka% arglist.pl cat The arguments are: cat BINF 634 FALL 2014

  49. pod2text acting On Our Program [binf:fall09/binf634/mycode] jsolka% pod2text arglist.pl NAME arglist.pl AUTHOR Jeff Solka SYNOPSIS usage: arglist.pl arg1 arg2 ... DESCRIPTION Echoes out the command line arguments. * First item in a list. * Second item. • See Ch. 26 for other formatting tricks. BINF 634 FALL 2014

  50. perldoc Acting on Our Program [binf:fall09/binf634/mycode] jsolka% perldoc arglist.pl > arglist.mp [binf:fall09/binf634/mycode] jsolka% cat arglist.mp ARGLIST(1) User Contributed Perl Documentation ARGLIST(1) NAME arglist.pl AUTHOR Jeff Solka SYNOPSIS usage: arglist.pl arg1 arg2 ... DESCRIPTION Echoes out the command line arguments. o First item in a list. o Second item. perl v5.8.8 2009-09-20 ARGLIST(1) BINF 634 FALL 2014

More Related