
Outline. Outline. Lab 1 Solution Program 2 Scoping Algorithm efficiency Sorting Hashes Review for midterm Quiz 3. Lab 1 Solution. BINF634 Fall 2013 Regular Expression Lab (Key) All problems except number 9 are worth 11 points. Number 9 is worth 12 points.
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
BINF634 Fall 2013 Regular Expression Lab (Key)
All problems except number 9 are worth 11 points. Number 9 is worth 12 points.
1) Write a PERL regular expression that would match only the strings: “bat”, “at”, and “t”.
/^b?a?t$/
2) Write a PERL regular expression to recognize any string that contains the substring “jeff”.
/jeff/
BINF 634 Fall 2013 - LECTURE06
3) Write a PERL regular expression that would match the strings: “bat”, “baat”, “baaat”, “baa…aat”, etc. (strings that start with b, followed by one or more a’s, ending with a t).
/^ba+t$/
4) Write a PERL regular expression that matches the strings: “hog”, “Hog”, “hOg”, “HOG”, “hOG”, etc. (That is, “hog” written in any combination of uppercase or lowercase letters.)
/^[hH][oO][Gg]$/
5) Write a PERL regular expression that matches any positive number (with or without a decimal point). Hint #1: if there is a decimal point, there must be at least one digit following the decimal point. Hint #2: Since the dot “.” matches any character, you must use \. to match a decimal point.
/^\d+(\.\d+)?$/
BINF 634 Fall 2013 - LECTURE06
6) Write a PERL regular expression to match any integer that doesn’t end in 8.
/^\d*[^8]$/
7) Write a PERL regular expression to match any line with exactly two words (or numbers) separated by any amount of whitespace (spaces or tabs). There may or may not be whitespace at the beginning or end of the line.
^\s*\w+\s+\w+\s*$
BINF 634 Fall 2013 - LECTURE06
BINF 634 Fall 2013 - LECTURE06
use strict;
use warnings;
my $x = 23;
print "value in main body is $x \n";
mysub($x);
print "value in main body is $x \n";
exit;
sub mysub{
print "value in subroutine is $x \n";
$x=33;
}
value in main body is 23
value in subroutine is 23
value in main body is 33
#!/usr/bin/perl
use strict;
use warnings;
{
my $x = 23;
print "value in main body is $x \n";
mysub($x);
print "value in main body is $x \n";
exit;
}
sub mysub{
print "value in subroutine is $x \n";
$x=33;
}
This will not compile
Be Careful With ScopeScoping
BINF 634 Fall 2013 - LECTURE06
use strict;
use warnings;
{
my $x = 23;
print "value in main body is $x \n";
mysub($x);
print "value in main body is $x \n";
exit;
}
sub mysub{
my($x) = @_;
$x=33;
print "value in subroutine is $x \n";
}
value in main body is 23
value in subroutine is 33
value in main body is 23
Be Careful With Scope (cont.)Scoping
BINF 634 Fall 2013 - LECTURE06
Algorithm Efficiency
Algorithm is O(N2)
# An inefficient way to compute intersections
my @a = qw/ A B C D E F G H I J K X Y Z /;
my @b = qw/ Q R S A C D T U G H V I J K X Z /;
my @intersection = ();
for my $i (@a) {
for my $j (@b) {
if ($i eq $j) {
push @intersection, $i;
last;
}
}
}
print "@intersection\n";
exit;
Output:
A C D G H I J K X Z
N = size of Lists
BINF 634 Fall 2013 - LECTURE06
N = size of Lists
Data Structures and Algorithm EfficiencyAlgorithm Efficiency
# A better way to compute intersections
my @a = qw/ A B C D E F G H I J K X Y Z /;
my @b = qw/ Q R S A C D T U G H V I J K X Z /;
my @intersection = ();
# "mark" each item in @a
my %mark = ();
for my $i (@a) { $mark{$i} = 1 }
# intersection = any "marked" item in @b
for my $j (@b) {
if (exists $mark{$j}) {
push @intersection, $j;
}
}
print "@intersection\n";
exit;
Output:
A C D G H I J K X Z
version 1
version 2
BINF 634 Fall 2013 - LECTURE06
Algorithm Efficiency
% wc -l list1 list2
24762 list1
12381 list2
37143 total
% /usr/bin/time intersect1.pl list1 list2 > out1
22.91 real 22.88 user 0.02 sys
% /usr/bin/time intersect2.pl list1 list2 > out2
0.06 real 0.05 user 0.00 sys
22.88/.05 = 458
BINF 634 Fall 2013 - LECTURE06
Hashes
BINF 634 Fall 2013 - LECTURE06
Hashes
BINF 634 Fall 2013 - LECTURE06
Sorting
BINF 634 Fall 2013 - LECTURE06
use strict;
use warnings;
{
my(@unsorted) = (17, 8, 2, 111);
my(@sorted) = sort @unsorted;
print "@unsorted \n";
print "@sorted \n";
exit;
}
Output:
17 8 2 111
111 17 2 8
Sorting Our First AttemptSorting
BINF 634 Fall 2013 - LECTURE06
Sorting
1. $a <=> $b returns 0 if equal, 1 if $a > $b, -1 if $a < $b
2. The "cmp" operator gives similar results for strings
3. $a and $b are special global variables:
do NOT declare with "my" and do NOT modify.
BINF 634 Fall 2013 - LECTURE06
use strict;
use warnings;
{
my(@unsorted) = (17, 8, 2, 111);
my(@sorted) = sort { $a <=> $b }@unsorted;
print "@unsorted \n";
print "@sorted \n";
exit;
}
Output:
17 8 2 111
2 8 17 111
Sorting NumericallySorting
BINF 634 Fall 2013 - LECTURE06
use strict;
use warnings;
{
my(@unsorted) = (17, 8, 2, 111);
my(@sorted) = sort numerically @unsorted;
print "@unsorted \n";
print "@sorted \n";
exit;
}
sub numerically { $a <=> $b }
Output:
17 8 2 111
2 8 17 111
Sorting Using a SubroutineSorting
BINF 634 Fall 2013 - LECTURE06
use strict;
use warnings;
{
my(@unsorted) = (17, 8, 2, 111);
my(@reversesorted) = reverse sort numerically @unsorted;
print "@unsorted \n";
print "@reversesorted \n";
exit;
}
sub numerically { $a <=> $b }
Output:
17 8 2 111
111 17 8 2
Sorting DescendingSorting
BINF 634 Fall 2013 - LECTURE06
use strict;
use warnings;
{
# Sorting strings:
my @dna = qw/ TATAATG TTTT GT CTCAT /;
## Sort @dna by length:
@dna = sort { length($a) <=> length($b) }@dna;
print "@dna\n"; # Output: GT TTTT CTCAT TATAATG
exit;
}
Output:
GT TTTT CTCAT TATAATG
Sorting DNA by LengthSorting
BINF 634 Fall 2013 - LECTURE06
use strict;
use warnings;
{
# Sorting strings:
my @dna = qw/ TATAATG TTTT GT CTCAT /;
@dna = sort { ($b =~ tr/Tt//) <=> ($a =~ tr/Tt//) } @dna;
print "@dna\n"; # Output: TTTT TATAATG CTCAT GT
exit;
}
Output:
TTTT TATAATG CTCAT GT
Sorting DNA by Number of T’s (Largest First)Sorting
BINF 634 Fall 2013 - LECTURE06
use strict;
use warnings;
{
# Sorting strings:
my @dna = qw/ TATAATG TTTT GT CTCAT /;
@dna = reverse sort {
($a =~ tr/Tt//) <=> ($b =~ tr/Tt//) } @dna;
print "@dna\n"; # Output: TTTT TATAATG CTCAT GT
exit;
}
Output:
TTTT TATAATG CTCAT GT
Sorting DNA by Number of T’s (Largest First) (Take 2)Sorting
BINF 634 Fall 2013 - LECTURE06
use strict;
use warnings;
{
# Sort strings without regard to case:
my(@unsorted) = qw/ mouse Rat HUMAN eColi /;
my(@sorted) = sort { lc($a) cmp lc($b) } @unsorted;
print "@unsorted \n";
print "@sorted \n";
exit;
}
Output:
mouse Rat HUMAN eColi
eColi HUMAN mouse Rat
Sorting Strings Without Regard to CaseSorting
BINF 634 Fall 2013 - LECTURE06
use strict;
use warnings;
{
my(%sales_amount) = ( auto=>100, kitchen=>2000, hardware=>200 );
sub bysales { $sales_amount{$b} <=> $sales_amount{$a} }
for my $dept (sort bysales keys %sales_amount) {
printf "%s:\t%4d\n", $dept, $sales_amount{$dept};
}
exit;
}
Output:
kitchen:2000
hardware: 200
auto: 100
Sorting Hashes by ValueSorting
BINF 634 Fall 2013 - LECTURE06
Midterm
BINF 634 Fall 2013 - LECTURE06
Midterm
BINF 634 Fall 2013 - LECTURE06
Midterm
$RNA = ~ s/T/U/ig
BINF 634 Fall 2013 - LECTURE06
Midterm
$revcom =~ tr/ACGT/TGCA/;
BINF 634 Fall 2013 - LECTURE06
Midterm
@bases = (‘A’, ‘C’, ‘G’, ‘T’);
$base1 = pop @bases;
unshift (@bases, $base1);
print “@bases\n\n”;
BINF 634 Fall 2013 - LECTURE06
Midterm
unless(COND){
#do something
}
BINF 634 Fall 2013 - LECTURE06
Midterm
$protein = join(‘’,@protein)
BINF 634 Fall 2013 - LECTURE06
Midterm
$myfile = “myfile”;
Open(MYFILE, “>$myfile”)
BINF 634 Fall 2013 - LECTURE06
Midterm
while($DNA =~ /a/ig){$a++}
BINF 634 Fall 2013 - LECTURE06
Midterm
use strict;
BINF 634 Fall 2013 - LECTURE06
Midterm
in the array @ARGV ?
BINF 634 Fall 2013 - LECTURE06
Midterm
BINF 634 Fall 2013 - LECTURE06
Midterm
BINF 634 Fall 2013 - LECTURE06
Midterm
BINF 634 Fall 2013 - LECTURE06
Midterm
$verbs[rand @verbs]
BINF 634 Fall 2013 - LECTURE06
BINF 634 Fall 2013 - LECTURE06