1 / 16

Perl IV

Perl IV. Part V: Hashing Out the Genetic Code, Bioperl. Hashes. There are three main data types in Perl – scalar variables, arrays, and hashes. Hashes provide VERY fast nested-array look-up Format is similar to that of array: % hash = (‘ key ’ => ‘ value ’); $ value = $ hash {‘ key ’};.

Download Presentation

Perl IV

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Perl IV Part V: Hashing Out the Genetic Code, Bioperl

  2. Hashes • There are three main data types in Perl – scalar variables, arrays, and hashes. • Hashes provide VERY fast nested-array look-up • Format is similar to that of array: • %hash = (‘key’ => ‘value’); • $value = $hash{‘key’};

  3. Hashes %array = ( ‘key1’, ‘value1’, ‘key2’, ‘value2’, ‘key3’, ‘value3’, ); %array = ( ‘key1’=> ‘value1’, ‘key2’=> ‘value2’, ‘key3’=> ‘value3’, );

  4. Hashes • @keys = keys %hash • @values = values %hash

  5. The Binary Search of Arrays • The ‘halving’ method is considerably faster then doing a comparison. e.g. finding a match in a 30000 set array takes 15 times through a loop max. • Good method for one sort and multiple comparisons

  6. Comparing Strings • To compare 2 strings alphabetically in Perl, you use the cmp operator, which returns 0 if the two strings are the same, -1 if they are in alphabetical order, and 1 if they are in reverse order. • ‘zzz’ cmp ‘zzz’ returns 0 • ‘AAA’ cmp ‘ZZZ’ returns -1 • ‘ZZZ’ cmp ‘AAA’ returns 1

  7. Sorting Arrays • Sorting an array of strings alphabetically • @array = sort @array; • if given numbers this will sort them lexically • Sorting an array of numbers in ascending order • @array = sort { $a <=> $b } @array; • the values $a and $b must be used

  8. Sorting Hashes • Sorting keys and values • foreach ( sort keys (%hash)) { • print “$_\t”, “*” x $hash{$_},”\n”; } • Sorting keys in ascending order • foreach (sort {$hash{$b}<=>$hash{$_}} keys (%hash)) { …… }

  9. Nested Arrays print $array will give ARRAY(0x85d3ad0) but print $array[$i] gives array of $j • $array[$i] -> [$j]; • produces $array[$i][$j] • Or use hashes: • %hash = (duck => [‘Huey’,’Louie’,’Dewey’], horse => [‘Mr. Ed’], dog => [‘Benji’, ‘Lassie’] ); $value = $hash{$key}[$i]

  10. The Genetic Code is Redundant

  11. Searching for codons DIFFICULT: my($codon) = @_; return s if ($codon =~ /TCA/i ); return s elseif ($codon =~ /TCC/i); return s elseif ($codon =~ /TCG/i); blah blah blah blah

  12. Searching for codons BETTER: my($codon) = @_; return A if ($codon =~ /GC./i ); return C elseif ($codon =~ /TG[TC]/i); return D elseif ($codon =~ /GA[TC]/i); blah blah blah blah

  13. Searching for codons BEST: my($codon) = @_; $codon uc $codon; my(%genetic_code) = ( ‘TCA’ => ‘S’, ‘TCC’ => ‘S’, ‘TCG’ => ‘S’ …. yadda yadda yadda ); return $genetic_code{$codon} if (exists $genetic_code{$codon})

  14. Modules • Perl contains the ability to deal with methods in an object-orientated manner • classes are contained in packages • These are often referred to as modules • OO structure is: • objectName ->method(arguments) Note to Self --- how many objects?.........

  15. BioPerl (www.bioperl.org) • The main focus of Bioperl modules is to perform sequence manipulation, provide access to various biology databases (both local and web-based), and parse the output of various programs. • Its modules rely heavily on additional Perl modules available from CPAN (www.cpan.org)

  16. How to go about comparing an unknown sequence . . . $in = Bio::SeqIO->new(‘file’=>$infile, ‘-format’=>’genbank’); $seqobj = $in->next_seq(); @allfeatures = $seqobj->all_SeqFeatures(); $feat = $allfeatures[0]; $feature_start = $feat->start; $feature_strand = $feat->strand; If ($seqobj->species->{common_name} =~ {elegans}) { $seq = $seqobj->primary_seq->{seq} $id = $seqobj->id; }

More Related