1 / 24

Bioinformatics

Bioinformatics. Lecture 7: Introduction to Perl. Introduction. Basic concepts in Perl syntax: variables, strings, input and output Conditional and iteration File handling and error handling Arrays, lists and hashes . First program. a basic Strings program: Test.pl #!/ usr/bin/perl

malha
Download Presentation

Bioinformatics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bioinformatics Lecture 7: Introduction to Perl

  2. Introduction • Basic concepts in Perl syntax: • variables, strings, input and output • Conditional and iteration • File handling and error handling • Arrays, lists and hashes

  3. First program • a basic Strings program: Test.pl • #!/usr/bin/perl • print "Hello boys and girs!\n this is introduction to perl"; • Open with notepad and type the above • Save file as hello.pl • Ensure that hide file extensions option is unchecked. • Run via the command line

  4. Variables declarations • $variable name : intergers, floats, strings. • @ arrays • Arithmetic operators: • +, -, *, / , **( exponentation); % modulus • Double v single quotation marks • $x = ‘ I am from Cork ‘ • print “the value of $x is $x\n” • print ’the value of $x is $x\n’ • print “the value of \$x is $x\n” # note the \$x • #evaluating expressions in print (# comment line symbol) • $ x = 15; • Print “the value of x is “, $x + 3, “\n” (ArithmeticExample.pl)

  5. Input , output and files handling • Input • $var = <> (input a line of text and assign it to $var): also iputs return character • Chomp $var removes the return character from the #also used the word chop • Alternatively chomp($var = <>); • $line = <DATA> reads in “hardcoded data” • Output • print (already covered) • File Handling • open MYFILE , ‘data.txt’ (open file for reading;) • open MYFILE, ‘>data.txt’ (open file for writing) • Open MYFILE, ‘>> data.txt’ (open file for appending) • $line = <MYFILE > #read one line from file • @entire_file = <MYFILE> ; (called slurping) #reads all the file into an array • print MYFILE “Do you like computers….”, $number/3, “\n” # write out to file • close MYFILE;

  6. Conditional Operator • == Equality $a == $b • != Not equal $a != $b • < Less than $a < $b • > Greater than $a > $b • <= Less than or equal to $a <= $b • >= Greater than or equal to $a >= $b • ! Logical not $ = !$b

  7. String conditional operator • eq Equality $a eq $b • ne Not equal $a ne $b • lt Less than $a lt $b • gt Greater than $a gt $b • le Less than or equal to $a le $b • ge Greater than or equal to $a ge $b • . Concatenation $a.$c • =~ Pattern match $a =~ /gatc/

  8. Conditional statements • If and elseif and else if_else.pl • #!/usr/bin/perl • print “Enter your age: ”; • $age = <>; • if ($age <= 0) { • print “You are way too young to be using a computer.\n”; • } • elseif ($age >= 100) • { • print “Not in a dog’s life!\n”; • } else • { • print “Your age in dog years is ”,$age/7,“\n”; • }

  9. Iteration: loops • While-loops • #!/usr/bin/perl • $count = 1; • while ($count <= 5) { • print “$count potato\n”; • $count = $count + 1; • } • Until-loops • #!/usr/bin/perl • $count = 1; • until ($count > 5) { • print “$count potato\n”; • $count = $count + 1; • }

  10. Loops with defined • #!/usr/bin/perl • # defined fnt is true if $line assigned a value • print “Type something. ‘quit’ to finish\n ”; • while ( defined($line = <>) ) { • chomp $line; • last if $line eq ‘quit’; # breaks out of loop at quit • print “You typed ‘$line’\n\n”; • print “Type something> ”; • } • print “goodbye!\n”; loops_defined.pl

  11. Shorthand input notation • #!/usr/bin/perl • print “Type something. ‘quit’ to finish\n ”; • while (<>) { • chomp; # $_ generic variable name • last if $_eq ‘quit’; • print “You typed ‘$_ ’\n\n”; • print “Type something> ”; • } • print “goodbye!\n”;

  12. Change Standard input/ output • redirect Sdout to a file • U:\test test.pl> stdout.txt [produces a text file ] • print file goes to file and not to screen • Run Loops_defined to redirect to output to file • The <> input has one feature where if a file name is on the command line it beings to read from it otherwise it reads from keyboards • U:\test commandline.plstdin.txt

  13. Finding length of file • #!/usr/bin/perl#File_size_1.pl • # file size.pl • $length = 0; # set length counter to zero • $lines = 0; # set number of lines to zero • print “enter text one line at a time and press (ctrl z) to quit”; • while (<>) { # read file one line at a time • chomp; # remove terminal newline • $length = $length + length $_ ; • $lines = $lines + 1; • } • print “LENGTH = $length\n”; • print “LINES = $lines\n”; • Try using keyboard as Stdin (ctrl Z) and file name on command line

  14. Dynamic Arrays • Declaration of an array in perl • @sequences = (‘123a’, ‘23ed4’, ‘2334d’); • Array contains 3 strings!!! • Array operations: • $one_seq = @sequences[2] {zero based array} • @seq = @sequences; assigns arrays • @seq = (@seq, ‘125f’); adding an value • @combined = (@seq, @seq2) • Removing (splice) @removed = splice @seq, 1, 2 • slicing : @slice = @seq[1,2]; • Splice_slice_array.pl

  15. Dynamic Arrays • push @sequences, ‘2345d’; (adds element to end of array) • Pop @sequences removes and returns (function returns) last element of array • Shifting: removes and returns the first element of an array. • Unshifting: Adds an element or list of elements onto the beginning of an array.

  16. Shift Pop push unshift example • #! /usr/bin/perl • # The 'pushpop' program - pushing, popping, shifting and unshifting. • @sequences = ( 'TTATTATGTT', 'GCTCAGTTCT', 'GACCTCTTAA', • 'CTATGCGGTA', 'ATCTGACCTC' ); • print "@sequences\n"; • $last = pop @sequences; • print "@sequences\n"; • $first = shift @sequences; • print "@sequences\n"; • unshift @sequences, $last; • print "@sequences\n"; • push @sequences, ( $first, $last ); • print "@sequences\n"; • What is the expected output (run code to confirm)

  17. Arrays: two more functions • Substr (extracting a substring from a string) • $sub = substr ($string, offset position[position to begin extraction], size of substring) • Substr and index: • To obtain the reverse complement of a DNA sequence: assume the sequence is stored in array: (GGGGTTTT becomes AAAACCCC) • Iterating through an array: • foreach $dna (@dna) • { • $dna = reverse $dna; # reverse the contents of a scalar $dna • $dna =~ tr/gatcGATC/ctagCTAG/; • # tr (translate first set into second; e.g. g becomes c ) complement (replace) • }

  18. Questions • how would you read in a file of DNA sequence into an array and print both the original and reverse complementary copy • What use could this program have? (biology related answer)

  19. Array and lists • Lists are an array of constants or variables • Values of a list assigned to any array • @clones = (’192a8’,’18c10’,’327h1’,’201e4’); • Values in an array assigned to a list • ($first,$second,$third) = @clones;

  20. Hashes: associative arrays • Similar arrays but elements are unordered • Two parts: the identifer (name), a scalar value (string) • Add Elements are referred to by strings: • %oligos = (); • $oligos{’192a8’} = ‘GGGTTCCGATTTCCAA’; • $oligos{’18c10’} = ‘CTCTCTCTAGAGAGAGCCCC’; • $oligos{’327h1’} = ‘GGACCTAACCTATTGGC’; • Note in the name part use ‘ ‘ • Removing elements: • Delete $oligos{’192a8’};

  21. Hashes • Outputting hash results • $s = $oligos{’192a8’}; • print “oligo 192a8 is $s\n”; • print “oligo 192a8 is ”,length $oligos{’192a8’},“ base pairs long\n”; • print “oligo 18c10 is $oligos{’18c10’}\n”; • Expected output: input_output_hash.pl • oligo 192a8 is GGGTTCCGATTTCCAA • oligo 192a8 is 16 base pairs long • oligo 18c10 is CTCTCTCTAGAGAGAGCCCC

  22. Hashes • Example of the use of a Hash table • hash_bases.pl program • For loops and hash tables • foreach $clone (’327h1’,’192a8’,’18c10’) { • print “$clone: $oligos{$clone}\n”; • } • %oligos is refers to the hash table • $oligos is used to refer to elements • $size = keys %oligo; returns the number of entries

  23. Displaying all entries in a hash table • while ( ( $genome, $count ) = each %gene_counts ) • { • print "`$genome' has a gene count of $count\n";} • foreach $genome ( sort keys %gene_counts ) • { • print "`$genome' has a gene count of $gene_counts • { $genome }\n";} • Refer to genes.pl

  24. Error Handling • die function: • open myfile, ‘stdin.txt’ or • Die “could not open file aborting…\n”; • If file does not exits the program terminates with the above message • Write a program to read in data from a file to an array and when all the data is input to output in reverse order • Create a hash table that performs the condon to AA conversion and use it to convert codons {entered from the key board} into their corresponding Amino Acids

More Related