1 / 20

Introduction to Python

Introduction to Python. BCHB524 2013 Lecture 5. Outline. Review DNA as a string Extracting codons in DNA Counting in-frame codons in DNA Reverse Complement Program Input/Output raw_input, command-line arguments standard-input, standard-output, redirection. Review.

asabi
Download Presentation

Introduction to Python

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to Python BCHB5242013Lecture 5 BCHB524 - 2013 - Edwards

  2. Outline • Review • DNA as a string • Extracting codons in DNA • Counting in-frame codons in DNA • Reverse Complement • Program Input/Output • raw_input, command-line arguments • standard-input, standard-output, redirection BCHB524 - 2013 - Edwards

  3. Review • Printing and execution • Variables and basic data-types: • integers, floats, strings • Arithmetic with, conversion between • String characters and chunks, string methods • Functions, using/calling and defining: • Use in any expression • Parameters as input, return for output • Control Flow: • if statements – conditional execution • for statements – iterative execution BCHB524 - 2013 - Edwards

  4. DNA as a string seq = "gcatgacgttattacgactctgtgtggcgtctgctgggg"seqlen = len(seq)# set i to 0, 3, 6, 9, ..., 36for i inrange(0,seqlen,3):# extract the codon as a string    codon = seq[i:i+3]print codonprint"Number of Met. amino-acids", seq.count("atg") BCHB524 - 2013 - Edwards

  5. DNA as a string • What about upper and lower case? • ATG vs atg? • Differences between DNA and RNA sequence? • Substitute U for each T? • How about ambiguous nucleotide symbols? • What should we do with ‘N’ and other ambiguity codes (R, Y, W, S, M, K, H, B, V, D)? • Strings don’t know any biology! BCHB524 - 2013 - Edwards

  6. DNA as a string seq = "gcatgacgttattacgactctgtgtggcgtctgctgggg"definFrameMet(seq):    seqlen = len(seq)    count = 0for i inrange(0,seqlen,3):        codon = seq[i:i+3]if codon.upper() == "ATG":            count = count + 1return countprint"Number of Met. amino-acids", inFrameMet(seq) BCHB524 - 2013 - Edwards

  7. DNA as a string input_seq = "catgacgttattacgactctgtgtggcgtctgctgggg"defcomplement(nuc):    nucleotides = 'ACGT'    complements = 'TGCA'    i = nucleotides.find(nuc)    comp = complements[i]return compdefreverseComplement(seq):    newseq = ""for nuc in seq:        newseq = complement(nuc) + newseqreturn newseqprint"Reverse complement:", reverseComplement(input_seq) BCHB524 - 2013 - Edwards

  8. DNA as a string input_seq = "catgacgttattacgactctgtgtggcgtctgctgggg"defcomplement(nuc):    nucleotides = 'ACGT'    complements = 'TGCA'    i = nucleotides.find(nuc)if i >= 0:        comp = complements[i]else:        comp = nucreturn compdefreverseComplement(seq):    seq = seq.upper()    newseq = ""for nuc in seq:        newseq = complement(nuc) + newseqreturn newseqprint"Reverse complement:", reverseComplement(input_seq) BCHB524 - 2013 - Edwards

  9. Creating reusable programs • Need to get input data and options from the user • …often us, but sometimes others, or us later. • Sometimes, want completely new inputs • …but often, want the same or similar input. • Sometimes, typing the input is OK • …but often, want to use data in a file. • Sometimes, output to the screen is OK • …but often, want the result to go into a file. BCHB524 - 2013 - Edwards

  10. Interactive input input_seq = raw_input("Type your codon: ")defcomplement(nuc):    nucleotides = 'ACGT'    complements = 'TGCA'    i = nucleotides.find(nuc)if i >= 0:        comp = complements[i]else:        comp = nucreturn compdefreverseComplement(seq):    seq = seq.upper()    newseq = ""for nuc in seq:        newseq = complement(nuc) + newseqreturn newseqprint"Reverse complement:", reverseComplement(input_seq) BCHB524 - 2013 - Edwards

  11. Command-line input import sysinput_seq = sys.argv[1]defcomplement(nuc):    nucleotides = 'ACGT'    complements = 'TGCA'    i = nucleotides.find(nuc)if i >= 0:        comp = complements[i]else:        comp = nucreturn compdefreverseComplement(seq):    seq = seq.upper()    newseq = ""for nuc in seq:        newseq = complement(nuc) + newseqreturn newseqprint"Reverse complement:", reverseComplement(input_seq) BCHB524 - 2013 - Edwards

  12. Interactive and file input import sysinput_seq = sys.stdin.read() defcomplement(nuc):    nucleotides = 'ACGT'    complements = 'TGCA'    i = nucleotides.find(nuc)if i >= 0:        comp = complements[i]else:        comp = nucreturn compdefreverseComplement(seq):    seq = seq.upper()    newseq = ""for nuc in seq:        newseq = complement(nuc) + newseqreturn newseqprint"Reverse complement:", reverseComplement(input_seq) BCHB524 - 2013 - Edwards

  13. File input only import sysseq_file = sys.argv[1]# MAGIC: open file, read contents, and remove whitespaceinput_seq = ''.join(open(seq_file).read().split())defcomplement(nuc):    nucleotides = 'ACGT'    complements = 'TGCA'    i = nucleotides.find(nuc)if i >= 0:        comp = complements[i]else:        comp = nucreturn compdefreverseComplement(seq):    seq = seq.upper()    newseq = ""for nuc in seq:        newseq = complement(nuc) + newseqreturn newseqprint"Reverse complement:", reverseComplement(input_seq) BCHB524 - 2013 - Edwards

  14. Input Summary • raw_input provides interactive values from the user (also copy-and-paste) • sys.stdin.read() provides interactive or file-based values from the user (also copy-and-paste) • sys.argv[1] provides command-line values from the user (also copy-and-paste) • value can be a filename that provides user-input • Terminal standard-input redirection "<" can be used to send a file's contents to raw_input or sys.stdin.read() BCHB524 - 2013 - Edwards

  15. Output is easy… • Just use print, right? • Print statements go to the terminal's standard-output. • We can redirect to a file using ">" • Errors still get printed to the terminal. • We can also link programs together – standard-output to standard-input using "|" • Also, cat just writes its file to standard out BCHB524 - 2013 - Edwards

  16. Connect reverse complement w/ codon counting… • Create and test rc.py from earlier slides: • Sequence from standard-input • Reverse complement sequence to standard-output • Create and test codons.py from earlier slides: • Sequence from standard-input • Count to standard-output • Place example sequence in file: test.seq • Execute: cat test.seq | python rc.py | python codons.py BCHB524 - 2013 - Edwards

  17. In general • Windows and OS X have similar facilities • cmd in windows, terminal in OS X • Powerful mechanism for making reusable programs • No knowledge of python required for use! • Most bioinformatics software is used from the command-line w/ command-line arguments: • Files provide sequence data, etc. • I'll promote this style of program I/O. BCHB524 - 2013 - Edwards

  18. Exercise 1 • Use UniSTS (“google UniSTS”) to look up PCR markers for your favorite gene • Write a command-line program to compute the reverse complement sequence for the forward and reverse primer. BCHB524 - 2013 - Edwards

  19. Exercise 2 • Write a command-line program to test whether a PCR primer is a reverse complement palindrome. • Such a primer might fold and self-hybridize! • Test your program on the following primers: • TTGAGTAGACGCGTCTACTCAA • TTGAGTAGACGTCGTCTACTCAA • ATATATATATATATAT • ATCTATATATATGTAT BCHB524 - 2013 - Edwards

  20. Homework 3 • Due Monday, September 16th. • Submit using Blackboard • Use only the techniques introduced so far. • Make sure you can run the programs demonstrated in lecture(s). • Exercises 1, 2 from Lecture 4 • Exercises 1, 2 from Lecture 5 • Rosalind exercises 6,7 BCHB524 - 2013 - Edwards

More Related