1 / 17

Advanced Python Concepts: Modules

Advanced Python Concepts: Modules. BCHB524 2013 Lecture 14. Modules. We've already used these from the python standard library and from BioPython.

orly
Download Presentation

Advanced Python Concepts: Modules

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Advanced Python Concepts: Modules BCHB5242013Lecture 14 BCHB524 - 2013 - Edwards

  2. Modules • We've already used these from the python standard library and from BioPython import sysfilename = sys.argv[1]file_handle = open(filename)import Bio.SeqIOseq_records = Bio.SeqIO.parse(file_handle, "swiss")import gzipfile_handle = gzip.open(filename)import urllibfile_handle = urllib.urlopen(url)import mathx = math.sqrt(y) BCHB524 - 2013 - Edwards

  3. Modules • Modules contain previously defined functions, variables, and (soon!) classes. • Store useful code for re-use in many programs • Structurerelated codetogether import sysimport MyNucStuffseqfilename = sys.argv[1]seq   = MyNucStuff.read_seq_from_filename(seqfilename)cseq  = MyNucStuff.complement(seq)rcseq = MyNucStuff.reverseComplement(seq)print"   Sequence:",seqprint" C sequence:",cseqprint"RC sequence:",rcseq BCHB524 - 2013 - Edwards

  4. Modules • In MyNucStuff.py defread_seq_from_filename(seq_filename):    seq_file = open(seq_filename)    dna_seq = ''.join(seq_file.read().split())    dna_seq = dna_seq.upper()    seq_file.close()return dna_seqcompl_dict = {'A':'T', 'T':'A', 'C':'G', 'G':'C'}defcomplement(seq):return''.join(map(compl_dict.get,seq))defrevseq(seq):return''.join(reversed(seq))defreverseComplement(seq):return complement(revseq(seq))# plus anything else you like... BCHB524 - 2013 - Edwards

  5. Modules • We can import specific functions directly… • And reference them without the module name import sysfrom MyNucStuff import read_seq_from_filenamefrom MyNucStuff import complementfrom MyNucStuff import reverseComplementseqfilename = sys.argv[1]seq   = read_seq_from_filename(seqfilename)cseq  = complement(seq)rcseq = reverseComplement(seq)print"   Sequence:",seqprint" C sequence:",cseqprint"RC sequence:",rcseq BCHB524 - 2013 - Edwards

  6. Modules • We can even import all the functions from a module… import sysfrom MyNucStuff import * seqfilename = sys.argv[1]seq   = read_seq_from_filename(seqfilename)cseq  = complement(seq)rcseq = reverseComplement(seq)print"   Sequence:",seqprint" C sequence:",cseqprint"RC sequence:",rcseq BCHB524 - 2013 - Edwards

  7. Packages • Packages are collections of modules, grouped together. All equivalent: • Implemented using files and folders/directories. import Bio.SeqIOBio.SeqIO.parse(handle, "swiss")from Bio import SeqIOSeqIO.parse(handle, "swiss")from Bio.SeqIO import parseparse(handle, "swiss") BCHB524 - 2013 - Edwards

  8. What can go wrong? • Sometimes our own .py files can "collide" with Python's packages. • Test what happens with an "empty" module in files: • xml.py (and then try to import ElementTree) • Bio.py (and then try to import SeqIO) • etc… BCHB524 - 2013 - Edwards

  9. A module for codon tables • Module is called: codon_table • Functions: • read_codons_from_filename(filename) • returns dictionary of codons – value is pair: (amino-acid symbol, initiation codon true/false) • amino_acid(codon_table,codon) • returns amino-acid symbol for codon • is_init(codon_table,codon) • returns true if codon is an initiation codon, false, otherwise • get_ambig_aa (codon_table,codon) • Returns the single amino-acid consistent with ambiguous codon (containing N's), or X. • translate(codon_table,seq,frame) • returns amino-acid sequence for DNA sequence seq BCHB524 - 2013 - Edwards

  10. A module for codon tables from MyNucStuff import *from codon_table import *import sysiflen(sys.argv) < 3:print"Require codon table and DNA sequence on command-line."    sys.exit(1)table = read_codons_from_filename(sys.argv[1])seq = read_seq_from_filename(sys.argv[2])if is_init(table,seq[:3]):print"Initial codon is an initiation codon"for frame in (1,2,3):print"Frame",frame,"(forward):",translate(table,seq,frame) BCHB524 - 2013 - Edwards

  11. A module for codons • In codon_table.py: defread_codons_from_filename(codonfile):# magicreturn codon_tabledefamino_acid(table,codon):# magicreturn aadefis_init(table,codon):# magicreturn initdefget_ambig_aa(table,codon):# magicreturn aadeftranslate(table,seq,frame):# magicreturn aaseq BCHB524 - 2013 - Edwards

  12. A module for codons • In codon_table.py: defread_codons_from_filename(codonfile):    f = open(codonfile)    data = {}for l in f:        sl = l.split()        key = sl[0]        value = sl[2]        data[key] = value        f.close()    b1 = data['Base1']    b2 = data['Base2']    b3 = data['Base3']    aa = data['AAs']    st = data['Starts']    codon_table = {}    n = len(aa)for i inrange(n):        codon = b1[i] + b2[i] + b3[i]        isInit = (st[i] == 'M')        codon_table[codon] = (aa[i],isInit)return codon_table BCHB524 - 2013 - Edwards

  13. Lab exercises • Rework the lecture, and your solutions (or mine) from the homework exercises #1 through #3, to make a MyDNAStuff module. • Put as many useful nucleotide functions as possible into the module... • Rework the lecture, and your solutions (or mine) from homework exercises #4 and #5, to make the codon_table module with functions specified in this lecture. • Demonstrate the use of these modules to translate an amino-acid sequence in all six-frames with just a few lines of code. • The final result should look similar to Slide 10. • Your program should handle DNA sequence with N’s in it. BCHB524 - 2013 - Edwards

  14. Class Project: Expectations • 40% of your grade! • Project Report • Long version of your homework write-up • Project Presentation • Demo your program • Describe your project solution BCHB524 - 2013 - Edwards

  15. Class Project: Blast Database • Write a program that computes all pairwise blast alignments for two species' proteomes and stores the alignments in a relational database. • Write a program that retrieves the blast alignment for two proteins (specified by their accessions) from the relational database. • Write a program that finds pairs of orthologous proteins that are mutually best hits in the species' proteomes. BCHB524 - 2013 - Edwards

  16. Class Project: MS/MS Viewer • Write a program to display peptide fragmentation spectra from an mzXML file. • The program will take an mzXML file, a scan number, and a peptide sequence as input. • The peptide's b-ion and y-ion m/z values should be computed, and peaks matching these m/z values annotated with appropriate labels. • The output figure/plot should aid the user in determining whether or not the peptide is a good match to the spectrum. BCHB524 - 2013 - Edwards

  17. Class Project: Protein Digest • Write a simple web-server application using TurboGears to carry out anin silicoenzymatic digest of a user-provided protein sequence. • Users should be able to specify min and max length, min and max molecular weight, # of missed cleavages, and specific enzyme. • Output should be a table of peptides, with their length, molecular weight, # of missed cleavages, and amino-acids to left and right of each peptide in the protein sequence. BCHB524 - 2013 - Edwards

More Related