Introduction to python
This presentation is the property of its rightful owner.
Sponsored Links
1 / 20

Introduction to Python PowerPoint PPT Presentation


  • 85 Views
  • Uploaded on
  • Presentation posted in: General

Introduction to Python. BCHB524 2013 Lecture 5. Outline. Review DNA as a string Extracting codons in DNA Counting in-frame codons in DNA Reverse Complement Program Input/Output raw_input, command-line arguments standard-input, standard-output, redirection. Review.

Download Presentation

Introduction to Python

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Introduction to python

Introduction to Python

BCHB5242013Lecture 5

BCHB524 - 2013 - Edwards


Outline

Outline

  • Review

  • DNA as a string

    • Extracting codons in DNA

    • Counting in-frame codons in DNA

    • Reverse Complement

  • Program Input/Output

    • raw_input, command-line arguments

    • standard-input, standard-output, redirection

BCHB524 - 2013 - Edwards


Review

Review

  • Printing and execution

  • Variables and basic data-types:

    • integers, floats, strings

    • Arithmetic with, conversion between

    • String characters and chunks, string methods

  • Functions, using/calling and defining:

    • Use in any expression

    • Parameters as input, return for output

  • Control Flow:

    • if statements – conditional execution

    • for statements – iterative execution

BCHB524 - 2013 - Edwards


Dna as a string

DNA as a string

seq = "gcatgacgttattacgactctgtgtggcgtctgctgggg"seqlen = len(seq)# set i to 0, 3, 6, 9, ..., 36for i inrange(0,seqlen,3):# extract the codon as a string    codon = seq[i:i+3]print codonprint"Number of Met. amino-acids", seq.count("atg")

BCHB524 - 2013 - Edwards


Dna as a string1

DNA as a string

  • What about upper and lower case?

    • ATG vs atg?

  • Differences between DNA and RNA sequence?

    • Substitute U for each T?

  • How about ambiguous nucleotide symbols?

    • What should we do with ‘N’ and other ambiguity codes (R, Y, W, S, M, K, H, B, V, D)?

  • Strings don’t know any biology!

BCHB524 - 2013 - Edwards


Dna as a string2

DNA as a string

seq = "gcatgacgttattacgactctgtgtggcgtctgctgggg"definFrameMet(seq):    seqlen = len(seq)    count = 0for i inrange(0,seqlen,3):        codon = seq[i:i+3]if codon.upper() == "ATG":            count = count + 1return countprint"Number of Met. amino-acids", inFrameMet(seq)

BCHB524 - 2013 - Edwards


Dna as a string3

DNA as a string

input_seq = "catgacgttattacgactctgtgtggcgtctgctgggg"defcomplement(nuc):    nucleotides = 'ACGT'    complements = 'TGCA'    i = nucleotides.find(nuc)    comp = complements[i]return compdefreverseComplement(seq):    newseq = ""for nuc in seq:        newseq = complement(nuc) + newseqreturn newseqprint"Reverse complement:", reverseComplement(input_seq)

BCHB524 - 2013 - Edwards


Dna as a string4

DNA as a string

input_seq = "catgacgttattacgactctgtgtggcgtctgctgggg"defcomplement(nuc):    nucleotides = 'ACGT'    complements = 'TGCA'    i = nucleotides.find(nuc)if i >= 0:        comp = complements[i]else:        comp = nucreturn compdefreverseComplement(seq):    seq = seq.upper()    newseq = ""for nuc in seq:        newseq = complement(nuc) + newseqreturn newseqprint"Reverse complement:", reverseComplement(input_seq)

BCHB524 - 2013 - Edwards


Creating reusable programs

Creating reusable programs

  • Need to get input data and options from the user

    • …often us, but sometimes others, or us later.

  • Sometimes, want completely new inputs

    • …but often, want the same or similar input.

  • Sometimes, typing the input is OK

    • …but often, want to use data in a file.

  • Sometimes, output to the screen is OK

    • …but often, want the result to go into a file.

BCHB524 - 2013 - Edwards


Interactive input

Interactive input

input_seq = raw_input("Type your codon: ")defcomplement(nuc):    nucleotides = 'ACGT'    complements = 'TGCA'    i = nucleotides.find(nuc)if i >= 0:        comp = complements[i]else:        comp = nucreturn compdefreverseComplement(seq):    seq = seq.upper()    newseq = ""for nuc in seq:        newseq = complement(nuc) + newseqreturn newseqprint"Reverse complement:", reverseComplement(input_seq)

BCHB524 - 2013 - Edwards


Command line input

Command-line input

import sysinput_seq = sys.argv[1]defcomplement(nuc):    nucleotides = 'ACGT'    complements = 'TGCA'    i = nucleotides.find(nuc)if i >= 0:        comp = complements[i]else:        comp = nucreturn compdefreverseComplement(seq):    seq = seq.upper()    newseq = ""for nuc in seq:        newseq = complement(nuc) + newseqreturn newseqprint"Reverse complement:", reverseComplement(input_seq)

BCHB524 - 2013 - Edwards


Interactive and file input

Interactive and file input

import sysinput_seq = sys.stdin.read()

defcomplement(nuc):    nucleotides = 'ACGT'    complements = 'TGCA'    i = nucleotides.find(nuc)if i >= 0:        comp = complements[i]else:        comp = nucreturn compdefreverseComplement(seq):    seq = seq.upper()    newseq = ""for nuc in seq:        newseq = complement(nuc) + newseqreturn newseqprint"Reverse complement:", reverseComplement(input_seq)

BCHB524 - 2013 - Edwards


File input only

File input only

import sysseq_file = sys.argv[1]# MAGIC: open file, read contents, and remove whitespaceinput_seq = ''.join(open(seq_file).read().split())defcomplement(nuc):    nucleotides = 'ACGT'    complements = 'TGCA'    i = nucleotides.find(nuc)if i >= 0:        comp = complements[i]else:        comp = nucreturn compdefreverseComplement(seq):    seq = seq.upper()    newseq = ""for nuc in seq:        newseq = complement(nuc) + newseqreturn newseqprint"Reverse complement:", reverseComplement(input_seq)

BCHB524 - 2013 - Edwards


Input summary

Input Summary

  • raw_input provides interactive values from the user (also copy-and-paste)

  • sys.stdin.read() provides interactive or file-based values from the user (also copy-and-paste)

  • sys.argv[1] provides command-line values from the user (also copy-and-paste)

    • value can be a filename that provides user-input

  • Terminal standard-input redirection "<" can be used to send a file's contents to raw_input or sys.stdin.read()

BCHB524 - 2013 - Edwards


Output is easy

Output is easy…

  • Just use print, right?

  • Print statements go to the terminal's standard-output.

    • We can redirect to a file using ">"

    • Errors still get printed to the terminal.

  • We can also link programs together – standard-output to standard-input using "|"

    • Also, cat just writes its file to standard out

BCHB524 - 2013 - Edwards


Connect reverse complement w codon counting

Connect reverse complement w/ codon counting…

  • Create and test rc.py from earlier slides:

    • Sequence from standard-input

    • Reverse complement sequence to standard-output

  • Create and test codons.py from earlier slides:

    • Sequence from standard-input

    • Count to standard-output

  • Place example sequence in file: test.seq

  • Execute:cat test.seq | python rc.py | python codons.py

BCHB524 - 2013 - Edwards


In general

In general

  • Windows and OS X have similar facilities

    • cmd in windows, terminal in OS X

  • Powerful mechanism for making reusable programs

    • No knowledge of python required for use!

  • Most bioinformatics software is used from the command-line w/ command-line arguments:

    • Files provide sequence data, etc.

  • I'll promote this style of program I/O.

BCHB524 - 2013 - Edwards


Exercise 1

Exercise 1

  • Use UniSTS (“google UniSTS”) to look up PCR markers for your favorite gene

    • Write a command-line program to compute the reverse complement sequence for the forward and reverse primer.

BCHB524 - 2013 - Edwards


Exercise 2

Exercise 2

  • Write a command-line program to test whether a PCR primer is a reverse complement palindrome.

    • Such a primer might fold and self-hybridize!

    • Test your program on the following primers:

      • TTGAGTAGACGCGTCTACTCAA

      • TTGAGTAGACGTCGTCTACTCAA

      • ATATATATATATATAT

      • ATCTATATATATGTAT

BCHB524 - 2013 - Edwards


Homework 3

Homework 3

  • Due Monday, September 16th.

  • Submit using Blackboard

  • Use only the techniques introduced so far.

  • Make sure you can run the programs demonstrated in lecture(s).

  • Exercises 1, 2 from Lecture 4

  • Exercises 1, 2 from Lecture 5

  • Rosalind exercises 6,7

BCHB524 - 2013 - Edwards


  • Login