why to become a pyologist l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
why to become a Pyologist PowerPoint Presentation
Download Presentation
why to become a Pyologist

Loading in 2 Seconds...

play fullscreen
1 / 27

why to become a Pyologist - PowerPoint PPT Presentation


  • 120 Views
  • Uploaded on

why to become a Pyologist. Perl is for plumbers – Python is for biologists. Stefan Maetschke Teasdale Group. why. why, why, why …. Biologists suffer for no good reason Perl is difficult to write and read Perl gives weak error feedback Perl obscures basic concepts

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'why to become a Pyologist' - davida


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
why to become a pyologist

why to become a Pyologist

Perl is for plumbers – Python is for biologists

Stefan Maetschke

Teasdale Group

slide2
why

why, why, why …

  • Biologists suffer for no good reason
    • Perl is difficult to write and read
    • Perl gives weak error feedback
    • Perl obscures basic concepts
    • Limited understanding of principles
    • Low productivity
    • Reduced research scope
  • Perl is for plumbers - Python is for scientists
  • I want to have an easy life
plumbers and others
sys admin

plumbing

vi

awk/Perl

grep/diff

scientist

Python

plumbers and others

spectrum of tasks, tools and roles

SW developer

  • designing
  • Emacs/IDE
  • C/C++/Java
  • UML/Unit test
equals

Perl

Python

Guido van Rossum

Larry Wall

1991

1987

There should be one way

There’s more than one way

Easy

Difficult

equals( , )

Cross-platform, open-source, scripting language,

multi-paradigm, dynamic typing, statement ratio: 6

you must be joking
you must be joking!

list = [ [‘a’, ’b’, ’c’], [1, 2, 3] ]

print list[0]

@list = ( [‘a’, ’b’, ’c’], [1, 2, 3] );

print “@{$list[0]}\n”;

my @list = ('a', 'b', 'c');

my %hash;

$hash{‘letters'} = \@list;

print "@{$hash{‘letters'}}\n";

list = ['a', 'b', 'c']

hash = {}

hash[‘letters'] = list

print hash[‘letters']

class Person:

def __init__(self, age):

self.age = age

package Person;

use strict;

sub new {

my $class = shift;

my $age = shift or die "Must pass age";

my $rSelf = {'age' => $age};

bless ($rSelf, $class);

return $rSelf;

}

http://www.strombergers.com/python/

more perl bashing
More Perl bashing…

def add(a, b):

return a + b

sub add {

$_[0] + $_[1];

}

sub add($, $) {

local ($a, $b) = _@;

return $a + $b;

}

sub add {

my $a = shift;

my $b = shift;

return $a + $b;

}

sub add {

my ($a, $b) = _@;

return $a + $b;

}

sub diff {

my ($aref, $bref) = _@;

my (@a) = @$aref;

my (@b) = @$bref; return scalar(@a) + scalar(@b);

}

def diff(a, b):

return len(a) - len(b)

http://www.strombergers.com/python/

complexity wall

Higher order

concepts

Data structures

Functions

Classes

complexity wall

everything you can do in Python you can do in Perl but you don’t

simple scripts

≈ 100 lines=> fun stops

=> Python allows you to break through the complexity wall

googliness
googliness
  • C 53,000 1,820 572
  • Java 7,760 2,890 320
  • C++ 1,290 3,100 231
  • C# 1,020 794 161
  • Perl 1,150 685 101
  • Python 527 798 199
  • Ruby 470 806 186
  • Scala 394 354 69
  • Haskell 212 323 74

X language

X load file

X bioinformatics

kilo-hits, May 2008

and the winner is
and the winner is…

<- without Psyco

http://shootout.alioth.debian.org/

damn lies and stats
damn lies and stats

sourceforge projects

  • Perl declining, Python increasing ?
  • May 2008, keyword search: Perl 3474, Python 4063

http://rengelink.textdriven.com/blog/

see the light

Four attributes:

  • sepal length
  • sepal width
  • petal length
  • petal width
see the light…

classify Iris plants

  • Three species:
  • Iris setosa
  • Iris versicolor
  • Iris virginica

http://archive.ics.uci.edu/ml/datasets/Iris

Fisher, R.A. "The use of multiple measurements in taxonomic problems"

Annual Eugenics, 7, Part II, 179-188 (1936)

libs for life science
libs for life science
  • Scientific computing: SciPy, NumPy, matplotlib
  • Bioinformatics: BioPython
  • Phylogenetic trees: Mavric, Plone, P4, Newick
  • Microarrays: SciGraph, CompClust
  • Molecular modeling: MMTK, OpenBabel, CDK, RDKit, cinfony, mmLib
  • Dynamic systems modeling: PyDSTools
  • Protein structure visualization: PyMol, UCSF Chimera
  • Networks/Graphs: NetworkX, PyGraphViz
  • Symbolic math: SymPy, Sage
  • Wrapper for C/C++ code: SWIG, Pyrex, Cython
  • R/SPlus interface: RSPython, RPy
  • Java interface: Jython
  • Fortran to Python: F2PY

Check also out: http://www.scipy.org/Topical_Softwareand: http://pypi.python.org/pypi

last words
last words
  • Perl perfect for plumbing
  • Python excellent for scientific programming
    • Easy to learn, write and maintain
    • Suited for scripting and mid-size projects
    • Huge number of scientific libraries
  • Python is an attractive alternative to Matlab/R
  • Easy integration of Java, C/C++ or Fortran code
questions
questions

isn’t Python lovely…

Interest:

Python Course?

links
links
  • Wikipedia – Pythonhttp://en.wikipedia.org/wiki/Python
  • Instant Pythonhttp://hetland.org/writing/instant-python.html
  • How to think like a computer scientisthttp://openbookproject.net//thinkCSpy/
  • Dive into Pythonhttp://www.diveintopython.org/
  • Python course in bioinformaticshttp://www.pasteur.fr/recherche/unites/sis/formation/python/index.html
  • Beginning Python for bioinformaticshttp://www.onlamp.com/pub/a/python/2002/10/17/biopython.html
  • SciPy Cookbookhttp://www.scipy.org/CookbookMatplotlib Cookbookhttp://www.scipy.org/Cookbook/Matplotlib
  • Biopython tutorial and cookbookhttp://www.bioinformatics.org/bradstuff/bp/tut/Tutorial.html
  • Huge collection of Python tutorialhttp://www.awaretek.com/tutorials.html
  • What’s wrong with Perlhttp://www.garshol.priv.no/download/text/perl.html
  • 20 Stages of Perl to Python conversionhttp://aspn.activestate.com/ASPN/Mail/Message/python-list/1323993
  • Why Pythonhttp://www.linuxjournal.com/article/3882
some papers
some papers
  • Bassi S. (2007) A Primer on Python for Life Science Researchers. PLoS Comput Biol 3(11): e199. doi:10.1371/journal.pcbi.0030199
  • Mangalam H. (2002)The Bio* toolkits--a brief overview. Brief Bioinform. 3(3):296-302.
  • Fourment M., Gillings MR. (2008)A comparison of common programming languages used in bioinformatics.BMC Bioinformatics 9:82.
to whom it may concern
to whom it may concern

NP = Non-Programmer

  • NPs who don’t use Perl yet
  • NPs who want to see the light
  • NPs who want to give their code away without being rightfully ashamed
  • Matlab aficionados
one of ten perl myths
one of ten Perl myths

http://www.perl.com/pub/a/2000/01/10PerlMyths.html

“…we can happily consign the idea that ‘Perl is hard’ to mythology.”

Swap two sections of a string: “aaa:bbb” -> “bbb:aaa”

“…Perl works the way you do…”

while (<>) {

s/(.*):(.*)/$2:$1/;

print;

}

while (<>) {

chomp;

($first, $second) = split /:/;

print $second, ":", $first, "\n";

}

“…That's one, fairly natural way to think about it…”

from re import sub

for line in file:

print sub(‘(.*):(.*)’, r’\2:\1’, line)

for line in file:

line = line.strip()

first, second = line.split(‘:’)

print second+’:’+first

camel chaos
camel chaos
  • does not scale well
  • complex syntax
  • cryptic commands
  • does not encourage clear code
  • difficult to read/maintain
  • hard to understand the principles
  • error prone
    • no check of subroutine arguments
    • variables are global by default
why python
why Python
  • overcome the complexity wall
  • many, excellent scientific libraries
  • clear, easy to learn syntax
  • hard to do it wrong
  • does not require prior suffering/experience
my bias
my bias
  • R&D: C/C++ -> applied ML in robotics, image processing, quality control
  • SW Development: Java -> Speech Processing, Data Mining
  • Computational Biology: Java, Python
  • Other languages I played with: Ada, APL, Basic, MatLab, Modula, Pascal, Perl, Prolog, R, Groovy, Forth, Fortran, Scala, Assembly code