Matlab bioinformatics tools
This presentation is the property of its rightful owner.
Sponsored Links
1 / 15

MATLAB Bioinformatics Tools PowerPoint PPT Presentation


  • 67 Views
  • Uploaded on
  • Presentation posted in: General

MATLAB Bioinformatics Tools. Rob Henson The MathWorks, Inc. Who Am I?. Development manager for Bioinformatics group at The MathWorks Natick, MA Software developer Background in algorithm design and software engineering. What do I do?. Write software for bioinformatics Sequence analysis

Download Presentation

MATLAB Bioinformatics Tools

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Matlab bioinformatics tools

MATLAB Bioinformatics Tools

Rob Henson

The MathWorks, Inc.


Who am i

Who Am I?

  • Development manager for Bioinformatics group at The MathWorks

    • Natick, MA

  • Software developer

  • Background in algorithm design and software engineering


What do i do

What do I do?

  • Write software for bioinformatics

    • Sequence analysis

    • Microarray data analysis

  • Some consulting

    • Bioinformatics algorithm design

    • Machine learning tools

      • E.g. Neural networks, HMMs etc.


My solution to dotplot

My solution to dotplot

>> map = eye(128);

>> spy(map(seq1,seq2))

Why does this work?

How could we make this better?


Enhancements to dotplot

Enhancements to dotplot

  • Does map need to be 128?

    • What is the right value?

  • Can we use less memory?

  • How do we deal with bad inputs?

  • Can we extend this to look for longer patterns?


Some useful tools

Some useful tools

  • edit

  • dbstop

  • profiler

  • Getting help

    • Documentation

    • Technical Support Knowledge Base

    • Newsgroup


A full implementation of dotplot

A full implementation of dotplot

function matches = dotplot(seq1,seq2,window,stringency)

% DOTPLOT Visualize sequence matches.

% DOTPLOT(S,T) plots the sequence matches of sequences S and T.

%

% DOTPLOT(S,T,WINDOW,NUM) plots sequence matches when there

% are at least NUM matches in a window of size WINDOW. For nucleotide

% sequences a WINDOW of 11 and NUM of 7 is recommended in the

% literature.

%

% MATCHES = DOTPLOT(...) returns the number of dots in the dotplot

% matrix.

%

% Example:

% moufflon = getgenbank('AB060288','sequence',true);

% takin = getgenbank('AB060290','sequence',true);

% dotplot(moufflon,takin,11,7)

%

% This shows the similarities between prion protein (PrP) nucleotide

% sequences of two ruminants, the moufflon and the golden takin.

%

% See also NWALIGN, SWALIGN.


Sequence properties

Sequence properties

  • Amino acid composition

    • histc function

  • Molecular weight

    • Indexing and sum function

  • Hydrophobicity


Molecular weights

Molecular weights

A: 89.000

R: 174.000

N: 132.000

D: 133.000

D: 121.000

Q: 146.000

E: 147.000

G: 75.000

H: 155.000

I: 131.000

L: 131.000

K: 146.000

M: 149.000

F: 165.000

P: 115.000

S: 105.000

T: 119.000

W: 204.000

Y: 181.000

V: 117.000

http://cn.expasy.org/tools/pscale/Molecularweight.html


Matlab bioinformatics tools

mw = [89.0900

0

121.1500

133.1000

147.1300

165.1900

75.0700

155.1600

131.1700

0

146.1900

131.1700

149.2100

132.1200

0

115.1300

146.1500

174.2000

105.0900

119.1200

0

117.1500

204.2300

0

181.1900];

seq = ‘MATLAPEEPQSDPSVEPPLSQETFSDLWKLLPENNVLSP’;

seqmw = mw(seq-’A’+1);

plot(seqmw)


Proteinplot

proteinplot


Assignments

Assignments

1. Create a hydrophobicity plot

You can get the amino acid values from http://cn.expasy.org/cgi-bin/protscale.pl

Use Kyte & Doolittle’s values.

Create a function that has two inputs, the sequence and the window size. The function will create a hydrophobicity plot. The help for the function is on the next slide…


Matlab bioinformatics tools

function hydrophobic(sequence, window_length)

% HYDROPHOBIC plots the hydrophobicity of an amino acid sequence

% HYDROPHOBIC(SEQUENCE,WINDOW_LENGTH) creates a hydrophobicity plot of

% SEQUENCE using a smoothing window of length, WINDOW_LENGTH.

%

% SEQUENCE must be a valid amino acid sequence. If SEQUENCE contain any

% symbols other than the standard 20 amino acid letters, the function

% will give an error message. SEQUENCE can be either upper or lower case.

%

% WINDOW_LENGTH must be an odd positive integer.

%


Assignments1

Assignments

2. Modify the function to return the maximum and minimum hydrophobicity values in the plot.

Make appropriate changes to the help for the function.


Advanced example

Advanced example

  • Alignment significance

    • Alignment algorithms such as Smith-Waterman and Needleman-Wunsch always find some alignment. How do we know if what they find is significant or simply random?


  • Login