the 7 bridges in k nigsberg and compositional representation of protein sequences l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
The 7 Bridges in K ö nigsberg and Compositional Representation of Protein Sequences PowerPoint Presentation
Download Presentation
The 7 Bridges in K ö nigsberg and Compositional Representation of Protein Sequences

Loading in 2 Seconds...

play fullscreen
1 / 21

The 7 Bridges in K ö nigsberg and Compositional Representation of Protein Sequences - PowerPoint PPT Presentation


  • 91 Views
  • Uploaded on

The 7 Bridges in K ö nigsberg and Compositional Representation of Protein Sequences. Bailin Hao ( 郝柏林 ) (ITP & BGI, CAS ) Huimin Xie ( 谢惠民 ) ( Math Dept. Suzhou U ) Shuyu Zhang ( 张淑誉 ) (IP. Acad. Sinica). Compositional Approach in Prokaryote Phylogeny .

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'The 7 Bridges in K ö nigsberg and Compositional Representation of Protein Sequences' - lucinda


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
the 7 bridges in k nigsberg and compositional representation of protein sequences

The 7 Bridges in Königsberg and Compositional Representation of Protein Sequences

Bailin Hao (郝柏林)

(ITP & BGI, CAS )

Huimin Xie (谢惠民)

(Math Dept. Suzhou U)

Shuyu Zhang (张淑誉)

(IP. Acad. Sinica)

compositional approach in prokaryote phylogeny
Compositional Approach inProkaryote Phylogeny
  • Justification of using K-tuples instead of primary protein sequences.
  • Problem of uniqueness of reconstruction of protein sequence from its constituent K-tuples.
  • Picking up a special class of proteins without biological knowledge.
7 bridges in k nigsberg

7 bridges in Königsberg

Euler (1736)

4 odd nodes: No!

Dénes König, Theory of Finite and Infinite Graphs

1st ed.(1932). Birkhaüser (1990)

“ From Königsberg to König’s book, So runs the graphic tale…’’

basic notions
Basic Notions
  • A Graph G=(V, E), where V is a set of nodes (vertices), E is a set of edges (bonds)
  • Edges: undirected (u,v)=(v,u), uand v adjacent; directed (u,v) differs from (v,u), u incident to v.
  • A weight may beassociated with (u,v): cost, distance, transfer function, reaction rate, etc.
  • Eulerian graph: each edge appears once and only once in a path
  • Hamiltonian graph: each vertex appears once and only once in a path. Hamiltonian cycle of minimal weight --- Travel Salesman Problem (TSP)
slide6

An Euler path:

An Euler loop:

Euler grahp: loop

Semi-Euler praph: path, no loop

Problem of Eulerian loop: simple, known solutions

Problem of Hamiltonian paths: much harder

hamiltonian loops much harder

Hamiltonian Loops: much harder

No!

10 nodes

15 arcs

di=3 nodes

Traveling Salesman Problem

NP-hard problems

Yes!

graph nodes arcs directed labeled arcs and nodes simple graph no rings at nodes no repeated arcs
Graph = nodes + arcs

Directed , Labeled arcs and nodes

Simple graph:

No rings at nodes:

No repeated arcs:

i

i

j

indegree d in i outdegree d out i euler graph d in i d out i di i
Indegree din(i)

Outdegree dout(i)

Euler graph:

din(i) = dout(i)  dii

slide10

Simple Euler GraphDiagonal matrix:M=diag( d1, d2, … dn )Adjacent matrix:A={aij} aij= aii= 0Kirchhoff matrix:C=M-ACij=  Cij=0 det(C)=0All minors of C are equal. Denote this common minor by

1

n

i,j=0

0

i

j

slide11

Number of Euler loops in simple Euler Graph

N G de Bruijn

T van Aardenrie Ehrenfest

C A B Smith

W T Tuite

BEST Theorem

e (G) = (di-1)!

i

slide12

i

Number of Eulerian loops in general Eular G.

some aii0rings

some aij>1parallel arcs

Putting auxiliary nodes on these rings and parallel arcs makes the graph simple.

slide13

No need to work with bigger A matrix.

Just let some aii0, aij>1 in original A.

Eliminate redundancy caused by unlebeled arcs.

Modified BEST Theorem:

e(G) =

(di-1)!

i

 aij!

ij

anpa pseam 82aa malslftvgqliflfwtmriteaspdpaakaapaaaaapaaaapdtasdaaaaaaltaanakaaaeltaanaaaaaaatarg

MALS

K=5

ALSL

LSLF

SLFT

LFTV

FTVG

ANPA_PSEAM 82AA

MALSLFTVGQLIFLFWTMRITEASPDPAAKAAPAAAAAPAAAAPDTASDAAAAAALTAANAKAAAELTAANAAAAAAATARG

TVGQ

VGQL

AKAA

anpa pseam 82aa antifreeze protein a b precursor in winter flaunder alanine rich amphiphilic

6 rings

ANPA_PSEAM 82AA

Antifreeze protein

A/B precursor in winter flaunder

Alanine-rich

Amphiphilic

auxiliary arc

slide16

From pdb.seq-a special selection of SWISSPROT2821-1=2820 proteins ( May 2000 )R—number of reconstructed AA sequences from a given protein decomposition

compositional representation of proteins

Compositional Representation of Proteins

K

L= -k+1

K

M

i

j

i=1

j=1

The collection {W }or {W ,n j}may be used as an equivalent representation of the original protein sequence.

A seemingly trivial result upon further reflection: random AA sequences have unique reconstruction as well.

Compositional Representation works equally for random AA sequences and most of protein sequences.

A given realization of a short random AA sequence is as specific as a real protein sequence.

slide18

Nucleotide correlations in DNA/RNA

Much studied

K=2 correlation functions 16 9 6

See Wentian Li, Computer Chem. 21(1997) 257-271.

Amino Acid correlations in Proteins

Almost no study

Hard to comprehend 400 correlation functions at K=2

Proteins too short to define correlation functions

One should approach the problem from a more deterministic point of view

Repeated AA segments in proteins are strong manifestation of correlations!

slide19

On-going study: the other extreme

Quit a few proteins have an enormous

number of reconstructions.

Transmembrane

Antifreeze

Fibrous: collagens

Coarse-graining: closer to biology by reducing the number of AAs

slide21

Preprint:

NSF – ITP – 01 – 018

LANL E-archive physics/0103028

arxiv.org or cn.arxiv.org

Cross-referenced in q-bio since 15 Sept 2003