SubMAP Aligning metabolic pathways with subnetwork mappings

Download Presentation

SubMAP Aligning metabolic pathways with subnetwork mappings

Loading in 2 Seconds...

- 47 Views
- Uploaded on
- Presentation posted in: General

SubMAP Aligning metabolic pathways with subnetwork mappings

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

SubMAP

Aligning metabolic pathways

with subnetwork mappings

Ferhat Ay, Tamer Kahveci

RECOMB 2010

Bioinformatics Lab.

University of Florida

What is Pathway Alignment?

R4

R6

R1

R3

R8

Pathway1

R2

R5

R7

Alignment

R3

R5

R1

Pathway2

R2

R7

R4

R6

- Global Alignment is GI-Complete
- Local Alignment is NP-Complete

Existing Methods

- Heymans et al. Bioinformatics (2003) – Undirected, Hierarchical Enzyme Similarity
- Clemente et al.Genome Informatics(2005) – Gene Ontology Similarity of Enzymes
- Pinter et al. Bioinformatics (2005) – Directed,Only Multi-Source Trees
- Singh et al. RECOMB (2007) – PPI Networks, Sequence Similarity
- Dost et al. RECOMB (2007) – QNET, Color Coding, Tree queries of size at most 9
- Cheng et al. Bioinformatics (2009) – MetNetAligner, Allows insertions & deletions

- PROBLEMS
- Restriction of one-to-one Mappings
- Similarity of Biological Functions
- Topology Restrictions

Different Paths, Same Function

E. Coli

A. thaliana

One-to-many (Subnetwork) Mappings

- Biologically Relevant
- Frequently Observed in Nature
- Characterize Similarity

- CHALLENGES
- Exponential Number of Subnetworks
- Defining Similarity Between Subnetworks
- Overlapping Mappings (Consistency)

Outline

Enumerating Subnetworks

Homological Similarity

Topological Similarity

One-to-one

Mapping

Subnetwork

Mappings

One-to-one

Mapping

Subnetwork

Mappings

Combining

Homology & Topology

Extracting Mappings

Enumeration of Subnetworks

1

5

6

3

2

4

R1

1

2

3

4

5

6

R2

1,3

2,3

3,4

3,5

5,6

R3

1,2,3

1,3,4

1,3,5

2,3,4

2,3,5

3,4,5

3,5,6

- Homological Similarities (1-to-1)

Enzyme Similarity (SimE)

- Hierarchical Enzyme Similarity – Webb EC.(2002)
- Information-Content Enzyme Similarity – Pinter et al.(2005)
- Gene Ontology Similarity of Enzymes– Clemente et al.(2003)
Compound Similarity (SimC)

- Identity Score for compounds
- SIMCOMP Compound Similarity –
- Hattori et al.(2003)

- Defined in terms of SimE and SimC

L-Aspartate

L-Lysine

Homological Similarities(Subnetworks)

Input

compounds

enzymes

Output

compounds

Subnetwork1

- (s1)

Input

compounds

enzymes

Output

compounds

Subnetwork2

- (s2)

Sim(s1,s2) = w1MWBM(inputs) + w2MWBM(enzymes) + w3MWBM(outputs)

- Topological Similarities (1-to-1)

|R| = 4

BN (R3)= {R1,R2}

FN (R3)= {R4}

BN (R3)= {R1}

FN (R3)= {R4,R5}

R1

R3

R4

A [R3 ,R3][R2,R1] =1 = 1

2*1 + 1*2 4

R2

R4

R1

R3

R5

|R| = 4

(|R| |R| ) x (|R| |R| ) = 16 x 16

Topological Similarities (Subnetworks)

4

7

1

5

1

5

6

3

2,3

5,7

2

4

Si

Si

5,6

Pathway 1

5,6,7

Backward & Forward neighbors

Support matrix

S'j

{Si,S'j}

Pathway 2

1

FN(Si)FN(S'j)+BN(Si)BN(S'j)

Outline

Enumerating Subnetworks

Homological Similarity

Topological Similarity

One-to-one

Mapping

Subnetwork

Mappings

One-to-one

Mapping

Subnetwork

Mappings

Combining

Homology & Topology

Extracting Mappings

Combining Homology & topology

Hk+1= αAHk+ (1-α)H0

Iteration 1: Support of aligned first degree neighbors added

Iteration 2: Support of aligned second degree neighbors added

Iteration 0: Only pairwise similarity of R3 and R3

Iteration 3: Support of aligned third degree neighbors added

R1

R4

R6

R1

R3

R3

R2

R8

R2

R5

R7

R8

R5

R7

Focus on R3 – R3 matching

Combining Homology & topology

Hk+1= αAHk+ (1-α)H0

InitialSimilarity

Matrix

H0Vector

HkVector

FinalSimilarity

Matrix

0.5

1.0

0.4

0.3

0.6

0.9

0.5

0.5

0.6

0.9

0.5

0.5

0.6

0.9

0.5

0.5

Power

Method

Iterations

0.5

1.0

0.4

0.3

0.3

0.5

0.8

0.8

0.1

1.0

0.2

0.9

0.5

1.0

0.4

0.3

0.3

0.5

0.8

0.8

0.1

1.0

0.2

0.9

0.3

0.5

0.8

0.8

0.2

0.3

0.6

0.9

0.2

0.3

0.6

0.9

0.2

0.3

0.6

0.9

0.1

1.0

0.2

0.9

0.2

1.0

0.4

0.6

0.2

1.0

0.4

0.6

0.2

1.0

0.4

0.6

Extracting Mappings (1-to-1)

R1 R2 R3

R1

R1

R1

R2

R3

R4

0.8

0

0.4

0

0.3

1.0

0

0.5

0

0

0.6

0.9

R2

R2

R3

R3

R4

Maximum Weight Bipartite Matching

Extracting Mappings (Subnetworks)

1,2

1

1,2

1

Conflicts With

1

2

1

2

3

1

4

3,4,5

4

3,4,5

Conflicts With

4,5

5

4,5

5

Maximum bipartite matching will fail!

How to Handle Conflicts?

Label

Mapping

Weight

Conflict Graph

a

1,2

1

0.7

b

1

2

0.6

c

3

1

0.4

d

4

3,4,5

0.9

Find the set of non-conflicting mappings that maximizes the sum of sum of similarity scores.

e

4,5

5

0.8

Subnetwork from pathway 1

Subnetwork from pathway 2

Maximum Weight Independent Set Problem

- Given an undirected vertex weighted
- graph find a vertex induced subgraph

- That maximizes the sum of the vertex
- weights (maximum weight)

- That has no edges (independent set)

- NP-Hard – Karp, 1972
- Hard to approximate – Hastad, 1996
- (There is no PTAS unless P=NP)

9+8+2+0 = 19

Finding the Best Alignment is NP-Hard

≤

MWIS problem in bounded degree

graphs with max degree k+1

Metabolic pathway alignment with

subnetworks of size at most k

How do we find the mappings?

Label

Mapping

Weight

Conflict Graph

a

1,2

1

0.7

b

1

2

0.6

c

3

1

0.4

f(a) = 0.7

- f(b) = 0.6/0.7
- f(c) = 0.4/0.7
- f(d) = 0.9/0.8
- f(e) = 0.8/0.9

f(a) = 0.7

- f(b) = 0.6/0.7
- f(c) = 0.4/0.7

Alignment

d

4

3,4,5

0.9

Choose the vertex v that maximizes

w(v)

f(v) =

e

4,5

5

0.8

∑u in N(v)w(u)

EXPERIMENTAL

RESULTS

Alternative paths -1

Comparison

MetNetAligner: Cheng & Zelikovsky, Bioinformatics 2009.

SubMAP: Ay & Kahveci, RECOMB 2010.

Alternative paths - 2

Alternative paths - 3

Alternative paths - 4

Mappings among major clades

Number of subnetworks

Performance of our algorithm

k : maximum size of subnetwork

Conclusion

- Considering subnetworks improves the accuracy of metabolic pathway alignment and allows revealing alternative paths that are biologically relevant

- Alignments within and across the clades have different characteristics in terms of their mapping cardinalities.

- SubMAP can be effectively used for applications where identifying different entity sets with same/similarfunctions is necessary. e.g.: filling pathway holes and metabolic/phylogenic reconstruction.

Obrigado