An iterative relaxation technique for the nmr backbone assignment problem
Download
1 / 62

An Iterative Relaxation Technique for the NMR Backbone Assignment Problem - PowerPoint PPT Presentation


  • 62 Views
  • Uploaded on

An Iterative Relaxation Technique for the NMR Backbone Assignment Problem. Wen-Lian Hsu Institute of Information Science Academia Sinica. Characteristics of Our Method. Model this as a constraint satisfaction problem Solve it using natural language parsing techniques

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' An Iterative Relaxation Technique for the NMR Backbone Assignment Problem' - ravi


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
An iterative relaxation technique for the nmr backbone assignment problem

An Iterative Relaxation Technique for the NMR Backbone Assignment Problem

Wen-Lian Hsu

Institute of Information Science

Academia Sinica


Characteristics of our method
Characteristics of Our Method Assignment Problem

  • Model this as a constraint satisfaction problem

  • Solve it using natural language parsing techniques

    • Both top-down and bottom-up

  • An iterative approach

    • Create spin systems based on noisy data.

    • Link spin systems by using maximum independent set finding techniques.


Outline
Outline Assignment Problem

  • Introduction

  • Method

  • Experiment Results

  • Conclusion


Blind man s elephant
Blind Man’s Elephant Assignment Problem

  • We cannot directly “see” the positions of these atoms (the structure)

  • But we can measure a set of parameters (with constraints) on these atoms

    • Which can help us infer their coordinates

Each experiment can only determine

a subset of parameters (with noises)

To combine the parameters of different

experiments we need to stitch them together


The flow of nmr experiments
The Flow of NMR Experiments Assignment Problem

Calculation and

simulation

- Energy minimization

- Fitness of structure

constraints

Get protein

Samples

Collect NMR spectra

Resonance assignment

Structure Constraints


Chemical shift assignment
Chemical Shift Assignment Assignment Problem

Find out Chemical Shift for Each Atom

  • Backbone atoms: Ca, Cb, C’, N, NH

  • Various experiments: HSQC, CBCANH, CBCACONH, HN(CA)CO, HNCO, HN(CO)CA, HNCA

  • Side chain: all others (especially CHs)

  • TOCSY-HSQC, HCCCONH, CCCONH, HCCH-TOCSY

Cd

H3

Cg

H2

One amino acid

Cb

H2

Ca

N

CO

H

H


Some relevant parameters

18-23 Assignment Problem

55-60

17-23

30-35

16-20

31-34

19-24

Some Relevant Parameters

ppm

CH3

CH3

O

H

H

H

H-C-H

O

Backbone

-N-C-C-N-C-C-N-C-C-N-C-C-

H-C-H

H

H-C-H

H

O

O

H

O

H


Three important experiments

HSQC Assignment Problem

Three important experiments

  • Backbone: Ca, Cb,C’,N,NH

  • HSQC, CBCANH, CBCA(CO)NH, HN(CA)CO, HNCO, HN(CO)CA, HNCA

  • sequential assignment

  • chemical shifts of Ca, Cb, NH


Our nmr spectra
Our NMR spectra Assignment Problem

CBCA(CO)NH

CBCANH

  • HSQC

  • CBCA(CO)NH (2 peaks)

  • HNCACB (4 peaks)


Hsqc spectra

HSQC Assignment Problem

HSQC Spectra

  • HSQC peaks (1 chemical shifts for an amino acid)


Cbca co nh spectra
CBCA(CO)NH Spectra Assignment Problem

  • CBCA(CO)NH peaks (2 chemical shifts for one amino acid)


Cbcanh spectra

- Assignment Problem

-

+

+

CBCANH Spectra

  • CBCANH peaks (4 chemical shifts for one amino acid)

    • Ca (+), Cb (-)


A dataset example

H Assignment Problem

N

A Dataset Example

  • HSQC

  • HNCACB 4

  • CBCA(CO)NH 2


Backbone assignment
Backbone Assignment Assignment Problem

  • Goal

    • Assign chemical shifts to N, NH, Ca (and Cb) along the protein backbone.

  • General approaches

    • Generate spin systems

      • A spin system: an amino acid with known chemical shifts on its N, NH, Ca (and Cb).

    • Link spin systems


Ambiguities
Ambiguities Assignment Problem

  • All 4 point experiments are mixed together

  • All 2 point experiments are mixed together

  • Each spin system can be mapped to several amino acids in the protein sequence

  • False positives, false negatives


Previous approaches

Legal matching Assignment Problem

Illegal matching under constraints

Previous Approaches

  • Constrained bipartite matching problem

    • The spin system might be ambiguous

    • Can’t deal with ambiguous link


Natural language processing signal or noise
Natural Language Processing Assignment Problem ─ Signal or Noise?

  • Speech recognition:Homophone selection

台 北 市 一 位 小 孩 走 失 了

台 北 市 小 孩

台 北

適 宜 走 失

事 宜

一 位

一 味

移 位


An Error-Tolerant Algorithm Assignment Problem


Phrase, Sentence Combination Assignment Problem


Hierarchical Analysis Assignment Problem

句意模版

句型模版

片語模版

字詞模版


Perfect group
Perfect Group Assignment Problem

  • Each spin group contains 6 points, in which

    • 4 points are from the first experiments

    • 2 points are from the second experiment

H

O

a

H

C

a

C

b

N

C

C

b

H

C

H

O

a

H

C

C

a

b

N

C

C

b

H

C


Perfect group1

H Assignment Problem

H

O

O

a

H

a

H

C

C

C

a

a

C

b

N

b

C

N

C

C

C

b

b

H

H

C

C

Perfect Group

  • Each spin group contains 6 points, in which

    • 4 points are from the first experiments

    • 2 points are from the second experiment

H

O

a

H

C

C

a

b

N

C

C

b

H

C


A perfect spin system group
A Perfect Spin System Group Assignment Problem

CBCA(CO)NH

i -1

i -1

CBCANH

Ca

Ca

Cb

Cb


False positives and false negatives
False Positives and False Negatives Assignment Problem

  • False positives

    • Noise with high intensity

    • Produce fake spin systems

  • False negatives

    • Peaks with low intensity

    • Missing peaks

  • In real wet-lab data, nearly 50% are noises (false positive).


Spin system group

Perfect Assignment Problem

H

False Negative

False Positive

N

Spin System Group


Outline1
Outline Assignment Problem

  • Introduction

  • Method

  • Experiment Results

  • Conclusion


Main idea
Main Idea Assignment Problem

  • Deal with false negative in spin system generation procedures.

  • Eliminate false positive in spin system linking procedures.

  • Perform spin system generation and linking procedures in an iterative fashion.


Spin system group generation
Spin System Group Generation Assignment Problem

  • Three types of spin system group are generated based on the quality of CBCANH data:

    • Perfect

    • Weak false negative

    • Severe false negative


Perfect spin systems
Perfect Spin Systems Assignment Problem

  • A spin system is determined without any added pseudo peak.

CBCA(CO)NH

i -1

i -1

CBCANH

Ca

Ca

Cb

Cb


Weak false negative spin system group
Weak False Negative Spin System Group Assignment Problem

  • A spin system is determined with an added pseudo peak.

CBCA(CO)NH

i -1

i -1

CBCANH

Ca

Cb

Cb

115.481 9.604 60.044 1.30407e+008

Ca


Severe false negative spin system group
Severe false Negative Spin System Group Assignment Problem

  • A spin system is determined with two added pseudo peaks.

CBCA(CO)NH

i -1

i -1

CBCANH

Ca

Note: it is also possible thatCai-1 = 28.166 and Cbi-1 = 59.419

Cb

119.857 8.435 28.166 3.36293e+007

119.857 8.435 59.419 1.56434e+008

Cb

Ca


A note on spin system generation
A note on spin system generation Assignment Problem

  • To generate *ALL* possible spin systems, a peak can be included in more than one spin system.

    • False positives are eliminated in spin system linking procedure.

    • False negative are treated by adding pseudo peaks.

  • A rule-based mechanism is used to filter out incompatible spin systems (false positives).

    • Adopt maximum weight independent set algorithm


Spin system linking
Spin System Linking Assignment Problem

  • Goal

    • Link spin system as long as possible.

  • Constraints

    • Each spin system is uniquely assigned to a position of the target protein sequence.

    • Two spin systems are linked only if the chemical shift differences of their intra- and inter- residues are less than the predefined thresholds.


A peculiar parking lot valet parking
A Peculiar Parking Lot (valet parking) Assignment Problem

Information you have: The make of your car, the car parked in front of you (approximately).Together with others, try to identify as many cars in the right order as possible (maximizing the overall satisfaction).


Backbone assignment1
Backbone Assignment Assignment Problem

DGRIGEIKGRKTLATPAVRRLAMENNIKLS


Spin system positioning
Spin System Positioning Assignment Problem

  • We assign spin system groups to a protein sequence according to their codes.

D 50

G 10

R 40

I 50|51

55.26638.67544.5550

Spin System

44.417055.04330.04

55.26638.67544.5550 => 50 10

44.417055.04330.04 =>10 40

44.417030.66528.72

44.417030.66528.72 =>10 40

5535629.78260.04437.541

5535629.78260.04437.541 => 40 50


Link spin system groups

Segment 1 Assignment Problem

Segment 2

Segment 3

Link Spin System groups

D

G

R

I

44.417030.66528.72

55.26638.67544.5550

44.417055.04330.04

5535629.78260.04437.541


Iterative concatenation

Step1 Assignment Problem

1

1

2

56

47

Step2

Segment 1

Segment 31

Segment 2

Step n-1

Segment 78

Segment 79

Iterative Concatenation

DGRI….FKJJREKL

1

Spin Systems

2

….

56

….

Step n

Segment 99


Conflict segments
Conflict Segments Assignment Problem

DGRIGEIKGRKTLATPAVRRLAMENNIKLS

Segment 78

Segment 79

Segment 71

Segment 97

Segment 99

Segment 98

  • Two kinds of conflict segments

    • Overlap (e.g. segment 71, segment 99)

    • Use the same spin system (e.g. both segment 78 and segment 79 contain spin system 1 )


A graph model for spin system linking
A Graph Model for Spin System Linking Assignment Problem

  • G(V,E)

    • V: a set of nodes (segments).

    • E:(u, v), u, v V,u and v are conflict.

  • Goal

    • Assign as many non-conflict segments as possible => find the maximum independent set of G.


An example of g

SP13 Assignment Problem

Seg2

Overlap

Overlap

SP15

Seg4

Seg1

Seg3

Seg4

Seg2

An Example of G

Seg1

Segment1: SP12->SP13->SP14

Segment2: SP9->SP13->SP20->SP4

Segment3: SP8->SP15->SP21

Segment4: SP7->SP1->SP15->SP3

Seg3

  • Seq. : GEIKGRKTLATPAVRRLAMENNIKLSE


Segment weight
Segment weight Assignment Problem

  • The larger length of segment is, the higher weight of segment is.

  • The less frequency of segment is, the higher of segment is.


Find maximum weight independent set of g
Find Maximum Weight Independent Set of Assignment ProblemG

  • Boppana, R. and M.M. Halldόrsson, Approximatin Maximum Independent Sets bt Excluding Subgraphs. BIR, 1992. 32(2).


An iterative approach
An Iterative Approach Assignment Problem

  • We perform spin system generation and linking iteratively.

  • Three stages.


First stage
First Stage Assignment Problem

  • Generate perfect spin systems;

    • Perform spin system concatenation on spin systems (newly generated perfect) to generate segments;

    • Retain segments that contain at least 3 spin systems;

    • Perform MaxIndSet on the segments;

    • Drop spin systems (and related peaks) that are used in the resulting segments.


Second stage
Second Stage Assignment Problem

  • Generate weak false negative spin systems.

    • Perform segment extension on the resulting segments of the first iteration (using unused perfect and newly generated weak false negative);

    • Perform spin system concatenation on the unused spin systems (perfect + weak false negative) to generate longer segments;

    • Retain segments that contain at least 3 spin systems;

    • Perform MaxIndSet on the segments;

    • Drop spin systems (and related peaks) that are used in the resulting segments.


Third stage
Third Stage Assignment Problem

  • Generate severe false negative spin systems.

    • Perform segment extension on the resulting segments of the second iteration (using unused perfect and weak false negative, as well as newly generated severe false negative);

    • Perform spin system concatenation on the unused spin systems (perfect + weak false negative + severe false negative) to generate longer segments;

    • Retain segments that contain at least 3 spin systems;

    • Perform MaxIndSet on the segments.


Segment extension

12 Assignment Problem

29

109

29

Segment Extension

….FKJJREKL….

109

New spin systems

1

2

….

45

New 109


Segment extension1

97 Assignment Problem

78

77

99

97‘

71

99‘

77

99‘

97‘

Segment Extension

DGRGEKGRKTLATPAVRRLAMENNIKLS

97

23

99

24

26

45

28

27

31

28

29

32

33

MaxIndSet


Outline2
Outline Assignment Problem

  • Introduction

  • Method

  • Experimental Results

  • Conclusion


Experimental results
Experimental Results Assignment Problem

  • Two datasets obtained from our collaborator Dr. Tai-Huang, Huang in IBMS, Academia Sinica:

    • Average precision: 87.5%

    • Average recall: 73.1%

  • Perfect data from BMRB: 99.1%


Real wet lab datasets
Real Wet-Lab Datasets Assignment Problem

  • The two datasets are obtained from our collaborator Dr. Tai-Huang, Huang in IBMS at Academia Sinica, Taiwan.



Outline3
Outline Assignment Problem

  • Introduction

  • Method

  • Experiment Results

  • Conclusion


Conclusion
Conclusion Assignment Problem

  • We model the backbone assignment problem as a constraint satisfaction problem

  • This problem is solved using a natural language parsing technique (both bottom-up and top-down approach)

  • The same approach seem to work for a large class of noise reduction problems that are discrete in nature


A genetic algorithm for nmr backbone resonance assignment i
A genetic algorithm for NMR backbone resonance assignment (I)

  • Randomly generate a population of chromosomes

    • Each chromosome represents a possible backbone resonance assignment

  • Fitness function

    • Evaluate the fitness of each chromosome according to the connectivity between adjacent amino acids


A genetic algorithm for nmr backbone resonance assignment ii
A genetic algorithm for NMR backbone resonance assignment (II)

  • Crossover operation

    • An offspring inherits different connected blocks from parents

  • Mutation operation

    • Make a new connected block from any position to increase the popular diversity


Generation of a random chromosome
Generation of a random chromosome (II)

  • Step1. Randomly select a position x

  • Step2. Randomly select a SSGroup i from CL(x)

  • Step3. Extend connected fragments from i to both sides by using adjacency lists until no more extension can be found.

  • Step4. Repeat Step1~Step3 until all positions are assigned.


Fitness evaluation
Fitness Evaluation (II)

Building Blocks: connected fragments

Fitness(ch) = The number of connected pairs associate with

their chemical shift differences.

Two principles:

1. The more connected pairs it has, the higher score it gets.

2. The less chemical shift differences it has, the higher score it gets.


Crossover operation
Crossover Operation (II)

cutting site

parents

offspring


Mutation operation
Mutation operation (II)

  • Once a position is going to mutate, the following positions will also mutate to produce a connected fragments.

Mutation point


Experiment results
Experiment Results (II)

  • The accuracy on two real dataset

    • SBD:95.1% (FP: 67%)

    • LBD:100% (FP: 48%)

  • The average accuracy on perfect BMRB datasets (902 proteins)


ad