1 / 62

# An Iterative Relaxation Technique for the NMR Backbone Assignment Problem - PowerPoint PPT Presentation

An Iterative Relaxation Technique for the NMR Backbone Assignment Problem. Wen-Lian Hsu Institute of Information Science Academia Sinica. Characteristics of Our Method. Model this as a constraint satisfaction problem Solve it using natural language parsing techniques

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about ' An Iterative Relaxation Technique for the NMR Backbone Assignment Problem' - ravi

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### An Iterative Relaxation Technique for the NMR Backbone Assignment Problem

Wen-Lian Hsu

Institute of Information Science

Characteristics of Our Method Assignment Problem

• Model this as a constraint satisfaction problem

• Solve it using natural language parsing techniques

• Both top-down and bottom-up

• An iterative approach

• Create spin systems based on noisy data.

• Link spin systems by using maximum independent set finding techniques.

Outline Assignment Problem

• Introduction

• Method

• Experiment Results

• Conclusion

Blind Man’s Elephant Assignment Problem

• We cannot directly “see” the positions of these atoms (the structure)

• But we can measure a set of parameters (with constraints) on these atoms

• Which can help us infer their coordinates

Each experiment can only determine

a subset of parameters (with noises)

To combine the parameters of different

experiments we need to stitch them together

The Flow of NMR Experiments Assignment Problem

Calculation and

simulation

- Energy minimization

- Fitness of structure

constraints

Get protein

Samples

Collect NMR spectra

Resonance assignment

Structure Constraints

Chemical Shift Assignment Assignment Problem

Find out Chemical Shift for Each Atom

• Backbone atoms: Ca, Cb, C’, N, NH

• Various experiments: HSQC, CBCANH, CBCACONH, HN(CA)CO, HNCO, HN(CO)CA, HNCA

• Side chain: all others (especially CHs)

• TOCSY-HSQC, HCCCONH, CCCONH, HCCH-TOCSY

Cd

H3

Cg

H2

One amino acid

Cb

H2

Ca

N

CO

H

H

18-23 Assignment Problem

55-60

17-23

30-35

16-20

31-34

19-24

Some Relevant Parameters

ppm

CH3

CH3

O

H

H

H

H-C-H

O

Backbone

-N-C-C-N-C-C-N-C-C-N-C-C-

H-C-H

H

H-C-H

H

O

O

H

O

H

HSQC Assignment Problem

Three important experiments

• Backbone: Ca, Cb,C’,N,NH

• HSQC, CBCANH, CBCA(CO)NH, HN(CA)CO, HNCO, HN(CO)CA, HNCA

• sequential assignment

• chemical shifts of Ca, Cb, NH

Our NMR spectra Assignment Problem

CBCA(CO)NH

CBCANH

• HSQC

• CBCA(CO)NH (2 peaks)

• HNCACB (4 peaks)

HSQC Assignment Problem

HSQC Spectra

• HSQC peaks (1 chemical shifts for an amino acid)

CBCA(CO)NH Spectra Assignment Problem

• CBCA(CO)NH peaks (2 chemical shifts for one amino acid)

- Assignment Problem

-

+

+

CBCANH Spectra

• CBCANH peaks (4 chemical shifts for one amino acid)

• Ca (+), Cb (-)

H Assignment Problem

N

A Dataset Example

• HSQC

• HNCACB 4

• CBCA(CO)NH 2

Backbone Assignment Assignment Problem

• Goal

• Assign chemical shifts to N, NH, Ca (and Cb) along the protein backbone.

• General approaches

• Generate spin systems

• A spin system: an amino acid with known chemical shifts on its N, NH, Ca (and Cb).

Ambiguities Assignment Problem

• All 4 point experiments are mixed together

• All 2 point experiments are mixed together

• Each spin system can be mapped to several amino acids in the protein sequence

• False positives, false negatives

Legal matching Assignment Problem

Illegal matching under constraints

Previous Approaches

• Constrained bipartite matching problem

• The spin system might be ambiguous

• Can’t deal with ambiguous link

Natural Language Processing Assignment Problem ─ Signal or Noise?

• Speech recognition：Homophone selection

An Error-Tolerant Algorithm Assignment Problem

Phrase, Sentence Combination Assignment Problem

Hierarchical Analysis Assignment Problem

Perfect Group Assignment Problem

• Each spin group contains 6 points, in which

• 4 points are from the first experiments

• 2 points are from the second experiment

H

O

a

H

C

a

C

b

N

C

C

b

H

C

H

O

a

H

C

C

a

b

N

C

C

b

H

C

H Assignment Problem

H

O

O

a

H

a

H

C

C

C

a

a

C

b

N

b

C

N

C

C

C

b

b

H

H

C

C

Perfect Group

• Each spin group contains 6 points, in which

• 4 points are from the first experiments

• 2 points are from the second experiment

H

O

a

H

C

C

a

b

N

C

C

b

H

C

A Perfect Spin System Group Assignment Problem

CBCA(CO)NH

i -1

i -1

CBCANH

Ca

Ca

Cb

Cb

False Positives and False Negatives Assignment Problem

• False positives

• Noise with high intensity

• Produce fake spin systems

• False negatives

• Peaks with low intensity

• Missing peaks

• In real wet-lab data, nearly 50% are noises (false positive).

Perfect Assignment Problem

H

False Negative

False Positive

N

Spin System Group

Outline Assignment Problem

• Introduction

• Method

• Experiment Results

• Conclusion

Main Idea Assignment Problem

• Deal with false negative in spin system generation procedures.

• Eliminate false positive in spin system linking procedures.

• Perform spin system generation and linking procedures in an iterative fashion.

Spin System Group Generation Assignment Problem

• Three types of spin system group are generated based on the quality of CBCANH data:

• Perfect

• Weak false negative

• Severe false negative

Perfect Spin Systems Assignment Problem

• A spin system is determined without any added pseudo peak.

CBCA(CO)NH

i -1

i -1

CBCANH

Ca

Ca

Cb

Cb

Weak False Negative Spin System Group Assignment Problem

• A spin system is determined with an added pseudo peak.

CBCA(CO)NH

i -1

i -1

CBCANH

Ca

Cb

Cb

115.481 9.604 60.044 1.30407e+008

Ca

Severe false Negative Spin System Group Assignment Problem

• A spin system is determined with two added pseudo peaks.

CBCA(CO)NH

i -1

i -1

CBCANH

Ca

Note: it is also possible thatCai-1 = 28.166 and Cbi-1 = 59.419

Cb

119.857 8.435 28.166 3.36293e+007

119.857 8.435 59.419 1.56434e+008

Cb

Ca

A note on spin system generation Assignment Problem

• To generate *ALL* possible spin systems, a peak can be included in more than one spin system.

• False positives are eliminated in spin system linking procedure.

• False negative are treated by adding pseudo peaks.

• A rule-based mechanism is used to filter out incompatible spin systems (false positives).

• Adopt maximum weight independent set algorithm

• Goal

• Link spin system as long as possible.

• Constraints

• Each spin system is uniquely assigned to a position of the target protein sequence.

• Two spin systems are linked only if the chemical shift differences of their intra- and inter- residues are less than the predefined thresholds.

A Peculiar Parking Lot (valet parking) Assignment Problem

Information you have: The make of your car, the car parked in front of you (approximately).Together with others, try to identify as many cars in the right order as possible (maximizing the overall satisfaction).

Backbone Assignment Assignment Problem

DGRIGEIKGRKTLATPAVRRLAMENNIKLS

Spin System Positioning Assignment Problem

• We assign spin system groups to a protein sequence according to their codes.

D 50

G 10

R 40

I 50|51

55.26638.67544.5550

Spin System

44.417055.04330.04

55.26638.67544.5550 => 50 10

44.417055.04330.04 =>10 40

44.417030.66528.72

44.417030.66528.72 =>10 40

5535629.78260.04437.541

5535629.78260.04437.541 => 40 50

Segment 1 Assignment Problem

Segment 2

Segment 3

D

G

R

I

44.417030.66528.72

55.26638.67544.5550

44.417055.04330.04

5535629.78260.04437.541

Step1 Assignment Problem

1

1

2

56

47

Step2

Segment 1

Segment 31

Segment 2

Step n-1

Segment 78

Segment 79

Iterative Concatenation

DGRI….FKJJREKL

1

Spin Systems

2

….

56

….

Step n

Segment 99

Conflict Segments Assignment Problem

DGRIGEIKGRKTLATPAVRRLAMENNIKLS

Segment 78

Segment 79

Segment 71

Segment 97

Segment 99

Segment 98

• Two kinds of conflict segments

• Overlap (e.g. segment 71, segment 99)

• Use the same spin system (e.g. both segment 78 and segment 79 contain spin system 1 )

A Graph Model for Spin System Linking Assignment Problem

• G(V,E)

• V: a set of nodes (segments).

• E:(u, v), u, v V,u and v are conflict.

• Goal

• Assign as many non-conflict segments as possible => find the maximum independent set of G.

SP13 Assignment Problem

Seg2

Overlap

Overlap

SP15

Seg4

Seg1

Seg3

Seg4

Seg2

An Example of G

Seg1

Segment1: SP12->SP13->SP14

Segment2: SP9->SP13->SP20->SP4

Segment3: SP8->SP15->SP21

Segment4: SP7->SP1->SP15->SP3

Seg3

• Seq. : GEIKGRKTLATPAVRRLAMENNIKLSE

Segment weight Assignment Problem

• The larger length of segment is, the higher weight of segment is.

• The less frequency of segment is, the higher of segment is.

Find Maximum Weight Independent Set of Assignment ProblemG

• Boppana, R. and M.M. Halldόrsson, Approximatin Maximum Independent Sets bt Excluding Subgraphs. BIR, 1992. 32(2).

An Iterative Approach Assignment Problem

• We perform spin system generation and linking iteratively.

• Three stages.

First Stage Assignment Problem

• Generate perfect spin systems;

• Perform spin system concatenation on spin systems (newly generated perfect) to generate segments;

• Retain segments that contain at least 3 spin systems;

• Perform MaxIndSet on the segments;

• Drop spin systems (and related peaks) that are used in the resulting segments.

Second Stage Assignment Problem

• Generate weak false negative spin systems.

• Perform segment extension on the resulting segments of the first iteration (using unused perfect and newly generated weak false negative);

• Perform spin system concatenation on the unused spin systems (perfect + weak false negative) to generate longer segments;

• Retain segments that contain at least 3 spin systems;

• Perform MaxIndSet on the segments;

• Drop spin systems (and related peaks) that are used in the resulting segments.

Third Stage Assignment Problem

• Generate severe false negative spin systems.

• Perform segment extension on the resulting segments of the second iteration (using unused perfect and weak false negative, as well as newly generated severe false negative);

• Perform spin system concatenation on the unused spin systems (perfect + weak false negative + severe false negative) to generate longer segments;

• Retain segments that contain at least 3 spin systems;

• Perform MaxIndSet on the segments.

12 Assignment Problem

29

109

29

Segment Extension

….FKJJREKL….

109

New spin systems

1

2

….

45

New 109

97 Assignment Problem

78

77

99

97‘

71

99‘

77

99‘

97‘

Segment Extension

DGRGEKGRKTLATPAVRRLAMENNIKLS

97

23

99

24

26

45

28

27

31

28

29

32

33

MaxIndSet

Outline Assignment Problem

• Introduction

• Method

• Experimental Results

• Conclusion

Experimental Results Assignment Problem

• Two datasets obtained from our collaborator Dr. Tai-Huang, Huang in IBMS, Academia Sinica:

• Average precision: 87.5%

• Average recall: 73.1%

• Perfect data from BMRB: 99.1%

Real Wet-Lab Datasets Assignment Problem

• The two datasets are obtained from our collaborator Dr. Tai-Huang, Huang in IBMS at Academia Sinica, Taiwan.

Outline Assignment Problem

• Introduction

• Method

• Experiment Results

• Conclusion

Conclusion Assignment Problem

• We model the backbone assignment problem as a constraint satisfaction problem

• This problem is solved using a natural language parsing technique (both bottom-up and top-down approach)

• The same approach seem to work for a large class of noise reduction problems that are discrete in nature

• Randomly generate a population of chromosomes

• Each chromosome represents a possible backbone resonance assignment

• Fitness function

• Evaluate the fitness of each chromosome according to the connectivity between adjacent amino acids

• Crossover operation

• An offspring inherits different connected blocks from parents

• Mutation operation

• Make a new connected block from any position to increase the popular diversity

• Step1. Randomly select a position x

• Step2. Randomly select a SSGroup i from CL(x)

• Step3. Extend connected fragments from i to both sides by using adjacency lists until no more extension can be found.

• Step4. Repeat Step1~Step3 until all positions are assigned.

Building Blocks: connected fragments

Fitness(ch) = The number of connected pairs associate with

their chemical shift differences.

Two principles:

1. The more connected pairs it has, the higher score it gets.

2. The less chemical shift differences it has, the higher score it gets.

cutting site

parents

offspring

• Once a position is going to mutate, the following positions will also mutate to produce a connected fragments.

Mutation point

• The accuracy on two real dataset

• SBD:95.1% (FP: 67%)

• LBD:100% (FP: 48%)

• The average accuracy on perfect BMRB datasets (902 proteins)