Introduction to bioinformatics lecture xvi global optimization and monte carlo
This presentation is the property of its rightful owner.
Sponsored Links
1 / 18

Introduction to Bioinformatics: Lecture XVI Global Optimization and Monte Carlo PowerPoint PPT Presentation


  • 37 Views
  • Uploaded on
  • Presentation posted in: General

Introduction to Bioinformatics: Lecture XVI Global Optimization and Monte Carlo. Jarek Meller Division of Biomedical Informatics, Children’s Hospital Research Foundation & Department of Biomedical Engineering, UC. Outline of the lecture.

Download Presentation

Introduction to Bioinformatics: Lecture XVI Global Optimization and Monte Carlo

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Introduction to bioinformatics lecture xvi global optimization and monte carlo

Introduction to Bioinformatics: Lecture XVIGlobal Optimization and Monte Carlo

Jarek Meller

Division of Biomedical Informatics,

Children’s Hospital Research Foundation

& Department of Biomedical Engineering, UC

JM - http://folding.chmcc.org


Outline of the lecture

Outline of the lecture

  • Global optimization and local minima problem

  • Physical map assembly, ab initio protein folding and likelihood maximization as examples of global optimization problems

  • Biased random search heuristics

  • Monte Carlo approach

  • Biological motivations and genetic algorithms

JM - http://folding.chmcc.org


Optimization steepest descent and local minima

Optimization, steepest descent and local minima

Optimization is a procedure in which an extremum of a function is sought.

When the relevant extremum is the minimum of a function the optimization

procedure is called minimization.

f(x)

Local minimum

Local minimum

Local minimum

Global minimum

JM - http://folding.chmcc.org


Rugged landscapes and local minima maxima problem

Rugged landscapes and local minima (maxima) problem

JM - http://folding.chmcc.org


Algorithmic complexity of global optimization

Algorithmic complexity of global optimization

  • Polynomial vs. exponential complexity, e.g., n2 vs. 2nsteps to obtain the optimal solution where n denotes the overall “size of the input”

  • Global optimization term is used to refer to optimization problems for which no polynomial time algorithm that guarantees optimal solution is known

  • In general global optimization implies that there might be multiple local minima and thus one is likely to find a local rather than the global optimum

  • Let us revisit some of the global optimization problems that we stumbled on so far …

JM - http://folding.chmcc.org


The problem of ordering clone libraries with sts markers in the presence of errors

The problem of ordering clone libraries with STS markers in the presence of errors

In the presence of experimental errors the problem leads to global

optimization problem (see Pevzner, Chapter 3).

STS: 1 2 3 4 5

DNA

clone 1

clone 2

clone 3

clone 4

STS

Clone


Heuristic solutions may still provide good probe ordering

Heuristic solutions may still provide good probe ordering

The number of “gaps” (blocks of zeros in rows) in the hybridization matrix

may be used as a cost function, since hybridization errors typically split

blocks of ones (false negatives) or split a gap into two gaps (false positive).

The problem of finding a permutation that minimizes the number of gaps

can be cast as a Traveling Salesman Problem (TSP), in which cities are the

columns of the hybridization matrix (plus an additional column of zeros)

and the distance between two cities is the number of positions in which

the two columns differ (Hamming dist.)

Thus, an efficient algorithm is unlikely in general case (unless P=NP) and

heuristic solutions are being sought that provide good probe ordering, at

least for most cases (e.g. Alizadeh et. al., 1995)

JM - http://folding.chmcc.org


Profile hmms and likelihood optimization when states optimal multiple alignments are not known

Profile HMMs and likelihood optimization when states (optimal multiple alignments) are not known

JM - http://folding.chmcc.org


Random biased search ideas and heuristics

Random biased search: ideas and heuristics

GA, MC, SA (MC with a smoothing)

Fitness lanscapes

Biological and physical systems solve these “unsolvable” problems:

From optimization to biology and back to optimization

JM - http://folding.chmcc.org


Literature watch 10 years of dna computing

Literature watch: 10 years of DNA computing

Adleman LM, Molecular computation of solutions to combinatorial problems,

Science 266:1021-4 (1994)

RS Braich, N Chelyapov, C Johnson, PWK Rothemund, and L Adleman,

Solution of a 20-Variable 3-SAT Problem on a DNA Computer,

Science 296: 499-502 (2002)

JM - http://folding.chmcc.org


Monte carlo random search

Monte Carlo random search

A simulation technique for conformational sampling and optimization

based on a random search for energetically favourable conformations.

Finding global (or at least “good” local) minimum by biased

random walk may take some luck …

JM - http://folding.chmcc.org


Monte carlo algorithm

Monte Carlo algorithm

  • The core of MC algorithm is a heuristic prescription for a plausible pattern of changes in the configurations assumed by the system. Such an elementary “move” depends on the type of the problem.

  • In the realm of protein structure it may be for instance a rotation around a randomly chosen backbone bond. A long series of random moves is generated with only some of them considered as “good” moves.

  • The advantage of MC method is its generality and a relatively weak dependence on the dimensionality of the system. However, finding a “move” which would ensure efficient sampling may be a highly non-trivial problem.

JM - http://folding.chmcc.org


Monte carlo algorithm1

Monte Carlo algorithm

  • In the standard Metropolis MC a move is accepted unconditionally if the new configuration results in a better (lower) potential energy. Otherwise it is accepted with a probability given by the Boltzmann factor:

denotes the change in the potential energy associated with a move

JM - http://folding.chmcc.org


Climbing mountains easier simulated annealing

Climbing mountains easier: simulated annealing

  • Increasing the effective “temperature” means higher probability of accepting moves that increase the energy

  • Thus, the likelihood of escaping from a local minimum may be tuned

  • Heating and cooling cycles, in analogy to physical systems

  • In the limit of infinitely slow cooling simulated annealing is guaranteed to provide the global minimum

JM - http://folding.chmcc.org


From biology to optimization genetic algorithms

From biology to optimization: genetic algorithms

Genetic algorithm (GA). A class of algorithms inspired by the mechanisms of genetics, which has been applied to global optimization (especially combinatorial optimization problems). It requires the specification of three operations (each is typically probabilistic) on objects, called "strings" (these could be real-valued vectors)

0. Initialize population 1. Select parents for reproduction and “evolutionary” operators (e.g. mutation and crossover) 2. Perform operations to generate intermediate population and evaluate their fitness (value of the objective function to be optimized) 3. Select a subpopulation for next generation (survival of the fittest)

4. Repeat 1-3 until some stopping rule is reached

JM - http://folding.chmcc.org


Genetic algorithm operators and adaptation

Genetic algorithm: operators and adaptation

Reproduction - combining strings in the population to create a new string (offspring);

Example: Taking 1st character from 1st parent + rest of string from 2nd parent:

[001001] + [111111] ===> [011111]

Mutation - spontaneous alteration of characters in a string;

Example: [001001] ===> [101001]

Crossover - combining strings to exchange values, creating new strings in their place.

Example: With crossover location at 2:

[001001] & [111111] ===> [001111], [111001]

JM - http://folding.chmcc.org


Genetic algorithms for global optimization

Genetic algorithms for global optimization

The original GA was proposed by John Holland and used crossover and total population replacement. This means a population with 2N objects (called chromosomes) form N pairings of parents that produce 2N offsprings. The offsprings comprise the new generation, and they become the total population, replacing their parents. More generally, a population of size N produces an intermediate population of N+M, from which Ñis kept to form the new population. One way to choose which Ñsurvive is by those with the greatest fitness values –survival of the fittest.

JM - http://folding.chmcc.org


Random biased search ideas and heuristics1

Random biased search: ideas and heuristics

GA, MC, SA (MC with a smoothing)

Fitness lanscapes

JM - http://folding.chmcc.org


  • Login