- 265 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about 'Connections between Computer Science and Biology' - LionelDale

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

connections

- bioinformatics: computational approach to problems in molecular biology
- biological processes inspire algorithms and data structures in computer science
- biomolecules “compute”
- biological organisms “compute”

bioinformatics

- sequencing the genome
- predicting the structure of molecules
- predicting genes, molecular function
- constructing evolutionary trees
- modeling cellular networks
- ...

constructing evolutionary trees

“The affinities of all the beings of the same class have sometimes been represented by a great tree. I believe this simile largely speaks the truth. The green and budding twigs may represent existing species; and those produced during each former year may represent the long succession of extinct species.” - Darwin, Origin of the Species

constructing evolutionary trees

- traditional approach: use morphological features of organisms (number of legs, etc.)
- current approach: use base sequences of universal molecules such as RNA

RNA molecules

- strings of ribo-nucleic acids, of which there are four types, denoted by A, C, G, U.

5’ - ACCAUGGAC - 3’

- some “universal” RNA molecules function in life’s most basic processes, and so mutate slowly

Aardvark

CAGA

Bison

CGCG

Chimp

UGCA

Dog

UGCG

Elephant

two possible evolutionary treesUGCG

CACG

- which is a better fit with the data? why?

UGCG

CACG

CAGG

UGCG

UGCG

CAGG

CAGG

Aardvark

CAGA

Bison

CGCG

Chimp

UGCA

Dog

UGCG

Elephant

parsimony score

- to get a parsimony score for a tree, count the number of places where a nucleotide differs from a parent to a child

parsimony problem

- input: RNA sequences for some taxa, or species
- output: the most parsimonious tree for the input taxa

the more taxa, the more possible trees that are candidates for being the output

application of parsimony(Luo et al., Nature, Jan 2001)

- did mammals evolve independently on the north and south continents?

how many trees are there?

- unfortunately, the number of possible trees grows exponentially with the number of taxa (organisms)
- example of an exponential function: 2n (2 multiplied n times)
- if there are n taxa, there are even more than 2n possible evolutionary trees

complexity of the parsimony problem

- all known algorithms for exactly solving the parsimony problem require an exponential number of steps - this is a so-called NP-hard problem
- in practice, heuristic algorithms are typically used, which try to search in an intelligent way for a good tree, but offer no guarantee of finding the best tree

connections: biologically inspired data structures

- tree structures for organizing data are ubiquitous in computing (e.g. folders in a windows environment)
- programming language environments support operations on trees (add-node, find-parent, etc.) for the programmer to use

summary

- strong connections between biology and cs
- many computational problems, such as constructing parsimonious evolutionary trees, are “intractable”
- algorithms for intractable problems are often heuristic

vocabulary

- bioinformatics
- evolutionary tree construction; parsimony problem
- exponential running time, intractable problem (technically sometimes called NP-hard problem)
- heuristic algorithms

Download Presentation

Connecting to Server..