1 / 19

Comput ati onal Biology Applications

Comput ati onal Biology Applications. Bartosz Nowierski Poznań Univeristy of Technology. Laboratory of Bioinformatics (1). Institute of Bioorganical Chemistry (Polish Academy of Sciences) Founded: 1.11.1999 Members: Prof. Dr. Habil. Jacek Błażewicz - director Ph.D. Piotr Formanowicz

Download Presentation

Comput ati onal Biology Applications

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Computational Biology Applications Bartosz Nowierski Poznań Univeristy of Technology

  2. Laboratory of Bioinformatics (1) • Institute of Bioorganical Chemistry (Polish Academy of Sciences) • Founded: 1.11.1999 • Members: • Prof. Dr. Habil. Jacek Błażewicz - director • Ph.D. Piotr Formanowicz • Ph.D. Marta Kasprzak • M.Sc. Marcin Jaroszewski • M.Sc. Piotr Łukasiak • M.Sc. Piotr Wierzejewski ASM Team

  3. Laboratory of Bioinformatics (2) • Basic research area: • algorithms for sequencing by hybridisation • analysis of DNA graphs • analysis of NMR spectra for RNA chains • restriction map constraction • constracting phylogenetic trees • constructing bio-server for selected problems of computional biology • prediction of protein secondary structures • DNA sequence assembly • Basic research area: • algorithms for sequencing by hybridisation • analysis of DNA graphs • analysis of NMR spectra for RNA chains • restriction map constraction • constracting phylogenetic trees • constructing bio-server for selected problems of computional biology • prediction of protein secondary structures • DNA sequence assembly ASM Team

  4. Laboratory of Bioinformatics (3) • International cooperation: • Universite LeHavre, France • Max Planck Institutes, Germany • TmBioscience, Canada • Rutgers University, USA • TU Clausthal, Germany • RIKEN, Japan ASM Team

  5. Applications • DNA sequence assembly • Prediction of protein secondary structures • Constracting phylogenetic trees • Motivation: • popular subject • great demend • faster => distributed ASM Team

  6. ACCGT ACCGT CGTGC CGTGC TTACC TACCGT TTACC TACCGT TTACCGTGC DNA Sequence AssemblyProblem specification • Alphabet = {A, C, G, T} • Problem: ASM Team

  7. (5, 1) (6, 1) DNA Sequence AssemblyOverlap graph • input sequence  vertex • overlap of sequences  arc • shift • weight ACTGCCTA CTAGGATC TCAAGA ASM Team

  8. (3, 1) (3, 1) (7, 1) (7, 1) (6, 1) (2, 1) (2, 2) (3, 1) (3, 1) (4, 1) (4, 2) DNA Sequence AssemblyRedundant arcs • Arc deletion ATGACTACT GACTACTGA ACTGAATCA 2+4 = 6 ASM Team

  9. DNA Sequence AssemblyHamiltionan path with max. weight • NP-hard problem => heuristic • Selection of first element: unatractive successor of any vertex • Selection of next elements: atractive succesor, but not attractive to others ASM Team

  10. DNA Sequence AssemblyParallelization • Overlaps – distrbute set of sequences • Arc reduction – distrib. set of vertices • First vertex – distribute set of vertices • Next vertices  ASM Team

  11. Example: VASYDYLVIGGGSGG...VAIHPTSSEELVTLR XEEXXEEEEXXXHHH...XXXXXXXHHHHHXXX Protein Secondary StructuresProblem specification aminoacid  secondary structure {A,C,D,E,F,G,H,I,K,....}  {H,E,X} ASM Team

  12. x-3 x-2 x-1 x0 x1 x2 x3 x-3 x-2 x-1 x0 x1 x2 x3 1.1 -0.7 1.2 -0.5 0.1 -1.1 2.1 1.1 -0.7 1.2 -0.5 0.1 -1.1 2.1 1 0 1 0 0 0 1 RULES H: x-3<0.7  x-1>0.1  x2<-0.5 E: x-1>0.3  x0>-1.5  x2<0.2  x3>1.2 RULES H: x-3=0  x-1=1  x2=0 E: x-1=1  x0=0  x2=0  x3=1 E E Protein Secondary StructuresRule usage • Logical Analysis of Data approach VASYDYLVIGGGSGG VASYDYLVIGGGSGG ASM Team

  13. Rule generation scenario aminoacid sequences rules secondary structures Protein Secondary StructuresRule generation • Good rule properties • rule says e.g. H it must be right • rule says e.g. not H it should be right • The best rules if 1 variable is out, it’s not good anymore ASM Team

  14. Protein Secondary StructuresRule generation - algorithm • Algorithm generate all reasonable 0-1 arrays • Clasifier generation • set of rules (logical OR) • mathematical formula • Parallelization division of array space ASM Team

  15. Fitch( ) = sth. small human monkey iguana snake Fitch( ) = sth. big human iguana monkey snake Phylogenetic TreesProblem specification ASM Team

  16. 4 species T1 T2 T3 T4 T5 T6 T7 T8 ….. 5 species ….. Phylogenetic TreesAlgorithm • Branch & Bound • Parallelization • distribution of subtrees • exchange of information about solutions ASM Team

  17. Usage of GridLab (1) • Resource management (GRMS): • assignment of resources • application structure (master-slave) • checkpointing & migration • dynamic assignment of new resources • frameworks • distributed shared memory (?) ASM Team

  18. Usage of GridLab (2) • Monitoring: • dynamic link/processor states (tuning) • estimation of end time (GRMS, end user) • Others: • visualisation • security • mobile users ASM Team

  19. Contact • Director Prof. Dr. Habil. Jacek Błażewicz Jacek.Blazewicz@cs.put.poznan.pl • DNA Sequence Assembly B.Sc. Bartosz Nowierski Bartosz.Nowierski@cs.put.poznan.pl • Protein Secondary Structures M.Sc. Piotr Łukasiak Piotr.Lukasiak@cs.put.poznan.pl • Phylogenetic Trees Ph.D. Piotr Formanowicz Piotr.Formanowicz@cs.put.poznan.pl ASM Team

More Related