1 / 47

CSCI6904 Genomics and Biological Computing

CSCI6904 Genomics and Biological Computing. Instructor: Christian Blouin Schedule : - Monday 14:30 – 13:55 - Wednesday 14:30 – 13:55 Contact : cblouin@cs.dal.ca rm.: 321 CS building ph: 6702.

jean
Download Presentation

CSCI6904 Genomics and Biological Computing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSCI6904Genomics and Biological Computing Instructor: Christian Blouin Schedule : - Monday 14:30 – 13:55 - Wednesday 14:30 – 13:55 Contact : cblouin@cs.dal.ca rm.: 321 CS building ph: 6702

  2. GenomicsAnalysis of biological data within the context of genetic content of entire organism.Computational Molecular BiologyModeling and problem solving using computational techniquesBioinformaticsUsing computational techniques to perform data analysis on biological datasets

  3. Possible misconceptions about BioinformaticsBioinformatics is about large datasets!I need a biological degree to do bioinformatics.Biologists don’t know anything about computation. Trivial applications of CS can make a break through.CSCI6904 midterms are hard!

  4. Why should CS people do biology? Sir. A. Flemming discovered Penicillin by designing experiments (although the actual discovery was itself an anecdote). Rosalind Franklin generated X-ray diffraction patterns by developing methods and instrumentations. The nature of science is changing rapidly

  5. Why should CS people do biology? Again, more research in biology and chemistry boils down to the design of a clever analysis. The quantitative skills required to navigate biology/chemistry are highly sought by: -- Industrial sector Pharmaceutical Environment Agriculture Food Science -- Government labs -- Academic labs The nature of science is changing rapidly

  6. Accessibility to data The availability of a rapidly growing mass of information has been a cliché one-liner already for a while. It is nonetheless true. The researchers interested in biological questions cannot be bothered with database issues. Computer scientists are needed to make this connection and in the process generate more general and portable methodology. Annotation, curation, query ,maintenance… Role of Computer Scientists in future developments in the field

  7. Accessibility to computation Even if its easy to get all the relevant data, rarely there is the appropriate tools to do the job. There is a need for flexible and powerful computational platform to allow biologists/chemists to get the information they want, when they want it. Toolkits, APIs, Interfaces, Visual Programming, Education Role of Computer Scientists in future developments in the field

  8. Accessibility to Knowledge Biological systems did not evolve in complexity with regards to human limitations. In a not so far future, knowledge rather than data will become a more useful commodity. By knowledge, I refer to the inference of conceptual relationship between data and statements present in the literature. Knowledge mining, natural language processing Role of Computer Scientists in future developments in the field

  9. Statistical Mechanics As the base of data gets bigger and the bias in the nature of the data fades, the assumptions made by statistical mechanics are increasingly getting satisfied. Statistical mechanics has the potential to clean complex problem of convoluted models to represent them. Computational chemistry, pattern detection, design. Role of Computer Scientists in future developments in the field

  10. Nanotechnology Molecular biology presents a pre-fabricated framework for a microscopic platform. Proteins and nucleotides can be used as machines, for computing. The limiting factor to this is the inadequate quality of the models used for molecular design. Modeling evolution and molecules may just be what we need for the next biggest thing since running water, electricity and the internet. Integrate all of the above. Role of Computer Scientists in future developments in the field

  11. Nevermind technology! Whatever takes too long to run today will run slowly tomorrow, and will probably run in real time off your video card in five years. High performance computing should be seen as an open door to smart rather than just faster computing. A great example is the massive parallel algorithm behind folding@home. Distributed computing and algorithms, data structures. Role of Computer Scientists in future developments in the field

  12. Academic activities Lectures (Partial examinations I and II) 2 * 10% Identify problems, relate computational techniques to biological problems, apply bioinformatic techniques to unrelated issues. Journal club content when relevant to class content 2 Paper Reviews (30 min critical presentation) (10%, 15%) Present and discuss a paper on a topic of you choice. All are expected to read the papers ahead of the presentation. Project (A clear question, a brief answer) 55% The main activity for this course will be a small project on a relevant issues in Bioinformatics. (5% will be peer reviews)

  13. Objectives • Proficiencies in the generals applications of Bioinformatics. • Focus on Genomics and Evolutionary Biology. • Learn the minimum necessary in Biology, Chemistry and Medicine to understand current problems in the field. • Stimulate the generation of ideas for the course’s project. The Lectures

  14. Objectives • Read current papers. • Identify current issues in Bioinformatics. • Learn about new applications of CS to the field. • Personalize the course to your own interests. The Seminars

  15. I would be glad to swap one/a few seminar session for workshops and first hand work if you are interested and the enrollment is such that we have free seminar sessions. Workshops?

  16. The Project Objectives An excuse for you to get first-hand experience in a field for which you may have never touched before. Address an issue of general interest in bioinformatics: Application of your favorite techniques. Application of general methodologies. Feasibility studies. Straightforward implementations applied to bioinformatics. Biologically-inspired computing which requires of you more biology that you’d want to get into.

  17. The Project Format Milestone 1 – Definition of an area. Milestone 2 – Definition of a problem. Milestone 3 – Design of an experiment. Milestone 4 – Journal discussion of the problem. Milestone 5 – Discussing the results in the form of a short paper.

  18. The Project Format II Can be a group project. However, as the team size increases, so will be the expectations!

  19. The Project

  20. The university has guidelines http://plagiarism.dal.ca • Contact Gwendolyn MacNairn, our Librarian, if in doubt. • This should not be an issue anymore for graduate students! • As part of the project I will offer to proof read each of the term papers. This proof reading will aim at pointing out logic, scientific errors or omissions. A bit like a mini-peer review. However, please note the following: • It doesn’t make the instructor a co-author: I don’t want to be responsible if you don’t get a perfect mark even if you implement all of my comments. • If there is a suspicion of plagiarism, although the review isn’t graded, a manuscript WILL be sent out for disciplinary action. Plagiarism

  21. Fundamental Concepts of Bioinformatics, Krane and Raymer, 92$, University Library Discovering Genomics, Proteomics and Bioinformatics, Campbell and Heyer, 92$ (amazon <- pay no mind to these “ z” characters…) Inferring Phylogenies, Felsenstein, 75$ (online) All are covering only part of what we are going to talk about, unfortunately. However, the first one is a rather comprehensive overview of the field. Recommended Readings

  22. CSCI6904Genomics and Biological Computing Genomic data Alphabet in biology Statistical mechanics Physical Simulations Classic/Modern Genetics Evolutionary theory Cellular Processing Functional Genomics Sequence alignments Structure alignments Phylogeny Protein Folding Machine learning methods Conceptual Biology DNA computing Content

  23. Parallel history

  24. Life

  25. Life – Origins

  26. Quick glance at life forms Eukaryotes We are! Nucleus, linear chromosomes and extensive control machinery

  27. Quick glance at life forms Archaea Bacteria look alike. Apparently more closely related to us than bacteria. Many known to live in exotic environments.

  28. Quick glance at life forms Bacteria Single cell, one circular genome, “omnipresent” life forms.

  29. Quick glance at self-replicative entities Virus Sole purpose is to replicate, usually don’t do much more.

  30. Quick glance at self-replicative entities Indian corn Transposon disrupts a pigmentation-related gene. Transposons Pieces of DNA that jump from one cell to another.

  31. Quick glance at self-replicative entities Prions Not even genetically encoded. Responsible for “Mad cow” disease. Same principle in neurodegenerative diseases “Alzheimer” and “Parkinson”.

  32. What is Cellular biology ? http://www.emc.maricopa.edu/faculty/farabee/BIOBK/BioBookTOC.html

  33. Real World players Lipids Sugars Nucleotides Amino-Acids

  34. What is molecular biology ? http://www.emc.maricopa.edu/faculty/farabee/BIOBK/BioBookTOC.html

  35. Complex systems are usually modeled well using a graph approach.Graphs terminology isn’t in the biological culture, yet.

  36. Edges and vertices

  37. Lucky us, this encoding is 1-dimensional (and thus can be represented as strings) http://www.emc.maricopa.edu/faculty/farabee/BIOBK/BioBookTOC.html

  38. What kind of information ? Bergeron, Bioinformatics Computing, pp:45-46

  39. What kind of information ? http://www.ncbi.nlm.nih.gov

  40. Sequences Genebank http://www.ncbi.nlm.nih.gov/Genbank/GenbankOverview.html • DNA sequences. • Primary data generators submit to Genebank. • Annotation issues. • Heart of most genomics projects. http://www.ncbi.nlm.nih.gov

  41. Structures Protein Databank http://www.rcsb.org/pdb/ • Models of 3D structures • X-ray crystallography • NMR spectroscopy http://www.ncbi.nlm.nih.gov

  42. Microarray Gene expression • Identify which genes are expressed under a given set of conditions. • Microchips require small amount of sample for a full analysis. http://www.ncbi.nlm.nih.gov

  43. What can we do with sequences? Multiple sequence alignments Principle Character in sequences can be substituted randomly. Alignment position homologous position together. Unlikely that an ultimate alignment tool will ever be made. http://www.ncbi.nlm.nih.gov

  44. What can we do with sequences? Multiple sequence alignments Tools Bioedit (Windows) free. All inclusive functions Seaview (Unix) Free. Unstable. Little alternative that I know of. http://www.mbio.ncsu.edu/BioEdit/bioedit.html

  45. What can we do with sequences? Whole genome analysis Look for genes Look for regulation mechanisms Look for drug targets (exclusive pathway) Predict the function of unknown sequences http://www.the-scientist.com/images/yr2001/oct29/y.gif

  46. What can we do with sequences? Tell a story Relationship amongst sequences Origins of systems Horizontal transfer of information between sequence Understand evolution

  47. What can we do with structures? What is its function? What is the mechanism? Does it relate to other known structures Can we design a drug to enhance/suppress its function? Predict the structure of related proteins. http://www.ks.uiuc.edu/Research/vmd/

More Related