1 / 40

CSCI2950-C Genomes, Networks, and Cancer

CSCI2950-C Genomes, Networks, and Cancer. http://cs.brown.edu/courses/cs2950-c/. Course Organization. Seminar style Grading: 10% participation 30% paper reviews (~10) 20% presentations (2-3) 40% project. Midterm proposal and final presentations

lerickson
Download Presentation

CSCI2950-C Genomes, Networks, and Cancer

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSCI2950-C Genomes, Networks, and Cancer http://cs.brown.edu/courses/cs2950-c/

  2. Course Organization • Seminar style • Grading: • 10% participation • 30% paper reviews (~10) • 20% presentations (2-3) • 40% project. Midterm proposal and final presentations • Extra credit: 5% for report on comp. bio. seminars (pre-arranged): Maximum 2 • CS requirements: PhD: Area B (Algorithms) ScM: Theory or Practice* *depending on project Significant programming*

  3. Biology 101

  4. Biology 101 Central Dogma

  5. What can we measure? Sequencing (expensive) Hybridization (noisy) Sequencing (expensive) Hybridization (noisy) Mass spectrometry (noisy) Hybridization (very noisy!)

  6. Human Genome Sequenced2000-2003, ??? • Comparative Genomics • Genomes of other mammals • 2) Disease genomics • HapMap • 1000 Genomes • 3) Cancer genomics • The Cancer Genome Atlas • Tumor Sequencing Project Now what? Sept. 3 “In the Genome Race, the Sequel Is Personal”

  7. Course Topics Genomes Algorithms Visualization Machine Learning Systems Data Mining Networks Cancer

  8. Course Topics Genomes Networks Cancer

  9. DNA sequencing How we obtain the sequence of nucleotides of a species …ACGTGACTGAGGACCGTG CGACTGAGACTGACTGGGT CTAGCTAGACTACGTTTTA TATATATATACGTCGTCGT ACTGATGACTAGATTACAG ACTGATTTAGATACCTGAC TGATTTTAAAAAAATATT…

  10. DNA Sequencing Goal: Find the complete sequence of A, C, G, T’s in DNA Challenge: There is no machine that takes long DNA as an input, and gives the complete sequence as output Can only sequence ~500 letters at a time

  11. Shotgun Sequencing genomic segment cut many times at random (shotgun) Get one or two reads from each segment ~500 bp ~500 bp

  12. Fragment Assembly reads Cover region with ~7-fold redundancy Overlap reads and extend to reconstruct the original genomic region

  13. Comparative Genomics Compare genome sequences of related species. • Sequence Alignment • Phylogeny: defining ancestral relationships between species, genomes, genes. • human • mouse • Sequences • Evolve via substitutions • Conservation suggests function • EFTPPVQAAYQKVVAG • DFNPNVQAAFQKVVAG

  14. History of Chromosome X Rat Consortium, Nature, 2004

  15. Comparative Genomic Architectures: Mouse vs Human Genome • Humans and mice have similar genomes, but their genes are ordered differently • ~245 rearrangements • Reversals • Fusions • Fissions • Translocation

  16. Genome rearrangements Mouse (X chrom.) • What are the similarity blocks and how to find them? • What is the architecture of the ancestral genome? • What is the evolutionary scenario for transforming one genome into the other? Unknown ancestor~ 75 million years ago Human (X chrom.)

  17. 1 2 3 9 10 8 4 7 5 6 Reversals • Blocks represent conserved genes. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10

  18. Reversals 1 2 3 9 10 8 4 7 5 6 1, 2, 3, -8, -7, -6, -5, -4, 9, 10 • Blocks represent conserved genes. • In the course of evolution or in a clinical context, blocks 1,…,10 could be misread as 1, 2, 3, -8, -7, -6, -5, -4, 9, 10.

  19. Reversals and Breakpoints 1 2 3 9 10 8 4 7 5 6 1, 2, 3, -8, -7, -6, -5, -4, 9, 10 The reversion introduced two breakpoints(disruptions in order).

  20. Sorting Permutations by Reversals  = 12…n signed permutation (Sankoff et al.1990) Reversal (i,j) [inversion] 1…i-1 -j... -ij+1…n Problem: Given , find a sequence of reversals 1, …, t with such that:  . 1 . 2 … t = (1, 2, …, n) andt is minimal. Solution: Analysis of breakpoint graph • Polynomial time algorithms • O(n4) : Hannenhalli and Pevzner, 1995. O(n2) : Kaplan, Shamir, Tarjan, 1997. • O(n) [distance t] : Bader, Moret, and Yan, 2001.O(n3) : Bergeron, 2001.

  21. Course Topics Genomes Networks Cancer Network Alignment Network Integration

  22. Biological Interaction Networks Many types: • Protein-DNA (regulatory) • Protein-metabolite (metabolic) • Protein-protein (signaling) • RNA-RNA (regulatory) • Genetic interactions (gene knockouts)

  23. Regulatory Networks

  24. Cis-regulatory Network

  25. Metabolic Networks Nodes = reactants Edges = reactions labeled by enzyme (protein) that catalyzes reaction

  26. Protein-Protein Interaction (PPI) Network

  27. Protein-Protein Interaction Network? • Proteins are nodes • Interactions are edges • Edges may have weights Yeast PPI network H. Jeong et al. Lethality and centrality in protein networks. Nature 411, 41 (2001)

  28. Network Alignment Sharan and Ideker. Modeling cellular machinery through biological network comparison. Nature Biotechnology 24, pp. 427-433, 2006

  29. The Network Alignment Problem Given: k different interaction networks belonging to different species, Find: Conserved sub-networks within these networks Conserved defined by protein sequence similarity (node similarity) and interaction similarity (network topology similarity)

  30. Protein Signaling Networks Art Salomon Biology Department Use machine learning methods (Bayesian networks, etc. to derive network structure.

  31. Computational Problems • Classifying Network Topology • Finding paths, cliques, dense subnetworks, etc. • Comparing Networks Across Species • Using networks to explain data • Dependencies revealed by network topology • Modeling dynamics of networks

  32. Course Topics Genomes Networks Cancer Cancer Genomes Cancer & Networks

  33. Single nucleotide change Cell Division and Mutation Copy number Structural

  34. Cancer Genomes Leukemia Breast

  35. Rearrangements in Tumors Change gene structure, create novel fusion genes • Gleevec (Novartis 2001) targets ABL-BCR fusion

  36. Challenges • What is detailed organization of cancer genomes? • What genes affected? • What sequence of rearrangements produced this genome? • Can we create custom treatments for tumors based on mutational spectrum? (e.g. Gleevec)

  37. Tumor Genome → Phenotype Gene Networks and Pathways Integration of Multiple Data Sources • ESP and copy number • Mutation • Expression • (mRNA and miRNA) • Binding (ChIP-chip) • Pathways • Epigenetics activation repression Duplicated genes Deleted genes

  38. Tumor Genome → Phenotype Gene Networks and Pathways Integration of Multiple Data Sources • ESP and copy number • Mutation • Expression • (mRNA and miRNA) • Binding (ChIP-chip) • Pathways • Epigenetics activation repression Duplicated genes Deleted genes

  39. Course Topics Genomes Algorithms Visualization Machine Learning Systems Data Mining Networks Cancer

More Related