Outline methods to gind genome-wide maps of develomental processes • What to do after the genome is done? • What data do you need? • What models do you need to be able to build?
First • Describe the developing system in terms of gene expression • Can be used to assign genes to functional groups • Set baselines for perturbation experiments • Example: insect metamorphosis – an integrated set of developmental processes controlled by a transcriptional hierarchy that coordinates the action of hundreds of genes
Genomics of insect metamorphosis • Microarray analysis revealed that 13% of assayed transcripts displayed significant changes in the first 30 hours of metamorphosis. • They found coordinate changes in groups of genes that provided insight into the dynamics of the process • Ecdysone levels up – glycolysis down • Metamorphosis onset- muscle-related transcripts are down
Drosophia development • Comparison of expression patterns showed that embryogenesis and metamorphosis were more similar than other growth stages (insight into how complex organisms develop). • It may be easier to study one phase than the other – so here is the logic for doing it. • It may be possible to differentiate the effects of the same genes on the two processes • 40% of genes differed between sexes and most of this was in the germline
Drosophila • It was possible to pull out tissue-specific developmental programs • Could verify this with mutants (critical for first model systems)
C. elegans • Time course from early embryogenesis to adulthood • Do they know what effect growth medium has? • 22% of the transcripts showed significant change over this time course • Less evolutionarily conserved genes tend to be expressed later in life than highly conserved genes.
Differences in experiments/ • 6 or 8 timepoints were used for C. elegans, whereas 80 were used for Drosophila • Suggest that these developmental studies are not necessary for other organisms in which tissues can be easily dissected. What do you think?
Examination of germline-specific transcripts in C. elegans • Used germline mutant + wild type to identify these • 1416 (12%) were germline enriched • 258 oocyte enriched • 650 sperm enriched • 508 germline enriched • Identities indicate that posttranslational mechanisms play an important part in sperm • Kinses, phosphatases • Is this enough information for you? What would you do next? What questions would you ask?
Surprises in the C. elegans story? • Germiline-specific genes were not on the X chromosome • There is no doubt more information in these datasets that will be mined in the future – if you know about them.
Is it possible to study behavior at this level? • Some Drosophila behaviors are under circadian control • Found 134 transcripts that are cyclic • 25% have no known function • Geotaxis (moving with or against gravity, polygenic trait • Found several hundred differentially expressed genes • Found three genes that when mutated resulted in geotaxis defects
Genomics in model systems to study human disease • Leptin knockout mice vs wild type • Found 2000 detectably expressed genes and 450 were significantly different in mutant obese mice and wild type. • Added leptin back to ob mice to distinguish specific effects of leptin • Identified a transcription factor downregulated by leptin that regulates fatty-acid biosynthesis
More disease models • MS • Retinal diseases -mouse • Identified 396 genes, looked for rod-specific expresion. Human orthologs exist for 237. In all, 86 of the newly identified genes correspond to 37 different disease loci
Regulatory networks • How can multiple types of information be integrated? • Modules – multiple, adjacent cis-regulatory elements • “Building the core of the regulatory circuitry by determining the relationships between cis-regulatory sequences, regulatory proteins, and gene expression seems a first step in delineating developmental networks”
Genome-wide analysis of transcription –factor/binding-site interactions • ChIP (chromatin immunoprecipitation) – identify what genes transcription factors are interacting with. (Cross-link binding proteins to DNA. Shear DNA and IP. Label and hybridize DNA to identify genes to which factors are bound.
Be careful • 9 transcription factors govern the yeast cell cycle (?) • Look at what these bind • Only 213 of 800 cell cycle genes are boudn by these.
Comparative genomics • Can sequence important regions of closely related species to understand regulatory regions (Lander in Nature/Johnston in Science)
Future • Localization and processing of RNA • Post translational modifications • Integration of data “data currently being generated is..systematic but also superficial.” • Much of the data leads to speculative discorse
Neurospora genome • There are many of these types of papers. • Sequence a genome, assemble it (this is an enormous amount of work) – then come up with the beginning of analyses that says this was worth doing. • 40 megabases – a little over 2X yeast. • 10K genes, only 25% fewer than Drosophila and almost 2X that of yeast.
Neurospora - • Intro gives the placement of this organism – why is it important? • >20-fold sequence coverage • Have 958 sequence contigs, with 97% of the sequence in the largest 44 contigs • 41% of the genes aren’t found in other organisms • Important organism for studying epigenetics
Genome papers • What is there? • Sequence information • Genes • Pathways • Potential utility of the sequence • Comparative analysis with other genomes • Some researchers continue sequence analyses others move to functional genomics
Webpages • All genome projects have a webpage: • http://www-genome.wi.mit.edu/. • www.tigr.org • Also see webpages for this class (GOLD)