1 / 38

Genes: Regulation and Structure

Genes: Regulation and Structure. Many slides from various sources, including S. Batzoglou,. Cells respond to environment. Various external messages. Heat. Responds to environmental conditions. Food Supply. Genome is fixed – Cells are dynamic. A genome is static

mikko
Download Presentation

Genes: Regulation and Structure

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Genes: Regulation and Structure Many slides from various sources, including S. Batzoglou,

  2. Cells respond to environment Various external messages Heat Responds to environmental conditions Food Supply

  3. Genome is fixed – Cells are dynamic • A genome is static • Every cell in our body has a copy of same genome • A cell is dynamic • Responds to external conditions • Most cells follow a cell cycle of division • Cells differentiate during development

  4. Gene regulation • Gene regulation is responsible for dynamic cell • Gene expression varies according to: • Cell type • Cell cycle • External conditions • Location

  5. Where gene regulation takes place • Opening of chromatin • Transcription • Translation • Protein stability • Protein modifications

  6. Transcriptional Regulation • Strongest regulation happens during transcription • Best place to regulate: No energy wasted making intermediate products • However, slowest response time After a receptor notices a change: • Cascade message to nucleus • Open chromatin & bind transcription factors • Recruit RNA polymerase and transcribe • Splice mRNA and send to cytoplasm • Translate into protein

  7. Transcription Factors Binding to DNA Transcription regulation: Certain transcription factors bind DNA Binding recognizes DNA substrings: Regulatory motifs

  8. Promoter and Enhancers • Promoter necessary to start transcription • Enhancers can affect transcription from afar

  9. Regulation of Genes Transcription Factor (Protein) RNA polymerase (Protein) DNA Gene Regulatory Element

  10. Regulation of Genes Transcription Factor (Protein) RNA polymerase DNA Regulatory Element Gene

  11. Regulation of Genes New protein RNA polymerase Transcription Factor DNA Regulatory Element Gene

  12. Example: A Human heat shock protein --158 0 HSE CCAAT AP2 AP2 CCAAT SP1 SP1 TATA • TATA box: positioning transcription start • TATA, CCAAT: constitutive transcription • GRE: glucocorticoid response • MRE: metal response • HSE: heat shock element GENE promoter of heat shock hsp70

  13. DNA transcription RNA translation Protein Gene expression CCTGAGCCAACTATTGATGAA CCUGAGCCAACUAUUGAUGAA PEPTIDE

  14. The Genetic Code

  15. Eukaryotes vs Prokaryotes • “Typical” human & bacterial cells drawn to scale. • Eukaryotic cells are characterized by membrane-bound compartments, which are absent in prokaryotes. Brown Fig 2.1 BIOS Scientific Publishers Ltd, 1999

  16. Prokaryotic genes – searching for ORFs. • Small genomes have high gene density Haemophilus influenza – 85% genic • No introns • Operons One transcript, many genes • Open reading frames (ORF) – contiguous set of codons, start with Met-codon, ends with stop codon.

  17. Example of ORFs. There are six possible ORFs in each sequence for both directions of transcription.

  18. Eukaryotes vs Prokaryotes • “Typical” human & bacterial cells drawn to scale. • Eukaryotic cells are characterized by membrane-bound compartments, which are absent in prokaryotes. Brown Fig 2.1 BIOS Scientific Publishers Ltd, 1999

  19. Gene structure intron1 intron2 exon2 exon3 exon1 transcription splicing translation Codon: A triplet of nucleotides that is converted to one amino acid exon = protein-coding intron = non-coding

  20. Gene structure intron1 intron2 exon2 exon3 exon1 transcription splicing translation exon = coding intron = non-coding

  21. Exon 3 Exon 1 Exon 2 Intron 1 Intron 2 5’ 3’ Stop codon TAG/TGA/TAA Start codon ATG Finding genes Splice sites

  22. atg caggtg ggtgag cagatg ggtgag cagttg ggtgag caggcc ggtgag tga

  23. 0. We can sequence the mRNA • Expressed Sequence Tag (EST) sequencing is expensive • It has some false positive rates (aberrant splicing) • The method sequences all RNAs and not just those that code for genes • This is difficult for rare genes (those that are expressed rarely or in low quantities. • Still this is an invaluable source of information (when available)

  24. Biology of Splicing (http://genes.mit.edu/chris/)

  25. 1. Consensus splice sites Donor: 7.9 bits Acceptor: 9.4 bits (Stephens & Schneider, 1996) (http://www-lmmb.ncifcrf.gov/~toms/sequencelogo.html)

  26. 2. Recognize “coding bias” • Each exon can be in one of three frames ag—gattacagattacagattaca—gtaag Frame 0 ag—gattacagattacagattaca—gtaag Frame 1 ag—gattacagattacagattaca—gtaag Frame 2 Frame of next exon depends on how many nucleotides are left over from previous exon • Codons “tag”, “tga”, and “taa” are STOP • No STOP codon appears in-frame, until end of gene • Absence of STOP is called open reading frame (ORF) • Different codons appear with different frequencies—codingbias

  27. 2. Recognize “coding bias” Amino Acid SLC DNA codons Isoleucine I ATT, ATC, ATA Leucine L CTT, CTC, CTA, CTG, TTA, TTG Valine V GTT, GTC, GTA, GTG Phenylalanine F TTT, TTC Methionine M ATG Cysteine C TGT, TGC Alanine A GCT, GCC, GCA, GCG Glycine G GGT, GGC, GGA, GGG Proline P CCT, CCC, CCA, CCG Threonine T ACT, ACC, ACA, ACG Serine S TCT, TCC, TCA, TCG, AGT, AGC Tyrosine Y TAT, TAC Tryptophan W TGG Glutamine Q CAA, CAG Asparagine N AAT, AAC Histidine H CAT, CAC Glutamic acid E GAA, GAG Aspartic acid D GAT, GAC Lysine K AAA, AAG Arginine R CGT, CGC, CGA, CGG, AGA, AGG Stop codons Stop TAA, TAG, TGA Can map 61 non-stop codons to frequencies & take log-odds ratios

  28. 3. Genes are “conserved”

  29. Approaches to gene finding • Homology • Procrustes • Ab initio • Genscan, Genie, GeneID • Comparative • TBLASTX, Rosetta • Hybrids • GenomeScan, GenieEST, Twinscan, SLAM…

  30. HMMs for single species gene finding: Generalized HMMs

  31. intron exon exon intron intergene exon intergene HMMs for gene finding GTCAGAGTAGCAAAGTAGACACTCCAGTAACGC

  32. T A A T A T G T C C A C G G G T A T T G A G C A T T G T A C A C G G G G T A T T G A G C A T G T A A T G A A Exon1 Exon2 Exon3 GHMM for gene finding duration

  33. Observed duration times

  34. Better way to do it: negative binomial • EasyGene: Prokaryotic gene-finder Larsen TS, Krogh A • Negative binomial with n = 3

  35. Splice Site Models • WMM: weight matrix model = PSSM (Staden 1984) • WAM: weight array model = 1st order Markov (Zhang & Marr 1993) • MDD: maximal dependence decomposition (Burge & Karlin 1997) decision-tree like algorithm to take significant pairwise dependencies into account

  36. Donor site 5’ 3’ Position % Splice site detection

More Related