slide1 n.
Skip this Video
Loading SlideShow in 5 Seconds..
University of Brawijaya 4 th December 2013 PowerPoint Presentation
Download Presentation
University of Brawijaya 4 th December 2013

Loading in 2 Seconds...

play fullscreen
1 / 26

University of Brawijaya 4 th December 2013 - PowerPoint PPT Presentation

  • Uploaded on

University of Brawijaya 4 th December 2013. Austen Ganley INMS. Understanding the Human Genome: Lessons from the ENCODE project. Glossary. Non-coding RNA Sequencing Microarray Transcription start site Active/open Inactive/repression. Genome Genes DNA/RNA Protein Cell

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'University of Brawijaya 4 th December 2013' - arty

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

University of Brawijaya

4th December 2013

Austen Ganley


Understanding the Human Genome: Lessons from the ENCODE project



  • Non-coding RNA
  • Sequencing
  • Microarray
  • Transcription start site
  • Active/open
  • Inactive/repression
  • Genome
  • Genes
  • Protein
  • Cell
  • Transcription
  • Chromatin
  • Histones
  • Nucleosomes

transcriptional terminator

transcriptional start site






  • Individual scientists worked together
  • Aim was to understand 1% of the human genome (2007), and 100% (2012)
  • Looked at:
    • Transcription
    • Chromatin/transcription factors
    • Replication
    • Evolution


  • Now estimated to be about 21,000 protein-coding genes (taking about 3% of the whole genome)
  • In addition, there are about 9,000 microRNAs, and about 10,000 long non-coding RNAs


  • Transcription was measured by two different methods:
    • Whole genome microarrays
    • RNA-sequencing


  • Transcription was measured by two different methods:
    • Whole genome microarrays
    • RNA-sequencing
  • They found at least 62% of the whole genome is transcribed (remember, genes only account for about 3% of the whole genome)

Transcriptional start sites

  • Goal is to identify the transcription start sites
  • Not easy to do!
  • Use a technique called CAGE (Cap Analysis Gene Expression)


  • Makes use of the 5’ CAP on mRNA
  • First, mRNA is reverse-transcribed, to form cDNA (RNA-DNA hybrid)
  • Then, biotin is attached to the 5’ CAP, and the cDNA is fragmented
  • The biotin fragments are isolated (representing the 5’ end of mRNA), and these fragments are sequenced

About 60,000 transcription start sites found

  • Only half of these match known genes
  • What do the other ones do? May explain high level of transcription
  • The transcription start sites are often far upstream of the gene start, and can overlap genes

Transcriptional start sites from the DONSON gene

Overlapping Genes

  • An overlapping gene, starting far upstream
  • The DONSON gene is a known gene
  • However, some transcripts start in the ATP50 gene, and include some ATP50 exons
  • Two genes are skipped out

Chromatin: histones and nucleosomes

  • Nucleosomes are formed from DNA that is packaged around histones
  • Histones are a set of proteins that usually associate as an octamer


Dnase I hypersensitive sites (DHS)

  • DNase I preferentially digests nucleosome-depleted regions (DNaseI hypersensitive sites)
  • These are associated with gene transcription
  • Chromatin is digested with DNase I: only digests nucleosome-free regions
  • The remaining DNA is isolated, and put on a microarray or sequenced
  • Find the open, active regions of the genome

Hebbes Lab, University of Portsmouth, UK

Gilbert, Developmental Biology, Sinauer


DNase I hypersensitive sites

  • In total, about 3 million DNase I hypersensitive sites in the genome, covering about 15% (versus about 40,000 genes covering about 4%)
  • Transcriptional start sites are regions of DNase I hypersensitivity, as expected
  • Most DNase I hypersensitive sites are not associated with transcriptional start site, though


Transcribed region

Transcription start sites

DNase I hypersensitive region



Histone Modification Effects

  • Modifications occur on the histone tails
  • They alter the strength of DNA-histone binding, and influence the binding of other proteins to the DNA
  • Thus they can activate or silence gene expression

The “Histone Code”

  • The combination of histone modifications determine a gene’s transcriptional status – histone code
  • Some modifications are associated with active gene expression
    • H3K4me2
    • H3K4me3
    • H3ac
    • H4ac
  • Some with repression
    • H3K27me3
    • H3K4me1


ChIP (Chromatin immunoprecipitation)

  • Method to find where your protein of interest is binding to
  • You cross-link the sample, and fragment the DNA into pieces
  • Immunoprecipitate using an antibody to your protein of interest
  • Reverse the cross-links, and isolate the DNA
  • To find where in the genome the protein was bound:
  • Hybridise the DNA to a microarray (ChIP-chip) OR sequence it (ChIP-seq)


Histone modification profiles

  • They found that histone modifications associated with active transcription were found around transcription start sites
  • They found that histone modifications associated with gene repression were depleted around transcription start sites
  • This is as expected
  • Around DNase I hypersensitive sites not near transcription start sites, they found almost the opposite pattern

Enrichment of active histone marks and depletion of inactive histone marks at a transcription start site

Enrichment of inactive histone marks but little enrichment of active histone marks at a DNase I hypersensitive site


Histone modification profiles

  • They also found other patterns
  • Combining all the results (plus results for transcription factor binding), they say that the human genome is divided into seven different types of chromatin states
  • Which state it is depends on what combination of histone modifications/transcription factor binding there is

The seven chromatin states

Enhancer (yellow)

Gene body (green)

Inactive region (grey)

Promoter (red)



Grand Summary

Transcription start sites:

• Twice as many transcription start sites as traditional “genes”

• transcripts span large regions, even between genes

DNase I hypersensitive sites:

• more than just at transcription start sites

• two types: those found both at TSS, and those found at other regions

• these have different chromatin profiles


• a lot of non-coding transcription (~60% of the genome transcribed) – much more than needed just to transcribe all the genes


• genome can be generalised into seven different states

• the function of some of these states is known – e.g. promoter

• the function of others is not known, but may explain the high level of transcription and open chromatin structure

Histone modifications:

• active marks correlate with TSS/DHS

• distal DHS have a different histone modification profile

Chromatin states:

• The genome can be divided into seven different types

• these are determined by the combination of histone modifications and transcription factor binding that occur