slide1 n.
Skip this Video
Loading SlideShow in 5 Seconds..
Genome organization PowerPoint Presentation
Download Presentation
Genome organization

Loading in 2 Seconds...

play fullscreen
1 / 46

Genome organization - PowerPoint PPT Presentation

  • Uploaded on

Eukaryotic genomes are complex and DNA amounts and organization vary widely between species. . Genome organization. Genome Organization. G. C value paradox:. The amount of DNA in the haploid cell of an organism is not related to its evolutionary complexity or number of genes.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Genome organization' - atalaya

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Eukaryotic genomes are complex and DNA amounts and organization vary widely between species.

Genome organization

c value paradox
C value paradox:

The amount of DNA in the haploid cell of an organism is not related to its evolutionary complexity or number of genes.


Amount of DNA in a Genome Does Not Correlate with Complexity

105 106 107 108 109 1010 1011 1012



How many genes do humans have?

Original estimate was between 50,000 to 100,000 genes

We now think humans have ~ 20,000 genes

How does this compare to other organisms?

Mice have ~30,000 genes

Pufferfish have ~35,000

Nematodes (C. elegans), have ~19,000

Yeast (S. cerevisiae) has ~6,000

The microbe responsible for tuberculosis has ~4,000


Some gene products are RNA (tRNA, rRNA, others) instead of protein

Some nucleic acid sequences that do not encode gene products (noncoding regions) are necessary for production of the gene product (protein or RNA).

Eukaryotic genes are complex!

gene identification
Gene Identification
  • Open reading frames
  • Sequence conservation
    • Database searches
    • Synteny
  • Sequence features
    • CpG islands
  • Evidence for transcription
    • ESTs, microarrays
  • Gene inactivation
    • Transformation, RNAi
noncoding regions
Noncoding regions
  • Regulatory regions
    • RNA polymerase binding site
    • Transcription factor binding sites
  • Introns
  • Polyadenylation [poly(A)] sites
splice sites
Splice Sites

Eukaryotes only

Removal of internal parts of the newly transcribed RNA.

Takes place in the cell nucleus

Splice sites difficult to predict


One gene, many proteins

via alternative splicing , 3’ cleavage and polyadenlyation

trans splicing in higher eukaryotes

Trans-Splicing in Higher Eukaryotes

Gingeras, Nature (2009) 461, 206-211

non contiguous transcription generates an enormous number of possible transcripts

Non-contiguous Transcription Generates An Enormous Number of Possible Transcripts

• Trans-splicing exists in higher eukaryotes as well as in lower ones like Trypanosomes

Blue: only co-linear

Red: all combinations

Six 2-exon co-linear combinations from four exons

325 combinations of 3-exons, non-colinear

• Reassortment of exons coding for ncRNA or protein domains could dramatically increase number of functional products beyond the number of ‘genes’

Gingeras, Nature (2009) 461, 206-211

why genome size isn t the only concern size doesn t matter
Why genome size isn’t the only concern (size doesn’t matter?)
  • More sophisticated regulation of expression?
  • Proteome vastly larger than genome?
    • Alternate splicing
    • RNA editing
  • Postranslational modifications?
  • Cellular location?
  • Moonlighting
gene families
Gene families

E.g. globins, actin, myosin

Clustered or dispersed



Nonfunctional copies of genes

Formed by duplication of ancestral gene, or reverse transcription (and integration)

Not expressed due to mutations that produce a stop codon (nonsense or frameshift) or prevent mRNA processing, or due to lack of regulatory sequences

duplicated genes
Duplicated genes

Encode closely related (homologous) proteins

Formed by duplication of an ancestral gene followed by mutation

  • Five functional genes and two pseudogenes

Different members of the globin gene family are paralogs, having evolved one from another through gene duplication. Paralogs are separated by a gene duplication event.

Each specific gene family member (e.g. a specific gene in human) is an ortholog of the same family member in another species (e.g. mouse). Both evolved from an ancestral globingene. Orthologs are separated by a speciation event.

It is not always easy to distinguish true orthologs from paralogs , especially in polyploid organisms!

noncoding rnas ncrna
Noncoding RNAs (ncRNA)

Do not have translated ORFs


Not polyadenylated

functions of known lncrnas

Functions of Known lncRNAs

• Transcriptional interference

-lncRNA transcription turns off transcription of nearby gene

• Initiation of chromatin remodeling

- lncRNA transcription turns on transcription of nearby gene

• Promoter inactivation

- lncRNA binds to TFIIB and to promoter DNA

• Activation of an accessory protein

- lncRNA binds to allosteric effector protein TLS and inhibits

histone acetyltransferase, decreasing transcription

Ponting et al, Cell (2009) 136, 629-641

functions of known lncrnas1

Functions of Known lncRNAs

• Activation of transcription factors

- binding of lncRNA to Dlx2 activates Dlx5/6 activity

• Oligomerization of an accessory protein

- lncRNA induces heat shock factor trimerization

• Transport of transcription factors

-lnRNA NRON keeps NFAT out of nucleus

• Epigenetic silencing of gene clusters

-Xist RNA inactivates X chromosome

• Epigenetic repression of genes in trans

-HOTAIR binds PRC2, leading to methylation and silencing of several genes in HOXD locus

Ponting et al, Cell (2009) 136, 629-641



  • ~97-98% of the transcriptional output of the human genome is ncRNA
    • Introns
    • Transfer RNAs (tRNA)
      • ~ 500 tRNA genes in human genome
    • Ribosomal RNAs
      • Tandem arrays on several chromosomes
      • 150-200 copies of 28S – 5.8S – 18S cluster
      • 200-300 copies of 5S cluster

Genome Organization - ncRNA

The level of transcription from human chromosomes 21 and 22 is an order of magnitude higher than can be accounted for by known or predicted exons

Almost half of all transcripts from well-constructed mouse cDNA libraries are ncRNAs (identified because they do not code for an open reading frame of larger than 100 codons)

repetitive dna
Repetitive DNA
  • Moderately repeated DNA
    • Tandemly repeated rRNA, tRNA and histone genes (gene products needed in high amounts)
    • Large duplicated gene families
    • Mobile DNA
  • Simple-sequence DNA
    • Tandemly repeated short sequences
    • Found in centromeres and telomeres (and others)
    • Used in DNA fingerprinting to identify individuals
segmental duplications
Segmental duplications

Found especially around centromeres and telomeres

Often come from nonhomologous chromosomes

Many can come from the same source

Tend to be large (10 to 50 kb)

Unique to humans?

mobile dna
Mobile DNA
  • Moves within genomes
  • Most of the moderately repeated DNA sequences found throughout higher eukaryotic genomes
    • L1 LINE is ~5% of human DNA (~50,000 copies)
    • Alu is ~5% of human DNA (>500,000 copies)
  • Some encode enzymes that catalyze movement