Functional non coding dna part i non coding genes and non coding elements of coding genes
This presentation is the property of its rightful owner.
Sponsored Links
1 / 29

Functional Non-Coding DNA Part I Non-coding genes and non-coding elements of coding genes PowerPoint PPT Presentation


  • 156 Views
  • Uploaded on
  • Presentation posted in: General

Functional Non-Coding DNA Part I Non-coding genes and non-coding elements of coding genes. BNFO 602/691 Biological Sequence Analysis Mark Reimers, VIPBG. What D oes ‘Functional N on-Coding DNA’ Mean?. DNA whose sequence affects transcripts made from DNA in some way

Download Presentation

Functional Non-Coding DNA Part I Non-coding genes and non-coding elements of coding genes

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Functional Non-Coding DNAPart INon-coding genes and non-coding elements of coding genes

BNFO 602/691

Biological Sequence Analysis

Mark Reimers, VIPBG


What Does ‘Functional Non-Coding DNA’ Mean?

  • DNA whose sequence affects transcripts made from DNA in some way

  • Could affect transcription levels, splicing or sequestering of RNA

  • Three main ways to identify functional non-coding elements

    • Sequence characteristics – favored bases

    • Genomic conservation

    • Epigenetic marks and open chromatin

      • especially outside of genes


Types of Non-Coding Elements

  • Non-coding RNAs

    • miRNAs, lncRNAs, etc

  • Non-coding gene elements

    • UTRs, splice sites, poly-adenylation sites, splice sites and regulating element, RNA-binding sites

  • DNA elements outside genes – our main focus

    • Promoters

    • Enhancers/Silencers

    • Insulators


Types of Non-Coding RNA

  • microRNAs

  • Silencing RNAs

  • Small nuclear/nucleolar RNAs

  • Piwi-Interacting RNAs

  • Long Non-Coding RNAs

  • Circular RNAs

  • Still other RNAs???

  • Comprehensive data base at www.ncrna.org


Micro-RNAs

  • Micro-RNAs are small non-coding RNA molecules, about 21–25 nucleotides in length

  • They are processed from much longer genes, or from introns within mRNA, by several molecular pathways

  • Micro-RNAs base-pair with complementary sequences within mRNA molecules, often in 3’ or 5’ UTR.

  • miRNA binding usually results in gene repression either via translational stalling or by triggering mRNA degradation

Image by Charles Mallery, U of Miami


Micro-RNAs

  • The human genome encodes over 1500 miRNAs, which are believed to affect more than half of human genes

  • miRNAs are abundant in many cell types

    • Thousands of copies per cell of some miRNAs

    • Those within gene introns share regulation

  • miRNAs are well-conserved across vertebrates

    • No orthologs between plant and animal miRNAs

    • miRBase is the comprehensive repository of micro-RNAs


Other Short RNAs: siRNA

  • Small interfering RNAs are double-stranded with an overhang

  • They are processed by some of the same machinery as miRNAsand have some of the same effects


Other Short RNAs: piRNA

  • Piwi-Interacting RNAs are longer 26-31 base single-stranded RNAs

    • PIWI (P-element Induces Wimpy Testis) protein

  • Over 50,000 sequences known in mouse

    • They are the largest class of nc-RNA

  • They seem to play an ancient role in defenseagainst retro-viruses and transposons


Other Short RNAs: snRNAs & snoRNAs

  • Small nuclear RNAs (snRNAs) are typically ~ 150 bases long, and associate with protein

    • Many conserved copies of each snRNA gene

    • U1-U6 snRNAs key parts of splicing machinery

  • Small nucleolar RNAs (snoRNAs)

    • Guide chemical modifications of other RNAs

    • Prader-Willi syndrome results from deletion of region containing 29 copies of SNORD116 on chr 15q11

U6 snRNA


Long Non-Coding RNAs

  • Many long (>200bp) stretches of genome are transcribed and have epigenetic marks like those of protein-coding genes

  • Most of these are spliced RNAs with two (or more) exons

  • GENCODE v15 has 13.5K lncRNA

  • See also

    • Derrien et al, Genome Research 2012

    • Lee, Science 2012

From Derrien et al Genome Res 2012


Many lncRNAs Induce Silencing

  • Coat nearby gene(s) and silence them

  • Xist binds to gene clusters first

  • Xist binds disparate parts of chromosome

  • Many lncRNA are antisense to genes

  • Some lncRNAs maintain pluripotency of stem cells

From Jeannie Lee lab (Harvard) website


Long Non-Coding RNAs - 2

  • Most lncRNAs are expressed in only a few tissues

  • Most human lncRNAs are specific to the primate lineage

From Derrien et al Genome Res 2012


Circular RNAs

  • Several thousand non-coding RNAs apparently form circular structures

  • Many form complexes with AGO and seem to absorb attached miRNAs, blocking processing

  • CDR1 has 70 conserved binding sites for mir7


Functional Pseudo-Genes

  • Pseudo-genes are copies of genes that are decaying and rarely (never) make proteins

  • Some pseudo-genes act to absorb negative regulators of the original gene – eg. SRGAP2B


How to Identify Non-Coding RNAs?

  • Short (and long) RNA transcriptomes

  • Promoter chromatin marks for independent (non-embedded) miRNAs and lncRNAs


DEMO: Display HOTAIR & XIST Tracks in UCSC Browser


Non-Coding Elements of Genes

  • TSS

  • 5' UTRs

  • Introns

  • Splicing regulation sites

  • 3' UTRs

  • Termination/Poly-adenylation sites


Transcription Start Sites

  • Transcription of most genes may initiate at several distinct clusters of locations with distinct promoters for each TSS

  • Two major types of metazoan TSS: CG-rich broad TSS, and narrow (often tissue-specific) TSS


Transcription Start Sites

Transcription often starts at CG within promoter


5’ Untranslated Regions

  • First exon often contains dozens to thousands of bases before Start codon (median 150)

  • Sometimes contains regulatory sequences, e.g. binding sites for RNA binding proteins, and translation initiators


Splice Regulatory Sites

  • Splicing is achieved through binding of spliceosome to recognition sequences on nascent RNA molecule


Splice Regulatory Sites

  • Tissue-specific splice regulatory sites are highly conserved

From Merkin et al Science 2012


Splicing Patterns Evolve in All Tissues Except Brain

From Merkin et al Science 2012


Non-Coding Elements in Coding Exons

  • Many regulatory sites occur within coding exons, esp. toward 5’ end

  • These constrain some codons as much as protein sequence

  • Many human SNPs break TFBS but have little effect on protein (AFAWK)

From Stergachis et al Science 2013


3’ Untranslated Regions

  • Longest exon is usually 3’UTR (>1000 nt)

  • Typically 1/3 – 1/2 of a gene is in 5’ & 3’ UTRs

  • 3’UTR has binding sites for miRNAs and RNA binding proteins

  • AU-rich elements (AREs) stabilize mRNA

  • Proteins recognize complex secondary structure

GRIK4 3’UTR secondary structure is conserved


RNA Binding-Protein Sites

  • mRNAs are usually further processed (e.g. transported or sequestered)

  • RNA binding proteins recognize specific motifs within secondary structure of 3’ or 5’ UTR

  • These sites are often highly conserved

From Ray et al Nature 2013


Poly-adenylation/Termination Sites

  • Transcripts can be terminated and poly-adenylated at sites with specific sequences

  • Most genes have alternate poly-adenylation sites

  • Median lengths of 3’UTR are 250 & 1773 bp(mouse)


Poly-adenylation/Termination Sites

  • Rapidly proliferating cells express gene isoforms with short 3’ UTRs

  • Neurons typically have longer 3’ UTRs

Types of alternate poly-adenylation

Elkon et al, NRG 2013


DEMO: GAPDH and GABRA1 in UCSC Browser


  • Login