Kendra baughman york marahrens lab ucla l.jpg
This presentation is the property of its rightful owner.
Sponsored Links
1 / 23

Finding Sequence Motifs in Alu Transposons that Enhance the Expression of Nearby Genes PowerPoint PPT Presentation


  • 159 Views
  • Uploaded on
  • Presentation posted in: General

Kendra Baughman York Marahrens’ Lab UCLA. Finding Sequence Motifs in Alu Transposons that Enhance the Expression of Nearby Genes. Overview. Goal Background Prior Studies Strategy Results Remaining Tasks Future Directions. Goal.

Download Presentation

Finding Sequence Motifs in Alu Transposons that Enhance the Expression of Nearby Genes

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Kendra baughman york marahrens lab ucla l.jpg

Kendra Baughman

York Marahrens’ Lab

UCLA

Finding Sequence Motifs in AluTransposons that Enhance the Expression of Nearby Genes


Overview l.jpg

Overview

Goal

Background

Prior Studies

Strategy

Results

Remaining Tasks

Future Directions


Slide3 l.jpg

Goal

Determine if there are motifs present among Alu elements near highly expressed genes, and missing from Alu elements near poorly expressed genes, that might contribute to gene expression


Background alu elements l.jpg

Background – Alu Elements

Repetitive sequence

Transposons (DNA sequences that make copies of themselves and insert elsewhere in the genome)

Over 1 million in human genome

~50 subfamilies categorized by sequence differences


Prior studies l.jpg

Prior Studies

“Repetitive sequence environment distinguishes housekeeping genes”

Eller, Daniel et al. submitted

“Alu abundance positively correlates with gene expression level”

C.D. Eller et. al. submitted


Slide6 l.jpg

Higher Alu concentration near widely expressed genes


Slide7 l.jpg

Higher Alu concentration near highly expressed genes


Alu subfamilies l.jpg

Alu Subfamilies

# Alu in the Subfamily

Subfamily


Slide9 l.jpg

Data

Human gene expression levels from microarray data (Stan Nelson’s lab, UCLA)

Alu information from UCSC Genome Browser, Repeat masker tracks


Goal reiterated l.jpg

Goal, reiterated

Determine if there are motifs present among Alu elements near highly expressed genes, and missing from Alu elements near poorly expressed genes, that might contribute to gene expression


Strategy l.jpg

Strategy

Find Alu “near” high and low expression genes (within 20kb)

Perform multiple sequence alignment on Alu sequences

Identify motifs preferentially conserved around highly expressed genes (these motifs could help the genes be highly expressed)


Strategy12 l.jpg

Strategy

Find Alu “near” high and low expression genes (within 20kb)

Perform multiple sequence alignment on Alu sequences

Identify motifs preferentially conserved around highly expressed genes (these motifs could help the genes be highly expressed)


Slide13 l.jpg

Used Perl scripts to extract information from MySQL databases

Grouped genes by expression level in R

Chose genes in top and bottom 20%

Expression Level

Genes

Screening the genes…


Slide14 l.jpg

Chrom1 1st 20mb

Chrom10

Chrom19 1st 20mb

10kb

3%

6%

20%

20kb

7%

7%

28%

50kb

17%

11%

50%

Screening the Alu…

  • Used MySQL queries to determine flanking region

  • Used Perl scripts to screen

    Alu located within 20kb of genes

  • Omitted Alu in overlapping flanking regions

PERCENTAGES OF ALU THROWNOUT

LO-gene

HI-gene

HI-Alu

??-Alu

LO-Alu


Strategy15 l.jpg

Strategy

Find Alu “near” high and low expression genes (within 20kb)

Perform multiple sequence alignment on Alu sequences

Identify motifs preferentially conserved around highly expressed genes (these motifs could help the genes be highly expressed)


Alignment process l.jpg

Alignment Process…

  • First alignment tool: Clustalw

    • Slow, inaccurate

  • Second alignment tool: T-COFFEE

    • Can’t handle hundreds of sequences

  • Third alignment tool: MUSCLE

  • Aligning thousands of sequences = big gaps and processing limitations

  • Chose to analyze by subfamily (S, Sp/q)

    • Aligned elements around highly expressed genes

    • Aligned elements around poorly expressed genes

    • Profile high/low alignment

    • Consensus sequence alignment


Slide17 l.jpg

  • Alignment viewed in Jalview


Slide18 l.jpg

Alignments of Alu Sp/q and AluS Elements

High Alu

High conserv.

Low conserv.

AluSp-q EPS

AluSp/q

AluS


Strategy19 l.jpg

Strategy

Find Alu “near” high and low expression genes (within 20kb)

Perform multiple sequence alignment on Alu sequences

Identify motifs preferentially conserved around highly expressed genes (these motifs could help the genes be highly expressed)


Slide20 l.jpg

Alu w/ a base:

*5547666896759699995769699999999999*9989979

Frequency of consensus base

All Alu:

0444762289674300448576809499545545409449808

High Alu:

TATCCACGCCTGCAAAATCTCAGCCACTCCCAAAGTTGCTGCG

Alu consensus sequence

Low Alu

CANCC-CGCCT-CGTAATCCCAA--------AATGTT--TG-G

All Alu:

76044 55899 37444989894 454045 98 8

Frequency of consensus base

Alu w/ a base:

77488 66899 67444999995 455645 98 9

AluSp/q

High Alu: TGCTCAGAAATTTCTCGGCTCACTGCAACCTCCGTATCACCCC

Low Alu:CG---A-AA--------------------CTCCGT--T---CT

Alu w/ a base: 596**65559458765699999978999999966566******

Alu w/ a base: 56 5 69 555655 6 99

Frequency of consensus base

All Alu: 0860005458443600233333323333333345400000000

All Alu: 55 4 58 444544 0 77

Alu consensus sequence

Frequency of consensus base

AluS


Remaining tasks l.jpg

Remaining Tasks

Analyze the remaining sub-families

Determine whether identified motifs agree across subfamilies

BLAST motifs against all Alu sequences and correlate alignment scores with expression level


Future directions l.jpg

Future Directions

Cluster alignments into a relationship tree to see if HI and LO Alu groups cluster differently from each other

Create a matrix of pairwise alignments and cluster these into a tree using nearest neighbour clustering

Use Hidden Markov Models or Gibbs sampling to identify sequence motifs (non-multiple sequence alignment method of motif finding)


Acknowledgements l.jpg

Acknowledgements

Danny Eller

York Marahrens

Marc Suchard

Chiara Sabatti

SoCalBSI

NIH/NSF


  • Login