Finding promoters other important genomic sequences
1 / 14

Finding Promoters other important genomic sequences - PowerPoint PPT Presentation

  • Uploaded on

Finding Promoters other important genomic sequences. Lecture 10. Introduction. Purpose of Promoter analysis Finding Prokaryotic promoters Finding Eukaryotic promoters Two basic approaches to finding promoters and other regulatory elements

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Finding Promoters other important genomic sequences' - marin

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript


  • Purpose of Promoter analysis

  • Finding Prokaryotic promoters

  • Finding Eukaryotic promoters

  • Two basic approaches to finding promoters and other regulatory elements

  • Brief reference to some other interesting sequence regions

Promoter analysis
Promoter Analysis

  • The existence of a “potential” ORF indicates the presence of a near by promoter.

  • Promoter are essential elements of the DNA sequence. The can be upstream/downstream of the protein coding sequence (CDS) and are essential in the binding of RNA polymerase and other factors that initiate the transcription process.

  • They exist in both eukaryotic and prokaryotic organisms.

Global Sequence

Proximity of promoters
Proximity of promoters

  • Promoters in prokaryotes have well defined b.p. sequences (motifs) upstream of the CDS (true ORF):

    • The Pribnow box: TATAAT at position -10

    • A TTGACA at position -35

    • An AT rich region before -35 position box.

  • The -10/-35 refer to the bp upstream of the transcription start site.

  • Analyse the E.Coli Pal gene and see if you can find the promoter region and indicate the transcription start site (TSS).

Orf prokaryotics pal gene e coli
ORF prokaryotics(Pal Gene E.Coli)

Adapted Understanding bioinformatics 9.3

Basic promoter prediction program
Basic promoter prediction program

  • Modify your exisiting code to search for possible promoter regions. And determine the distance from the beginning of the promoter to the start codon.

  • Analyse the region near the Pal gene (CDS and promoter) and propose any other interesting fact about the consequence of the high gene density.

Eukaryotic gene promoters
Eukaryotic gene promoters.

  • Eukaryotic promoters are more complex and can often be located long distances from the transcription start site (TSS):

    • While the core promoter is not as well defined it can contain. …

      • TATA box

      • CAAT box

      • GC rich regions

    • Generally it is in close proximity to the Transcription initiation region.

      • Some programs consider a promoter region is correct if its:

        • 200bp 5’ end

        • 100bp 3’ end

Promoter analysis1
Promoter Analysis

  • Promoters characterisation (discovering transcription factor binding patterns) takes two basic approaches (Chapter 5 Baxevanis 2005):

    • Pattern Driven Algorithms: depends on existing of experimentally annotated data, in bioinformatics databases, that relate to binding sites

    • Care must be taken as this approach can lead to false positives; binding site variability, short sequence length.

    • The analysis of the results must take into account the surrounding region of the “putative” promoter site

Global Sequence

Promoter region of eukaryotics genes
Promoter region of Eukaryotics genes

The figure below illustrates a number of eukaryotic promoters and illustrates the variability. [klug 7thed] . However it also illustrates the common features: TATA box…

Example of pattern driven approach
Example of Pattern Driven approach

Figure A and B show the results of patterns associated with the TATA box

Note a score of -8.16 must be obtained to classify it as a TATA box “region”

Figure C and D are associated with the DNA CAP signals (CAP is a transcriptional activator). “do not confuse it with the 5’ RNA Cap (Cap and poly A tail )”

Promoter analysis2
Promoter Analysis

  • Sequence-driven algorithms: the assumption that common, promoter / regulatory [silencer/enhancer] functionality can be obtained from underlying conserved, sequences.

    • Genes that are co-regulation or co-expression provide good candidates for obtaining data for this approach;

    • Co-regulated genes (on/off), have the same regulatory elements, often they contain similar promoters/regulatory regions(an operon promoter is a simple example of a common promoter)

    • Genes that are Co-expression (on) also, could, have similar promoter/regulatory regions.

Sequence driven approach
Sequence Driven Approach

  • The sequence driven approach can also be performed across species. This approach can help regulatory sites; enhancers/silencers as opposed to simply RNA polymerase binding signals: the core promoter.

  • Compare genes that are regulated in the same wayor with similar regulatory patterns and comparing sequence: looking for matching segments/motifs.

  • Baxevanis (p 129) highlights some problems with the intaspecies approach can include:

    • If background conservation is high difficult to detect such sites.

    • Some gene regions are more conserved than others.

    • Some important regulatory elements are not conserved across species.

Other regions repeating elements
Other regions: repeating elements

  • Tandem repeats: these are sequences that are repeated many times throughout the DNA sequence. These sequences are often associated with CDS region [Baxevanis p 297]

  • Inverted repeats: These are repeating sequences but are inverted and on opposite strands: Often associated with regulatory elements

    • ATGC----

    • -----CGTA

  • There are many other patterns that can be searched for such as tRNA genes (refer to E. Coli Pal gene figure in the lecture 9)…. But these are not covered here. Interested readers can refer to chapter 9 and chapter 10 in understanding Bioinformatics

  • SNPs (single nucleotide polymorphisms) associated with looking for a single BP change in the CDS. These can be associated with certain diseases such as sicKle cell anemia (a-> t and so glu->Val). This changes the structure of the haemoglobin. They can also be used in the study of evolution and gene finger printing. (Baxevanis chapter 7)

Potential exam questions
Potential exam questions

  • The search for promoters is often used to help indicate the validity of ORF.

    • Explain two approaches that can be used to find such regions (10 marks)

    • Describe the problems associated with each approach (8 marks)