1 / 54

Global view of enhancer–promoter interactome in human cells

Global view of enhancer–promoter interactome in human cells. Published on May 27, 2014. PNAS. Presented by CAO Qin. Group Meeting. Outline. Biological Background Contributions Procedures and Results Conclusions. Biological Background. Biological Background. Enhancer

kasen
Download Presentation

Global view of enhancer–promoter interactome in human cells

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Global view of enhancer–promoter interactome inhuman cells Published on May 27, 2014. PNAS. Presented by CAO Qin Group Meeting

  2. Outline • Biological Background • Contributions • Procedures and Results • Conclusions

  3. Biological Background

  4. Biological Background • Enhancer • Transcription is often weak in the absence of regulatory DNA regions that are more distant from the transcription start site(TSS); these regions are called enhancers or cis-regulatory modules. • Enhancer sequences contain short DNA motifs that act as binding sites for sequence-specific transcription factors. Figure 3. Enhancers and their features. Image credit: Stark et al.,2014. Transcriptional enhancers: from properties to genome-wide predictions. Nature Reviews Genetics,15:72-286.

  5. Biological Background • Enhancer • Enhancer prediction • There is no perfect prediction method yet. Table 1. Features used in enhancer prediction and disadvantages of different methods

  6. Biological Background • Enhancer • Linking enhancer to its target gene(s) • There is no perfect linking method yet. • Increasing evidence suggests that enhancers are not located adjacent to their target promoters. Instead, they are positioned tens of kilobases away and contact their target via long-range interactions. • Enhancers are position independent. They may be located either upstream or downstream of the regulated promoter.

  7. Biological Background • Enhancer • Linking enhancer to its target gene(s) • Experimental approaches

  8. Biological Background • Enhancer • Linking enhancer to its target gene(s) • Current computational approaches • Assigning the nearest promoter of an enhancer as its target • Correlating histone modification patterns at enhancers and transcription levels within a given genomic domain. • Correlating Dnase I hypersensitivity signals at enhancers and promoters. • Regions of chromatin which are sensitive to cleavage by the DNase I enzyme. • In these regions, chromatin has lost its condensed structure, exposing the DNA, and making it accessible, which is necessary for the binding of proteins such as transcription factors.

  9. Biological Background • Enhancer • Linking enhancer to its target gene(s) • Disadvantages of Current computational approaches • Either focus on the nearest promoter • Or only use limited types of genomic features • No rigorous characterization of the performance of these methods was reported. • Computational work can • Complement experimental protocols • Allow prioritization of experiments much more efficiently.

  10. Contributions

  11. Contributions • Introduce an integrated method for predicting enhancer targets (IM-PET). • Analyze global EP interactome across multiple cell types and gain better insights into the mechanisms of enhancer and promoter communication.

  12. Procedures and Results

  13. Procedures and Results • A Set of Discriminative Features for Identifying EP Pairs. • Performance Assessment of the IM-PET Algorithm. • Genome-Wide Prediction of EP Pairs in 12 Human Cell Types. • EP Interactions Have Higher Cell Type Specificity than Enhancers. • Promoters with High Expression Specificity Are Regulated by Multiple Enhancers That Have Lower Conservation Levels. • Cohesin Mediates Chromatin Loop Formation and Regulates Cell Type-Specific Gene Expression in the Absence of CTCF.

  14. Procedures and Results • A Set of Discriminative Features for Identifying EP Pairs. • Building training set P300(a protein) is found in many enhancer-associated protein complexes, which can be regarded as an enhancer marker.

  15. Procedures and Results • A Set of Discriminative Features for Identifying EP Pairs. • Selecting features • Feature 1. Enhancer and target promoter activity profile correlation (EPC) • Enhancer activity: score from CSI-ANN algorithm using histone modification signature(3 histone modifications: H3K4me1, H3K4me3, and H3K27ac) • Promoter activity: fragments per kilobase of exon sequence per million reads(FPKM) value from RNA-Seq data(expression levels) • Correlation between enhancer-score and promoter-expression across multiple cell types A neural-network based prediction algorithm

  16. Procedures and Results • A Set of Discriminative Features for Identifying EP Pairs. • Selecting features • Feature 1. Enhancer and target promoter activity profile correlation (EPC) The average correlation between real EP pairs is significantly higher than that of non-interacting pairs. “real pairs” : namely the positive training set “non-pairs” : namely the negative training set “all pairs” : EP pairs formed by extracting all promoters within 2 Mbp of an enhancer. “nearest pair”: EP pair in which the promoter is closest to the enhancer among all promoters in the genome.

  17. Procedures and Results • A Set of Discriminative Features for Identifying EP Pairs. • Selecting features • Feature 2. Transcription factor and target promoter correlation(TPC) • Enhancer: TF signals • Promoter: expression levels(FPKM) Real EP pairs have significantly higher TPC scores than non-interacting pairs.

  18. Procedures and Results • A Set of Discriminative Features for Identifying EP Pairs. • Selecting features • Feature 3. Coevolution of enhancer and target promoter(COEV) -> Whether true EP pairs tend to coevolve, whereas non-interacting pairs do not. • Evolutionary constraint between interacting EP pairs can be quantified by: • Sequence similarity • Conserved synteny Not only that humans and other mammals share most of the same genes, but also that large blocks of our genomes contain these genes in the same order, a feature called conserved synteny.

  19. Procedures and Results • A Set of Discriminative Features for Identifying EP Pairs. • Selecting features • Feature 3. Coevolution of enhancer and target promoter(COEV) • For an enhancer or promoter: • Extract all of its homologous sequences in 14 mammal species • Compute the sequence similarity scores between the human sequence and its 14 homologous sequences separately. Previous studies suggested that a real EP pair is more likely to be maintained in a conserved synteny block among different species. Define a synteny score =1 if the distance between an enhancer and a promoter is less than 2 Mbp in species s =0 otherwise conserved synteny Sequence similarity

  20. Procedures and Results • A Set of Discriminative Features for Identifying EP Pairs. • Selecting features • Feature 3. Coevolution of enhancer and target promoter(COEV) The COEV scores of real EP pairs are significantly higher than those of non-interacting pairs.

  21. Procedures and Results • A Set of Discriminative Features for Identifying EP Pairs. • Selecting features • Feature 4: Distance constraint between enhancer and target promoter(DIS). The distance distribution of real EP pairs is significantly different from that of nearest pairs and that of non-interacting pairs.

  22. Procedures and Results • A Set of Discriminative Features for Identifying EP Pairs. • Performance Assessment of the IM-PET Algorithm. • Genome-Wide Prediction of EP Pairs in 12 Human Cell Types. • EP Interactions Have Higher Cell Type Specificity than Enhancers. • Promoters with High Expression Specificity Are Regulated by Multiple Enhancers That Have Lower Conservation Levels. • Cohesin Mediates Chromatin Loop Formation and Regulates Cell Type-Specific Gene Expression in the Absence of CTCF.

  23. Procedures and Results • Performance Assessment of the IM-PET Algorithm. • IM-PET algorithm: • Train a random-forest classifier for predicting EP pairs using the 4 features. • Compare IM-PET to four state-of-the-art methods:

  24. Procedures and Results • Performance Assessment of the IM-PET Algorithm. • Compare IM-PET to four state-of-the-art methods: 1. AUC value of IM-PET is 94%, 27% higher than the Ernst et al. approach (AUC=67%). 2. The True Positive Rate is much higher than that of PreSTIGE, nearestpromoter, and Thurman et al. methods when the False Positive Rates are equal. (E) ROC curve. Numbers next to circles indicate thresholds for predicting EP pairs using the Thurman et al. method. PreSTIGE made two sets of predictions: high- and low-confidence sets.

  25. Procedures and Results • Performance Assessment of the IM-PET Algorithm. • IM-PET is able to discriminate true EP pairs from random ones on RedFly dataset. • Obtain 831 EP pairs in Drosophila melanogaster from the RedFly database that are validated by in vivo transgenic reporter gene assays • The four selected features are able to discriminate true EP pairs from random ones. • These results suggest that the IM-PET algorithm is generally applicable to a range of species.

  26. Procedures and Results • A Set of Discriminative Features for Identifying EP Pairs. • Performance Assessment of the IM-PET Algorithm. • Genome-Wide Prediction of EP Pairs in 12 Human Cell Types. • EP Interactions Have Higher Cell Type Specificity than Enhancers. • Promoters with High Expression Specificity Are Regulated by Multiple Enhancers That Have Lower Conservation Levels. • Cohesin Mediates Chromatin Loop Formation and Regulates Cell Type-Specific Gene Expression in the Absence of CTCF.

  27. Procedures and Results • Genome-Wide Prediction of EP Pairs in 12 Human Cell Types. Predict enhancers genome-wide across in 12 cell types using CSI-ANN| Combine feature scores in the RF model and a linkage score is computed for each candidate pair Set a threshold for linkage scores (The threshold is computed by the 1% FDR based on training set) For each enhancer, extract all promoters within the 2Mbp window centered at the enhancer Get the predicted EP pairs with the scores above the threshold For each candidate EP pair, compute the feature scores of EPC, TPC, COEV, DIS Prediction procedure

  28. Procedures and Results • Genome-Wide Prediction of EP Pairs in 12 Human Cell Types. Cell type-specific enhancers and EP pairs are defined as those occurring in only one cell type. Cell type-specific promoters are defined as those with an expression specificity rank in the top 25%.

  29. Procedures and Results • Genome-Wide Prediction of EP Pairs in 12 Human Cell Types. • Prediction validation-Computational methods: • Evaluate the predictions using additional ChIA-PET interactions from K562, MCF7, and CD4+ T cells that are not used during training the classifier. • Highest F1 score, which is the harmonic mean of precision and recall and quantifies the balanced performance. • A higher AUC value compared with the method by Ernst et al.

  30. Procedures and Results • Genome-Wide Prediction of EP Pairs in 12 Human Cell Types. • Prediction validation-Computational methods: • Use high-resolution Hi-C to identify a set of promoter–enhancer interactions in IMR90 cells Both ROC curve analysis and F1 score demonstrate that their method had the highest balanced performance.

  31. Procedures and Results • Genome-Wide Prediction of EP Pairs in 12 Human Cell Types. • Prediction validation-Computational methods: • Whether predicted pairs significantly overlap with reported eQTL–gene pairs from GM12878 and HepG2 cells. eQTLs, expression Quantitative Trait Loci, are genomic loci that regulate expression levels of mRNAs. Their method achieves the highest performance, further supporting the conclusion.

  32. Procedures and Results • Genome-Wide Prediction of EP Pairs in 12 Human Cell Types. • Prediction validation-Experimental methods: • Using 3C coupled with quantitative PCR Select 9 predicted pairs in GM12878 and K562 cells 4 pairs were predicted only in GM12878 cells 3 pairs were predicted only in K562 cells 1 pair was predicted in both cell types false positive rate = FP/(FP+TN)=30% False-positive rate of a single 5C experiment is 20–47%. Perform 16 experiments and achieve an 81% (13 of 16) validation rate. Suggests that their prediction method has a similar accuracy as 5C

  33. Procedures and Results • A Set of Discriminative Features for Identifying EP Pairs. • Performance Assessment of the IM-PET Algorithm. • Genome-Wide Prediction of EP Pairs in 12 Human Cell Types. • EP Interactions Have Higher Cell Type Specificity than Enhancers. • Promoters with High Expression Specificity Are Regulated by Multiple Enhancers That Have Lower Conservation Levels. • Cohesin Mediates Chromatin Loop Formation and Regulates Cell Type-Specific Gene Expression in the Absence of CTCF.

  34. Procedures and Results • EP Interactions Have Higher Cell Type Specificity than Enhancers. • Although enhancers are known to function in a tissue-specific manner, quantitatively, it is not known how and to what extent they contribute to the cell-specific gene expression in a cell. Each enhancer on average targets 2.92 promoters. defined as the number of promoters targeted by an enhancer.

  35. Procedures and Results • EP Interactions Have Higher Cell Type Specificity than Enhancers. • Although enhancers are known to function in a tissue-specific manner, quantitatively, it is not known how and to what extent they contribute to the cell-specific gene expression program in a cell. 1. About 32% of all enhancers are unique to a single cell type. About 49% of the EP interactions are unique to a single cell type. 2. The higher specificity of EP pairs is not an artifact of different thresholds used for enhancer and EP pair predictions. The cumulative distributions of enhancers and EP pairs that are observed in only 1, 2, and up to 12 cell types.

  36. Procedures and Results • EP Interactions Have Higher Cell Type Specificity than Enhancers. • The results suggest that cell type-specific EP interaction is more prevalent than cell type-specific activity of enhancers. • In other words, nonspecific enhancers may be involved in specific promoter interactions in different cell types.

  37. Procedures and Results • EP Interactions Have Higher Cell Type Specificity than Enhancers. • Cell type-specific target selection may contribute a large part to cell type-specific gene expression. target promoters High expression The expression specificity of the predicted targets is consistent with the predicted EP specificity.

  38. Procedures and Results • A Set of Discriminative Features for Identifying EP Pairs. • Performance Assessment of the IM-PET Algorithm. • Genome-Wide Prediction of EP Pairs in 12 Human Cell Types. • EP Interactions Have Higher Cell Type Specificity than Enhancers. • Promoters with High Expression Specificity Are Regulated by Multiple Enhancers That Have Lower Conservation Levels. • Cohesin Mediates Chromatin Loop Formation and Regulates Cell Type-Specific Gene Expression in the Absence of CTCF.

  39. Procedures and Results • Promoters with High Expression Specificity Are Regulated by Multiple Enhancers That Have Lower Conservation Levels. • Previously, multiple enhancers controlling the same promoter have been identified and termed “shadow enhancers”. • Previous studies suggested that “shadow enhancers” are important for ensuring the robust expression of genes with a critical role in development. ~40% of promoters interact with two or more enhancers. Defined as the number of enhancers that interact with a given promoter.

  40. Procedures and Results • Promoters with High Expression Specificity Are Regulated by Multiple Enhancers That Have Lower Conservation Levels. • To better understand shadow enhancers and their target promoters, they investigated several features: • 1. Promoter expression specificity • An entropy-based method to compute the expression specificity of every promoter in every cell. PromoterA has the lowest expression specificity overall. PromoterB has the highest expression specificity in cell2

  41. Procedures and Results • Promoters with High Expression Specificity Are Regulated by Multiple Enhancers That Have Lower Conservation Levels. • To better understand shadow enhancers and their target promoters, they investigated several features: • 1. Promoter expression specificity • An entropy-based method to compute the expression specificity of every promoter in every cell. A significant positive correlation between the degree of a promoter and the expression specificity of the promoter. Rank 0: highest specificity Rank 1: lowest specificity

  42. Procedures and Results • Promoters with High Expression Specificity Are Regulated by Multiple Enhancers That Have Lower Conservation Levels. • To better understand shadow enhancers and their target promoters, they investigated several features: • 2. GO term analysis indicates that promoters controlled by three or more enhancers are more enriched in cell type-specific terms.

  43. Procedures and Results • Promoters with High Expression Specificity Are Regulated by Multiple Enhancers That Have Lower Conservation Levels. • To better understand shadow enhancers and their target promoters, they investigated several features: • 3. A significant negative correlation between enhancer sequence conservation and the target promoter degree Suggesting that shadow enhancers are less conserved.

  44. Procedures and Results • Promoters with High Expression Specificity Are Regulated by Multiple Enhancers That Have Lower Conservation Levels. • In summary: • 1. Promoters with high expression specificity are more likely to be regulated by multiple shadow enhancers. • 2. Shadow enhancers are less conserved. • 3. There may exist a genetic backup mechanism for EP communication to ensure accurate and robust cell type-specific gene expression.

  45. Procedures and Results • A Set of Discriminative Features for Identifying EP Pairs. • Performance Assessment of the IM-PET Algorithm. • Genome-Wide Prediction of EP Pairs in 12 Human Cell Types. • EP Interactions Have Higher Cell Type Specificity than Enhancers. • Promoters with High Expression Specificity Are Regulated by Multiple Enhancers That Have Lower Conservation Levels. • Cohesin Mediates Chromatin Loop Formation and Regulates Cell Type-Specific Gene Expression in the Absence of CTCF.

  46. Procedures and Results • Cohesin Mediates Chromatin Loop Formation and Regulates Cell Type-Specific Gene Expression in the Absence of CTCF. • Previously, our knowledge is that : • CTCF is the most characterized mammalian insulator-binding protein. • New studies suggest that: • CTCF can mediate chromatin loop formation between distal regulatory elements and promoters. • As the role of CTCF extends well beyond that originally attributed to insulator proteins and its functional effects are based on its ability to mediate interactions between distant sequences, the term 'architectural' has been proposed rather than 'insulator' to describe this type of protein[1]. Block enhancer-promoter interaction [1] Ong et al., CTCF: an architectural protein bridging genome topology and function. Nature Reviews Genetics 15, 234–246 (2014)

  47. Procedures and Results • Cohesin Mediates Chromatin Loop Formation and Regulates Cell Type-Specific Gene Expression in the Absence of CTCF. • Cohesin complex has been shown to co-localize with CTCF and facilitate the CTCF-mediated chromatin looping. • However, cohesin alone has recently been implicated in tissue-specific transcriptional regulation in a CTCF independent manner. Image credit: Ong et al., CTCF: an architectural protein bridging genome topology and function. Nature Reviews Genetics 15, 234–246 (2014)

  48. Procedures and Results • Cohesin Mediates Chromatin Loop Formation and Regulates Cell Type-Specific Gene Expression in the Absence of CTCF. • Goal: To better define the role of cohesin in EP interaction CNC: Cohesin sites that do not contain CTCF sites Find CTCF and cohesin binding sites that overlap with their predicted EP pairs CAC: Cohesin sites that co-localize with CTCF sites

  49. Procedures and Results • Cohesin Mediates Chromatin Loop Formation and Regulates Cell Type-Specific Gene Expression in the Absence of CTCF. • Goal: To better define the role of cohesin in EP interaction CNC but not CAC sites significantly overlap with predicted EP pairs.

  50. Procedures and Results • Cohesin Mediates Chromatin Loop Formation and Regulates Cell Type-Specific Gene Expression in the Absence of CTCF. • Goal: To better define the role of cohesin in EP interaction Both enhancers and promoters overlap with CNC sites show higher cell type specificity than those overlapping with CAC sites.

More Related