1 / 18

You Have to Know What You’re Looking for Before You Can Find It or

You Have to Know What You’re Looking for Before You Can Find It or Why People Like Me Are Still Writing Programs to Find CpG Islands. Presented by Emily Mitchell. So What’s A CpG Island, Anyway?. Classic definition, Gardiner-Garden and Frommer, 1987: 200 bases long

axel
Download Presentation

You Have to Know What You’re Looking for Before You Can Find It or

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. You Have to Know What You’re Looking for Before You Can Find It or Why People Like Me Are Still Writing Programs to Find CpG Islands Presented by Emily Mitchell

  2. So What’s A CpG Island, Anyway? Classic definition, Gardiner-Garden and Frommer, 1987: 200 bases long G + C content of at least 50% observed CpG/expected CpG ratio of at least 0.6. at least 7 CG dinucleotides present Problem with classic definition: It’s not just CpG islands that meet these criteria.

  3. Takai and Jones redefine CpG islands, 2002: At least 500 bases long G + C content at least 55% observed/expected ratio of at least 0.65. at least 7 CG dinucleotides present Problem with Takai and Jones’ definition: it preferentially locates CpG islands in the 5’ region of genes*, but filters out other CG-rich sequences that may be important for gene regulation. * Takai and Jones were only looking at human genes, specifically in chromosomes 21 and 22.

  4. The Sliding Window Algorithm Analyze a window. Does it meet CpG island criteria?

  5. If not, slide to the right one nucleotide And analyze again.

  6. And again. Until it meets the criteria

  7. Then jump ahead and check the window adjacent to the island on the 3’ side.

  8. Repeat as needed, until the new window does not meet the CpG island criteria

  9. Then slide the window back toward the island.

  10. Keep sliding until the window meets CpG island criteria.

  11. Then analyze the combined island as a whole.

  12. If it doesn’t meet the criteria, try trimming a base pair off each end and analyzing again. You may have to do a lot of trimming. (not shown)

  13. Once it meets CpG island criteria, move on to the next adjacent window and analyze that.

  14. Changes to the Sliding Window Algorithm (CpGIE) Trim from both ends at once, instead of one and then the other Change the criteria for what constitutes a “mathematical” CpG island Change how the window moves when it’s reading an island Use a different representative window

  15. What’s the Difference?

  16. Distance-Based Algorithm (CpGcluster) A statistical approach using p-values Based entirely off of how close together CG dinucleotides are and whether it’s statistically significant. Doesn’t rely on (arbitrary) user-input parameters But is it really a CpG island if it’s only 11 bp long?

  17. So… What? We know basically what CpG islands do. We know roughly where (at least some of them) are. But how can you study their sequences with any degree of confidence if you don’t know exactly where they begin and end?

  18. References Antequera, F. “Structure, function and evolution of CpG island promoters.” CMLS, 2003. Gardiner-Garden, M. and M. Frommer. “CpG islands in vertebrate genomes.” Journal of Molecular Biology, 1987. Hackenberg, Michael, et al. “CpGcluster: a distance-based algorithm for CpG-island detection.” BMC Bioinformatics, 2006. Wang, Yong and Frederick C.C. Lung. “An evaluation of new criteria for CpG islands in the human genome as gene markers.” Bioinformatics, 2004.

More Related