html5-img
1 / 15

Alex Zelikovsky Department of Computer Science Georgia State University Joint work with Adrian Caciula (GSU), Sergh

Monte-Carlo Regression Algorithm for Isoform Frequency Estimation from RNA-Seq Data. Alex Zelikovsky Department of Computer Science Georgia State University Joint work with Adrian Caciula (GSU), Serghei Mangul (UCLA) James Lindsay, Ion Mandoiu (UCONN) . Outline.

zalika
Download Presentation

Alex Zelikovsky Department of Computer Science Georgia State University Joint work with Adrian Caciula (GSU), Sergh

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Monte-Carlo Regression Algorithm for Isoform Frequency Estimation from RNA-Seq Data Alex Zelikovsky Department of Computer Science Georgia State University Joint work with Adrian Caciula (GSU), SergheiMangul (UCLA) James Lindsay, Ion Mandoiu (UCONN) IEEE ICCABS 2013, New Orleans, LA

  2. Outline • RNA-Seq: Introduction • MCReg: Monte Carlo Regression based Algorithm • Experimental Results • Conclusions and Future Work IEEE ICCABS 2013, New Orleans, LA

  3. Genome-Guided RNA-Seq ProtocolRNA-Seq enables transcript-level resolution of gene expression From RNA – through the process of hybridization- Make cDNA & shatter into fragments Sequence fragment ends Map reads to genome A B C D E Isoform Expression (IE) Isoform Discovery (ID) Gene Expression (GE) A B C A C D E IEEE ICCABS 2013, New Orleans, LA [Nicolae, et. al., 11]

  4. Outline • RNA-Seq: Introduction • MCReg: Monte Carlo Regression based Algorithm • Observed Read Distribution • MC-Based Estimation of Expected Read Distribution • Regression-Based Estimation of Isoform Frequencies • Experimental Results • Conclusions and Future Work IEEE ICCABS 2013, New Orleans, LA

  5. MCReg: Monte-Carlo Regression MCReg Motivation: Reducing the error rate is critical for detecting similar transcripts especially in those cases when one is a subset of another: Screenshot from Genome browse: IEEE ICCABS 2013, New Orleans, LA

  6. General Method Overview • Map paired-end reads onto the library of known isoforms using an ungapped aligner (e.g., Bowtie) • B. Langmead, C. Trapnell, et. al., “Ultrafast and memory-efficient alignment of short DNA sequences to the human genome,” Genome Biology, vol. 10, no. 3, p. R25, 2009. • Group reads that have been mapped to the same transcripts into classes • Monte-Carlo-Based Estimation of Expected Read Distribution using e.g. Grinder simulator • F.E. Angly et. al. Grinder: a versatile amplicon and shotgun sequence simulator. Nucleic acids research, 2012 • Solve the regression:The least-square formulation can be solved with a constrained quadratic programming solver • M. S. Andersen et. al. CVXOPT: A Python package for convex optimization,  Available at cvxopt.org, 2013.

  7. Observed Read Distribution IEEE ICCABS 2013, New Orleans, LA

  8. Monte-Carlo-Based Estimation of Expected Read Distribution IEEE ICCABS 2013, New Orleans, LA

  9. MC-Based Estimation of Expected Read Distribution IEEE ICCABS 2013, New Orleans, LA

  10. Regression-Based Estimation of Isoform Frequencies IEEE ICCABS 2013, New Orleans, LA

  11. Regression-Based Estimation of Isoform Frequencies IEEE ICCABS 2013, New Orleans, LA

  12. Outline • RNA-Seq: Introduction • MCReg: Monte Carlo Regression based Algorithm • Experimental Results • Conclusions and Future Work IEEE ICCABS 2013, New Orleans, LA

  13. Simulation Setup IEEE ICCABS 2013, New Orleans, LA

  14. Experimental Results Frequency estimation accuracy was assessed using the coefficient of determination r2. For IsoEM r2 = 0.92, while for MCReg r2 = 0.97. The results shows better correlation compared with IsoEM especially because of those cases of sub-transcripts where IsoEM skewed the estimated frequency toward super-transcripts. IEEE ICCABS 2013, New Orleans, LA

  15. Thanks! IEEE ICCABS 2013, New Orleans, LA

More Related