1 / 15

Biases in RNA- Seq data

Biases in RNA- Seq data . Transcript length bias. Two transcripts of length 50 and 100 have the same abundance in a control sample. The expression of both transcripts is doubled in a treatment sample. The biological variance is the same for both transcripts.

minna
Download Presentation

Biases in RNA- Seq data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Biases in RNA-Seq data

  2. Transcript length bias Two transcripts of length 50 and 100 have the same abundance in a control sample. The expression of both transcripts is doubled in a treatment sample. The biological variance is the same for both transcripts. They have the same level of differential expression. The transcripts are fragmented into short reads of 10 bases, and reported by the RNA-Seq experiment. There will be more hits to the 100 base transcript – its n will be larger, so it will be reported as more significantly changed.

  3. Oshlack and Wakefield 2009, Biology Direct, 4, 14

  4. Random priming aims to sample transcripts uniformly, rather than from just one end (such as with the oligodT primer ……)

  5. Counts of reads along gene Apoein different tissues of the Wold data. (a) brain, (b) liver, (c) skeletal muscle. Each vertical line stands for the count of reads starting at that position. The grey lines are counts in the UTR regions and a further 100 bp. Here introns are deleted and exons are connected into a single piece. Li et al. 2010, Genome Biology, 11, R50

  6. Nucleotide frequencies versus position for stringently mapped reads. For each experiment, mapped reads were extended upstream of the 5′-start position, such that the first position of the actual read is 1 and positions 0 to −20 are obtained from the genome. The first hexamer of the read is shaded. Brief experimental protocols are indicated in the key Biases are caused by hexamer priming that is not random Hansen et al. Nucleic Acids Research, 2010, 38, e31

  7. Roberts et al. 2011, Genome Biology, 12, R22

  8. Human experiment (SRA012427) Yeast experiment (SRA020818_RH) GC content biases some RNA-Seq experiments, but not at the same level in all experiments. Roberts et al. 2011, Genome Biology, 12, R22

  9. Next-generation sequencing is rapidly evolving. There is no market leader, and there have been only a relatively small number of published studies of RNA-Seq for even the most popular NGS platforms. There are clearly biases in the data, and the protocols and chemistry used to generate the data leaves signatures. It is hard to perform meta-analysis. AffymetrixGeneChips are the dominant platform for microarray observations, and have been so for almost a decade – there are more than one hundred thousand hybridizations in the public domain. There has only been a handful of standardised protocols used. This huge dataset allows sensitive meta-analysis.

  10. Affymetrix

  11. Applied Biosystems

  12. Illumina

  13. Life Technologies

  14. Pacific Biosciences

  15. Helicos 1 year Helicos since 2007

More Related