1 / 22

RNA- Seq Data Analysis

RNA- Seq Data Analysis. UND Genomics Core. RNA- Seq Analysis Pipeline. Sequence Quality Control Assessment. Adapter Trimming. Alignment. Quantification. Differential expression. Software For RNA- Seq Analysis. Fastq files. Contains sequence and quality information. Q-score.

mtolbert
Download Presentation

RNA- Seq Data Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. RNA-Seq Data Analysis UND Genomics Core

  2. RNA-Seq Analysis Pipeline Sequence Quality Control Assessment Adapter Trimming Alignment Quantification Differential expression

  3. Software For RNA-Seq Analysis

  4. Fastq files • Contains sequence and quality information

  5. Q-score • Q-score is a metric to assess the accuracy of sequencing • Relates to the probability of a wrong base call via logarithmic function. • -10log10(P)

  6. Sequence Quality Assessment Fastqc–o FastQC *fastq.gz Good data Bad data

  7. Sequence Quality Assessement • Q-scores based with respect to the location on flow cell • Ideally should be all blue

  8. Sequence Quality Assessement • Common for RNA-Seq • Not all FastQC warnings apply

  9. Adapter contamination Read1 cDNA adapter

  10. Adapter Trimming cDNA • Trim galore • Tries to automatically detect adapter • For Illumina adapters, the first 13 bases of the Illumina indexed adapters What trim galore matches

  11. RNA-Seq Analysis Pipeline Sequence Quality Assessment Adapter Trimming Alignment Quantification Differential expression

  12. Alignment for RNA-Seq • For eukaryotic genomes, splice- aware aligners can align reads across exon – intron boundaries. Spliced read

  13. Splice – Aware Alignment Programs • Tophat - https://ccb.jhu.edu/software/tophat/index.shtml • Hisat2 - https://ccb.jhu.edu/software/hisat2/manual.shtml • STAR - https://github.com/alexdobin/STAR

  14. Non Splice-aware Aligners • Bowtie • http://bowtie-bio.sourceforge.net/index.shtml • Bowtie2 • http://bowtie-bio.sourceforge.net/bowtie2/index.shtml • BWA • http://bio-bwa.sourceforge.net/

  15. Reference Files for Alignment • fasta • gtf (gene transfer format) Where to find reference files? • UCSC • http://hgdownload.cse.ucsc.edu/downloads.html • Ensembl • https://uswest.ensembl.org/downloads.html) • iGenomes • https://support.illumina.com/sequencing/sequencing_software/igenome.html

  16. More File Formats • Fasta • Indexes (*ht.1) • Binary version of the genome which allows for quick reading of the genome

  17. GTF file

  18. Alignment Output • Sam (Sequence alignment format) • Bam (binary sam file)

  19. RNA-Seq Analysis Pipeline Sequence Quality Assessment Adapter Trimming Alignment Quantification Differential expression

  20. Counting Reads • Count reads that unambiguously align to a gene • Programs that will count reads: • FeatureCounts • HTSeq Read Read Gene A Gene A Gene B Gene B

  21. Transcriptome assembly • Assemble a map of the genes in the transcriptome using the reads in present sample • Can be guided with a gtf, if using a well –annotated genome • Assembles transcriptome and estimate the abundance of transcript at the same time • Programs that will assemble transcriptomes: • Cufflinks • Stringtie

  22. Your Turn! • Work through a RNA-Seq analysis described in the file Day2_HandsOn.docx in the Day2 workshop material folder

More Related