Download
rdp capturing the unclassified n.
Skip this Video
Loading SlideShow in 5 Seconds..
RDP – Capturing the Unclassified PowerPoint Presentation
Download Presentation
RDP – Capturing the Unclassified

RDP – Capturing the Unclassified

116 Views Download Presentation
Download Presentation

RDP – Capturing the Unclassified

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. RDP – Capturing the Unclassified Use only on data that can be publicly shared. These are not secure tools.

  2. Genboree RDP Output • Tutorial 2 Dataset • QIIME • chimeras removed • RDP • Sample Period

  3. Download files • Raw.results.tar.gz

  4. Unarchive and Decompress • Use 7zip • Seq.fna

  5. Open in Bioedit

  6. In Bioedit: • Ctrl +A – to select all sequences • Shift + Ctrl + C – to copy all sequence titles • In Excel: • Paste into excel. In Column B (or other) • =left(a1,number_of_characters_in_titles) • Ctrl+Shift+Down arrow • Ctrl+D – to copy to all cells below • Check your work. Select only your samples. Do not select blank cells. Copy the correct titles.

  7. In Bioedit: • Paste Over titles • Save as: your_filename.fas • In the pull down menu • choose fasta

  8. rdp.cme.msu.edu

  9. Make an Account

  10. For very tiny datasets

  11. very tiny datasets

  12. very tiny datasets • Do not navigate away

  13. For pyrosequenced datasets

  14. You can navigate away and pick up the results later.

  15. Check in while running?

  16. Done: Download

  17. What do you get back? • Confidence file • Classifications • Failed classifications  Check this file. • Problems have happened if not empty. • Hierarchy

  18. Open classifications in excel • Focus on Phylum for tutorial. Use any level.

  19. Tutorial ease condense sample periods

  20. Keep it Tidy • Cut out what isn’t needed or being used.

  21. Confidence in the Classification • Sort on the confidence level • Odd groups • Leave in or take out? • Replace those below your confidence level • Unclassified_ • =concatenate($column$row,cell) • $ keeps the column or row static in your formula as you drag to multiple cells

  22. Copy to a new columnRemove Duplicates

  23. Even at the Phylum Level • 60 categorical levels • (could be 2 for every known phylum)

  24. To count by sample and phylum classification • =countifs($K:$K,$O2,$A:$A,P$1) • How to stop recalculation and manually restart – don’t crash your machine! You can easily cause hours of computation on large matrixes!

  25. Stop Automatic Recalculation • In the Options Menu • Under Formulas • F9

  26. Fill Formulas and Check Cells

  27. Copy Whole and Paste As Values

  28. Sum Rows and Sort On (Your Favorite) • Total is Customary • Can rearrange as needed

  29. Select Data and Titles Only

  30. Make a 100% Stacked Chart • Not very pretty

  31. Switch Perspectives

  32. Size Correctly

  33. To Compare to Genboree • RDP must be run • png.result.tar.gz

  34. What did we learn?

  35. What did we learn?

  36. Some Problems Commonly Encountered • Column formatting is not always followed with RDP output. • To get a clean graph with all taxonomic levels on one column, you may need to sort and remove sections of data. • Some have additional levels • Some have fewer levels of classification

  37. Additional Levels of Classification Move over Move over Delete Delete

  38. Fewer Levels of Classification Common Trouble Makers • Bacteroidetes • Verrucomicrobia • Acidobacteria • Dehalococcoidetes • Cyanobacteria • Chloroplast • Deltaproteobacteria • OD1_genera_incertae_sedis • TM7_genera_incertae_sedis • Armatimonadetes • WS3_genera_incertae_sedis Move Over