140 likes | 155 Views
This project explores viral metagenomes sampled directly from Yellowstone Hot Springs, analyzing unknown species/strains to construct contigs for gene discovery and phylogenetic comparison. Utilizing tools like BioBIKE and NCBI BLAST, the study delves into common descent, genetic similarities, and evolutionary implications in hot spring bacteria viruses. The findings provide insights into ecological relationships and biodiversity in the unique environment.
E N D
A portrait of an analysis Viral Metagenome Project
Viral metagenome? Sampled directly from the environment Unknown species/strains Assembly of Viral Metagenomes from Yellowstone Hot Springs, T. Schoenfeld, et al. 2008 Two Yellowstone springs, Octopus Hot Springs and Bear Paw Hot Springs
From a read to a contig • One read (OctHSe.ATYB2334-b2) of ~1kb • 969nt isn’t long enough! • BioBIKE: SEQUENCE-SIMILAR-TO finding overlapping reads • ≥ 95% identity, ≥20nt minimum overlap to consider
Building it up • MS Word, highlighting • GenBank format, then FASTA • Align, copy/paste, repeat • From 969nt to 5546nt
Building it up, cont. • 9 reads in total • But are there any genes?
Searching for significance Is the sequence similar to known viruses? Known anything, for that matter? What does it mean if it is, or if it isn’t? Common descent
Finding similarities • SEQUENCE-SIMILAR-TO in BioBIKE • NCBI BLAST • Translated-nucleotide vs. protein
Finding similarities, cont. S.Islandicus E=1E-120 Per protein S.Islandicus E=1E-120
What are these? Stygiolobus spp., Acidianus spp., Sulfolobus islandicus: hot-springs bacteria in Sulfolobaceae Viruses which infect hot-springs bacteria Does this make sense?
Low identities Bacterial transfer • Note identities for known viruses: 20-40% • Low compared to the other reads • Reproductive isolation? • Evolution can occur leading to differentiation • Changes in protein structure may or may not alter function Hot spring 2 Hot spring 1 Hot spring 2 Hot spring 1 Time passes… Hot spring 2 Hot spring 1
Phylogenetic trees Shows (potential) lines of descent BioBIKE has tools for it: TREE-OF Compares genetic/protein sequences Differences permit prediction
Phylogenetic trees, example • 20 best matches in SEQUENCES-SIMILAR-TO an open reading frame of the contig in Octopus Hot Springs reads (Seq_21 is the contig) • Predicts line of descent from common ancestor
Conclusions Metagenome analysis allows one to find similarities between organisms inhabiting an environment and known organisms or sequences This leads to the ability to make inferences about the ecology and biodiversity of an environment