200 likes | 349 Views
TandemGraph. Dina Sokol Ramin Rakhamimov http://tandem.sci.brooklyn.cuny.edu Department of Computer and Information Science Brooklyn College CUNY BIOCOMP ’09: Las Vegas Nevada. TandemGraph. A graphical tool for modeling string regularities
E N D
TandemGraph Dina Sokol Ramin Rakhamimov http://tandem.sci.brooklyn.cuny.edu Department of Computer and Information Science Brooklyn College CUNY BIOCOMP ’09: Las Vegas Nevada
TandemGraph • A graphical tool for modeling string regularities • Graphs tandem repeats generated by various genome parsers. • Enables user to see all data in a single view. • Provides easy navigation of data (shifting, scaling, zooming...) • Licensed by AFPL, you can contribute and modify according to your needs.
Why we need TandemGraph? • Graphical presentations of string patterns enable easy and efficient methods of analysis. • When input is large text presentation, abstract statistics is hard to deduce. • Graphical tools enable the user to intuitively pinpoint individual or global patterns. • Areas of high concentration as well as sparse areas are easily discernable.
Tandem Repeats • Tandem repeats occur in DNA when a pattern of two or more nucleotides is repeated and the repetitions are directly adjacent to each other. • In DNA, tandem repeats are used for disease diagnosis, mapping studies, and human identity testing. • 50% of the human genome consists of repeated sequences, of which some are tandem repeats. • TRED – (A tool developed by our team, Justin Tojeira, Dr. Dina Sokol) parses genomes and generates a list of tandem repeats found.
Example ATTCGATTCGATTCG… The sequence ATTCG is repeated three or more times.
Example Alignment Start: 72001 End:72021 Length:21 Period:2.1 Repeats:10 Errors:1 72001 AA 72002 72003 AA 72004 72005 AA 72006 72007 AA 72008 72009 AA 72010 72011 AA 72012 72013 AA 72014 72015 AA 72016 A-A 72017 AGA 72019 72020 AG 72021
Textual/Graphical Comparison • Running TRED on Chromosome 1 of Homo Sapiens generated 91,815 repeats. • Generating about 1,000 paginated pages per chromosome. • Using the graphical approach, all 91,815 repeats are displayed within a single view.
Features of TandemGraph • Load tandem repeats data from our in-house database. • Navigate around desired regions through selecting, shifting and zooming widgets. • Switch between triangle or trapezoid representation. • Select individual repeats. • View individual period sizes and number of errors. • View actual alignment of the tandem repeats.
Future of TandemGraph • Apply to other fields of edit distance outside of Biology. • Enable the use of other proprietary/standard input formats. • Integrate with textual presentations for a more fine tuned control.