1 / 1

Abstract

Visualization of Sequence Alignment for Large Genomes. 指導老 師 : 黃耀廷 專題生 : 洪碩懋. Method : Records mapping position. Abstract.

kele
Download Presentation

Abstract

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Visualization of Sequence Alignment for Large Genomes 指導老師:黃耀廷專題生 : 洪碩懋 Method : Records mapping position Abstract I created a table to record the file seek positions of the mapping positions when preprocessing.This way, searching from the beginning of a file is not needed. Searching now uses these file pointers. The alignment of Next-Generation Sequencing compares millions of reads with reference sequence. The alignment results are written to Sequence Alignment/Map File (SAM File). However the huge size of SAM file can not be easily checked by human eye. In this project, I developed visualization software of sequence alignment for large genomes using OpenGL API. The size of the SAM file may be up to gigabytes. I invented methods for prefetching the necessary alignments according to the user interface navigation behavior. Additional Improvement : File Mapping Originally, using standard read/write operations on each segment of data over and over again created a very large overhead. I changed from using standard read/write operations to using file mapping in order to increase I/O performance. The reason I did this is there is no overhead copying data into user space since file mapping directly maps the file into virtual memory. The Problem Encountered Experiments • The size of a SAM file can be 10+ gigabytes, so it is not feasible to load an entire SAM file into a system's memory. • Poor performance that makes users unpleasantwhile using the viewer. To verify whether my methods really improved the performance or not, I tested performance with the above methods. The results show my methods are better than using fstream and no records. Program Features The ComboBoxdisplays the list of reference sequences. The button labelled “ Take A Screenshot” can take a screenshot. The progress bar is used to check whether the pre-processing is done. After the user specifies all settings and clicks the button labelled“View”, the OpenGL window iscreated to visualizethe sequence alignment as follows. In the bottom half of the window, the green, blue, yellow and red colors respectively represent A,T, C and G. In the middle section, the curve represents the coverage of segment sequences. In the top-right corner, it displays properties like thecurrent position, average coverage and the value on the curve. Pressing the “A” and “D” keys moves to the left or right, depending on where the mouse cursor is moved. Users can zoom in/out to view varying ranges withthe mouse wheel. Inthe top-left corner, clicking the button will immediately convert all color squares into the letter A, T, C and G as shown in the figure above.

More Related