Visualization and Classification of DNA sequences using Pareto learning Self Organizing Maps based on Frequency and Correlation Coefficient. Hiroshi Dozono Saga University. Introduction (1).
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
T. Abe, T. Ikemura,et.al, Informatics for unreveiling hidden genome signatures, Genome Res., vol.13, p.693-702
Correlation Coefficients(CC) of DNA sequence
A 1000010010 ρAA(n) CC between A and n-shifted A
C 0101001000 ρAC(n) CC between A and n-shifted C
G 0010000001 :
T 0000100100 ρTT(n) CC between T and n-shifted T
For all combinations of A,G,T,C and from 1 to n shifts, 4x4xn correlation coefficients are calculated, and used as input vector of SOM.
Compared with dimension of n-tuples(4n), dimension of CC is much smaller.