E N D
1. MATLAB & SNPCLIP MATLAB
GUIs, Databases, and C++ code
SNPCLIP
Theory and design
2. MATLAB & GUIs MATLAB Page:
http://www.mathworks.com/access/helpdesk_r13/help/techdoc/creating_guis/ch_overv.html
GUI Tutorial
http://www.matlabgui.com/
3. Database Interfacing Database ToolBox 3.4
Importing Data:
Establish a connection
conn = database('DatabaseName','username','password', 'DriverName','URL of Server');
conn = database('oracle','scott','tiger',...
'oracle.jdbc.driver.OracleDriver','jdbc:oracle:oci7:');
4. Database Interfacing Retrieving Data
Query Builder:
5. Database Interfacing
6. Database Interfacing Exporting Data
Query Builder
7. Database Interfacing Retrieving Data
Command Line
cursorA = exec(connectionA, select X from Y)
cursorA = fetch(cursorA, R)
conn = database('SampleDB', '', '');
curs=exec(conn, 'select all Ids from Customer');
setdbprefs('DataReturnFormat','numeric')?
curs=fetch(curs,3);
curs.Data
ans =
39
17
13
8. Database Interfacing Exporting Data
Command Line
fastinsert(conn, 'tablename', colnames, exdata)?
exdata = {'San Diego', 88}
colnames = {'City', 'Avg_Temp'}
fastinsert(conn, 'Temperatures', colnames, exdata)?
http://www.mathworks.com/products/database/index.html
9. MATLAB and C++ Code MEX Files
Hello World
#include "mex.h"
void mexFunction(int nlhs, mxArray *plhs[],
int nrhs, const mxArray *prhs[]) {
mexPrintf("Hello, world!\n");
}
10. MATLAB and C++ Code Compiling:
mex -setup
Select a compiler:
[1] Lcc C version 2.4.1 in D:\APPLIC~1\MATLAB\R2006b\sys\lcc
[2] Microsoft Visual C/C++ version 8.0 in D:\Applications\Microsoft Visual Studio 8
[3] Microsoft Visual C/C++ version 7.1 in D:\Applications\Microsoft Visual Studio .NET 2003
11. MATLAB and C++ Code Compile:
mex hello.c
Run:
hello
Output:
Hello, world!
12. MATLAB and C++ Code #include "mex.h"
void mexFunction(int nlhs, mxArray *plhs[],
int nrhs, const mxArray *prhs[])?
{
int i, j, m, n;
double *data1, *data2;
if (nrhs != nlhs)?
mexErrMsgTxt("The number of input and output arguments must be the same.");
for (i = 0; i < nrhs; i++)
{
/* Find the dimensions of the data */
m = mxGetM(prhs[i]);
n = mxGetN(prhs[i]);
/* Create an mxArray for the output data */
plhs[i] = mxCreateDoubleMatrix(m, n, mxREAL);
/* Retrieve the input data */
data1 = mxGetPr(prhs[i]);
/* Create a pointer to the output data */
data2 = mxGetPr(plhs[i]);
/* Put data in the output array */
for (j = 0; j < m*n; j++)?
{
data2[j] = 2 * data1[j];
}
}
}
13. MATLAB and C++ Code [a,b]=timestwo([1 2 3 4; 5 6 7 8], 8)?
a =
2 4 6 8
10 12 14 16
b =
16
14. S.A.G.E. Statistical Analysis for Genetic Epidemiology
Many libraries and programs.
Summary statistics means, std deviations, family sizes, etc.
Data Quality Mendelian inconsistencies
Allele Frequency
Haplotype Frequency under development
Segregation Analysis
Many more
http://darwin.cwru.edu/sage
15. S.A.G.E. Runs on all platforms
Command line or GUI
16. New Approach SNP data is to large to comprehend much easier to visualize and play with
17. SNPCLIP - Problem The problem:
DNA chips are acquiring SNPs in large quantities
SNPs are stored in different formats
To much data to look at 'by hand'
Researchers need an effective way of knowing which SNPs are most important
Want to apply assorted conditions on the data
SNP data can take a lot of memory to store
18. SNPCLIP - Solution The Solution:
Automated system to import large files of various formats
Create a system that efficiently display various calculations on different data sets
Allow the user to exclude/include based on criteria
Allow the user to move between datasets with ease
19. SNPCLIP - UI
20. SNPCLIP - Core Sample data:
FAMID,ID,rs2132594,rs1078687,rs1885865,rs1591426,rs1037800,rs594535,rs241251,rs709209,rs2057008,rs1005670,rs705681,rs763161,rs1473420,rs1294028,rs630075,rs2205669,rs877309,rs1075793
1333,1,2/2,4/4,4/4,2/2,3/3,4/1,2/4,4/4,2/2,1/2,1/3,4/4,1/2,2/4,2/4,4/4,3/3
1333,2,2/2,2/4,4/4,2/2,3/3,4/1,0/0,2/2,2/2,1/2,1/3,4/4,1/2,4/4,2/2,4/4,1/1
1333,3,2/2,2/4,4/4,2/2,3/3,4/4,2/2,2/4,2/2,1/1,3/3,4/4,1/1,4/4,2/2,4/4,1/3
1333,4,2/2,4/4,4/4,2/2,3/3,4/1,2/4,2/4,2/2,1/2,1/3,4/4,1/2,4/4,2/2,4/4,1/3
1340,12,2/2,2/4,4/4,4/4,1/3,4/1,4/4,2/4,2/2,0/0,0/0,4/4,2/2,0/0,2/2,2/4,1/1
1340,13,2/2,2/4,4/4,4/4,3/3,1/1,4/4,4/4,2/2,2/2,1/3,4/4,2/2,2/2,2/2,2/4,1/1
1341,1,2/2,2/4,2/4,2/2,3/3,4/4,2/2,2/4,2/2,1/1,1/3,2/4,1/2,2/2,4/4,4/4,3/3
1341,2,2/2,4/4,2/4,4/4,3/3,4/4,2/2,2/4,2/2,1/2,1/1,2/4,2/2,2/2,2/4,4/4,1/3
1341,3,2/2,4/4,2/2,2/4,3/3,4/4,2/2,2/4,2/2,1/2,1/3,4/4,2/2,2/2,2/4,4/4,3/3
21. SNPCLIP - Core How to handle data:
Storage
Create a matrix of bitvectors
Saves space
easily access any SNP by index or bitmasking
Missing data
Create parallel matrix for missing values
Use of bitmasking to identify missing/non missing values
22. SNPCLIP - CORE Data:
{1,0,1,0,1,1,1,0,}
{1,1,0,0,1,1,1,0,}
{1,0,1,1,1,1,1,1,}
Missing: (1 represents missing)
{0,0,1,0,0,0,1,0,}
{0,0,1,0,0,1,0,0,}
{1,0,0,0,0,0,1,0,}
23. SNPCLIP - Core What to do with Data:
FILTER!!!!
Modular design so any calculation can be run
Missingness, Allele Frequency, Departure from HWE...anything the researcher wants
Each custom filter creates a new bitmask based on calculations.
24. SNPCLIP - CORE Bitmask is applied to data.
Summary statistics are generated based on the user's preferences
Can filter on existing filtered data or on original tab
Allows user to 'play' with data to find meaningful information
25. SNPCLIP - UI