1 / 16

Selecting the microarray genes that link the specific genes of interest among them.

Selecting the microarray genes that link the specific genes of interest among them. http://ibb.uab.es/revresearch.

chana
Download Presentation

Selecting the microarray genes that link the specific genes of interest among them.

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Selecting the microarray genes that link the specific genes of interest among them. http://ibb.uab.es/revresearch Huerta, M., Cedano, J. and Querol, E. (2007) Analysis of non-linear relation between expression profiles by the Principal Curves of Oriented-Points approach. J Bioinform Comput Biol, 6:367-386. .

  2. Objectives • Provide powerful tools for studying the non-linear dependences among gene expressions focussed in researcher genes of interest. • Taking advantage of the high-throughput capability of microarray technology.

  3. Procedure • pre-process: • The correlation degree between each pair of genes is obtained using the PCOP calculus. • The minimum-spanning tree among the microarray-genes is build using the pairwise-correlations previously calculated. • Zoom-in operation: • The genes which connect the query genes are selected using the minimum-spanning tree calculated in the pre-process. The query genes are provided by the researcher in each new query. • The intra-set behaviour pattern of the gene subset provided by the selection algorithm is obtained using the PCOP calculus. This inner pattern relates the expression fluctuations of the selected genes, plus the query ones, among them.

  4. The selection algorithm uses the minimum spanning tree to select the genes that link the query ones. • Considering that the researcher genes of interest have a certain level of correlation as a set (correlation level which the researcher does not wish to lose in its study), the zoom-in operation selects the maximum number of genes which connect the genes of the query set, but maintaining the correlation level of the query set in the new query-plus-selected-genes set. • When building a hierarchical clustering using a single linkage, we obtain a minimum-spanning tree where each edge of the tree represents the relationship used to add each new gene or gene cluster to the tree (Gower and Ross 1969). In this way, we can apply these clusters, their hierarchy and the properties of the hierarchical clustering to their corresponding minimum-spanning tree. • As a result, following the minimum spanning path, the genes in the mspaht more near to the query genes are the best correlated genes of each query gene, and the genes in the mspaht more far from the query genes are the genes more correlated among them plus the more correlated with the best correlated genes of the query genes (Huerta et al, 2007). In this way, the correlation degree of the query set is preserved in the new query-plus-selected-genes set. Huerta, M., Cedano, J. and Querol, E. (2007) Analysis of non-linear relation between expression profiles by the Principal Curves of Oriented-Points approach. J Bioinform Comput Biol, 6:367-386.

  5. Example of microarray analysis. • the profiles of 9703 cDNAS representing ~8000 unique genes of 60 cell lines, in relation to the activity profiles of 1400 drugs. They provide a resulting table of 1376 genes and 118 compounds with the most representative substances and genes normalised for the 60 cell lines (a suitable data for knowledge discovery using our tools)

  6. Minimum spanning tree among some microarray gene-expressions using the f value provided by the PCOP calculus.

  7. Hierarchical cluster corresponding to the previous minimum spanning tree.

  8. Example 1: relating cyclin E1 (CNNE1) and TP53 expressions. • Selected genes that link the query ones: • the thioredoxin-related protein endothelial protein disulphide isomerase gene, TXNDC5.

  9. Minimum spanning tree among some microarray gene-expressions using the f value provided by the PCOP calculus. Query genes Selected genes

  10. The non-linear relationship among the expression of the three genes. TXNDC5 TP53 CNNE1

  11. Results analysis • Cyclin E1 is a regulatory subunit of the cdc2-related protein kinase CDK2, which is activated shortly before S-phase entry. Lower levels of cyclin E1 imply lower cell-division rates, whereas higher levels of cyclin E1precede higher rates of cell-division (Hinchcliffe et al 1999). High levels of TP53 induce either apoptosis (in the presence of appropriate mutations) or, alternatively, switch on the mechanisms of DNA repair. At low levels of TP53, less apoptosis is produced and mutations can accumulate more easily. It is known that rapidly dividing cells show a higher mutation rate, whereas slowly dividing cells show lower rates (Bielas and Heddle 2003). It has been reported that constitutive cyclin E1 over-expression, in both immortalized rat embryo fibroblasts and human breast epithelial cells, results in chromosomal instability (Spruck et al 1999; Tissier et al 2004). A slight overproduction (just 5% more is enough) of cyclin E1 has been associated with the malignant phenotype and is strongly correlated with tumour size (Tissier et al 2004). Other authors have also reported the association of the above genes with cell division and apoptosis (Knoblach et al 2003; Sullivan et al 2003). Huerta, M., Cedano, J. and Querol, E. (2007) Analysis of non-linear relation between expression profiles by the Principal Curves of Oriented-Points approach. J Bioinform Comput Biol, 6:367-386.

  12. Example 2: relating cyclin E1 (CNNE1), TNK2 and CDK6expressions.

  13. The non-linear relationship among the expression of the three genes. CNNE1 TNK2 CDK6

  14. Results analysis • CDK6 and cyclin E1 genes show mutual-exclusion expression with respect to the TNK2 gene. When TNK2 is expressed above the control level, CDK6 and cyclin E1 levels are fixed around their minimum expression. When CDK6 and cyclin E1 are over-expressed, TNK2 is at its minimum levels. The 0 value in the curve parameter (abscissa axes in Figure) is positioned when all of the genes are in basal expression. The location of this 0 value shows us that cyclin E1 and CDK6 over-expression is more usual than TNK2 over-expression.

  15. Conclusions • Our approach relate the genes beyond the activation pathways or GO functional annotation: • Let’s study the connection found between cyclin E1 and TP53 shown in the example. in the GO tool, their functions/biological processes are only indirectly related. In activation or metabolic pathways, they are not related. And, in any case, the dependence of their respective expression levels is not shown. • The hidden reason for links found among TXNDC5,Cyclin E1 and TP53 is the adaptive response of the cell (TXNDC5) in order to survive in a low oxygen environment due to the cell growing at a high-division rate (cyclin E1). Neither it represents the genes belonging to the same activation pathway nor to the interaction of these proteins.

  16. Bibliografy • Delicado, P. (2001) Another look at principal curves and surfaces. J. Multivariate Anal., 77, 84-116. • Delicado, P. and Huerta, M. (2003) Principal curves of oriented points: Theoretical and computational improvements. Computation. Stat., 18, 293-315. • Huerta, M., Cedano, J. and Querol, E. (2007) Analysis of non-linear relation between expression profiles by the Principal Curves of Oriented-Points approach. J Bioinform Comput Biol, 6:367-386.

More Related