1 / 32

Construction of Molecular Networks and Pathways using OMICs and Literature Data

Construction of Molecular Networks and Pathways using OMICs and Literature Data. Mathew Palakal and Meeta Pradhan School of Informatics IUPUI. From Bibliomics to Target Discovery for Colorectal Cancer. CRC related Keywords. BioSIFTER Literature harvesting and Personalization. BioMAP

atira
Download Presentation

Construction of Molecular Networks and Pathways using OMICs and Literature Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Construction of Molecular Networks and Pathways using OMICs and Literature Data Mathew Palakal and Meeta Pradhan School of Informatics IUPUI

  2. From Bibliomics to Target Discovery for Colorectal Cancer CRC related Keywords BioSIFTER Literature harvesting and Personalization BioMAP Mining and Identification of novel biomarkers

  3. BioSIFTER

  4. BioSIFTER

  5. BioSIFTER

  6. BioSIFTER

  7. BioSIFTER

  8. BioSIFTER

  9. BioMAP: BioMedical Literature Mining • “ A major challenge faced by biologist is to identify the most • significant genes in a disease that can be targeted” Nodes/Links Experimental Data Our Hypothesis: Augmenting the experimental data with literature data can help to identify novel molecules that may be of significant relevance to the study under consideration. New Nodes/Links Augmented with Literature Data

  10. Regulatory Network Construction and Analysis Experiment Data hMLH1: DNA repair MSH2: DNA repair CDK8: Wnt signaling CRC miRNA Network Multi-scale Multi-level Analysis Literature augmented data SMAD4, P53, NF-kB, AKT1, PAK1, SOS Topological Analysis Annotating the Interaction Network with miRNA and miRNA Expression Data Hyper geometric Associations P53 EP300 Sub-Graph Analysis • Protein Interaction Prediction • Gene Ontology Annotation Similarity Association • Structural Interactions • Pfam Domain Interactions • Sequence Potential Analysis Validation of the Significant Genes Interaction Scoring (i) First Principle Methods (ii) Machine Learning CRC TF Network

  11. Experiments on TF Networks Experiment Data hMLH1: DNA repair MSH2: DNA repair CDK8: Wnt signaling miRNA Network Identification of significant nodes in the network Set of 48 keywords: myh, mlh1, cdk8, crcs7, dcc, crcs6, tgfbr1, tpx2, crcs, apc, hnpcc7, msh2, mlh1, braf, hnpcc, msh6, pten, fus1, cxcl2, rad18, hgf, axin2, casp3, prl3, nat1, gstm1, gstt1, cyp2c9, bcl2, prmt1, sn38, cpt11, proxy, smad3, igfbp1, pdgfb, capg, plk1, ifim1, csnk2a2, mbl2, pms2, cxcl2, igfir, cyp27b1, cyp24, mucins, colorectal Cancer Literature augmented data SMAD4, P53, NF-kB, AKT1, PAK1, SOS Topological Analysis Sub-Graph Analysis Annotating the Interaction Network with miRNA and miRNA Expression Data P53 EP300 Hyper geometric Associations • Protein Interaction Prediction • Gene Ontology Annotation Similarity Association • Structural Interactions • Pfam Domain Interactions • Sequence Potential Analysis TF Network Validation of the Significant Genes Interaction Scoring (i) First Principle Methods (ii) Machine Learning

  12. Literature Mining Experiment Data hMLH1: DNA repair MSH2: DNA repair CDK8: Wnt signaling miRNA Network Identification of significant nodes in the network Literature augmented data SMAD4, P53, NF-kB, AKT1, PAK1, SOS Topological Analysis • Retrieved 133,923 articles. • Obtained 2724 unique Swiss-Prot entry names. Sub-Graph Analysis Annotating the Interaction Network with miRNA and miRNA Expression Data P53 EP300 Hyper geometric Associations • Protein Interaction Prediction • Gene Ontology Annotation Similarity Association • Structural Interactions • Pfam Domain Interactions • Sequence Potential Analysis TF Network Validation of the Significant Genes Interaction Scoring (i) First Principle Methods (ii) Machine Learning

  13. Protein Interaction Prediction Experiment Data hMLH1: DNA repair MSH2: DNA repair CDK8: Wnt signaling miRNA Network Identification of significant nodes in the network Literature augmented data SMAD4, P53, NF-kB, AKT1, PAK1, SOS • Protein-protein interaction prediction is based on: • Gene Ontology Annotation Similarity Association • Structural Interaction • Pfamdomain interaction • Sequence Potential Analysis 2724 Topological Analysis Sub-Graph Analysis Annotating the Interaction Network with miRNA and miRNA Expression Data P53 EP300 Hyper geometric Associations • Protein Interaction Prediction • Gene Ontology Annotation Similarity Association • Structural Interactions • Pfam Domain Interactions • Sequence Potential Analysis TF Network Validation of the Significant Genes Interaction Scoring (i) First Principle Methods (ii) Machine Learning

  14. Sliding Window Algorithm for PPI Prediction Experiment Data hMLH1: DNA repair MSH2: DNA repair CDK8: Wnt signaling • Physico-chemical parameters for probable • interacting interface identification • Hydrophobicity • Accessibility • Residue Interface Propensity • P53 : EP300= Total Interacting Score (Number of Interface Residue and Number of Structure Interacting) • Protein % structure Interacting % structure Interacting • P53_HUMAN 70 MDM2_HUMAN 93 • P53_HUMAN 59 EP300_HUMAN 100 • P53_HUMAN 67 MDM4_HUMAN 100 • UBP7_HUMAN 100 P53_HUMAN 74 P53 miRNA Network 2F1Y A 1c26 A 1Z1M A Identification of significant nodes in the network Literature augmented data SMAD4, P53, NF-kB, AKT1, PAK1, SOS Topological Analysis Sub-Graph Analysis Annotating the Interaction Network with miRNA and miRNA Expression Data P53 EP300 1L3E B 3BIY A Hyper geometric Associations • Protein Interaction Prediction • Gene Ontology Annotation Similarity Association • Structural Interactions • Pfam Domain Interactions • Sequence Potential Analysis EP300 TF Network Validation of the Significant Genes Interaction Scoring (i) First Principle Methods (ii) Machine Learning

  15. Transcription Factor Network Generation for CRC Experiment Data hMLH1: DNA repair MSH2: DNA repair CDK8: Wnt signaling miRNA Network • 117 transcription factors • 277 non-transcription factors • 700 interactions Identification of significant nodes in the network Literature augmented data SMAD4, P53, NF-kB, AKT1, PAK1, SOS Topological Analysis Sub-Graph Analysis Annotating the Interaction Network with miRNA and miRNA Expression Data P53 EP300 Hyper geometric Associations • Protein Interaction Prediction • Gene Ontology Annotation Similarity Association • Structural Interactions • Pfam Domain Interactions • Sequence Potential Analysis TF Network Validation of the Significant Genes Interaction Scoring (i) First Principle Methods (ii) Machine Learning

  16. Multi-level Multi-parametric Approach to Identify Significant Transcription Factors in CRC Network Experiment Data hMLH1: DNA repair MSH2: DNA repair CDK8: Wnt signaling • Topological Analysis • Nodestrength= function (ProteinInteractionPropensityScore, • Topological Features) • Sub-Graph Analysis • Hyper geometric Associations • Multiparametric approach is used to identify significant Transcription Factors. miRNA Network Identification of significant nodes in the network Literature augmented data SMAD4, P53, NF-kB, AKT1, PAK1, SOS Topological Analysis Sub-Graph Analysis Annotating the Interaction Network with miRNA and miRNA Expression Data P53 EP300 Hyper geometric Associations • Protein Interaction Prediction • Gene Ontology Annotation Similarity Association • Structural Interactions • Pfam Domain Interactions • Sequence Potential Analysis TF Network Validation of the Significant Genes Interaction Scoring (i) First Principle Methods (ii) Machine Learning

  17. Results: Significant Transcription Factors in CRC Network Experiment Data hMLH1: DNA repair MSH2: DNA repair CDK8: Wnt signaling • Highly Scored Common Transcription factors: • c-Jun, NF-kB, P53, STAT3, SP1, STAT1, c-MYC, E2F1, SMAD3, MEF2A • Highly Scored Unique Transcription Factors: • Topological: LEF1, MEF2C, SMAD2, • SMAD4, ELK-1, PPARA • Hypergeometric: DAND5, RXRA, ESR1, • ATF-2, SP3, RARA, PPARD • Module: P73, ETS1, ETS2, GATA-1, • FOXA1, FOXA2, SLUG, • HAND1, SNAIL, VDR, TF7L2, • ITF-2, REST, SRF, IRF1 miRNA Network Identification of significant nodes in the network Literature augmented data SMAD4, P53, NF-kB, AKT1, PAK1, SOS Topological Analysis Sub-Graph Analysis Annotating the Interaction Network with miRNA and miRNA Expression Data P53 EP300 Hyper geometric Associations • Protein Interaction Prediction • Gene Ontology Annotation Similarity Association • Structural Interactions • Pfam Domain Interactions • Sequence Potential Analysis TF Network Validation of the Significant Genes Interaction Scoring (i) First Principle Methods (ii) Machine Learning

  18. Result: A Highly-scored Module Experiment Data hMLH1: DNA repair MSH2: DNA repair CDK8: Wnt signaling miRNA Network PIAS1 Identification of significant nodes in the network Literature augmented data SMAD4, P53, NF-kB, AKT1, PAK1, SOS C-JUN Topological Analysis ATF-2 ESR1 MAPK14 Sub-Graph Analysis Annotating the Interaction Network with miRNA and miRNA Expression Data P53 EP300 MAPK1 JNK1 Hyper geometric Associations • Protein Interaction Prediction • Gene Ontology Annotation Similarity Association • Structural Interactions • Pfam Domain Interactions • Sequence Potential Analysis ELK-1 TF Network MK09 MK10 Validation of the Significant Genes Interaction Scoring (i) First Principle Methods (ii) Machine Learning

  19. Validation of the Significant Genes Experiment Data hMLH1: DNA repair MSH2: DNA repair CDK8: Wnt signaling miRNA Network Identification of significant nodes in the network Literature augmented data SMAD4, P53, NF-kB, AKT1, PAK1, SOS Topological Analysis Sub-Graph Analysis Annotating the Interaction Network with miRNA and miRNA Expression Data P53 EP300 Hyper geometric Associations • Protein Interaction Prediction • Gene Ontology Annotation Similarity Association • Structural Interactions • Pfam Domain Interactions • Sequence Potential Analysis TF Network Validation of the Significant Genes Interaction Scoring (i) First Principle Methods (ii) Machine Learning

  20. Validation of the Significant Genes Experiment Data hMLH1: DNA repair MSH2: DNA repair CDK8: Wnt signaling miRNA Network Identification of significant nodes in the network Literature augmented data SMAD4, P53, NF-kB, AKT1, PAK1, SOS Topological Analysis Sub-Graph Analysis Annotating the Interaction Network with miRNA and miRNA Expression Data P53 EP300 Hyper geometric Associations • Protein Interaction Prediction • Gene Ontology Annotation Similarity Association • Structural Interactions • Pfam Domain Interactions • Sequence Potential Analysis TF Network Validation of the Significant Genes Interaction Scoring (i) First Principle Methods (ii) Machine Learning

  21. Validation of the Significant Genes Experiment Data hMLH1: DNA repair MSH2: DNA repair CDK8: Wnt signaling miRNA Network Identification of significant nodes in the network Literature augmented data SMAD4, P53, NF-kB, AKT1, PAK1, SOS Topological Analysis Sub-Graph Analysis Annotating the Interaction Network with miRNA and miRNA Expression Data P53 EP300 Hyper geometric Associations • Protein Interaction Prediction • Gene Ontology Annotation Similarity Association • Structural Interactions • Pfam Domain Interactions • Sequence Potential Analysis TF Network Validation of the Significant Genes Interaction Scoring (i) First Principle Methods (ii) Machine Learning

  22. Global Transcription Factor Association Network showing Functional Groups Experiment Data hMLH1: DNA repair MSH2: DNA repair CDK8: Wnt signaling miRNA Network Identification of significant nodes in the network Literature augmented data SMAD4, P53, NF-kB, AKT1, PAK1, SOS Topological Analysis Sub-Graph Analysis Annotating the Interaction Network with miRNA and miRNA Expression Data P53 EP300 Hyper geometric Associations • Protein Interaction Prediction • Gene Ontology Annotation Similarity Association • Structural Interactions • Pfam Domain Interactions • Sequence Potential Analysis TF Network Validation of the Significant Genes Interaction Scoring (i) First Principle Methods (ii) Machine Learning

  23. Annotation of miRNA with Transcription Factors in CRC Experiment Data hMLH1: DNA repair MSH2: DNA repair CDK8: Wnt signaling • Expression dataset: GSE14985 • 3 Normal samples, 3 colon samples • No. of miRNA :723 • Top 100 differentially expressed miRNA are identified. • 26 upregulated and 74 downregulatedmiRNA are further analyzed. miRNA Network Identification of significant nodes in the network Literature augmented data SMAD4, P53, NF-kB, AKT1, PAK1, SOS Topological Analysis Sub-Graph Analysis Annotating the Interaction Network with miRNA and miRNA Expression Data P53 EP300 Hyper geometric Associations • Protein Interaction Prediction • Gene Ontology Annotation Similarity Association • Structural Interactions • Pfam Domain Interactions • Sequence Potential Analysis TF Network Validation of the Significant Genes Interaction Scoring (i) First Principle Methods (ii) Machine Learning

  24. Novel miRNA identified Experiment Data hMLH1: DNA repair MSH2: DNA repair CDK8: Wnt signaling Up-regulated Novel miRNATarget of miRNARelevance to cancer hsa-miR-663 CCND1, FOS, PTEN, TGFBR1 Not reported* hsa-miR-630 ATM, BAX,BCL2,BCL2L2, CASP3, Not reported* p53, TP73 hsa-miR-424 ATF2, BCR, CCND1,CDK6, CHEK1, Kidney, E2F1, EGFR, ESR1, ETS1, FLT3, Pancreatic cancer HIF1A, MUC1, MYB, RARA, RUNX1, SMAD3, SP2,WEE1 * The target genes were identified by literature mining and many genes are important in CRC miRNA Network Identification of significant nodes in the network Literature augmented data SMAD4, P53, NF-kB, AKT1, PAK1, SOS Topological Analysis Sub-Graph Analysis Annotating the Interaction Network with miRNA and miRNA Expression Data P53 EP300 Hyper geometric Associations • Protein Interaction Prediction • Gene Ontology Annotation Similarity Association • Structural Interactions • Pfam Domain Interactions • Sequence Potential Analysis TF Network Validation of the Significant Genes Interaction Scoring (i) First Principle Methods (ii) Machine Learning

  25. Novel miRNAIdentified Down-regulated Novel miRNATarget of miRNA Disease hsa-let-7c BBC3, BCL2, MCL1, MEF2C, MYC, Lung,hepatocellular NGF, PPARA, ADAM9 cancer hsa-let-7d BDNF, CCND1, EGFR, SMAD3 Epithelial Ovariancancer hsa-let-7i BCL2, HIF1A, NFKB1, TLR4 Breast cancer hsa-miR-103 BMP7, CDK6, PPARA Pancreatic cancer hsa-miR-100 AKT1,CCND1, ESR1,FGFR3,JUN,P53 Oral squamous cell MYC carcinoma hsa-miR-99a AKT1, BDNF, CCND1, JUN,IGF1, JUN, Bladder cancer MYC, p53 hsa-miR-30e Bcl2l2, ERBB2 Lung cancer hsa-miR-425 SMAD3 Glioblastoma hsa-miR-361-5p AKT1, IRS1 Ovarian cancer hsa-miR-494 AKT1, CDK6, JUN, PTEN Cardiac Hypertrophy hsa-miR-331-3p AKT1, EGFR, ERBB2 Epithelial ovarian cancer Experiment Data hMLH1: DNA repair MSH2: DNA repair CDK8: Wnt signaling miRNA Network Identification of significant nodes in the network Literature augmented data SMAD4, P53, NF-kB, AKT1, PAK1, SOS Topological Analysis Sub-Graph Analysis Annotating the Interaction Network with miRNA and miRNA Expression Data P53 EP300 Hyper geometric Associations • Protein Interaction Prediction • Gene Ontology Annotation Similarity Association • Structural Interactions • Pfam Domain Interactions • Sequence Potential Analysis TF Network Validation of the Significant Genes Interaction Scoring (i) First Principle Methods (ii) Machine Learning

  26. miRNA-gene Network Experiment Data hMLH1: DNA repair MSH2: DNA repair CDK8: Wnt signaling miRNA Network Identification of significant nodes in the network Literature augmented data SMAD4, P53, NF-kB, AKT1, PAK1, SOS Topological Analysis Sub-Graph Analysis Annotating the Interaction Network with miRNA and miRNA Expression Data P53 EP300 Hyper geometric Associations • Protein Interaction Prediction • Gene Ontology Annotation Similarity Association • Structural Interactions • Pfam Domain Interactions • Sequence Potential Analysis TF Network Validation of the Significant Genes Interaction Scoring (i) First Principle Methods (ii) Machine Learning

  27. Number of miRNAAssociated with CRC Related Pathways Experiment Data hMLH1: DNA repair MSH2: DNA repair CDK8: Wnt signaling miRNA Network Identification of significant nodes in the network Literature augmented data SMAD4, P53, NF-kB, AKT1, PAK1, SOS Topological Analysis Sub-Graph Analysis Annotating the Interaction Network with miRNA and miRNA Expression Data P53 EP300 Hyper geometric Associations • Protein Interaction Prediction • Gene Ontology Annotation Similarity Association • Structural Interactions • Pfam Domain Interactions • Sequence Potential Analysis TF Network Validation of the Significant Genes Interaction Scoring (i) First Principle Methods (ii) Machine Learning

  28. Validation of the Significant Genes Experiment Data hMLH1: DNA repair MSH2: DNA repair CDK8: Wnt signaling miRNA Network Identification of significant nodes in the network Module: Brca1: p53:c-Myc Pathway: Brca1 as a transcription regulator Domain: DNA Damage Literature augmented data SMAD4, P53, NF-kB, AKT1, PAK1, SOS Topological Analysis Sub-Graph Analysis Annotating the Interaction Network with miRNA and miRNA Expression Data P53 EP300 Hyper geometric Associations • Protein Interaction Prediction • Gene Ontology Annotation Similarity Association • Structural Interactions • Pfam Domain Interactions • Sequence Potential Analysis TF Network Validation of the Significant Genes Interaction Scoring (i) First Principle Methods (ii) Machine Learning

  29. Protein-Protein Interaction Prediction Tool hMLH1: DNA repair MSH2: DNA repair CDK8: Wnt signaling Identification of significant nodes in the network Experiment Data Topological Analysis SMAD4, P53, NF-kB, AKT1, PAK1, SOS Literature augmented data Sub-Graph Analysis Annotating the Interaction Network with miRNA and miRNA Expression Data Hyper geometric Associations EP300 P53 Algorithm for Interacting Proteins Validation of the Significant Nodes Interaction Scoring (i) First Principle Methods (ii) Machine Learning

  30. Publications • M. Pradhan, P. Gandra, M. Palakal, Predicting Protein-Protein Interactions using First Principle Methods and Statistical Scoring, ACM International Symposium on Biocomputing, Calicut, 2010. • M. Pradhan and M. Palakal, Global analysis of transcription factors and functional domains in CRC. (Manuscript under preparation). • M. Pradhan, P. Gandra, M. Palakal, Predicting Protein-Protein Interactions using First Principle Methods and Statistical Scoring, ACM International Symposium on Biocomputing, Calicut, 2010. • M. Pradhan and M. Palakal, Identifying CRC specific pathways and biomarkers from literature augmented proteomics data, BIOCOMP 2010. • M. Pradhan and M. Palakal Global analysis of miRNA target genes in colon rectal cancer, IEEE BIBM Hong Kong, 2010. • M. Pradhan and M. Palakal, Global analysis of transcription factors in CRC using protein interaction networks. (Manuscript in final stages). • M. Pradhan and M. Palakal, Identifying candidate pathways and genes in CRC: meta-analysis of gene expression data (Manuscript in preparation). • M. Pradhan and M. Palakal, Machine Learning for Predicting Protein Interactions (Manuscript in preparation). • M. Pradhan, Sanders P and M. Palakal, Algorithm for Protein-drug binding predictions (Manuscript in preparation). • Y. Pandit , M. Pradhan and M. Palakal, Database for Protein-Protein Interaction Predictions (Manuscript in preparation).

  31. Acknowledgements • The TiMAP team: Meeta Pradhan Shielly Hartanto Premchand Gandra Deepali Jhamb Rini Pauly Gokul Kilaru Philip Sanders Yogesh Pandit Sijin C. A. Tulip Nadu • Kshithija Nagulapalli http://regen.informatics.iupui.edu/research/

  32. Questions?

More Related