Class projects
1 / 19

- PowerPoint PPT Presentation

  • Updated On :

Class Projects. Future Work and Possible Project Topic in Gene Regulatory network. Learning from multiple data sources ; Learning causality in Motifs ; Learning GRN with feedback loops ;. Learning from multiple data sources.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about '' - vahe

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Future work and possible project topic in gene regulatory network l.jpg

Future Work and Possible Project Topic in Gene Regulatory network

Learning from multiple data sources;

Learning causality in Motifs;

Learning GRN with feedback loops;

Learning from multiple data sources l.jpg
Learning from multiple data sources network

  • We have gene expression data and topological ordering information;

  • Incorporating some other data sources as prior knowledge for the learning;

    • Transcription factor binding location data;

Example: Partial regulatory network recovered using expression data and location data.

Learning causality in motifs l.jpg
Learning Causality in Motifs network

  • They be used to assemble a transcriptional regulatory network.

  • Network motifs are the simplest units of network architecture.

Future work and possible project topics in protein interaction l.jpg
Future work and Possible Project Topics in protein interaction

  • Learning from multiple data sources;

  • Disease related protein-protein interactions;

  • Learning from different species;

Learning from multiple data sources8 l.jpg
Learning from Multiple data sources interaction

  • Gene Neighbor: identifies protein pair encoded in close proximity across multiple genomes.

  • Rosetta Stone

  • Phylogenetic Profile

  • Gene Clustering:

  • closely spaced genes, and assigns a probability P of observing a particular gap distance

Disease related protein protein interactions l.jpg
Disease related interaction protein-protein interactions;

Disease Related???

-- Query NCBI OMIM


Learning from different species l.jpg
Learning from interactiondifferent species

Projects for bioqa l.jpg
Projects for BioQA interaction

  • Learning

    • Given a set of relevant abstracts, what kind of features can we obtain to enhance our queries?

    • Given a set of questions from users, how can we identify keywords from the questions to form queries?

  • Answer Presentation

    • Given a relevant abstract/article,

      • how can we retrieve the relevant passage with respect to the user’s question?

      • how to extract answers?

Projects for bioqa13 l.jpg
Projects for BioQA interaction

  • Automatic Extraction

    • Extract relations of gene-disease, gene-biological process (also their corresponding organisms)

    • Uniquely identify the genes

      • A gene symbol can be associated with multiple gene identifiers. Which gene identifier is the right one?

    • Can these extraction processes be generalized?

  • Sortal Resolution

    • Given an abstract and query, perform sortal resolution (but not on pronouns)

    • Example:

      • Given the following abstract:

        • “In this report, we show that virus infection of cells results in a dramatic hyperacetylation of histones H3 and H4 that is localized to the IFN-beta promoter. … Thus, coactivator-mediated localized hyperacetylation of histones may play a crucial role in inducible gene expression. [PMID: 10024886]

      • and the query about histones, perform resolution on histones

      • Results: histones refer to H3, H4.

Projects for bioqa14 l.jpg
Projects for BioQA interaction

  • Semantics of Words

    • Dealing with the semantics of words to improve the retrieval of answers

      • Example: semantic relation between “role” and “play”

  • Gene symbol variants, disambiguate gene symbols, entity recognition

    • Generate gene symbol synonyms and variants given a gene symbol in a query

      • Example: variants of “CDC28” can be written as “Cdc28”, “Cdc28p”, “cdc-28”

    • “GSS” is a synonym of “PRNP”, but “GSS” itself is also a gene which is unrelated to “PRNP”.

    • Improve on recognition of diseases, biological processes

  • Extension of Ontology

    • To capture biological processes and their possible relations to diseases

    • Examples:

      • learning and/or memory can influence Alzheimer’s disease

      • Degradation of ubiquitin cycle can cause extra long/short half-life of genes

      • Extra long/short half-life of genes can cause cancer

Other projects l.jpg

Other projects interaction

Build an ontology l.jpg
Build an Ontology interaction

  • Build an ontology for a domain for which we do not have an ontology yet.

  • Verify its consistency.

Various kinds of text extraction systems l.jpg
Various kinds of text extraction systems interaction

  • TREC suggested ones

    • Which method/protocol is used in which experiment/procedure

    • Gene – disease – role

    • Gene – biological process – role

    • Gene – mutation type – biological impact

    • Gene – interaction – gene – function – organ

    • Gene – interaction – gene – disease – organ

  • Protein Lounge inspired

    • Kinase-phosphatase

    • transcription factor

    • peptide antigen

Drug classification in pharmacogenetics l.jpg
Drug classification in Pharmacogenetics interaction

Experimental Data available

  • Drug response on cell lines; gene expression data; gene copy data; mutation analysis data; RNAi data

    Data from literature

  • Mutation data (Sanger lab); NCI-60 drug response data; Mutation analysis data; Pathway data (e.g. BIND); Gene Ontology

  • Proprietary data

    • Where does the drug physically interact? (600 Kinase – IC 50)

  • Gene expression data of patients after treatments


  • Given a patient, what kinds of data do we need in order to determine if a drug should be applicable to that patient or not? How do we develop a classifier using these kinds of data?

  • Find gene and protein interaction network (or components) using these data.