informatics journal club and research talk template l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Informatics Journal Club and Research Talk Template PowerPoint Presentation
Download Presentation
Informatics Journal Club and Research Talk Template

Loading in 2 Seconds...

play fullscreen
1 / 50

Informatics Journal Club and Research Talk Template - PowerPoint PPT Presentation


  • 259 Views
  • Uploaded on

Informatics Journal Club and Research Talk Template. Research Paradigm. Driving Biomedical Problem. Informatics Methods (existing). New Methods. Apply to 2 nd problem area to see generality of new method .

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

Informatics Journal Club and Research Talk Template


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
research paradigm
Research Paradigm

Driving Biomedical Problem

Informatics Methods (existing)

New Methods

Apply to 2nd problem area to see generality of new method

Evaluate: 1) ability to solve Biomed problem & 2) Incremental improvement from new method

guide to talks
Guide to Talks
  • Pick journal club paper or research topic that discusses new or improved informatics methods
  • Make the focus of your talk be a description of the methods.
  • Describe methods in the context of previous work and in light of evaluations of the methods applied to the biomedical problem
russ b altman 10 4 06

BMI Journal Club:Finding function: evaluation methods for functional genomic dataMyers, Barrett, Hibbs, Huttenhower &TroyanaskayaBMC Genomics 2006 7:187

Russ B. Altman

10/4/06

why this paper
Why this paper?
  • {Brief bullet points about why this paper is a good BMI journal club paper, and why you selected it}
outline part 1
Outline (part 1)
  • General description of medical/biological problem
  • Informatics issues that come up in solving those problems
  • Additional biological/informatics background
  • Aims of paper
outline part 2
Outline (part 2)
  • Methods Employed
  • Results
  • Comparison/Evaluation of Methods
  • Authors Conclusions
  • Assessment of paper: informatics
  • Assessment of paper: biomedicine
  • Concerns
  • Summary/ Your conclusions
why this paper9
Why this paper?
  • Needed a good methodological paper
  • Proliferation of work here and elsewhere on predicting gene function from high throughput genomics
  • This paper addresses an important problem in evaluation, and uses general informatics principles
  • Olga is a recent BMI graduate :)
potentially confounding biomedicine
Potentially confounding biomedicine! ;)
  • {What is application area of biology or medicine in which this work is presented?}
  • {Discussion of the biological or medical problem that drove/required/suggested researchers to recognize potential for informatics innovation}
  • {What is the significance of this biomedical problem}
  • {REMEMBER TO SEPARATE THE INFORMATICS FROM THE BIOMEDICAL APPLICATION. THAT MAY LEAVE NOTHING…}
potentially confounding biomedical background
(Potentially confounding) biomedical background…
  • With the human genome sequenced, we need to understand the interactions and functions of genes (for understanding, drug-design
  • High-throughput experimental data sets are used and integrated for this purpose: two-hybrid, mRNA expression, affinity precipitation
  • Diverse algorithms are also created for integrating these data:
    • Naïve Bayes (Troyanskaya & others)
    • Probabilistic Relational Models (Koller)
    • Comparative techniques (Segal & Stuart)
more biology context
More biology context…
  • It is critical to assemble networks of interacting and functionally related genes in order to generate hypotheses about cellular biology, identify drug targets, assess pathway engineering opportunities.
  • Yeast is the best-studied organism because of the wealth of data sets
  • Authors suspect that use of existing “silver standards” may skew conclusions about high vs. low information content methods/data sources.
  • Scientists are frustrated if many predictions are “high confidence” and then fail in the lab.
informatics problem
Informatics Problem
  • {Describe what is the general biomedical informatics question/problem addressed in the paper}
  • {Brief review of what others have done to solve this problem, and how performance has been. THIS MAY REQUIRE READING OTHER PAPERS!}
  • {Why is there another paper on this topic?}
informatics problem14
Informatics Problem
  • Whenever a method is created that makes “predictions” or “diagnoses” it must be evaluated against a gold standard of truth.
  • When making multiple predictions, there can be biases in the gold standard based on its coverage of the predicted space
  • The resulting reports of performance can vary widely and unpredictably based on which parts of the gold standard are used.
  • This is a relatively new problem in the context of large scale predictive technologies
informatics problem15
Informatics Problem
  • What is the best way to evaluate a system making thousands or millions of predictions?
  • How can we “level the playing field” so that different methods and data sources can be assessed with respect to information content fairly?
biomedical context alternative slide location
Biomedical Context (alternative slide location)
  • [You may want to address the informatics question first and then raise the medical/biological context, but it often flows better if you start with the biomedical context and use that to motivate the informatics question.]
background
Background
  • {Review of informatics and biomedicine people need to know in order to understand the key contributions of the paper}
background18
Background
  • Gene Ontology
    • Taxonomy of gene function, 30K+ terms
    • Terms assigned to genes manually = genes related if they get the same term
  • KEGG
    • Database of biological pathways
    • Mostly metabolic, manually curated
    • Genes in same pathway = related
  • Each of these provides a biased coverage of gene function space!
background21
Background
  • GO is organized from most general (top) to most specific (bottom)
  • For validation, people often choose a “level” of GO at which they define GO annotations to be “meaningful.”
  • E.g. All GO codes at level 5 or below = sufficiently precise predictions.
aims of paper
Aims of Paper
  • {As in BMI 212, a listing of the specific aims of the paper. No more than 3 usually (often less).}
  • {NOTE: the paper should be presented initially in the most positive light, as the authors would have presented it. The time for critique is after the “author perspective” presentation.}
aims of paper25
Aims of Paper
  • Define the problem of biased gold standards in high-throughput evals.
  • Create a method for comparing prediction methods fairly
  • Build a manual gold standard and associated web tool
  • Allow evaluations to report not only overall performance, but area-specific performance.
methods employed
Methods Employed
  • {This is the key part of the presentation for BMI crowd. This should be a presentation of the methods described in the paper at sufficient technical level so people can discuss and evaluate it. Avoid detailed math/equations unless absolutely critical to the discussion.}
methods employed27
Methods Employed
  • 6 post-doctoral biologists
  • Examine every GO code and vote on “informative” or “not informative” if applied to a gene
  • 3 “informative” votes = useful category
  • <1 “informative” and >1000 annotations = not useful category
  • “Not usefuls” are key denominator for computations of precision/specificity
results
Results
  • {Recapitulate major results. Usually by presenting main figures from the paper.}
methods
Methods
  • With “gold standard” GO codes that they trust, can now analyze methods/data sources and give specific performance report on different areas (of biology).
  • Can also systematically remove GO topics in order to see if there are dominant effects (e.g. remove ribosomes)
authors conclusions
Authors Conclusions
  • {A presentation of how the authors summarize their results and significance. Usually not more than 3 major points. Often one.}
authors conclusions35
Authors Conclusions
  • Curated GO codes now provide more trustworthy gold-standard
  • Allows tools to be built that give
    • Overall performance
    • Subarea-specific breakdown of performance
    • Direct comparison of different methods/data sources
  • Sets the bar on evaluation, and starts a discussion about community-wide standards.
assessment of paper informatics
Assessment of Paper: Informatics
  • {What are the major methodological (engineering) innovations in the paper, in your opinion?}
  • {Are the methods presented soundly, completely, and evaluated appropriately?}
  • {How general are the methods presented for use in other areas either directly or with some effort by others?}
assessment of paper informatics37
Assessment of Paper: Informatics
  • Beautiful description and justification of the work. Clearly a general problem.
  • Well informed by research in the field, and evaluation of problems that arise in eval.
  • Solution applicable in many domains
    • Close (KEGG, NLP, others)
    • Farther (Any large volume prediction activity)
  • Some bias in expert-based gold standards
  • Very good availability of specific tool to allow use (cf. Maureen)
assessment of paper biomedicine
Assessment of paper: Biomedicine
  • {Has the paper helped make a new contribution of biomedical knowledge?}
  • {What is the domain significance of this paper?}
  • {Was it published in the right journal to find the audience who should care about it the most?}
assessment of paper biomedicine39
Assessment of paper: Biomedicine
  • Should greatly reduce the noise in papers about high-throughput predictions
  • Should create a new bar for performance
  • Systems biology and interaction informatics workers need to pay attention.
  • Microarray information content may be lower than thought previously on average
  • Genomics audience is a good one, since they need to be aware of these relatively sophisticated informatics issues.
detailed concerns
Detailed Concerns
  • {Particularly if you don’t like the paper, what are your technical informatics concerns about the method, implementation or evaluation?}
detailed concerns41
Detailed Concerns
  • A little confused about negative gold standard and how it is meant to be used. (Email in to Olga…)
  • There are still biases in the gold standard (e.g. GO) by omission that can’t be addressed without more work
  • What is #2 bad GO area after “ribosome”?… that example is used a lot in the paper.
summary conclusions
Summary Conclusions
  • {Do you accept all of the authors conclusions previously presented?}
  • {Modified conclusions that you would accept}
summary conclusions43
Summary Conclusions
  • Very important paper for evaluation of these methods
    • Now mandatory for papers in future to address these issues.
  • Authors aims achieved
    • Showed the problem
    • General solution proposed
    • Specific solution built and disseminated
references
References
  • {This paper, and other related papers that a BMI student studying for quals or otherwise interested could review.}
references45
References
  • Myers CL, Barrett DR, Hibbs MA, Huttenhower C, Troyanskaya OG. Finding function: evaluation methods for functional genomic data. BMC Genomics. 2006 Jul 25;7:187. PMID: 16869964
  • Lin N, Wu B, Jansen R, Gerstein M, Zhao H. Information assessment on predicting protein-protein interactions. BMC Bioinformatics. 2004 Oct 18;5:154.PMID: 15491499
  • Lee SG, Hur JU, Kim YS. A graph-theoretic modeling on GO space for biological interpretation of gene clusters. Bioinformatics. 2004 Feb 12;20(3):381-8. Epub 2004 Jan 22. PMID: 14960465
  • Jansen R, Gerstein M. Analyzing protein function on a genomic scale: the importance of gold-standard positives and negatives for network prediction. Curr Opin Microbiol. 2004 Oct;7(5):535-45. PMID: 15451510
  • Ben-Hur A, Noble WS. Choosing negative examples for the prediction of protein-protein interactions.BMC Bioinformatics. 2006 Mar 20;7 Suppl 1:S2. PMID: 16723005
acknowledgments
Acknowledgments
  • {Thanks to those who contributed to preparation of presentation.}
  • {Don’t hesitate to contact authors of paper for clarifications. They are usually flattered that you are looking at their paper.}
acknowledgments48
Acknowledgments
  • Maureen Hillenmeyer first brought this paper to my attention.
  • Olga provided a few clarifications that I needed after reading the paper.
  • BMI-exec encouraged me to do this as an example for how we would like students to select and present BMI JC papers this year.
thanks

Thanks.

{insert your email address}

thanks50

Thanks.

russ.altman@stanford.edu