1 / 42

Data, Data Everywhere, But Not a Byte to Eat

Data, Data Everywhere, But Not a Byte to Eat. Michael F. Huerta, Ph.D. Associate Director, National Library of Medicine Director, Office of Health Information Programs Development BRDI/NAS 2/26/13. Biomedical Research Enterprise - Today. Lots and lots of data – in individual labs.

arobertson
Download Presentation

Data, Data Everywhere, But Not a Byte to Eat

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data, Data Everywhere, But Not a Byte to Eat Michael F. Huerta, Ph.D. Associate Director, National Library of Medicine Director, Office of Health Information Programs Development BRDI/NAS 2/26/13

  2. Biomedical Research Enterprise - Today • Lots and lots of data – in individual labs

  3. Biomedical Research Enterprise - Today • Lots and lots of data – in individual labs • Few data broadly available to research community • Exceptions: genomic, human subject autism, particular research initiatives (e.g., ADNI, Human Connectome Project)

  4. Biomedical Research Enterprise - Today • Lots and lots of data – in individual labs • Few data broadly available to research community • Exceptions: genomic, human subject autism, particular research initiatives (e.g., ADNI, Human Connectome Project) • For much of biomedical research enterprise • Major public products: concepts in scientific papers, not data • Biomedical research is concept-centric, not data-centric

  5. Biomedical Research Enterprise - Tomorrow • Liberated data - increase data sharing

  6. Biomedical Research Enterprise - Tomorrow • Liberated data - increase data sharing • Advances in relevant data science and data tools

  7. Biomedical Research Enterprise - Tomorrow • Liberated data - increase data sharing • Advances in relevant data science and data tools • Ways to make data • Discoverable • Useful to others • Citable • Linked to scientific literature

  8. Biomedical Research Enterprise - Tomorrow • Liberated data - increase data sharing • Advances in relevant data science and data tools • Ways to make data • Discoverable • Useful to others • Citable • Linked to scientific literature • Greater prominence of data in science & scholarship

  9. Today Tomorrow

  10. NIH Big Data to Knowledge Initiative for Research Data Today Tomorrow

  11. NIH Big Data to Knowledge (BD2K) • Data and Informatics Working Group (DIWG) of the Advisory Committee to the Director of NIH • D DeMets & L Tabak

  12. NIH Big Data to Knowledge (BD2K) • Data and Informatics Working Group (DIWG) of the Advisory Committee to the Director of NIH • D DeMets & L Tabak • Recommendations for NIH Research Data:

  13. NIH Big Data to Knowledge (BD2K) • Data and Informatics Working Group (DIWG) of the Advisory Committee to the Director of NIH • D DeMets & L Tabak • Recommendations for NIH Research Data: • Sharing & Standards • Tools • Workforce

  14. NIH Big Data to Knowledge (BD2K) • Data and Informatics Working Group (DIWG) of the Advisory Committee to the Director of NIH • D DeMets & L Tabak • Recommendations for NIH Research Data: • Sharing & Standards • Tools • Workforce • Implementation Groups • Eric Green (Acting AssocDirof NIH for Data Science) • Sharing & Standards M Huerta & J Larkin • Tools(Software Development) V Bonazzi & J Couch • Tools+(Centers) L Brooks, M Huerta, P Lyster & B Seto • Workforce M Dunn

  15. Sharing & Standards • Policies to increase data sharing and change the culture • Changes will liberate data

  16. Sharing & Standards • Policies to increase data sharing and change the culture • Changes will liberate data • Frameworksfor community-based standards efforts • Standards make data useful • Community-base promotes their use

  17. Sharing & Standards • Policies to increase data sharing and change the culture • Changes will liberate data • Frameworksfor community-based standards efforts • Standards make data useful • Community-base promotes their use • Catalog of data set information  research ecosystem • Discoverable, citable, and linked to the literature

  18. Sharing & Standards • Policies to increase data sharing and change the culture • Changes will liberate data • Frameworksfor community-based standards efforts • Standards make data useful • Community-base promotes their use • Catalog of data set information  research ecosystem • Discoverable, citable, and linked to the literature Each adds value

  19. Sharing & Standards • Policies to increase data sharing and change the culture • Changes will liberate data • Frameworksfor community-based standards efforts • Standards make data useful • Community-base promotes their use • Catalog of data set information  research ecosystem • Discoverable, citable, and linked to the literature Each adds value Synergy together

  20. NIH Data Catalog: A Use Case

  21. NIH Data Catalog: A Use Case An NIH-funded investigator

  22. NIH Data Catalog Just before submitting a scientific paper to a journal, investigator uploads minimal info about the data set to the NIH Data Catalog

  23. NIH Data Catalog Minimal info includes: -Authors proper credit for data -Data descriptors (controlled) efficient search -Data locus, availability & way to access sharing

  24. NIH Data Catalog Minimal info includes: -Authors proper credit for data -Data descriptors (controlled) efficient search -Data locus, availability & way to access sharing Upload generates: -Data publication citation -Data Unique IDentifier (DUID)

  25. NIH Data Catalog Data Unique IDentifier (DUID) is sent to the investigator

  26. NIH Data Catalog Investigator submits manuscript to the scientific journal - with DUID in abstract & data publication cited in manuscript

  27. NIH Data Catalog Journal paper is published & indexed in PubMed

  28. NIH Data Catalog PubMed pulls DUID from abstract as a separate data element in the PubMed citation

  29. NIH Data Catalog Data publication is also sent to PubMed for indexing

  30. NIH Data Catalog PubMed also pulls DUID from data publication as an element of PubMed citation

  31. DUID now in PubMed citations of both the scientific publication & the data publication forming a 2-way link NIH Data Catalog

  32. NIH Data Catalog PubMed uses same data descriptors as data publication for indexing data publication

  33. NIH Data Catalog Use of same controlled Terms in catalog and PubMed provides discoverabilityof info about data sets

  34. NIH Data Catalog DUIDs, citations of data publications & scientific publications can be used in NIH administrative systems

  35. Bringing Data into the Research Ecosystem

  36. Bringing Data into the Research Ecosystem • Data more available (policies) & useful (standards)

  37. Bringing Data into the Research Ecosystem • Data more available (policies) & useful (standards) • Data sets are discoverable: • Same descriptors of data sets used in data catalog are used as index and search terms in PubMed

  38. Bringing Data into the Research Ecosystem • Data more available (policies) & useful (standards) • Data sets are discoverable: • Same descriptors of data sets used in data catalog are used as index and search terms in PubMed • Data sets are citable: • NIH Data Catalog produces citable data publications • Citability + proper credit  incentives related to data

  39. Bringing Data into the Research Ecosystem • Data more available (policies) & useful (standards) • Data sets are discoverable: • Same descriptors of data sets used in data catalog are used as index and search terms in PubMed • Data sets are citable: • NIH Data Catalog produces citable data publications • Citability + proper credit  incentives related to data • Data sets are linked with the literature • Common search & retrieval approach for scientific publications and data publications through PubMed • Use of DUID for direct, two-way linkage

  40. Bringing Data into the Research Ecosystem • Data more available (policies) & useful (standards) • Data sets are discoverable: • Same descriptors of data sets used in data catalog are used as index and search terms in PubMed • Data sets are citable: • NIH Data Catalog produces citable data publications • Citability + proper credit  incentives related to data • Data sets are linked with the literature • Common search & retrieval approach for scientific publications and data publications through PubMed • Use of DUID for direct, two-way linkage • Information in ecosystem - use by NIH and 3rd parties • Trend analysis, etc.

More Related