1 / 12

Towards Data Attribution & Citation in the Life Sciences Philip E. Bourne UCSD pbourne@ucsd

Towards Data Attribution & Citation in the Life Sciences Philip E. Bourne UCSD pbourne@ucsd.edu. Life Science Data Repositories. NLM is the elephant in the room .. However .. There are thousands on community maintained efforts – all want an NAR publication

petra-knox
Download Presentation

Towards Data Attribution & Citation in the Life Sciences Philip E. Bourne UCSD pbourne@ucsd

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Towards Data Attribution & Citation in the Life Sciences Philip E. Bourne UCSD pbourne@ucsd.edu Data Attribution and Citation

  2. Life Science Data Repositories • NLM is the elephant in the room .. However .. • There are thousands on community maintained efforts – all want an NAR publication • The ability to cite and attribute the data are highly variable: • DOIs assigned in some cases, but not used • Attribution is through the metadata in most cases • Citation is typically by the associated literature reference if it exists, and/or a database identifier • The use of data repositories such as Dryad is compelling for the long tail problem • Data journals are on the horizon Data Attribution and Citation

  3. Consider the PDB as a Use Case • Oldest data resource in biology? • A resource used by ~ 200,000 individuals per month – increasing number of school kids! • A resource distributing worldwide the equivalent to ¼ the National Library of Congress each month • A bicoastal/worldwide resource • 1TB Data Attribution and Citation

  4. PDB Typical Growth Curve – But the Complexity! Number of released entries Year

  5. Number of visits and page views is growing faster than number of unique visitors People are doing more with the data

  6. The Data May Save Lives? * Structure Summary page activity for H1N1 Influenza related structures Jan. 2008 Jul. 2008 Jan. 2009 Jul. 2009 Jan. 2010 Jul. 2010 3B7E: Neuraminidase of A/Brevig Mission/1/1918 H1N1 strain in complex with zanamivir 1RUZ: 1918 H1 Hemagglutinin * http://www.cdc.gov/h1n1flu/estimates/April_March_13.htm

  7. PDB Data Attribution and Citation • About 25% of our budget has been spent on data remediation – multiple versions supported – the copy of record (as defined by the publication) is always available • Cant publish unless data are deposited – motivated by the community - very good data to publication correspondence • Data objects are discreet and we assign DOIs – but they are not used – database identifiers preferred Data Attribution and Citation

  8. Ah yes .. But the CD4 Story…

  9. Literature/Data Integration User clicks on content Metadata and webservices to data provide an interactiveview that can be annotated Selecting features provides a data/knowledge mashup Analysis leads to new content I can share The Knowledge and Data Cycle 0. Full text of PLoS papers stored in a database 4. The composite view has links to pertinent blocks of literature text and back to the PDB 4. 1. 3. A composite view of journal and database content results 1. A link brings up figures from the paper 3. 2. 2. Clicking the paper figure retrieves data from the PDB which is analyzed PLoS Comp. Biol. 2005 1(3) e34

  10. Example of Interoperability: The Database View www.rcsb.org/pdb/explore/literature.do?structureId=1TIM BMC Bioinformatics 2010 11:220

  11. Example of Interoperability – The Literature View From Anita de Waard, Elsevier

  12. Acknowledgements Funding Agencies: NSF, NIGMS, DOE, NLM, NCI, NCRR, NIBIB, NINDS, NIDDK Data Attribution and Citation 12

More Related