1 / 19

NIF Resource Curation and Automated Resource Discovery?

NIF Resource Curation and Automated Resource Discovery?. NIF Resources. NIF is cataloging websites that house information about databases, atlases, software tools, data, transgenic mice and other things that we consider of value to the neuroscience community. Definition of Resource.

jed
Download Presentation

NIF Resource Curation and Automated Resource Discovery?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. NIF Resource CurationandAutomated Resource Discovery?

  2. NIF Resources • NIF is cataloging websites that house information about databases, atlases, software tools, data, transgenic mice and other things that we consider of value to the neuroscience community.

  3. Definition of Resource • Individual resource boundary: shall be considered an individual resource if it is maintained by a single entity, and has the properties of one or more individual web pages that are related by a theme and html links.

  4. Resource Nomination Registry (4500) User Feedback *Automated tools *Automated updates Level 2 tools Web Crawl Nomination NIF Web (499,952) Level 2/3 (29) Check: -Links -Annotation -Vocabulary Registry Subset Public Registry (2300) *In Development

  5. Resource is Nominated NIF Staff, Contact at Meetings, Web Form In NIF already? Decision: Should it be included? Do not include Keep Record Assign Metadata -short name, long name, url -description (short description 1-3 sentences, longer description) -parent organization (physical location, university) -support (grant numbers) -keywords (species, technique, structure, age, level, disease, topic) Assign resource type

  6. Resources Difficult to Categorize • Link aggregates • Large organizations (NIH) • Poorly documented databases • Private data sites • Clinical trials that are still recruiting  • Experimental protocol  • Commercial entities • Journals • JOVE • supplemental materials

  7. CINdy the resource curation tool

  8. Resource Ontology (BRO) • Data Resource: provides access to data; database, atlas, book • Software Resource: software programs or source code • Material Resource: reagents, tissue samples or organisms • Funding Resource: grants or contracts • Training Resource: educational materials, training programs • Job Resource: employment opportunities • People Resource: access to individual people’s web sites

  9. NIF Service vs BRO Service

  10. Solutions Consolidating Classes • Synonyms where appropriate: ex. Material storage service vs. Material storage repository. • Temporary mapping, where appropriate • *Deprecated terms must be maintained* • Data loss • Moving forward with a joint descriptive terminology!

  11. Evolution of the NIF Resource Ontology

  12. Resource Boundary? • Software Library • Software tool • Plugin: I2B2 • Our solution: use url as a uniqueness qualifier • Our problem: a single url may house several resources • Individual plugins can have individual urls

  13. Boundary cont. • Individual resource boundary: shall be considered an individual resource if it is maintained by a single entity, and has the properties of one or more individual web pages that are related by a theme and html links. • Solution to random boundary problem: Human Curator

  14. Issues of Scope • Single line or short paragraph + keywords • Resource discovery problem *Stanford ontologies description is very short (as are many) finding this resource by keyword will be difficult unless we index the content of the website. • Data dump • Small vs. Large databases • Updates

  15. Internal referencing • Stanford example: • License: “same as bioportal” – does not match any license types in any list. • Problem: non standard terminology, reference to another project (no url), can create loops • also true in publications: ex., used same protocol as paper X, which used the same protocol as paper Y • Automated text mining tools have a hard time recognizing these

  16. What can we gain from automated systems? • Basic information: Name, url, contact info • Some keywords • Some descriptive text • No resource boundary • No resource description

  17. How do we help the computers? • Common naming project (neurocommons) http://sharedname.org/page/Main_Page • Automated uri’s • Community building: • Shared data models • Shared ontology • RDF entity tags? (mouse vs mouse)

More Related