1 / 30

D4: SKOS and HIVE—Enhancing the Creation, Design and Flow of Information

D4: SKOS and HIVE—Enhancing the Creation, Design and Flow of Information. Speakers: Hollie White Jane Greenberg Coordinator: Alan Keely. Overview. HIVE—Helping Interdisciplinary Vocabulary Engineering Motivation—Dryad repository HIVE—Goals, status, and design A scenario

meagan
Download Presentation

D4: SKOS and HIVE—Enhancing the Creation, Design and Flow of Information

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. D4: SKOS and HIVE—Enhancing the Creation, Design and Flow of Information Speakers: Hollie White Jane Greenberg Coordinator: Alan Keely

  2. Overview • HIVE—Helping Interdisciplinary Vocabulary Engineering • Motivation—Dryad repository • HIVE—Goals, status, and design • A scenario • HIVE for Law Library, repositories, etc. • Challenges • Technical and social • Conclusion and questions

  3. HIVE model • <AMG> approach for integrating discipline CVs • Model addressing C V cost, interoperability, and usability • constraints (interdisciplinary environment) 15/09/2014 Titel (edit in slide master) 3

  4. Motivation

  5. ~ Surveyof400 evolutionary biologist: 48 % based on other data; 78% data not deposited ~ Evolutionary biologists use published data more frequently than they are depositing it themselves! Ecology Paleontology  Physiology  Systematics  Genomics  Population genetics…. 5

  6. Partner Journals • American Society of Naturalists • American Naturalist • Ecological Society of America • Ecology, Ecological Letters, Ecological Monographs, etc. • European Society for Evolutionary Biology • Journal of Evolutionary Biology • Society for Integrative and Comparative Biology • Integrative and Comparative Biology • Society for Molecular Biology and Evolution • Molecular Biology and Evolution • Society for the Study of Evolution • Evolution • Society for Systematic Biology • Systematic Biology • Commercial journals • Molecular Ecology • Molecular Phylogenetics and Evolution

  7. Vocabulary needs for Dryad • Vocabulary analysis • 600 keywords, Dryad partner journals • Vocabularies: NBII Thesaurus, LCSH, the Getty’s TGN, ERIC Thesaurus, Gene Ontology, IT IS (10 vocabularies) • Facets: taxon, geographic name, time period, topic, research method, genotype, phenotype… • Results 431 topical terms, exact matches • NBII Thesaurus, 25%; MeSH, 18% 531 terms (research method and taxon) • LCSH, 22% found exact matches, 25% partial • Conclusion: Need multiple vocabularies

  8. Goals, status, and design

  9. HIVE...as a solution Address CV (controlled vocabulary) cost, interoperability, and usability constraints • COST: Expensive to create, maintain, and use • INTEROPERABILITY: Developed in silos (structurally and intellectually) • USABILITY: Interface design and functionality limitations have been well documented

  10. Relevance to the law library community? • Orphaned data (more of a Dryad issue) • More important, interdisciplinary needs • COST (create, maintain, and use) • INTEROPERABILITY • USABILITY

  11. HIVE Goals • Automatic metadata generation approach that dynamically integrates discipline-specific controlled vocabularies encoded with the Simple Knowledge Organisation System (SKOS) • Provide efficient, affordable, interoperable, and user friendly access to multiple vocabularies during metadata creation activities • A model that can be replicated —> model and service Three phases of HIVE: 1. Building HIVE - Vocabulary Development - Server preparation - Primate Life Histories Working Group • Wood Anatomy and Wood Density Working Group • Sharing HIVE - Continuing education (empowering information professionals) • Evaluating HIVE - Examining HIVE in Dryad

  12. HIVE Partners Vocabulary Partners • Library of Congress: LCSH • the Getty Research Institute (GRI): TGN (Thesaurus of Geographic Names ) • United States Geological Survey (USGS): NBII Thesaurus • Agrovoc Thesaurus Advisory Board • Jim Balhoff, NESCent • Libby Dechman, LCSH • Mike Frame, USGS • Alistair Miles, Ok • William Moen, University of North Texas • Eva Méndez Rodríguez, University Carlos III of Madrid • Joseph Shubitowski, Getty Research Institute • Ed Summers, LCSH • Barbara Tillett, Library of Congress • Kathy Wisser, Simmons • Lisa Zolly, USGS WORKSHOPS HOSTS: Columbia Univ.; Univ. of California, San Diego; Univ. of North Texas; Universidad Carlos III de Madrid, Madrid, Spain

  13. HIVE Construction • HIVE stores millions of concepts from different vocabularies, and makes them available on the Web by a simple HTTP • Vocabularies are imported into HIVE using SKOS/RDF format • HIVE is divided in two different modules: • HIVE Core • SKOS/RDF storage and management (SESAME/Elmo) • SMART HIVE: Automatic Metadata Extraction and Topic Detection (KEA++ and MAUI) • Concept Retrieval (Lucene and MG4J) • HIVE Web • Web user Interface (GWT—Google Web Toolkit) • Machine oriented interface (SOAP and REST)

  14. SKOS <rdf:RDF> <rdf:Description rdf:about="http://thesaurus.nbii.gov/nbii#Wood-pulp"> <rdf:type rdf:resource="http://www.w3.org/2004/02/skos/core#Concept"/> <skos:prefLabel>Wood pulp</skos:prefLabel> <skos:altLabel>Pulp (wood)</skos:altLabel> <skos:broader rdf:resource="http://thesaurus.nbii.gov/nbii#Wood”/> <skos:related rdf:resource="http://thesaurus.nbii.gov/nbii#Paper”/> <skos:related rdf:resource="http://thesaurus.nbii.gov/nbii#Paper-industry-wastes”/> <skos:related rdf:resource="http://thesaurus.nbii.gov/nbii#Pulp-mills”/> <skos:related rdf:resource="http://thesaurus.nbii.gov/nbii#Sawdust”/> <skos:inScheme rdf:resource="http://thesaurus.nbii.gov/nbii#"/> <skos:scopeNote>LSC Life Sciences</skos:scopeNote> </rdf:RDF>

  15. A scenario

  16. Meet Amy • Amy Zanne is a botanist. • Like every good scientist, she publishes.

  17. Meet Amy • Amy Zanne is a botanist. • Like every good scientist, she publishes. • She deposits data in Dryad.

  18. Law library/data repositories • http://www.law.harvard.edu/library/research/databases/major.html • http://www.digitalcurrent.com/legal_webhosting.aspx

  19. Challenges • Building vs. doing/analysis • Source for HIVE generation, beyond abstracts • Combining many vocabularies during the indexing/term matching phase is difficult, time consuming, inefficient. • NLP and machine learning offer promise • Interoperability = dumbing down • ontologies • Proof-of-concept/ illustrate the differences between HIVE and other vocabulary registries (NCBO and OBO Foundary) • General large team logistics, and having people from multiple disciplines (also the ++)

  20. Conclusion • Vocabularies will enrich Dryad data description, and assist with access, use, reuse, etc… • Nothing novel, but infrastructure is supportive, finally… • Dryad and HIVE are real-world applications using Semantic Web technology Links • HIVE • http://ils.unc.edu/mrc/hive/ • Metadata Research Center <MRC> • http://www.ils.unc.edu/mrc/ • Dryad • http://datadryad.org/ • National Evolutionary Synthesis Center (NESCent) • http://www.nescent.org/index.php The Dryad Data Repository 15/09/2014 30

More Related