1 / 11

Goals of the Infrastructure Team

Goals of the Infrastructure Team. Patrick Leary. Core EOL Activities. Identify sources of biological content Acquire content or metadata about content Create a central biological index Generate web pages aggregating content related to a particular taxon

goldy
Download Presentation

Goals of the Infrastructure Team

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Goals of the Infrastructure Team Patrick Leary

  2. Core EOL Activities • Identify sources of biological content • Acquire content or metadata about content • Create a central biological index • Generate web pages aggregating content related to a particular taxon • Provide users with content search and retrieval

  3. Core Infrastructure Needs • Mobilizing content • Develop or improve standards • Create ‘connectors’ to map different data models • Tools for easy content provider initialization • Software to ingest content into EOL databases • Name-finding tools to identify relevant BHL pages • Tap into existing content stores (Flickr, Wikimedia)

  4. Content Partners (e.g. FishBase, Tree of Life, LifeDesks, …) APIs, Excel, Export XML Schema Processing EOL Databases Content Cache Rails Models and Controllers APIs

  5. Content Partner Registry • Maintain and improve registry • Ensure regularly-scheduled harvests • Work with Species Pages Group

  6. Biological Index • Evolving names-based infrastructure • Proper handling of names maximizes the quality of the index • Names and hierarchies are the cornerstone of this biological index • Addressing the problem of scale • 12 million names • 9.5 million taxonomic assertions • 1 million published data objects • 18 million species references in BHL • 20 million verified out links

  7. Names Impede Retrieval

  8. Making Order Of The Mess • Group related names • Lexical Groups • Pomatomussaltatrix • Pomatomussaltatrix(Linnaeus, 1766) • Pomatomussaltator(Linnaeus, 1766) • Pomatomussaltratrix • Nomenclatural Groups • GasterosteussaltatrixLinnaeus, 1766 • Temnodonsaltator(Linnaeus, 1766) • Pomatomussaltatrix(Linnaeus, 1766) • Common Names • Bluefish (Pomatomussaltatrix) • Skipjack (Pomatomussaltator(Linnaeus, 1766)) • Çinekopbalığı (GasterosteussaltatrixLinnaeus, 1766)

  9. Multiple Hierarchies

  10. Multiple Hierarchies A B A B A B B

  11. Next Steps • Reduce impediments for contributors • Integrate more existing content stores such as You Tube or Wikipedia • Continue to improve name and concept reconciliation • Provide names and hierarchy editing interfaces for curators • Improve names finding tools for Biodiversity Heritage Library (BHL) • Atomized descriptive data

More Related