1 / 12

Data discovery from a digital library perspective

Data discovery from a digital library perspective. Greg Janée, Darren Hardy UC Santa Barbara. Outline. Questions grappling with granularity struggling with search dithering over distribution pondering process Integrating search with access. institution (NASA). data center (GSFC).

Download Presentation

Data discovery from a digital library perspective

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data discoveryfrom adigital library perspective Greg Janée, Darren Hardy UC Santa Barbara

  2. Outline • Questions • grappling with granularity • struggling with search • dithering over distribution • pondering process • Integrating search with access

  3. institution (NASA) data center (GSFC) program (MODIS) product (sea surface temperature) resolution (1km) space time granule datum Granularity type organization

  4. Approaches I • ADL • uniform object (metadata) representation • flat list of collections (=containers) • possible extensions: • collections as first-order objects • nested containers • THREDDS • hierarchical “collection” datasets • “coherent” datasets (=aggregation server?) • “direct” datasets

  5. Approaches II • Granularity on the Web... • webpage • multi-page document • website • ...and sidestepping it • uniform representation (webpage) • page linking • visible, decomposable identifiers (URLs)

  6. Flattening granularity • Use heuristics to return “best” match inherit descriptive metadata dataset aggregate intrinsic metadata

  7. Search • Type • text, numeric, space, time, ... • Source • data itself • intrinsic metadata • added (usually descriptive) metadata • 3rd party

  8. Distribution • Centralized system • eg. Google, ECHO • SPOF; requires resources • Peer-to-peer • eg. BRICKS, built on P-GRID • MPOF; requires commitment • ADL: incomplete peer-to-peer

  9. A “textbook” search process • Classic process (Lancaster 1979) • Information need • Stated request • Selection of database • Search strategy • Search in database • Screening of output • Web search - about the same 25 years later

  10. What’s the real process? • Irrational search (Pharo & Järvelin 2006) • Textbook search processes insufficient • Disjointed incrementalism theory • Many smaller steps • Learning during a search • Subjective & dynamic information needs over time • What’s the ideal for earth science data users? • How do you inform choices during search? • How do you formulate a search, and what’s the context? • When is enough enough?

  11. Integrating search with access • File menu • Open... • Search library... • Close • Quit • Query results returned as a THREDDS catalog?

  12. We’re funded to do this!

More Related