Faceted Search for Hydrologic Data Discovery Alex Bedig Alva Couch Tufts University, Medford, MA
Overview of Relevant Architecture Source: http://www.cuahsi.org/his
“Ontology” • A collection of terms along with a set of relationships between terms. • In our case, main relationship is hierarchical: “is a subconcept of”. • Provides a mapping between user notions of data, and data as it is found in HIS Central.
Discovery in HydroDesktop Source: HydroDesktop
Procedure of Discovery in HydroDesktop Specify spatial and temporal dimensions. Choose terms from the “Hydrosphere” variable name ontology. Click search, wait… for results… usually.
April 15, 2011 Usability Study CUAHSI Ontology Startree
Use Case 1: No Matching Series ISSUE: User’s selections return no series, no feedback suggesting which constraints could be relaxed. SOLUTION: Search should occur in multiple steps, informing the user of where data exists in each step.
Use Case 2: No Familiar Terms ISSUE: User is unfamiliar with the terms provided in the variable-name ontology, leading to low confidence in search results. SOLUTION: Search should allow for multiple representations of the same canonical names, eliminate options based upon known terms, and present only options for which data is available.
Use Case 3: Too Many Results ISSUE: User’s search returns a large number of results; filtering any further requires download of results for client-side manipulation. SOLUTION: Exposing multiple dimensions of metadata in the search interface allows for more precise search, reducing download time and selection procedures.
Demo! • SOAP Endpoint: http://cuahsi.eecs.tufts.edu/FacetedSearch/MultiFacetedHISSvc.svc?wsdl • Prototype Services Demonstrated: • GetAllOntologyElements • GetTypedOntologyElementsGivenConstraints • ConductFacetedSearch
Conclusions • Faceted search of HIS Central improves the user experience by: • Eliminating “wasted” time in which a search returns no data. • Allowing multiple metadata dimensions to be specified. • Allowing multiple ontological representations of vocabulary. • Moving towards the use of multiple vocabularies. • Thus increasing the likelihood that a user finds relevant data.
Conclusions • Faceted search requires some rethinking of HIS central, including • Services that return whether series exist for a query. • Support for multi-dimensional queries. • A need for speed that may justify supercomputing solutions.