Visual Search Interfaces for Online Digital Repositories

Visual Search Interfaces for Online Digital Repositories Dissertation Defense Edward Clarkson May 6, 2009 Committee Jim Foley, Georgia Tech Gregory Abowd, Georgia TechGary Marchionini, UNC – CHColin Potts, Georgia TechJohn Stasko, Georgia Tech

IntroductionProblem Space • Environment of interest: online digital repositories. • Spectrum of directed to exploratory information-seeking tasks • Directed: specific end goal • Exploratory: unspecified goal (browsing)

Keyword search engines and search engine result pages (SERPs) Faceted classification and navigation Problems: Contextualization Vocabulary visibility Detecting relationships within data IntroductionCurrent Approaches

IntroductionMy Approach: ResultMaps • Goal: provide context for search results • Approach: Pair familiar text listings with alternative visual representation • Leverage hierarchical metadata (expose more of its structure) • Outlier, cluster, relationship detection more apparent

IntroductionMy Approach: ResultMaps • Encode total repository within treemap according to hierarchical metadata • Recursive division of 2-D area according to tree data • Visually accentuate matched results within overall space • Interactively link treemap with text result listings • Treemaps preserve context • Listings preserve familiarity [Johnson 1991]

IntroductionMy Approach: ResultMaps vs. • Advantages • Space-constrained display. • Leaf node emphasis. • Well-suited for grid-based web. • Disadvantages • Actual hierarchical structure not evident. • Learning effects for novice users. [Lamping 1995], [Plaisant 2002]

IntroductionOverview and Contributions 0. Introduction • Thesis and Contributions • SERP ResultMaps • Faceted ResultMaps • Modeling Faceted Metadata • Design Implications • System Design • Evaluation Design

Research ApproachThesis “ResultMaps constitute a lightweight visualization mechanism for digital repository search systems. They provide a means for contextualizing repository content, providing several prospective benefits, while not impairing usage for uninterested users. Empirical studies of their usage, along with models of faceted environments, suggest a set of implications for design of future systems in this space and their evaluation.”

Research ApproachQuestions • How does adding SERP RMs affect user performance on DL search tasks? • How does adding RMs affect subjective impressions—such as satisfaction and engagement—of DL interfaces? • Do RMs yield a greater level of knowledge about the overall content in digital library as an ancillary effect of normal usage? • How do RMs affect query string characteristics over sequences of queries and other types of user behavior? • How do faceted ResultMaps affect subjective impressions of faceted DL interfaces? • How do faceted ResultMaps affect the incidence of data insights (e.g., identifying data relationships such as correlations between facets)?

Research ApproachExperimental Studies

SERP ResultMapsDemo

SERP EvaluationsStudy R2 Design • Summative • RQs: performance, subjective effects, ancillary effects, query construction • Split-plot design • RM vs. control (between); repository size (within) • Measures • Task time/accuracy, CSUQ, engagement, enjoyment [Lewis 1995], [Ghani 1991], [Capra 2007]

SERP EvaluationsStudy R2 Procedure • Datasets • HCC EDL (~500 nodes) • Intute (~5000 nodes) • 6 tasks (1 practice) • Range of directed to open-ended • 36 volunteers from intro HCI, PSYC courses

SERP EvaluationsStudy R2 Results • RM users fast, more accurate, better subjective results… • But only enjoyment difference significant • No differences in query characteristics between interfaces • Large repository had more, longer, more varied queries

SERP EvaluationsStudy R2 Results • Positive RM ratings • Anecdotal comments • “Visual representation of materials…is useful” • No negative results, and many near-significant results

IntroductionOverview and Contributions 0. Introduction • SERP ResultMaps • Faceted ResultMaps • Modeling Faceted Metadata • Design Implications • System Design • Evaluation Design

Faceted NavigationIntroduction Focus • Classify items in multiple independent categorizations (facets) • Ex.: architectural works (focus) • High recent uptake in research, e-commerce • Amazon, eBay, etc. • Flamenco, Relation Browser, mSpace, etc. Facets [Yee 02], [Capra 07], [Huynh 09], [Lee 09], [schraefel 05]

Faceted NavigationIntroduction

Faceted NavigationIntroduction • Facets contain relevant facet values • Represent available constraints • Selections trigger UI update • Matching focus items • Relevant facet values • Differences from SERP env.

Faceted ResultMaps • Scalability factors • Aggregating nodes • Ordered layouts • Implementations • Flamenco-based • Swivel

Faceted ResultMapsFlamenco Demo

Modeling Faceted Metadata Modeling Faceted Metadata

Modeling Faceted Metadata Implications • Non-reductive selections could be detrimental to performance • For M facets, N selections: M+1 queries that are N+1-way joins • Asynchronous preview data • For M hierarchical facets, mean of B sub-values and N selections: tooltip previews require factor of B+1 more queries • Only one extra query per ResultMap • ResultMaps require less restrictive version of query model

Faceted EvaluationsStudy F2 Design • Summative/Formative • Quasi-experimental between-subjects • 7 ARCH 2111 precepts • 1 pilot, 4 test, 2 aborted (n=23 control, n=15 RM) • Flamenco ResultMaps vs. control • Measures: tasks, CSUQ, engagement/enjoyment

Faceted EvaluationsStudy F2 Procedure • Archivision data: 16K+ images, 11 facets, 6 RMs • Preview performance poor; disabled for test. • In-class assignment during 50 minute class meeting • Ex.: Search for and list differences between American and Italian civic buildings • Task grading according to TA template/input • Execution problems • Network outage • Concurrent performance • Warm-up/practice constraint

Faceted EvaluationsStudy F2 Results • Dominated by problems (performance, etc.) • RM users: explicit positive comments • Control users: explicit references to RM-like features • [It is] difficult to draw relationships with other categories [and] difficult to cross-reference material outside the search • Side by side comparison of…categories would be useful; more graphics might be nicer.

Faceted EvaluationsStudy F2 Implications • Simplification • No key facet • No differentiation between on/off screen • Increase RM size • Improve performance

Faceted ResultMapsDesign Iteration • Swivel • Modern tech., guided by model work • RM version • No key facet • Limit depth • Stacked bar version

Faceted ResultMapsSwivel Demo

Faceted EvaluationsStudy F3 Design • Summative • Longitudinal, within-subjects, quasi-experimental • 4 ARCH 2112 precepts (N=66; 1/3 from F1 classes) • Measures: CSUQ, engagement/enjoyment

Faceted EvaluationsStudy F3 Procedure • Longitudinal: 2 weeks x 3 conditions • Same Archivision data from F1 • Reduced facets: 8 facets, 4 RMs • Experimental variances/threats to validity: • Exam after period 1 • Cancelled classes

Faceted EvaluationsStudy F3 Results • Analysis problems: response rate and ‘useful’ responses • 27% returned all 3 surveys (9% all useful) • 42 students returned at least one useful survey

Faceted EvaluationsStudy F3 Results • Sig. pairwise CSUQ corr.: • Bar/RM: • r2=0.762; p < 0.01; n=14 • RM/control: • r2=-0.839; p<0.02; n=7 • Bar/Control • r2=-0.513; p=0.09; n=12 • Behavioral logs (page views) • RM/Bar longer sessions, more usage of facets than search queries

Faceted EvaluationsStudy F3 Implications • Link to learning style? • “The Map was more of a visual aid and helped me organize my thoughts” • “Map is the most visual and most appealing of the three. This is good, especially when dealing with architects” • “I am a visual learner so the map version seemed to be my favorite.” • Power of defaults: • 33 direct usages (clicks) of top RM, 9 of next, none of others.

Design ImplicationsSystem Design (from Evaluation) • Power of defaults (F2, F3) • Ordering, progressive disclosure (trade-off with discoverability, insight) • UI overload (R2, F2, F3) • Infovis additions less likely to overload on simpler base systems (mitigated by user experience, learning style).

Design ImplicationsSystem Design (from Model) • Extensions to faceted navigation tools • Probabilistic categorization • Indirect selections • Only asynchronous previews are scalable

Design ImplicationsSystem Evaluation • User interest (R1,R2,F1,F2,F3) • Direct vs. analytic data interest • Pedagogical evaluation link • Facilitate insight reporting (F2,F3) • Analytic user interest • Low-cost reporting mechanisms (annotation, bookmarking) • Evaluation incentives • Task comparison: measure of isomorphism (R1,R2,F1) • Similarity in local tree structures (tree alignment) • Similarity in target items

Conclusions and Future Work • Faceted navigation • Extending data, query capabilities of tools • Complexity analysis of interface features, data limits based on data/query models • Similarity of HCIR and infovis problems • Data integration/reformulation: (semi) automatic tools • Quantitative task isomorphism metrics

Acknowledgements • Personal • Professional • Jim and committee • Sham Navathe • Sabir Khan, Benjy Flowers, Myung Seok Hyun, Carina Antunez, Marietta Monaghan • Financial • Stephen Fleming Chair • DHS, NVAC, RVAC

Questions?

Backups

SERP EvaluationsStudy R3 • HCC EDL log analysis, April-Oct. 2008 • Users randomly shown either RM or control version of SERP based on IP address. • 22,867 requests • 516 search requests (272 RM; 244 control) • …but no interesting results

SERP ResultMaps

Visual Search Interfaces for Online Digital Repositories

Visual Search Interfaces for Online Digital Repositories

Presentation Transcript

Metadata for Digital Repositories

Caption Search for Bioscience Search Interfaces

Semantic Application for Digital Repositories

JISC Digital Repositories Call for Proposals

Digital Preservation for Digital Repositories

Digital Repositories

Trust in Digital Repositories

Context-based Search in Topic Centered Digital Repositories

Visual Design for 3D User Interfaces

ResultMaps : Visualization for Search Interfaces

Replication Policies for Federated Digital Repositories

Design of metadata surrogates in search result interfaces of learning object repositories:

Trusted Digital Repositories, Certification

Digital Interfaces

Visual Interfaces to Digital Libraries

Digital Repositories Team

IMS Digital Repositories Interoperability

Digital/Open Access repositories

Evaluating Search Interfaces

Trusted Digital Repositories, Certification

Caption Search for Bioscience Search Interfaces

Standards For JISC's Digital Repositories Programme