630 likes | 874 Views
Visual Search Interfaces for Online Digital Repositories. Dissertation Defense Edward Clarkson May 6, 2009. Committee Jim Foley, Georgia Tech Gregory Abowd, Georgia Tech Gary Marchionini, UNC – CH Colin Potts, Georgia Tech John Stasko, Georgia Tech. Introduction Problem Space.
E N D
Visual Search Interfaces for Online Digital Repositories Dissertation Defense Edward Clarkson May 6, 2009 Committee Jim Foley, Georgia Tech Gregory Abowd, Georgia TechGary Marchionini, UNC – CHColin Potts, Georgia TechJohn Stasko, Georgia Tech
IntroductionProblem Space • Environment of interest: online digital repositories. • Spectrum of directed to exploratory information-seeking tasks • Directed: specific end goal • Exploratory: unspecified goal (browsing)
Keyword search engines and search engine result pages (SERPs) Faceted classification and navigation Problems: Contextualization Vocabulary visibility Detecting relationships within data IntroductionCurrent Approaches
IntroductionMy Approach: ResultMaps • Goal: provide context for search results • Approach: Pair familiar text listings with alternative visual representation • Leverage hierarchical metadata (expose more of its structure) • Outlier, cluster, relationship detection more apparent
IntroductionMy Approach: ResultMaps • Encode total repository within treemap according to hierarchical metadata • Recursive division of 2-D area according to tree data • Visually accentuate matched results within overall space • Interactively link treemap with text result listings • Treemaps preserve context • Listings preserve familiarity [Johnson 1991]
IntroductionMy Approach: ResultMaps vs. • Advantages • Space-constrained display. • Leaf node emphasis. • Well-suited for grid-based web. • Disadvantages • Actual hierarchical structure not evident. • Learning effects for novice users. [Lamping 1995], [Plaisant 2002]
IntroductionOverview and Contributions 0. Introduction • Thesis and Contributions • SERP ResultMaps • Faceted ResultMaps • Modeling Faceted Metadata • Design Implications • System Design • Evaluation Design
Research ApproachThesis “ResultMaps constitute a lightweight visualization mechanism for digital repository search systems. They provide a means for contextualizing repository content, providing several prospective benefits, while not impairing usage for uninterested users. Empirical studies of their usage, along with models of faceted environments, suggest a set of implications for design of future systems in this space and their evaluation.”
Research ApproachThesis “ResultMaps constitute a lightweight visualization mechanism for digital repository search systems. They provide a means for contextualizing repository content, providing several prospective benefits, while not impairing usage for uninterested users. Empirical studies of their usage, along with models of faceted environments, suggest a set of implications for design of future systems in this space and their evaluation.”
Research ApproachThesis “ResultMaps constitute a lightweight visualization mechanism for digital repository search systems. They provide a means for contextualizing repository content, providing several prospective benefits, while not impairing usage for uninterested users. Empirical studies of their usage, along with models of faceted environments, suggest a set of implications for design of future systems in this space and their evaluation.”
Research ApproachThesis “ResultMaps constitute a lightweight visualization mechanism for digital repository search systems. They provide a means for contextualizing repository content, providing several prospective benefits, while not impairing usage for uninterested users. Empirical studies of their usage, along with models of faceted environments, suggest a set of implications for design of future systems in this space and their evaluation.”
Research ApproachThesis “ResultMaps constitute a lightweight visualization mechanism for digital repository search systems. They provide a means for contextualizing repository content, providing several prospective benefits, while not impairing usage for uninterested users. Empirical studies of their usage, along with models of faceted environments, suggest a set of implications for design of future systems in this space and their evaluation.”
Research ApproachThesis “ResultMaps constitute a lightweight visualization mechanism for digital repository search systems. They provide a means for contextualizing repository content, providing several prospective benefits, while not impairing usage for uninterested users. Empirical studies of their usage, along with models of faceted environments, suggest a set of implications for design of future systems in this space and their evaluation.”
Research ApproachQuestions • How does adding SERP RMs affect user performance on DL search tasks? • How does adding RMs affect subjective impressions—such as satisfaction and engagement—of DL interfaces? • Do RMs yield a greater level of knowledge about the overall content in digital library as an ancillary effect of normal usage? • How do RMs affect query string characteristics over sequences of queries and other types of user behavior? • How do faceted ResultMaps affect subjective impressions of faceted DL interfaces? • How do faceted ResultMaps affect the incidence of data insights (e.g., identifying data relationships such as correlations between facets)?
SERP EvaluationsStudy R2 Design • Summative • RQs: performance, subjective effects, ancillary effects, query construction • Split-plot design • RM vs. control (between); repository size (within) • Measures • Task time/accuracy, CSUQ, engagement, enjoyment [Lewis 1995], [Ghani 1991], [Capra 2007]
SERP EvaluationsStudy R2 Procedure • Datasets • HCC EDL (~500 nodes) • Intute (~5000 nodes) • 6 tasks (1 practice) • Range of directed to open-ended • 36 volunteers from intro HCI, PSYC courses
SERP EvaluationsStudy R2 Results • RM users fast, more accurate, better subjective results… • But only enjoyment difference significant • No differences in query characteristics between interfaces • Large repository had more, longer, more varied queries
SERP EvaluationsStudy R2 Results • Positive RM ratings • Anecdotal comments • “Visual representation of materials…is useful” • No negative results, and many near-significant results
IntroductionOverview and Contributions 0. Introduction • SERP ResultMaps • Faceted ResultMaps • Modeling Faceted Metadata • Design Implications • System Design • Evaluation Design
Faceted NavigationIntroduction Focus • Classify items in multiple independent categorizations (facets) • Ex.: architectural works (focus) • High recent uptake in research, e-commerce • Amazon, eBay, etc. • Flamenco, Relation Browser, mSpace, etc. Facets [Yee 02], [Capra 07], [Huynh 09], [Lee 09], [schraefel 05]
Faceted NavigationIntroduction Focus • Classify items in multiple independent categorizations (facets) • Ex.: architectural works (focus) • High recent uptake in research, e-commerce • Amazon, eBay, etc. • Flamenco, Relation Browser, mSpace, etc. Facets [Yee 02], [Capra 07], [Huynh 09], [Lee 09], [schraefel 05]
Faceted NavigationIntroduction • Facets contain relevant facet values • Represent available constraints • Selections trigger UI update • Matching focus items • Relevant facet values • Differences from SERP env.
Faceted ResultMaps • Scalability factors • Aggregating nodes • Ordered layouts • Implementations • Flamenco-based • Swivel
Modeling Faceted Metadata Implications • Non-reductive selections could be detrimental to performance • For M facets, N selections: M+1 queries that are N+1-way joins • Asynchronous preview data • For M hierarchical facets, mean of B sub-values and N selections: tooltip previews require factor of B+1 more queries • Only one extra query per ResultMap • ResultMaps require less restrictive version of query model
Faceted EvaluationsStudy F2 Design • Summative/Formative • Quasi-experimental between-subjects • 7 ARCH 2111 precepts • 1 pilot, 4 test, 2 aborted (n=23 control, n=15 RM) • Flamenco ResultMaps vs. control • Measures: tasks, CSUQ, engagement/enjoyment
Faceted EvaluationsStudy F2 Procedure • Archivision data: 16K+ images, 11 facets, 6 RMs • Preview performance poor; disabled for test. • In-class assignment during 50 minute class meeting • Ex.: Search for and list differences between American and Italian civic buildings • Task grading according to TA template/input • Execution problems • Network outage • Concurrent performance • Warm-up/practice constraint
Faceted EvaluationsStudy F2 Results • Dominated by problems (performance, etc.) • RM users: explicit positive comments • Control users: explicit references to RM-like features • [It is] difficult to draw relationships with other categories [and] difficult to cross-reference material outside the search • Side by side comparison of…categories would be useful; more graphics might be nicer.
Faceted EvaluationsStudy F2 Implications • Simplification • No key facet • No differentiation between on/off screen • Increase RM size • Improve performance
Faceted ResultMapsDesign Iteration • Swivel • Modern tech., guided by model work • RM version • No key facet • Limit depth • Stacked bar version
Faceted EvaluationsStudy F3 Design • Summative • Longitudinal, within-subjects, quasi-experimental • 4 ARCH 2112 precepts (N=66; 1/3 from F1 classes) • Measures: CSUQ, engagement/enjoyment
Faceted EvaluationsStudy F3 Procedure • Longitudinal: 2 weeks x 3 conditions • Same Archivision data from F1 • Reduced facets: 8 facets, 4 RMs • Experimental variances/threats to validity: • Exam after period 1 • Cancelled classes
Faceted EvaluationsStudy F3 Results • Analysis problems: response rate and ‘useful’ responses • 27% returned all 3 surveys (9% all useful) • 42 students returned at least one useful survey
Faceted EvaluationsStudy F3 Results • Sig. pairwise CSUQ corr.: • Bar/RM: • r2=0.762; p < 0.01; n=14 • RM/control: • r2=-0.839; p<0.02; n=7 • Bar/Control • r2=-0.513; p=0.09; n=12 • Behavioral logs (page views) • RM/Bar longer sessions, more usage of facets than search queries
Faceted EvaluationsStudy F3 Implications • Link to learning style? • “The Map was more of a visual aid and helped me organize my thoughts” • “Map is the most visual and most appealing of the three. This is good, especially when dealing with architects” • “I am a visual learner so the map version seemed to be my favorite.” • Power of defaults: • 33 direct usages (clicks) of top RM, 9 of next, none of others.
Design ImplicationsSystem Design (from Evaluation) • Power of defaults (F2, F3) • Ordering, progressive disclosure (trade-off with discoverability, insight) • UI overload (R2, F2, F3) • Infovis additions less likely to overload on simpler base systems (mitigated by user experience, learning style).
Design ImplicationsSystem Design (from Model) • Extensions to faceted navigation tools • Probabilistic categorization • Indirect selections • Only asynchronous previews are scalable
Design ImplicationsSystem Evaluation • User interest (R1,R2,F1,F2,F3) • Direct vs. analytic data interest • Pedagogical evaluation link • Facilitate insight reporting (F2,F3) • Analytic user interest • Low-cost reporting mechanisms (annotation, bookmarking) • Evaluation incentives • Task comparison: measure of isomorphism (R1,R2,F1) • Similarity in local tree structures (tree alignment) • Similarity in target items
Conclusions and Future Work • Faceted navigation • Extending data, query capabilities of tools • Complexity analysis of interface features, data limits based on data/query models • Similarity of HCIR and infovis problems • Data integration/reformulation: (semi) automatic tools • Quantitative task isomorphism metrics
Acknowledgements • Personal • Professional • Jim and committee • Sham Navathe • Sabir Khan, Benjy Flowers, Myung Seok Hyun, Carina Antunez, Marietta Monaghan • Financial • Stephen Fleming Chair • DHS, NVAC, RVAC
SERP EvaluationsStudy R3 • HCC EDL log analysis, April-Oct. 2008 • Users randomly shown either RM or control version of SERP based on IP address. • 22,867 requests • 516 search requests (272 RM; 244 control) • …but no interesting results