1.92k likes | 1.93k Views
This tutorial discusses the importance of information architecture in improving search interfaces, based on studies and research findings. It provides insights into the problems faced by users and proposes a methodology to integrate search and navigation using faceted metadata. The approach aims to enhance usability and support different task types. The tutorial also highlights the advantages of this approach in terms of user control, context, consistency, and scalability.
E N D
Designing Information Architecturefor Search Marti Hearst University of California, Berkeley www.sims.berkeley.edu/~hearst NSF CAREER Grant, NSF9984741 Tutorial: SIGIR 2001
Outline • Motivation • Search Interfaces: • Web search vs Site Search • Search UIs: What works; what doesn’t • Methodology • Information Architecture Defined • Faceted Metadata • Integrating Search into IA via Faceted Metadata • Results of Usability Studies • Tools • Conclusions
Contributors to the Research • Dr. Rashmi Sinha • Graduate Students • Ame Elliott • Jennifer English • Kirsten Swearington • Ping Yee • Research funded by • NSF CAREER Grant, NSF9984741
Claims • Web Search is OK • Gets people to the right starting points • Web SITE search is NOT ok • The best way to improve site search is • NOT to make new fancy algorithms • Instead …
The best way to improve search: Improve the User Interface
Recent Study by Vividence Research • Spring 2001, 69 web sites • 70% eCommerce • 31% Service • 21% Content • 2% Community • The most common problems: 53% had poorly organized search results 32% had poor information architecture 32% had slow performance 27% had cluttered home pages 25% had confusing labels 15% invasive registration 13% inconsistent navigation
Vividence findings: effects on users • Poorly organized search results • Frustration and wasted time • Poor information architecture • Confusion • Dead ends • "back and forthing" • Forced to search
Vividence findings: effects on users • Cluttered home pages • Creates disinterest • Wastes time • No contrast: everything has equal weight • Don’t know where to start • Failure to engage • No call to action • Failure to establish navigation • Layout reflects company organization chart • Investor centeredness
Vividence findings: characteristics • Inconsistent Navigation • Primary navigation bar is, in fact, really secondary • Un-scalable designs • Poor transitions between company divisions • "Junk Drawer" navigation bars • Random links • Shoe-horned functions • Heavy need to hit the "back-button"
Vividence Study • Breakdown of most common search problems • 41% - of searches encountered no problems • 20% - had search problems not named below • 14% - of searches were not “advanced” enough • 12% - did not organize results well • 10% - of searches yielded inaccurate/unrelated results • 9% - were too slow • 8% - of searches had insufficient instructions • 7% - engine was too difficult to locate • 7% - of searches produced too few results • 7% - of searches were too limiting • 3% - of searches produced an error message • 3% - were too difficult to use
Other Relevant Studies • Commercial studies (are not usually scientific, do not supply full details) • CreativeGood.com Holiday 2000 ecommerce report • UIE, and Jared Spool’s talks: http://world.std.com/~uieweb • Scientific studies (often less relevant to real web situations) • Many papers from the CHI proceedings http://www.acm.org/dl/ • Papers from Human Factors and the Web http://www.optavia.com/hfweb/ • See the extensive bibliography from my textbook chapter (in this package).
The Philosophy • Information architecture should be designed to integrate search throughout • Search results should reflect the information architecture. • This supports an interplay between navigation and search • This supports the most common human search strategies.
The Approach • Assign faceted metadata to content items • Allow users to navigate through the faceted metadata in a flexible manner • Organize search results according to the faceted metadata so navigation looks similar throughout • Give previews of next choices • Allow access to previous choices
Advantages of the Approach • Supports different task types • Highly constrained known-item searches use one interface • Open-ended, browsing tasks use another interface • Both types of interface use the same underlying structure • Can easily switch from one interface type to the other midstream
Advantages of the Approach • Honors many of the most important usability design goals • User control • Provides context for results • Reduces short term memory load • Allows easy reversal of actions • Provides consistent view
Advantages of the Approach • Allows different people to add content without breaking things • Can make use of standard technology
Web Search is Working! Survey finds high user satisfaction Study by npd group http://www.searchenginewatch.com/reports/npd.html
Why is Web Search Working? • Web Search is Successful at Finding Good Starting Points (home pages) • Evidence: • Search engines using • Link analysis • Page popularity • Interwoven categories • These all find dominant home pages
Organizing Search Results:What works, What Doesn’t • There is a lot of prior work on this • Cha-Cha (Chen et al. 1999) • Scatter-Gather clustering (Cutting et al. 93, Hearst et al. 1996) • Becoming more prevalent in web search too. • Teoma • Vivisimo • Northern Light
Web Search Results Grouping • Drill down one category • Cannot mix and match categories • Not clear if it is useful or not • Can help differentiate different meanings of the same word. • But …what about site search?
If Web search engines are providing source selection … … what happens when the user gets to the site? Follow Links … or … Search
Following Hyperlinks • Works great when it is clear where to go next • Frustrating when the desired directions are undetectable or unavailable Site Search Is not getting good reviews
text search An Analogy hypertext
Analogy • Hypertext: • A fixed number of choices of where to go next; • A glance at the map tells you where you are; • But may not go where you want to go. • To get from Topeka to Santa Fe, may have to go through Frostbite Falls • Site Search: • Can go anywhere; • But may get stuck, disoriented, in a crevasse!
Goal: An All-Tertrain Vehicle • The best of both techniques • A vehicle that magically lays down track to suggest choices of where you want to go next based on what you’ve done so far and what you are trying to do • The tracks follow the lay of the land and go everywhere, but cross over the crevasses • The tracks allow you to back up easily
What works, what doesn’t • There is negative evidence for • Clustering • Fancy visualizations • There is positive evidence for • Grouping into meaningful, consistent categories • Relevance feedback • Depends how you do it • Showing similar items
Study of Kohonen Feature Maps • H. Chen, A. Houston, R. Sewell, and B. Schatz, JASIS 49(7) • Comparison: Kohonen Map and Yahoo • Task: • “Window shop” for interesting home page • Repeat with other interface • Results: • Starting with map could repeat in Yahoo (8/11) • Starting with Yahoo unable to repeat in map (2/14) UWMS Data Mining Workshop
Study (cont.) • Participants liked: • Correspondence of region size to # documents • Overview (but also wanted zoom) • Ease of jumping from one topic to another • Multiple routes to topics • Use of category and subcategory labels UWMS Data Mining Workshop
Study (cont.) • Participants wanted: • hierarchical organization • other ordering of concepts (alphabetical) • integration of browsing and search • corresponce of color to meaning • more meaningful labels • labels at same level of abstraction • fit more labels in the given space • combined keyword and category search • multiple category assignment (sports+entertain) UWMS Data Mining Workshop
Visualization of Clusters • Huge 2D maps may be inappropriate focus for information retrieval • Can’t see what documents are about • Documents forced into one position in semantic space • Space is difficult to use for IR purposes • Hard to view titles • Perhaps more suited for pattern discovery • problem: often only one view on the space
Summary: Clustering(Based on other studies as well) • Advantages: • Get an overview of main themes • Domain independent • Disadvantages: • Many of the ways documents could group together are not shown • Not always easy to understand what they mean • Different levels of granularity • Probably best for scientists only • Take heart – there is good evidence for organizing via categories!
The DynaCat System • Decide on important question types in an advance • What are the adverse effects of drug D? • What is the prognosis for treatment T? • Make use of MeSH categories • Retain only those types of categories known to be useful for this type of query. Pratt, W., Hearst, M, and Fagan, L. A Knowledge-Based Approach to Organizing Retrieved Documents. AAAI-99: Proceedings of the Sixteenth National Conference on Artificial Intelligence, Orlando, Florida, 1999.
DynaCat Study • Design • Three queries • 24 cancer patients • Compared three interfaces • ranked list, clusters, categories • Results • Participants strongly preferred categories • Participants found more answers using categories • Participants took same amount of time with all three interfaces
Cha-Cha (intranet search) Cha-Cha: A System for Organizing Intranet Search Results, by Chen, Hearst, Hong, and Lin, Proceedings of 2nd USENIX Symposium on Internet Systems, Boulder, CO, Oct 1999. cha-cha.berkeley.edu
The Standard Model • Assumptions: • Maximizing precision and recall simultaneously • The information need remains static • The value is in the resulting document set
“Berry-Picking” as an Information Seeking Strategy (Bates 90) • Berry-picking model • Interesting information is scattered like berries among bushes • The user learns as they progress, thus • The query is continually shifting