1 / 46

Information Search (Shneiderman and Plaisant, Ch. 13)

Information Search (Shneiderman and Plaisant, Ch. 13). from http://wps.aw.com/aw_shneider_dtui_13. Overview. Introduction “Information search should be a joyous experience” Searching in Textual Documents Multimedia Document Searches Advanced Filtering and Search Interfaces

Download Presentation

Information Search (Shneiderman and Plaisant, Ch. 13)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Information Search(Shneiderman and Plaisant, Ch. 13) from http://wps.aw.com/aw_shneider_dtui_13

  2. Overview • Introduction • “Information search should be a joyous experience” • Searching in Textual Documents • Multimedia Document Searches • Advanced Filtering and Search Interfaces • Information Foraging • A forest and some trees …

  3. Information Search • Critical need to access information, as part of any task • always has been, always will be (ahbawb) • Cultural change, if not evolution, due to amount of information accessible by individual • “Information overload” – ahbawb • What’s new is ubiquity due to massive e-access • Old school “information retrieval” and “end user searching” • Gurus and cost • Genuinely new … • Interest, due to market/user size • E.g., search engines can be profitable • tools, e.g., visualization, due to Moore’s law

  4. Information Search - Words • Old school • Information retrieval, database management • Bibliographic document systems, structured relational db – attributes • New school • Information gathering, seeking, filtering, sensemaking, visual analytics • CS focus • Data mining, data warehouses, data marts • Toward future ends such as • Knowledge networks, semantic webs, … • Range of search elements increases • Cf. Hearst November, 2011 CACM paper, “collaborative search” (on web site)

  5. Search Terminology • Shneiderman’s taxonomy • Task objects • E.g., movies for rent, are stored in structured relational databases, textual document libraries, or multimedia document libraries • Structured relational database • relations and a schema to describe the relations • Relations have items (usually called tuples or records), and each item has multiple attributes (often called fields), which each have attribute values • Textual document library • Set of collections • typically up to a few hundred collections per library • descriptive attributes or metadata about the library • E.g., name, location, owner

  6. Search Terminology, 2 • Task actions are decomposed into browsing or searching • Examples of task actions in information search: • Specific fact finding (known-item search) • Find the e-mail address of the President of the United States • Extended fact finding • What other books are by the author of “Jurassic Park”? • Exploration of availability • Is there new work on voice recognition in the ACM digital library? • Open-ended browsing and problem analysis • Is there new research on fibromyalgia that might help my patient?

  7. Search Terminology, 3 • Once users have clarified their information needs, the first step towards satisfying those needs is deciding where to search • Supplemental finding aids can help users to clarify and pursue their information needs, e.g. table of contents or indexes • Additional preview and overview surrogates for items and collections can be created to facilitate browsing

  8. Searching Textual Documents • As noted, recent dramatic changes • Historically, Boolean clause search and SQL • Other methods include: • Natural language queries • Form fill-in • Query by example (QBE) • Evidence shows that users perform better and have higher satisfaction when they can view and control the search

  9. Ex., Library of Congress • Aids to find bills, etc • “Multiple paths to information items” • (had a look, just for fun) • Not bad

  10. Ex., Library of Congress Aids to find bills, etc

  11. Ex., Library of Congress Aids to find bills, etc

  12. Ex., Library of Congress Aids to find bills, etc

  13. Searching in Textual Documentsand Database Querying

  14. Searching in Textual Documentsand Database Querying, 2 A search for “user interface” powered by Endeca (http://www.lib.ncsu.edu) returns 144 results grouped into 10 pages. The menu at the upper right allows users to sort results by relevance or by date, while on the left a summary of the results organized by Subject, Genre, or Format provides an overview of the results and facilitates further refinement of the search.

  15. Framework for Textual Search • Recall, task delineation for interface design • Shneiderman suggests stages to consider in textual search • Overview below, detail, next slide: • Formulation: expressing the search • Initiation of action: launching the search • Review of results: reading messages and outcomes • Refinement: formulating the next step • Use: compiling or disseminating insight

  16. 5 Stages of Textual Search - Detail • Yet another “taxonomy and guidelines” from Shneiderman …

  17. Multimedia Document Searches • “Multimedia” (non-textual) search is hard • Quickly evolving area • Interface issues essentially undefined • “Hum that tune”, “what did he/she/it look like” • Types: • Image search • Map search • Design or diagram search • Sound search • Video search • Animation search

  18. Image Search • Finding photos with images such as the Statue of Liberty is a challenge • Query-by-Image-Content (QBIC) is difficult • Search by profile (shape of lady), distinctive features (torch), colors (green copper) • Simple drawing tools to build templates or profiles to search with • More success is attainable by searching restricted collections • Search a vase collection • Find a vase with a long neck by drawing a profile of it • Critical searches such as fingerprint matching requires a minimum of 20 distinct features • For small collections effective browsing and lightweight annotation are important

  19. Map Search • On-line maps are plentiful • Search by latitude/longitude is the structured-database solution • Today's maps are allow utilizing structured aspects and multiple layers • City, state, and site searches • Flight information searches • Weather information searches • Mapquest, Google Maps, etc. • Mobile devices can allow “here” as a point of reference

  20. Other Multimedia Searches • Design/Diagram Searches • Some computer-assisted design packages support search of designs • Allows searches of diagrams, blueprints, newspapers, etc., e.g. search for a red circle in a blue square or a piston in an engine • Document-structure recognition for searching newspapers • Sound Search • Video Search • Provide an overview • Segmentation into scenes and frames • Support multiple search methods • Animation Search • Possible to search for specific animations like a spinning globe • Search for moving text on a black background

  21. Image Search Sketch or image to start Also, see Google

  22. Advanced Filtering & Search Interfaces • Wide range of interface strategies and styles • Filtering with complex Boolean queries • Automatic filtering • Dynamic queries • Faceted metadata search • Query by example • Implicit search • Collaborative filtering • Multilingual searches • Visual field specification

  23. Advanced Filtering and Search Interface Examples, 1 • Alternatives to form fill-in query interfaces: • Filtering with complex Boolean queries • Problem with informal English, e.g. use of ‘and’ and ‘or’ • Venn diagrams, decision tables, etc., not worked for complex queries • Dynamic Queries • “Direct manipulation” queries • Use sliders and other related controls to adjust the query • Get immediate (less than 100 msec) feedback with data • Dynamic HomeFinder and Blue Nile and (sort of) Realtor.com • Hard to update fast with large databases

  24. Dynamic Queries • Diamond price, rating indicated using sliders, etc.

  25. Faceted Metadata • Facets include media, location, date, themes

  26. Advanced Filtering and Search Interface Examples, 2 • Collaborative Filtering • Groups of users combine evaluations to help in finding items in a large database • User "votes" and info used for rating the item of interest, • e.g. Rating restaurants highly is given a list of restaurants also rated highly by those who agree the six are good • Multilingual searches • Current systems provide rudimentary translation searches • Prototypes of systems with specific dictionaries and more sophisticated translation • Visual searches • Specialized visual representations of possible values, e.g. dates on a calendar or seats on a plane • On a map the location may be more important than the name • Implicit initiation and immediate feedback

  27. Tree Map of Products(Shneiderman) Using The Hive Group’s treemap (http://www.hivegroup.com/), users can review all waterproof binoculars in the catalog of Amazon.com products and browse the items in the list, grouped by manufacturer. Each box corresponds to a pair of binoculars, and the size of the box is proportional to its price. Green boxes are best-sellers. Users can filter the results using the dynamic query sliders on the right. Here all the binoculars with less than three user reviews have been filtered out, leaving only 61 binoculars to consider.

  28. Cost of Knowledge, Search,Cognition, and Computers • Information systems (computers) and “cost” of acquiring knowledge • A first principle of information system design • “Cognitive information ergonomics” • Efficiency/productivity gain/usability/… • “Economics of cognition and the cognitive cost of knowledge” • There is (and has always been) a cost to acquire information / knowledge • cost = user/worker time +, e.g., machine cost, db access charge, book • Many studies fail to document increased profit directly from implementation of (single) information system • However, no doubt that worker productivity in late 20th century dramatically increased • Productivity greatly enhanced by pervasive use electronic information systems (computers)

  29. Informavores and Information Foraging • That human quest for information is innate and adaptive is well known • Humans are informavores • George Miller, 1983, “… magic number 7 + 2” • Organisms that hunger for information about the world and themselves • “A wealth of information creates a poverty of attention and a need to allocate it efficiently” • Herb Simon, AI, Nobel prize, economics, cognition • Consider analogy of acquiring knowledge with animals seeking food • Pirolli, P. and S. Card (1995). Information Foraging in Information Access Environments, in CHI '95, p. 518 • Pirolli, P. (2004) in Carroll (ed.), on web site • Pirolli, P. (2007) ….. Book ….. • Countless secondary sources

  30. Information Foraging Theory (IFT) • Information Foraging Theory (IFT) • Pirolli and Card – Xerox PARC • “an approach to the analysis of human activities involving information access technologies” • Derives from optimal foraging theory in biology and anthropology • Analyzes adaptive value of food-foraging strategies • Analyzes trade-offs in value of information gained against the costs of performing activity in human-computer interaction tasks • And need models and analysis techniques to determine value added by information access, manipulation, and presentation techniques • Real information system design problem is not how to collect more information, but how to optimize user’s time • Increase relevant information gained per unit time expended • IFT provides a relatively “formal” (quantitative) account

  31. IFT – Time Scales • Considers “adaptiveness of human-system designs in the context of the information ecologies in which tasks are performed” • Ecology, as system, here, information • Time scales of information seeking and sense making activities: • Cognitive band (~100 ms – 10 s) • Rational band (minutes to hours) • Social band (days to months) • Have seen much of cognitive, now others

  32. Problem solving • Decision making 10-1000 • Visual search • Motor behavior 1-100 • Visual attention • Perceptual judgment Pete Pirolli's Home Page Peter Pirolli. ... Palo Alto, CA 94304 USA phone: +1-650-812-4483 fax: +1-650-812-4241 email: pirolli@parc.xerox.com This page updated December 18, 2000. www.parc.xerox.com/istl/members/pirolli/pirolli.html - 9k - Cached - Similar pages .100-1 Time Scales of Analysis Psychological domain User Interface Domain Time scale (s)

  33. IFT – An Ecological Perspective • Time scales of information seeking and sense making activities • Cognitive band (~100 ms – 10 s) • Rational band (minutes to hours) • Social band (days to months) • As time scale increases, less regard for how internal processing accomplishes linking of actions to goals • Assumes behavior governed by “rational principles and shaped by constraints and affordances of the task environment” • An ecological perspective, i.e., that behavior is “adaptive” in that it accomplishes some goal

  34. IFT – Metaphor and Quantitative • Information Foraging Theory • name both a metaphor and straightforward use of biological “optimal foraging theory” • Metaphor: • Animals adapt behavior and structure through evolution • (humans don’t have to wait that long!) • Animals adapt to increase their rate of energy intake, etc. • To do this they evolve different methods • E.g., wolf hunts prey, spiders build webs and wait • And there are analogies to this • E.g., hunting = active information seeking, waiting = information filtering • Humans (and others) hunt in groups - when variance of food is high • Accept lower expected mean to minimize probability of days without food • Also, on social time scale, sharing of information

  35. Optimal Foraging Theory - Biology • Developed in biology for understanding opportunities and forces of adaptation • P&C use elements of the theory to help in understanding existing human adaptations for gaining and making sense of information • Also, aid in task analysis for creating new interactive information system designs • Optimality models include: • Decision assumptions • Which of the problems faced by an agent are to be analyzed • E.g., whether to pursue a particular type of information (or prey) when encountered, how long to spend • Currency assumptions • How choices are to be evaluated, e.g., information value (food value) • Constraint assumptions • Limit and define relationships among decision and currency variables • E.g., from task structure, interface technology, user knowledge

  36. Information Foraging Theory • Information foraging usually a task embedded in context of some other task • Value and cost structure defined in relation to the embedding task • Value of external information may be in improvements to outcomes of embedding task • Usually, embedding task is some ill-structured problem • Additional knowledge is needed to better define goals, available actions, heuristics, etc. • E.g., choosing a graduate school, developing business strategy • Though use optimality model, not imply human behavior is classically rational • I.e., have perfect information and infinite computational resources • Rather, humans exhibit bounded rationality, or make choices based on satisficing

  37. IFT – Information Patch ModelA formal (mathematical) model – actually, pretty straightforward • Information patch model – from optimal foraging theory • Rate of currency intake, R = U / (Ts + Th) • U = net amount of currency (value, e.g., food, information) gained • Ts = time spent searching • Th = time spent exploiting • Net currency gain, U= Uf - Cf • Uf = overall currency intake (gross amount foraged) • Cf = currency expended in foraging • Average rate of currency intake u = Uf / lTs • If assume information workers/foragers/consumers encounter information as linear function of time (will revisit this) • Total n items encountered = lTs, where l is rate of encounter with items • (will use next slide)

  38. IFT – Information Patch Model quickly … • Average cost of handling items (1st total/rate, the average) : • Let s = search cost per unit time, then total cost of search = sTs • Then, substituting in equation for R, rate of currency intake: • So, can express R in terms of • Average rate of currency intake, u • Search cost per unit time, s • Cost of handling items, h

  39. IFT – Information Patch Model And so forth …

  40. An Example: Scatter Gather Hierarchical clustering of document Users see “overview” of document clusters Allows user to navigate through clusters and overviews

  41. Scatter/Gather Task Display Titles Window Scatter/Gather Window Law Nat. Lang. World News Robots AI Expert Sys CS Planning Medicine Bayes. Nets

  42. Optimal Foraging Time in a Patchcumulative gain functions, consider document relavency Information gained time • gi(t), cumulative gain function • Amt of information gained in time t • gA(t) = random order of encounter • Increase in information equal for all elements • Hence, constant slope • gB(t) and gc(t) = ordered by relevancy • “Relevant” items, those with higher information content, encountered earlier • Hence, highest rate of information increase earlier, and rate decreases • lp, rate of encounter with relevant items • x-axis, travel time between patches • RBand RC = rate of return • tcandtboptimal foraging time • Foraging longer in the “patch” not optimal

  43. IFT - Cost of Knowledge • Foraging Efficiency • Animals minimize energy expenditure to get required gain in sustenance • Humans minimize effort to get necessary gain in information • Again, foraging for food has much in common with seeking information • Like edible plants in wild, useful information items often grouped together, but separated by long distances in an “information wasteland” • Also, information “scent” • Like scent of food, information in current environment that will assist in finding more information clusters • Activities analyzed according to value gained and the cost incurred • Resource costs • Expenditures of time and cognitive effort incurred • Opportunity costs • Benefits that could be gained in engaging in other activities • “Cost of lost opportunity” • E.g., if not gaining information about algorithms (or messing with registration system), could be gaining information about software design

  44. IFT - Conclusion • Information processing systems evolve so as to maximize the gain of valuable information per unit cost • Sensory systems (vision, hearing) • Information access (card catalogs, offices) ) ( information value cost of interaction maximize

  45. End .

More Related