LIS 385T: Information Architecture and Design Search Results By Roger C. Wei 10/22/2002
History of e-Info Searching • 1960s Electronically searched info on computer. • 1971 End-user search system was introduced. • 1985 First commercial CD-ROM appeared. • 1991 Emergence of the World Wide Web. • 1992 Gopher was fully functional. • 1994 Emergence of first Web search engine (WebCrawler).
What is Search Results • Search results refer to presentation of content that matches the user’s search query (Rosenfeld & Morville, 2002). • Consist of individual hits.
Search Tools on the Web • About 85% of web users surveyed use search engines to find specific information. • Results of a web search depend on the selected search engine.
Searching and Ranking Process (1) match the search words (2) process the search syntax (3) construct a set of documents (4) assign a weight (5) use the assigned weights to rank (6) display the search results
Relevance • An abstract measure of how well a document satisfy the user’s info need. • A subjective notion and difficult to quantify.
Factors of Ranking (1/2) • Popularity of the page • Frequency of terms • Number of query terms matched • Rarity of terms • Weighting by field • Proximity of terms • Word variants
Factors of Ranking (2/2) • Weighting according to the order where the searcher entered terms • Case-sensitivity • Analysis of documents in the database • Relevance feedback applied to retrieved records • Date
What re-ranking search results are User User Interface Database Selector Doc Selector Result Merger Query Dispatcher Search Engine Search Engine Search Engine
Measure of IR tool Effectiveness • Recall= the number of retrieved relevant documents the number of relevant documents • Precision= the number of retrieved relevant documents the number of retrieved documents
Recall vs. Precision Recall Precision
Search Evaluation • Must be examined in pairs. • Two Issues: • Not easy to define relevancy • Difficult to know the # of relevant doc that have not been retrieved.
Survey of Existing Engines (1/2) • Major Search Engines: Google, Teoma, AllTheWeb.com (FAST), Yahoo, MSN Search, Lycos, Ask Jeeves, AOL Search, WiseNut, Inktomi, LookSmart, Open Directory, Overture, AltaVista, HotBot, Netscape Search. • News Search Engines: AltaVista News • Specialty Search Engines: Ask Jeeves • Kids Search Engines: AOL Kids Only
Survey of Existing Engines (2/2) • Metacrawlers: Dogpile, Metacrawler, Cnet Search, Search.com, ProFusion, Mamma, Ixquick. • Multimedia Search Engines: MP3.com. • Regional Search Engines: Mosaique. • Paid Listings Search Engines: Overture. • Search Utilities: Copernic.
Search Results & Info Architecture -IA, organization info to reach info needs. • IA provides more considerations for the way engines display results. • IA indirectly increase user’s satisfaction in info needs, when IR don’t have a major breakthrough.
Interface Design at the Search results page: Farkas (2002) • Display a number indicating the total number of results. • Provide a query form on the results page. • Offer some tips on searching if no results. • Hits must provide sufficient information.
Interface Design at the Search results page: Van Duyne (2002) • Provide relevant summaries • Offer clear organization • Provide good hyperlinked titles for each hit • Use log files to tailor results for the most common search terms • Compensate for common misspellings • Provide support for common search tasks
Interface Design at the Search results page: Rosenfeld (2002) -variables to consider: • The level of searching expertise users have. • The type of results users want. • The type of information being searched. • The amount of information being searched.
Search Results Interface • Google vs. Hotbot.
References • Baeza-Yates, R. & Ribeiro-Neto, B. (1999) Modern Information Retrieval. Addison-Wesley, Reading, MA, USA. • Chowdhury, A. & Soboroff, I. (2002) Automatic Evaluation of World Wide Web Search Services, Proceeding of the twenty-fifth annual international conference on research and development in information retrieval, August 11-15, 2002, 421-422, Tampere, Finland. • Chowdhury G. G. & Chowdhury S. (2001) information sources and searching on the world wide web. Library Association Publishing. London, UK. • Farkas, D. K., & Farkas J. B. (2002) Principles of Web Design. Pearson Education, Inc. • Glossbrenner, A. & Glossbrenner, E. (1999) Search engines for the world wide web, 2nd edn, Peachpit Press. • Google: How to Interpret your Search Results (2002). Retrieved October 18, 2002, from: http://www.google.com/help/interpret.html. • Hagedorn, K. (2000) Information Architecture Glossary. Retrieved October 18, 2002, from: http://argus-acia.com/ • Henninger, M. (1999) Don’t Just Surf: effective reserch strategies for the Net, 2nd Edition, Univeristy of New South Wales Press, Australia. • Hock, R. E. (2001) The extreme searcher’s guide to web search engines: a handbook for the serious searcher, 2nd Edition. CyberAge Books, Information Today, Inc. New Jersey, USA.
Jansen, B. J. 2000. An investigation into the use of simple queries on Web IR systems. Information Research: An Electronic Journal. 6(1). • Kobayashi, M. & Takeda, K. (2000) Information Retrieval on the Web. ACM Computing Surveys, 32(2), June 2000, 144-174. • Meng, W., Yu, C. & Liu K. (2002) Building Efficient and Effective Metasearch Engines. ACM Computing Surveys, 34(1), March 2002, 48–89. • Morville, P., Rosenfeld, L. B. & Janes, J. (1999) The Internet searcher’s handbook : locating information, people, and software. 2nd Edition. Neal-Schuman Publishers, Inc. NY, USA. • Poulter, A., Tseng, G. & Sargent, G. (1999) The Library and Information Professional’s Guide to the World Wide Web, Library Association Publishing. • Rosenfeld, L. & Morville P. (2002). Information Architecture for the World Wide Web, 2nd Edition. O’Reilly, Sebastopol, CA, USA. • Search Terms Glossary (1998). Retrieved October 20, 2002, from: http://www.searchtools.com/info/glossary.html. • Sullivan, D. (2002) Search Engine Watch. Retrieved October 10, 2002, from: http://searchenginewatch.com/. Van Duyne, D. K., Landay, J. A. & Hong, J. I. (2002) The Desing of Sites: patterns, principles, and processes for crafting a customer-centered Web experience. Addison-Wesley. • Weiss, S. (1997) Glossary for Information Retrieval. Retrieved October 10, 2002, from: http://www.cs.jhu.edu/%7Eweiss/glossary.html. • Yang, R. (1998) Relationship between Precision and Recall ratios and other evaluation methods. Retrieved July 10, 1999, from: http://www.staff.uiuc.edu/~pare/rong.html.