Neverending Search: What you (and your students) really need to know about online searching and search tools!
Overview • MAR*TEC video
What searching looked like back in the day: http://fury.com/images/weblog/google_circa_1960.jpg
Search, don’t surf! It’s a trillion page Web!
Just because you live on the Web, doesn’t mean you can’t learn how to use it more effectively and more powerfully! It’s a fact: We already know how to use the Web!
Effective searching Understanding strategy/ syntax Brainstorming/ Questioning/ Planning Choosing the right type of search tool Evaluating results! Staying up to date
Four tips: FSRE (for sure?) • Focus—What is your mission or question? • Strategize—Which search tools will you use? Which keywords and search terms will you use and how will you express them? • Refine--How might I improve my search results? • Evaluate—Which results will you visit? Which sites or documents are worthy enough to use? Did I do good work?
Good searchers also: • Use peripheral vision--they mine their results for additional search terms • Consult several search tools • Make use of advanced search screens • Search the free Web and subscription databases • Use appropriate syntax (the language specific to the search tool they are using) • Use search strategies • Modify or refine their searches (Searching is recursive!)
What does a good researcher look like? • Springfield students . . .(3:07) • Do you feel competent? • What skills?
True or false? • Web search engines can locate every page on the Web? • Search engines are the only search tools on the Web? • Webmasters can fool a search engine into ranking a page more highly in its search results Eileen Stec, Rutgers, ALA Preconference 1/03
Pre-process your search terms Bernie Dodge Step Zero: Seven Steps to Better Searching http://edweb.sdsu.edu/webquest/searching/stepzero.html
Research Question: How effective are drug abuse prevention programs for young people? Recognize the importance of brainstorming and strategy Connect with “ANDs”
When do you really need OR? OR is generally used for synonyms or related words.
NOT as a refinement technique for problem words eagles NOT Philadelphia“Martin Luther” NOT King
Rockwell Schrock’s Boolean Machine http://kathyschrock.net/rbs3k/boolean/
Let’s play Boolean Aerobics! • Stand up if you have brown hair AND brown eyes • Remain standing if you have brown hair AND brown eyes AND are wearing glasses • Remain standing if you have brown hair AND brown eyes AND are wearing glasses AND are wearing something blue
“Phrase searching” • One of your best searching tools! • Use only for legitimate phrases, names, titles • “vitamin A” • “John Quincy Adams” • Titles “An Officer and a Gentleman” • Phrase searching is sometimes overused: Remember: not every group of words is a phrase • Sometimes “ANDing” or “NEARing” are better strategies
Advanced Search Screens • Google • All the Web • AltaVista • HotBot
A question is not a query How many buffalo remain in the United States?
How to structure a good query • Brainstorm several key words and phrases—the ones you think would appear and wouldn’t appear in your dream document • Anticipate synonyms or be on the look out for them as you search • Put most important words and phrases first. • Consider phrases—which words are likely to appear next to each other in exact order in good results? Use quotation marks to search phrases, as well as names like “Martin Luther King” or phrases like “vitamin A” “Raisin in the Sun” • Focus on nouns (verbs are often vague, stop words, like articles—a, an, the—are ignored by most engines)
Query tips (continued) • Consider alternate forms of words (truncate when you can) adolesc* for adolescent, adolescents, adolescence • Check your spelling. Bad spelling usually turns up bad hits • Follow “more like this” or “similar to” leads for your best results • Mine result lists for important words, names, and • phrases you didn’t think of originally. • Use lowercase letter unless you are searching for proper names. Turn off the CAPS LOCK! • Be creative!
Tricks for advanced searchersseeking a needle in a haystack • Word stemming: • wom*n • lesson* NEAR plan* • Search within • Google, AlltheWeb • Also use “find” to search within a page full of text! • Field Searching • Search for keywords in titles, subject tags, file formats rather than just words anywhere in the text • Search Engine Features Chart http://searchenginewatch.com/facts/ataglance.html
Field searching is usually easier in the Advanced Search area • title: • Link check (Google, AltaVista) Helps in evaluating sites! • link:mciu.org/~spjvweb • Media or filetype:pdf or ppt (Google) Great for finding documents, papers, and presentations! • domain: • domain:jp +edu
Just as I wouldn’t ask my contractor friend to prepare my will, I wouldn’t ask my lawyer friend to build my new kitchen. Search tools have specialties too.
Research points to the fact that students do not plan, they react. Response?
“People who are only good with hammers see every problem as a nail.” Abraham Maslow
Another paradigm shifting? “We are moving from the Decade of the End User to the Decade of the Search Engine.” Pam Berger
We need to address two issues: • How do we make the user smarter? • How do we make the system smarter?
Search enginesDatabases of billions of Web pages, gathered by automated "robots," allowing broad, often overwhelming searches. Search engines vary in the ways they collect sites and organize results Metasearch Engines Search across a variety of search tools and organize the collected results. Good for a broad sweep type search Subject directoriesLinks to resources arranged in subject hierarchies, encouraging users to both browse through, and often search for, results. Subject directories are often annotated. They are selected, evaluated, and maintained by humans. Specific Subject Guides or GatewaysThe work of a subject specialists, subject gateways usually result in carefully selected and annotated links Specialized search engines Search engines that focus their searching in a particular area of knowledge or interest. Subscription Databases Payservices often provided by states or libraries offering premium content in the form of reference materials, journal and newspaper articles, broadcast transcripts, etc. A field guide to the search tools
Subject directories: When to use them • When you are just starting out, or have a broad topic or one major keyword or phrase (example: “Civil War”) • When you want to get to the best sites on a topic quickly • When you value annotations and assigned subject headings which may help retrieve more relevant material • When you want to avoid viewing the many noise documents picked up by search engines
Two Essential Directories Librarians’ Index to the Internet http://lii.org Well-organized, selective, and continually updated collection, also known as "the thinking person's Yahoo” Maintained by a team of librarians at Berkeley Public Library Kids Click http://kidsclick.org/ Great starting point for kids. Annotations are carefully written. Offers grade levels and describes how illustrated a site is.
Subject directories to count on INFOMINE: Scholarly Internet Resource Collections http://infomine.ucr.edu/ A large collection of scholarly Internet resources About.com http://www.about.com Offers a surprising number of guide pages, maintained by paid experts. Not scholarly but very handy for everyday, practical topics Academic Info: Your Gateway to Quality Educational Resources http://www.academicinfo.net/ Great for high school and college research BUBL Link http://bubl.ac.uk/link/ This UK project leads to carefully selected and annotated resources WWW Virtual Library http://www.vlib.org/ The first subject directory on the Web. Features comprehensive, well-annotated subject collections maintained by experts around the world
Subject directories--Popular • Google Directory http://directory.google.com/ • Yahoo http://dir.yahoo.com/ Both Yahoo and Google offer popular directories. They are not very selective, but they offer some wonderful subject collections. Examples: Yahoo Full Coverage http://fullcoverage.yahoo.com/fc/ Google Social Issues http://directory.google.com/Top/Society/Issues/
Search Engines: When to use them • When you have a narrow topic or several keywords • When you are looking for a specific site • When you want to do a comprehensive search and retrieve a large number of documents on your topic • When you want to search for use an advanced search screen or search for particular types of documents, file types, source locations, languages, date last modified, etc. • When you want to take advantage of newer retrieval technologies, such as concept clustering, ranking by popularity, link ranking, etc.
Search engines are powerful but they have limitations! • They do not crawl the web in “real time” • If a site is not linked or submitted it may not be accessible • Not every page of a site is always searchable • Few search engines truly search the full text of Web pages • Special tools needed for the Invisible/Deep Web • Paid placement / sponsored results distract from real results
When using a search engine Your goal is to get the best stuff to appear on the first two or three pages.
Relevance rocks!Search engines determine relevance in different ways.
Second Gen Search Tools Approach relevance in helpful ways: • Googleranks by link popularity • Teoma ranks by subject-specific popularity • Vivisimo, KillerInfo, Mooter: offer concept-clustered results • Surfwax: uses human generated indexes--Focus Words and summaries • Ixquick Metasearch: uses the ranking schemes (top ten lists) of other search tools U. Albany Laura Cohen http://library.albany.edu/internet/second.html
Some search tools present results horizontally, not in long lists! • Query Server (metasearch) • Vivisimo (metasearch) • Killerinfo
Chris Sherman, What’s the Best Search Engine, 6/3/03 “Search engines differ from one another more than most people think. Each has a unique index of pages, and differing relevance algorithms. Because of this, you often get very different results using the same query words on different engines. If you're not finding what you're looking for, stop banging away on your ‘favorite’ and try another engine!”
Why we love Google • One of three major engines to crawl Acrobat files—view in ASCII, Google also sees Word, EXCEL, PowerPoint and RichText • Unified search--groups, images, directory • OR now in use but must be uppercase • New ~ (tilde) can pick up synonyms • Stop words can be searched with “+” • +to +be +or +not +to +be • Makes cache copy of page available--retrieves dead pages • Huge reach • Special features define: / calculator • But . . . • Only crawls and makes searchable first 110K of a page. Content of long pages may be invisible! • New stuff may not be linked and get buried
alltheweb • Indexes PDF, Word, and Flash • Does not limit content crawled on a Web page • Indexes every word including stop words! • Good for tracking plagiarism • Refreshes database frequently! • Unified search--web pages, files, media in one list • Large, fast, fresh • Changes with Yahoo! ownership?
Vivisimo • De-dupes • Clusters hierarchical categories on the fly • Preview from result list • Searches several news databases and government sources
KartOO (visual metasearch) • List of “top” sites • Use of rollovers • Illustrates relationships among topics and sites in “cartographic environment” • Ability to plus or minus terms from suggestion box on left--makes Boolean obvious • Similar to MusicPlasma.com • Trend mostly BETA—TouchGraph (on-line display of social networks) Anacubis (for Google/Amazon), Grokker • Mooter
SurfWax Metasearch • Quick view of results’ content • Search terms to broaden or narrow a subsequent search. • Author description • Key Points • Focus Words • Links to Find Articles • Sorting options
Ixquick--metasearch Searches multiple engines and directories of your choice, and returns only those documents that appear in the top 10 those tools’ results lists. Taps the ranking systems of the other engines!
Ithaki Metasearch • Keyword relevance • Language options • Dedupes • Search Sites: Yahoo websites, Open Directory (web sites & categories), Looksmart, Galaxy, Zeal & AskJeeves • Results that appears more frequently in search engines will be at the top, • Group the results by search engine
BrainBoost • Natural language • Question-answering • Meta-search