1 / 20

Commercial Online Databases and the Internet

Commercial Online Databases and the Internet. OSS ‘99 Global Information Forum May 24, 1999 Anne Caputo Dow Jones Interactive Publishing. Traditional Search Services Challenge the Web. The Internet Searchoff September 1997-February 1998 Susan Feldman, DATASEARCH sef2@cornell.edu Goal

abdul-nash
Download Presentation

Commercial Online Databases and the Internet

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Commercial Online Databases and the Internet OSS ‘99 Global Information Forum May 24, 1999 Anne Caputo Dow Jones Interactive Publishing

  2. Traditional Search Services Challenge the Web • The Internet Searchoff • September 1997-February 1998 • Susan Feldman, DATASEARCH • sef2@cornell.edu • Goal • Compare searching traditional online services with World Wide Web • Effectiveness in finding information • When to use which one • Strengths of each approach

  3. Searchoff Ground Rules • Be a trained, experienced searcher • Use a real question from a client • Search either Dialog or Dow Jones Interactive • Relevance rank the results • Rank the top 30 retrieved documents on a scale of 1 to 5

  4. Business Technology Medicine/Pharmaceuticals Science Humanities Engineering Other 38% 18% 14% 10% 8% 6% 6% Subjects Searched

  5. Alta Vista Hotbot Excite Infoseek Lycos Webferret 45% 20% 14% 14% 5% 2% Web Search Engines Used

  6. Internet Search-Off Results 1400 Web totals 1400 1200 Dlg/dj totals 1143 1000 W D 800 600 515 484 400 W D 200 0 Relevance Points # Documents

  7. Searching time • Total minutes searching time: • DIALOG/DOW JONES: 594 minutes • WWW search engines: 1230 minutes • Plus formatting time

  8. Searching Assumptions:traditional search engines • Information exists on the subject • The information is high quality • The information is current • The information is expensive • To find it, we need expertise and training to know how and where to search • It will be a surprise if we can’t find something

  9. Searching assumptions:World Wide Web • There MIGHT be information on the topic • Quality and timeliness is unpredictable • The information is free • There’s no telling how the search engine works • searching requires no skill • searching requires no training • It will be a surprise if we find something

  10. Series1 Series2 Retrieved Documents by Relevance 350 306 300 Web 250 200 147 150 -- DIALOG/ 117 Dow Jones 108 111 100 D D 60 W 52 38 50 34 D 26 D W w W 0 RANKED 1 RANKED 2 RANKED 3 RANKED 4 RANKED 5 Less Relevant More Relevant

  11. Conclusion DIALOG training has influenced an entire generation of searchers: we automatically shift into Boolean

  12. Digression: • Nested Boolean searches don’t take advantage of the strong points of Web search engines • Statistical search engines search a whole territory. Boolean engines search for a point in that territory

  13. Web Strategies • Map the territory: • Use your searching skills to create lists of related terms • Omit Boolean operators; • Let the search engine work without interference • Put the most important and most rare words first • Use MORE LIKE THIS to improve results

  14. Web Strategies • Use phrases when possible to eliminate irrelevant materials • Ignore the useless hits and pursue the good ones • Don’t worry about finding six million documents. • Just look at the top 30 • Rephrase the search • Move to another search engine if you don’t find anything

  15. Conclusions: traditional search services • Predictable archives • Chemical Engineering • Electrical Engineering • Strengths • History and background on companies • History and historical figures • Market reports, industry reports

  16. Conclusions: traditional search services • Current drug studies (authoritative) • Industry newsletters and journals • Financial industry coverage • Scholarly journal articles • High quality information • Quick searches when you know the information is likely to be there

  17. Conclusions: The Web • Pictures and illustrations • Some conference coverage and papers • Product information comes from company • Small companies – products/ background • Medical statistics (current) • If you know where to find the information

  18. Conclusions: use both • To supplement each other for: • Standards • Articles on topics of general interest • Popular subjects • Organizations • Directory information • Reviews/evaluations/how-to information

  19. Conclusions: use both • Government regulations and other agency information • Competitive intelligence • Obscure topics • Clues for finding information on and offline

  20. Conclusions: general • Time is money. • Free information that takes too long to find and format is expensive information • The Web is a new tool. • We need to learn to use both online sources well • Vary strategies and approach to take advantage of each medium

More Related