searching the internet
Download
Skip this Video
Download Presentation
Searching the Internet

Loading in 2 Seconds...

play fullscreen
1 / 21

Searching the Internet - PowerPoint PPT Presentation


  • 115 Views
  • Uploaded on

Searching the Internet. CSCI-N 100 Department of Computer and Information Science. Searching the Internet. What is the Internet Does anyone own the Internet How is the Internet controlled. The Internet…. It is not a centrally owned or organized institution. It is not a single entity.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Searching the Internet' - cicada


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
searching the internet

Searching the Internet

CSCI-N 100 Department of Computer and Information Science

searching the internet1
Searching the Internet
  • What is the Internet
  • Does anyone own the Internet
  • How is the Internet controlled
the internet
The Internet…
  • It is not a centrally owned or organized institution.
  • It is not a single entity.
  • It is not a 'Den of Iniquity'
  • It is not crawling with eight - year - old children controlling nuclear bombs.
  • The Internet is not a hive of viruses waiting to attack your computer.
  • The Internet is not just for pimple-faced teenagers with propeller beanies.
the internet1
The Internet…
  • Is a vast repository of information.
  • Is relatively universal
  • Is dynamic – changing minute-by-minute
the internet2
The Internet
  • InterNIC
    • - Internet Network Information Center - An international coalition of Internet organization that has what control there is of the Internet
  • IAB
    • - Internet Architecture Board - An organization that sets standards for the Internet
  • ICANN
    • - Internet Corporation for Assigned Names and Numbers – An organization responsible for the global coordination of the Internet's system of unique identifiers
  • W3C
    • World Wide Web Consortium - develops interoperable technologies, specifications, guidelines, software, and tools
search engines
Search engines
  • Search Engines
    • an information retrieval system
    • allows one to ask for content meeting specific criteria
    • list is often sorted with respect to some measure of relevance of the results
    • use regularly updated indexes to operate quickly and efficiently
search engines1
Search engines
  • First search engines
    • Archie - archive" without the "v"
      • created in 1990 by a student at in Montreal
      • program downloaded the directory listings of all the files located on public anonymous FTP (File Transfer Protocol) sites
      • creating a searchable database of filenames
      • could not search by file contents
search engines2
Search engines
  • Gopher
    • indexed plain text documents
    • created in 1991 at the University of Minnesota: Gopher was named after the school's mascot
    • most of the Gopher sites became websites after the creation of the World Wide Web because these were text files
search engines3
Search engines
  • Veronica (Very Easy Rodent-Oriented Net-wide Index to Computerized Archives)
    • provided a keyword search of most Gopher menu titles in the entire Gopher listings
  • Jughead (Jonzy's Universal Gopher Hierarchy Excavation And Display)
    • a tool for obtaining menu information from various Gopher servers
and the answer is
And the answer is …
  • People have trouble with
    • How to ask
    • What to ask
    • Where to ask
    • When to ask
how to ask
How to ask
  • Search criteria
    • Build a query
      • Date
      • File name
      • Location
      • Keyword
      • Domain
      • Country
how to ask1
How to ask
  • Boolean phrases
    • And, + (plus)
      • Finds documents containing all of the specified words or phrases
      • Peanut AND butter finds documents with both the word peanut and the word butter.
    • Or
      • Finds documents containing at least one of the specified words or phrases
      • Peanut OR butter finds documents containing either peanut or butter. The found documents could contain both items, but not necessarily.
    • Not, - (minus)
      • Excludes documents containing the specified word or phrase
      • Peanut NOT butter finds documents with peanut but not containing butter
    • Wild card (*)
      • Finds documents with just given information, * fills in the rest
      • Pea* returns all pages with the phrase pea (Be Careful!!)
what to ask
What to ask
  • All of these words
    • Documents must contain all of the words you list
  • This exact phrase
    • Documents must contain these exact words in the order you typed them
  • Any of these words
    • Documents must contain at least one of the words you list
  • None of these words
    • Documents that contain these words will be omitted from your results
where to ask
Where to ask
  • Search engines
    • Do not really search the World Wide Web directly
    • Searches a database of the full text of web pages selected from the billions of web pages out there residing on servers
    • Search engine databases are selected and built by computer robot programs called “spiders”
    • After spiders find pages, they pass them on to another computer program for "indexing."
types of search tools
Types of Search Tools
  • Search engines
    • built by computer robot programs ("spiders") -- not by human selection
    • NOT organized by subject categories -- all pages are ranked by a computer algorithm
    • contain full-text (every word) of the web pages they link to -- you find pages by matching words in the pages you want
    • huge and often retrieve a lot of information -- for complex searches use ones that allow you to search within results
    • Unevaluated -- contain the good, the bad, and the ugly -- YOU must evaluate everything you find
      • Google, Yahoo, Ask.com
types of search tools1
Types of Search Tools
  • Subject directories
    • built by human selection -- not by computers or robot programs
    • organized into subject categories, classification of pages by subjects -- subjects not standardized and vary according to the scope of each directory
    • NEVER contain full-text of the web pages they link to -- you can only search what you can see (titles, descriptions, subject categories, etc.) -- use broad or general terms
    • small and specialized to large, but smaller than most search engines -- huge range in size
    • often carefully evaluated and annotated (but not always!!)
directories
Directories
  • Librarians Index
    • www.lii.org
  • Infomine
    • infomine.ucr.edu
  • AcademicInfo
    • www.academicinfo.us
  • About.com
    • www.about.com
  • Google Directory
    • directory.google.com
  • Yahoo!
    • dir.yahoo.com
types of search tools2
Types of Search Tools
  • Searchable database contents or the "Invisible Web"
    • Invisible Web is estimated to offer two to three times as many pages as the visible web
    • Pages in non-HTML formats (pdf, Word, Excel, Corell suite, etc.) are "translated" into HTML
    • Script-based pages, whose links contain a ? or other script coding, no longer cause most search engines to exclude them
    • Pages generated dynamically by other types of database software (e.g., Active Server Pages, Cold Fusion) can be indexed if there is a stable URL somewhere that search engine spiders can find
types of search engines
Types of search engines
  • Meta-Search Engines
    • submit keywords in its search box
    • it transmits your search simultaneously to several individual search engines and their databases of web pages
    • Meta-search engines do not own a database of Web pages
    • Examples
      • Dopgpile.com
      • Clusty.com
      • Surfwax.com
references
References
  • Module #8: Communication and Internet protocols
    • http://www.cs.iupui.edu/~aharris/mmcc/mod8/abip.html
  • Module #2: Communication and the World Wide Web
    • http://www.cs.iupui.edu/~aharris/mmcc/mod2/abwww.html
  • World Wide Web Consortium
    • http://www.w3.org/
  • Search engine
    • http://en.wikipedia.org/wiki/Search_engine
references1
References
  • The BEST Search EnginesUC Berkeley - Teaching Library Internet Workshops
    • http://www.lib.berkeley.edu/TeachingLib/Guides/Internet/SearchEngines.html
    • http://www.lib.berkeley.edu/TeachingLib/Guides/Internet/FindInfo.html
ad