The Internet, Intranets, and Extranets Chapter 7
Ch. 7 The Internet, Intranets, and Extranets • The Internet • History of the Internet • How the Internet works • Internet services • The World Wide Web • Web publishing • Intranet and extranet • several issues associated with the use of networks
Internet • Also called the Net • A worldwide collection of networks • These networks are connected together using routers • Internet Protocol(IP) • No single point of registration and control for the network
History of the Internet • ARPANET • Networking project by Pentagon’s Advanced Research Projects Agency (ARPA) • Goal: • To allow scientists at different locations to share information and work together on military and scientific projects • To function even if part of the network were disabled or destroyed • Became functional in September 1969 • Four original nodes on ARPANET • University of California at LA, the Stanford Research Institute, the University of California at Santa Barbara, and the University of Utah • NSFnet: • The National Science Foundation’s network • Connected to ARPANET in 1986 • NSFnet + ARPANET = The INTERNET • Today more than 100 million host nodes
How the Internet Works • Hosts • Routers forward packets to other networks • Internet Protocol Stack (TCP/IP) • Internet Protocol (IP) • Transport Control Protocol (TCP) • Backbones • Inner structure of the Internet • Communications lines that carry the heaviest amount of traffic
The Operation of the Internet • Packets of information flow between machines governed by common rules (protocols): • Internet protocol (IP) • Transport control protocol (TCP) • Internet is a packet-switching network • Messages are decomposed into packets, containing part of the message, plus information on the sending and receiving machines and how the packet relates to the other packets • Packets travel independently and possibly on different routes through the Internet • Packets are reassembled into the message at the receiving machine.
What is an Internet protocol (IP) address? Number that uniquely identifies each computer or device connected to the Internet Four groups of numbers, each separated by a period Number in each group is between 0 and 255 126.96.36.199 last part identifies specific computer first part identifies network How the Internet Works IP address
What is a domain name? Text version of an IP address Components are separated by periods Each domain name represents one or more IP addresses www.oswego.edu How the Internet Works IP address 188.8.131.52 Domain name
How the Internet Works • Uniform Resource Locator (URL) • An assigned address for each computer on the Internet • http://www.cs.oswego.edu/~ychoi/ISC110/isc110.htm Domain category World Wide Web Path Resource Hypertext transfer protocol Host Network Name
Internet service provider (ISP) A business that has a permanent Internet connection Provides temporary connections to individuals and companies for free or for a fee Regional ISP: Provides access to the Internet through one or more telephone numbers local to a specific geographic location National ISP: Provides local telephone numbers in most major cities and towns nationwide. May also provide a toll-free number
How might data travel the Internet using a telephone line connection? 1:You initiate an action to request data from the Internet. 2: A modem converts the digital signals from the computer into analog signals, which are understood by telephone lines. 3: Data (request) travels through telephone lines to a local ISP. 4: Data may pass through one or more routers before reaching its final destination. 5: The regional ISP uses lines, leased from a telephone company, to send data to a national ISP. 6: The national ISP routes data across the country to another national ISP. 7: Data moves from a national ISP to a local ISP and then to a destination server. 8: The server retrieves the requested data and sends it back through the Internet backbone to your computer.
Internet connection methods • LAN whose server is an Internet host • SLIP or PPP via dial-up access • SLIP or PPP- communications protocols enable packets to move across telephone lines • On-line service such as AOL, MSN, or Prodigy
E-mail – electronic messaging USENET newsgroups – forums that collect groups of messages from users based on common themes LISTSERV – distributes email messages to all subscribers Chatting – live, interactive, written conversations based on topic groups Instant messaging – instant text messaging between Internet users Telnet – user on one computer doing work on another computer Internet telephony – conducting voice conversations over the Internet Internet fax – real time document transmittal Streaming audio and video Internet Services - Communications
File Transfer Protocol (FTP) – electronic transfer of files from one computer to another Archie – tools to enable searching for files at FTP sites Gophers – menu-driven information search tool Veronica – text search through Gopher sites Wide Area Information Servers (WAIS) – database search tool Internet Services – Information Retrieval
How does an e-mail message travel? • Using e-mail software, you create and send a message. • Your software contacts software on your service provider’s mail server. • Software on the mail server determines the best route for the data and sends the message, which travels along Internet routers to the recipient’s mail server. • The mail server transfers the message to a POP3 server. 5 When the recipient uses e-mail software to checkfor e-mail messages, the message transfers from the POP3 server to the recipient’s computer.
What is a POP server? Post office protocol server • When a message arrives at the recipient’s mail server, the message transfers to a POP server • POP server holds an e-mail message until the recipient retrieves it with his or her e-mail software • POP3 is the newest version
The World Wide Web • An application that uses the Internet transport functions • A system with universally accepted standards for storing, retrieving, formatting, and displaying information via a client/server architecture • Based on HTML -standard hypertext language used in Web • Handles text, hypermedia, graphics, and sound
Brief History of the Web 1945 Vannevar Bush (1945) -“The Memex” • A desk containing a micro-film reader and stores of film that would serve as the equivalent of an entire research library. • It would allow different items in the microfilm collection to be linked together and annotated by the reader • Bush, Vannevar. "As We May Think." Atlantic Monthly (July, 1945) http://www.isg.sfu.ca/~duchier/misc/vbush 1960s • Ted Nelson coins the word Hypertext • Doug Engelbart prototypes an “oNLine System”(NLS) which does hypertext browsing editing, email, and so on. He invents the mouse for this purpose. 1980 • Tim Berners-Lee writes a notebook program which allows links to be made between arbitrary nodes. 1993 • The WWW was freely usable by anyone
Memex • Personal library with links • Links to information • Links to own thoughts • Operated by association (and buttons) • Trails • Associations of thoughts • Dynamic • Shareable • Intricacy of trails • Web of trails
Memex Picture from http://www.dynamicdiagrams.com/design/memex/model.htm#download
The World Wide Web • Home Page - a text and graphical screen display; first, introductory page in a web site • Web Site - all the pages of a company or individual • Hyperlinks - ways to link and navigate around the pages on a web site • Webmaster - the person in charge of a Web site • Browsers - graphical software that enables WWW users to request and view web documents • Uniform Resource Locator (URL) - points to the address of a specific resource on the Web • Hypertext Transport Protocol (HTTP) - communications standard used to transfer pages across the WWW portion of the Internet
Web browser Also called a browser • Software program that allows you to access and view Web pages • Two popular browsers for personal computers • Netscape • Internet Explorer
Web Site Navigation • Web site organization and navigation design is up to the developers • Navigation allows users to • Know where they are • Know where they can go • Approaches • TOC/Index model • Site Map • Navigation Rules • Consistency • Ease of use • Indicate current location • Always be able to go home
Uniform Resource Locator (URL) Format: http://www.cs.oswego.edu/~ychoi/ISC110.htm http:// hypertext transfer protocol, the communications standard used to transfer pages on the Web Unique address for a Web page Browser retrieves a Web page by using the URL Also called a Web address
Domain Name System mil net gov org com edu Domain Name Server http request Oswego Web Server Web Server CS Web Server Your Computer Directories File
Domain Name Server Domain Name Server www.oswego.edu 184.108.40.206 www.microsoft.edu 932.562.85.9 www.nasa.gov 976.899.86.5 . . . .
A computer that delivers (serves) Web pages you request Computer running web server software Contains WWW directories Web Server WEB SERVER WWW.oswego.edu Smith ISC110 Homwork Ch3reviewquestion.htm
Finding Information on the Web • Understanding Information Discovery Tools • Subject Guides • Search Engines • A software program you can use to find Web sites, Web pages and Internet files • A Web search tool that helps you find relevant web pages
Search Tools • Type a phrase into a search box on a “web form” • Possibly restrict search using • Search operators • Inclusion operators • Exclusion operators • Wildcards • Boolean • Submit the form • Get candidate web pages
Typical crawler-based search engine architecture Index Interface Query Engine Users Indexer Crawler Web servers
Search form Search Engine Cat http://djfidj Dog http://dkjdjf . . . Search Technology Web Servers Index of words, phrases Metadata, and URLs Information Retrieval and Indexing Result list of URLs
Types of Search Engines • Directories • hierarchically organized indexes that allow you to browse through lists of web sites by category or subject • search engines • create a database of sites using robots or spiders • meta-search engines • query multiple search engines simultaneously and return a complete set of hits • Specialized search engines • Create a database of sites on a specific topic using robots or spiders
What is a directory? An organized set of topics Used by a search engine to aid in locating Web sites Each major topic has related subtopics
Directories • Yahoo http://www.yahoo.com • Librarians’ Index to the Internet http://sunsite.berkeley.edu/InternetIndex
Search Engines • Hotbot http://www.hotbot.com • AltaVista http://www.altavista.digital.com • Northern Light http://northernlight.com • Google http://www.google.com/
Search Engine Components • Spider (called as crawler or bot) • A program that reads pages on Web sites in order to find Web pages that contain the search text • The spider visits a web page, reads it, and then follows links to other pages within the site. • The spider returns to the site on a regular basis, such as every month or two, to look for changes. • Index/Database • Everything the spider finds goes into the index. The index, called the catalog, is like a giant book containing a copy of every web page that the spider finds • Search engine software • The program that sifts through the millions of pages recorded in the index to find matches to a search and rank them in order of what it believes is most relevant.
Meta-Search Engines • MetaCrawler www.metacrawler.com • ProFusion www.profusion.com • Mamma www.mamma.com
Subject Directories üHave a broad topic üLook for a collection of sites recommended by experts üDon’t want a list of low-content documents Search Engines üHave a narrow or unique topic or idea to research üLook for a specific site üSearch for particular types of documents, file types, source locations, languages, date last modified, etc. When to use what?
Specialized Search Engines • Career Mosaic www.careermosaic.com • Diseases, Disorders and related topics www.mic.ki.se/Diseases/index.html • The Day in History www.historychannel.com/today • Shareware.com www.shareware.com
Search Engine Comparison • Search engine comparison chart http://newsite.kclibrary.org/resources/search/chart.cfm • Search features chart http://www.searchenginewatch.com/facts/ataglance.html
Examining a Search Engine • Before you use a search engine, you need consciously examine following aspects: • The size and content of the database • How the search engine index web pages: what elements of a web page are searched • Its Retrieval Algorithms: what exactly are the matching principles that the search engine is operated on • Its Ranking System: in what order are the retrieved documents arranged • Its display format: what are the display elements and how display is arranged
Search Features • A search engine may support all or some of the following features: • Boolean search • Proximity • Truncation • Case Sensitive • Phrase Search • Field Searching • Limits • Sorting
Type of Searches Typically, a search engine contains two forms of search: 1.Basic or Simple Search: used for general searches, oriented to general public, emphasizing high recall 2.Advanced or Power Search: Used for more precise searches, allow the user to specify date range, conducting field search and modify display format, etc. Focusing on high precision Note: not all search engines support advanced search techniques. Search engines with subject directory sites normally do not offer advanced search techniques.
Your own home page • Your own site on the web • Must have a service provider • Must have some software with which to write it • Manual exercise • Some editors and programming aids: such as Microsoft Frontpage
Making Web Pages • Create text files • Plain text • .htm or .html extensions • Content includes HTML and text • Place it on a web server
HTML • HyperText Markup Language • Hypertext: Documents distributed in files and connected by hyperlinks • The markup language used by the World Wide Web • HTML uses markup tags to tell the Web browser how to display the text. • Formatting commands are intermixed with text in a file • Interpreted from start to finish • It was invented by Tim Berners-Lee