Basic internet and networking concepts
Download
1 / 66

Basic Internet and Networking Concepts - PowerPoint PPT Presentation

Basic Internet and Networking Concepts Representation and Management of Data on the Internet The Internet and the World-Wide Web TCP/IP and Web Browsers The Internet and the Web Internet means Inter-Network A world-wide network of many LANs (local-area networks)

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha

Download Presentation

Basic Internet and Networking Concepts

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Basic Internet and Networking Concepts

Representation and Management of Data on the Internet


The Internet and the World-Wide Web

TCP/IP and Web Browsers


The Internet and the Web

  • Internet means Inter-Network

    • A world-wide network of many LANs (local-area networks)

    • The LANs are of various types

  • Web means World-Wide Web

    • A large collection of information arranged as hypertext and stored in many computers that are part of the Internet

  • The two are related but not the same


A Bit of History

  • The Internet grew very rapidly throughout the 1980s and 90s

    • Less than 600 computers were connected to the Internet in 1983

    • Now there are tens (if not hundreds) of millions of computers

  • The Web started in 1989 and grew very rapidly during the 1990s

  • The current Web has billions of pages


Internet Applications

  • Email

  • Telnet

  • FTP

  • Newsgroups

  • World-Wide Web

  • Chat

  • ...


The Web


Web Browsers

  • Web browsers provide a very convenient interface for viewing the information stored on the Web

  • Mosaic – the first browser – was introduced in 1993 and sharply increased the popularity of the Web


TCP/IP

  • TCP/IP is the common language of the Internet

    • IP – Internet Protocol

    • TCP – Transmission Control Protocol

  • The IP protocol transmits packets of data from one host (i.e., computer) to another

  • The TCP protocol uses many packets to transmit a long stream of data


TCP vs. IP

  • IP routes each packet from the source host to the destination host

    • IP is oblivious to the fact that usually each packet is part of a data stream

  • TCP handles correctly a long data stream

    • Divides a long data stream into many packets, at the source

    • Reassembles the packets, in the right order, at the destination

    • Handles errors and lost data


Sockets

  • Sockets are a common interface that make TCP streams look like file streams

  • Modern programming languages support sockets

  • A read or a write operation to/from a socket may block

    • Until data arrives, or

    • Until data can be sent

  • Use multiple threads so that blocking will not cause the whole GUI to freeze


IP Addresses, Host Names and URLs


IP Addresses

  • A computer connected to the Internet is called a host

  • Every host has a unique IP address

  • An IP address consists of 32 bits that are written as four decimal numbers, separated by dots

    • Example: 135.17.98.240

      • The numbers denote the four bytes composing this address


Internet Addresses

  • In addition to an IP address, a host may also have a human-readable Internet address (or hostname)

  • Some examples of hostnames:

    • www.cs.huji.ac.il

    • www.cocacola.com

    • shum.cc.huji.ac.il

  • The first part is the name of a particular host (i.e., computer)

  • The rest is the domain name


The Hierarchical Structureof Hostnames

  • Example: www.cs.huji.ac.il

    • www is a name of a computer

    • That computer is in the CS Department

    • That dept. is at The Hebrew University of Jerusalem (huji)

    • That university is an Academic Campus (ac) in Israel (il)

  • The rightmost name, il, is the main domain

  • As we move left, the sub-domains are more specific


The First 7 Generic Domains

  • com - commercial organizations (www.cocacola.com)

  • edu - educational institutions (www.berkeley.edu)

  • gov - U.S. governmental organizations (www.cia.gov)

  • int - international organizations

  • mil - U.S. military

  • net - networks (InterNIC)

  • org - other organizations (www.w3.org)

  • More domains have been added in recent years


Country Domains

  • Generic domains usually refer to hosts inside the U.S.

  • Other countries use two-letter country domains:

    • il - Israel

    • uk - United Kingdom

    • jp - Japan

    • se - Sweden

  • These domains have sub-domains that correspond to the generic domains, for example:

    • co.il is the domain of all commercial organizations in Israel

    • ac.il is the domain of all academic institutions in Israel


URLs

  • Each information piece on the Web has a unique identifying address, called a URL (Uniform Resource Locator)

  • A URL takes the following form:

  • http://www.huji.ac.il/index.html

  • It has 3 parts: a protocol field, a hostname field and a file field

file

protocol

hostname


URL Fields

  • The protocol field (“http” in the previous example) specifies the way in which the information should be accessed

  • The hostname field specifies the host on which the information is found

  • The file field specifies the particular location in the host's file system where the file is found

  • More complex forms of URLs are possible


Using IP Addresses in URLs

  • How does the browser know the IP address of the Web server?

  • One possibility is that the user explicitly specifies the IP address of the server in the hostname field of the URL, for example:

    http://135.17.98.240/index.html

  • However, it is inconvenient for people to remember such addresses


From Hostnames to IP Addresses

  • When we address a host in the Internet, we usually use its hostname (e.g., using a hostname in a URL)

  • The browser needs to map that hostname to the corresponding IP address of the given host

  • There is no algorithm for computing the IP address from the hostname

  • A lookup table provides the IP address of each hostname


Where is the Translation Done?

  • The translation of IP addresses to hostnames requires a lookup table

  • Since there are millions of hosts on the Internet, it is not feasible for the browser to hold a table that maps all hostnames to their IP-addresses

  • Moreover, new hosts are added to the Internet every day and hosts change their names


DNS (Domain Name System)

  • The browser (and other Internet applications) use a DNS Server to map hostnames to IP addresses

  • DNS is a hierarchical scheme for naming hosts

  • The command nslookup gets an IP address and returns a hostname or vice-versa

  • It runs on clients and contacts a DNS server


LANS, IP Addresses and Routing

How IP Packets are transmitted Across the Internet


IP Addresses

  • An IP address consists of 4 bytes

    • Each byte is a number in the range 0 – 255

    • For example, 132.64.1.10

  • The first 1 to 3 bytes identify the network and the remaining 1 to 3 bytes identify hosts on the network

    • There are several classes of network addresses

  • Subnet masks effectively increase the number of networks

  • Network Information Center (NIC) assigns IP addresses to organizations and companies


Classes of IP Addresses

  • Class A: The first byte is 1 – 127

    • 1 byte for network and 3 for host

  • Class B: The first byte is 128 – 191

    • 2 bytes for network and 2 for host

  • Class C: The first byte is 192 – 223

    • 3 bytes for network and 1 for host

  • Classes D and E: 224 – 255

    • These classes have special functions, e.g., a multicast packet uses a class D address


Subnet Masking

  • The network part of an IP address identifies a LAN (Local-Area Network)

  • Hosts in a given LAN can be up to 100 meters from the LAN switch

  • HU has one class B network address, namely, 132.64 (CS is 132.65)

    • But HU needs many LANs !

  • Subnet masking solves this problem


Defining a Subnet Mask

  • The subnet mask is a four-byte sequence of 1’s followed by 0’s, e.g., 255.255.255.0

  • IP addresses are interpreted as follows:

    • Any bit that is 1 in the mask identifies the network and any bit that is 0 identifies the host

  • When the subnet mask 255.255.255.0 is applied to an IP address of Class B, e.g., 132.64.112.52, it means that

    • The first 2 bytes identify a network (HU)

    • The third byte identifies a subnet, i.e., a specific LAN

    • The fourth byte identifies a host


Local-Area Networks (LANs)

  • LANs are typically built by connecting hosts to a 100Mbit Ethernet LAN switch, using Category 5 cables

  • Maximal distance between switch and host is 100 meters

  • LAN switches transmit IP packets between hosts on the same LAN

  • LAN switches translate IP addresses to physical addresses (MAC addresses)


Routers

  • LAN switches are connected using fiber optics to routers

  • Routers route IP packets across LANs

  • A router is connected directly to two or more LANs and it can transmit IP packets between this LANs (local routing)

  • Some routers are connected to each other via WANs (Wide-Area Networks) and do backbone routing


Hop-by-Hop Routing

  • Suppose that an IP packet is sent from a LAN to another far-away LAN

  • The message gets to the router that is directly connected to the source LAN

  • The router sends it to the next hop, i.e.,

    • A router on the same LAN that is also connected to some other LANs, or

    • A router on the same WAN


Routing Tables

  • Each router has routing table with prefixes of IP address

    • Each prefix has a router address for the router that handles that prefix

  • Given an IP packet with some IP address, the next-hop router is determined by matching the longest prefix (of an IP address) from the routing table with the given IP address

  • There is a default entry for the largest routers in the backbone of the Internet


Updating the Routing Tables

  • The routing table includes local information provided by the local network administrator

  • Router periodically update their routing tables by exchanging information with their neighboring routers

  • Routing protocols: Distance Vector (Bellman-Ford), Open Shortest Path First (OSPF)


A Short Overview of How the Web Works

The HTTP Protocol, Web Proxies, Dynamic HTML Pages


The HTTP Protocol

  • Hypertext Transfer Protocol

  • Used between Web clients (e.g., browsers) and Web servers (and proxies)

  • Text based

  • Built on top of TCP

  • Stateless protocol (it doesn’t remember your previous requests)


Browsers Are Clients

  • We use a browser to display HTML pages

  • The browser is responsible for fetching the HTML pages and displaying their contents according to the HTML rules


Web Servers

  • HTML pages are stored in file systems

  • Some hosts, called Web servers, can access these HTML pages

  • Each Web server runs an HTTP-daemon in order to make its HTML pages available to other hosts

  • The term “Web server” refers to the software that implements the HTTP daemon, but sometimes it also refers to the host that runs that software


HTTP Daemons

  • An HTTP-daemon is an application that is constantly running on a Web server, waiting for requests from remote hosts

  • Technically, any host connected to the Internet can act as a Web server by running an HTTP-daemon application

  • A Web client (e.g., browser) connects to a Web server through the HTTP protocol and requests an HTML page


Browser-HTTPD Interaction

user requests

http:// www.cs.huji.ac.il /index.html

GET /index.html

host www.cs.huji.ac.il

Web server

sends the

content of

index.html

HTTP

daemon

Browser

Disk


Browser-HTTPD Interaction

  • The user requests http://www.cs.huji.ac.il/index.html

  • The browser contacts the HTTP-daemon running on the host www.cs.huji.ac.il and requests the HTML page /index.html

  • The HTTP-daemon translates the requested name to a specific file in its local file system

  • The HTTP-daemon reads the file index.html from the disk and sends the content of the file to the browser

  • The browser receives the HTML page, parses it according to the HTML rules and displays it


HTTP Transaction – Client

  • Client request:

    • The request

      GET /index.html HTTP/1.0

    • Optional header information

      User-Agent: browser name

      Accept:formats the browser understands

      ...

    • A blank line (\n)

    • The client can also send data (e.g., the data that the user entered into an HTML form)


HTTP Transaction – Server

  • Server response:

    • Status line

      HTTP/1.0 200 OK

    • Header information

      Content-type: text/html

      Content-length: 3022

      ...

  • A blank line (\n)

  • Document data


Proxy Servers

  • A proxy server acts as a delegate of browsers for accessing the Web

  • The browser transfers the request for a document to the Proxy

  • The Proxy contacts the Web server and fetches the document on behalf of the browser


Proxy Server

proxy asks the

document from

the HTTPD

user requests a document

browser requests the document

from the proxy

sends the

content of

index.html

Proxy server

Browser

Proxy

application

Cache


Advantages of Proxy Servers

  • Proxy servers have several advantages over direct access:

    • They can be combined with a firewall to enable restricted access to the Internet

    • They enable caching of popular documents

    • They can enlarge the functionality of the browser by translating from one protocol to another (for example, from FTP to HTTP and vice-versa)


Responding to Clients’ Inputs

  • HTML pages are static documents

  • Sometimes users supply input, for example, keywords submitted to a search engine

  • The Web server has to react to this input

    • The output is an HTML page that is not known in advance

  • In order to react to the input, the Web server may have to use some applications (e.g., database queries)


Server-Side Programming

  • Writing applications that react to clients’ inputs by creating HTML pages on the fly is known as server-side programming

  • A client request will include, in addition to the URL of the service provider, a list of parameters, for example:

    http://www.google.com/search?q=search-word

  • The response to the above request is a dynamic HTML page and generating it may involve interaction with other applications (e.g., database queries)


Generating Dynamic HTML Pages

user requests: http://www.excite.com/search?what=something

GET /search?what=something

host www.excite.com

sends the

content of

index.html

HTTPD

application

Browser

execution of a

search program


Server-Side and Client-Side Technologies

Servlets, JSP and Java Scripts


Server-Side Technologies

  • There are five common tools for server-side programming; each one works with some of the available Web servers

    • CGI (Common Gateway Interface) programming

    • Java Servlets

    • JSP – Java Server Pages, or

    • Microsoft ASP – ActiveServer Pages (similar to JSP)

    • PHP


CGI Programming

  • CGI is a scripting language

  • A cgi script works with an application that runs on the server and creates HTML pages

  • An early technology


Java Servlets

  • Servlets are java applications that some Web servers can run

  • A Servlet creates pages on the fly and these pages are returned to the requesting browser


JSP, ASP and PHP

  • JSP (Java Server Pages)

    • Create an HTML page that has Java-Servlet code inside HTML tags

      • This page is actually a template

      • The code, for example, could issue a database query and create an HTML table for the result

    • The Web server executes the code in the template and produces a pure HTML page that is returned to the client

  • Microsoft ASP (Active Server Pages)

    • The code is VB (Visual Basic) scripts

    • The Web server must be Microsoft IIS server

  • PHP is an HTML-embedded scripting language


Client-Side Technologies

  • Certain parts of a Web application can be executed locally, in the client

  • For example, some validity checks can be applied to the user’s input locally

  • The user request is sent to the server only if the input is valid

  • Java Script (not part of Java!) is an HTML-embedded scripting language for client-side programming


Java Script

  • Java Script is a scripting language for generating dynamic HTML pages in the browser

  • The script is written inside an HTML page and the browsers runs the script and displays an ordinary HTML page

  • There is some interaction of the script with the file system using cookies

  • Cookies are small files that store personal information in the file system of the client

    • For example, a cookie may store your user name and password for accessing a particular site


Separating Content from Style

XML and Style Sheets


Separating Content from Style

  • In HTML, the contents and the style of pages are inseparable

    • HTML tags actually refer only to the style

  • XML (eXtensible Markup Language) is a markup language for marking the semantics (meaning) of the data

  • XML tags describe the meaning of each portion of text in an XML document


XML Tags

  • XML tags are similar to attributes in a relation

  • However, the attributes are the same for all the records of the relation

  • In XML documents, each portion of text has its own tag

    • <course> databases </course>

    • <course> operating systems </course>

  • XML tags can be nested


Parsing XML Documents

  • XML facilitates easy parsing of documents according to their semantics

  • For example, the CS Department has many Web pages of courses

  • Can we write a program that reads all these pages and prints a list of the names of courses?

  • If XML tags are used, it is easy to do that


Data Exchange Using XML

  • XML is important in the context of data exchange between applications

  • It is possible to define a common set of tags that are suited for specific applications

  • For example, MathML is used for exchanging mathematical information


Showing XML Documentsin Browsers

  • XML documents contain data with semantic tags

  • For a graphical representation, information about the style must be added

    • For example, HTML tags provide information about the style


Style Sheets

  • Style is added to XML documents by means of style sheets

  • There are two style-sheet languages

    • CSS – Cascading Style Sheets

      • Describe how to graphically show the data

      • Can be used to give all HTML pages a common style

    • XSL – XML Style-sheet Language

      • Can also transform the data (e.g., XML to HTML transformations)


Putting it All Together

  • A common architecture for Web applications has three tiers

    • DBMS (database management system) for storing and processing information

    • A Web server + additional tools for interacting with the DBMS (and possibly some other applications) and producing dynamic HTML pages

    • A browser that supports

      • Java Script for locally generating dynamic HTML pages

      • CSS (and possibly XSL) for creating the desired visual output


How Should XML be Used?

  • How can we query easily and effectively XML documents?

  • How can we store efficiently XML documents?

  • What is the proper way to include other resources in XML documents (i.e., figures, sounds, etc.)?


The Challenge

  • We want to

    • represent information (internally) so that the semantics is well defined

    • use the style we like for displaying the information

  • Can we do that without making the process of creating HTML pages too cumbersome?


Topics Covered in the Course

  • Server-side programming

    • JDBC for connecting to the DBMS

    • Servlets

    • JSP

  • Client-side programming

    • Java Scripts

    • CSS

  • Data storage and processing on the Web

    • XML

    • XSL


Search Engines

  • What are search engines?

  • How do they work?

  • Shortcomings of search engines

  • Some popular search engines: Infoseek, HotBot, Altavista, Excite, Lycos, Yahoo!, Jeeves,...


ad
  • Login