Basic internet and networking concepts
1 / 66

Basic Internet and Networking Concepts - PowerPoint PPT Presentation

Basic Internet and Networking Concepts Representation and Management of Data on the Internet The Internet and the World-Wide Web TCP/IP and Web Browsers The Internet and the Web Internet means Inter-Network A world-wide network of many LANs (local-area networks)

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Download Presentation

Basic Internet and Networking Concepts

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Basic Internet and Networking Concepts

Representation and Management of Data on the Internet

The Internet and the World-Wide Web

TCP/IP and Web Browsers

The Internet and the Web

  • Internet means Inter-Network

    • A world-wide network of many LANs (local-area networks)

    • The LANs are of various types

  • Web means World-Wide Web

    • A large collection of information arranged as hypertext and stored in many computers that are part of the Internet

  • The two are related but not the same

A Bit of History

  • The Internet grew very rapidly throughout the 1980s and 90s

    • Less than 600 computers were connected to the Internet in 1983

    • Now there are tens (if not hundreds) of millions of computers

  • The Web started in 1989 and grew very rapidly during the 1990s

  • The current Web has billions of pages

Internet Applications

  • Email

  • Telnet

  • FTP

  • Newsgroups

  • World-Wide Web

  • Chat

  • ...

The Web

Web Browsers

  • Web browsers provide a very convenient interface for viewing the information stored on the Web

  • Mosaic – the first browser – was introduced in 1993 and sharply increased the popularity of the Web


  • TCP/IP is the common language of the Internet

    • IP – Internet Protocol

    • TCP – Transmission Control Protocol

  • The IP protocol transmits packets of data from one host (i.e., computer) to another

  • The TCP protocol uses many packets to transmit a long stream of data

TCP vs. IP

  • IP routes each packet from the source host to the destination host

    • IP is oblivious to the fact that usually each packet is part of a data stream

  • TCP handles correctly a long data stream

    • Divides a long data stream into many packets, at the source

    • Reassembles the packets, in the right order, at the destination

    • Handles errors and lost data


  • Sockets are a common interface that make TCP streams look like file streams

  • Modern programming languages support sockets

  • A read or a write operation to/from a socket may block

    • Until data arrives, or

    • Until data can be sent

  • Use multiple threads so that blocking will not cause the whole GUI to freeze

IP Addresses, Host Names and URLs

IP Addresses

  • A computer connected to the Internet is called a host

  • Every host has a unique IP address

  • An IP address consists of 32 bits that are written as four decimal numbers, separated by dots

    • Example:

      • The numbers denote the four bytes composing this address

Internet Addresses

  • In addition to an IP address, a host may also have a human-readable Internet address (or hostname)

  • Some examples of hostnames:




  • The first part is the name of a particular host (i.e., computer)

  • The rest is the domain name

The Hierarchical Structureof Hostnames

  • Example:

    • www is a name of a computer

    • That computer is in the CS Department

    • That dept. is at The Hebrew University of Jerusalem (huji)

    • That university is an Academic Campus (ac) in Israel (il)

  • The rightmost name, il, is the main domain

  • As we move left, the sub-domains are more specific

The First 7 Generic Domains

  • com - commercial organizations (

  • edu - educational institutions (

  • gov - U.S. governmental organizations (

  • int - international organizations

  • mil - U.S. military

  • net - networks (InterNIC)

  • org - other organizations (

  • More domains have been added in recent years

Country Domains

  • Generic domains usually refer to hosts inside the U.S.

  • Other countries use two-letter country domains:

    • il - Israel

    • uk - United Kingdom

    • jp - Japan

    • se - Sweden

  • These domains have sub-domains that correspond to the generic domains, for example:

    • is the domain of all commercial organizations in Israel

    • is the domain of all academic institutions in Israel


  • Each information piece on the Web has a unique identifying address, called a URL (Uniform Resource Locator)

  • A URL takes the following form:


  • It has 3 parts: a protocol field, a hostname field and a file field




URL Fields

  • The protocol field (“http” in the previous example) specifies the way in which the information should be accessed

  • The hostname field specifies the host on which the information is found

  • The file field specifies the particular location in the host's file system where the file is found

  • More complex forms of URLs are possible

Using IP Addresses in URLs

  • How does the browser know the IP address of the Web server?

  • One possibility is that the user explicitly specifies the IP address of the server in the hostname field of the URL, for example:

  • However, it is inconvenient for people to remember such addresses

From Hostnames to IP Addresses

  • When we address a host in the Internet, we usually use its hostname (e.g., using a hostname in a URL)

  • The browser needs to map that hostname to the corresponding IP address of the given host

  • There is no algorithm for computing the IP address from the hostname

  • A lookup table provides the IP address of each hostname

Where is the Translation Done?

  • The translation of IP addresses to hostnames requires a lookup table

  • Since there are millions of hosts on the Internet, it is not feasible for the browser to hold a table that maps all hostnames to their IP-addresses

  • Moreover, new hosts are added to the Internet every day and hosts change their names

DNS (Domain Name System)

  • The browser (and other Internet applications) use a DNS Server to map hostnames to IP addresses

  • DNS is a hierarchical scheme for naming hosts

  • The command nslookup gets an IP address and returns a hostname or vice-versa

  • It runs on clients and contacts a DNS server

LANS, IP Addresses and Routing

How IP Packets are transmitted Across the Internet

IP Addresses

  • An IP address consists of 4 bytes

    • Each byte is a number in the range 0 – 255

    • For example,

  • The first 1 to 3 bytes identify the network and the remaining 1 to 3 bytes identify hosts on the network

    • There are several classes of network addresses

  • Subnet masks effectively increase the number of networks

  • Network Information Center (NIC) assigns IP addresses to organizations and companies

Classes of IP Addresses

  • Class A: The first byte is 1 – 127

    • 1 byte for network and 3 for host

  • Class B: The first byte is 128 – 191

    • 2 bytes for network and 2 for host

  • Class C: The first byte is 192 – 223

    • 3 bytes for network and 1 for host

  • Classes D and E: 224 – 255

    • These classes have special functions, e.g., a multicast packet uses a class D address

Subnet Masking

  • The network part of an IP address identifies a LAN (Local-Area Network)

  • Hosts in a given LAN can be up to 100 meters from the LAN switch

  • HU has one class B network address, namely, 132.64 (CS is 132.65)

    • But HU needs many LANs !

  • Subnet masking solves this problem

Defining a Subnet Mask

  • The subnet mask is a four-byte sequence of 1’s followed by 0’s, e.g.,

  • IP addresses are interpreted as follows:

    • Any bit that is 1 in the mask identifies the network and any bit that is 0 identifies the host

  • When the subnet mask is applied to an IP address of Class B, e.g.,, it means that

    • The first 2 bytes identify a network (HU)

    • The third byte identifies a subnet, i.e., a specific LAN

    • The fourth byte identifies a host

Local-Area Networks (LANs)

  • LANs are typically built by connecting hosts to a 100Mbit Ethernet LAN switch, using Category 5 cables

  • Maximal distance between switch and host is 100 meters

  • LAN switches transmit IP packets between hosts on the same LAN

  • LAN switches translate IP addresses to physical addresses (MAC addresses)


  • LAN switches are connected using fiber optics to routers

  • Routers route IP packets across LANs

  • A router is connected directly to two or more LANs and it can transmit IP packets between this LANs (local routing)

  • Some routers are connected to each other via WANs (Wide-Area Networks) and do backbone routing

Hop-by-Hop Routing

  • Suppose that an IP packet is sent from a LAN to another far-away LAN

  • The message gets to the router that is directly connected to the source LAN

  • The router sends it to the next hop, i.e.,

    • A router on the same LAN that is also connected to some other LANs, or

    • A router on the same WAN

Routing Tables

  • Each router has routing table with prefixes of IP address

    • Each prefix has a router address for the router that handles that prefix

  • Given an IP packet with some IP address, the next-hop router is determined by matching the longest prefix (of an IP address) from the routing table with the given IP address

  • There is a default entry for the largest routers in the backbone of the Internet

Updating the Routing Tables

  • The routing table includes local information provided by the local network administrator

  • Router periodically update their routing tables by exchanging information with their neighboring routers

  • Routing protocols: Distance Vector (Bellman-Ford), Open Shortest Path First (OSPF)

A Short Overview of How the Web Works

The HTTP Protocol, Web Proxies, Dynamic HTML Pages

The HTTP Protocol

  • Hypertext Transfer Protocol

  • Used between Web clients (e.g., browsers) and Web servers (and proxies)

  • Text based

  • Built on top of TCP

  • Stateless protocol (it doesn’t remember your previous requests)

Browsers Are Clients

  • We use a browser to display HTML pages

  • The browser is responsible for fetching the HTML pages and displaying their contents according to the HTML rules

Web Servers

  • HTML pages are stored in file systems

  • Some hosts, called Web servers, can access these HTML pages

  • Each Web server runs an HTTP-daemon in order to make its HTML pages available to other hosts

  • The term “Web server” refers to the software that implements the HTTP daemon, but sometimes it also refers to the host that runs that software

HTTP Daemons

  • An HTTP-daemon is an application that is constantly running on a Web server, waiting for requests from remote hosts

  • Technically, any host connected to the Internet can act as a Web server by running an HTTP-daemon application

  • A Web client (e.g., browser) connects to a Web server through the HTTP protocol and requests an HTML page

Browser-HTTPD Interaction

user requests

http:// /index.html

GET /index.html


Web server

sends the

content of






Browser-HTTPD Interaction

  • The user requests

  • The browser contacts the HTTP-daemon running on the host and requests the HTML page /index.html

  • The HTTP-daemon translates the requested name to a specific file in its local file system

  • The HTTP-daemon reads the file index.html from the disk and sends the content of the file to the browser

  • The browser receives the HTML page, parses it according to the HTML rules and displays it

HTTP Transaction – Client

  • Client request:

    • The request

      GET /index.html HTTP/1.0

    • Optional header information

      User-Agent: browser name

      Accept:formats the browser understands


    • A blank line (\n)

    • The client can also send data (e.g., the data that the user entered into an HTML form)

HTTP Transaction – Server

  • Server response:

    • Status line

      HTTP/1.0 200 OK

    • Header information

      Content-type: text/html

      Content-length: 3022


  • A blank line (\n)

  • Document data

Proxy Servers

  • A proxy server acts as a delegate of browsers for accessing the Web

  • The browser transfers the request for a document to the Proxy

  • The Proxy contacts the Web server and fetches the document on behalf of the browser

Proxy Server

proxy asks the

document from


user requests a document

browser requests the document

from the proxy

sends the

content of


Proxy server





Advantages of Proxy Servers

  • Proxy servers have several advantages over direct access:

    • They can be combined with a firewall to enable restricted access to the Internet

    • They enable caching of popular documents

    • They can enlarge the functionality of the browser by translating from one protocol to another (for example, from FTP to HTTP and vice-versa)

Responding to Clients’ Inputs

  • HTML pages are static documents

  • Sometimes users supply input, for example, keywords submitted to a search engine

  • The Web server has to react to this input

    • The output is an HTML page that is not known in advance

  • In order to react to the input, the Web server may have to use some applications (e.g., database queries)

Server-Side Programming

  • Writing applications that react to clients’ inputs by creating HTML pages on the fly is known as server-side programming

  • A client request will include, in addition to the URL of the service provider, a list of parameters, for example:

  • The response to the above request is a dynamic HTML page and generating it may involve interaction with other applications (e.g., database queries)

Generating Dynamic HTML Pages

user requests:

GET /search?what=something


sends the

content of





execution of a

search program

Server-Side and Client-Side Technologies

Servlets, JSP and Java Scripts

Server-Side Technologies

  • There are five common tools for server-side programming; each one works with some of the available Web servers

    • CGI (Common Gateway Interface) programming

    • Java Servlets

    • JSP – Java Server Pages, or

    • Microsoft ASP – ActiveServer Pages (similar to JSP)

    • PHP

CGI Programming

  • CGI is a scripting language

  • A cgi script works with an application that runs on the server and creates HTML pages

  • An early technology

Java Servlets

  • Servlets are java applications that some Web servers can run

  • A Servlet creates pages on the fly and these pages are returned to the requesting browser


  • JSP (Java Server Pages)

    • Create an HTML page that has Java-Servlet code inside HTML tags

      • This page is actually a template

      • The code, for example, could issue a database query and create an HTML table for the result

    • The Web server executes the code in the template and produces a pure HTML page that is returned to the client

  • Microsoft ASP (Active Server Pages)

    • The code is VB (Visual Basic) scripts

    • The Web server must be Microsoft IIS server

  • PHP is an HTML-embedded scripting language

Client-Side Technologies

  • Certain parts of a Web application can be executed locally, in the client

  • For example, some validity checks can be applied to the user’s input locally

  • The user request is sent to the server only if the input is valid

  • Java Script (not part of Java!) is an HTML-embedded scripting language for client-side programming

Java Script

  • Java Script is a scripting language for generating dynamic HTML pages in the browser

  • The script is written inside an HTML page and the browsers runs the script and displays an ordinary HTML page

  • There is some interaction of the script with the file system using cookies

  • Cookies are small files that store personal information in the file system of the client

    • For example, a cookie may store your user name and password for accessing a particular site

Separating Content from Style

XML and Style Sheets

Separating Content from Style

  • In HTML, the contents and the style of pages are inseparable

    • HTML tags actually refer only to the style

  • XML (eXtensible Markup Language) is a markup language for marking the semantics (meaning) of the data

  • XML tags describe the meaning of each portion of text in an XML document

XML Tags

  • XML tags are similar to attributes in a relation

  • However, the attributes are the same for all the records of the relation

  • In XML documents, each portion of text has its own tag

    • <course> databases </course>

    • <course> operating systems </course>

  • XML tags can be nested

Parsing XML Documents

  • XML facilitates easy parsing of documents according to their semantics

  • For example, the CS Department has many Web pages of courses

  • Can we write a program that reads all these pages and prints a list of the names of courses?

  • If XML tags are used, it is easy to do that

Data Exchange Using XML

  • XML is important in the context of data exchange between applications

  • It is possible to define a common set of tags that are suited for specific applications

  • For example, MathML is used for exchanging mathematical information

Showing XML Documentsin Browsers

  • XML documents contain data with semantic tags

  • For a graphical representation, information about the style must be added

    • For example, HTML tags provide information about the style

Style Sheets

  • Style is added to XML documents by means of style sheets

  • There are two style-sheet languages

    • CSS – Cascading Style Sheets

      • Describe how to graphically show the data

      • Can be used to give all HTML pages a common style

    • XSL – XML Style-sheet Language

      • Can also transform the data (e.g., XML to HTML transformations)

Putting it All Together

  • A common architecture for Web applications has three tiers

    • DBMS (database management system) for storing and processing information

    • A Web server + additional tools for interacting with the DBMS (and possibly some other applications) and producing dynamic HTML pages

    • A browser that supports

      • Java Script for locally generating dynamic HTML pages

      • CSS (and possibly XSL) for creating the desired visual output

How Should XML be Used?

  • How can we query easily and effectively XML documents?

  • How can we store efficiently XML documents?

  • What is the proper way to include other resources in XML documents (i.e., figures, sounds, etc.)?

The Challenge

  • We want to

    • represent information (internally) so that the semantics is well defined

    • use the style we like for displaying the information

  • Can we do that without making the process of creating HTML pages too cumbersome?

Topics Covered in the Course

  • Server-side programming

    • JDBC for connecting to the DBMS

    • Servlets

    • JSP

  • Client-side programming

    • Java Scripts

    • CSS

  • Data storage and processing on the Web

    • XML

    • XSL

Search Engines

  • What are search engines?

  • How do they work?

  • Shortcomings of search engines

  • Some popular search engines: Infoseek, HotBot, Altavista, Excite, Lycos, Yahoo!, Jeeves,...

  • Login