Internet and web programming basics
1 / 89

Web Programming - PowerPoint PPT Presentation

  • Uploaded on

Internet and Web Programming Basics. Web Programming. Networking. Early computers were highly centralized. Single point of failure User has to access the computer. Low cost computers made it possible to get past these 2 primary disadvantages with a network.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about ' Web Programming' - finley

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Internet and web programming basics

Internet and Web Programming Basics

Web Programming


  • Early computers were highly centralized.

    • Single point of failure

    • User has to access the computer.

  • Low cost computers made it possible to get past these 2 primary disadvantages with a network.

  • Network – “ ... communication system for connecting end-systems”


  • End-systems also known as “hosts”

    • PCs, workstations

    • dedicated computers

    • network components

  • Advantages of networking

    • Sharing of resources

    • Price/Performance

    • Centralized administration

    • Computers as communication tools


  • Mechanisms by which software running on two or more endpoint can exchange messages

  • Java is a network centric programming language

  • Java abstracts details of network implementation behind a standard API

Networking traditional uses
Networking - Traditional Uses

  • Communication (email)

  • Resource Sharing

    • File exchange, disk sharing

    • Sharing peripherals (printers, tape drives)

    • Remote execution

Networking new er uses
Networking - New(er) Uses

  • Information sharing

    • Peer-to-Peer computing

  • Entertainment, distributed games

  • E-Commerce

  • Collaborative computing

    • Forums

    • Chats

  • WWW

Lan local area network
LAN - Local Area Network

  • Connects computers that are physically close together ( < 1 mile).

    • high speed

  • Technologies:

    • Ethernet 10 Mbps, 100Mbps

    • Token Ring 16 Mbps

Wan wide area network
WAN - Wide Area Network

  • Connects computers that are physically far apart (long-haul network).

    • typically slower than a LAN.

    • typically less reliable than a LAN.

  • Technologies:

    • telephone lines

    • satellite communications

Client server architecture
Client/Server Architecture

  • Classical network architecture is a client/server (C/S) architecture

  • A server is a process (not a machine!) waiting for requests from a client.

  • A client is a process (not a machine!) sending requests to a server and waiting for a reply.

  • Both client and server are software entities

Client server architecture1
Client/Server Architecture

  • Server examples:

    • finds a document.

    • prints a file for client.

    • records a transaction.

  • Servers are generally more complex

  • Two basic types of servers:

    • Iterative - handles one client at a time.

    • Concurrent - handles many clients in parallel.

Iterative server
Iterative Server

  • Naïve server implementation is sequential.

    • handles one request at a time.

  • Consider a server that needs to read data from a disk

  • Reading a file from disk takes a long time

  • The server will be idle while it waits for the data to be read

  • Other clients will be waiting

Iterative server1


Start request loading

Awaiting disk availability

Deliver the data across network

Iterative Server

Concurrent server
Concurrent Server

  • Threaded servers can process several requests at once.

    • Each request is handled by a separate thread.

  • This does not increase the overall amount of work done, but reduces the wastage!

  • Threaded operation is worthwhile when threads are expected to block, awaiting I/O operations

Concurrent server1


Start request loading

Awaiting disk availability

Deliver the data across network

Concurrent Server

Networking models
Networking Models

  • Using a formal model allows us to deal with various aspects of networking abstractly.

  • We will look at a popular model – OSI reference model.

    • ISO proposal for the standardization of the various networking protocols (1984)

  • The OSI reference model is a layered model.


  • Divide a task into pieces and then solve each piece independently (or nearly so).

  • Establish a well defined interface between layers .

  • Major advantages:

    • Independence

    • Extensibility

Layered system example unix os



System Calls


Layered System Example – Unix OS

Osi 7 layer model
OSI 7-Layer Model:

High level protocols

7 Application

6 Presentation

5 Session

4 Transport

3 Network

2 Data-Link

1 Physical

Low level protocols

Communication protocols
Communication Protocols

  • Communication between two sides is defined through a protocol

  • Protocol – An agreed upon convention for communication

    • Both sides need to understand the protocol.

  • Examples: TCP/IP, UDP, others

  • Protocols must be formally defined and unambiguous

    • Tons of documentation

Interface vs peer to peer protocols



Interface Protocols







Data Link

Data Link

Interface vs. Peer-to-Peer Protocols

  • Interface protocols describe the communication between layers on the same side.

  • Peer-to-peers protocols describe the communication between the sides at the same layer.


  • Physical Layer: transmission of raw bits over a communication channel

  • Data Link Layer: divides data into packets and provides an error-free communication link

  • Network Layer: selects path between the two sides, fragmentation & reassembly, connection between network types

The transport layer
The Transport Layer

  • Transport Layer: provides virtual end-to-end links between peer processes

  • TCP – Transmission Control Protocol

    • Connection oriented

    • Reliable, keeps order

  • UDP – User Datagram Protocol

    • Connectionless

    • Unreliable

    • Fast


  • Session Layer: establishes, manages, and terminates sessions between applications

  • Presentation Layer: responsible for data compression and encryption

  • Application Layer: anything above previous layers (specific applications)

The internet
The Internet

  • A worldwide network connecting millions of hosts

  • WAN interconnecting many LANs of various types

  • Applications

    • World-Wide Web

    • Email

    • FTP

    • … more and more

The web
The Web

  • The term World-Wide Web (or simply Web) describes a collection of pieces of information that

    • are stored as files on particular hosts

    • can be reached by other connected hosts

  • These hosts are called Web servers

Web or internet
Web or Internet?

  • They are not the same things.

  • The Internet is a collection of computers or networking devices connected together.

    • They have communication between each other.

  • The Web is a collection of documents that are interconnected by hyper-links.

    • These documents are provided by Web servers and accessed through Web browsers.

How does the web work
How does the Web Work?

  • The Web information is stored in the Web pages

    • In HTML format.

  • The Web pages are stored in the hosts called Web servers

    • In the Web server file system.

  • The computers reading the pages are called Web clients using specific Web browser

    • Most commonly Internet Explorer or Netscape.

  • The Web server waits for the request from the Web clients over the Internet

    • Internet Information Server (IIS) or Apache.


  • Much of the information that is found on the Web is stored as HTML files.

  • HTML is a scripting language for storing formatted text.

    • allows to combine other types of information (such as images) in the documents.

  • Allows interconnection (links) between the documents.


  • Are used to display HTML documents.

  • The browser is responsible for

    • fetching the documents

    • displaying them according to the HTML rules.

  • Browsing refers to the activity of viewing Web documents through following the links.


  • Each communication endpoint must have an address.

  • Consider 2 computers communicating over a network:

    • the communication protocol must be specified

    • the name of the host (end-system) must be specified

    • the specific process of the host must be specified.


  • Each Web document has a unique identifying address called a URL (Uniform Resource Locator).

  • A URL takes the following form:

  • URL structure: <scheme>://<user>:<password>@<host>:<port>/<path>;<params>?<query>#<frag>




Url fields
URL fields

  • The protocol field specifies the way in which the information should be accessed.

  • The host field specifies the host on which the information is found.

  • The file field specifies the particular location on the host’s disk (path) where the file is found and the name of the file

  • There could be more complex forms of URLs but we do not discuss them

Ip addresses
IP Addresses

  • Hostnames (i.e., URLs) are used by people.

  • Network mechanisms use IP-addresses instead.

  • Every host connected to the Web has a unique IP address that identifies it.

  • IP addresses are

    • 32-bit (4 byte) numbers

    • usually written as four decimal numbers separated by dots, e.g., where the numbers refer to the above 4 bytes.


  • As data traverses the Web, each packet carries not only the address of the host but also the port on that host to which it is aimed.

  • 65,536 ports are available at each host.

  • A port does not represent anything physical like a serial or parallel port.

  • Hosts are responsible for reading the port number from the packets they receive to decide which program should process that data.


  • On Unix systems, ports between 1 and 1023 are reserved for the OS processes.

  • Any process can listen for connections on ports of 1025 to 65,535 as long as the port is not already occupied.

  • In Windows and Mac-OS, any process can listen to any port.

Well known ports
Well-Known Ports

  • Many services run on well-known ports.

    • Web HTTP servers listen for connections on port 80.

    • SMTP servers listen on port 25.

    • Echo servers listen on port 7.

    • FTP servers listen on port 21.

    • Telnet servers listen on port 23.

    • DayTime servers listen on port 13.

    • whois servers listen on port 43.

    • finger servers listen on port 79.

Client server model
Client-Server Model

Server application

Client application

Port 5746

Server machine

Client machine


  • However it is inconvenient for people to remember IP addresses and ports.

  • Many hosts have in addition to IP address a human readable hostname.




  • Hostnames have hierarchical structure.

    • Hostname, refers to the host www in the computer science (cs) department of the Vienna University, which is an Academic Campus (ac) in Austria (at).

  • The rightmost part describes the main domain of the host. Left to it, a sub-domain, and further left more specific sub-domains.


  • There are generic domains

    • com commercial organizations

    • edu educational institutions

    • gov U.S. governmental organizations

  • Most countries use country domains:

    • il Israel

    • uk United Kingdom

    • jp Japan

Dns servers
DNS Servers

  • The mapping between the hostnames and the corresponding IP address is done by DNS.

  • It is not feasible for the Web browser to hold a table mapping all the hostnames to their IP-addresses.

    • New hosts are added to the Web every day

    • Hosts change their names and IP addresses.

Web protocols
Web Protocols

  • It is a special set of rules that endpoints (both clients and servers) in the Web use to handle communication.

    • Transmission Control Protocol (TCP) – To exchange messages with other endpoints at the information packet level.

    • Internet Protocol (IP) – To send and receive messages at the address level.

    • Hypertext Transfer Protocol (HTTP) – To deliver HTML, sound, audio files on the World Wide Web.

Http protocol
HTTP Protocol

  • HyperText Transfer Protocol

    • Used between Web-clients (e.g., browsers) and Web-servers

    • Text based

    • Built on top of TCP protocol

  • Stateless protocol

    • No data about the communicating sides is stored

Http transaction request
HTTP Transaction - Request

  • Client sends a request that looks like

    • GET /index.html HTTP 1.0

      • GET is a keyword

      • Index.html is the requested document

      • HTTP 1.0 is the protocol version that the client understands

    • The request terminates always with \r\n\r\n.

  • Client may send optional information

    • For example, <keyword:value> list

      • User-Agent : browser name

      • Accept : formats the browser understand

Http transaction request1
HTTP Transaction - Request

  • Request example:

    GET /index.html HTTP 1.0

    User-Agent: Lynx/2.4 libwww/2.1.4

    Accept: text/html

    Accept: text/plain

  • In addition to GET, clients can request

    • HEAD – Retrieve only header for the file

    • POST – Send data to the server

    • PUT – Upload a file to the server

Http transaction response
HTTP Transaction - Response

  • Server response

    • sends status line

      HTTP/1.0 200 OK

    • sends header information

      Content-type: text/html

      Content-length: 3022


    • sends a blank line (\n)

    • sends document contents (e.g., html file)

Http transaction response1
HTTP Transaction - Response

HTTP/1.1 200 OK

Date: Fri, 16 Apr 2004 18:48:13 GMT

Server: Apache/1.3.29 (Darwin)

Last-Modified: Fri, 16 Apr 2004 10:15:59 GMT

ETag: "58db37-89-407fb25f"

Accept-Ranges: bytes

Content-Length: 137

Connection: close

Content-Type: text/html




<img src=“smiley.gif">





Blank line


Http 1 0 response codes
HTTP 1.0 response codes

  • 2xx Successful

    • response codes between 200-299 indicate that response understood and accepted

  • 200 OK – the most popular respond indicate success

  • 201 created – respond to successful POST request

  • 202 accepted – respond to POST request, meaning processing is not over yet

  • 204 no content – the server successfully processed the request, but has no content to send back

Http 1 0 response codes1
HTTP 1.0 response codes

  • 3xx Redirection

    • response codes between 300-399 indicate that the web browser needs to go to a different page

  • 301 Moved Permanently – the page has moved to a new URL.

  • 302 Moved Temporarily – the page has moved temporarily to a new URL.

  • 304 Not Modified – get request with If-Modified-Since get, such response if requested file has not been changed.

Http 1 0 response codes2
HTTP 1.0 response codes

  • 4xx Client Error

    • response codes between 400-499 indicate that the server got erroneous request

  • 400 Bad Request – improper request

  • 401 Unauthorized – unauthorized request (need username & password)

  • 403 Forbidden – the server refuses to process the request

  • 404 Not Found – the server cannot find the requested page

Http 1 0 response codes3
HTTP 1.0 response codes

  • 5xx Server Error

    • response codes between 500-599 indicate that something has gone wrong with the server, and it cannot be fixed

  • 500 Internal Server Error – unexpected error occurred at the server

  • 501 Not Implemented – the server does not have the feature that is needed to fulfill the request.

  • 502 Bad Gateway – applicable only to proxies servers

  • 503 Service Unavailable – the server temporarily unable to handle the request (due to overload or maintenance)

Http 1 1
HTTP 1.1

  • HTTP 1.1 has much more responses defined.

  • HTTP 1.1 is an official standard (unlike 1.0).

  • Primary improvement of version 1.1

    • HTTP 1.0 opens new connection for every request

    • HTTP 1.1 allows a client to send many requests over a single connection, that remains open until explicitly closed. Thus, overheads are reduced.

    • Requests and responcesare asynchronous

      • Clients can send many requests without waiting for response before sending the next request.

Http daemons
HTTP Daemons

  • How can servers recognize incoming requests?

    • Each server runs an HTTP-daemon

      • Constantly running on the server

    • Clients request for a service through the server’s daemon

    • Technically, any host connected to the Web can act as a server by running HTTP-daemon

Client httpd interaction
Client - HTTPD interaction

  • The user requests

  • The browser contacts the HTTP-daemon running on the host and requests the document /index.html

  • The HTTP-daemon translates the requested name to an access to a specific file in its local file-system.

  • The server reads the file index.html from its disk and sends its contents to the client.

  • The client receives the document, parses it and the browser displays it graphically.

Client httpd interaction1
Client - HTTPD interaction

user requests /index.html

GET /index.html


sends the

contents of






Proxy servers
Proxy Servers

  • Act as delegates of Web-browsers for accessing the Web.

  • The browser transfers the requests for a document to the Proxy Server

  • The Proxy Server contacts the relevant Web Server and fetches the document on behalf of the browser.

Proxy server
Proxy server

proxy asks the

document from


user requests a document

browser request the document

from the proxy

sends thecontents of the document

Proxy server





Proxy servers1
Proxy Servers

  • Proxy servers have several advantages over direct data access:

    • Security

      • Can be combined with a firewall to enable restricted access to the Web.

    • Communication

      • Enable caching of popular documents.

    • Portability

      • Perform ‘mediation' between different network protocols

Dynamically generated documents
Dynamically Generated Documents

  • Many Web documents should be generated dynamically, upon requests from clients

    • News items

    • Web-based email

    • Personalized applications

  • Contents of these pages can not be prepared “manually”

  • They are generated dynamically by Common Gateway Interface (CGI) programs

Dynamically generated documents1
Dynamically Generated Documents

  • The HTTP request invokes a program on the server.

  • The program creates a new page on the fly and sends it to the client as a response.

  • This program may use details sent in the request in order to generate the page.

  • The CGI programs may be written in any language

    • Most popular are Perl and Java.

  • HTTP server that gets request to CGI program, usually invokes the CGI program in an independent new process.

Dynamically generated documents2
Dynamically Generated Documents

user requests

GET /search?what=something


sends thecontents of the document




execution of a

search program

The java net package
The package

  • The package contains classes that allow your programs to send and receive data across the Internet.

  • Java class represents an abstraction of Web addresses.

    • Encapsulates an address

    • Contains methods to convert IP addresses to hostnames and vice versa.

Parsing inetaddressess
Parsing InetAddressess

  • InetAddress object can be represented by:

    • its host name as a string,

    • its IP address as a string,

    • its IP address as a byte array

    • public String getHostName()

    • public String getHostAddress()

    • public byte[] getAddress()

Creating inetaddress objects
Creating InetAddress objects

try {//using hostname InetAddress address = InetAddress.getByName(""); System.out.println(address); } catch (UnknownHostException e) { System.out.println("Could not find!"); }


try {//using IP address InetAddress address = InetAddress.getByName(""); System.out.println(address); } catch (UnknownHostException e) { System.out.println("Could not find!"); }

Given address find a hostname
Given address, find a hostname

try {

InetAddress ia =




catch (Exception e) {



Hosts with multiple addresses
Hosts with multiple addresses

try {

InetAddress[] addresses = InetAddress.getAllByName("");

for (int i = 0; i < addresses.length; i++)



catch (UnknownHostException e) {

System.out.println("Could not find");


Local host
Local Host

try {

InetAddress address = InetAddress.getLocalHost();



catch (UnknownHostException e) {



  • Returns an InetAddress object that contains the address of the computer the program is running on.

  • In addition, local host may be accessed through an IP address

Inetaddress equals

try {

InetAddress oreilly = InetAddress.getByName("");

InetAddress helio = InetAddress.getByName("");

if (oreilly.equals(helio))

System.out.println (" is the same as");


System.out.println (" is not the same as");


catch (UnknownHostException e) {

System.out.println("Host lookup failed.");


Parsing inetaddressess example
Parsing InetAddressess Example

try {

InetAddress me = InetAddress.getLocalHost();

System.out.println("My name is " + me.getHostName());

System.out.println("My address is " + me.getHostAddress());

byte[] address = me.getAddress();

for (int i = 0; i < address.length; i++) {

System.out.print(address[i] + " ");




catch (UnknownHostException e) {

System.err.println("Could not find local address");


The java net url class
The class

  • The class represents a URL.

  • Accessing a documents through URL object allows to hide protocol-dependent operations.

  • Protocol handler is responsible for communicating with the server:

    • handles any necessary negotiation with the server

    • returns the actual contents of the requested file.

The java net url class1
The class

  • When a URL object is constructed:

    • Java looks for the appropriate protocol handler (such as "http" or "mailto").

    • It is presumed to be a part of the URL.

    • If no such handler is found, the constructor throws a MalformedURLException.

  • JDK 1.1 supports 10 protocols:

    • file, ftp, gopher, http, mailto, appletresource, doc, netdoc, systemresource, verbatim

Constructing url objects
Constructing URL Objects

  • Java provides 4 constructors:

    • public URL(String u) throws MalformedURLException

    • public URL(String protocol, String host, String file) throws MalformedURLException

    • public URL(String protocol, String host, int port, String file) throws MalformedURLException

    • public URL(URL context, String u) throws MalformedURLException

Constructing url objects1
Constructing URL Objects

URL u = null;

try {

u = new URL(" webp/index.html#Info");


catch (MalformedURLException e){e.printStackTrace();}


URL u = null;

try {

u = new URL("http","", "/courses/ webp/index.html#Info");


catch (MalformedURLException e){e.printStackTrace();}

Constructing url objects2
Constructing URL Objects

URL u = null;

try {

u = new URL("http","", 80, "/courses/webp/index.html#Info");


catch (MalformedURLException e){e.printStackTrace();}


URL u1 = null, u2 = null;

try {

u1 = new URL("http","", "/courses/ webp/index.html#Info");

u2 = new URL(u1,"hw1.doc");


catch (MalformedURLException e){e.printStackTrace();}

Parsing urls
Parsing URLs

  • The class has 5 methods to split a URL into its component parts.

    try {

    u = new URL(" webp/index.html#Info");


    catch (MalformedURLException e) {e.printStackTrace();}

    System.out.println("Protocol is " + u.getProtocol());

    System.out.println("Host is " + u.getHost());

    System.out.println("Port is " + u.getPort());

    System.out.println("File is " + u.getFile());

    System.out.println("Anchor is " + u.getRef());

Parsing urls1
Parsing URLs

  • If a port is not explicitly specified in the URL, it is set to -1.

    • This does not mean that the connection is attempted on port -1 (which does not exist)

    • This means that the default port (80) is used.

  • If the anchor does not exist, it is null, so watch out for NullPointerExceptions.

Reading data from a url
Reading Data from a URL

public final InputStreamopenStream() throws Oexception

  • The openStream() method opens a connection to the specified URL

  • This allows to download data from the URL

  • Any headers coming before the actual data are stripped off before, as the stream is opened

Reading data from a url1
Reading Data from a URL

try {

URL u = new URL(args[0]);

InputStream in = u.openStream();

in = new BufferedInputStream(in);

Reader r = new InputStreamReader(in);

int c;

while ((c = != -1)

System.out.print((char) c);


catch (MalformedURLException e) {

System.err.println("unparseable URL");


catch (IOException e) {




  • openConnection() opens a socket (to be defined later) to the server

    • Socket facilitates direct communication with the server .

  • Particularly, it gives an access to everything sent by the server: document, protocol headers, etc.

Reading data from a url2
Reading Data from a URL

try {

URL u = new URL(args[0]); URLConnection uc = u.openConnection(); InputStream in = uc.getInputStream(in); Reader r = new InputStreamReader(in); int c; while ((c = != -1)

System.out.print((char) c);


catch (MalformedURLException e) {

System.err.println("unparseable URL");


catch (IOException e) {



Using the connection
Using the Connection

try {URL u = new URL(args[0]);URLConnection uc = u.openConnection();System.out.println("Content-type: " + uc.getContentType());System.out.println("Content-encoding: "+ uc.getContentEncoding());System.out.println("Content-length: " + uc.getContentLength());

System.out.println("Date: " + new Date (uc.getDate()));System.out.println("Last modified: "+new Date (uc.getLastModified()));System.out.println("Expiration date: " + new Date (uc.getExpiration()));}

catch (Exception e) {e.printStackTrace();}


  • getContent() returns the downloaded data as an object.

    • HTML or text file usually will become some sort of InputStream object.

    • Image such as GIF or JPEG will become some sort java.awt.ImageProducer object.

    • Casting can be made to the appropriate type.

  • getContent() uses the content-type field in the header of the data accepted from the server.


try {

URL u = new URL(args[0]);

try {

Object o = u.getContent();

System.out.println("I got a " +



catch (IOException e) {




catch (MalformedURLException e) {

System.err.println("unparseable URL");


Class urlencoder
class URLEncoder

  • The class contains a utility method for converting a String into a "x-www-form-urlencoder" format.

  • To convert a String, each character is examined:

    • Characters 'a' to 'z', 'A' to 'Z', and '0' to '9' remain the same.

    • The space character is converted into a plus sign '+'.

    • Other characters are converted into a 3-character string %xx, where xx is the two-digit representation of the character.

  • String encode(String s)translates a string into x-www-form-urlencoded format.

Class urldecoder
class URLDecoder

  • The class contains a corresponding class to URLEncoder.

  • String decode(String s)translates a x-www-form-urlencoded format into ASCII.

Using lycos search engine
Using Lycos Search Engine



public class LycosUser {

public static void main (String[] args) {

String querystring = "";

for (int i = 0; i < args.length; i++)

querystring += args[i] + " ";

querystring = querystring.trim();

querystring = "query=" + URLEncoder.encode(querystring);

Using lycos search engine1
Using Lycos Search Engine

try {

String thisLine;

URL u = new URL(" bin/pursuit?" + querystring);

DataInputStream retHTML = new DataInputStream(u.openStream());

while ((thisLine = retHTML.readLine()) != null)



catch (Exception e){