web technologies l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Web Technologies PowerPoint Presentation
Download Presentation
Web Technologies

Loading in 2 Seconds...

play fullscreen
1 / 45

Web Technologies - PowerPoint PPT Presentation


  • 371 Views
  • Uploaded on

Web Technologies Kevin McManus WWW Architecture Platform: Windows, Mac, Unix, etc. Browser: IE, Mozilla, Opera, etc. Client Request: http://www.gre.ac.uk/about/ Network HTTP over TCP/IP Response: <html>…</html> Server Platform: Windows, Mac, Unix, etc.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Web Technologies' - jacob


Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
web technologies

Web Technologies

Kevin McManus

www architecture
WWW Architecture

Platform: Windows, Mac, Unix, etc.

Browser: IE, Mozilla, Opera, etc.

Client

Request:

http://www.gre.ac.uk/about/

Network

HTTP over TCP/IP

Response:

<html>…</html>

Server

Platform: Windows, Mac, Unix, etc.

Web Server: Apache, IIS, Xitami, etc.

the University of Greenwich

www architecture3
WWW Architecture
  • Client-Server Request-Response architecture
    • You request a web page
      • e.g. http://www.gre.ac.uk/about/index.html
      • HTTP request
    • The web server responds with data
      • HTTP response
      • usually in the form of a web page (HTML document)
        • could be any file format
      • web page is written using HyperText Markup Language (HTML)
    • Web pages are identified by a Uniform Resource Locator (URL)
      • protocol: e.g. http
      • web server: e.g. www.gre.ac.uk
        • [machine name].[domain name]
      • web page: e.g. about/index.html

the University of Greenwich

internet standards
Internet Standards
  • Internet Engineering Task Force (IETF)

http://www.ietf.org/

    • founded 1986
    • a large open international community of network designers, operators, vendors, and researchers concerned with the evolution of the Internet architecture and the smooth operation of the Internet
    • open to any interested individual or organisation
    • establishes standards through working groups, mailing lists and Request For Comments (RFC) documents

http://www.ietf.org/rfc.html

the University of Greenwich

web standards
Web Standards
  • World Wide Web Consortium (W3C)

http://www.w3.org

    • founded 1994 by Tim Berners-Lee
      • currently hosted by MIT, ERCIM and Keio University
      • open to corporate membership
    • develops interoperable technologies to lead the Web to its full potential
      • produces specifications, standards, technical reports, guidelines, software, and tools
    • a forum for information, commerce, communication, and collective understanding
    • home of the Web Accessibility Initiative (WAI)
      • access for all

the University of Greenwich

internet web
Internet / Web
  • The Internet (capital I) is a globally interconnected network of computers
    • employs a wide variety of communication technologies
      • wires, fibres, satellites and so on
    • supports a number of protocols
  • The Web (capital W) is globally interconnected network of information
    • hypertext documents
    • using the Internet to connect information

the University of Greenwich

clients servers similarities
Clients & ServersSimilarities
  • Client and server computers both usually have:
    • hardware
      • Central Processing Unit (CPU)
        • e.g. Intel Pentium, AMD Athlon, IBM PPC, Sun Sparc
      • memory
      • I/O
        • Visual Display Unit (VDU), storage (fixed, removable), network
      • bus to connect it all together
    • software
      • multi-tasking operating system
        • Unix, Linux, NT, XP
      • file system
      • applications

the University of Greenwich

clients servers differences
Clients & ServersDifferences
  • Clients
    • generally support a single user
    • optimized for responsiveness to user
    • have a user interface, graphics
    • have client applications
      • e.g. web browsers
  • Servers
    • supports multiple users
    • optimized for throughput
    • more: CPUs (SMP), memory, disks (SANs), I/O
    • provide services
      • e.g. web, file, print, database, e-mail, telnet, directory
    • provides a high quality of service
      • RAID, UPS, redundant power supplies, hot swap devices

the University of Greenwich

web browser
Web Browser
  • Client-side application
    • also known as a User Agent
  • Requests resources from web servers
    • knows how to parse and render HTML
    • may know how to handle images
    • may know how to handle client-side scripts
    • may use plug-ins to handle other media formats
  • Popular browsers
    • have a graphical interface
      • Mosaic - developed at NCSA circa 1992
      • Modern browsers include: Internet Explorer, Mozilla, Netscape, Opera
    • compatibility is an issue
  • Uncommon browsers
    • text only – Lynx
    • assistive technologies – Jaws, IBM Home Page Reader

the University of Greenwich

web server
Web Server
  • A program running on a server
  • Service HTTP requests from web clients
    • accepts the request
    • returns the requested resource (if it can)
      • HTTP response
      • usually an HTML web page
  • Originally developed as the HTTP daemon (HTTPd) at NCSA
    • circa 1995 following work by Tim Berners-Lee at CERN
  • Configurable
    • deny or grant requests
    • provide virtual hosts
  • Logs requests and responses
  • Popular servers
    • Apache (70%), Microsoft Internet Information Server (IIS), Sun One

the University of Greenwich

proxy servers firewalls
Proxy Servers & Firewalls
  • Proxy Server
    • a server that sits between a client and the Internet
    • improves performance by caching frequently accessed resources
      • essential to achieve scalability of the Web
    • can filter requests to prevent access to certain web sites
      • used to implement censorship
    • can alter the client's request or the server's response
      • useful but open to abuse
  • Firewall
    • a piece of hardware and/or software that prevents communications forbidden by a security policy
      • controls network traffic between different zones of trust
      • prevents unauthorized access to a network from the Internet

the University of Greenwich

networks
Networks
  • Network = an interconnected collection of independent computers
  • Why have networks?
    • resource sharing, information sharing, reliability, communication
  • Networked computers can offer so much more than isolated machines
  • Web technologies add:
    • global information sharing: search engines, wikis
    • applications that do not require a client-side installation
    • new business models: e-commerce
    • new education models: e-learning
    • new ways of organising society: e-government, social networks
    • entertainment: radio and television streaming and podcasts

the University of Greenwich

networks13
Networks
  • Network scope
    • internet: a collection of connected networks
    • Internet: a specific world-wide network based on IP, used to connect companies, universities, governments, organizations and individuals.
      • grew from ARPANET, funded by the US DoD.
    • intranet: a network based on Internet technologies that is internal to a company or organization
    • extranet: a network based on Internet technologies that connects one company or organization to another

the University of Greenwich

network protocol stack
Network Protocol Stack

Application layer

HTTP

HTTP

Transport layer

TCP

TCP

Internet layer

IP

IP

Physical layer

Ethernet

Ethernet

the University of Greenwich

networks physical layer
NetworksPhysical Layer
  • Defines the physical specifications for devices
    • electrical, optical, electromagnetic, dimensional
  • Establishes connections to a medium of communication
    • copper wire, fibre-optic, wireless
  • Ethernet is now the most common implementation
    • (actually the data link layer in the OSI model)
    • many variations on ethernet
    • the local router maps IP addresses to Media Access Control (MAC) addresses
      • 48bit address of an ethernet controller
      • must be unique on a subnet
      • usually permanently set at point of manufacture

the University of Greenwich

networks internet layer
NetworksInternet Layer
  • Internet Protocol (IP)
  • Responsible for communicating packets from source to destination
    • across multiple network hops
  • Not guaranteed to be reliable
  • IP address: 32 bit value usually written in dotted decimal notation as four 8-bit numbers (0 to 255) e.g. 130.51.12.4
    • globally unique
      • for computers connected to the Internet
    • limited number of addresses – only 4 billion!
    • Network Address Translation (NAT) used to increase capacity
  • IPv6 provides increased number of addresses
    • 128 bit addresses

the University of Greenwich

networks transport layer
NetworksTransport Layer
  • Provides an efficient, reliable and cost-effective service
  • Uses the sockets programming model
  • Port numbers are used to identify the application
    • well-known ports identify standard services e.g. HTTP uses port 80, SMTP uses port 25
    • can use other port numbers – if they are free

e.g. http://fred.foo.net:8080/bar/myfile.html

  • Transmission Control Protocol (TCP)
    • connection-oriented byte stream
    • guaranteed reliability
  • User Datagram Protocol (UDP)
    • connectionless
    • no guarantee but lower overheads

the University of Greenwich

networks application layer
NetworksApplication Layer
  • Telnet - remote terminal
  • File Transfer Protocol (FTP)
  • Network News Transfer Protocol (NNTP)
  • Simple Network Management Protocol (SNMP)
  • Simple Mail Transfer Protocol (SMTP)
  • Post Office Protocol (POP)
  • Interactive Mail Access Protocol (IMAP)
  • Secure Shell (SSH) – secure terminal
  • Hypertext Transfer Protocol (HTTP) – the principal protocol of the World Wide Web

the University of Greenwich

hypertext transfer protocol http
Hypertext Transfer ProtocolHTTP
  • a.k.a. Hypertext Transport Protocol
  • HTTP is a simple stateless request-response protocol
  • A web client (user agent) requests a resource identified by a uniform resource locator (URL)
  • The web server identified in the URL responds with the file identified in the URL
    • the file may contain static data
      • HTML pages, GIFs, JPEGs, Microsoft Word documents, Adobe PDF documents, etc., etc.
    • the file may be a program that runs on the server to output data
      • ASP, PHP, Perl, JSP, etc., etc.
  • HTTP/1.0 highly successful
  • HTTP/1.1 introduced to address flaws in 1.0 and improve network performance
    • pipelining requests and responses

the University of Greenwich

http methods
HTTP Methods
  • GET, POST, HEAD, PUT, DELETE, TRACE, OPTIONS, CONNECT
  • GET and POST are both used to...
    • request a resource from a server
    • send data with the request
      • as name value pairs
  • GET appends name value pairs to the URL
    • visible in the browser
    • can be bookmarked and cached
    • safe, idempotent
  • POST sends name value pairs after the HTTP header
    • not cached
    • can carry larger payload
  • Differences between GET and POST are subtle and significant
    • we will look closely at this later

the University of Greenwich

http request
HTTP Request

Method

File name

HTTP version

GET /~k.mcmanus/index.html HTTP/1.1

Host: staffweb.cms.gre.ac.uk

Connection: close

Accept: text/xml,text/html,text/plain,image/png,*/*

Accept-Language: en-gb,en

User-Agent: Mozilla/4.0 (compatible;MSIE 6.0;Windows NT 5.0)

Accept-Charset: ISO-8859-1,utf-8;q=0.7,*

If-Modified-Since: Mon, 18 Sep 2006 22:57:19 GMT

Referer: http://web-sniffer.net

Headers

Blank line

Data – none for GET

the University of Greenwich

http response
HTTP Response

HTTP version

Status code

Reason phrase

Headers

HTTP/1.0 200 OK

Date: Thu, 21 Sep 2006 22:06:05 GMT

Server: Apache/1.3.33 (Unix) PHP/4.3.10

Connection: close

Content-Type: text/html

ETag: "5d150-141c-450f244f"

Last-Modified: Mon, 18 Sep 2006 22:57:19 GMT

Content-Length: 5184

<?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict

<html xmlns="http://www.w3.org/1999/xhtml">

...

</html>

Data

the University of Greenwich

http server status codes
HTTP Server Status Codes

the University of Greenwich

slide24
HTTP
  • HTTP is a stateless protocol
  • Each HTTP request is independent of previous and subsequent requests
    • HTTP/1.0 defaults to Connection: close
      • closes the channel of communication immediately after a response
    • Connection: keep-alive was introduced to enable persistent connections
      • no need to re-negotiate a connection for each request
      • a connection can be re-used for multiple requests
    • HTTP/1.1 defaults to keep-alive for efficiency
      • supports pipelining to allow multiple requests to be sent in one TCP packet
  • The stateless nature of HTTP has a big impact on how web applications are designed
    • we will look very closely at this

the University of Greenwich

state preservation
State Preservation
  • State preservation mechanisms come in three basic variations:
    • cookies
      • store a small amount of information on the client
      • sent to the server at each HTTP request
    • session variables
      • a unique identifier is used to associate information stored on the server with a particular client
    • passing data at each request-response cycle
      • store information in the web page
      • appending data to a URL
      • hidden fields in HTML forms

the University of Greenwich

https
HTTPS
  • A secure version of HTTP
    • syntactically identical to HTTP
  • Allows client and server to exchange data with confidence that the data was neither modified nor intercepted during transmission
    • essential when communicating sensitive information over the Web
  • Implements HTTP over Secure Sockets Layer (SSL)
    • SSL is also known as Transport Layer Security protocol (TLS)
    • uses public key encryption to encode data during transmission

the University of Greenwich

uris urls and urns
URIs, URLs and URNs
  • Uniform Resource Identifier (URI = URL or URN)
    • generic term for all resource names and addresses
  • Uniform Resource Locator (URL)
    • a set of URI schemes that have explicit instructions on how to access a resource over the Internet
    • globally unique
  • Uniform Resource Name (URN)
    • a URI that has an institutional commitment to availability and persistence
      • http://www.w3.org/Addressing
      • http://www.w3.org/Addressing/URL/5_BNF.html
  • http://w3.foo.net:8080/bar/index.php?fruit=plum&user=joe
  • [protocol]://[host]:[port]/[file path]?[arg]=[val]&[arg]=[val]

the University of Greenwich

multipurpose internet mail extensions mime
Multipurpose Internet Mail Extensions MIME
  • Originally designed for email, also used for HTTP
  • Tells the browser how to interpret the incoming data
  • Defines types of data/documents
    • ASCII - text/plain, text/html, text/xml
    • image formats - image/gif, image/jpeg
    • audio formats - audio/x-aiff, audio/mpeg3
    • binary data - application/octet-stream
  • Applied by the web server according to the filename extension

e.g. a file called daisy.png will be sent with a mime type image/png

the University of Greenwich

domain name system dns
Domain Name System DNS
  • Human-friendly domain names, gre.ac.uk
  • Globally unique identification of computers

bukowski.gre.ac.uk

  • Hierarchical name space with limited root names
    • organisational: .com .net .gov .edu .org .mil
    • national: .uk .jp .de .fr .tv etc.
    • Internet Corporation For Assigned Names and Numbers (ICANN) assumes responsibility for global coordination of the namespace
    • ICANN assigns control of each namespace to a registration authority
      • e.g. VeriSign for .com, Nominet for .uk
    • the Joint Academic Network (JANET) acts as authority for .ac.uk
    • JANET devolves authority for .gre.ac.uk to the University of Greenwich

the University of Greenwich

domain name system dns30
Domain Name System DNS
  • DNS servers map domain names to IP addresses
    • usually using the Berkeley Internet Name Daemon (BIND)
    • actually mapping fully qualified machine names to IP addresses
  • Web client contacts it’s local DNS server to translate the domain part of a URL into an IP address
    • If the local DNS server cannot resolve the address then the request is passed to DNS at the next level of controlling authority
    • resolved addresses are cached by the local DNS server
      • and by the browser
    • The browser can then send an HTTP request to the IP address

http://staffweb.cms.gre.ac.uk/~mk05/page1.html

Is the same as…

http://193.60.76.168/~mk05/page1.html

the University of Greenwich

hypertext
Hypertext
  • Conventional text has a single linear narrative path
    • time line
    • beginning – middle - end
  • Hypertext
    • multiple paths
    • may be read in any order
      • possibly an inappropriate order
    • not new a new concept
      • an indexed or referenced document
        • encyclopaedia, academic text
    • Computers are very good at traversing indexes
      • first computer hypertext system developed by IBM in 1968
        • required a mainframe computer
      • first popular system Apple HyperCard in 1987

the University of Greenwich

hypertext markup language html
HyperText Markup Language HTML
  • Originally defined by Tim Berners-Lee circa 1992
    • further developed by the IETF
      • simplified version of the Standard Generalized Markup Language (SGML)
    • an international standard (ISO) HTML4.01 1999
    • later specifications are maintained by the W3C
  • Tag based markup
    • tags define the structure of a page
      • metadata describing how to render the page
        • headings, paragraphs, lists, etc.
      • tags can have attributes
        • provide extra clues about page rendering

e.g. colour, font, size, decoration

      • anchor tags link to other (parts of) pages
        • hypertext

the University of Greenwich

slide33
HTML

<HTML>

<HEAD>

<TITLE>page1.html</TITLE>

</HEAD>

<BODY BGCOLOR="#FFFFDD">

<H1>Simple Example HTML page</H1>

<P>

This <I>paragraph</I> contains an anchor tag<BR>

<A HREF="page2.html">click here for the next page</A>

</P>

</BODY>

</HTML>

the University of Greenwich

cascading style sheets css
Cascading Style Sheets CSS
  • Rules to control HTML web page rendering in the web browser
    • provides greater styling control than HTML
  • Author styles
    • external style sheets
      • one style sheet can be used with many web pages
      • one web page can use many style sheets
      • improves consistency of style across the pages of a web site
      • easier updating and maintenance of the web site
    • embedded styles
      • rules embedded in the head of an HTML page
    • inline styles
      • rules as attributes in individual HTML tags
  • User styles and user agent styles
    • applied by the user to cater for their individual needs
  • Style rules cascade
    • inherited from parent tag to child tag
    • from external to embedded to inline

the University of Greenwich

client side scripting
Client Side Scripting
  • Executable script embedded into HTML pages
  • Parsed and executed by the web client
  • Usually JavaScript
    • native support in most web clients
  • Script may be included as:
    • an external file
    • embedded in the page head
    • inline with the page content
  • Can access and operate on page contents
    • using the Document Object Model (DOM)
  • Can respond to events in the browser
    • e.g. onClick, onMouseOver, onKeyPress
  • Used to enhance the user experience
    • e.g. image rollovers, form data validation

the University of Greenwich

extensible markup language xml
Extensible Markup LanguageXML
  • Simplified subset of SGML
  • A meta-language - extensible
    • a language for defining other languages

e.g. XHTML, MathML, SVG, RSS, RDF

  • Represents hierarchical data
    • tree structure
    • human and machine readable format
  • Useful for data exchange and transformation
  • Facilitates separation of content from presentation
  • Enabling technology for web services and the semantic web

the University of Greenwich

extensible hypertext markup language xhtml
Extensible Hypertext Markup Language XHTML
  • XHTML is an XML conforming HTML
    • XHTML 1.0 first published in 2000
      • three variants – transitional, frameset, strict
    • XHTML 1.1 became a W3C recommendation in 2001
      • strict and modular
  • Many HTML tags and attributes are deprecated
    • all styling is deprecated from strict XHTML
  • Strict syntax forces separation of content from presentation
    • XHTML tags describe only the page structure
      • greatly simplifies page markup
    • CSS is used to provide presentation
    • cleaner code improves legibility, maintenance and accessibility
    • recommended by WAI

the University of Greenwich

multipart html documents
Multipart HTML documents
  • It is usual for HTML documents to be composed from several component parts such as
    • CSS
    • JavaScript
    • images
    • media - audio, video, Shockwave (Flash movies)
    • applets – small Java applications
  • Each component part has to be downloaded from a web server
    • multiple HTTP requests are required to download a single web page
      • HTTP 1.1 can pipeline these requests
      • components are not necessarily from the same web server

the University of Greenwich

server side scripting
Server Side Scripting
  • Application program running on the web server
    • output is returned to the web browser
      • usually HTML
  • Can access resources on the server
    • files, databases
  • Common Gateway Interface (CGI)
    • standard way to allow programs to run on the web server
    • often Perl scripts
      • may be written in any language the server supports
    • output from the program (STDOUT) is routed through the web server back to the client
  • Web server scripting environments
    • executable script embedded into HTML pages

e.g. Active Server Pages (ASP), PHP Hypertext Preprocessor (PHP), Java Server Pages (JSP), Server Side Includes (SSI)

the University of Greenwich

web services evolution of the web

HTML

HTML

Generation 1Static HTML

Generation 2Web Applications

Web ServicesEvolution of the Web

HTML, XML

XML

XML

Generation 3Web Services

the University of Greenwich

evolution of the web
Evolution of the Web
  • The Web was originally conceived to serve static HTML pages over HTTP
  • In a short period of time many technologies were introduced and developed to provide dynamic, interactive web pages and stateful web applications
  • In an even shorter period of time the Web has dramatically changed many of the ways in which we work, relax and function as individuals and as a society
  • Web technologies continue to advance to support service oriented architectures and the semantic web
  • From this soup of technologies Web2.0 has evolved

the University of Greenwich

evolution of the web42
Evolution of the Web

HTML, XML

XML

user generated content

asynchronous partial page updates

service oriented architectures

Generation 4Web 2.0

the University of Greenwich

questions
Questions

the University of Greenwich

further reading
Further reading
  • The World Wide Web Consortium

http://www.w3.org/

http://www.w3.org/WAI/

http://www.w3.org/Addressing/

  • Wikipedia

http://www.wikipedia.org/

  • Mozilla

http://www.mozilla.org

  • Apache

http://www.apache.org

  • Web Resources collected by Kevin McManus

http://staffweb.cms.gre.ac.uk/~k.mcmanus/web

the University of Greenwich

questions45
Questions
  • What is the sequence of events in a web browser such as Mozilla when you follow a link to the following URL?

http://staffweb.cms.gre.ac.uk/~k.mcmanus

  • What are the advantages of using a simple stateless protocol to implement the Web?
  • Why was HTTP1.1 developed?
  • What MIME type will a web server respond with for a filename extension of .php?
  • What are stateful web applications?

the University of Greenwich