file access services directory services the world wide web www l.
Skip this Video
Loading SlideShow in 5 Seconds..
File Access Services, Directory Services, & The World Wide Web (WWW) PowerPoint Presentation
Download Presentation
File Access Services, Directory Services, & The World Wide Web (WWW)

Loading in 2 Seconds...

play fullscreen
1 / 69

File Access Services, Directory Services, & The World Wide Web (WWW) - PowerPoint PPT Presentation

  • Uploaded on

File Access Services, Directory Services, & The World Wide Web (WWW) 635.413.31 – Summer 2007 File Access Services Definition of File Access Services The ability to transparently access information stored in files on other systems across a network File Access vs. File Transfer

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'File Access Services, Directory Services, & The World Wide Web (WWW)' - issac

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
file access services
File Access Services

Definition of File Access Services

  • The ability to transparently access information stored in files on other systems across a network

File Access vs. File Transfer

  • File Transfer Characteristics
    • File transfer makes independent ‘copies’ of files on different hosts
    • Copies the entire file and access is generally not available until the copy is completely executed
    • File access methods & privileges are not the same for local and remote resources (so it’s not ‘transparent’)
    • Allows for storage of intermediate results, distributed management & control, and data protection
file access services3
File Access Services

File Access Characteristics

  • User access is only to a selected portion of a file; the complete ‘master’ copy of the file is stored on a single system
  • ‘Images’ of file are not independent; there is only one real copy stored on a single system
  • File access methods and permissions are typically the same for local and remote resources
  • Typically provides quick up-to-date access to files and prevents different ‘versions’ of a file from getting out of synchronization
  • Multi-user access to files must be carefully managed!!
examples of major file access services
Examples of Major File Access Services
  • The Networked File System (NFS)
    • The goal of NFS is to provide transparent file access for clients to files and filesystems on NFS servers
    • From both a user & a technical perspective uses a client-server architecture
    • Developed by Sun Microsystems to allow filesystems to be distributed across multiple UNIX/Linux systems
    • Current specification (version 4) is in RFC 3530. still a number of version 3 implementations (RFC 1813)
    • NFS clients available for Windows and Mac Operating Systems
    • Theoretically any application on the client that works with local file access should also work with NFS file access
    • Technical details of NFS will be examined later
examples of major file access services5
Examples of Major File Access Services
  • Microsoft File Services
    • File access and sharing has been around as a separate product for a decade (LANManager); support for networked file access now an integral part of Windows 95/98/NT and later versions
    • Originally proprietary but Microsoft has transitioned to TCP/IP (actually UDP) and published the interface specifications in RFC 1001 and 1002
    • Major components of Microsoft file access services:
      • The Server Message Block API
      • The NetBIOS API
      • Browser service
      • The Universal Naming Convention (UNC)
examples of major file access services6
Examples of Major File Access Services
  • Microsoft File Services (Continued)
    • The client typically maps a networked file or filesystem by binding the resource (via the UNC) to a local drive letter – at that point access is the same as a local resource
    • Also Resources can be mapped directly with the UNC
    • Clients available for Windows, OS\2, Mac, and many desktop UNIX/Linux systems (via SAMBA, etc.)
    • Microsoft has introduced a new component called DFS (Distributed File Services)
      • Provides a unified virtual namespace that makes distributed resources look like local resources
      • Provides unified access across multiple ‘real’ file services like SMB, Netware, and NFS
      • Still proprietary to Microsoft
examples of major file access services7
Examples of Major File Access Services
  • Apple (Appleshare and the Appletalk Filing Protocol)
    • Created as part of a proprietary network architecture for Macs
    • Created for Apple File sharing: client-server or peer-to-peer
    • Current versions run on TCP/IP
  • Novell Netware
    • Created to support Novell File Servers & Clients
    • Proprietary client-server architecture though Novell does do 3rd party licensing
    • Access to remote resources via ‘drive mapping’ where a remote resource is bound to a local drive letter
    • Current versions run on TCP/IP
    • Clients for DOS, Win3.1, Win95/98, NT, OS/2, and Mac
the remote procedure call rpc
The Remote Procedure Call (RPC)

The foundation of NFS – The Remote Procedure Call (RPC)

  • NFS is based on the Remote Procedure Call, a generic API that allows procedures called on one system to execute on another
  • If a client is RPC ‘aware’ nothing needs to be done by a user to access resources via NFS
  • There are two different RPC varieties; Sun RPC and the OSF Distributed Computing Environment (DCE)
  • Sun RPC specifications published in RFC 1057; version 2 ‘open’ specifications published in RFC 1831 (version 2)
  • Sun RPC, the foundation of most NFS implementations, uses the Sockets API to access TCP and UDP transport services
  • The RPC specifications consist of two main pieces:
    • The transfer protocol (message format & exchange)
    • Data representation & encoding
the remote procedure call rpc9
The Remote Procedure Call (RPC)

General Diagram of RPC Operation

the remote procedure call rpc10
The Remote Procedure Call (RPC)

RPC Message format

  • Two messages used – a Request and a Reply
  • Fields in an RPC Request
    • Transaction ID (XID) field [4 bytes]: a field initialized by the client w/ a unique number used to match Replies w/Requests
    • Call field [4 bytes]: set to zero for a Request & one for a Reply
    • RPC version field [4 bytes]: specifies what version of RPC is in use; typically either two (version 2) or three (for version 3)
    • Program numbers, version, and procedure fields [4 bytes each]: used by the server to determine what specific procedure the client wants to access remotely
the remote procedure call rpc11
The Remote Procedure Call (RPC)

RPC Message Format (Continued)

  • Credentials field [up to 408 bytes]:
    • Used to identify the client to the server
    • Use is optional; typically the User ID and Group ID of the client process is included
    • Can be used for the server to enforce access restrictions on the RPC
    • Actual length of field encoded at the beginning of the field value
  • Verifier field [up to 408 bytes]: used with secure RPC to pass encryption related information
  • Parameters field [variable length]: holds all parameter information necessary for the procedure call to be executed by the server
the remote procedure call rpc12
The Remote Procedure Call (RPC)

RPC Data Representation

  • RPC uses a special encoding system call External Data Representation (XDR) for encoding the data values in RPC fields
  • Allows heterogeneous systems to communicate using a common data representation
  • Field structure including bit and byte order specified for each field in the RPC messages
  • As an example, in XDR all integers are encoded as four byte values
  • XDR specifications published in RFC 4506
the remote procedure call rpc13
The Remote Procedure Call (RPC)

The RPC Port Mapper (‘411’ for RPC)

  • RPC server programs use ephemeral ports instead of well-known ports, so some kind of ‘registrar’ is necessary for client stub processes to determine how to communicate with server processes
  • The port mapper is an RPC server process all other RPC server processes on that particular server registers with when they initialize
  • The port mapper listens for client calls on well-known port 111 (both UDP and TCP)
the technical details of nfs
The Technical Details of NFS

General NFS Technical information

  • NFS uses a subset of defined RPCs to provide distributed file access (RPC can do much more than that)
  • Files and filesystems are referenced through the use of file ‘handles’ – objects or structures used to represent the files or filesystems
  • NFS servers are stateless (version 3 and earlier); servers do not keep track of which clients are accessing its resources or which resources are being accessed
    • Version 4 introduces file locking, which introduces stateful operation and requires more complex operational procedures
the technical details of nfs15
The Technical Details of NFS

An Example of NFS Operation

the technical details of nfs16
The Technical Details of NFS

NFS Procedures

  • NFS Version 4 has 39 procedures for file & directory manipulation and access as well as an extensible security framework
    • Built on RPCSEC_GSS (RPC 2203)
    • Provides integrity, privacy, and authentication
    • Allows for negotiation of security options
  • NFS Version 3 has 20 procedures for file/directory operations
  • Important NFS procedures (common to v3 and v4)
    • OPEN & CLOSE: compound commands for file manipulation
    • ACCESS – check file access permissions
    • READ & WRITE – reads and writes to/from files; options for either synchronous or asynchronous writes
    • GETATTR & SETATTR – get and set attributes on a file
    • MKDIR & RMDIR – make directory & delete directory
the technical details of nfs17
The Technical Details of NFS

Mounting Drives

  • In order for a client to access remote files via NFS it must use the NFS mount procedure
  • The mount procedure returns a ‘handle’ which the client uses to reference the filesystem
  • The client then uses the handle to integrate the remote filesystem into its local filesystem structure (analogous to mapping a networked drive)
  • As with other NFS procedures the port mapper must be called first to find the UDP or TCP port being used on the NFS server
    • Note: this procedure works differently ‘under the hood’ in v4, as no portmapper is used and the mount protocol has been incorporated into the core NFS specification
the technical details of nfs18
The Technical Details of NFS

NFS over TCP

  • Originally NFS was strictly a LAN technology rarely implemented over wide area connections, so UDP was used to provide efficient throughput
  • With the globalization of companies NFS is being used more often over wide geographic areas
  • Using TCP provides NFS with more robust implementations over WAN connections
  • The default transport for NFS version 4 is TCP
the technical details of nfs19
The Technical Details of NFS

An NFS Example – Reading the file testplan.txt from mountpoint/smiley/testplan.exe

directory services21
Directory Services
  • Definition
    • A means of searching a database of information easily & rapidly using keywords to match attributes stored in the database
    • With the explosion of networking and the Internet very important for finding people, information, & services
  • Directory services must be:
    • Easy to use
    • Contain correct data
    • Provide value
  • Directory services are categorized as White or Yellow Pages
    • White Pages: Directory Services oriented toward information associated with individuals
    • Yellow Pages: Directory Services oriented toward finding resources (networked printers, servers, etc) or services
directory service protocols
Directory Service Protocols

There are 4 Major Directory Service Protocols used in the Internet:

    • DNS,
    • NIS,
    • X.500, and
    • LDAP
    • And yes, NDS, WINS, and AD could be included too…
  • The Domain Name Service
    • Yes, this can be considered a very limited Directory Service!
      • Can be adapted for other uses such as spam blacklists
    • Provides only name resolution services but some information can be gleaned from the name structure
    • Has almost global accessibility
    • Already covered earlier in exhaustive detail
directory service protocols23
Directory Service Protocols
  • The Network Information Service (NIS)
    • NIS was developed by Sun Microsystems as a companion to NFS to ease the burden on administrators of distributed computing systems
    • Uses a client-server architecture for distribution of Yellow Pages information
    • NIS allows the use of a single administrative ‘repository’ for important network-wide information such as password files.
    • Enables the use of a single username & password across multiple systems
    • Also includes information search and retrieval commands that can be used in the development of non-administrative distributed directories and databases
    • Sun is trying to migrate from NIS to the LDAP-enabled Sun Directory Services (SDS); there’s a large installed base of users so this will likely take a while
directory service protocols24
Directory Service Protocols
  • Technical Details of NIS
    • Like NFS, NIS is built upon the Remote Procedure Call (RPC) and NIS procedures operate very much like NFS procedures
    • Built around map files, which are the data or directory information repositories and maps, which are unique views or indexes for the data files
    • Map files are stored in individual directories on the server (example: the password file is typically stored on a server at /var/yp/passwd
    • Servers come in two varieties: master and slave
      • The master server is the single administration point for the map files; analogous to a primary DNS server
      • There is only one master NIS server for a set of map files
      • Slaves read all their map file information from the Master
      • Clients can be serviced equally well by either a Master or a Slave
directory service protocols25
Directory Service Protocols
  • Technical Details of NIS (continued)
    • Organizations that want multiple access policies for directory or distributed database information can segregate NIS servers into Domains
      • A domain defined as a set of servers that access and distribute information from a unique set of maps
      • Each domain has a single Master server and as many slaves as necessary to adequately service client requests
      • Domains can overlap (an NIS server could be a Master for multiple domains) and clients can easily switch between domains
    • NIS Database Operations come in two flavors: distributed filesystems and explicit commands
      • With the distributed filesystem mode of operation NIS clients read data for administrative functions from the NIS server instead of a local file
      • This mode is used for such functions as password or license files – it allows a single file administered from the Master to be used across multiple NIS client systems
directory service protocols26
Directory Service Protocols
  • Technical Details of NIS (continued)
    • NIS clients can also be set up so they will check their local file first and if a match is not found do an NIS lookup
      • NIS services (maps) may also be accessed using explicit commands
      • This is very useful for building distributed database or directory applications that have nothing to do with system administration (i.e. – systemwide personnel directory, etc.)
      • The most important NIS user command is ypmatch; this allows the search of a map using a keyword
    • Three groups of internal NIS procedure calls perform the work for either mode of operation
      • Client-Lookups: key-driven procedures (match, get-first, get-next, get-all)
      • Maintenance-calls: checking server names & status (get-master, get-order)
      • Internal NIS calls: commands between servers such as a map transfer from Master to Slave
directory service protocols27
Directory Service Protocols
  • The X.500 Directory Service (aka DAP)
    • ITU standard for storing, accessing & distributing directory information
    • What X.500 provides:
      • A standards-based directory
      • A structured information framework
      • A single global namespace
      • Powerful search capabilities
      • Decentralized maintenance
    • Encompasses seven recommendations (X.501, X.509, X.511, X.518, X.519, X.520, and X.521) besides X.500 – the standards cover:
      • Directory Service Quality: access rules, authentication, filters, etc.
      • Directory Queries: read, compare, list, search operations, etc.
      • Directory Modification: add, rename, modify operations, etc.
      • Error Reporting: definition of of error conditions and responses
      • Referrals: relationships between Directory servers & other external objects
    • Originally designed for OSI protocols but later modified for TCP/IP
directory service protocols28
Directory Service Protocols
  • X.500 Technology
    • X.500 Directory Structure and Terminology
      • Information is held in a Directory Information Base (DIB)
      • A DIB consists of individual entries organized in a tree structure called the Directory Information Tree (DIT)
        • Can also have relational characteristics
      • Each entry is composed of attributes and has a Distinguished Name (DN) that uniquely identifies it (User friendly Distinguished Names are very important to X.500)
        • Each attribute composed of a type and one or more associated values
        • The type specifies a particular syntax and data type for the value (boolean, integer, etc.)
      • DIT entries have some object-oriented characteristics like inheritance
      • The DIT also has a schema; in X.500 the schema is a rule-set that ensures the DIT maintains its logical structure during modifications
directory service protocols29
Directory Service Protocols
  • An X.500 DIT example
directory service protocols30
Directory Service Protocols
  • X.500 Technology
    • The X.500 communication protocols
      • X.500 client (Directory User Agent or DUA) communicates with an X.500 server (Directory Server Agent or DSA) using the Directory Access Protocol (DAP)
      • The X.500 DAP uses the standard OSI presentation layer Remote Operations Service Element (ROSE) and Association Control Service Element (ASCE) for communications
      • RFC 1249 defines a standard for interface the X.500 service to the UDP and TCP transport layers for use over the Internet
      • OSI networking is ridiculously complex and X.500 over TCP/IP isn’t much better so that’s all we’re going to cover on this subject
directory service protocols31
Directory Service Protocols
  • The Lightweight Directory Access Protocol (LDAP)
    • Even with the adaptation of X.500 to TCP/IP very few organizations actually adopted it
      • Required too much computing and OSI engineering expertise
      • Most implementations were kludgey and complex
    • Engineers & researchers at the University of Michigan decided a completely new protocol for accessing X.500 services from Internet clients would hasten deployment of global directory services
      • The Lightweight Directory Access Protocol (LDAP) was originally designed to be a ‘gateway’ service; an X.500 ‘back-end’ was still necessary
      • Eventually the designers saw the value of LDAP as a completely independent directory services access protocol
    • LDAP version 3 specifications published as RFC 2252
      • Recent update in RFC 4510 – provide new features & protocol extensibility
directory service protocols32
Directory Service Protocols
  • The Lightweight Directory Access Protocol (LDAP)
    • What the current version of LDAP provides:
      • Centralized administrative tasks within an organization
      • Storage of sensitive information in a centrally secure repository
      • Authentication and security services
      • Centralized procedure for determining the status and role of an individual in an organization
      • Standardizes the search and retrieval of white & yellow page data
      • Multi-platform, multi-vendor open standard with published APIs for software development
      • URL format specifications for accessing LDAP data via browsers
      • Gateway services to other directory services like NIS and NDS
    • Vendors with current LDAP enabled products: Redhat, Innosoft, OpenLDAP, Sun, Microsoft, and Novell (list not exhaustive)
directory service protocols33
Directory Service Protocols
  • The Lightweight Directory Access Protocol (LDAP)
    • LDAP Technology
      • LDAP standard addresses 3 areas: API, data format, and access protocol
    • There are several APIs published and in use:
      • The base API developed by the U. of Michigan published as RFC 1823
      • Other C and Perl based APIs readily available
    • LDAP data format
      • LDAP has the same object format and definitions as X.500
      • Directory Information Base (DIB) & Directory Information Tree (DIT)
      • Entries and Attributes
      • LDAP DIB structure includes both hierarchal & relational structure
      • Four LDAP object types are currently defined: Person, Organizational Unit, Group, and Domain
directory service protocols34
Directory Service Protocols

Example LDAP

directory structure

directory service protocols35
Directory Service Protocols
  • The Lightweight Directory Access Protocol (LDAP)
    • LDAP access protocol
      • Conforms to a client-server architecture (clients performing operations against servers)
      • Does not define how data is stored in server; just how to access it in a standard manner
      • Uses TCP; client can issue multiple queries over a TCP connection
    • Basic LDAP Operations
      • Binding an LDAP client to a server
        • Required operation before a client can search information on a LDAP server
        • Authentication and access controls are parts of this operation
        • Unbinding and rebinding can occur over a single TCP connection allowing a client to change access privileges
directory service protocols36
Directory Service Protocols
  • The Lightweight Directory Access Protocol (LDAP)
    • Basic LDAP Operations
      • Server search
        • Single or multiple keyword
        • Wildcards are allowed
      • Compare entries
      • Add entry or entries
      • Modify existing entry or entries
      • Delete existing entry or entries
directory service protocols37
Directory Service Protocols
  • The Lightweight Directory Access Protocol (LDAP)
    • Other LDAP Operations
      • Referral – allows one server to pass searches to another server
      • Replication – allows multiple servers to handle requests for a DIT; enhances availability via load balancing and redundancy
      • Encryption & Security – supports Kerberos, SASL, and Secure Sockets Layer (SSL)
      • DIT discovery – allows a client to query a server for its structure and entity relationships
introduction to the world wide web39
Introduction to the World Wide Web


  • Developed to provide an easy way to distribute info to members of a geographically dispersed group
  • Especially well-suited for multimedia information
  • Meant to allow software, hardware, & system independent information sharing!
  • Built on the concept of hypertext and hypermedia – cross-linked groups of documents or objects
  • Plenty of good Web related information available at
introduction to the world wide web40
Introduction to the World Wide Web


  • The WWW concept was developed in 1989 by Tim Berners-Lee at CERN
  • He wanted a way to easy distribute high energy physics data to researchers located around the world
  • Concept was developed and culminated in the release of the first WWW browser in 1993 (Mosaic)
  • With the incorporation of Netscape commercial development of the WWW gained momentum
  • WWW is actually based on an earlier text-based system called gopher that was developed in the mid-80s
introduction to the world wide web41
Introduction to the World Wide Web

General Operation & Concepts

  • WWW has client-server architecture, with browsers (clients) pulling data from Web servers
  • Browsers request information (web pages) using a unique 'address' for each page called a Uniform Resource Locator (URL)
  • Servers deliver either an error indication or the requested web page information to browser
  • The WWW uses TCP for transport (i.e. one-to-one communications)
  • WWW is mostly a 'pull' technology – only client-requested info is sent
  • Web pages are linked together in a hierarchical 'mesh' by embedding URLs in web pages – hyperlinks
  • Caching is an integral part of the WWW & can take place in many locations
    • Server-side in web farms (load balancers & caching appliances)
    • In the network at firewalls and proxy servers
    • Client-side in web browsers
world wide web addressing
World Wide Web Addressing

There are three basic parts to the WWW: Addressing, Transport, & Presentation

Addressing (URLs)

  • One of most important aspects of the WWW is it’s global addressing scheme called the Uniform Resource Locator (URL)
  • URLs provide a unique ‘address’ allowing direct access to web pages
  • URLs consist of three parts: protocol, host name, and document name
    • The URL Protocol part
      • Web browsers can handle multiple protocols to provide backwards compatibility & flexibility
      • Depending on the browser these additional protocols can be handled internally in the browser or through the use of 'helper' applications
      • Additional protocols handled: telnet, ftp, gopher, news (NNTP), mail (SMTP), and local file access
world wide web addressing43
World Wide Web Addressing
  • The URL Host Name part
    • Second part is the DNS name or IP address of the server where the web page resides
    • Use of IP addresses is valid and works but is discouraged
    • If no TCP port number is specified with the name or address, port 80 is implied
    • If you wish to run web services on another port the port is specified after the DNS name (example:
  • The URL document name
    • Final part of URL specifies the location of the web page on the server
    • Usually consists of directory path followed by web page filename
    • Path is typically relative to a default directory on the web server (operating system dependent)
    • Path structure and filenames are operating system specific
transport the hypertext transfer protocol http
Transport: The Hypertext Transfer Protocol (HTTP)


  • The Hypertext Transfer Protocol is the application layer protocol responsible for delivery of WWW data between browsers & servers
  • Current HTTP release is version 1.1 (RFC 2616) though a significant number of systems run the first standard 'production' release (v1.0)
  • HTTP is a very simple text based request-response protocol like SMTP; MIME like header fields and encoding conventions are used
    • RFC-822/2822 format headers and MIME extensions are used to define the web page content, negotiate page parameters, & support conditional requests
  • HTTP uses TCP for reliable transport layer services
  • HTTP v1.0 works in a stop and wait fashion - only one HTTP command can be outstanding & requires a separate TCP connection for each object
  • For each object (file) on a page a separate TCP connection is established
    • HTTP v1.1 allows persistent connections and ‘pipelining’
transport the hypertext transfer protocol http45
Transport: The Hypertext Transfer Protocol (HTTP)
  • Important HTTP Characteristics
    • Application Level
    • Request/Response
    • Stateless
    • Bi-directional Transfer
    • Capability Negotiation
    • Support for Caching
    • Support for Intermediaries/Proxies
transport the hypertext transfer protocol http46
Transport: The Hypertext Transfer Protocol (HTTP)
  • Important HTTP commands
    • GET: transfers a file from server to browser
    • HEAD: transfers the HTML header information for a web page from server to browser
    • PUT: uploads a file from browser to server (used a lot with web-based mailers)
    • POST: append information to a URL (used with forms)
    • LINK: create a link between files on a web server
    • UNLINK: delete a link between files on a web server
    • DELETE: delete a file on the server
  • ALL of these commands are subject to access restrictions enforced by both the web server software and the server operating system!
transport the hypertext transfer protocol http48
Transport: The Hypertext Transfer Protocol (HTTP)
  • HTTP v1.1 Technical Details
    • Persistent TCP Connections
      • Allow multiple HTTP commands (and responses) to be sent over a single TCP connection; can greatly increase response times and interactivity
      • Reduces load opening & closing TCP connections
      • Fall back to v1.0 if necessary
    • Application-layer compression
      • Does not resend HTTP headers on sequential requests to the same server
    • HTTP Command Pipelining
      • Multiple HTTP commands can be issued without a response
      • The server must respond to each request in the order it was received
    • Expanded and improved directives for more intelligent Web caching
      • Enhanced support for Web content lifetimes and expiration dates
      • Validation algorithms & commands to allow a cache to ‘check’ on web page expiration
    • Better support for transfer of dynamically generated web pages
      • ‘Chunked’ transfers; files sent as a sequential stream of chunks with a defined size
      • HTTP 1.1 has explicit header fields for defining ‘chunk’ mode transfers and expressing the size of the chunks
presentation the hypertext markup language html
Presentation: The Hypertext Markup Language (HTML)


  • HTML is concerned with the structure and formatting of web pages
  • HTML was developed from an ISO standard called SGML; a ‘metalanguage’ developed in 1974 (you may ask why SGML isn’t used for the WWW instead of writing a new language – it’s because SGML is very general and terribly complex)
  • Current version is 4.01 (December 1999)
  • HTML is designed to be backwards compatible to previous versions
  • It was the original objective of HTML to allow the browser to control formatting & presentation of page
    • This was instituted to allow better data sharing in a heterogeneous environment
    • Content providers want more control over the presentation of their information so the original object of browser control over presentation is being gradually eroded in each new release of HTML
presentation the hypertext markup language html50
Presentation: The Hypertext Markup Language (HTML)

HTML page structure

  • An HTML page consists of ASCII text and is divided into two main parts: a header and a body
  • The header contains information about the document
  • The body contains the actual information to display
  • HTML depends on embedded commands in the web page called TAGS
  • Tags define both a command action and a scope
  • Tags are enclosed in start and end symbols (‘<’ and ‘>’) to distinguish them from normal text that is for client display
  • Tags, depending on their specific actions, can have optional attributes (size, align, action, etc.)
presentation the hypertext markup language html51
Presentation: The Hypertext Markup Language (HTML)

Basic tags

  • Page tags: <HTML> and </HTML> tags delimit the complete page
  • Header and Body tags: <HEADER>, </HEADER>, <BODY>, and </BODY>
  • Hyperlinks: <A HREF=”URL”> and </A>
  • Images: <IMG src=”URL”> and </IMG>
    • JPEG format (*.jpg)
    • GIF format (*.gif)
  • Formatting
    • Physical formatting
      • Specifies an exact display parameter for rendering web text
      • Examples are Bold (<B> and </B>) and Font Size
    • Logical formatting
      • Specifies special treatment of the enclosed text but the exact physical rendering is left up to the browser or defined in a style sheet
presentation the hypertext markup language html52
Presentation: The Hypertext Markup Language (HTML)

Style Sheets

  • Provide a way to uniformly define how logical styles are assigned physical display attributes (without changing the configuration of your browser)
  • Allows the web page author to apply typographic styles and spacing instructions for elements on a web page
  • The advantages of style sheets are that they remove presentation information from the HTML document (which should be concerned only with structure) and one style sheet could be ‘reused’ by linking it to multiple HTML documents
  • The disadvantage of style sheets is that they take presentation control away from the browser -- if the content creator is not careful their web pages may not display properly
  • Style sheets can be ‘embedded’ into a web page or linked externally through a link or import statement in the header
  • Generic Style Sheet entry syntax (example later): selector {property:value}
presentation the hypertext markup language html53
Presentation: The Hypertext Markup Language (HTML)

More Tags

  • Menus, lists, and tables
    • Menus
      • Creates a ‘compact’ list of choices with no numbering or bullets
      • Relevant tags: <MENU>, </MENU>, and <LI>
    • Lists
      • Ordered Lists
        • Ordered lists sequentially list items
        • Relevant tags: <OL>, </OL>, and <LI>
      • Unordered Lists
        • Items are listed with bullets
        • Relevant tags: <UL>, </UL>, and <LI>
presentation the hypertext markup language html54
Presentation: The Hypertext Markup Language (HTML)

More Tags

  • Tables
    • Creates a 2-D collection of data organized in rows and columns
    • The <TABLE> and </TABLE> tags bound the table data
    • The rows are specified by <TR> tags
    • Captions can be added using the <CAPTION> tag
    • Cell data delimited by the <TD> or <TH> tags
  • Headings
    • HTML Headings provide a set of logical ‘styles’ for use in documents
    • These are logical styles – actual display properties set by the browser or a style sheet!
    • Relevant tags: <H1> through <H6> with end tags </H1> through </H6>
  • Paragraphs, Line Breaks, & Horizontal Rules
    • These are the most commonly used tags that do not come in pairs
    • Relevant Tags: Paragraph <P>, Line Break <BR>, & Horizontal Rule <HR>
presentation the hypertext markup language html55
Presentation: The Hypertext Markup Language (HTML)

Example HTML Page:

<!doctype html public "-//w3c//dtd html 4.0 transitional//en">


<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">

<meta name="Author" content="John Romano">

<meta name="GENERATOR" content="Mozilla/4.5 [en] (Win95; U) [Netscape]">

<title>Sample Web Page</title>


<h4>This is a sample Web Page for my Internet Class</h4>

<h5>I'll start off with a list of what toys I want for Christmas</h5>


<li>A trip to Tahiti</li>

<li>A fast expensive convertible</li>


<hr WIDTH="100%"><img SRC="bix2b.JPG" height=86 width=144>

<h5>Here's a table for comparison</h5>&nbsp;

<center><table BORDER COLS=3 WIDTH="100%" ><tr>

<td><b>Name</b></td><td><b>Description</b></td><td><b>Part Number</b></td></tr><tr>

<td>RAD Optimux</td><td>4xT-1 Fiber Modem</td><td>OPTM-4T1-SC-2</td></tr><tr>

<td>Lucent DDM-2000</td><td>OC-3 SONET mux</td><td>1284-3A17</td></tr>


<p>&nbsp;<a href="">Send Mail to Me!</a>



presentation the hypertext markup language html56
Presentation: The Hypertext Markup Language (HTML)

Advanced HTML

  • Forms
    • Basic HTML is one way information transfer between server and browser; interactivity is very important and was added in version 2.0
    • Forms allow data to be pushed back to the server for processing (via some other specification such as CGI)
    • Forms delimited using the <FORM> and </FORM> tags
      • The Method and Action attributes of the FORM tag used to specify how and where to upload the form data
      • Example: <FORM ACTION= METHOD=POST>
    • Individual fields in the form specified using the <INPUT> tag
    • The <INPUT> tag has many attributes to control labeling and data input
      • The Name attribute
      • The Size attribute
      • The Type attribute
presentation the hypertext markup language html57
Presentation: The Hypertext Markup Language (HTML)
  • Cascading Style Sheets
    • A new feature that imbues style sheets in object oriented functionality
    • Style sheets can be ‘linked’ together in a tree structure
    • Style sheets can inherent properties from parent to child
    • Cascading style sheets sometimes have different behaviors between browsers!
a look ahead addressing
A Look Ahead: Addressing

Addressing (URIs)

  • A Uniform Resource Identifier is a formatted string which identifies, via name, location, or any other unique property, a resource
  • URLs are a subset of URIs that specify a location – this restriction can be a problem
  • URIs will in some sense allow ‘anycasting’ of information across the WWW
  • Will allow requests for information without worrying about where it came from
  • Will help eliminate problems with network and server outages

URI composition

  • A URI is composed of four major parts: scheme, authority, path, and query
  • The URI syntax is: <scheme>://<authority><path>?<query>
a look ahead addressing59
A Look Ahead: Addressing

URI composition (Continued)

  • Scheme
    • Only the scheme part of the URI is mandatory – other parts are optional
    • Could identify a protocol or some other defined or standardized namespace
    • The remainder of the URI syntax depends on the scheme!
  • Authority
    • May be defined as a Internet-based service or by a Scheme-specific naming authority
    • Internet based-servers usually have the form: <userinfo>@<host>:<port>
    • Userinfo is optional but must be composed of legal URI characters
    • Host would typically be a valid DNS name or IPv4 address
    • Port would be a valid TCP port number
    • A scheme specific naming authority could be anything valid in the syntax of the scheme as long as it is composed of legal URI characters
a look ahead addressing60
A Look Ahead: Addressing

URI composition (Continued)

  • Path
    • The path component contains data, specific to the scheme and authority, that identifies the resource within the scope of the scheme and authority
    • Typical paths in an Internet server-based URI (like a URL) navigate the directory structure of the server
  • Query
    • The query component is a string of information to be interpreted by the resource
    • Typically used to provide keywords or parameters for formatting output
a look ahead presentation
A Look Ahead - Presentation

The Extensible Markup Language (XML)

  • A new way to structure and format data on the Internet
    • Goal: make information transmitted across the Internet ‘self-describing’
    • Consists of a ruleset to describe how to define your own markup language
    • Like HTML and other markup languages tags are used -- however they are there more to describe what the information is instead of how it should look
  • XML History
    • Development of XML began in 1996 and culminated in early 1998 with the adoption of version 1.0 standard by the World Wide Web Consortium (W3C)
    • Like HTML, XML is based on the Standardized General Markup Language (SGML) developed by the ISO in the mid-1970s
    • SGML is far too general and complex (the standard runs over 500 pages) for easy and widespread deployment; XML is a stripped down version suitable for use on a wide range of networked systems
    • HTML has been ‘re-written’ into XML (aka XHTML 1.1)
a look ahead xml
A Look Ahead - XML
  • HTML shortcomings
    • Even with the vast bandwidth in the Internet the WWW is often plagued by lack of interactivity and poor response times
    • Explosive growth of the WWW has made it difficult to find what you’re looking for (metatags & web crawlers don’t cut it anymore)
    • HTML is inflexible; it takes months to years for new standards with additional tags and functionality to be published
  • XML solutions to HTML problems
    • Speed – with tags specifying what the delivered content means and what should be done to it a significant amount of processing can be done on the client
      • Saves network bandwidth
      • Offloads overloaded servers
    • Searching – XML makes searches more intelligent by allowing the structure and meaning of data to be included as keywords or search parameters
    • Flexibility – XML allows a content creator to define tags in a standard way so new tags can be created on the fly without rewriting standards
a look ahead xml63
A Look Ahead - XML
  • XML Implementation
    • The XML standard is, by comparison to SGML or HTML, a very concise document
    • XML uses Unicode – a standard set of characters that encompasses all of the world’s major languages
  • Like HTML, XML is built on tags
    • Tags must be in start/end pairs that enclose the text to which they apply
    • Tags cannot overlap, but they can be nested
    • Nested tags create a tree structure for the XML document
  • Information typically required in a new XML document
    • What tags are allowed
    • How they can and cannot be nested
    • How tags should be processed
    • The first two of the preceding three items are usually covered in the Document Type Definition (DTD)
a look ahead xml64
A Look Ahead - XML

A Real World XML Document (A RealServer license key)


<!-- Warning: Do not Edit this file! Editing will invalidate your Real Server License Key. -->

<List Name="License">

<List Name="Definition">

<Var Name="Evaluation" Value="True"/>

<Var Name="Manufacturer" Value="RealNetworks"/>

<Var Name="LicenseID" Value="01-0102-0031-57074"/>

<Var Name="ProductID" Value="0"/>

<Var Name="MajorVersion" Value="6"/>

<Var Name="MinorVersion" Value="0"/>

<Var Name="StartDate" Value="01/01/1997"/>

<Var Name="EndDate" Value="01/01/2030"/>



<List Name="General">

<Var Name="ClientConnections" Value="10"/>

<Var Name="Live" Value="True"/>


<List Name="LicenseGenerator">

<Var Name="MajorVersion" Value="6"/>

<Var Name="MinorVersion" Value="0"/>

<Var Name="TimeStamp" Value="[28/03/1999:09:23:06 00:00]"/>


<Var Name="LicenseKey" Value="JhxSssanlhiY/Y/ETPgLRg=="/>


a look ahead xsl
A Look Ahead - XSL

Other important parts of XML

  • The Extensible Stylesheet Language (XSL)
    • Defines a set of rules for presentation of XML data allowing the development of a ‘write once, publish everywhere’ information infrastructure
    • XSL stylesheets can be defined for various display media; when an XML document is downloaded the appropriate XSL stylesheet is used to reformat the data to display data on the local system
    • XSL stylesheets can be defined for presentation of data in non-visual formats; using XSL an XML ‘web’ page could be listened to or converted to braille
    • XSL/XML document use:

XML Document (Structure)

XSL StyleSheet (Presention)

Users display at Browser

XML Document #2 (Structure)

XSL StyleSheet #2 (Presention)

a look ahead xml xsl
A Look Ahead – XML & XSL

XML & XSL Document Example:

<?xml version = “1.0” ?>

<?xml-stylesheet type=”text/xsl” href=”editors.xsl” ?>

<!-- This is a partial DTD - I haven’t defined everything necessary for this example! -->

<!ELEMENT street (#PCDATA)*>

<!ELEMENT city (#PCDATA)*>

<!ELEMENT state (#PCDATA)*>


<!ELEMENT address (street, city, state, zip)>





<title>Senior Engineer</title>

<organization>Johns Hopkins</organization>


<street>139 N Charles Street</street>








a look ahead xml xsl67
A Look Ahead – XML & XSL

XML & XSL Document Example:

<?xml version = “1.0” ?>

<!-- This stylesheet takes a simple XML doc and displays it in HTML -->

<xsl:stylesheet xmlns:xsl=>

<!- this defines the xsl namespace this stylesheet belongs to -!>

<xsl:template match=”/”>

<!- Apply template to everything in the XML document -!>



<H1>Editor Contacts</H1>

<xsl:for-each select=”editor_contacts/editor”>

<H2>Name: <xsl:value-of select=”first_name”/> <xsl:value-of select=”last_name”/> </H2>\

<P> Title: <xsl:value-of select=”title”/></P>

<P> Title: <xsl:value-of select=”organization”/></P>

<P> Title: <xsl:value-of select=”address/street”/></P>

<P> Title: <xsl:value-of select=”address/city”/></P>

<P> Title: <xsl:value-of select=”address/state”/></P>

<P> Title: <xsl:value-of select=”address/zip”/></P>

<P> Title: <xsl:value-of select=”e_mail”/></P>






a look ahead xlink
A Look Ahead - XLink
  • Improved Hyperlinks (XLink)
    • Another W3C standard to provide more intelligent hyperlinks
    • Version 1.0 approved June 2001
    • Unlike current hyperlinks that can only link to another single physical location XLinks will be able to do much more:
      • XLinks could allow a choice of multiple actions or destinations
      • Other XLinks will allow the information to be embedded directly in the page (maybe a floating dialog box) instead of forcing the viewer to leave that page
      • XLinks could be indirect links, allowing changes to links to be made at a single database record instead of where ever the link is referenced
reading homework
Reading & Homework
  • Reading
    • Chapter 25: 25.3, 25.13, and 25.14
    • Chapter 27: HTTP
    • There’s a lot more to the WWW & XML than presented here – see for more!
    • Next week: advanced web applications