1 / 30

Web Standards for the Clumps Projects

Web Standards for the Clumps Projects. Brian Kelly Email Address UK Web Focus B.Kelly@ukoln.ac.uk UKOLN URL University of Bath http://www.ukoln.ac.uk/.

Download Presentation

Web Standards for the Clumps Projects

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Web Standards for the Clumps Projects Brian Kelly Email Address UK Web Focus B.Kelly@ukoln.ac.uk UKOLN URL University of Bath http://www.ukoln.ac.uk/ UKOLN is funded by the British Library Research and Innovation Centre, the Joint Information Systems Committee of the Higher Education Funding Councils, as well as by project funding from the JISC’s Electronic Libraries Programme and the European Union. UKOLN also receives support from the University of Bath where it is based.

  2. Contents • Introduction • Web Standards Overview • Web Standards: • Data Formats • Transport • Addressing • Metadata • Distributed Searching • Collections • Authentication • Deployment Issues • Aims of Talk • To give brief overview of web architecture • To describe developments to web standards • To briefly address implementation models

  3. Standardisation • Proprietary • De facto standards • Often initially appealing (cf PowerPoint, PDF) • May emerge as standards HTML extensions PDF and Java? • W3C • Produces W3C Recommendations on Web protocols • Managed approach to developments • Protocols initially developed by W3C members • Decisions made by W3C, influenced by member and public review PNG HTML Z39.50 Java? • ISO • Produces ISO Standards • Can be slow moving and bureaucratic • Produce robust standards • IETF • Produces Internet Drafts on Internet protocols • Bottom-up approach to developments • Protocols developed by interested individuals • "Rough consensus and working code" HTTP URNwhois++ PNG HTML HTTP

  4. The Web Vision • Tim Berners-Lee's (and W3C's) vision for the Web: • Evolvability is critical • Automation of information management: If a decision can be made by machine, it should • All structured data formats should be based on XML • Migrate HTML to XML • All logical assertions to map onto RDF model • All metadata to use RDF • See keynote talk at WWW 7 conference at <URL:http://www.w3.org/Talks/1998/0415-Evolvability/slide1-1.htm>

  5. HTML 4.0, CSS 2.0 and DOM • HTML 4.0 used in conjunction with CSS 2.0 (Cascading Style Sheets) and the DOM provides an architecturally pure, yet functionally rich environment • HTML 4.0 - W3C-Rec • Improved forms • Hooks for stylesheets • Hooks for scripting languages • Table enhancements • Better printing • CSS 2.0 - W3C-Rec • Support for all HTML formatting • Positioning of HTML elements • Multiple media support • DOM - W3C-Rec • Document Object Model • Hooks for scripting languages • Permits changes to HTML & CSS properties and content • Problems • Changes during CSS development • Netscape & IE incompatibilities • Continued use of browsers with known bugs

  6. HTML Limitations • HTML 4.0 / CSS 2.0 have limitations: • Difficulties in introducing new elements • Time-consuming standardisation process (<ABBREV>) • Dictated by browser vendor (<BLINK>, <MARQUEE>) • Area may be inappropriate for standarisation: • Covers specialist area (maths, music, ...) • Application-specific (<STUD-NUM>) • HTML is a display (output) format • HTML's lack of arbitrary structure limits functionality: • Find all memos copied to John Smith • How many unique tracks on Jackson Browne CDs

  7. XML • XML: • Extensible Markup Language • A lightweight SGML designed for network use • Addresses HTML's lack of evolvability • Arbitrary elements can be defined (<STUDENT-NUMBER>, <PART-NO>, etc) • Agreement achieved quickly - XML 1.0 became W3C Recommendation in Feb 1998 • Support from industry (SGML vendors, Microsoft, etc.) • Support in Netscape 5 and IE 5

  8. XML Concepts • Well-formed XML resources: Make end-tags explicit: <LI>...</LI> Make empty elements explicit: <IMG .../> Quote attributes <IMG SRC="logo" HEIGHT="20" Use consistent upper/lower case • Valid XML resources: Need DTD • XMLNamespaces: Mechanism for ensuring unique XML elements: <?xmlns:FOO="http://foo.org/1998-001" prefix="i"> <P>Insert <i:PART>M-471</i:PART></P>

  9. XML Deployment • Ariadne issue 15 has article on "What Is XML?" • Describes how XML support can be provided: • Natively by new browsers • Back end conversion of XML - HTML • Client-side conversion of XML - HTML / CSS • Java rendering of XML • Examples of intermediaries See http://www.ariadne.ac.uk/issue15/what-is/

  10. England France XLink, XPointer and XSL • XLink will provide sophisticated hyperlinking missing in HTML: • Links that lead user to multiple destinations • Bidirectional links • Links with special behaviors: • Expand-in-place / Replace / Create new window • Link on load / Link on user action • Link databases • XPointer will provide access to arbitrary portions of XML resource • XSL stylesheet language will provide extensibility and transformation facilities (e.g. create a table of contents) <commentary xml:link="extended" inline="false"> <locator href="smith2.1" role="Essay"/> <locator href="jones1.4" role="Rebuttal"/> <locator href="robin3.2" role="Comparison"/> </commentary>

  11. XML Update • Data / Schemas • XML-Data: Submitted to W3C Jan 98 (Obsolete?) • Document Content Description: Submitted Aug 98 • XSchema: Independent effort • Programming Interface • DOM level 1: W3C Recommendation, May 98 • Style & Presentation • CSS level 2: W3C Recommendation, May 98 • Extensible Style Language: Working Draft, Aug 98 • Relationship to Other Resources • XLink , XPointer: Working Drafts, Mar 98 • XML Namespaces: Working Draft, Aug 98 • Query Languages • XML Query Language: Submitted to W3C Aug 98 • XQL: Independent effort

  12. Addressing • URLs (e.g. http://www.bristol-poly.ac.uk/depts/music/) have limitations: • Lack of long-term persistency • Organisation changes name • Department shut down or merged • Directory structure reorganised • Inability to support multiple versions of resources (mirroring) • URNs (Uniform Resource Names): • Proposed as solution • Difficult to implement (no W3C activity in this area)

  13. Addressing - Solutions • DOIs (Document Object Identifiers): • Proposed by publishing industry as a solution • Aimed at supporting rights ownership • Business model needed • PURLs (Persistent URLs): • Provide single level of redirection • Pragmatic Solution: • URLs don't break - people break them • Design URLs to have long life-span • Further information: <URL: http://www.ukoln.ac.uk/metadata/resources/urn/> <URL: http://hosted.ukoln.ac.uk/biblink/wp2/links.html>

  14. Transport • HTTP/0.9 and HTTP/1.0: • Design flaws and implementation problems • HTTP/1.1: • Addresses some of these problems • 60% server support • Performance benefits! (60% packet traffic reduction) • Is acting as fire-fighter • Not sufficiently flexible or extensible HTTP/NG: • Radical redesign using object-oriented technologies • Undergoing trials • Gradual transition (using proxies) • Integration of application (distributed searching?)

  15. URNs, DOIs AddressingURL Metadata -RDFPICS, TCN, MCF, DSig, DC,... TransportHTTP Data formatHTML HTML 4.0, CSS, XML HTTP/1.1, HTTP/NG Metadata • Metadata - the missing architectural component from the initial implementation of the web • Metadata Needs: • Resource discovery • Content filtering • Authentication • Improved navigation • Multiple format support • Rights management

  16. Metadata Examples • DSig (Digital Signatures initiative): • Key component for providing trust on the web • DSig 2.0 will be based on RDF and will support signed assertion: • This page is from the University of Bath • This page is a legally-binding list of courses provided by the University • P3P (Platform for Privacy Preferences): • Developing methods for exchanging Privacy Practices of Web sites and user • Note that discussions about additional rights management metadata are currently taking place

  17. Sitemaps http://www.elsop.com/linkscan/map.html • Sitemaps provide navigational alternatives to browsing a site by following links. • Configurable site maps will enable end users to define hierarchies

  18. RDF • RDF (Resource Description Framework): • Highlight of WWW 7 conference • Provides a metadata framework ("machine understandable metadata for the web") • Based on ideas from content rating (PICS), resource discovery (Dublin Core) and site mapping (MCF) • Applications include: • cataloging resources – resource discovery • electronic commerce – intelligent agents • digital signatures – content rating • intellectual property rights – privacy • See <URL: http://www.w3.org/Talks/1998/0417-WWW7-RDF>

  19. RDF Model RDF Data Model • RDF: • Based on a formal data model (direct label graphs) • Syntax for interchange of data • Schema model PropertyType Resource Value Property page.html Cost £0.05 Cost page.html ValidUntil £0.05 11-May-98 PropObj InstanceOf Value Property ValidUntil PropName 11-May-98 Cost

  20. RDF Example • Example of Dublin Core metadata in RDF • <?xml:namespace ns="http://www.w3.org/TR/WD-rdf/" prefix="rdf"?> • <?xml:namespace ns="http://purl.org/dublin_core/schema/" prefix="dc"?> • <rdf:RDF> • <rdf:Description RDF:HREF="page.html"> • <dc:Creator>John Smith</dc:Creator> • <dc:Title>John’s Home Page</dc:Title> • </rdf:Description> • </rdf:RDF>

  21. Browser Support for RDF Trusted 3rd Party Metadata • Mozilla (Netscape's source code release) provides support for RDF. • Mozilla supports site maps in RDF, as well as bookmarks and history lists • See Netscape's or HotWired home page for a link to the RDF file. Embedded Metadata e.g. sitemaps Image fromhttp://purl.oclc.org/net/eric/talks/www7/devday/

  22. RDF Conclusion • RDF is a general-purpose framework • RDF provides structured, machine-understandable metadata for the Web • Metadata vocabularies can be developed without central coordination • Role for eLib projects in defining schemas? • RDF Schemas describe the meaning of each property name • Signed RDF is the basis for trust

  23. Distributed Searching • Distributed searching important for the DNER (Distributed National Electronic Resource) http://prospero.ahds.ac.uk:8080/ahds_live/ AHDS prototype provides cross-searching using Z39.50 ROADS prototype provides cross-searching using whois++

  24. How Metadata Could Be Used • Issues: • Loss of visibility • Performance, .. • Database Description • Music resources, including ... • Policy (Terms & Conditions / Resource and Service) • For licensing reasons, access is restricted to authorised HEIs • For performance reasons, access restricted between 9-17.00 • The service logo must be included in results set, unless results only come from service • Permission for cross-searching restricted to other eLib projects • You're only allowed to link to the main entry point • Individual • Give me HTML or PDF resources, not Word, … • I'm blind. Include ACSS in results and deliver a sitemap • Client Software • My browser doesn't support XML,so send me HTML

  25. Collection Description Work • Collection Description Group: • UKOLN involvement in producing list of attributes for collection level description (in the library, museum, archival sense), which includes databases of Internet resource descriptions such as SOSIG. • Work of interest to clumps and hybrid libraries. • WG membership: Dan Brickley (ROADS, ILRT), Andy Powell (ROADS), Verity Brack (RIDING), Matthew Dovey (Music Online, Malibu), Dennis Nicholson (BUBL/CAIRNS) and David Kay (FD). • See <URL: http://www.ukoln.ac.uk/metadata/cld/> • Collection Description eLib supporting study dueout in Oct. Will define core attributes (cf Dublin Core).

  26. Technologies • Number of formats and protocols could be used to implement distributed searching. XML and RDF plus: • Z39.50ISO standard. Well-known in library world, but heavy-weight • whois++Lightweight IETF standard. Used in ANR gateways, but not widely deployed • LDAPLightweight version of X.500 directory service. • HTTP/NG?Opportunity to develop new solution using OO technologies • IETF WebDav: • Requirement for distributed authoring include author metadata and collection definitions. See <URL: http://www.ietf.org/html.charters/webdav-charter.html> and <URL: http://www.ietf.org/ids.by.wg/webdav.html>

  27. Authentication • Deployment of an open, scaleable, flexible authentication system is difficult & expensive • Current solutions include: • Server-based username and password schemes • IP-based schemes • Athens - Based on replicated Sybase application See <URL: http://www.athens.ac.uk/> • W3C DSig work - Digital Signatures Initiative. See <URL: http://www.w3.org/DSig/> • Other Public Key developments - e.g. reports of Post Office involvement, statements from Tony Blair, EU, .. "In May 1998 the Commission published its proposal for a "European Parliament and Council Directive on a Common Framework for Electronic Signatures" (COM(1998)297)."

  28. Certificates • Should we be looking into using commercially-supported digital ids, such as Verisign's? • Can purchase server ID for $349 • End user certificates available Browser Support Use certificates to positively identify yourself, certificate authorities andpublishers Need for a certification infrastructure

  29. browser browser Deployment Issues • More sophisticated deployment techniques can be adopted to overcome deficiencies in simple model Original Model Web server simply sends file to client File contains redundant information (for old browsers) plus client interrogation support HTML resource Web server Sophisticated Model HTML / XML / databaseresource IntelligentWeb server Client proxy Server proxy • Intermediaries can provide functionality not available at client: • DOI support • XML support / format conversion • Authentication Example of an intermediary

  30. Conclusions • To conclude: • Standards are important, especially for national initiatives, such as eLib • Proprietary solutions are often tempting because: • They are available • They are often well-marketed and well-supported • They may become standardised • Solutions based on standards may not be properly supported by applications • Metadata is big growth area • Opportunity for eLib projects to shape developments • Intermediaries may have a role to play in deploying standards-based solutions • Intelligent servers likely to be important

More Related