web site creation good practice guidelines designing for migration preservation and dissemination n.
Skip this Video
Download Presentation
Brian Kelly UK Web Focus UKOLN University of Bath

Loading in 2 Seconds...

play fullscreen
1 / 18

Brian Kelly UK Web Focus UKOLN University of Bath - PowerPoint PPT Presentation

  • Uploaded on

Web Site Creation: Good Practice Guidelines Designing For Migration, Preservation and Dissemination. Brian Kelly UK Web Focus UKOLN University of Bath. Email B.Kelly@ukoln.ac.uk URL http://www.ukoln.ac.uk/. UKOLN is supported by:. Contents. We’ve Been Here Before

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Brian Kelly UK Web Focus UKOLN University of Bath' - wallace-hawkins

Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
web site creation good practice guidelines designing for migration preservation and dissemination

Web Site Creation: Good Practice Guidelines Designing For Migration, Preservation and Dissemination

Brian Kelly

UK Web Focus


University of Bath





UKOLN is supported by:

  • We’ve Been Here Before
  • Web-Based Dissemination
  • Mirroring, Migration & Preservation
  • Conclusions
what happens when the funding stops
What Happens When The Funding Stops?
  • When the NOF project funding finishes what happens?
    • The project gracefully turns into a fully-fledged service, with new funding from NOF, the EU, your organisation, etc.
    • The project staff all leave and the Web site is shut down, is moved and can’t be found, or is broken and there is no-one with the interest, expertise or permissions to fix it
we ve been here before
We’ve Been Here Before
  • The UK Higher Education sector has been here before:
  • CTI Projects
    • CBL applications locked into obsolete hardware
  • TLTP Projects
    • CBL developers using Toolbook on standalone PC, which could not be deployed on campus LAN
  • eLib Projects
    • Web sites disappear
  • EU Programmes
survey of eu web sites

Yes Never Domain Page Gone Gone

65 16 11 12

Survey of EU Web Sites
  • WebWatching Telematics For Libraries Project Web Sites (Fourth Framework)
    • Exploit Interactive article published in Oct 2000
    • Web site availability:
    • Server details:

Apache – 41 IIS – 10 NCSA – 3 Netscape – 3 Other – 6 (e.g. Mac, GN)

    • See <http://www.exploit-lib.org/issue7/webwatch/>
survey of elib web sites
Survey of eLib Web Sites
  • WebWatching eLib Project Web Sites
    • Ariadne article published in Jan 2001
    • Of 71 Web sites, 3 domains no longer available and 2 entry points have gone
    • LinkPopularity.com results shown:
    • Survey also includes:
      • Analysis of entry points (links, HTML, accessibility)
      • Nos. of pages indexed by AltaVista- 0 in some cases 
        • Due to robots.txt file
        • Due to frames interface or other robots barrier
    • See <http://www.ariadne.ac.uk/issue26/web-watch/>

SOSIG 7,076OMNI 5,830EEVL 3,865History 2,605Netskills 2,363Ariadne 2,144

xxx ~10

web site promotion
Web Site Promotion
  • You want:
    • Your quality pages to be found in a timely fashion by users of search engines
    • To encourage others to link to you
  • To ensure this happens you should:
    • Have a domain and URL naming policy
    • Exploit the Robots Exclusion Protocol - see <http://www.robotstxt.org/wc/norobots.html>
    • Be aware of barriers to robots (which may also be barriers to humans)
    • Think about a linking policy and procedures
url naming policy
URL Naming Policy
  • Issues:
    • Having your own domain is a good idea (e.g. http://www.ariadne.ac.uk/)
    • Short URLs are good (more memorable; search engines tend not to index deeply)
    • Sub-domains may be a useful compromise (e.g. http://ariadne.bath.ac.uk/)
    • Keep URLs short by using directory defaults:
  • www.ariadne.ac.uk/issue5/metadata/intro.htm
  • www.ariadne.ac.uk/issue5/metadata/
    • Shorter, less prone to typos and allows for format and language negotiation, new server management tools, etc
  • …/issue5/metadata/intro.fr.html
  • …/issue5/metadata/intro.pdf (.cfm, .asp, .jsp)
planning search engine strategy
Planning Search Engine Strategy
  • You search for your project name and find a personal page of a former colleague with informal information 
  • To avoid this:
    • Distinguish between (a) initial information about the project (b) information for project partners, funders, etc. and (c) information for end user
    • Use search engine techniques to:
      • Ban search engines from indexing certain pages
      • Register key pages (e.g. list of new resources)

as appropriate

  • Make use of the Robots Exclusion Protocol (REP) to ban robots from indexing :
    • Non-public areas (e.g. area for partners)
    • Pre-release Web sites
    • Pages prior to an official launch
  • Remember to switch off ban after launch!


User-agent: *

Disallow: /partners

Disallow: /draft

/robots.txt in Web root

Note that use of directories to group related resources will have many benefits: controlling indexing robots, mirroring and auditing software, etc.

other barriers to indexing
Other Barriers To Indexing
  • Other barriers to indexing robots:
  • Frames
    • Most search engines can’t index framesets and rely on appropriate <NOFRAMES> tags
  • Flash (and other proprietary formats)
    • Most search engines can’t index proprietary formats
  • Poorly implemented JavaScript pages
    • Search engines may not have JavaScript interpreters and can’t index text generated by JavaScript
  • Poorly implemented user-agent negotiation (client-or server-side)
    • Most search engines don’t have a Netscape or IE user-agent string and so will index “Upgrade to Netscape”
  • Invalid HTML Pages
    • Search engines may not be as tolerant of HTML errors as Web browsers
  • Robots have similarities to the visually impaired
  • Good design for robots is likely to be good design for people with disabilities (and vice versa)
  • Make use of tools such as Bobby, WAVE, etc. to check accessibility – see <http://www.cast.org/bobby/>

You should formulate plans for making your Web site search-engines friendly and accessible

other ways of dissemination
Other Ways Of Dissemination
  • Users find your Web site by:
    • Search engines
    • Following a link
    • Entering a URL which they found on a mouse mat, pen, in an article, etc
  • Links to your Web site are valuable as they:
    • Drive traffic to your Web site
    • Improve ranking in citation-based search engines such as AltaVista
  • Possible problems with links:
    • “Link-spamming services” 
    • Being in the “Web sites that suck” portal
    • Resources needed to encourage linking
encouraging links
Encouraging Links
  • You can:
    • Submit to directories (e.g. Yahoo!)
    • Use directory (and search engine) submission services
    • Have clear entry points with static URLs for key menu pages
    • Think about who you want to link to you and why they would do so
    • Target them and think of motivation (e.g. attractive small icon)
    • Monitor trends in links to your Web site (e.g. try <http://www.linkpopularity.com/>)
news feeds
News Feeds
  • Providing automated news feeds which can be included in third party Web site with no manual intervention is a good way to support dissemination
extension to news feeds
Extension to News Feeds
  • The RDN (Resource Discovery Network):
    • Wants to provide news feeds about developments by RDN hubs
    • It’s using the RSS standard for news feeds (and XML/RDF application)
    • A CGI-based RSS parser (and authoring tool) has been created
    • To allow potential users to try it out easily, a JavaScript parser has also been written
    • See <http://rssxpress.ukoln.ac.uk/>

Can this (slightly) heavyweight CGI solution be complemented by a lightweight JavaScript solution be used within your NOF-digi project?

mirroring and preservation
Mirroring and Preservation
  • Another way to maximise impact of your Web site is for it to be mirrored:
    • Use of Web mirroring software to install service at another location (e.g. overseas to overcome network bandwidth problems or behind a firewall)
    • Issues about whether you are mirroring output from a service or the service itself (affected by push vs pull mode of mirroring)
    • NOF, for example, may wish to mirror your service in order to preserve it (once funding runs out and everyone leaves)

Note that you may wish to mirror only the project deliverables Web site, and not the Web site for partners or the Web site about the project – another reason for having separate Web sites

  • To conclude:
    • Make plans for the architecture of your Web service (URL naming, mirrorability, dissemination, etc.) at the start
    • Ensure your Web site is friendly to robots
    • Think about use of neutral resources which can be processed automatically by software (avoid the human bottleneck)