1 / 13

Technical Developments Related to Quality Issues

Technical Developments Related to Quality Issues. Contents Application-based Developments Protocol Developments Conclusions. Brian Kelly UK Web Focus UKOLN University of Bath Bath, BA2 7AY B.Kelly@ukoln.ac.uk http:/www.ukoln.ac.uk/.

Download Presentation

Technical Developments Related to Quality Issues

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Technical Developments Related to Quality Issues • Contents • Application-based Developments • Protocol Developments • Conclusions Brian Kelly UK Web Focus UKOLN University of Bath Bath, BA2 7AY B.Kelly@ukoln.ac.uk http:/www.ukoln.ac.uk/ UKOLN is funded by the British Library Research and Innovation Centre, the Joint Information Systems Committee of the Higher Education Funding Councils, as well as by project funding from the JISC’s Electronic Libraries Programme and the European Union. UKOLN also receives support from the University of Bath where it is based. 1

  2. Application-Based Solutions • Sophisticated search engines are being developed: Google • Large-scale search engine for the research community (now commercial) Clever • IBM research project Direct Hit! • Records how users make use of search engines Alexa • Allows end users to vote on resources 2

  3. Google • Google uses a "PageRank" technique - important resources are pointed to from many sites and important sites (e.g. Yahoo). • See <URL: http://www.google.com/> Following the link to the first hit Search for Digital Libraries 3

  4. Clever See <URL: http://www.almaden.ibm.com/cs/k53/clever.html>) • Aims to find small set of documents the most authoritative information on the requested subject. • Uses a standard search engine to gather a "root set" of pages matching the query. Next, adds all pages pointing to or pointed to by the root set. Thereafter, it uses only the links between these pages to distill the best authorities and hubs. Clever finds the key Baseball sites. AltaVista results include sites selling medical services. Distinct pages found using Clever 4

  5. Direct Hit • Direct Hit: • Integrated with search engines such as Yahoo • Ranks results based on clicking profile from other users of the search service http://www.directhit.com/ Users searching for Dublin Core typically click on links related to metadata. Therefore put these at the top of the search results. 5

  6. Alexa • Alexa: • Enables end users to "rate" site when surfing • Includes access to related links • Based on central archive of the web (see <URL: http://www.archive.org/> • See also Netscape's What's Related facility http://www.alexa.com/ • Possibilities: • Signed votes • Use Alexa model with UK database of resources 6

  7. Summary • Good News • New generation of experimental search engines are being developed • Algorithms include: • Making use of link information • Making use of end users input • Collaborative bookmarks (cf FireFly - You like "Sex" and "Drugs". So does he, and he also likes "Rock'n'Roll") • But such techniques make use of "brute strength" approach • Is there a more elegant solution? 7

  8. Metadata / RDFPICS, IPR, MCF, DSig, DC,... AddressingURL TransportHTTP Data formatHTML We Need Metadata! • Web originally based on 3 architectural components. • Metadata is the missing component. The W3C is developing a machine-understandable metadata framework which can automate a variety of tasks (resource discovery, content filtering, etc.) 8

  9. Resource Value PropertyType Property RDF • RDF (Resource Description Framework): • Provides a metadata framework ("machine understandable metadata for the web") • Based on ideas from content rating (PICS), resource discovery (Dublin Core), etc. • Based on a formal data model (direct label graphs) • Applications include: • cataloging resources – resource discovery • intellectual property rights – content rating • digital signatures • privacy RDF Data Model 9

  10. Certificates • Certificates can be provided for: • Services • Users • Code (Java, ActiveX) • Certificate Authorities (CAs) can distribute certificates: • Global CAs (Verisign, Thawte) • National CAs (Post Office, central University body, British Library, etc) • Government legislation this session related to digital signatures 10

  11. University PhDThesis ResearchOffice PressOffice MSc Prospectus Certificates Within An Organisation • Digital signatures will enable publishers (e.g. Universities) to give an authoritative stamps to digital resources Staff and students can be given a certificate which is used for authentication Admissions The CVCP could give certificates to Universities, who would then be authorised to distribute certificates within the university Within the University, the Research Office and PR Office can allocate legally-binding signatures to authorised publications 11

  12. Unsigned Gateway Signed Gateway Advanced search engine SignedPhDThesis InformationGateway Quality Resources Developments for Gateways • Quality information gateways: • Can make use of signed resources to help cataloguing • Can provide input to sophisticated search engines (similar to Google) Signed gateway: this gateway follows xx quality conventions A central organisation could give certificates to approved information gateways 12

  13. Conclusions • Manual Indexing • Subject Gateway approach • Quality • Value-added services • Incomplete • Expensive • Automated Indexing • AltaVista approach • Comprehensive • Junk indexed • Too may hits • A Third Way • Combination of automated and manual approaches • Involvement from SBIG, author and end user • Exciting possibilities • Uncertainty of timescales and success • Coordination required - political issues (ownership of metadata, selling ads, etc.) 13

More Related