1 / 50

Interoperability, Z39.50 Profiles & Testing

Netspeed 2002 Conference, October 25, 2002 Calgary, Alberta. Interoperability, Z39.50 Profiles & Testing. William E. Moen <wemoen@unt.edu> School of Library and Information Sciences Texas Center for Digital Knowledge University of North Texas Denton, TX 72603. Overview. Interoperability

helene
Download Presentation

Interoperability, Z39.50 Profiles & Testing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Netspeed 2002 Conference, October 25, 2002 Calgary, Alberta Interoperability,Z39.50 Profiles &Testing William E. Moen<wemoen@unt.edu>School of Library and Information SciencesTexas Center for Digital KnowledgeUniversity of North TexasDenton, TX 72603

  2. Overview • Interoperability • Profiles • The Bath Profile • The U.S. National Profile • Beyond profiles • Indexing and search functionality • Interoperability testing Netspeed 2002 -- Calgary, Alberta -- October 2002

  3. Interoperability Systems and organizations will interoperate! One should actively be engaged in the ongoing process of ensuring that the systems, procedures and culture of an organisation are managed in such a way as to maximise opportunities for exchange and re-use of information, whether internally or externally. Paul Miller, 2000 Netspeed 2002 -- Calgary, Alberta -- October 2002

  4. Defining interoperability System-oriented definition • The ability of two or more systemsor components to exchange information and use the exchanged information without special effort on either system User-oriented definition • User’s ability to successfully search and retrieve information in a meaningful way and have confidence in the results • The condition achieved when two or more technical systems can exchange information directly in a way that is satisfactory to usersof the systems (AAP) Netspeed 2002 -- Calgary, Alberta -- October 2002

  5. Assessing interoperability • Binary • Interoperable • Not interoperable • Continuum • More or less interoperable • Acceptable levels of interoperability Netspeed 2002 -- Calgary, Alberta -- October 2002

  6. Factors affecting interoperability • Multiple and disparate systems • operating systems, information retrieval systems, etc. • Multiple protocols • Z39.50, HTTP, SOAP, etc. • Multiple data formats, syntax, metadata schemes • MARC 21, UNIMARC, XML, / ISBD/AACR2-based, Dublin Core • Multiple vocabularies, ontologies, disciplines • LCSH, MESH, AAT • Multiple languages, Multiple character sets • Indexing, word normalization, and word extraction policies Netspeed 2002 -- Calgary, Alberta -- October 2002

  7. Mapping the landscape • Networked information retrieval occurs within and across communities • Information communities • Focal community (e.g., libraries) • Extended community (e.g., cultural heritage community) • Extra community • Knowledge Domains • Intra domain • Extra domain • Costs to achieve interoperability vary Netspeed 2002 -- Calgary, Alberta -- October 2002

  8. Extended Community (e.g., Cultural Heritage) Focal Community (e.g., Libraries) Focal Community (e.g., Archives Focal Community (e.g., Museum) Information communities Extra Community Focal Community (e.g., Geospatial ) Extended Community Focal Community (e.g., Geospatial) Focal Community (e.g., Natural HistoryMuseums) Netspeed 2002 -- Calgary, Alberta -- October 2002

  9. Focal community • Community agreements exist (e.g., standards, rules, etc.) • Interoperability factors reduced • Interoperability more easily achieved • Libraries as Focal Community • Relative homogeneity of data and systems • Z39.50 widely implemented • Standards-based MARC records • Content and structure prescribed by AACR • Commonly understood access points • Use of controlled vocabularies Netspeed 2002 -- Calgary, Alberta -- October 2002

  10. Threats to Z39.50 interoperability • Differences in implementationof the standard • Differences in local information retrieval systems • Search functionality • Indexing policies • These threats can be addressed by • Z39.50 specifications and configuration • Enhancing local information retrieval systems • Recommendations for local indexing decisions Netspeed 2002 -- Calgary, Alberta -- October 2002

  11. Virtual Catalog Application Netspeed 2002 -- Calgary, Alberta -- October 2002

  12. Z39.50 Model of Resource Discovery Netspeed 2002 -- Calgary, Alberta -- October 2002

  13. Complete Z39.50 Specifications Z39.50 Profile Profiles Z39.50 specifications Profiles are a solution path forimproving interoperability • Represent community consensus on requirements • Identify Z39.50 specifications to support those requirements • Aid in purchasing decisions • Provide specifications for vendors Netspeed 2002 -- Calgary, Alberta -- October 2002

  14. Profiles • Defines a subset of specifications from one or more standards • Goal of profiles is to improve interoperability • Profiles are useful for: • prescribing how Z39.50 should be used in a particular application environment • solving interoperability problems with existing Z39.50 implementations within a community or across two or more communities Netspeed 2002 -- Calgary, Alberta -- October 2002

  15. The Bath Profile The Bath Profile: An International Z39.50 Specification for Library Applications and Resource Discovery, Release 2 (Draft 3,Oct. 2002) • Enables effective use of Z39.50 in a range of library applications: • Search and retrieval from library catalogues • Search and retrieval of bibliographic holdings • Search and retrieval of authority records • Cross-domainsearching FOR MORE INFORMATION, VISIT THE BATH MAINTENANCE AGENCY WEBSITE… http://www.nlc-bnc.ca/bath/ Netspeed 2002 -- Calgary, Alberta -- October 2002

  16. Structure of the profile • Modular for extensibility • Related requirements and specifications group in Functional Areas • Release 2 defines four Functional Areas • Functional Area A: Basic Bibliographic Search and Retrieval, with Primary Focus on Library Catalogues • Functional Area B: Bibliographic Holdings Search and Retrieval • Functional Area C: Cross-Domain Search and Retrieval • Functional Area D: Authority Record Search and Retrieval in Online Library Catalogues • Defines Conformance Levels for each area Netspeed 2002 -- Calgary, Alberta -- October 2002

  17. Addressing interoperability • The Bath Profile: • Identifies searching requirements (tasks) • Defines the searches (semantics and behavior) • Specifies Z39.50 query to represent the search • Standard combination of Z39.50 attribute types and values • Clients must send all attribute type values specified for search • Servers must be able to process all values • No default behavior by client or server • Requires support for specific formats for interchanging retrieval records Netspeed 2002 -- Calgary, Alberta -- October 2002

  18. Functional Area A, Level 0 • Conformance Level 0 • Version 2 required, Version 3 recommended • Basic Bibliographic Search (Z39.50 Search Service) • Author Search — Keyword • Title Search — Keyword • Subject Search — Keyword • Any Search — Keyword • Basic Bibliographic Retrieval (Z39.50 Present Service) • Z-clients to support MARC21 and SUTRS • Z-servers to support MARC 21 Netspeed 2002 -- Calgary, Alberta -- October 2002

  19. Functional Area A, Level 1 • Conformance Level 1 • Inherits search requirements form Level 0 • Requires 15 additional searches, including: • Exact Match (author, title, subject) • First Words & First Characters in Field (author, title, subject) • Keyword with Right Truncation (author, title, subject) • Standard ID, Date, • Browse Indexes (Z39.50 Scan Service) • 3 scans defined • Retrieval • Z-clients to support MARC21 and SUTRS • Z-servers to support MARC 21 Netspeed 2002 -- Calgary, Alberta -- October 2002

  20. Functional Areas B, C, D • Area B -- Holdings Information • Address the challenge of search and retrieval of bibliographic holdings information • Locations Only • Locations, Summary Information and Count if available • Summary Copy Level Holdings • Use of XML as Record Syntax • Area C -- Cross Domain Search/Retrieval • Defines two conformance levels (13 searches) • Dublin Core DTD for XML record syntax • Area D – Authority Record Search/Retrieval • Defines one conformance level • Defines 14 searches Netspeed 2002 -- Calgary, Alberta -- October 2002

  21. Level 0: title keyword search Uses: Searches for complete word in a title of a resource. Example: Title search for “woman” represented in Z query as: (1,4)(2,3)(3,3)(4,2)(5,100)(6,1) woman Netspeed 2002 -- Calgary, Alberta -- October 2002

  22. Level 0: title keyword right truncation Uses: Searches for complete word beginning with the specified character string in fields that contain a title of a resource. Example: Title search for woman truncated as “wom” represented in Z query as: (1,4)(2,3)(3,3)(4,2)(5,1)(6,1) wom Netspeed 2002 -- Calgary, Alberta -- October 2002

  23. Level 1: title first words in field Uses: Searches for complete word(s) in the order specified in fields that contain a title of a resource. The field must begin with the specified character string. This search is useful when the beginning words in a title are known to the user. Example: Title search for “Gone with the” represented in Z query as: (1,4)(2,3)(3,1)(4,1)(5,2)(6,1) gone with the Netspeed 2002 -- Calgary, Alberta -- October 2002

  24. Endorsements of Bath Profile • Atlantic Scholarly Information Network • CENL Working Group on Technical Standards • Czech and Slovak Library Information Network (CASLIN) • Committee on Institutional Cooperation (CIC) • International Coalition of Library Consortia (ICOLC) • Istituto Centrale per il Catalogo Unico delle Biblioteche Italiane e per le Informazioni Bibliografiche (ICCU) • M25 Consortium of Higher Education Libraries • National Library of Canada • OCLC • ONE2 • SmartLibrary • Standing Conference of National and University Libraries (SCONUL) • Z Texas Project Netspeed 2002 -- Calgary, Alberta -- October 2002

  25. Bath as foundation profile • National, regional, and state profiles based on the Bath Profile • ONE-2 Profile • DanZIG Profile • U.S. National Z39.50 Profile • Z Texas Profile Netspeed 2002 -- Calgary, Alberta -- October 2002

  26. Library application profiles • The Bath Profile: An InternationalZ39.50 Specification for Library Applications and Resource Discovery • U.S. National Z39.50 Profile for Library Applications • Z Texas Profile: A Z39.50 Profile for Library Systems Applications in Texas Relationship among profiles Bath Profile Core Specifications For Global Interoperability Netspeed 2002 -- Calgary, Alberta -- October 2002

  27. U.S. National Profile • National Information Standards Organization (NISO) standards effort • National Profile: • Addresses cross-catalog searching and holdings information interchange • Bath Profile is foundation for U.S. National Profile • Responds to national requirements • Work initiated in November 2000 • Draft standard ready by end of 2002 FOR MORE INFORMATION, VISIT THE PROJECT WEBSITE… http://www.unt.edu/zprofile Netspeed 2002 -- Calgary, Alberta -- October 2002

  28. U.S. Profile Functional Area A • Conformance Level 0 • Version 2 required, Version 3 recommended • Basic Bibliographic Search (Z39.50 Search Service) • Author Search — Keyword (NISO) • Title Search — Keyword (Bath) • Subject Search — Keyword (Bath) • Any Search — Keyword (Bath) • Basic Bibliographic Retrieval (Z39.50 Present Service) • MARC 21 supported by Z-client and Z-servers Netspeed 2002 -- Calgary, Alberta -- October 2002

  29. U.S. Profile Functional Area A • Conformance Level 1 • Version 3 required • Inherits search requirements form Level 0 • Requires 20 additional searches, including: • Exact Match (author, title, subject) • First Words & First Characters in Field (author, title, subject) • Keyword with Right Truncation (author, title, subject) • ISBN, ISSN, Standard ID, Format/Type, Date, Language • Browse Indexes (Z39.50 Scan Service) • Retrieval • Z-clients support MARC 21 • Z-servers support MARC 21 Netspeed 2002 -- Calgary, Alberta -- October 2002

  30. U.S. Profile Functional Area A • Conformance Level 2 • 38 additional searches, including • Key Title, Series Title, Uniform Title, • Unanchored phrase searches for Title, Subject, Name, Any • Personal Author, Corporate Author, Conference Meeting • Notes, other standard number (e.g., LCCN) • Pattern searches for one or more controlled vocabularies Netspeed 2002 -- Calgary, Alberta -- October 2002

  31. U.S. Profile Functional Area B • Bibliographic Holdings Information Retrieval • Use of XML as Record Syntax • Z39.50 Holdings XML Schema http://www.portia.dk/zholdings/ • Harmonized with Bath Profile Netspeed 2002 -- Calgary, Alberta -- October 2002

  32. Z39.50 profiles are not enough • Profiles can: • Identify searching requirements (tasks) • Define the searches (semantics and behavior) • Specify Z39.50 query to represent the search and formats of retrieval records • Also needed are: • Agreements on indexing • Common search functionality • Methods and testbed for interoperability testing • Conformance to profiles by vendors and libraries Netspeed 2002 -- Calgary, Alberta -- October 2002

  33. Indexing & search functionality • Indexing • Access points • Populating indexes from which MARC fields/subfields • Moving toward community agreements on common indexing policies to support profile-defined searches • Indexing guidelines available for use http://www.unt.edu/zinterop/ • Related issues: word normalization, word extraction • Search functionality • Phrase searching • Truncation • Proximity searching, etc. Netspeed 2002 -- Calgary, Alberta -- October 2002

  34. Interoperability testbed project Realizing the Vision of Networked Access to Library Resources: An Applied Research and Demonstration Project to Establish and Operate a Z39.50 Interoperability Testbed • A Institute of Museum and Library Services National Leadership Grant • Goal: Improve Z39.50 semantic interoperability among libraries for information access and resource sharing FOR MORE INFORMATION, VISIT THE PROJECT WEBSITE… http://www.unt.edu/zinterop/ Netspeed 2002 -- Calgary, Alberta -- October 2002

  35. Z-Interop vision • Provide a technically and organizationally trusted environment for vendors and consumers to demonstrate and evaluate Z39.50 products • Develop rigorous methodologies, test scenarios & procedures to measure and assess the extent of interoperability • Demonstrate and operate a Z39.50 interoperability testbed Netspeed 2002 -- Calgary, Alberta -- October 2002

  36. Z-Interop partners • Institute of Museum and Library Services • UNT’s Texas Center for Digital Knowledge • University of North Texas School of Library and Information Sciences • OCLC Online Computer Library Center • Sirsi Corporation • Sea Change Corporation, Bookwhere 2000 Netspeed 2002 -- Calgary, Alberta -- October 2002

  37. Components of the testbed • Test dataset • 400,000 MARC 21 records from OCLC’s WorldCat • Z39.50 reference implementations • Z-client, Z-server, information retrieval system • Test scenarios & searches • Searches with known result records from dataset • Benchmarks • Results of test searches against reference implementations Netspeed 2002 -- Calgary, Alberta -- October 2002

  38. Analysis of test dataset • Determine frequency of words in dataset • Systematically select words for use in test searches • Identify records that contain selected word • Aggregate Record Group • Word appears in any fields and subfields • Identify records that contain selected word in specified fields/subfields • Candidate Record Group • For example, examine records for occurrence of word in title-related fields/subfields Netspeed 2002 -- Calgary, Alberta -- October 2002

  39. Decomposed MARC records 400,000 MARC21 records = 33 million decomposed records Netspeed 2002 -- Calgary, Alberta -- October 2002

  40. Analysis logic 1. Examine for occurrence of word “river” 2. Yields Aggregate Record Group for word “river” Test Dataset (decomposed records) Aggregate Record Group 3. Examine for occurrence of word “river” in selectedfields/subfields Candidate Record Group 4. Yields Candidate Record Group for word “river” in selectedfields/subfields Netspeed 2002 -- Calgary, Alberta -- October 2002

  41. Some critical questions • What is a “word” • Self-help • Self help • Normalization • Elena • Éléna • What are the appropriate Author, Title, and Subject fields to look in for the word? • Decision related to indexing policies Netspeed 2002 -- Calgary, Alberta -- October 2002

  42. Reference implementations • Online Catalog Software • Z-Interop testbed uses SIRSI’s UNICORN system • Test dataset loaded on the system • Indexing policies based on guidelines • Z39.50 Server • SIRSI Z39.50 Module • Configured according to Bath/U.S. Profile • Z39.50 Client • Bookwhere 2000 • Configured according to Bath/ U.S. Profile Netspeed 2002 -- Calgary, Alberta -- October 2002

  43. Establishing benchmarks Reference Z39.50 Client Reference Z39.50 Server Test Dataset Configuredto SupportProfileSpecifications Configuredto SupportProfileSpecifications Indexed perguidelines to supportProfile searches Test searches Benchmarks For Test Search Yields Compared to CandidateRecord Group RetrievalResults Netspeed 2002 -- Calgary, Alberta -- October 2002

  44. Interoperability testing • Z-Interop Interoperability Testing Policies and Procedures • Test dataset loaded on participant’s system • Configured conform with Bath/U.S. Profiles • Indexed according to participant’s policies • Testing Z-servers • Z-Interop will send test searches from reference Z-client • Report results compared with benchmarks • Analyze results to assist implementor to improve interop • Testing Z-clients • Test searches sent to reference Z-server Netspeed 2002 -- Calgary, Alberta -- October 2002

  45. Testing & assessment Test Dataset Loaded by Vendor or Library Reference Z39.50 Client VendorZ39.50 Server Configuredby Vendorfor Conformance to Profile Configuredto SupportProfileSpecifications Indexed by Vendor According to Vendor’s Specifications Test Searches Benchmarks For Test Search RetrievalResults Compared to Netspeed 2002 -- Calgary, Alberta -- October 2002

  46. Current testing • Validate testing methodologies, procedures, policies • Bath/U.S. National Profiles Levels 0 & 1 Search & Retrieval • Title Search – Keyword • Author Search – Keyword • Subject Search – Keyword • Any Search – Keyword • Title, Author, Subject Searches – Keyword Right Truncation • Simple Keyword Boolean searches (AND, OR, NOT) • Test participants • InQuirion • OCLC • Innovative Interfaces • TLC/CARL • epixtech • Fretwell-Dowing • M 25 (UK) • Others expressing interest Netspeed 2002 -- Calgary, Alberta -- October 2002

  47. Research questions • What are acceptable levels of interoperability? • What are appropriate measures of interoperability? • What does conformance to a Profile mean? • Conformance of vendor’s product • Conformance of your implementation of vendor’s product • To what extent are organizations willing to support common indexing practices to improve interoperability? Netspeed 2002 -- Calgary, Alberta -- October 2002

  48. Critical success factors • Openness and transparency of processes • Project documents available on website • Culture of nurturing improvement • Trustworthiness • Confidentiality of participants’ results Netspeed 2002 -- Calgary, Alberta -- October 2002

  49. An opportunity for Z39.50 • Z39.50 experience has shown the challenges of interoperability • Problems of interoperability are better understood within a focal community • Solution paths exist • Interoperability testing serves as platform for improvement • The pieces are finally falling into place! Netspeed 2002 -- Calgary, Alberta -- October 2002

  50. References • The Bath Profile Maintenance Agency • http://www.nlc-bnc.ca/bath/ • U.S. National Profile • http://www.unt.edu/zprofile/ • Z39.50 Interoperability Testbed • http://www.unt.edu/zinterop/ Netspeed 2002 -- Calgary, Alberta -- October 2002

More Related