1 / 25

The YAZ Toolkit

The YAZ Toolkit. Sebastian Hammer Index Data - Information Retrieval consultants. Why Toolkits?. Z39.50 is a machine-to-machine protocol Good for software Not friendly for people Toolkits give you a programming interface They hide encoding, network layer, error handling

Download Presentation

The YAZ Toolkit

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The YAZ Toolkit • Sebastian Hammer • Index Data - Information Retrieval consultants

  2. Why Toolkits? • Z39.50 is a machine-to-machine protocol • Good for software • Not friendly for people • Toolkits give you a programming interface • They hide encoding, network layer, error handling • Leave you to concentrate on your application

  3. YAZ • Z39.50 toolkit for Unix, Windows, etc. • ANSI-C, Portable, compact, easy to use • Widely used - constant feedback • Unlimited use license • Optional commercial support available

  4. Implementing Z39.50 • Building a server • Testing using third party clients • Building your own client

  5. The Z39.50 Server • Low-level YAZ API. Full access to Z39.50 protocol • Generic front-end server • Z39.50 server application • Simple high-level API to your ”back-end” • Standard daemon under Unix - multithreaded NT Service

  6. Generic Front-end Server API • #include <backend.h> • bend_initialize • bend_search • bend_fetch • bend_close • (bend_scan, etc.)

  7. bend_initialize • Verify authentication • Initialize instance of database back-end • Create session handle for private use • Return association accept/reject

  8. bend_close • Close database back-end instance (if relevant) • Release memory

  9. bend_search • Parameters • Database name • Result set name • Query • Result • Status • Number of hits

  10. bend_search bend_searchresult*bend_search( void *handle, bend_searchrequest *r);

  11. bend_searchrequest char *setname; int replace_set; int num_bases; char **basenames; Z_Query *query; ODR stream;

  12. Z_Query • Representation of Type-1 (RPN) query in C datatypes • Dangers: • Processing recursive data structure - ”easy”. • Interpreting the query honestly - HARD!

  13. RPN Processing Guidelines • Check for unsupported queries carefully • Attribute types/values • Operators • Operand types • etc. • Process attributes strictly • Provide good defaults for missing attributes

  14. Query Processing Thoughts • It is better to fail a query than to return incorrect results • Look at all attribute types/values. Reject unknown attributes • Set up clients so only good attributes are sent

  15. Memory Management Built-in memory pool manager • Allocation of protocol package elements • temporary memory for request processing • Memory recycled automatically

  16. bend_fetch • Parameters • Result set name • Offset • Result • Status • Record

  17. Retrieval Records • What record syntaxes do you need to support? • MARC = ISO2709 • SUTRS - Simple text format • GRS-1 - structured records • SGML/XML?

  18. The Z39.50 Client • Challenge: Integration with existing User interface / event paradigm • YAZ: C representation of Z39.50 protocol packages • Comstack API for exchanging protocol packages

  19. Z39.50 ASN.1 AttributesPlusTerm ::= SEQUENCE { attributeList SEQUENCE OF AttributeElement; term Term; };

  20. Z39.50 ASN.1 in C typedef struct Z_AttributesPlusTerm { int num_attributes; Z_AttributeElement **attributeList; Z_Term *term; } Z_AttributesPlusTerm;

  21. Comstack • Abstraction over transport service layer • Simplifies exchange of BER-encoded packages • Allows portability over transport stacks • You supply event-handling (”select”)

  22. ZAP - WWW/Z39.50 Gateway Environment • User interface - HTML forms • Results - based on templates containing bits of HTML • Quick prototyping • High performance under Apache webserver

  23. Other Options - scripting • IrTcl - Z39.50 package for TCL(TK) • High-level scripting environment • Incremental prototyping • Build platform-independent GUI clients or WWW gateways

  24. News? • Protocol encoders now ASN.1 compiler-generated • Easy to add or switch to new ASN.1 (eg. ILL) • Protocol package pretty-printing • Threadproof. Solid Windows port • Optional commercial support

  25. Where is it? www.indexdata.dk

More Related