Implementation of a faceted catalog search solution Kristin Antelman & Emily Lynema NCSU Libraries Feb. 7, 2006
Overview • Purchase decision • Implementation team • Technical overview • Features • Interface decisions • The future…
Purchase Decision • Lots of broad topical keyword searches • Authority infrastructure underutilized • No relevancy ranking of results • Opportunity to partner with Endeca
Implementation Team • Andrew Pace, Systems, Chair • Cindy Levine, Research and Information Services • Emily Lynema, Systems, ex officio (tech lead) • Erik Moore, Systems, ex officio (ILS librarian) • Charley Pennell, Cataloging • Shirley Rodgers, Systems • Tito Sierra, Digital Library Initiatives
Technical Overview • Endeca ProFind co-exists with SirsiDynix Unicorn ILS and Web2 online catalog. • Endeca indexes MARC records exported from Unicorn. • Index is refreshed nightly with records added/updated during previous day.
Endeca ProFind Overview • Endeca’s ProFind software is responsible for… • Ingesting and indexing reformatted NCSU data. • Creating a back-end service that responds to queries with result sets. • NCSU is responsible for… • Reformatting MARC records into something Endeca application can parse. • Keeping these reformatted records up to date. • Building the web application that users see. • Sending queries to Endeca back-end service and displaying results.
Data Extraction • First extract MARC data for import into Endeca.
MARC to ?? • Endeca doesn’t understand MARC records. • MARC flat text file(s) for ingest by Endeca. • Creates opportunity to manipulate data on the back-end.
Nightly Update • Each night a script updates the data indexed by Endeca: • Exports updated or new MARC records from Unicorn. • Reformats and merges these records with those already indexed. • Starts Endeca re-index – completely rebuilding index for the catalog. • Process requires about 7 hours.
Quick Demo • http://catalog.lib.ncsu.edu
Interface Decisions • Search interface pages • Full view holdings display • Order of dimensions
Search Interface Pages • Problem: How to provide Endeca keyword searching and Web2 authority searching while keeping the search interface as close to the ‘one box’ approach as possible.
Pre-Endeca Catalog Search • 6 search tabs • 14 radio buttons • 1-4 drop down boxes
Endeca Catalog Search • 3 search tabs • No radio buttons • 2 search boxes • Keyword search default
Option 1: Links within Search tab Right-hand links to 3 pages under Search tab too confusing
Option 2: Single drop-down • Too many options in drop-down overwhelm user • Keyword and authority searching lead to completely different interfaces
Option 3: Separate tab for authority searches Anybody know what Begins With… tab does?
Full-View Holdings Display • Problem: Communicate whether a resource is available and where it is located in a usable fashion.
Pre-Endeca Results List • Too many boxes, lines, and shaded areas. • Elements for a single record not visually grouped.
First version of results page wireframe (~8 total iterations). Ideas drawn from Web2, RedLightGreen, Amazon, etc.
Brief view vs. Full view gives user choice about displaying holdings. 5th Revision: Attempt to aggregate holdings information by call number. Particularly confusing for online resources.
Reduces complexity of continuing and online resources. 8th (and Final) Revision: Aggregate holdings information by library.
Dimension Display • Problem: With 10 dimensions to display on the results page, where should they appear (and in what order)? • Goal: Give high visibility to dimensions that will be most valuable to users, but also highlight useful dimensions that may represent new concepts.
9. Availability 10. Library of Congress Classification • Subject: Topic • Subject: Genre • Format • Library • Subject: Region • Subject: Era • Language • Author
Future Plans • Ongoing tweaks: • Relevance ranking algorithms & spell correction thresholds • Display fixes/enhancements • Additional browsing options • Endeca 2.0 ideas • FRBR-ized display [more later] • Build detail page in Endeca with live item data from Oracle • Shopping cart functionality for email/export of records • Enrich records with supplemental content – more usable TOCs, book reviews, etc.
FRBR & Rollup • Explore Endeca’s built-in rollup functionality. • Need to create a single text key to ‘roll up’ individual records for different editions into a single work result. • Looking at using author/title keys as outlined in the Library of Congress FRBR display tool algorithm.
Users performs keyword search for ‘iliad’ Single aggregate record represents 73 actual records — different editions of Iliad with Homer as author
Click on ‘See all editions’ to view individual publication and holdings information for each aggregated result.