Endeca @ NCSU Libraries Andrew Pace & Emily Lynema NCSU Libraries May 24, 2006
Technical Overview • Endeca Information Access Platform co-exists with SirsiDynix Unicorn ILS and Web2 online catalog. • Endeca indexes MARC records exported from Unicorn. • Index is refreshed nightly with records added/updated during previous day.
Endeca IAP Overview Endeca Information Access Platform NCSU exports and reformats Data Foundry MDEX Engine Parse text files Raw MARC data Indices Flat text files HTTP HTTP NCSU Web Application Client browser
Endeca IAP Overview Offline - Nightly NCSU exports and reformats Data Foundry MDEX Engine Parse text files Raw MARC data Indices Flat text files HTTP HTTP NCSU Web Application Client browser
Endeca IAP Overview Always Online NCSU exports and reformats Data Foundry MDEX Engine Parse text files Raw MARC data Indices Flat text files HTTP HTTP NCSU Web Application Client browser
Integrating Endeca • Endeca doesn’t understand MARC data / MARC-8 character encoding – translate to UTF-8 text files • Each night a script updates the data indexed by Endeca: • Exports updated or new MARC records from Unicorn. • Reformats and merges these records with those already indexed. • Starts Endeca re-index – completely rebuilding index for the catalog. • Process requires about 7 hours. • Retain Web2 OPAC for some functionality • Authority searching - known items and cross-references • Detailed record pages – how to make Endeca -> Web2 link?
Integrating Endeca - Future • MarcAdapter plugin for raw MARC data. • Create local field mappings and special handlers in Java. • Eliminate need for external MARC 21 translation and file merging. • Partial Updates • Update circulation data multiple times throughout the day.
Quick Demo • http://catalog.lib.ncsu.edu
Other interesting tidbits… (March 2006) • Authority searching decreased 45% • Keyword searching increased 230%. • Caveat: default catalog search changed from title authority to keyword. • ~ 6% of keyword searches offered spelling correction or suggestion • 3.6% - automatic spell correction • 2.6% - “Did you mean…” suggestion
Usability Testing • 10 undergraduate students • 5 with Endeca catalog • 5 with old Web2 OPAC • Endeca performed as well as OPAC for known-item searching in usability test • 89% Endeca tasks completed ‘easily’ (8/9) • 71% OPAC tasks completed ‘easily’ (15/21) • Endeca performed better than OPAC for topical searching in usability test.
Usability Testing Trends • Relevance *most* important • “Once I scroll through a page, I get pretty discouraged about the results...” Web2 OPAC participant looking for resources on cat health • ‘Keyword’ term less intuitive / trusted than ‘Subject’ and ‘Title’ • “[I used] Keyword in Title because that’s what I want the book to be mainly referring to. But I also could’ve went Keyword in Subject. But if I’d have went Keyword Anywhere it would have had too big of a field to look through.” Web2 OPAC participant looking for resources on gene therapy • When found, dimensions seem intuitive and useful • ‘Did you mean’ seems intuitive
A study in relevance • Are search results in Endeca more likely to be relevant to a user’s query than search results in Web2 OPAC? • 100 topical user searches from 1 month in fall 2005 • How many of top 5 results relevant? • 40% relevant in Web2 OPAC • 68% relevant in Endeca catalog
Relevance defined • Relevance ranking in Endeca – select from a variety of modules and order them based on importance. • Relevance most important in Keyword Anywhere - searches all fields. • At NCSU… • Original query term(s) (no thesaurus, stemming, spell correction) • Exact phrase match • Field ranking (Title higher than Author higher than Table of Contents) • Number of fields that contain term(s) …
Future Plans • Ongoing tweaks: • Continued usability testing • Relevance ranking algorithms & spell correction thresholds • Additional browsing options • Endeca 2.0 ideas • FRBR-ized display • Discussions with OCLC regarding FAST (Faceted Access to Subject Terms) and FRBR • Patron-generated refinements (folksonomies?) • Enrich records with supplemental Web Services content – more usable TOCs, book reviews, etc. • The death of authority searching (?) • More integration with QuickSearch, other data repositories, and third-party discovery tools
Thanks http://www.lib.ncsu.edu/endeca Andrew Pace, Head, IT email@example.com Emily Lynema, Systems Librarian for Digital Projects firstname.lastname@example.org