460 likes | 548 Views
Learn about XTF 2.1, a powerful and highly customizable search and display technology developed by the California Digital Library. From community-inspired features to planned improvements, discover how XTF simplifies search with features such as faceted browsing, keyword search, bookbag, spelling correction, and more. Explore the upcoming enhancements and integrated support for varied file formats, along with the collaborative development philosophy guiding XTF's evolution.
E N D
XTF 2.1 Powerful Search and Display without the Headaches Martin HayeCalifornia Digital Library
Overview • What is XTF? • Community-inspired development • New features in 2.1 • Planned improvements
XTF in 2 minutes • eXtensible Text Framework • Search and display technology from CDL • Open-source Java framework • Powerful and highly configurable • XML + Full text search • Also indexes PDF, HTML, Word
XTF in 2 minutes • Search: Query power/speed of Lucene, plus: • keyword search, facets, spelling, lots more • View: Processing power of Saxon, plus: • large file optimizations, hit markup • Configure and customize exclusively in XSLT • Mature, tightly integrated, well documented • In use at CDL and many other places
How does XTF compare? Green- stone * * Solr Turn-key / easy---------------> XTF 2.1 XTF 2.0 Customizable / Powerful ----------------------------------------> * disclaimer: based on my limited experience with Greenstone and Solr
Community-inspired Development • First, we asked the XTF community for features they wanted • Then everybody voted • People wanted many features they saw in XTF projects at CDL
Aligning Our Process • Our group was starting a new CDL project • We aligned our development • Result: Everybody benefits
New and improved features • Faceted browse • Search flexibility • Bookbag • Spelling correction • Similar items • Other stuff
Faceted browse • Previously implementing faceted browse required lots of XSLT programming. • Hierarchical facets: even harder • Required us to deeply refactor the stylesheets, but now it’s simple to add new facets.
Search flexibility • Keyword search: single box (now default). Internally, searches multiple fields. • Advanced search: explicitly fill in constraints for various fields • Freeform search (new): text-based field specifiers, AND, OR, parentheses, etc.
This fit nicely into XTF’s architecture Simple but conforming implementation OAI-PMH
Bookbag • Refactored the AJAX to use YUI (Yahoo User Interface widgets) • Still session based • Now supports emailing the bookbag
Spelling correction • Unicode bug fixes • On by default and fully integrated
Similar items • Allows user to see “more like this” • Improved AJAX integration • On by default - no configuration needed
Other changes in XTF 2.1 • Built-in NLM “Blue”, TEI P5, MS Word support (still support TEI P4, EAD, PDF, HTML, text) • Valid XHTML output • RawQuery servlet to provide a query back-end to a (e.g. Ruby) front-end or mash-up. • Bug fixes and minor changes (many reported/requested by users)
On the horizon • A page-turner for scanned texts and converted PDFs • Pop-up image/PDF page snippets • Background auto-warming, to speed response after incremental indexing • And of course, features suggested as users upgrade to or adopt XTF 2.1
Philosophy • Adaptation through programming • XTF is still about building what you want using a set of powerful tools But now: • Stylesheets are more modular • Build interfaces faster using honed widgets • Prettier UI to start with
Fin • Download: xtf.sourceforge.net • Documentation: xtf.wiki.sourceforge.net • Discussion: groups.google.com/group/xtf-user • Me: martin.haye@ucop.edu