1 / 15

VuFind Beyond MARC discovering everything else

VuFind Beyond MARC discovering everything else. Demian Katz VuFind Developer demian.katz@villanova.edu. How VuFind Used to Work. MARC records were loaded into Solr. Data parsed to fields for searching/faceting. Full binary record stored in “fullrecord” field.

brittany
Download Presentation

VuFind Beyond MARC discovering everything else

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. VuFind Beyond MARCdiscovering everything else Demian Katz VuFind Developer demian.katz@villanova.edu

  2. How VuFind Used to Work • MARC records were loaded into Solr. • Data parsed to fields for searching/faceting. • Full binary record stored in “fullrecord” field. • Solr was used for retrieving records. • VuFind’s PHP code made heavy use of “fullrecord” data for building displays.

  3. What’s wrong with that? • MARC must die. • Not all searchable documents are MARC. • Code for pulling data from MARC is ugly.

  4. Redesign Goals • Centralize MARC-specific code so it can be easily replaced. • Use stored Solr fields whenever possible. • Allow arbitrary metadata formats to coexist peacefully. • Make no assumptions about metadata content.

  5. The Solution: Record Drivers • A class interface for displaying a document retrieved from Solr. • A new Solr field tells VuFind which Record Driver to instantiate for each document. • A default Record Driver can be written to display a document based solely on stored Solr fields.

  6. One Key Design Decision • What should the Record Driver class contain? • Data-oriented methods (getTitle, getAuthor, etc.) • Screen-oriented methods (getSearchResult, getStaffView, etc.)

  7. The Answer: All of the Above interface RecordInterface public getSearchResult() public getStaffView() … class IndexRecord implements RecordInterface protected getAuthor() protected getTitle() … class MarcRecord extends IndexRecord protected getAuthor() protected getTitle() …

  8. Record Driver Benefits • Large-scale changes are possible. • Small-scale changes are easy. • Allows object-specific behaviors. • Eases maintenance of local customizations.

  9. Next Problem… • Where’s the data? • MARC records traditionally come from an ILS export. • SolrMarc traditionally takes care of populating VuFind’s Solr index.

  10. Growing the Toolkit • The toolkit approach is important! • Problems to solve: • Obtain records from remote sources • Process harvested files • Index arbitrary XML

  11. Tool #1: OAI-PMH Harvester • Purpose of tool: harvest metadata files from an OAI-PMH server into a directory. • Key feature: ID manipulation. • Key feature: delete support.

  12. Tool #2: Batch Import Scripts • Purpose of tool: process all metadata files in a directory. • Easily achieved with Windows batch or Unix shell scripting. • Several sample scripts ship with VuFind.

  13. Tool #3: XSLT Importer • Purpose of tool: with XSLT, map an XML document to a Solr document based on VuFind’s schema. • Key feature: PHP integration • Key feature: Aperture support • Several sample XSLT documents ship with VuFind (DSpace, OJS, VuDL).

  14. Parting Thoughts • Understanding Record Drivers gives you a lot of control over VuFind. • VuFind should be able to index practically anything with a bit of effort. • Don’t be afraid to build your own tools!

  15. More Information • VuFind: • http://vufind.org • Demian Katz: • demian.katz@villanova.edu

More Related