1 / 51

The Catalogue as Master file

The Catalogue as Master file. APLA 2003 Lisa Goddard and Louise White. Catalogue and Website. Most libraries maintain both an on-line catalogue and a website Find ourselves reproducing cataloguing information on the website to provide multiple points of access

grant-henry
Download Presentation

The Catalogue as Master file

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Catalogue as Master file APLA 2003 Lisa Goddard and Louise White

  2. Catalogue and Website • Most libraries maintain both an on-line catalogue and a website • Find ourselves reproducing cataloguing information on the website to provide multiple points of access • Common duplication- bib records for databases and database list

  3. Database List and Catalogue • At Memorial we maintained several lists of electronic indexes • “Database Lists” were maintained as static HTML pages , manually updated • Had two lists: One accesses from Homepage, comprehensive; one accessed from Unicorn, selected (catalogued) dbs

  4. Static HTML lists – the Problems • Constant updates • Duplicating information under multiple subject headings • Inconsistent entry style • Catalogue and website maintained by different departments – communications challenge

  5. The Wish List • Wanted a single list – easily done but we wanted more • Wanted one list made easy • Wanted title and subject access • Wanted to reflect frequent recommendations • Wanted end product designed for novice user but expert friendly

  6. The Source • One authoritative entry point for data –theCatalogue • Did not have to invent entry point, it already existed • Could harvest data and create an easy custom interface for information retrieval

  7. Team Players • Formed an all branch working group • Representatives from: • Cataloguing > MARC record expertise • Reference > controlled vocabulary for subjects and subject classification • Systems > data harvesting and display

  8. Consensus on a Schedule • Set a deadline of August 2002 (it was now April 2002) • Tight timeline kept us focused • Kept in close contact with Library Instruction coordinator who was designing a library research web tutorial

  9. The Final Product • Characteristics • All required elements are included in the MARC record • Title and subject access available • Subject Access tiered – Highly Recommended and Also Recommended • Live URLs for Databases and User Guides

  10. Item Format Title Local Note Coverage Summary EIndex 245 590 856 |z 520 Elements in the Bib Record

  11. Highly Recommended Also Recommended Database URL User Guide URL 691 indicator 1 691 indicator 2 856 indicator 40 856 indicator 42 Elements in the Bib Record

  12. Subject Access • Decided on a master list of subjects • Controlled vocabulary - Combination of Subjects and Disciplines • Some apparent duplication: English literature/Literature and Language • Restricted number of databases which could be Highly Recommended, otherwise dilute notion

  13. What’s in a name? • Point of much discussion • Recognized that “database” was legitimate term for internal use • Needed more descriptive term for external use • Wanted the word “article” in there as most likely to get the attention of users • Wanted to reinforce “index” in users vocabulary

  14. What’s in a name • Settled on Article Indexes • Not unique • The majority of databases are purely indexes • Also lists full text databases, databases of statistics… • Thought about “Article Indexes Plus” but kept hearing “Plus what?” in our own heads

  15. Article Indexes and the Homepage

  16. The Catalogue as Master File Part II: Technical Overview – Creating the Interface APLA 2003 Louise White & Lisa Goddard Memorial University Libraries

  17. Technical Overview: Creating the Interface Perl Script Removes extraneous data from MARC record. Reformats record for load into SQL database. SQL Database Report scheduled to reload table everyday from Perl text file. Can be queried by web interface. Web Server Contains ASP interface which formats data into dynamic web pages Library Catalogue Report runs daily to dump selected MARC records as text. This file is auto-matically FTP’d to the web server everyday.

  18. Retrieving the Bibliographic Data • MARC records are created, updated and managed in Unicorn. • Unicorn enforces cataloguing policies and standards to ensure record integrity. • Unicorn applies access rights for record editing and creation.

  19. Why not create the interface within the OPAC? • Database engine does not support SQL queries. • Unicorn uses proprietary, compiled code. • Complicated to create highly customized web interfaces.

  20. Getting the Eindex Records from the Catalogue • Uses the catalogue’s built in reporting tools. • Report selects records with item format “eindex” and writes their MARC records to a text file. • Report runs daily so text file is always current. • File automatically FTP’d to web server everyday. FTP Catalogue Daily Report Eindex MARC Records Web server

  21. The Catalogue Output • Text file created by Unicorn report contains well formatted MARC records with tags and indicators: • *** DOCUMENT BOUNDARY *** • FORM=SERIAL • .000. |aas a0c • .001. |aocm38313827 • .003. |aOCoLC • .005. |a20020605111040.0 • .006. |am e • .007. |acr un--------- • .008. |a980204c19639999maumx p si 0 0eng d • .035. |a(Sirsi) o38313827 • .040. |aBBH|cBBH|dOCL|dOCLCQ • .050. 4|aZ7006|b.M64 INTERNET • .130. 0 |aMLA international bibliography (Online : SilverPlatter • International) • .245. 00|aMLA international bibliography|h[electronic resource] • .246. 3 |aModern Language Association international • bibliography|b(Online • : SilverPlatter International) • .246. 1 |iTitle bar title :|aMLA bibliography|b(Online : SilverPlatter • International) • .260. |a[Norwood, Mass.] :|bSilverPlatter International, • .310. |aUpdated 10 times a year

  22. Reformatting the MARC Record • Extract only Pertinent Fields from the MARC record: Title .245. |a Subject heading – highly recommended .691. 1 |a Subject heading – also recommended .691. 2 |a URL to resource .856. 40 |u Holdings .856. 40 |a Access note .590. |a Description .520. |a URL to user guide .856. 42 |a • Remove all the MARC tags, subfield indicators and other clutter. • Save the results in a format that can be loaded into an SQL Database.

  23. Reformatting with PERL • PERL (Practical Extraction and Report Language) • Perl is a language that is widely used for creating and manipulating text files. • Uses pattern matching to make decisions. • Perl is free, open source software. It can be downloaded from http://www.activestate.com/Products/ActivePerl/.

  24. PERL: Selecting the Fields foreach(@lines){ if (@lines[$x]=~/^\.590\./){choose line if starts with .590. print @lines[$x];write it to a text file } if (@lines[$x]=~/^\.691\./){ print @lines[$x]; } if (@lines[$x]=~/^\.856\./){ print @lines[$x]; } $x++; choose line if starts with .691. write it to a text file choose line if starts with .856. write it to a text file

  25. PERL: Remove MARC Tags & Clutter • .691. 2|aBiology (“Also Recommended” for Biology) If line starts with “.691. 2” elsif (@record[$s]=~/\.691\. 2/){ ($junk, $subjother) = split(/\|a/, $line); push(@others, $subjother); $oth++; } Split the line on the “|a” indicator. Keep everything after the “|a” and discard the rest.

  26. PERL: Data Cleanup • Fix the diacritics: • while(<ORIG>){ • $_ =~s/áe/è/g; • $_ =~s/âe/é/g; • $_ =~s/ão/ô/g; • $_ =~s/ãe/ê/g; • $_ =~s/åa/a/g; Replace all occurrences of áe with è.

  27. PERL: The Final Output • The top line tells the database which fields to create in the table. • title*descrip*access*hold*url*guide*urlguide*subjectrec*subjectoth • The rest of the file contains the data for each field (* is the delimiter). • MLA international bibliography * This database contains references to literature, language, linguistics and folklore. It indexes journal articles, monographs, dissertations, working papers, proceedings, and bibliographies. It is updated 10 times per year.* Online access available to MUN users only.*1963- *http://204.187.104.2:8590/munf9?*User Guide available:*http://www.mun.ca/library/research_help/guides/MLA.html*English Folklore Language and Linguistics Literature Theatre*Aboriginal Studies

  28. Load File into SQL Compliant Database

  29. SQL: Standard Query Language • Now the data can be queried using SQL (Standard Query Language). • To get all the resources that are recommended for the subject Literature: SELECT * FROM ejournals WHERE subjectrec LIKE ‘%Literature%’; • To get all the titles that start with the letter A: • SELECT * FROM ejournals WHERE title LIKE ‘A%’;

  30. The Web Interface: Dynamic HTML • Eindex search page must accept a search term from the user, and then create and display a custom set of results based on that query. • Dynamic HTML is written on the fly. The page does not exist until requested by the user. • There are several web programming languages that will produce dynamic html files including PHP, JSP, and ASP.

  31. Server Side Languages • ASP/PHP/JSP are server side languages,commonly used to display database results over theweb. • Server side programs do all the processing on the server, in the background, and write out HTML files that are sent to your browser. • You do not need any special plug-ins to see these pages.

  32. Structure of a Dynamic URL Directory on web server ? Indicates beginning of parameters Value of variable http://www.library.mun.ca/eindex/DBSearchResults.asp?subhead=Biochemistry Server name and domain Name of asp file Name of variable

  33. Web Interface Composed of 3 Files Main Search Page (index.asp) Option B user chooses from subject menu sends variable “subhead” Option A user chooses from title menu sends variable “searchtext” Alphabetical Search Results Page (AlphaSearchResults.asp) Subject Search Results Page (DBSearchResults.asp)

  34. Main Search Page - Alphabetical Search: User selects “A” <a href="alphaSearchResults.asp?SearchText=A">A</a> Opening href tag Name of .asp file to render results page Name of variable expected by .asp file Value of variable as selected by user

  35. Main Search Page -SubjectSearch User selects “Biology” <option value="./DBSearchResults.asp?subhead=Biology">Biology</option> Name of variable expected by .asp file Name of .asp file to render results page Value of variable as selected by user

  36. ASP: Subject Search Results • Receives variable “subhead” from index.asp: • http://library.mun.ca/eindex/DBSearchResults.asp?subhead=Biology • Makes two SQL queries based on the value of variable “subhead”: • SELECT * FROM eindex WHERE subjectrec LIKE ‘%Biology%’ • Returns all records that have Biology in recommended subject field • SELECT * FROM eindex WHERE subjectoth LIKE ‘%Biology%’ • Returns all records that have Biology in other subject field

  37. ASP: Subject Search Code • ASP writes out the variables with HTML Tags around them: <% records.MoveFirst do while Not records.eof %> <tr><td colspan=2><b><a href="<%=rs("url")%>"><%=rs("title")%></a></b></td></tr> <tr><td><%=rs("access")%><br>Coverage:&nbsp;<%=rs("hold")%></td> <td valign="bottom"><a href="<%=rs("url2")%>">Database Guide</a></td></tr> <tr><td colspan=2><%=rs("descrip")%></td></tr> <tr><td colspan2></td></tr> <tr><td colspan2></td></tr> <% records.MoveNext loop %>

  38. ASP: Output • A line of ASP code that looks like this: • <tr><td colspan=2> • <b><a href="<%=rs("url")%>"><%=rs("title")%></a></b></td></tr> • Produces a line of HTML that looks like this: • <tr><td colspan=2> • <b><a href="http://isiknowledge.com/wos">Web of science citation • databases </a></b></td></tr>

  39. HTML: Output to Web Browser

  40. Article Indexes End to End Library Catalogue Daily Report Bib. Records text file FTP’d to web server Browser receives HTML Perl script to reformat Bib. records Auto-load into SQL Database ASP queries SQL and creates HTML pages dynamically

  41. The Future: Customizing Local Interfaces • Library catalogue w/ SQL compliant Database eliminates need for data porting and reformatting. Library Catalogue w/ Oracle SQL Database ASP queries SQL and creates result pages dynamically Browser receives HTML

  42. The Future: Portals and Single Search • Search multiple resources with one single interface which provides linking directly to licensed full text content. • Record retrieval using Z39.50 and XML Gateways. • XML as standard language for describing & transforming content and metadata. e.g. XML tagged Dublin Core • OpenURL reference linking – builds HTTP searches based on citation information and links directly to licensed content. • Support for ILL and Circulation (NCIP) protocols.

More Related