1 / 28

Filing and Word Breaking Procedures

Learn about the structure and procedures of filing and word breaking in Aleph, from the pre-14.x version to 14.1 onwards. Discover the ready-made components provided by Aleph and the tables that identify and define the procedures.

catherinei
Download Presentation

Filing and Word Breaking Procedures

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Filing and Word Breaking Procedures

  2. Session Agenda • Pre-14.x • tab_word_breaking table • Structure • Procedures • Special remarks • tab_filing table • Structure • Procedures

  3. Pre-14.x • Various filing and word breaking procedures existed. Each procedure included many parts, but was a closed box. • Each procedure was assigned a code, such as B1, B5, C1, A3, AM, etc. • Each procedure was a separate program, requiring new program development to create new procedures. For example, there was no A3 + AM filing procedure.

  4. From 14.1 onwards • ALEPH provides ready-made components (programs) for creation of filing and word breaking procedures • /tab/tab_word_breaking - • an ALEPH table which identifies word breaking procedures and defines their component parts • / tab/tab_filing - a table which identifies filing procedures and defines their component parts

  5. tab_word_breaking • /tab/tab_word_breaking - • is an ALEPH table which identifies word breaking procedures and defines their component parts. • Each word breaking procedure is made up of a group of one or more programs.

  6. tab_word_breaking • 1 2 3 4 • !!-!-!!!!!!!!!!!!!-!!!!!!!!!!!!!!!!!!!!!!!!!!!!! • 03 L abbreviation • 03 L numbers • 03 L compress - • 03 L to_blank !@#$%^&*()_+={}[]:";'<>,.?/|\ • col.1: procedure identifier • col.2: alpha of the text • col.3: procedure name • col.4: procedure parameters

  7. Procedures (1) • compress • Strips characters listed in col. 4 • delete_subfield • Changes sub-field sign (e.g., $$x) • to blank • to_blank • Changes characters listed in col. 4 • to blanks

  8. Procedures (2) • subf_to_sign • Changes second and subsequent • sub-field signs to the single character listed in col. 4 • blank_to_carat • Changes blanks to carat (^) • marc21_41 • 041 for separating languages in MARC21 field 041

  9. Procedures (3) • Abbreviation • Compresses a dot between single characters (e.g., I. B. M. changes to I B M; I.B.M. changes to IBM) • Numbers • Compresses a comma and a dot between numbers (e.g., 2,153 changes to 2153)

  10. Procedures (4) • IMPORTANT NOTE • The procedures must be listed in logical order. For example, numbers must be listed before compress or change_to_blank if a comma or a dot is included inthem. • Otherwise, they will no longer be present when the numbers procedure is used.

  11. Procedures (5) • Reminder • Word breaking procedures are used in tab11, section W. A line can be listed several times in tab11, in order to index it multiple times, with different word breaking each time. • For example, an apostrophe: • O’hara Ohara O hara • 11 W 100## abcdq 01 B WRD WAU • 11 W 100## abcdq 04 B WRD WAU

  12. unicode_to_word_gen • Word indexing routines, as well as retrieval routines, use the table defined under instance WORD-FIX in ./alephe/unicode/tab_character_conversion_line. The table is traditionally called unicode_to_word_gen.

  13. unicode_to_word_gen • This table defines equivalencies for characters, for the purpose of creating words in the words file. • All characters naturally retain their unicode value, and are stored in the system in UTF encoding. In order to translate one character into another character (e.g. translating an accented "e" to "e"), you can set an equivalency. The equivalency can be up to 5 characters: • 00E6 0061 0065 #LATIN SMALL LETTER AE

  14. unicode_to_word_gen • The library's tab_word_breaking table can define different treatment for the same characters. In separate procedures specific characters can be set to compress or to be changed to blank. Characters dealt with in this manner should be left in their natural value, and not translated in this table. • For example, you might want an apostrophe to be considered like a blank, like itself, and as if it were not there at all (e.g. o'hara, ohara). In order to be • able to set the apostrophe in tab_word_breaking as both as a compressed character, it must retain its natural value, and NOT be translated in this table.

  15. Special Remarks • 2. When browsing a word index in the OPAC, special characters are always displayed in their converted state. • I.e., if unicode_to_word_gen table sets umlaut to ue, the word will be displayed with ue, and not with an umlaut.

  16. tab_filing - Example • 01 L del_subfield • 01 L to_lower • 01 L abbreviation • 01 L suppress • 01 L compress ' • 01 L to_blank !@#$%^&*()_+- ={}[]:";<>?,./~` • 01 L mc_to_mac • 01 L pack_spaces • 01 L char_conv FILING-KEY-01 • 01 C chi

  17. tab_filing - Structure • 1 2 3 4 • !!-!-!!!!!!!!!!!!!!!!!!!!-!!!!!!!!!!!!!!> • 01 L compress ’ • 01 L char_conv FILING-KEY-01 • col.1: procedure identifier • col.2: alpha of the text • col.3: procedure name • col.4: procedure parameters

  18. tab_filing Procedures (1) • compress • Strips characters listed in col. 4 • (e.g., ()[]:,) • delete_subfield • Changes subfield sign to blank • (e.g., $$x) • to_blank • Changes characters listed in col. 4 to blanks

  19. tab_filing Procedures (2) • to_lower • Changes all characters to lower case • to_carat • Changes subfield sign to two carat (^^) signs in order to achieve hierarchical sorting of headings • suppress • Suppresses all text contained within <<…>>, as well as the signs themselves

  20. tab_filing Procedures (3) • expand_num • For filing numbers numerically, adds leading zeroes to numbers to fixed length of 7 (e.g. 17 -> 0000017) • mc_to_mac • Changes initial “mc” to “mac” (for interfiling McKay and MacKay) • non_filing • Suppresses initial text according to non-filing indicator defined in tab11

  21. tab_filing Procedures (4) • compress_blank • Strips blanks (e.g. ISBN) • numbers • Compresses a comma and a dot between numbers (e.g., 2,153 • changes to 2153) • non_numeric • Deletes all non-numeric characters (for ISBN, ISSN)

  22. tab_filing Procedures (5) • abbreviation • Compresses a dot between single characters (e.g., I. B. M. changes to I B M, I.B.M. changes to IBM) • build_filing_key_lc_call_no • Special procedure for correct sequencing of LC call numbers

  23. tab_filing Procedures (7) • char_conv • Translates one character for another (up to 5), using the char_conv procedure listed in the matching line of the tab_character_conversion_line in alephe/unicode • For example: • 01 L char_conv FILING-KEY-01 • refers to the line • FILING-KEY-01 ##### # line_utf2line_sb unicode_to_filing_01

  24. unicode_to_filing_nn_source • This table is used for character conversion for filing. The table must be processed using UTIL P/3 in order to create the unicode_to_filing_nn table. • This latter table is the one actually used by the system. It performs an additional translation in order to remove null characters.

  25. unicode_to_filing_01_source • Examples: • Latin capital letter AE: • 00C6 0041 0045 • Small letter sharp s: • 00DF 0053 005A

  26. IMPORTANT NOTE • The procedures must be listed in logical order. • For example: • numbers must be listed before compress or change_to_blank • if comma or dot are included inthem. • Otherwise, they will no longer be present when the numbers procedure is used.

  27. ./tab/tab_filing - usage • Filing procedures are used when building filing key for headings (Z01), index entries (Z11) and sort keys (Z101)

  28. ./tab/tab_filing - usage • Note: if no procedure for creation of sort keys • has been defined in tab01.lng, the system will use the default filing procedure 99. • Filing procedure 99 MUST be defined tab_filing, as far as it installs the default sort order.

More Related