INDEXES AND INDEXING Ma. Theresa B. VillanuevaHead, Microforms and Digital Resource CenterRizal Library, Ateneo De Manila University April 15-16, 2013 James O’Brien Library-Ateneo de Naga University
DEFINITION OF TERMS • a systematic guide designed to indicate subjects, topics, or features of documents in order to facilitate their retrieval Index • a tool, which indicates to a user the information or a source of information that one needs
Indexing • the process of identifying and assigning index terms to a document, either to describe its physical characteristics, give facts about its creator or distribution, or describe its content
General Purposes of Indexes • To construct representations of documents in a form that is suitable to the users to browse through • To maximize the searching successof the users • To minimize the time and effort in finding information
Uses of Indexes facilitate reference to the specific material or to locate wanted information serve as filter to withhold irrelevant materials make the information storage and retrieval system useful to individual disclose related information tool for current awareness services
Types of Indexes BY Arrangement BY Type of material indexed Alphabetical Classified Book Concordance Audiovisual Periodical/Newspaper BY Physical form Card index Printed Microform Computerized
a. Alphabetical Index - is based on the orderly principle of letters of the alphabet; used for the arrangement of subheadings, cross references as well as main headings b. Classified Index – contents are arranged systematically by classes or subject headings c. Concordance – is in alphabetical index of all principal words appearing in a single text or in a multi-volume of a single author w/ a precise pointer to the precise point at which the word occurs. By Arrangement
Card index – an index in which 3” x 5” cards are used as the tools Printed index – a tool for indexing or for researching and retrieval of information that is in printed form Microform index – index to microforms such as microfiche and microfilm Computerized index – uses computers to construct indexes By Physical Form
By Type of Materials Index a. Audiovisual Material Index - textual labeling (index terms or description) is needed along with image matching - search on words may retrieve a particular image related to the search term which in turn can be used as input to find other related entries
b. Book index - a list of words or group of words arranged alphabetically, at the back of the book giving a page location of the subject or name associated with each word.
Periodical Index/Newspaper Index - open-ended projects usually performed by group of people - consistency is a challenging part since each periodical issue may deal with unrelated topics by several authors - written in different styles and aimed at different users.
Classified Index Entry points are arranged in a hierarchy of related topics, starting with generic or broad topics and working down to the specific ones. Examples: - Index Medicus – classified index in the field of medicines and related disciplines - Engineering Index – classified index in the field of engineering and related disciplines Alphabetical Subject Index an alphabetical subject index covers a number of different kinds of indexes. The arrangement is in alphabetical order and follows a familiar pattern. Examples: - Reader’s Guide to Periodical Literature (RGPL) - Index to Philippine Periodicals (IPP) Author Index Entry points are names of persons, organizations, government agencies, institutions, etc. Examples: - Development Bank of the Philippines - Philippine Chamber of Commerce and Industry - Romulo, Carlos P. Periodicals Indexes
Exhaustivity Specificity INDEXING PRINCIPLES – refers to the extent to which a concept or topic in a document is identified by precise term in the hierarchy of its genus-species relations - refers to the extent to which a document is analyzed to identify its subject content Consistency –refers to the extent to which agreement exists on the terms to be used to index contents of documents
Exhaustivity results to high recall but low • precision. Principle of Exhaustivity • Exhaustive indexing use of various index terms to fully cover the major and minor themesof document • Selective indexing use of a few terms to cover only the main or major themeof a document
Principle of Specificity Example: Genus: Citrus Fruits Species:ORANGESLEMONS LIMES GRAPEFRUITS • Specificity would result to high precision but low recall
Principle of Consistency Intra-indexer consistency refers to the extent to which one indexer is consistent to himself/herself on assigning subject terms. There are two types of consistency level: Inter-indexer consistency refers to the agreement between or among indexers in assigning subject terms in a particular article
Indexing Methods 1. Derived or derivative indexing – a method by which words and phrases occurring in the title or text of documentary unit are extracted by a human or computer to serve as indexing terms. - also called an extractive indexing.
Assigned indexing - a method by which terms, descriptors or subject headings are selected by a human or computer to represent the topics or features of a documentary unit - assigned terms are often times taken from a source other than the document itself.
Indexing LanguageAn indexing language is a language that is used by the indexer to represent the subject content of a document.
to represent the subject content of a document • either using the words of the author or assigning • appropriate descriptors from a controlled • vocabulary • to help users discriminate between terms and • reduce ambiguity in the language Purposes and Uses of Indexing Language:
Types of Indexing Language 1. Natural Language - uses index terms/words occurring in the printed text as index entries; it is sometimes called derived-term system
Characteristics of using Natural Language: Improves recall because it provides more access point but reduces precision Redundancy is greater Uses more current terms Tends to be favored by end-users
2. Controlled vocabulary - represent the general conceptual structure of one or more subject areas and presents a guide to the users of the index - categorized as assigned-term system
To show the three relationships of terms: • equivalence • hierarchical • associative This is achieved by providing or showing under: broader term (BT) narrower term (NT) related terms (RT) use for (UF) see also (SA) Controlled Vocabulary provides cross references in the form of Use:
Relationships of Terms: a. Equivalence relationship - implies that there will be more than one term denoting the same concept
Equivalence relationship: Example 1 Use for (UF) or Use reference(see reference) Example: EMPLOYEES UF:Personnel Staff Workers - refers to a preferred descriptor from a non-usable term
Equivalence relationship Example 2: BIRTH CONTROL UF : Family Planning - reference deals primarily with synonymous or variant forms of the preferred descriptor - it is also used to lead the indexer to more general terms
Synonyms (e.g. Reason; Cause) • Quasi-synonyms(e.g. Law; Law Management) • Preferred spelling (e.g. Catalog; Catalogue) • Acronyms and abbreviations (e.g. ASEAN; Association of Southeast Asian Nations) • Current and established terms (e.g. Cellular Radio; Cellular Phone) • Translation(e.g. Coconut Coir; Bunot) Examples that indicate Equivalence relationship:
b. Hierarchical relationship – refers to the general and specific or broad and narrow type of relationship
Hierarchical relationship Example 1 : Broader term (BT) Employees BT : People - shows hierarchical relationship upward in the classification ranking • it differs from the use for reference in that both the basic terms and its broader term are descriptor terms and both can be used
Hierarchical relationship: Example 2 Cats BT: ANIMALS "ANIMALS" is a broader term to "CATS“ because all cats are animals. Reference: http://publish.uwo.ca/~craven/677/thesaur/main05.htm
Hierarchical relationship: Example 3 Narrower term (NT) Employees NT : HOTEL EMPLOYEES RAILROAD EMPLOYEES - reference is similar to the broader term reference, except it goes down in the classification ranking
Hierarchical relationship: Example 4 Head NT : NOSE “NOSE” might be a narrower term to “HEAD”, because noses are normally parts of heads. Reference: http://publish.uwo.ca/~craven/677/thesaur/main05.htm
Genus – species relationship (represent class inclusion) Example: Animals Domestic Animals Cats • Whole-part relationship Example: Hand Fingers • Instance relationship Example: Mountains Mount Apo
c. Associative relationship - refers to a non-hierarchical relationship of terms
Associative relationship Example 1 : Related term (RT) EMPLOYEE RT : EMPLOYMENT - reference refers to a descriptor that can be used in addition to the basic term but not in a hierarchical relationship
Associative relationship Other Examples : • Teachers – Student • Tables – Chairs • Education – Teaching • Men – Women
Scope Note (SN) & Qualifier - used to give the users about the descriptor’s usage restrictions or to clarify ambiguity; a scope note may give additional instructions to indexers Scope Note: Examples: INDEXING (SN) Assigning of natural language terms to documents HOSPITALIZATION (SN) Assign also terms for the conditions for which patients were hospitalized, if applicable Qualifier: Example: Security (Law) Security (Psychology) Reference: http://publish.uwo.ca/~craven/677/thesaur/main08.htm
Functions of Controlled Vocabulary: To control synonyms by choosing one form as the standard term To make distinction among homographs To link or bring together those terms whose meaning are closely related Example: Cereals and Wheat Controls variant spelling
A controlled vocabulary may take the form of verbal expressions as illustrated by Subject Headings Lists and Thesauri or coded/nonverbal expressions as shown by Classification schemes. Subject headings lists – are lists of terms representing several subject fields; some focus on specific fields Thesauri – are another authority devices that cover more specific or narrower subject fields Classification schemes – generally contain coded expression or notations to the relevant topics in a particular class or subclass
INDEXING GUIDELINES & PROCEDURES Part 2
INDEXING PROCESS: 1. Recording of bibliographic data - recording of the important information or the elements that identify a particular document The International Organization for Standards (ISO) set a Standards for bibliographic references: ISO 690 1975 (E)- “Bibliographic References Essential and Supplementary Elements”
- When indexing contents of a collection of documents, locators should give complete information about each document. - for periodical articles, each entry normally consists of the following elements: Essential elements for an article or contribution in a periodical are: Name(s) of Author(s) with forenames Title of the article Title of the periodical or Source Volume Number Issue Number Date of the issue Page number
Example: Name(s) of Author(s): [Xian, Jie] Title of the article : [Hybrid rice: a new hope towards a bountiful Philippines] Title of the periodical or Source : [Impact] Volume Number :  Issue Number :  Date of the issue : [September 2007] Page number : [4-8]
ISO FORMAT: Sample entry: ________________ (subject/Topic) Xian, Jie. Hybrid rice: a new hope towards abountiful Philippines. Impact,Vol. 46, no.9, S ‘12, p. 4-8.
Format comparison: ISO FORMAT: ATENEO FORMAT: ________________ (subject/Topic) _______________ (subject/topic) Hybrid rice: a new hope towards a bountiful Philippines. Xian, Jie. Impact 46 (9) : 4-8. S ‘12. Xian, Jie. Hybrid rice: a new hope towards a bountiful Philippines. Impact,Vol. 46, no.9, S ‘12, p. 4-8. OTHER FORMAT: _______________ (subject/topic) Xian, Jie. Hybrid rice: a new hope towards a bountiful Philippines. Impact 46 (9) : 4-8. S ‘12
2. Subject determination “aboutness of the material and the formulation of a concept list • Choose the most appropriate concepts; consider the users & the purpose of the index • No arbitrary limit should be set to the number of terms or descriptors which can be assigned to a document. - it should be determined fully by the amount of information contained in the document - it should be related to the expected needs of the users of the index.
Modify the indexing guidelines and procedures if needed; but modification should not compromise the structure or logic of the indexing language. • Concepts should be as specific as possible. More general concepts may be preferred in some circumstances, depending upon the following factors: • over-specificity might adversely affect the performance of the indexing system. • if an idea is not fully developed, or is referred to only casually by the author, then it might be justified to index at a more general level
3. Content/Conceptual analysis – identifying the topics discussed in a document and determining what aspects of its users will be interested in
Content Analysis -Decide which topics in the item are relevant to the potential user of the document. - Decide which topics truly capture the content of the document. - Determine terms that come as close as possible to the terminology use in the document. - Decide on index terms and the specificity of those terms.