  1. Today • Avoiding plagiarism and using style guides with guest Cynthia Crosser • Lecture: subject headings vs. keyword searching • Group activity: translating topics into LC subject headings • Assign homework & readings for next week

  2. Controlled Vocabularies Vs. Keyword Searching

  3. Controlled Vocabularies • Serve the purpose of collocating similar works, regardless of the words or language used in the actual works • Comprised of “Uniform Headings” • Library of Congress Subject Headings (LCSH) are one example. • Other systems/databases may call them “index terms,” “thesaurus terms,” “controlled terms,” or “subjects.”

  4. Uniform Headings • Address the problem of synonyms, variant phrases and different-language terms being used to express the same concept • Also removes ambiguity caused by variant meanings of a word or concept, particularly important in an author’s name (Disambiguation) • Includes cross-references that guide the user to the appropriate (“uniform”) heading used by the system. • BT = Broader Term • NT = Narrower Term • RT = Related Term • UF = Used For • Use x = Use indicated term x instead

  5. Uniform Headings

  6. Collocation • The act of arranging similar items near each other, either physically (i.e. on library shelves) or virtually (i.e. in a list of search results) • In a Uniform Headings system, works are collocated by subject matter and/or by author. • Electronic systems allow us to use both (see next slide)

  7. Collocation of Personal Names • Library of Congress uses the LC “Authority File” to unify and collocate personal names • “Poe Edgar Allan 1809 1849” for Edgar Allen Poe • “Twain, Mark, 1835-1910” for Samuel Clemens • “MccarthyCormac 1933” for Cormac McCarthy • Searching for Subject “MccarthyCormac 1933” will find works that are ABOUT Cormac McCarthy • Searching for Author “MccarthyCormac 1933” will find works that are WRITTEN BY Cormac McCarthy

  8. Categorization (Classification) • The act of applying a uniform heading (or other classification code) to a work • Done by a human being, not a computer • Who usually has some expertise in subject matter (but not always!) • Categorization is limited by resources and prone to human error • Too many new works, too little staffing who must meet quotas, leads to: • Mis-classifying works • Minimal cataloging • Copy cataloging

  9. Categorization (Classification) • Cataloger/Indexer: employs “Scope-Match Specificity” • 20% rule • Scope of the whole work, not individual parts or chapters • User/Searcher: employs “Specific Entry” • Works listed under the narrower heading are NOT listed under the more general heading! • Start with narrow concept and broaden as needed • As a rule, look in the direction of cross-references leading to most specific headings, and stop only at the level of specificity that provides the tightest fit for your topic

  10. Library of Congress Subject Headings • Employs pre-coordinated subject strings (one of the few controlled vocabularies that does this) • Example: • Allows for precise browsing and recognition of related subjects Yugoslavia—Antiquities Yugoslavia—Antiquities—Bibliography Yugoslavia—Antiquities—Maps … Yugoslavia—Yearbooks

  11. Library of Congress Subject Headings • Employs Free-floating Subdivisions • May be appended to any existing uniform heading without need to create a new uniform heading in the system • “Power” Examples: --Bibliography --Case studies --Criticism & Interpretation --Dictionaries --Economic aspects --Encyclopedias --Handbooks, Manuals, etc. --Health aspects --History --Influence --Maps --Public opinion --Psychological aspects --Quotations --Social aspects --Statistics

  12. Collocation vs. Relevance Ranking • Relevance Ranking is dependent on the keywords supplied by the user • User can’t always provide every synonym, alternate spelling, variant phrase, foreign language term, etc. and can easily miss important works • Will return works where target keywords are present, regardless of their context • Does not allow you to recognize relevant works whose keywords you cannot think of beforehand

  13. Collocation vs. Relevance Ranking • Sorted by works where the keywords appear the most (most basic relevancy ranking) or some other arbitrary ranking algorithm • Does not distinguish between works that are ABOUT a person and works that are BY a person • A search for “Cormac McCarthy” will have both in the search results • Most (including Google) do not allow for Boolean combinations, truncation, proximity, or field searching (we’ll cover these later)

  14. Collocation vs. Relevance Ranking • There are costs and benefits of each system • Neither offers perfect retrieval (high precision AND high recall) • “Smart” systems employ use of both collocation by uniform headings and relevancy ranking • Keyword search may rank higher the works where keywords appear in the uniform headings • Field searching in multiple fields where one field is designated as subject

  15. Finding LC Subject Headings • Find LC Subject Headings that correspond to topics in your concept maps (from homework #3) • Use the four methods described in text, pp. 30-40 • Group 1 use method 1, group 2 use method 2, and so on. • Find as many as possible in the time allowed • Report the most interesting findings to the rest of the class