1 / 18

Revisiting PRECIS: The Preserved Context Index System

Revisiting PRECIS: The Preserved Context Index System. Barbara H. Kwasnik School of Information Studies Syracuse University bkwasnik@syr.edu November 13, 2004 ASIST SIG/CR Workshop. Representation and Meaning.

hal
Download Presentation

Revisiting PRECIS: The Preserved Context Index System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Revisiting PRECIS: The Preserved Context Index System Barbara H. Kwasnik School of Information Studies Syracuse University bkwasnik@syr.edu November 13, 2004 ASIST SIG/CR Workshop

  2. Representation and Meaning • An indexer analyzes a text and strives to ascertain meaning. Ideally this analysis anticipates a searcher at some future time, looking for text with the same meaning. • But, meaning is not fixed at either end of this process. • And even if the meaning is relatively unambiguous or stable, the terms used to represent it are not.

  3. The Dilemma • Thus, most indexing processes encounter a dilemma at two levels: • Interpreting meaning as intended by the author and as construed by the potential user; • Choosing the terms to represent that meaning and that will enable this communication to be clear and as true as it can be. (Bearing in mind that such fidelity is a relative thing to begin with)

  4. Interacting Layers of Meaning • Meaning is ascertained through several layers. These layers interact and and inform each other: • The lexical and morphemic level – words and their forms • The semantic level – the meaning of the words • The syntactic level – the relationship of the words to each other, known as grammar • The discourse level – words interpreted in the context of text that is greater than the single sentence, and • The pragmatic level – words embedded in world knowledge, that is, the way they are used

  5. Meaning in Texts • The meanings created through texts are often complex – not readily reducible to a single concept. Representing them in too simple a way reduces the richness and fidelity of the representation. • But representing complexity is very difficult, especially if we want to build in some stability through standardization.

  6. What to Do? • Because words can be ambiguous, can have multiple senses and can change those senses over time, humans employ a range of strategies to work around this problem. • One of the most useful and “natural” is the inclusion of context to disambiguate potential

  7. The Role of Context • Indexers have employed many strategies to enhance the richness of representation. One of these techniques is to add contextual cues which may • help disambiguate the term’s possible multiple senses, and • reveal how the term is being used, that is, its role in the text.

  8. Back-of-the-Book Indexes • B.o.b.’s are replete with context. In fact, a good index can be “read” and will give a fairly good indication of the content and scope of the text. Librarians education of job satisfaction of poor pay for • The retention of natural order and prepositions helps make the meaning of individual terms clear (although not always). • But these indexes are usually unique to the text to which they point and are quite difficult to maintain on a large scale.

  9. Traditional Thesauri • A collection of subject terms structured as a hierarchy, with equivalence and associative relationships also noted. Community-college librarians UF Junior-college librarians BT Academic librarians RT University librarians • These types of structures offer a semantic context. • But, typically only one aspect of meaning is revealed at a time, and the representations only account for nouns. • Associations among terms can only imply syntactic relationships. E.g., “pasteurization” and “milk.”

  10. Facet Analysis • Strives to remedy limitations of one-dimensionality by enabling representation from a number of perspectives. • Using Ranganathan’s classic dimensions we produce the string: Time: 12th Century Space: Celtic Energy: Embroidered Matter: Felt Personality: Slippers • These strings can be presented in permuted order for access by any of the facets.

  11. PRECIS: Preserved Context Indexing System • Developed by Derek Austin in the early 1970s for subject indexing for the British National Bibliography • Subsequently developed by him, with the assistance of Mary Dykstra, into an adaptable method of linking both the semantics and syntax of indexing terms. • Goal was to represent meaning without “disturbing the user’s immediate understanding.”

  12. PRECIS Indexing Process (Incredibly Simplified) The indexer: • examines document, asking the following questions: • Did anything happen? • If yes, to whom or what did it happen? • Who or what did it? • Where did it happen? (from Dykstra, 1987, p.9) • mentally formulates a title-like phrase • E.g., “recruitment of teachers in American library schools” • analyzes terms syntactically

  13. PRECIS Indexing Process (Incredibly Simplified) • determines role of each term; • (e.g., agent, location) • selects appropriate role operator; • chooses lead terms. Term order is achieved by the operators and is based on context dependency. This means that each term in the string sets the next term into its obvious context. • (e.g., Teachers. Library schools.)

  14. Producing the following entry: United States Library schools. Teachers. Recruitment Library schools.United States Teachers. Recruitment Teachers.Library schools. United States Recruitment Recruitment.Teachers. Library schools. United States (from Austin, JDoc, 1974, p.49-51)

  15. Aspects of PRECIS Indexing: • Context is preserved: The entire indexing statement appears at each lead term; • The permuted entries read naturally, which is achieved by the prescribed order of the role operators; • The terms are linked to a machine-held thesaurus (not described in this presentation) thereby providing possible see’s and see also’s; • According to Austin, PRECIS can be adapted to other languages, e.g., those with inflection. • The indexer determines meaning and codes the roles and lead terms, but the computer takes care of the permutations.

  16. Some Challenges • Indexing with PRECIS requires a good knowledge of grammar; • In my opinion, the bottleneck comes at the first step: articulating the title-like phrase. • It’s not clear how the terms provided by the indexer are harmonized with the thesaurus to produce “consensual meaning.”

  17. PRECIS as a Bridge • PRECIS can take advantage of the semantic richness of a thesaurus, AND the contextual richness of the natural-like permuted phrases of back-of-the-book indexes. • Could potentially add to the power of a facetted- string display by adding some explicit notion of operators among the facets. • And, could take advantage of NLP techniques, which at this point are able to parse most syntactic roles, as well as phrases and names with about 80% accuracy without too much “work.” (personal communication, Liz Liddy)

  18. References • Austin, Derek. PRECIS: A Manual of Concept Analysis and Subject Indexing. 2nd ed. London: British Library Bibliographic Services Division, 1984. • Austin, Derek. The development of PRECIS: A theoretical and technical history. Journal of Documentation30 (1) 1974: 47-102. • Dykstra, Mary. PRECIS: A Primer. Rev. reprint. Metuchen, NJ & London: Scarecrow Press, 1987.

More Related