1 / 25

Document Ontologies in Library and Information Science: An Introduction and Critical Analysis

Document Ontologies in Library and Information Science: An Introduction and Critical Analysis. Allyson Carlyle iSchool, University of Washington, Seattle, WA, USA acarlyle@u.washington.edu http://purl.oclc.org/net/acarlyle Knowledge Technologies Conference 2002, Seattle, WA, USA.

lucien
Download Presentation

Document Ontologies in Library and Information Science: An Introduction and Critical Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Document Ontologies in Library and Information Science: An Introduction and Critical Analysis Allyson Carlyle iSchool, University of Washington, Seattle, WA, USA acarlyle@u.washington.edu http://purl.oclc.org/net/acarlyle Knowledge Technologies Conference 2002, Seattle, WA, USA

  2. Overview • Where I’m coming from : knowledge and document organization tasks in LIS (Library and Information Science) • Factors affecting the organization of knowledge and documents • Ontologies & Ontological assumptions in LIS • IFLA ontology: Physical/Abstract status of documents • Hirons & Graham ontology: Temporal status of documents

  3. Where I’m coming from: organizing tasks in LIS • Creating document representations,e.g., cataloging records; • Arranging documents, e.g., in Dewey number order on a library bookshelf; • Creating organizational standards (Dewey) and techniques (alphabetical ordering) to use in representing and arranging documents; • Creating organizational standards and techniques to provide pathways (via titles, author names, taxonomies, classifications, etc.) that guide people to documents and organized knowledge.

  4. Factors affecting the organization of knowledge and documents • People: individuals and groups (e.g., social, cultural, occupational orientations); • Systems: retrieval, display / organization, interface; • Knowledge / documents: delivery mode/format, subject content, disciplinary aspect, artifactual importance; • Administration & environment: costs, other constraints.

  5. Ontology? • In philosophy: “the branch of metaphysics that deals with the nature of being” • In computer related communities: “a specification of a conceptualization” (Tom Gruber); or “a set of vocabulary definitions that expresses a community’s consensus knowledge about a domain. This knowledge is meant to be stable over time, and reused to solve multiple problems.” (Peter Weinstein)

  6. Ontological assumptions in LIS • Documents have a simultaneous existence as both physical and abstract entities; this is being referred to in the library cataloging community as “content vs. carrier”

  7. Content vs. Carrier • The physical/abstract dichotomy presents the following problem – if we are creating document representations, what should we represent? The carrier? The content? Both? • Whatever decision is made, the physical/abstract dichotomy may result in complications for people when they are searching, navigating, and trying to determining relevance in systems.

  8. IFLA ontology of the physical/abstract status of documents • International Federation of Library Associations (IFLA) charged a study group to identify “functional requirements for bibliographic records” (in other words, an explanation an optimum model for creating document representations) • Functional Requirements for Bibliographic Records available at: http://www.ifla.org/VII/s13/frbr/frbr.pdf

  9. IFLA ontology of the physical/abstract status of documents • Proposes that documents are single physical entities representing multiple abstract entities each with its own distinct, and sometimes contradictory, attributes: • work (an intellectual or artistic creation) • expression (a realization of a work in alpha-numeric, musical, image, etc. form) • manifestation (a physical embodiment of an expression of a work) • item (a single exemplar of a manifestation)

  10. Alternative definitions for IFLA entities • work: a set of items embodying a distinct intellectual or artistic content • expression: a set of items embodying a realization of a work • manifestation: a set of “identical” items; items sharing many intellectual and physical attributes • item: a single item

  11. Items • Item attributes: condition, access restrictions on item, history (provenance), marks or inscriptions present

  12. Manifestations • Manifestation attributes: edition designation (3rd edition), publisher/distributor, date of publication/distribution, physical medium, access restrictions on manifestation, file characteristics (electronic document) • What is a manifestation in the web environment? What you see on the screen or what is stored in a file on a server? • If manifestation defined as what you see on a screen, how useful is it to describe web page “manifestations”?

  13. Expressions • Expression attributes: expression title (The Haunted Pool, The Devil’s Pool, La Mare au Diable), expression creator (e.g., translator), type of score (musical notation), projection or scale (cartographic expression), etc. • Do all expressions have unique attributes? • Dune vs. Dune – some interpretations would make manifestation attributes into expression attributes

  14. Works • Work attributes: creator, work title (La Mar au Diable), date of creation / date of publication or appearance, key (for a musical work), coordinates (for a cartographic work), etc. • Problem: What is a work? When does a version (expression) of a work become different enough to become its own “distinct intellectual or artistic creation” • Charles Dickens’ A Christmas Carol vs. Scrooged – the same work? different works that are related?

  15. Solutions to the Physical/Abstract Multiple Entity Problem • IFLA ontology is one approach; others, both simpler and more complex, are possible – see Indecs Framework, a variation of the IFLA ontology for “intellectual property” e-commerce (http://www.indecs.org/ ). • Standardized approaches or ontologies are possible that: • recognize multiple abstract entities embodied in a single physical item; • represent each entity using a particular set of attributes, clearly distinguished; • display relationships among items to users in an unambiguous and consistent manner

  16. Hirons & Graham ontology: temporal status of documents • Some documents, such as magazines, annual reports, and websites, may be seen as distinct works that accumulate or change as time passes. Hirons and Graham identify these as “ongoing entities.” How do we best create representations for ongoing entities?

  17. Hirons & Graham ontology: temporal status of document • With their ontology, Hirons and Graham clarify the nature of ongoing entities to improve library cataloging rules • However, their ontology may also be used to improve identification of metadata in web documents.

  18. Hirons & Graham ontology

  19. Strengths of the Hirons & Graham ontology • Recognizes both similarities and differences between documents such as serials that are “successive with discrete parts” and those that are “integrating”, such as Websites • Recognizes the fundamental nature of “integrating” documents; that they are not not made up of parts, but are wholes that are updated or changed.

  20. Complications • How do we maintain attribute values for ongoing entities? See Carl Lagoze et al. for a possible solution, using “event aware” metadata: http://www.cs.cornell.edu/lagoze/papers/ev.pdf

  21. Complications • Can the Hirons & Graham ontology and the IFLA ontology be successfully integrated? How can we talk about an integrating work or expression? What attributes are associated with them? • For example, are “serials”, such as magazines or e-journals, really “works”? If they are works (Time Magazine) composed of other works (Time articles), what are the implications for representation?

  22. References • IFLA Study Group on the Functional Requirements for Bibliographic Records. Functional Requirements for Bibliographic Records: Final Report. UBCIM Publications, New Series, vol. 19. München: K.G. Saur, 1998. http://www.ifla.org/VII/s13/frbr/frbr.pdf • The Indecs (INteroperability of Data in E-Commerce Systems) Framework. At: http://www.indecs.org/Used the IFLA ontology as an initial framework. • Jean Hirons and Crystal Graham. “Issues Related to Seriality.” From: The Principles and Future of AACR. Jean Weihs, ed. Ottawa: Canadian Library Association, 1998. [Written for the library cataloging community, so it parts may be difficult to understand.] • Carl Lagoze, Jane Hunter, and Dan Brinkley. “An Event Aware Model for Metadata Interoperability” At: http://www.cs.cornell.edu/lagoze/papers/ev.pdf

  23. “Ontology” References • Tom Gruber. (2001) “What is an Ontology?” At: http://www-ksl.stanford.edu/kst/what-is-an-ontology.html. • Peter Weinstein. “Ontology-Based Metadata: Transforming the MARC Legacy”, from Digital Libraries 98, Pittsburgh, PA, USA: pp. 254-263.

More Related