1 / 26

Guidelines and Principles for Developing Search and Browse Vocabularies May 31, 2003 Rice University Houston, TX

Guidelines and Principles for Developing Search and Browse Vocabularies May 31, 2003 Rice University Houston, TX. Amy J. Warner, PhD awarner@lexonomy.com. Epicurious.com. Navigation/Taxonomy. Vehicle Brands Vehicle Parts Cars Vehicle Accessories MR2Spider Carriers

terah
Download Presentation

Guidelines and Principles for Developing Search and Browse Vocabularies May 31, 2003 Rice University Houston, TX

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Guidelines and Principles for Developing Search and Browse VocabulariesMay 31, 2003Rice UniversityHouston, TX Amy J. Warner, PhD awarner@lexonomy.com

  2. Epicurious.com

  3. Navigation/Taxonomy Vehicle Brands Vehicle Parts Cars Vehicle Accessories MR2Spider Carriers Celica Bicycle Carriers Matrix Ski Carriers Avalon Roof Racks Camry Solarus Splash Guards Camry Security Systems Prius Tires Corolla ˚ ECHO ˚ SUVs/Vans Engines & Transmissions Land Cruiser ˚ Sequoia ˚ ` 4 Runner Sienna Highlander RAV4 Trucks Tundra Tacoma Celica Brochure Camry Brochure

  4. Synonym Rings Cholesterol Blood Cholesterol Serum Cholesterol Good Cholesterol Bad Cholesterol LDL . . .

  5. Medline

  6. MeSH & UMLS

  7. Controlled Vocabulary Defined • A subset of natural language. • A list of preferred and (sometimes)variant terms. • With semantic relationships (hierarchical and associative) (sometimes) defined. • Used to tag document attributes (describe facets). • Topic / Subtopic • Audience • Language • Form • Or can be used tocreate labeling scheme for navigation.

  8. Cornerstones of Vocabulary Control • Use unambiguous labels/search terms. • Make distinctions among labels/search terms clear. • Make choices about wording and specificity of labels/search terms based on user testing and on size of collection. • Use other semantic relationships (hierarchical, associative) if necessary to organize large lists of labels/search terms.

  9. Continuum of Vocabulary Control Less More • Synonym • Control • USE/Used for relationshipVehicle crashes USE Vehicle collisions Vehicle collisions UF Vehicle crashes • Synonym RingsVehicle collisions Vehicle crashes Crashes Collisions • Hierarchical • Relationships • Broader/Narrower Terms Vehicle collisions NT Truck collisions Truck collisions BT Vehicle collisions • Browse CategoriesVehicle safety Truck safety Truck collisions Vehicle safety • Site Index • Taxonomies • Associative • Relationships • Part/Whole • Cause/Effect • etc.Vehicle parts RT Vehicles Vehicles RT Vehicle parts

  10. Steps in Controlled Vocabulary Construction • Group terms by subject (facet analysis) • Link synonyms and variants.Synonym RingsVehicle collisions Vehicle crashes Crashes Collisions • Identify broader and narrower terms.Taxonomies / Hierarchies • Identify related terms.Thesauri

  11. Purposes of Standard • Base choices on ‘best practice’. • Base choices on known principles. • Foster interoperability.

  12. Current NISO Thesaurus Standard • Guidelines for the construction, format, and management of monolingual thesauri: Z39.19-1993. • Not a technical standard, but a set of guidelines. • Emphasizes search thesauri. • Emphasizes postcoordinate retrieval. • Used mainly for abstracting and indexing services. • Does not put the standard in context.

  13. Why Revise • Not revised since 1993. • Number of downloads high, reflecting interest. • Does not take the web environment into account. • Navigation schemes are controlled vocabularies too. • Is out of date in terms of computing technology in general: • Software for managing thesauri has advanced. • Software for leveraging thesauri though an interface has advanced. • Currently little attention paid to user testing.

  14. Term forms • Currently • Emphasizes rigid rules for grammatical form. • Emphasizes short phrases as terms. • Suggested revision • Loosen rules on grammatical form. • Allow for longer, more complex phrases. • Rationale • Software can perform automatic stemming. • Navigation schemes are more precoordinate.

  15. Semantic Relationships • Current standard • Only accounts for explicit equivalence relationships. • Hierarchical relationship only allowed for genus-species relationship, with a few exceptions. • Associative relationship only allowed across categories. • Proposed revision • Provide guidelines for choosing unambiguous labels. • Provide guidelines for loose, browse categories. • Rationale • Labeling schemes and pick lists often do not account for explicit synonymy relationships. • Hierarchical navigation schemes need to be less rigid.

  16. Browse Categories

  17. Usability Testing • Current standard • Discusses users but does not include guidelines for testing with users. • Proposed revision • Provide guidelines for open card sort testing of high level categories. • Provide guidelines for closed card sorting of term groups under high level categories. • Rationale • User testing important consideration for choose terms and term relationships.

  18. Display • Current standard • Emphasizes print copies of thesauri. • Screen display section oriented toward display of print copy. • Proposed revision • Oriented more toward displays of vocabularies that only exist in digital format. • Rationale • Most web vocabularies do not have print counterparts.

  19. Interoperability • Current standard • Does not address issues associated with interoperability • Proposed revision • Will address major issues and problems associated with interoperability, including multiple languages • Rationale • Being able to share information within and among organizations

  20. Construction and Maintenance • Current standard • Emphasizes maintenance problems in print vocabularies. • Discusses software that manages stand-alone vocabularies. • Proposed revision • Advance standards for changing, adding, deleting terms automatically. • Provide guidance for software that is connected to information retrieval systems. • Rationale • Software has advanced significantly.

  21. Process for Revising Standard • Appoint editor. • Appoint advisory group. • Draft revision. • Discuss drafts with advisory group. • Vote on final draft by NISO board.

  22. Editor & Advisory Group • Amy Warner, lexonomy.com • Vivian Bliss, Microsoft • Carol Brent, ProQuest • John Dickert, U.S. DoD • Lynn El-Hoshy, Library of Congress • Emily Fayen, SDC liaison • Patricia Harpring, Getty • Stephen Hearn, American Library Association • Sabine Kuhn, American Chemical Society/Chemical Abstracts • Pat Kuhr, H.W. Wilson • Diane McKerlie, Design Strategy • Peter Morville, Semantic Studios • Stuart Nelson, National Library of Medicine • Diane Vizine-Goetz, OCLC • Marcia Lei Zeng, Special Libraries Association

  23. Progress to Date • Agreement on scope of revision. • Agreement that guidelines should be placed in context. • Agreement that guidelines should be educational as well as prescribing best practice. • Agreement that guidelines should be forward looking in terms of new technologies. • Agreement to write guidelines for elements and features that all vocabularies have in common, then consider their differences. • Survey conducted to determine use of standard, other standards, software.

  24. Other Players • Communication with editor of British Standard. • Communication and work with W3C to address issues of implementation of controlled vocabularies.

  25. Relationship with Semantic Web and OWL • Semantic Web is an ontological framework. • Both terms in the ontology and the relationships between them are standardized using OWL (Web Ontology Language). • Both the terms and the relationships are ‘deep’ semantically. • This is a structure into which ‘shallower’ terms provided by using Z39.19 could be inserted. • This would enhance interoperability because although we would not have complete agreement on vocabularies, we would have agreement on an effective structure for exchanging them.

  26. Contact Me Amy J. Warner awarner@lexonomy.com www.lexonomy.com

More Related