Metadata : Promise and Practice - PowerPoint PPT Presentation

paul
metadata promise and practice l.
Skip this Video
Loading SlideShow in 5 Seconds..
Metadata : Promise and Practice PowerPoint Presentation
Download Presentation
Metadata : Promise and Practice

play fullscreen
1 / 26
Download Presentation
Metadata : Promise and Practice
639 Views
Download Presentation

Metadata : Promise and Practice

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Metadata : Promise and Practice Jeffrey Beall Nebraska Library Association Technical Services Round Table Spring Meeting, April 25, 2008

  2. Outline • Introduction • 8 theses of my talk • About me • Metadata and high-quality information retrieval; value of browse displays • Four types of searching in libraries • The weaknesses of full-text searching • The future of cataloging and the debate • Next-generation library interfaces

  3. Favorite funny subject headings Golf and war Electric donkeys Infants — Congresses World Wide Web — Early works to 1800 Automobile driving — Religious aspects Dance — France Women, Kukukuku (Changed to: Women, Hamtai) Ugly contests Host-fungus relationships

  4. Favorite funny subject headings Weapons of mass destruction—Safety measures Pomegranate seeds in literature Infants — Books and reading Eskimos — Hunting Headache patients’ writings Bird surveys Violin — Methods (Fiddling) Global warming — Fiction Body, Human — Catalogs Mentally ill parents Appalachian Region — Intellectual life

  5. Favorite funny subject headings Tax exemption — Taxation Dinosaurs as pets Labor disputes — Poetry Crappie fishing Reality — Fiction Historic buildings — Design and construction Public toilets in motion pictures Domestic asses Hurling managers Uranus probes 110 10 |a United States. |b Office of Solid Waste

  6. Theses • Libraries should provide high-quality information discovery and information retrieval. • The best way to achieve this is with systems that sufficiently exploit rich, standard, and comprehensive metadata. • Rich, standard, and comprehensive metadata requires controlled vocabularies for subject metadata, name disambiguation, granularity of description, and collocation.

  7. Theses (continued) • Full-text searching, while not devoid of value, is a low-quality IR/ID system for the type of searching done in libraries, especially serious research and scholarship, etc. • At this time, computers, which do not understand the nuances of human language, are not able to create metadata that is of sufficient quality for use in library IR systems

  8. Theses (continued) • Information discovery often requires mediation. IR systems don’t have to be dumbed-down and made simple. Many things in the world are complicated, so it’s natural that the organization of information will reflect that. It’s okay to have to learn to use a library catalog or other IR system.

  9. Theses (continued) • Library IR systems should not abandon alphabetical browse displays in favor of relevance ranking. • The creation, maintenance, and sharing of metadata for intellectual resources should not be made so complicated that it reduces the amount or quality of metadata being created.

  10. About me Auraria Campus

  11. The value of metadata • Elements of metadata • The value of rich metadata • The library technology graveyard – analyses of low-quality, emerging library technologies • Defining quality in library IR systems

  12. Left-anchored subject browse display

  13. The value of left-anchored browse displays • Simplicity • Structure • Parsing advantage • References • Truncation • Concept consolidation • Collocation of inverted terms • Typographical errors • Classification display • Completeness • Skill transference

  14. The Four categories of searching in libraries • Deterministic searching • Full text searching • Metatext searching • Metadata-enhanced stochastic searching

  15. Deterministic searching • An author, title, subject, number search in an online library catalog • Only searches metadata; results sorted alphanumerically • Can use cross-references

  16. Full-text searching • Matches words in a search with words in documents • Advantages: free, good for rare terms, good for casual information seeking • Also called stochastic searching, probabilistic searching

  17. Metatext searching • Is a full-text search but only of metadata • A keyword search in a library catalog is an example • Advantages: good for rare words; good for novice searchers • Disadvantage: May miss abbreviated terms; is full text, but not of full text itself

  18. Metadata-enhanced stochastic searching • Is a full-text search but also uses metadata to limit results • Google advanced search is an example • Google staff mode – how do they encode metadata? What's their metadata scheme?

  19. The weaknesses of full-text searching • The synonym problem • The homonym problem • Inability to search by facets • Spamming • The "aboutness" problem • Figurative language • Word lists • Abstract topics

  20. The weaknesses of full-text searching (continued) • The incognito problem • Difficult-to-search paired topics • Search engine variability • The opaque web

  21. Search fatigue

  22. Miscellaneous • What computers still cannot do • Gresham's Law • Still need metadata surrogates • The debate about the future of cataloging • My strategy • "Next-generation" library catalogs

  23. WorldCat.org Example of a next-generation, FRBRized search engine • Facets • Metatext search • Hope for catalogers • Can be sorted also by • author, title, date

  24. jeffrey.beall@ucdenver.edu Discussion … Scarlet