The use of collaborative tagging in public library catalogues Louise Spiteri School of Information Management
Client-centred library catalogues • In recent years, significant developments have occurred in the creation of customizable user features in public library catalogues. • These features offer clients the opportunity to customize their own library web page and to store items of interest to them, such as book lists. • Client participation in these interfaces, however, is largely reactive; clients can select items from the catalogue, but they have little ability to organize and categorize these items in a way that reflects their own needs and language.
Desired features of library systems • Collaborative tagging, or folksonomies, allows anyone to freely attach keywords or tags to content. Demspey (2003) and Ketchell (2000) recommend that clients be allowed to annotate resources of interest and to share these annotations with other clients with similar interests. • Folksonomies can thus make significant contributions to public library catalogues by enabling clients to organize personal information spaces, namely to create and organize their own personal information space in the catalogue. • Clients find items of interest (items in the library catalogue, citations from external databases, external web pages, etc.) and store, maintain, and organize them in the catalogue using their own tags.
Research Goals • To examine the structure and scope of folksonomies. • How are the tags that constitute the folksonomies structured? • To what extent does this structure reflect and differ from the norms used in the construction of subject headings such as LCSH? • What are the strengths and weaknesses of folksonomies (e.g., reflect user need, ambiguous headings, redundant headings, etc.)? • To examine the extent to which LCSH headings reflect user-derived folksonomies. • How much overlap exists between LCSH headings and popular tags? • How well do LCSH headings mirror user-derived tags for similar concepts?
Methodology • Acquired the daily tag logs from three folksonomy sites, Del.icio.us; Furl, Technorati, over 30 days. • Assessed the structure of tags against criteria derived from Section 6 of the 2005 NISO Guidelines for the construction, format, and management of monolingual controlled vocabularies, specifically: • Term choice • Grammatical forms of terms • Nouns • Selection of preferred form
Methodology • The Tags were mapped against the LCSH online name & subject authority files to look for incidences of exact, partial, or no matches.
Results: Homographs • 22% of Delicious tags; 12% of Furl tags; 20% of Technorati tags. • Unique entities constitute a significant proportion of the homographs in all three sites, with 71% in Delicious, 43% in Furl, and 55% in Technorati. • The most frequently-occurring homographs across the three sites consist predominantly of computer-related products, such as Ajax and CSS.
Results: Single v. Multi-term tags • NISO recommends the use of single terms wherever possible • Single term tags constitute 93% of Delicious tags, 76% of Furl tags, and 80% of Technorati tags. • The preponderance of single tags in Delicious may reflect the fact that it does not allow for the use of spaces between the different elements of the same tag, e.g., open source.
Results: Unique entities (Proper nouns) • Unique tags constitute 22% of Delicious tags, 14% of Furl tags, and 49% of Technorati tags. • Computer-related products constitute 100% of the unique entities in Delicious, 63% in Furl, and 38% in Technorati. The remainder of the unique entities in Furl and Technorati represent places, people, and corporate bodies. • The unique entities in Technorati are closely related to developments in current news events, an occurrence that is likely due to the site’s focus on blogs, rather than web sites.
Results: Count and non-count nouns • Of the count nouns, 36% of Delicious tags, 62% of Furl tags, and 34% of Technorati appeared in the correct plural form.
Results: Spelling • Only 4% of Delicious, 3% of Furl, and 2% of Technorati tags did not conform to standard spelling (Merriam Webster). • Non-standard spelling was largely a result of the improper use of punctuation, or the inability to form compound tags in Delicious, e.g., • Opensource v. Open source • Superbowl v. Super Bowl • Web-2.0 v. Web2.0
Results: Abbreviations and acronyms • NISO recommends that the full form of terms should be used to minimize ambiguity, with cross references made between the full forms and their abbreviations (e.g., NFL USE National Football League). • Abbreviations and acronyms constitute 22% of Delicious tags, 16% of Furl tags, and 19% of Technorati tags. • The majority of these abbreviations and acronyms pertain to unique entities, such as product names (e.g., Flash, Mac, and NFL).
Results: Abbreviations and acronyms • Abbreviations and acronyms play a significant role in the ambiguity of the tags from the three sites • 71% of the abbreviated Delicious tags, 45% of the abbreviated Furl tags, and 73% of the abbreviated Technorati tags. • The Delicious tags are focused more heavily upon computer-related products, which may explain why there are so many more abbreviated tags, since many of these products are often referred to by these shorter terms, e.g., CSS, Flash, Apple, etc.
Results: Overlap between tags and LCSH • A total of 470 unique tags was derived from the three folksonomy sites and compared to the LCSH subject and name authority files. • Exact Matches: 27% of the tags • Partial Matches: 29% of the tags (e.g., Flash v. Macromedia Flash) • Indirect Matches: 25% of the tags • Referred synonyms, e.g., Films SEE Motion pictures • Combination of LCSH terms, e.g., Information technology + Education used to express the tag IT Education. • No Matches: 19% of the tags
Conclusions • The tags correspond closely to a number of the NISO guidelines pertaining to the structure of terms, namely in the types of concepts expressed by the tags, the predominance of single tags, the predominance of nouns, and the use of recognized spelling.
Conclusions: Singular and plural forms of tags • Problem areas in the structure of the tags pertain to: • The inconsistent use of the singular and plural form of count nouns • The difficulty with creating multi-term tags in Delicious • The incidence of ambiguous tags in the form of homographs and unqualified abbreviations or acronyms. • Since many search engines do not deploy default truncation, the use of the singular or plural form could affect retrieval; a search for the tag computer in Delicious, for example, retrieved 208,409 hits, while one for computers retrieved 91,205 hits.
Conclusions: Multi-term tags • Furl and Technorati allow for the use of multi-term tags, but make no mention of this feature in their help screens, which means that such tags may be constructed inconsistently, for example, by the insertion of punctuation, where a simple space between the tags will suffice. • Delicious does not allow directly for the construction of multi-term tags; it suggests that a variety of punctuation devices may be used to conflate two or three separate tags, once again to the detriment of retrieval, as is shown below: • Opensource: 103,476 hits • Open_source: 91, 205 hits • Open.source: 26,494 hits
Conclusions: Ambiguity • The help screens of the three sites do not address the notion of ambiguity in the construction of tags. • The sites fail to address the fact that abbreviated forms (or any tag, for that matter) may be culturally-based, so that while the meaning of NFL may be obvious to North American users, this may not be the case for people who live in other geographic areas. • It may be useful for the folksonomy sites to add direct links to an online dictionary and to Wikipedia, and to encourage people to use these sites to determine whether their chosen tags may have more than one application or meaning.
Conclusions: Tags and LCSH • At first glance, it may appear that LCSH does well compared to the tags, as 81% of the tags could be expressed, to some degree, in the LC authority files. • The indirect matches category, however, may provide a false sense of security, since a number of these matches are obtuse, awkward, or too broad or narrow in scope to express accurately the tags, e.g.: • Motion pictures instead of Films • Press instead of News • Cookery instead of Cooking
Recommendations • If library catalogues decide to incorporate folksonomies, they should consider including the following guidelines: • The proper use of singular and plural nouns and their impact on retrieval; • One standard way to construct multi-term tags; • A link to a recognized online dictionary and to Wikipedia to help users to determine the meanings of terms, to disambiguate amongst homographs, and to determine if the full form would be preferable to the abbreviated form. An explanation of the impact of ambiguous tags and homographs upon retrieval would be useful; and • An acceptable-use policy that would cover areas of potential concern, such as the use of potentially offensive tags, overly graphic tags, and so forth. Although such terms were not the focus of this study, their presence was certainly evident in some cases, and would need to be considered in an environment that includes clients of all ages.
Future Research • Examine the tagging behaviour of people who use folksonomy sites, e.g.: • Why do people choose the tags they use; what motivates them to modify these tags; how often do they modify them? • How are folksonomies used communally? • How do folksonomies foster consensus in the use of tags? • How does the community affect which tags are used and how?
Future research • An examination of the role of collaborative tagging in the creation of online communities that share their interests via the public library catalogue. • Clients with a shared interest in cult films, for example, could access each others’ relevant tags, and hence any resources that have been bookmarked under these tags. • Librarians could use the information found under the public tags to help them create reading lists. Collaborative tagging could thus help create more client-directed library portals.
Acknowledgements • This research was funded by the OCLC/ALISE Library & Information Science Research Grant Program (LISRGP)