SOCIAL TAGGING Ayesha Akbar ShafiaImtiaz Amal Faisal Omer Bin Asad "
SOCIAL TAGGING(FOLKSONOMY) SOCIAL BOOKMARKING Social Bookmarking is the practice of saving bookmarks to a public Web site and tagging with the key words. It is simply a method for Internet users to organize, store, manage and search for bookmarks of resources online. BOOKMARKING Bookmarking, on the other hand, is the practice of saving the address of a Web site you wish to visit in the future on your computer.
WHO IS DOING IT? • Furl • Simpy • Delicious • Citeulike It is particular useful when collecting set of resources that are to be shared with others. Social Bookmarking is another component of Social Media.It enables users to express different perspective on information through tagging.
HOW DOES IT WORK ? • The creator of the bookmark assigns tags to each resource resulting in a user directed method of classifying the information. • Descriptions may be added to these bookmarks in the form of metadata, so users may understand the content of the resource without first needing to download it for themselves. • As social bookmarking services indicate who created each bookmark and can provide access to that person’s other bookmarked resources, users can easily make social connections with other individuals interested in just about any topic.
WHY IS IT IMPORTANT • Gives users the opportunity to express differing perspectives on information and resources through informal organizational structures. • It allows like-minded individuals to find one another and create new communities of users that continue to influence the ongoing evolution of folksonomies and common tags for resources. • It helps one find information related to the topic you are researching, even in areas that aren’t obviously connected to the primary topic.
Advantages • Improve search capabilities of search engines • Metadata that is associated with resources. • Tag patterns emerge for documents tagged by multiple users allowing for describing their contents and characteristics. • This contextual information is created by content consumers (reader-to-author evaluation) • Hence, more trustful than the metadata provided by content producers.
Advantages • No controlled vocabulary used. • Tags emerge freely among users. • Thus, no rigid classification done like librarians.
Downsides • Done by amatuers • No oversight as to how the resources are organized and tagged • Result : Inconsistent or poor use of tags. • Reflects the use of community of users • Result: Skewed view on any particular topic For example, if a user saves a bookmark for a site with information about greyhounds but only tags the site with the term “greyhound” and not also with “dogs” or perhaps “dog racing,” that resource might never be found by someone looking for information about breeds of dogs.
Tool: Deli.icio.us • In 2003 Joshua Schachter launched del.icio.us, the first social bookmarking website. • Delicious uses a non-hierarchical classification system in which users can tag each of their bookmarks with freely chosen index terms (generating a kind of folksonomy). A combined view of everyone's bookmarks with a given tag is available; for instance, the URL "http://delicious.com/tag/wiki" displays all of the most recent links tagged "wiki". (Courtesy: www.wikipedia.com ) • Its collective nature makes it possible to view bookmarks added by other users.
Social Tag Prediction • 30 and 50 percent of URLs posted to del.icio.us have only been bookmarked once or twice. • Given that ,average bookmark has 2.5 tags. • Result: a query for a particular tag will return a bookmark only posted once or twice are low. • In other words, our recall for single tag queries is heavily limited by the high number of rare URLs with few tags.
Social Tag Prediction • For example, a user labeling a new software tool for Apple’s Mac OS X operating system might annotate it with “software,” “tool,” and “osx.” A second user looking for this content with the single tag query (or feed) “mac” would miss this content, even though a human might easily realize that “osx” implies “mac.”
Market Basket Model • In the market-basket model, there are a large set of items and a large set of baskets each of which contains a small set of items. • Goal: find correlations between sets of items in the baskets.
Market Basket Model • Market-basket data mining produces association rules of the form X → Y . • Association rules commonly have three values associated with them: • Support :The number of baskets containing both X and Y • Confidence: P(Y |X). (How likely is Y given X?) • Interest : P(Y |X)−P(Y ), alternatively P(Y |X) P(Y ) . (How much more common is X&Y than expected by chance?)
Researches Research 1
Research-Sentiment Analysis • Paper: Can Social Bookmarking Enhance Search in the Web? • Writers :Yusuke Yanbe ,Adam Jatowt, Satoshi Nakamura ,Katsumi Tanaka • Department of Social Informatics, Kyoto University • This paper investigate the usefulness of social bookmarking systems for the purpose of enhancing Web search through a series of experiments done on datasets obtained from social bookmarking systems.
Sentiment Analysis • Often tags contain sentiments expressed by users towards bookmarked resources. • This could allow for a sentiment-aware search that would exploit user feelings about Web pages.
Sentiment Analysis • In order to measure the number and kinds of sentiment tags used by bookmarking dataset, obtained from Hatena Bookmark is used. • Tags in this dataset were classified into two groups according to tag taxonomy defined by Golder and Huberman : • a) tags that identify what or whose the resource is about Content Tags • b) tags that identify qualities or characteristics of resources (scary, funny, stupid etc.) Sentiment tags
The Research • Examined top 1,100 tags from the dataset to detect content and sentiment tags. • Then, translated into English. • It can be seen that content tags are on average more common than sentiment tags.
Observation • The top 30 tags the ratio of content tags to sentiment tags is about 10:1. • Top 3 sentiment tags are very common, while the other tags are rather less used. Frequency distribution of top 20 content and sentiment tags
Observation • Top 54 sentiment tags placed on the negative-positive scale including the information about their frequencies. • The tags appearing more than 3000 times are above the dashed line, while those with frequencies less than 100 times are below the horizontal axis.
Findings • The most popular sentiment tags are: useful, amazing and awful. • In general, there are more positive sentiment tags than negative ones and positive ones are also more frequently used. • Only one negative sentiment tag was used more than 100 times (“it’s awful”). • Conclusion: This means that social bookmarkers usually do not bookmark resources to which they have negative feelings.
Researches Research 2
Research and it’s Implications • In order for researchers to understand the benefits and limitations of using user-generated tags for indexing and retrieval purposes, it is important to investigate to what extent community influences tagging behaviour, characteristic effects on tag datasets, and whether this influence helps or hinders search and retrieval.
The Research Paper Used • This article reports on research presented on a panel at The American Society for Information Science & Technology (ASIS&T) 2007 annual conference which investigated the use of social tagging in communities and in context.
The Purpose • Panel participants described studies around the world that explore to what extent and in what manner users, consciously or unconsciously, take into account their communities of practice when assigning tags. • Each study examines how different communities use social tagging to disseminate information to other community members in the online environment.
Social Tagging in the Code4Lib Community • To what extent community members consider community while tagging. • Code4Lib is an organic community consisting of librarians and library software developers. One way Code4Lib shares information is by bookmarking items in del.icio.us, a popular social tagging Web site, with the tag ‘code4lib’.
Social Tagging in the Code4Lib Community • Once an item is tagged with ‘code4lib’, it is shared in three ways: • On a Web page created through the del.icio.us site, on the Planet Code4Lib blog aggregator, • On the Code4Lib Internet Relay Chat (IRC) channel. It is assumed that members of the Code4Lib community want to share an online resource with the community if the set of tags applied to the bookmark includes the tag ‘code4lib’. • Conversely, it is assumed that community members are bookmarking resources for their own personal use when they do not include the ‘code4lib’ tag in the set of social tags they assign.
Process of Experimentation • Tags of fifteen Code4Lib members who bookmarked at least five items with the tag ‘code4lib’ on del.icio.us were reviewed. • All users whose tags were reviewed are active community members and are aware that items tagged with ‘code4lib’ are shared with the community. • Ten recent bookmarks tagged with ‘code4lib’ (community) were analyzed. • If community members tagged less than 10 items with ‘code4lib’, all bookmarks with the tag were reviewed. Ten items bookmarked by these members that did not include the ‘code4lib’ tag (personal) were also examined.
Process of Experimentation • All tags (n=872) associated with bookmarked resources were analysed (according to Golder and Huberman’s seven categories of tags.) • Sets of tags were separated by user and were placed into categories based on the inclusion or exclusion of the ‘code4lib’ tag within the set. • Both the overall number of tags and the numbers of tags in each category were analysed using the Wilcoxon Signed-Ranked tests to determine if there was any statistical difference in kinds or number of tags used
Findings • While casual observation shows differences in how some individuals tagged for themselves (set of tags which did not include ‘code4lib’) versus for the community (set of tags which included ‘code4lib’), overall, there was no significant difference in types of tags used in each set. • There was a significant difference (α = 0.01) found in the number of tags applied for the two sets. The average number of tags used in a set when 'code4lib' was included as a tag was 3.70 compared to only 2.97 tags when 'code4lib' was not included. • A larger number of tags was applied when tagging for the community.
Findings • This may indicate that community members do indeed tag differently for a community than they do for themselves. • However, when the tag code4lib is excluded from the count of tags for these resources, the difference does not turn out to be statistically significant. This suggests that the only difference is the inclusion of the community tag.
Metadata Tagging in China: Three Models of Tag Use • There are three models of how tagging is used in China as metadata . The first way is where the user employs tagging to link information through the tag to the user. This is tagging being used as a way to indicate the relationship of the user to the information, showing how the user perceives ('reads') that information object
Metadata Tagging in China: Three Models of Tag Use The second model portrays users that are connected together through their use of tags. This is where real social networking comes in, as users are tagging to relate their concept of information to another user's concept of some piece of information.
Metadata Tagging in China: Three Models of Tag Use Within the final model, tags are used to link banks of data (or information) to other Information. The tags are acting as metadata to allow search engines to know which information is related to other information. This type of tagging is used greatly within ontologies.
References http://net.educause.edu/ir/library/pdf/ELI7001.pdf http://www.ariadne.ac.uk/issue54/tonkin-et-al/ http://ilpubs.stanford.edu:8090/775/1/2006-10.pdf http://www.dsc.ufcg.edu.br/~baptista/cursos/BDCopin/p195-heymann.pdf http://ilpubs.stanford.edu:8090/834/1/2008-18.pdf "
References • http://net.educause.edu/ir/library/pdf/ELI7001.pdf • http://www.ariadne.ac.uk/issue54/tonkin-et-al/ • http://ilpubs.stanford.edu:8090/775/1/2006-10.pdf • http://www.dsc.ufcg.edu.br/~baptista/cursos/BDCopin/p195-heymann.pdf • http://ilpubs.stanford.edu:8090/834/1/2008-18.pdf