mining tag semantics for social tag recommendation n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Mining Tag Semantics for Social Tag Recommendation PowerPoint Presentation
Download Presentation
Mining Tag Semantics for Social Tag Recommendation

Loading in 2 Seconds...

play fullscreen
1 / 25

Mining Tag Semantics for Social Tag Recommendation - PowerPoint PPT Presentation


  • 144 Views
  • Uploaded on

Mining Tag Semantics for Social Tag Recommendation. Hsin-Chang Yang Department of Information Management National University of Kaohsiung. Outline. Introduction Text Mining by SOM Tag Recommendation Process Experimental Results Conclusions. Social Bookmarking –Why?.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Mining Tag Semantics for Social Tag Recommendation' - caldwell-humphrey


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
mining tag semantics for social tag recommendation

Mining Tag Semantics for Social Tag Recommendation

Hsin-Chang Yang

Department of Information Management

National University of Kaohsiung

outline
Outline
  • Introduction
  • Text Mining by SOM
  • Tag Recommendation Process
  • Experimental Results
  • Conclusions
social bookmarking why
Social Bookmarking –Why?
  • Social bookmarking services (aka folksonomy) are gaining popularity since they have the following benefits:
    • Alleviation of efforts in Web page annotation
    • Improvement of retrieval precision
    • Simplification of Web page classification
how a folksonomy works
How a folksonomy works?
  • Simple
    • A user (ui) annotates a Web page (oj) with a set of tags or post (Tij).
    • Generally represented as a set of tuples (ui, oj, Tij)

interesting…

GrC2011

program

Let me add some tags.

Granular Computing

ui

oj

Tij

characteristics of folksonomy
Characteristics of Folksonomy
  • Collaboration
  • Semantic relatedness
    • help improving retrieval precision
  • Social tagging is not a trivial task
tag recommendation
Tag Recommendation
  • the mechanism of suggesting proper tags to normal users when they try to adding tags to some Web page
    • save the effort of users to select tags from the ground up
    • constrain the formulation of tags
  • Automatic tag recommendation process is thus beneficial for social bookmarking services as well as search engines.
outline1
Outline
  • Introduction
  • Text Mining by SOM
  • Tag Recommendation Process
  • Experimental Results
  • Conclusions
text mining by som
Text Mining by SOM

Training Web pages

Training Posts

Preprocessing

Web page vectors

Post vectors

SOM training

Synaptic weight vectors

Labeling

page associations

tag associations

Page clusters

tag clusters

Association discovery

Page/tag associations

preprocessing
Preprocessing
  • bag of words approach for describing pages and posts
    • post: collection of tags annotated to a page at once
  • Web page Pi is transformed to a binary vector Pi.
  • Ti, which is the post of Pi, is transformed to a binary vector Ti.
som training
SOM Training
  • All Pi and Ti were trained by the self-organizing map algorithm separately.
  • Two maps MP and MT were obtained after the training.
labeling
Labeling
  • We labeled each Web page on MP by finding its most similar neuron. A page cluster map (PCM) was obtained after all pages being labeled.
  • The same approach was applied on all posts on MT and obtained tag cluster map (TCM).

PCM

TCM

P1, P5, P65

T1, T8

association discovery
Association Discovery
  • Finding associations between page clusters and post clusters.
  • We used a voting scheme to find the associations.

Ty

Ti

PCM

TCM

Tj

+1

Pi

+1

Pj

Px

association discovery1
Association Discovery
  • Similarity between a page cluster Px and a post cluster Ty :
    • I: index set operator
    • Ck,l = 1 if Pk is annotated by Tl;

= 0 otherwise

  • Px is associated with a post cluster Ty with maximum similarity
outline2
Outline
  • Introduction
  • Text Mining by SOM
  • Tag Recommendation Process
  • Experimental Results
  • Conclusions
architecture of tag spam detection
Architecture of Tag Spam Detection

Incoming Web page

Preprocessing

Incoming page vector

Labeling

PCM

Labeled page cluster

Tag Recommendation

Page/tag associations

Recommended tags

tag recommendation1
Tag Recommendation
  • Px : the incoming Web page
  • Let Px be labeled to Px.
  • Let Tx be the most related tag cluster of Px , all tags in Tx will be recommended.

PCM

TCM

recommended!

Tx

Tx

Px

Px

outline3
Outline
  • Introduction
  • Text Mining by SOM
  • Tag Recommendation Process
  • Experimental Results
  • Conclusions
experimental results
Experimental Results
  • Dataset
    • ECML/PKDD Discovery Challenge 2008 (RSDC 2008) tag recommendation dataset
    • over 132K tags posted by 468 users
    • 16235 bookmarked items, either Web pages or BibTeX entries
    • contains some noisy data
      • items without too much content
      • items without tags
experimental results1
Experimental Results
  • Preprocessing
    • Discard tags that contain non-English characters
    • Remove numeric tags
    • Remove tags that are stop words such as ’for’ and ’the’
    • Transform all tags to lowercase
    • Ignore extremely short tags
    • Ignore extremely long tags
    • Stemming the remaining tags
experimental results2
Experimental Results
  • Parameters for SOM training
experimental results3
Experimental Results
  • Summary of PCM and TCM
experimental results4
Experimental Results
  • We recommended each page with a set of 10 ranked tags.
  • These recommended tags were then compared to the original tags.
  • We use F1-measure to compare with the results in RSDC 2008.
experimental results5
Experimental Results
  • Evaluation result
conclusions
Conclusions
  • A novel scheme for tag recommendation based on text mining.
  • Relatedness between Web pages and tags were discovered based on clustering result of self-organizing map.
  • Use only the content of Web pages instead of user behaviors.