Download
1 / 53

Outline - PowerPoint PPT Presentation


  • 91 Views
  • Uploaded on

Outline . Motivation Information overload in a scientific congress scenario Conference Participant Advisor Service Profile-driven paper recommending User Profiles as Bayesian Text Classifiers User Profiles learned from documents semantically indexed through a WSD procedure [*]

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Outline ' - kirestin-tillman


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Outline
Outline

  • Motivation

    • Information overload in a scientific congress scenario

  • Conference Participant Advisor Service

    • Profile-driven paper recommending

    • User Profiles as Bayesian Text Classifiers

    • User Profiles learned from documents semantically indexed through a WSD procedure [*]

  • Empirical Evaluation

  • Conclusions and Future Work

    [*] Combining Learning and Word Sense Disambiguation for Intelligent User Profiling - IJCAI 2007


Motivation
Motivation

  • Information overload in the scientific congress scenario


Motivation1
Motivation

  • Information overload in the scientific congress scenario


Web personalization
Web Personalization

  • Personalized systems adapt their behavior to individual users by learning user profiles

    • Structured model of the user interests

    • Exploitable for providing personalized content and services

  • Personalization usually done automatically based on the user profile and possibly the profiles of other users with similar interests (collaborative approach)

  • How personalization can be used in the scientific congress scenario?


Web personalization in the scientific congress scenario
Web Personalization in the scientific congress scenario

  • Learn research interests of participants from papers they rated

  • Store research interests in personal profiles

    • Used to build personalized programs delivered to participants


Learning user profiles as a text categorization problem
Learning User Profiles as a Text Categorization problem

OUR STRATEGY

content-based recommendations by learning from TEXTand USER FEEDBACK on items


Keyword based profiles problems

doc1

AI is a branch of computer science

doc2

the 2007 International Joint Conference on Artificial Intelligence will be held in India

USER PROFILE

artificial 0.02

intelligence 0.01

apple 0.13

AI 0.15

doc3

apple launches a new product…

Keyword-based profiles: problems

MULTI-WORD CONCEPTS


Keyword based profiles problems1

doc1

AI is a branch of computer science

doc2

the 2007 International Joint Conference on Artificial Intelligence will be held in India

USER PROFILE

artificial 0.02

intelligence 0.01

apple 0.13

AI 0.15

doc3

apple launches a new product…

Keyword-based profiles: problems

SYNONYMY


Keyword based profiles problems2

doc1

AI is a branch of computer science

doc2

the 2007 International Joint Conference on Artificial Intelligence will be held in India

USER PROFILE

artificial 0.02

intelligence 0.01

apple 0.13

AI 0.15

doc3

apple launches a new product…

Keyword-based profiles: problems

POLYSEMY


Item recommender itr
ITem Recommender (ITR)

  • Advanced NLP techniques used to represent documents

  • Naïve Bayes text classification to assign a score (level of interest) to items according to the user preferences

  • Result: semantic user profile - as a binary text classifier (user-likes and user-dislikes) - containing the probabilistic model of user preferences



Word sense disambiguation wsd
Word Sense Disambiguation (WSD)

  • Process of deciding which sense of a word is used in a specific context

  • WordNet as sense inventory

    • nouns, verbs, adverbsand adjectivesorganized into SYNonym SETs (synset), each one representing an underlying lexical concept

    • change of text representation from vectors (bag) ofwords (BOW) into vectors (bag) of synsets (BOS)


Jigsaw wsd algorithm
JIGSAW WSD algorithm

  • Three different strategies to disambiguate nouns, verbs, adjectives and adverbs

    • Effectiveness of WSD strongly influenced by the POS tag of the target word

    • Input: d = {w1, w2, …. , wh} document

    • Output: X = {s1, s2, …. , sk} (kh)

      • Each siobtained by disambiguating wibased on the context of each word

      • Some words not recognized by WordNet

      • Groups of words recognized as a single concept


Jigsaw nouns the idea

Adaptation of the Resnik algorithm

Semantic similarity between synsets inversely proportional to their distance in the WordNet IS-A hierarchy

Path length similarity between synsets used to assign scores to the candidate synsets of a polysemous word

JIGSAWnouns: The idea


Synset semantic similarity

Placentalmammal

Carnivore

Rodent

3

4

Mouse

(rodent)

5

Feline, felid

2

Cat

(feline mammal)

1

Synset Semantic Similarity

SINSIM(cat,mouse) =

-log(5/32)=0.806

Leacock-Chodorow similarity


Jigsaw nouns

mouse

cat

02244530: any of numerous small rodents…

02037721: feline mammal…

cat

03651364: a hand-operated electronic device …

00847815: computerized axial tomography…

mouse

JIGSAWnouns

“The white cat is hunting the mouse”

w = cat

C = {mouse}

white

cat

hunt

mouse

Wcat={02037721,00847815}

T={02244530,03651364}


Jigsaw nouns1

cat

02244530: any of numerous small rodents…

0.806

02037721: feline mammal…

0.806

0.0

0.806

0.0

cat

03651364: a hand-operated electronic device …

00847815: computerized axial tomography…

0.107

mouse

JIGSAWnouns

“The white cat is hunting the mouse”

w = cat

C = {mouse}

white

hunt

Wcat={02037721,00847815}

T={02244530,03651364}


Jigsaw verbs synset description

Glosses

JIGSAWverbs: synset description

  • Descriptionof synset si = gloss + example phrases in WordNet for si


Jigsaw verbs synset description1
JIGSAWverbs: synset description

  • Descriptionof synset si = gloss + example phrases in WordNet for si

Example phrases


Jigsaw verbs the idea
JIGSAWverbs: The idea

  • It tries to establish a relation between verbs and nouns

    • Not directly linked in WordNet

  • Verb w disambiguated using:

    • nounsin the context of w

    • nounsinto thedescription of each candidate synset for w


Jigsaw verbs example 1 4
JIGSAWverbs: Example (1/4)

w=play N={basketball, soccer}

I play basketball and soccer

  • (70) play -- (participate in games or sport; "We played hockey all afternoon"; "play cards"; "Pele played for the Brazilian teams in many important matches")

  • (29) play -- (play on an instrument; "The band played all night long")

nouns(play,1): game, sport, hockey, afternoon, card, team, match

nouns(play,2): instrument, band, night

nouns(play,35): …


Jigsaw verbs example 2 4
JIGSAWverbs: Example (2/4)

w=play N={basketball, soccer}

nouns(play,1): game, sport, hockey, afternoon, card, team, match

game1

basketball1

game2

game

basketball

basketballh

gamek

sport1

sport2

sport

MAXbasketball = MAXiSinSim(wi,basketball) winouns(play,1)

sportk


Jigsaw verbs example 3 4
JIGSAWverbs: Example (3/4)

w=play N={basketball, soccer}

nouns(play,1): game, sport, hockey, afternoon, card, team, match

game1

soccer1

game2

game

soccer

soccerh

gamek

sport1

sport2

sport

MAXsoccer = MAXiSinSim(wi, soccer) winouns(play,1)

sportk


Jigsaw verbs example 4 4
JIGSAWverbs: Example (4/4)

MAXbasketball

Φ (play,1)= Weighted average of MAX values taking into account the position of each word in the context wrt the verb

nouns(play,1)

MAXsoccer

...

...

Φ (play,i)

nouns(play,i)

Synset assigned to “play” = argmax Φ (play,i)

i


Jigsaw others
JIGSAWothers

  • Based on the Lesk algorithm

  • Similarity between the glosses of each candidate sense of target wordand the glosses of words in the context


Jigsaw others example 1 5
JIGSAWothers:Example (1/5)

  • 1. {01703749} aged, elderly, older, senior -- (advanced in years; "aged members of the society"; "elderly residents could remember the construction of the first skyscraper"; "senior citizen")

  • 2. {01546830} aged, ripened - (of wines, fruit, cheeses; having reached a desired or final condition; "mature well-aged cheeses")

w=agedN={bottle, wine}

I bought a bottle of aged wine

Candidate synsets for the target word


Jigsaw others example 2 5
JIGSAWothers:Example (2/5)

  • 1. {01703749} aged, elderly, older, senior --(advanced in years; "aged members of the society"; "elderly residents could remember the construction of the first skyscraper"; "senior citizen")

  • 2. {01546830} aged, ripened -(of wines, fruit, cheeses; having reached a desired or final condition; "mature well-aged cheeses")

w=agedN={bottle, wine}

I bought a bottle of aged wine

Keep glosses of candidate synsets


Jigsaw others example 2 51
JIGSAWothers:Example (2/5)

  • 1. {02848798} bottle --(a glass or plastic vessel used for storing drinks or other liquids; typically cylindrical without handles and with a narrow neck that can be plugged or capped)

  • 2. {13584548} bottle, bottleful -- (the quantity contained in a bottle)

w=agedN={bottle, wine}

I bought a bottle of aged wine

Keep glosses of each word in the context


Jigsaw others example 2 52
JIGSAWothers:Example (2/5)

  • 1. {02848798} bottle --(a glass or plastic vessel used for storing drinks or other liquids; typically cylindrical without handles and with a narrow neck that can be plugged or capped)

  • 2. {13584548} bottle, bottleful -- (the quantity contained in a bottle)

w=agedN={bottle, wine}

I bought a bottle of aged wine

  • 1. {07784932} wine, vino -- (fermented juice (of grapes especially))

  • 2. {04907195} wine, wine-colored -- (a red as dark as red wine)


Jigsaw others example 3 5
JIGSAWothers:Example (3/5)

  • 1. {02848798} bottle --(a glass or plastic vessel used for storing drinks or other liquids; typically cylindrical without handles and with a narrow neck that can be plugged or capped)

  • 2. {13584548} bottle, bottleful -- (the quantity contained in a bottle)

w=agedN={bottle, wine}

I bought a bottle of aged wine

+

  • 1. {07784932} wine, vino -- (fermented juice (of grapes especially))

  • 2. {04907195} wine, wine-colored -- (a red as dark as red wine)

=

Gloss of the whole context

  • a glass or plastic vessel used for storing drinks or other liquids typically cylindrical without handles and with a narrow neck that can be plugged or cappedthe quantity contained in a bottle fermented juice (of grapes especially) a red as dark as red wine


Jigsaw others example 4 5

No overlap

JIGSAWothers:Example (4/5)

  • 1. {01703749} aged, elderly, older, senior --(advanced in years; "aged members of the society"; "elderly residents could remember the construction of the first skyscraper"; "senior citizen")

  • 2. {01546830} aged, ripened -(of wines, fruit, cheeses; having reached a desired or final condition; "mature well-aged cheeses")

w=agedN={bottle, wine}

I bought a bottle of aged wine

Overlap between Glosses

  • a glass or plastic vessel used for storing drinks or other liquids typically cylindrical without handles and with a narrow neck that can be plugged or cappedthe quantity contained in a bottle fermented juice (of grapes especially) a red as dark as red wine


Jigsaw others example 4 51
JIGSAWothers:Example (4/5)

  • 1. {01703749} aged, elderly, older, senior --(advanced in years; "aged members of the society"; "elderly residents could remember the construction of the first skyscraper"; "senior citizen")

  • 2. {01546830} aged, ripened -(of wines, fruit, cheeses; having reached a desired or final condition; "mature well-aged cheeses")

w=agedN={bottle, wine}

I bought a bottle of aged wine

  • a glass or plastic vessel used for storing drinks or other liquids typically cylindrical without handles and with a narrow neck that can be plugged or cappedthe quantity contained in a bottle fermented juice (of grapes especially) a red as dark as red wine

Overlap


Jigsaw others example 5 5

selected synset: 01546830

JIGSAWothers:Example (5/5)

  • 1. {01703749} aged, elderly, older, senior --(advanced in years; "aged members of the society"; "elderly residents could remember the construction of the first skyscraper"; "senior citizen")

  • 2. {01546830} aged, ripened -(of wines, fruit, cheeses; having reached a desired or final condition; "mature well-aged cheeses")

w=agedN={bottle, wine}

I bought a bottle of aged wine

  • a glass or plastic vessel used for storing drinks or other liquids typically cylindrical without handles and with a narrow neck that can be plugged or cappedthe quantity contained in a bottle fermented juice (of grapes especially) a red as dark as red wine


Paper recommending
Paper Recommending

Keyword-based

representation (BOW)

Tokenization +

Stopword +

Stemming

Sense-based

representation (BOS)

Tokenization +

Stopword +

POS + disambiguation

Title

content-based recommendations by learning from TEXT and USER RATINGS (1-5) on papers

Instance

(paper)

Authors

Abstract


An example of bos generated profile
An example of BOS-generated Profile


Conference participant advisor login
Conference Participant Advisor: Login

Conference Participant

Advisor service


Conference participant advisor selecting papers to train the system
Conference Participant Advisor: Selecting Papers to train the system



Conference participant advisor rating retrieved papers
Conference Participant Advisor: Rating Retrieved Papers


Conference participant advisor getting the personalized program
Conference Participant Advisor: Getting the Personalized Program


Personalized program delivered by mail

1 - personalized conference program

2 - details about recommended papers

Personalized Program delivered by mail


Conference participant advisor personalized program paper details
Conference Participant Advisor: Personalized Program + Paper details


Experimental evaluation
Experimental Evaluation

  • Experiments: BOW-generated profiles vs. BOS-generated profiles

  • ISWC dataset

    • 100 papers accepted at ISWC 02-03

    • 288 ratings collected by 11 users

  • 5-fold stratified cross-validation

  • Precision, Recall, F-measure, NDPM

    • Paper relevant if rating >3

    • Probability of class “likes” >0.5

  • Wilcoxon signed rank test

    • Classification for each user is a trial

    • Low number of independent trials

    • Significance level p < 0.05


Results of semantic profiles evaluation
Results of Semantic Profiles Evaluation

+2%

=

+2%

+1%


Conclusions future works
Conclusions & Future Works

  • Conference Participant Advisor

    • Intelligent service relying on concept-based profiles

    • WSD based on linguistic ontology

  • As a future work integration of:

    • domain-specific ontologies in the process of semantic representation and indexing of documents

    • social networks of conference participants as additional source of information


Service details
Service details

  • Service deployed in VIKEF project at:

    http://193.204.187.223:8080/iswc_rebuild/



Bag of synsets
Bag of Synsets

  • Reduction of features

    • Recognition of bigrams

    • Synonyms represented by the same synsets

Bag of Words

Bag of Synsets


Classification phase
Classification Phase

  • Each document is represented as a vector of BOS, one for each slot

  • Each slot is independent from the others

S = {s1, s2, …, s|S|} is the set of slots

bim is the BOS in slot sm of instance di

tk is the kth token (occurring nkim times in BOS bim)


Training phase
Training Phase

C = {c+, c-}

  • C+ likes (ratings 4-5)

  • C– dislikes (ratings 1-2) (3 is neutral)

User ratings ri  Weighted Instances


Evaluation
Evaluation

  • JIGSAW evaluated on SENSEVAL-3 English Sample task: 37.6% Precision

  • JIGSAW evaluated on SENSEVAL-3 English All Word task:52% Precision

SENSEVAL-3 English Sample task


ad