sihem amer yahia yahoo research l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
How Could We All Get Along on the Web 2.0? The Power of Structured Data on the Web PowerPoint Presentation
Download Presentation
How Could We All Get Along on the Web 2.0? The Power of Structured Data on the Web

Loading in 2 Seconds...

play fullscreen
1 / 52

How Could We All Get Along on the Web 2.0? The Power of Structured Data on the Web - PowerPoint PPT Presentation


  • 133 Views
  • Uploaded on

Sihem Amer Yahia Yahoo! Research. How Could We All Get Along on the Web 2.0? The Power of Structured Data on the Web. Outline. Web search and web 2.0 search Why should we all get along? How could we all get along? Related work Conclusion. Web search.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'How Could We All Get Along on the Web 2.0? The Power of Structured Data on the Web' - ayla


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
outline
Outline
  • Web search and web 2.0 search
  • Why should we all get along?
  • How could we all get along?
  • Related work
  • Conclusion
web search
Web search
  • Access to “heterogeneous”, distributed information
    • Heterogeneous in creation
    • Heterogeneous in motives
  • Keyword search very effective in connecting people to information

Search

web pages

web pages

web search vs web 2 0 search

Content consumers

Web search vs web 2.0 search?

Web 2.0 search

Subscribers

Feeds

Web search

Anonymous

Content aggregators

Content creators

slide5
Web 2.0 a generation of internet-based services that
  • let people form online communities
  • in order to collaborate
  • and share information

in previously unavailable ways

online communities
Online communities
  • Subscribers join communities where they
    • exchange content: emails, comments, tags
    • rate content from other subscribers
    • exhibit common behavior
  • About 500M unique Y! visitors per month, about 200M subscribers (login visitors) to more than 130 Y! services
web 2 0 search
Web 2.0 search

Connecting people to people

Web 2.0

Flickr

Y!Answers

YouTube

Y!Groups

web 2 0 search examples
Web 2.0 search examples
  • Mary is a professional photographer and is looking for aerial photos of the Hoggar desert
  • She is also an amateur Jazz dancer and wants to ask about dance schools w/flexible schedules in SF
  • She is also looking for the latest video on bird migration in Central Park, NY
  • She has heart problems but loves biking and is interested in finding about email discussions on biking trails in northern California
outline9
Outline
  • Web search and web 2.0 search
  • Why should we all get along?
  • How could we all get along?
  • Related work
  • Conclusion
improving users experience
Improving users’ experience
  • Keyword search should be maintained: simple and intuitive
  • Keyword queries usually short
    • only express a small fraction of the user's true intent
  • Users's interactions within community-based systems can be used to infer a lot more about intent and return better answers
why should we all get along
Why should we all get along?
  • Contributed content is structured
    • This is what DB community knows how to do best
  • Relevance to query keywords is key
    • This is what IR community knows how to do best
searching online communities
Searching online communities

data table

Tags, ratings,

Reviews table

community

relationship table

searching online communities13
Searching online communities
  • Search for most relevant data on some topic
    • Querying data: selection over data table
    • Querying annotations: selection over annotation table + join w/data table
    • Personalizing answers: join w/subscribers table
  • Relevance: use data relevance + annotation table
why should we all get along14
Why should we all get along?
  • Query interpretation depends on subscriber’s interest at the time of querying
  • Data annotations are dynamic
    • Precompute all (sub,sub,trust) for each topic?
  • Need for dynamic query generation
db and ir
DB and IR
  • Shared interactions help focus search
    • User-input, community-input, extraction
    • Personalizing answers with community information
  • Ranking as a combination of
    • Relevance
    • Relationship strengths between people in the same community
outline16
Outline
  • Web search and web 2.0 search
  • Why should we all get along?
  • How could we all get along?
    • Applications
    • Technical challenges
  • Related work
  • Conclusion
applications
Applications
  • Flickr enables sharing and tagging photos
  • Y! Answersenables asking and answering questions in natural language
  • YouTube enables sharing videos, rating videos, commenting on videos and subscribing to new videos from favorite users
  • Y! Groups enables creating groups, joining existing groups, posting in a group
flickr
Flickr
  • Acquired by Y! in 2005
  • Tag search
  • Photos grouped into categories.
  • Set privacy levels on each photo
the new inputs to flickr search
The new inputs to Flickr search

Users tag and rate

photos

  • Combine tag-based search
  • with community knowledge
  • Combine photo rating with
  • relationship strength

Users tagging same photos with

similar tags form a community of interest

y answers
Y! Answers
  • Launched in second half of 2005
  • Incentive system based on points and voting for best answers
  • Questions grouped by category
  • Some statistics:
    • over 60 million users
    • over 120 million answers, available in 18 countries and in 6 languages
the new inputs to y answers search
The new inputs to Y!Answers search

Users provide

Questions/Answers

Combine community

information with answer rating

Voting information reflects

communities of interest

youtube
YouTube
  • Founded in February 2005
  • Tag search
  • Videos grouped by category
  • Some statistics:
    • 100 million views/day
    • 65,000 new videos/day
the new inputs to youtube search
The new inputs to YouTube search

Users provide videos,

tags, ratings, comments

Combine community

information with video rating

Similar tags on same videos

imply communities of interest

yahoo groups
Yahoo! Groups
  • Yahoo! acquired eGroups in 2000
  • Group moderators
  • Groups belong to categories
  • Public and private groups
  • Some statistics:
    • over 7M groups
    • over 190M subscribers
    • over 100K new subscribers/day
    • over 12M emails/day
alternative query interpretations
Alternative query interpretations
  • Return all group postings relevant to a query.
  • Return only posting by subscribers sharing the same interests: women with heart disease interested in steep slopes
the new inputs to group search
The new inputs to Group search

Users participate in

many groups

Combine community information

with postings relevance

Group membership and postings imply communities of interest

outline37
Outline
  • Web search and web 2.0 search
  • Why should we all get along?
  • How could we all get along?
    • Applications
    • Technical challenges
  • Related work
  • Conclusion
so how can we all get along
So, how can we all get along?
  • Augment keyword query with conditions on structure to focus and personalize search (DB)
    • Flickr: tags
    • Answers: points
    • YouTube: reviews and ratings
    • Groups: emails
  • Combine it with relevance (IR)
search architecture

search terms

structuredquery

Search architecture

Query

tightening

Query

evaluation

Subscriber

Ranking

content relevance

+

relationship

Find relevant

community of

interest

example

“biking trails

northern

california”

message contains “…” and

from = “s1” or “s2”

Example

S1 S1

S2 S2

S3 S3

S4 S4

S5 S5

S6 S6

S7 S7

From:

To:

Date:

Subject:

Content:

( si, sj, cij )

message

structure

Many such relationships depending on subscriber’s interests

Query

tightening

can we really all get along
Can we really all get along?
  • IR may think that user weights are enough to target communities of interest and personalize queries
  • DB thinks expressiveness of query languages cannot all be captured by ranking functions
query rewriting
Query rewriting

Content-Only

Content

in context

Loose interpretation

of context

query relaxation
Query relaxation
  • Primitive operations for dropping query predicates
  • Answers to relaxed query contain answers to exact one
  • Scores relaxed answer no higher than score of exact one
query tightening
Query tightening
  • Primitive operations for adding query predicates
  • Tighter answers are found but looser answers should be maintained
  • Scores tighter answers no lower than scores of other answers
more technical challenges
More technical challenges
  • Query tightening primitives to focus search
  • Subscriber has a different profile/community of interest
  • Topk processing needs to enforce user profiles
outline46
Outline
  • Web search and web 2.0 search
  • Why should we all get along?
  • How could we all get along?
    • Applications
    • Technical challenges
  • Related work
  • Conclusion
related work
Related Work
  • Language models: Ask Bruce Croft
  • Web search personalization
    • Search behavior
    • HARD track at TREC
  • Building relationship graphs:
    • Collaborative filtering
    • Clustering
    • Unsupervised learning
tempting conclusion
Tempting conclusion
  • Little information could be gathered on users to greatly improve new-generation search
  • IR and DB views both needed
more technical challenges49
More technical challenges
  • Subscriber belongs to different communities of interest
  • Should subscriber turn off personalization?
  • How is efficiency affected? (revisiting topk processing)
  • Back from community search to web search?
beyond search in online communities
Beyond search in online communities
  • Are online communities a way to build more accurate user profiles or more?
    • display relevant groups when user is asking a question on Y! Answers: mashups?
danger of online communities
Danger of online communities

Are we discouraging diversity?

questions sihem@yahoo inc com http research yahoo com sihem
Questions?

sihem@yahoo-inc.com

http://research.yahoo.com/~sihem

Thank you.