i tube you tube everybody tubes pablo rodriguez telefonica research barcelona l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
PowerPoint Presentation
Download Presentation

Loading in 2 Seconds...

play fullscreen
1 / 29

- PowerPoint PPT Presentation


  • 382 Views
  • Uploaded on

add image. I Tube, You Tube, Everybody Tubes … Pablo Rodriguez Telefonica Research Barcelona. YouTube Video Example. “ Content is NOT king ” . Content Explosion. Internet. infinite. How to search content?. Number of TV channels. digital cable. 100. analog cable. 40. broadcast. 3.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about '' - Anita


Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
i tube you tube everybody tubes pablo rodriguez telefonica research barcelona

add image

I Tube, You Tube, Everybody Tubes…Pablo RodriguezTelefonica ResearchBarcelona

content explosion
Content Explosion

Internet

infinite

How to search content?

Number of TV channels

digitalcable

100

analogcable

40

broadcast

3

Time

1995

1980

today

1950

aggregation and recommendation
Aggregation and Recommendation

Infinite Choice = Overwhelming Confusion

Filters required to

connect users with

content that

appeal to their

interests

video and social networks
Video and Social Networks
  • Trends in video services
    • Users generate new videos
    • Users help each other finding videos
  • Need to understand usersandcontents
    • Video characteristics in YouTube
    • User-behavior and potential for recommendations
particularities of
Particularities of

“bite-size bits for high-speed munching”

[Wired mag. Mar 2007]

  • Plethora of YouTube clones
  • UGC is very different

How different?

ugc vs non ugc
UGC vs. Non-UGC
  • Massive production scale

15 days in YouTube to produce 120-yr worth of movies in IMDb!

  • Extreme publishers

1000 uploads over few years vs. 100 movies over 50 years

  • Short video length

30 sec–5 min vs. 100 min movies in LoveFilm

the rest: consumption patterns

user participation finding videos
User Participation/Finding Videos
  • Despite Web 2.0 features, user participation remains low
    • Only 0.16%-0.22% viewers rate videos/comment.
  • 47% videos have pointers from external sites
    • But requests from such sites account for less than 3% of the total views
goals and data
Goals and Data
  • Potential for recommendation systems?
  • Popularity evolution
  • Content Duplication
  • Crawled YouTube and other UGC systems

metadata: video ID, length, views

1.6M Entertainment, 250KScience videos

Goals

Data

part1 popularity distribution
Part1: Popularity Distribution

Static popularity characteristics

Underlying mechanism

pareto principle
Pareto Principle
  • 10% popular videos account for 80% total views

Other online VoD systems show smaller skew!

Fraction of aggregate views

Normalized video ranking

dominant power law behavior
Dominant Power-Law Behavior
  • Richer-get-richer principle

If video has K views, then users will watch the video with rate K

  • word frequency- citations of papers - scale of earthquakes
  • web hits

a

y=x

Frequency (log)

City population (log)

ugc video distribution
UGC Video Distribution
  • Straight-line waists and truncated both ends
focusing on popular videos
Focusing on Popular Videos
  • Why popular videos deviate from power-law?
  • Fetch-at-most-once[SOSP2003]
    • Behavior of fetching immutable objects oncecf. visiting popular web sites many times
why the unpopular tail falls off
Why the Unpopular Tail Falls Off
  • Natural shape is curved
  • Sampling bias or pre-filters
    • Publishers tend to upload interesting videos
  • Information filtering or post-filters
    • Search results or suggestions favor popular items
impact of post filters
Impact of Post-Filters
  • Videos exposed longer to filtering effect appear more truncated

video rank

is it naturally curved
Is it Naturally Curved?
  • Matlab curve fitting for Science

Science videos

Zipf

Zipf + exp cutoff

Exponential

Log-normal

is it naturally curved19
Is it Naturally Curved?
  • Matlab curve fitting for Science

Science videos

Zipf is scale-free, while exponential is scaled : underlying mechanism is Zipfand truncation is due to bottlenecks

Zipf

Zipf + exp cutoff

Exponential

Log-normal

implication of our findings
Implication of Our Findings

Latent demand for products that is suppressed by bottlenecks in the system

[Chris Anderson, The Long Tail]

Views

Entertainment

40% additional views!

How? Personalized recommendation

Enriched metadataAbundant videos

Rankings

part2 popularity evolution
Part2: Popularity Evolution

Relationship between popularity and age

popularity evolution
Popularity Evolution
  • So far, we focused on static popularity
  • Now focus on popularity dynamics
  • How requests on any given day are distributed across the video age?
  • 6-day daily trace of Science videos
    • Step1- Group videos requested at least once by age
    • Step2- Count request volume per age group
request volume across age
Request Volume Across Age

User preference relatively insensitive to age

--> 80% requests on videos older than a month

The probability of a video being watched is 43%, 18%, 17% and 14% for the first 24 hours, 6 days, 3 weeks, and 1 month accordingly

part4 content duplication
Part4: Content Duplication

Level of duplication

Birth of duplicates

content duplication
Content Duplication
  • Alias-identical or similar copies of the same content
  • Aliases dilute popularity of a single event
    • Views distributed across multiple copies
    • Difficulty in recommendation & ranking systems
  • Test with 51 volunteers
    • Find alias using keyword search
    • Identified 1,224 aliases for 184 original videos
the level of popularity dilution
The Level of Popularity Dilution
  • Popularity diluted up to few-orders magnitude
  • Often aliases got more requests than original
  • (e.g. alias got >1000 times more requests)
how late aliases appear
How Late Aliases Appear?
  • Significant aliases appear within one week
  • Within the first day of posting the original video, sometimes you get more than 80 aliases
conclusions
Conclusions
  • UGC is a new form of video social interaction
  • User interaction remains low
  • Lots of potential for social recommendations
dataset available at http an kaist ac kr traces imc2007 html
Dataset available at http://an.kaist.ac.kr/traces/IMC2007.html

Questions?