analyzing patterns of user content generation in online social networks l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Analyzing Patterns of User Content Generation in Online Social Networks PowerPoint Presentation
Download Presentation
Analyzing Patterns of User Content Generation in Online Social Networks

Loading in 2 Seconds...

play fullscreen
1 / 24

Analyzing Patterns of User Content Generation in Online Social Networks - PowerPoint PPT Presentation


  • 133 Views
  • Uploaded on

Analyzing Patterns of User Content Generation in Online Social Networks. Lei Guo, Yahoo! Enhua Tan, Ohio State University Songqing Chen, George Mason University Xiaodong Zhang, Ohio State University Yihong (Eric) Zhao, Yahoo!.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Analyzing Patterns of User Content Generation in Online Social Networks' - noelani


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
analyzing patterns of user content generation in online social networks

Analyzing Patterns of User Content Generation in Online Social Networks

Lei Guo, Yahoo!

Enhua Tan, Ohio State University

Songqing Chen, George Mason University

Xiaodong Zhang, Ohio State University

Yihong (Eric) Zhao, Yahoo!

online social networks platforms for social connections and content sharing
Online social networks: platforms for social connections and content sharing
  • Networking oriented OSNs
    • Knowledge-sharing mainly among friends
  • Knowledge-sharing oriented OSNs
    • Content sharing is among all users

User network

social connections

Content network

common interest topics

ugc content in online social networks
UGC content in online social networks
  • User generated content (UGC)
    • Users are basic elements of OSNs
    • OSNs are driven by user contributions
  • Understanding UGC content generation patterns is important
    • Business success: attract new users and clients
    • Identify and distinguish active users from spamming users
    • Predict hot spots and the trends of topics in user communities
    • Perform efficient resource management in the underlying supporting system

Users create new contents

advertisement

Contents attract new users

existing studies about user contributions in online social networks

log y

y

slope: -a

heavy tail

log i

i

Existing studies about user contributions in online social networks
  • Wikipedia
    • Power law: core users contribute most articles (ISSI’05)
      • Number of articles a user edited
      • Number of co-authors of a Wiki article
      • Heavy tailed, scale free: highly skewed towards top users
    • User contribution shifts from “elite” users to common users (CHI’07)
      • Log analysis from 2001 to 2006

Power law or not: no conclusion

  • Delicious social bookmark (CHI’07)
    • Similar shifts for user contribution as in Wikipedia
    • Power law or not: no conclusion

i : contribution rank of a user

yi : contribution of the user

our study
Our study
  • UGC content in three large online social networks
    • Blog, social bookmark, question answer
  • User posting over time
  • Distribution of user contributions
  • Implications of UGC generation patterns
  • Concluding remarks
  • User posting over time
ugc creation traffic overview
UGC creation traffic overview

Bookmark, US

Answer, US

Blog article, Asia

  • Weekly patterns
    • Blog (article/photo): weekday and weekend posts are similar—daily web journaling
    • Bookmark/Answer: weekend posts are smaller than weekdays
  • Daily patterns
    • Peak times are all 11:00 PM local time
    • Bottom times are different for US and Asia: different cultures

Blog photo, Asia

dynamics of user joining and posting in osns
Dynamics of user joining and posting in OSNs

Blog

Bookmark

User join rate (new users per day)

  • increases with time
  • bursty in large time scales

User increase rate

  • decrease with time
  • bursty in large time scales

Post increase rate

  • decrease with time
  • less bursy than user increase rate

Implications

  • total user population and content do not increase exponentially
  • User join bursts: post inc rate < user inc rate
  • Bursts and dynamics need to be considered for data analysis
user activity over time
User activity over time

Author’s OSN age of posts

  • User’s posting frequency over time
    • The age of the user in OSN when an UGC object is posted
    • Bookmark: almost uniform distribution
    • Blog: a little skewed towards small ages
    • Answer: more skewed towards small ages
  • User’s lifetime (active duration) in OSNs
    • Assumed exponential distribution before
    • For user posting behavior
      • Long lifetime users
      • Short lifetime users
      • Other users: a wide range of lifetime

Author’s OSN lifetime

outline
Outline
  • User generated content
  • User posting over time
  • Distribution of user contributions
  • Implications of UGC generation patterns
  • Concluding remarks
original and non original ugc content
Three kinds of UGC objects

Original UGC objects

Cut-and-paste objects

Spam and advertisement

Spam: filtered out with ML model

Cut-and-paste objects in Blog

Posted by a small number of users

No clear posting peak time

Focused on recreation and social event categories

Spam users and cut-and-paste users are removed in our analysis

Original and non-original UGC content

Cut-and-paste posts

stretched exponential distribution

log y

yc

fat head

slope:-a

b

thin tail

log i

log i

Stretched exponential distribution
  • User contribution in a social network follows the stretched exponential distribution
  • Rank order distribution:
    • fat head and thin tail in log-log scale
    • straight line in logx-yc scale (SE scale)

i : rank of users (N users)

y : number of objects created by the user

c: stretch factor

ugc creation patterns of blog
UGC creation patterns of Blog

article

photo

log scale

fat head

powered scaleyc

thin tail

c = 0.418

R2 = 0.997

log scale in x axis

x: contribution rank of user y: number of original posts by the user

Parameters: maximum likelihood method

R2: coefficient of determination (1 means a perfect fit)

Y left: y^c scale

Y right: log scale

ugc creation patterns of bookmark
UGC creation patterns of Bookmark

Bookmark (imports)

Bookmark (all posts)

log scale

fat head

  • Bookmark imports: bookmarks imported from user’s Web browser when joining the system
  • Bookmark posts: bookmarks posted to the system by the bookmark plug-in of web browser

powered scaleyc

thin tail

log scale in x axis

x: contribution rank of user y: number of bookmark posts by the user

ugc creation patterns of answer
UGC creation patterns of Answer

Answer (all posts)

Answer (best)

  • Best answer: the asker can select a best answer from all received answers. Best answers are high quality UGC posts since they are judged by the askers themselves.

log scale

fat head

powered scaleyc

thin tail

log scale in x axis

x: contribution rank of user y: number of answer posts by the user

model validation
Chi-square test

Validation on users joined the system simultaneously

Users join rate increases with time

Some users may become inactive

Validation on different parts of workloads

follow SE distribution with the same c

parameter c is the shape factor, not change for different parts of a workload

Model validation

Chi-square test results ( = 0.05)

k: number of bins, Oi: total observed posts, Ei: expected number of posts

outline16
Outline
  • User generated content
  • User posting over time
  • Distribution of user contributions
  • Implications of UGC generation patterns
  • Conclusion and future work
the 80 20 rule
The “80-20” rule
  • 80-20 rule of power law distributions
    • Pareto principle: 20% people own 80% social wealth
    • Internet systems: 20% web pages account for 80% requests
  • In social networks
    • Blog: 20% users for 80% posts
    • Bookmark: 17% users for 83% posts
    • Answer: 13% users for 87% posts

Roughly follows the 80-20 rule

User contribution is stretched exponential

What is the difference between user contribution distribution in online social networks and user income distribution in a real society?

asymptotical properties of top users

log y

logy

Power law

Stretched exponential

log i

log i

Asymptotical properties of top users

Highly skewed towards top users

Contributions of top users

Less skewed towards top users

The cumulative contribution ratio of top-k users among all n users in an OSN

A small number of top users cannot dominate the content in an OSN

the core users in social networks
The “core” users in social networks
  • Looking for a threshold to identify most important users
  • Power law distribution: hard threshold
    • By number of or fraction of users
    • By a predefined user contribution threshold
  • Stretched exponential distribution: general threshold for all systems

decrease rate of user contribution along rank

X0= log k, Y0= yk :

increase rate of user contribution rank

Let

creation patterns of different types of ugc
Creation patterns of different types of UGC

Blog article

Blog photo

Bookmark

Answer

more effort than short blog

taking photo,

transferring,

editing,

uploading,

writing desc,

higher quality, smaller c (more effort to compose)

no difference in effort

  • more effort, smaller c
    • longer articles need more effort to compose, adding tags needs extra effort

user participating effort is even smaller

higher quality and more effort than best answer

would have much smaller c

Power law !

Our conjecture: larger c, flatter user contribution distribution

discussion ugc production vs ugc consumption
Discussion: UGC production vs. UGC consumption
  • Internet media access patterns (PODC’08)
    • Number of requests to an media object is stretched exponential for different kinds of media systems
  • Media request is content consumption
    • Stretch factor increases with file length (duration a user views)
  • UGC creation is content production
    • Stretch factor decreases with the effort to create a UGC object
  • UGC social networks rely on user contribution to attract traffic
    • Relationship between UGC creation and consumption
    • More general model for both UGC creation and consumption
      • Understand the driving force of a social network
      • Design effective participation mechanisms for social applications
      • Provide efficient data management for underlying supporting systems
outline22
Outline
  • User generated content
  • User posting over time
  • Distribution of user contributions
  • Implications of UGC generation patterns
  • Concluding remarks
conclusion
Conclusion
  • User activities and contributions are critical for knowledge-sharing social networks
  • We have analyzed three large OSNs, and found
    • User lifetime in OSNs does not follow exponential distribution
    • User contribution distribution is stretched exponential
    • Different types of UGC content generation patterns can be modeled with different parameters in SE
  • User contribution model: distribution of individual user behaviors
    • Building block to understand more complex social network phenomena
    • Foundation to guide design, modeling and simulation of OSNs