1 / 22

The Spread of Media Content through the Blogosphere

TU Berlin Deutsche Telekom Lab. Flash Floods and Ripples:. Meeyoung Cha. Juan A. Navarro Max Planck Institute for Software Systems (MPI-SWS). Hamed Haddadi. The Spread of Media Content through the Blogosphere . ICWSM Data Challenge 2009. Motivation. How does content spread in blogs?

elina
Download Presentation

The Spread of Media Content through the Blogosphere

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. TU Berlin Deutsche Telekom Lab Flash Floods and Ripples: Meeyoung Cha Juan A. Navarro Max Planck Institute for Software Systems (MPI-SWS) Hamed Haddadi The Spread of Media Content through the Blogosphere ICWSM Data Challenge 2009

  2. Motivation How does content spread in blogs? What kinds of content are shared? • Blogs play a significant role in today’s Internet culture • Blogs are used for information propagation purposes • Discuss political issues • Review new products and online contents • Form communities and special interest groups • Increasingly, media content is shared through blogs

  3. Our goal • Characterizehow the structure of the blogosphere influences the patterns of content spreading • 1. Understand the structure of the blogosphere • Is the structure ideal for content dissemination? • 2. Understand the spreading patterns of content • What types of content spread? • How quickly does content spread?

  4. Part2. Analysis of network properties Part3. Analysis of spreading patterns Part1. Measurementmethodology

  5. Spinn3r dataset • Extracted post URL, site, host, language, timestamps, etc. • Step1: Focus on top 15 blog domains • Step2: Scrape content to find embedded HTML links • Code available at http://www.mpi-sws.org/~jnavarro/tools/ • Limitations • Comments and blogrolls missing • Some blogs only post summaries • Only used dataset with numbered ‘tiers’

  6. Step1: Top 15 blog sites … Total

  7. Step2: Extracting HTML links Links tomedia content Links toother blogs

  8. Part2. Analysis of network properties Part3. Analysis of spreading patterns Part1. Measurementmethodology

  9. Network of blogs Directed network of 85,013 nodes and 129,079 edges A B

  10. Network structure [ 73% of blogs in the largest connected component ] Average node degree 1.5 Power-law degree distribution 6% of links are reciprocal 35% of links cross blog domains 7% of links cross language boundaries

  11. Network structure – 2 Network structure is more sparse than social networks Density = Ratio of observed links, out of all possible links

  12. Insights for information propagation • Sparse structure & power-law degree distribution • Clear preference for bloggers to particular topics or sources • Trend setters (high in-degree) and recommenders (high out-degree) • Potential factors that can limit spreading • Blog domains had no visible effect on linking • Language barriers inhibit the flow of information

  13. Part2. Analysis of network properties Part3. Analysis of spreading patterns Part1. Measurementmethodology

  14. Spreading of media content media • What types of content are shared? • How quickly does information spread?

  15. Types of content shared Popular sharing of user-generated content

  16. Popularity of YouTube videos • Video popularity follows a power-law distribution: • Very large diffusion processes exist • Preferential attachment may drive linking

  17. Popular video categories Musicmost popular Still spread! Keen onpolitics We downloaded metadata of top 10,000 videos

  18. Time lag in the spread of videos Flash floods Ripples

  19. Example spreading pattern Blogs linking the same video are connected = Diffusion through the blogosphere Other McCain’s political campaignlinked by 79 blogs

  20. Insights from spreading patterns • Videos in different genres spread with very different patterns • Flash floods: found quickly and spread rapidly • Ripples: took longer to spread, re-discovered years after upload • Diffusion through links in the blogosphere • 24% of videos had any spreading in the blog graph • Other spreading factors: featuring and search

  21. Part2. Analysis of network properties Part3. Analysis of spreading patterns Part1. Measurementmethodology

  22. Conclusion • Identified spreading patterns and factors that limit spreading • Blogs serve as a medium to filter and spread media content • Potential implication: Recommendation systems can take into account and exploit different spreading patterns • Future work: spreading patterns of other types of content

More Related