Dynamics of peer to peer networks or who is going to be the next pop star
1 / 35

Dynamics of Peer-to-Peer Networks or Who is Going to be The Next Pop Star? - PowerPoint PPT Presentation

  • Uploaded on

Dynamics of Peer-to-Peer Networks or Who is Going to be The Next Pop Star?. Yuval Shavitt School of Electrical Engineering [email protected] http://www.eng.tau.ac.il/~shavitt. Credits. Talk is based on the papers:

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Dynamics of Peer-to-Peer Networks or Who is Going to be The Next Pop Star?' - maitland

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Dynamics of peer to peer networks or who is going to be the next pop star l.jpg

Dynamics of Peer-to-Peer Networks or Who is Going to be The Next Pop Star?

Yuval Shavitt

School of Electrical Engineering

[email protected]


Credits l.jpg

Talk is based on the papers:

  • Static and dynamic characterization of the Gnutella network [Shaked-Gish, S, Tankel, IPTPS 2007]

  • How to predict the next pop star? [Koenigstein, S, Tankel, KDD 2008]

What are peer to peer networks l.jpg
What are Peer-to-Peer Networks?



  • The common computing paradigm is client-server

    • Server waits for requests (on a known port)

    • Client sends a request

    • Server serves the client

    • Examples: WWW, FTP, SMTP (e-mail), …..

  • Peer-to-peer networks:

    • Each end-point is both client and server








The gnutella network l.jpg
The Gnutella Network

  • Gnutella: The most popular sharing network on the Internet

  • According to the Digital Music News Research Group40% market share in Q4 2007

  • Limewire: The most popular file sharing client in the world. Dominates the Gnutella network.

The gnutella protocol l.jpg
The Gnutella Protocol

  • Originally: a flat peer-to-peer distributed protocol.

    • Churn caused instability

  • Today: a 2-level tiered system

    • Stable nodes are promoted to become ultrapeers

    • Queries carry OOB address: The originator’s address or in most cases when the client is firewalled, this is the ultrapeer’s address

Locating the origin ip address l.jpg
Locating the Origin IP address





IP resolution Process:

  • Detect the U.P. IP

  • Discard queries with more than 2 hops

  • Discard queries with 2 hops and same IP

  • Intercept queries with 2 hops and different IPs




Cancels the bias for rare queries

Introduces bias against firewalled clients

Data sets l.jpg
Data Sets

  • First study:

    • Jul 2006 - Nov 2006

    • 665,000,000 world-wide geo-identified queries

  • Second study

    • Oct 2006 – Jul 2007, Sundays only

    • 310,000,000 USA geo-identified queries

  • A network crawl of 24 hours

    • 1.2M users

    • 533,000 different songs

      Largest studies ever performed

      in length and depth

How to predict artist s success l.jpg
How to Predict Artist’s Success?

Noam Koenigstein, Y. Shavitt, and Tomer Tankel. Spotting Out Emerging Artists Using Geo-Aware Analysis of P2P Query Strings. The 2008 ACM SIGKDD Conference, August 2008, Las Vegas, NV, USA.

The word of mouth effect l.jpg
The Word of Mouth Effect

The Divergence can be used to predict a new product success probability [Garber et al., Marketing Science 2004]

The divergence l.jpg
The divergence

  • When measured against the uniform distribution, maximum is achieved when P is a  function.

    • True for both Kullback-Leiblar and Jensen-Shannon

    • This is the case when emerging artists are considered

  • Non uniform distribution of potential adopters:

Party like a rockstar in 2007 l.jpg
Party Like a Rockstar in 2007

Week 6: The string “party like a rockstar” is detected by the algorithm

Week 8: Atlanta’s popularity chart in (Feb 18th)

Week 15: Atlanta based Shop Boyz sign contract with Universal Recordings

Week 18: The song first enters the Billboard Hot 100 on (80th position)

Week 23: Reached 2nd position on Billboard Hot 100

Ranked only


on the global chart

Party like a rockstar l.jpg
Party Like a Rockstar

Shop Boyz related queries in February 2007

Shop Boyz Popularity and Divergence in 2007

Soulja boy l.jpg
Soulja Boy

  • Detected by our alg: already in 2006.

  • The string “soulja boy” entered the “Atlanta queries top 100” already in October 2006

  • Entered the Bubbling Under R&B/Hip-Hop Singles in the 23rd of June 2007

  • Later ranked first in the following Billboard charts:Hot 100, Hot Rap Tracks, Hot Videoclip, Hot RingMasters and Hot Ringtones

Yung berg l.jpg
Yung Berg

  • Active in LA

  • Week 2: Entered LA top 100

  • Week 15: First appeared on the Billboard charts

  • Week 32: Reached 18 on the Billboard Top 100

The detection algorithm l.jpg
The Detection Algorithm

  • Input: A list of Geo-identified P2P Query stringsOutput: A list of locally popular query string with high probability to become globally popular

  • Build local and global popularity charts

  • local popularity is detected using local and global popularity thresholds

  • Looking for local popularity growth trends from week to week

  • Filtering:Non-music related content, and already familiar artists are characterized by uniform distribution

Local popularity l.jpg
Local Popularity

  • Not all queries are “products”, thus divergence is not effective (e.g., rare typos)

  • Detection is based on local popularity:

Atpl all times popular list l.jpg
ATPL - All Times Popular List

  • Initialization: All the strings that reached global popularity in 2006

  • Weekly aggregation

  • Filters non-volatile string:

    • adult related, e.g., “porn”

    • well established artists, e.g., “madonna”, “avril lavigne”

    • Movies, software, etc.

Correlation measurements l.jpg
Correlation Measurements

  • Modified time series correlation

  • P2P correlation with the Billboard:

Prediction results l.jpg
Prediction Results

  • Example:When a song enters the Billboard will it reach “top 20”?

  • Precision: 89%, Recall: 80%On average songs pass the threshold 2.83 weeks before reaching top Billboard rank

  • More details:Koenigstein, Shavitt, and Zilberman, AdMIRe2009

Summary l.jpg

  • Following activity in the Internet can help up detect trends before they are visible

    • P2P networks

    • Social networks

    • Blogs

    • Talk-backs

    • Searches

  • More at http://www.eng.tau.ac.il/~shavitt