Discovering leaders from community actions
Download
1 / 41

Discovering Leaders from Community Actions - PowerPoint PPT Presentation


  • 68 Views
  • Uploaded on

Discovering Leaders from Community Actions. Amit Goyal 1 Francesco Bonchi 2 Laks V.S. Lakshmanan 1 Oct 27, 2008. 2. 1. Context & Motivations: Viral Marketing. We are more influenced by our friends than strangers

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Discovering Leaders from Community Actions' - jason


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Discovering leaders from community actions

Discovering Leaders from Community Actions

Amit Goyal1

Francesco Bonchi2

Laks V.S. Lakshmanan1

Oct 27, 2008

2

1


Context motivations viral marketing

Context & Motivations:Viral Marketing


Word of mouth and viral marketing

We are more influenced by our friends than strangers

68% of consumers consult friends and family before purchasing home electronics (Burke 2003)

Word of Mouth and Viral Marketing

http://cs.ubc.ca/~goyal/

Amit Goyal (University of British Columbia)


Viral marketing

Also known as Target Advertising

Initiate chain reaction by Word of mouth effect

Low investments, maximum gain

Viral Marketing

http://cs.ubc.ca/~goyal/

Amit Goyal (University of British Columbia)


Viral marketing as an optimization problem

Given: Network with influence probabilities

Problem: Select top-k leaders such that by targeting them, the spread of influence is maximized

Hao Ma et al 2008, Domingos et al 2001, Richardson et al 2002, Kempe et al 2003

How to calculate true influence probabilities?

Viral Marketing as an Optimization Problem

http://cs.ubc.ca/~goyal/

Amit Goyal (University of British Columbia)


A pattern mining approach
A pattern mining approach

  • We propose a completely different approach based on frequentpattern mining.

  • We focus on the actions performed by users:

    • Joining a community (as in flickr/facebook community)

    • Rating a song, a movie (as in Y! Music, Y! Movie)

  • Importance of time in which actions are performed

  • Assumption: Users can see their friends’ actions

  • http://cs.ubc.ca/~goyal/

    Amit Goyal (University of British Columbia)


    Our contributions
    Our Contributions

    • Formally define the notion of leaders and its various flavors

    • Efficient algorithms for extracting these leaders

    • Demonstrate the utility and scalability of our algorithms, via an extensive set of experiments on a real world dataset

      • Yahoo! Messenger (social graph)

      • Yahoo! Movies rating (actions log)

    http://cs.ubc.ca/~goyal/

    Amit Goyal (University of British Columbia)


    Rest of the talk
    Rest of the talk

    • Framework definition:

      • Influence propagation on the social network

      • Various notions of leaders

    • Algorithms

    • Experiments

    • Related Work

    • Conclusion

    http://cs.ubc.ca/~goyal/

    Amit Goyal (University of British Columbia)



    Input data 1
    Input Data (1)

    • A social network, i.e., an undirected graph G=(V,E) where nodes are users and edges represent social ties.

    • Users declare their friends. e.g. Facebook, Yahoo! Messenger etc

    http://cs.ubc.ca/~goyal/

    Amit Goyal (University of British Columbia)


    Input data 2

    An actions log sorted in chronological order, i.e., a relation

    Actions(User, Action, Time)

    Example: Jack joined Yoga community at time 5

    Assumption:

    Users can see their friends actions (feeds)

    Input Data (2)

    http://cs.ubc.ca/~goyal/

    Amit Goyal (University of British Columbia)


    Action propagation
    Action Propagation

    Jack

    Jill

    3 time units

    • Jack and Jill are friends

    • Jack and Mary are friends

    • Action is “Joining the Yoga community”

    Joined Yoga

    Community at time 8

    Joined Yoga

    Community at time 5

    995 time units

    Mary

    Joined Yoga

    Community at time 1000

    • Action Propagated from Jack to Jill

    • Action propagated from Jack to Mary

    http://cs.ubc.ca/~goyal/

    Amit Goyal (University of British Columbia)


    Propagation graph
    Propagation Graph

    Jack

    Jill

    Joined Yoga

    Community at time 8

    Joined Yoga

    Community at time 5

    Ben

    Joined Yoga

    Community at time 15

    Joey

    Mary

    Joined Yoga

    Community at time 12

    Joined Yoga

    Community at time 1000

    Can we say Mary got influenced by Jack?? NO

    http://cs.ubc.ca/~goyal/

    Amit Goyal (University of British Columbia)


    User influence graph

    When an action propagates from user uto user v,we may think of vbeing influenced by u

    Influence should decay in time

    Size of influence graph << Size of PG

    User Influence Graph

    Propagation Graph

    User Influence Graph for Jack

    http://cs.ubc.ca/~goyal/

    Amit Goyal (University of British Columbia)


    Leaders first definition

    Jack

    Jack

    Jack

    Jack

    Jill

    Jill

    Jill

    Jill

    Joined Yoga

    Joined Yoga

    Joined Yoga

    Joined Yoga

    Joined Yoga

    Joined Yoga

    Joined Yoga

    Joined Yoga

    Community at time 8

    Community at time 8

    Community at time 8

    Community at time 8

    Community at time 5

    Community at time 5

    Community at time 5

    Community at time 5

    Ben

    Ben

    Ben

    Ben

    Joined Yoga

    Joined Yoga

    Joined Yoga

    Joined Yoga

    Community at time 15

    Community at time 15

    Community at time 15

    Community at time 15

    Joey

    Joey

    Joey

    Joey

    Mary

    Mary

    Mary

    Mary

    Joined Yoga

    Joined Yoga

    Joined Yoga

    Joined Yoga

    Joined Yoga

    Joined Yoga

    Joined Yoga

    Joined Yoga

    Community at time 12

    Community at time 12

    Community at time 12

    Community at time 12

    Community at time 1000

    Community at time 1000

    Community at time 1000

    Community at time 1000

    Leaders – first definition

    • Who should be a leader?

      • For an action, should influence sufficiently large number of users ( >ψ )

      • For an action, should influence these users in a reasonable amount of time ( <π )

      • Should act as a leader in sufficiently large number of actions ( >σ )

    3

    3

    If ψ= 2, π = 15, σ = 1

    then, both Jack and Jill are leaders

    7

    7

    7

    7

    4

    3

    995

    http://cs.ubc.ca/~goyal/

    Amit Goyal (University of British Columbia)


    Tribe leader

    A leader may influence different users for different actions

    What if a leader lead a fixed set of users for different actions?

    We call these leaders as Tribe Leaders

    Can be considered as small communities

    Tribe Leader

    jack

    A2

    A3

    A1

    A1, A2 and A3 are 3 different actions

    http://cs.ubc.ca/~goyal/

    Amit Goyal (University of British Columbia)


    Additional constraint genuineness

    It may happen that one user acts as a leader but in concrete he is always a follower of the other leaders

    We want to avoid this kind of fake leaders.

    gen(Jill) = 1/3

    Another constraint: confidence

    Additional Constraint: Genuineness

    Jack

    Tom

    A1

    A2

    Jill

    A1

    A3

    A2

    A1, A2 and A3 are 3 different actions

    http://cs.ubc.ca/~goyal/

    Amit Goyal (University of British Columbia)


    Algorithms but how will i discover the leaders

    Algorithms he is always a follower of the other leaders but how will I discover the leaders??


    Algorithms overview
    Algorithms: Overview he is always a follower of the other leaders

    • Assumptions:

      • Social graph is huge – millions of nodes

      • Actions log is huge – millions of tuples

      • For an action, size of user Influence Graph << size of Propagation Graph for all users

    • Our algorithms are able to extract the patterns (leaders and tribe leaders) in no more than one scan of the action log table.

    http://cs.ubc.ca/~goyal/

    Amit Goyal (University of British Columbia)


    Algorithms overview1
    Algorithms: Overview he is always a follower of the other leaders

    • Scan the action log table by means of a window of sizeπbackward in time, i.e., starting from the most recent timestamp (bottom of the table if we assume tuples to be ordered by time).

    • Efficiently compute the influence matrix, i.e., a matrix Users x Actions

      • IMπ(u, a) represents number of users, influenced by u w.r.t. action a within timeπ

    • Compute leaders from IM

    IM10(Jack, “joining yoga community”) = 3

    http://cs.ubc.ca/~goyal/

    Amit Goyal (University of British Columbia)


    Computing influence matrix 1
    Computing Influence Matrix (1) he is always a follower of the other leaders

    • We use a bit vector to track which users are influenced by a given user. Updated incrementally

    • Locking mechanism using another bit vector

      • 0 => free bit; 1 => occupied bit

    • Node to bit index mapping stored in a queue

    • Bits must be dynamically allocated.

    Queue

    Head

    R

    Time window on propagation graph

    S

    T

    W

    V

    01010111

    Lock bit Vector

    http://cs.ubc.ca/~goyal/

    Amit Goyal (University of British Columbia)


    Computing influence matrix 2
    Computing Influence Matrix (2) he is always a follower of the other leaders

    • Slide up the current window – delete node V

    • Delete the entry from queue

    • Update the lock

    • Update influence vectors

    Queue

    Head

    R

    Time window on propagation graph

    S

    T

    W

    V

    01010011

    Lock bit Vector

    01010111

    Lock bit Vector

    http://cs.ubc.ca/~goyal/

    Amit Goyal (University of British Columbia)


    Computing influence matrix 3
    Computing Influence Matrix (3) he is always a follower of the other leaders

    • New node P added

    • Issue a lock, add entry to the queue

    • Compute its Influence Vector by propagation

    • Number of followers of P = 4

    • IM(P,a) = 4

    Queue

    Head

    P

    Time window on propagation graph

    R

    S

    T

    W

    01010011

    Lock bit Vector

    01010111

    Lock bit Vector

    http://cs.ubc.ca/~goyal/

    Amit Goyal (University of British Columbia)


    Mining tribe leaders
    Mining Tribe Leaders he is always a follower of the other leaders

    • Influence Matrix not enough

    • We use influence cube: Users x Actions x Users

      • ICπ(u,a,v) = 1, when user v is influenced by user u for action a within time π

    • We do not explicitly compute the whole cube due to sparsity.

    • Problem same as discovering existence of frequent itemsets of size larger than a given threshold

    http://cs.ubc.ca/~goyal/

    Amit Goyal (University of British Columbia)


    Algorithms final comments
    Algorithms - Final Comments he is always a follower of the other leaders

    • The only truly mandatory threshold is π(time threshold)

    • Influence Matrix: O(TAn2) in bit level operations

      • T = total number of tuples in action log

      • A = total number of distinct actions

      • n = maximum number of nodes visible in any position of the time window

      • n << N, where N is the total number of users

    • Tribe Leaders:

      • Influence Cube: O(TAn2)

      • Finding existence of frequent itemsets: exponential in number of followers

        • But very fast due to optimizations (Bonchi 2003)

    http://cs.ubc.ca/~goyal/

    Amit Goyal (University of British Columbia)


    Experiments enough talking show me the results dude

    Experiments he is always a follower of the other leadersenough talking, show me the results dude!!


    Data preparation
    Data Preparation he is always a follower of the other leaders

    • Data

      • Social graph: Yahoo! Instant Messenger

      • Actions log: Yahoo! Movies

        • Action = user u rated movie m at time t

      • joined through common users identifiers

    • Started from Yahoo! Instant Messenger subgraph of “most active” users (110M nodes) and 21M ratings from Yahoo! Movies.

    • Ended with 217.5K nodes, 221.4K edges and 1.8M ratings.

    http://cs.ubc.ca/~goyal/

    Amit Goyal (University of British Columbia)


    Data characteristics connected components
    Data characteristics: connected components he is always a follower of the other leaders

    Total 46,650 connected components

    Giant component

    94K Users (43.2% of connected users)

    http://cs.ubc.ca/~goyal/

    Amit Goyal (University of British Columbia)


    Leaders vs tribe leaders
    Leaders Vs. Tribe leaders he is always a follower of the other leaders

    π – threshold on time

    σ – threshold on number of actions

    ψ – threshold on number of influenced users

    http://cs.ubc.ca/~goyal/

    Amit Goyal (University of British Columbia)


    Number of leaders found
    Number of leaders found he is always a follower of the other leaders

    π – threshold on time

    σ – threshold on number of actions

    ψ – threshold on number of influenced users

    http://cs.ubc.ca/~goyal/

    Amit Goyal (University of British Columbia)


    Run time
    Run-time he is always a follower of the other leaders

    π – threshold on time

    σ – threshold on number of actions

    ψ – threshold on number of influenced users

    http://cs.ubc.ca/~goyal/

    Amit Goyal (University of British Columbia)


    Genuineness an almost binary concept
    Genuineness: an almost binary concept! he is always a follower of the other leaders

    http://cs.ubc.ca/~goyal/

    Amit Goyal (University of British Columbia)


    Top 10 tribe leaders w r t tribe size
    Top-10 tribe leaders w.r.t. tribe size he is always a follower of the other leaders

    • Tribe leaders exhibit high confidence.

    • Tribe leaders with low genuineness were found dominated by other tribe leaders present in the tables.

    • We found many users acting as leader in many actions but not being a tribe leader.

    http://cs.ubc.ca/~goyal/

    Amit Goyal (University of British Columbia)


    Related work 1
    Related Work (1) he is always a follower of the other leaders

    • Identifying influential users

      • Domingos et al 2001, Richardson et al 2002, Kempe et al 2005

    • Identifying influential bloggers

      • Agarwal et al 2008

    • Identifying communities in Social Networks

      • Hoproft et al 2003, Kumar et al 2006, Backstrom et al 2006, Tantipathananadh et al 2007, Huang et al 2008, Friedland at el 2007

    http://cs.ubc.ca/~goyal/

    Amit Goyal (University of British Columbia)


    Related work 2
    Related Work (2) he is always a follower of the other leaders

    • Influence and Correlation in Social Networks

      • Aris Anagnostopoulos et al 2008

    • Revenue maximization

      • Hartline et al 2008

    • Near optimal sensor placement for outbreak detection

      • Leskovec et al 2007

    • Heat Diffusion Model

      • Hao Ma et al 2008 (CIKM)

    http://cs.ubc.ca/~goyal/

    Amit Goyal (University of British Columbia)


    Conclusions
    Conclusions he is always a follower of the other leaders

    • Proposed framework based on frequent pattern mining for discovering leaders in social networks

    • Formally define the problem of extracting leaders from social graph and actions log.

      • Various notions of leader, tribe leader

      • Their confidence and genuine variants

    • Efficient algorithms for extracting leaders of various flavors

      • Just one pass over the actions log table

    • Demonstrate the utility and scalability of our algorithms, via an extensive set of experiments on a real world dataset

      • Yahoo! Messenger (social graph)

      • Yahoo! Movies rating (actions log)

    http://cs.ubc.ca/~goyal/

    Amit Goyal (University of British Columbia)


    Ongoing future work
    Ongoing/Future Work he is always a follower of the other leaders

    • Gurumine: Pattern Mining System for Discovering Leaders and Tribes (Demo paper to appear in ICDE 2009)

    • Leadership Cube: What kind of leaders attract what kind of followers for what kind of actions?

    • Viral Marketing

    • Stronger notions of influence?

    http://cs.ubc.ca/~goyal/

    Amit Goyal (University of British Columbia)


    Thanks

    Thanks! he is always a follower of the other leaders

    3

    1

    4

    1

    3

    13

    4

    2

    3

    3

    2

    7

    5

    http://cs.ubc.ca/~goyal/

    Amit Goyal (University of British Columbia)


    Backup
    Backup he is always a follower of the other leaders


    Number of leaders found1
    Number of leaders found he is always a follower of the other leaders

    π – threshold on time

    σ – threshold on number of actions

    ψ – threshold on number of influenced users

    http://cs.ubc.ca/~goyal/

    Amit Goyal (University of British Columbia)


    Additional constraint confidence
    Additional constraint: confidence he is always a follower of the other leaders

    • Similarly to association rules, we can have a confidence measure for leaders.

    • Leadership confidence =

      # actions in which is a leader / # actions performed

    • Example: Lets say Jack performed 10 actions out of which in 7 actions, he acted as a leader (i.e. more than ψ users followed in short time), then conf(Jack) = 7/10

    http://cs.ubc.ca/~goyal/

    Amit Goyal (University of British Columbia)


    ad