Social knowledge dynamic s a case study on modeling wikipedia
1 / 24

Social Knowledge Dynamic s : A Case Study on Modeling Wikipedia - PowerPoint PPT Presentation

  • Uploaded on

The 10th HKBU-CSD Postgraduate Research Symposium Social Knowledge Dynamic s : A Case Study on Modeling Wikipedia Presenter: Benyun Shi Supervisor: Prof. Jiming Liu Department of Computer Science Hong Kong Baptist University September, 2009 Outline

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Social Knowledge Dynamic s : A Case Study on Modeling Wikipedia' - bernad

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Social knowledge dynamic s a case study on modeling wikipedia l.jpg

The 10th HKBU-CSD Postgraduate Research Symposium

Social Knowledge Dynamics:A Case Study on Modeling Wikipedia

Presenter: Benyun Shi


Prof. Jiming Liu

Department of Computer Science

Hong Kong Baptist University

September, 2009

Outline l.jpg

  • Wikipedia and Social Knowledge Dynamics

  • Previous Work on Wikipedia

    • Degree distribution

    • Reciprocity and feedback loops

    • Motifs

  • Modeling Wikipedia’s Growth

    • A model about reference

    • A model about degree distribution

  • AOC-based Models

  • Conclusion

Wikipedia l.jpg

  • Anyone can create, edit, as well as delete;

  • Some properties:

  • Each article can be treated as a collective “knowledge”of a group of users;

  • Users can exchange “knowledge” through “talk” page;

  • Users with similar “knowledge” may form communities;

  • The underlying structure of some article may inversely influence users “knowledge”;

Social knowledge dynamics l.jpg

Social knowledge dynamics

Culture dynamics

Social Dynamics

Language dynamics

Crowed behaviors

… …

Social Knowledge Dynamics

“Knowledge is embodied in people gathered in communities and networks. The road to knowledge is via people, conversations, connections and relationships. Knowledge surfaces through dialog, all knowledge is socially mediated and access to knowledge is by connecting to people that know or know who to contact.”

-- Denham Grey

  • Social dynamics:

  • A society of individuals to react to inner and/or outer changes;

  • Global patterns can emerge from even simple individuals;

    • phase transitions, catastrophe, etc.

Difficulties and motivations l.jpg
Difficulties and Motivations

  • Two levels of difficulty to discover global emergence by local dynamic models:

    • The definition of sensible and realistic microscopic models; (intact data is needed)

    • The usual problem of inferring the macroscopic phenomena out of the microscopic dynamic models;

  • Motivations of studying Wikipedia

    • The formation of Wikipedia is a kind of social knowledge dynamics; (if treat articles as knowledge)

    • Intact data for download;

      • Articles, categories, images and multimedia, talk pages, redirect and broken links, and so on.

Related analysis on wikipedia l.jpg
Related Analysis on Wikipedia

  • Treat Wikipedia as complex networks, where the articles represent the nodes, and hyperlinks represent links.

Degree distribution

Reciprocity and feedback loops


Degree distribution l.jpg
Degree distribution

  • Degree: measure the number of articles that link into or out of

  • Meanings of degree:

    • Two articles sharing a link reflect some kind of relations in term of their contents;

    • Articles with high degree are more likely to be common knowledge;

Observations scale free l.jpg
Observations: Scale-free

The out-degree distribution of Japan Wikipedia. (adopted from Fig. 3 in ref[1].)

The in-degree distribution of Japan Wikipedia. (adopted from Fig. 3 in ref[1].)


[1] V. Zlatic, M. Bozicevic, H. Stefancic, and M. Domazet, “Wikipedias: Collaborative Web-based Encyclopedias as Complex Networks”, Physical Review E 74, 016615, 2006.

Scale free and phase transition l.jpg
Scale-free and Phase Transition

“The theory of phase transitions told us loud and clear that the road from disorder to order is maintained by the powerful forces of self-organization and is paved by power laws. It told us that power laws are the patent signatures of self-organization in complex systems….”

--Barabasi AL. 2002. Linked: The new science of networks. Cambridge: Perseus Publishing.

Similar results can be observed from Wikipedia with other languages.

What are the fundamental principle behind the similar type of growth? – Preferential Attachment?

Reciprocity and feedback loops l.jpg
Reciprocity and Feedback Loops

  • Reciprocal links arejust the links pointing from the node i to the node j forwhich exists a link pointing from node j to the node i.

    Reciprocity qualifies mutual “exchange” between two articles.

  • Feedback loops:A loop with directed links that start from and end with the same node.

The density of the links

Feedback loops in ecological system l.jpg
Feedback Loops in Ecological System

The ecological studyobserved that the number of feedback loops in thespecies network is correlated with system lifetime.

State before crash

Normal State


[2] R. Mehrotra, V. Soni, and S. Jain. Diversity sustains anevolving network. Journal of the Royal Society Interface,6(38):793–799, 2009.

Motifs l.jpg

Feedback loops

Triadic subgraphs


  • Motifs[3] are small subgraphs of networks, which areused to systematically study similarity in the local structureof networks.


Do Wikipedia with different languages share same functions?

Is the formation of social knowledge driven by the same fundamental function?


[3] R. Milo, S. Itzkovitz, N. Kashtan, R. Levitt, S. Shen-Orr,I. Ayzenshtat, M. Sheffer, and U. Alon. Superfamilies ofevolved and designed networks. Science, 303(5663):1538–1542, 2004.

Modeling reference growth l.jpg
Modeling Reference Growth

At each time step t,

A number of entries and rt references are added;

The references are distributed among all entries following a probability

Frequency distribution of the expected and actual number of references added each month to each article (adopted from Fig. 3b in [4]).

The expected number of references added to entry i at time t is


[4] D. Spinellis and P. Louridas. The collaborative organizationof knowledge. Communications of the ACM, 51(8):68–73,


Modeling about degree distribution l.jpg
Modeling about Degree Distribution

  • The model consists of two steps:

    • A new node t attaches to a network with m outgoing links. The probability that the given link will attach itself to some node s is proportional to the in-degree ki(s) of the node s.

    • Every new link with the probability r, a new reciprocal link is formed between node s and t.

Comparison of in-degree distribution. Chosen parameters are t = 94094, m = 16.75, r=0.18. (adopted from [5])


[5] Vinko et al. Model of wikipedia growth basedon information exchange via reciprocal arcs. Physics andSociety, 2009.

Insufficiency 1 l.jpg
Insufficiency (1)

  • The above two models seems to reflect the preferential attachment as a principlebehind scale-free phenomena

    • However, other researchers also show that selective removal [6] can also formed the scale-free distribution.

  • The models for scale-free canbe divided into two groups:

  • Scale-free as the result of anoptimization or phase transition process

  • Scale-free as the results of a growth model, such as preferentialattachment.


[6] M. Salathé, Robert M May, and S. Bonhoeffer, “The Evolution of Network Topology by Selective Removal”, Journal of Royal Society, Interface, 2(5): 533–536, 2005.

Insufficiency 2 l.jpg
Insufficiency (2)

  • The above two models are based on simple stochastic processes

    • we should realize that the realWikipedia is drivenby the social dynamics, including user-user interactions,use-group interactions, and group-group interactions, ratherthan the simple stochastic processes.

Aoc based models l.jpg
AOC-based Models

  • Components of Autonomy-Oriented Computing

    • Entities;

    • Interactions;

    • Behavioral rules;

    • Self-organizations

      • Collective regulations;

      • Aggregations;



Interact for a page;


Self-organized groups;




Used to solve large-scale dynamically-evolving, and/or highly distributed computational problems.


[7] M. Salathé, Robert M May, and S. Bonhoeffer, “The Evolution of Network Topology by Selective Removal”, Journal of Royal Society, Interface, 2(5): 533–536, 2005.

Questions l.jpg

  • What are the fundamental behavioral rules (e.g., explicit/implicit optimization objectives)of entities to form global patterns of Wikipedia?

  • How doentities self-organize themselves during the evolution ofWikipedia?

  • Do these rules and self-organization reflect theformation rule of social knowledge and social organization?

Three possible directions 1 l.jpg
Three Possible Directions-1

  • Wikipedia as a system

    • As a collaborative system based solely on users’spontaneous actions, what’s the driven of its birth, boom,and death?

  • Existing results on ecosystems:

    • Large randomly assembled ecosystems tend to be less stableas they increase in complexity,

      • the complexityis measured by the connectance and the average interactionstrength between species.

    • Thetypical lifetime of the system increase with the diversity ofits components.

Three possible directions 2 l.jpg
Three Possible Directions-2

  • Topic evolution onWikipedia

    • We can treat the topic evolution onWikipedia as a results of user-to-user interactions, or eventhe interaction among groups of users. (Like cultural dynamics)

  • Existing work:

    • Static data mining; (Time windows for dynamic data mining)

    • Semantic/content analysis; (What is the driven force?)

Three possible directions 3 l.jpg
Three Possible Directions-3

  • User community dynamics onWikipedia

    • Each user may associate with multiple articles;

    • For each article, there will be multiple users actingon it;

    • Communities may emerge from entities localinteractions, which may change over time;

  • Existing work

    • Modularity

    • The linkage-based measurementcannot reflect multiple relationships

Three levels of consideration l.jpg
Three Levels of Consideration

  • Describing the structure

    • Such as food webs in ecosystems, neural networks in organisms, etc.

  • How the structure influence what happens in the system

    • Such as the food-web structure affects the dynamics of population of species;

  • How the structure change over time

    • Species going extinct will influence the food-web structure

Conclusion l.jpg

  • The relation of Wikipedia and social knowledge; (Motivations)

  • The current studies on Wikipedia and their insufficiency;

  • The possibility of adopting AOC-based modeling;

  • Three research directions;