analyzing the evolution of scientific citations collaborations a multiplex network approach n.
Skip this Video
Loading SlideShow in 5 Seconds..
Analyzing the Evolution of Scientific Citations & Collaborations: A Multiplex Network Approach PowerPoint Presentation
Download Presentation
Analyzing the Evolution of Scientific Citations & Collaborations: A Multiplex Network Approach

Loading in 2 Seconds...

play fullscreen
1 / 30

Analyzing the Evolution of Scientific Citations & Collaborations: A Multiplex Network Approach - PowerPoint PPT Presentation

  • Uploaded on

Analyzing the Evolution of Scientific Citations & Collaborations: A Multiplex Network Approach. By Soumajit Pramanik Guide : Dr. Bivas Mitra. Citation Network. Important Author-based Metrics : In-Citation Count H-Index etc. Co-Authorship Network. Existing Works.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Analyzing the Evolution of Scientific Citations & Collaborations: A Multiplex Network Approach' - lucien

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
analyzing the evolution of scientific citations collaborations a multiplex network approach

Analyzing the Evolution of Scientific Citations & Collaborations: A Multiplex Network Approach



Guide : Dr. BivasMitra

citation network
Citation Network
  • Important Author-based Metrics:
      • In-Citation Count
      • H-Index etc.
existing works
Existing Works
  • Previous works on Citation Network mainly focused on:
    • Analyzing the evolution of citation and collaboration networks using “Preferential Attachment” [Barabasi et al. 2002]
    • Understanding the importance of community structure in citation networks [Chin et al. 2006]
    • Studying the evolution of research topics [He et al. 2009]
  • Previous works on Collaboration Network mainly focused on:
    • Adopting social network measures of degree, closeness, betweenness and eigenvector centrality to explore individuals’ positions in a given co-authorship network [Liu et al. 2005].
    • Analyzing the importance of the geographical proximity (same university/city/country etc.) of the collaborators [Divakarmurthy et al. 2011].

1. Existing studies focused on the dominant factors like preferential attachment

2. None of these factors can be self-


3. Does their exist any self-tunable factor

(suppressed by dominant factors) for

boosting own citations/collaboration?


Advantage of attending Conferences:

Face-to-Face interactions

with Fellow Scientists

Studying the influence of such interactions on the

evolution of Citation and

Collaboration Networks

  • The authors, whose talks are scheduled in the same technical session of a conference, have high chances of interaction.
  • In general, the first or the last author (or sometimes both) of a paper attends the conference.
real dataset
Real Dataset:
  • Citations & Collaborations:
    • DBLPDataset for Computer Science domain (1960-2008)
    • Around 1 million papers along with information about author, year, venue and references
    • 501060 authors tagged with continents (using Microsoft Academic Search)
    • 6559415 author-wise citation links

  • Interactions:
    • Two domains: 1> Networking & Distributed


2> Artificial Intelligence

    • Selected 3 leading conferences from each domain:

1> INFOCOM, ICDCS, IPDPS from the first domain (1982-2007)

2> AAAI, ICRA, ICDE from the second domain (1980-2008)

    • Collected session information from DBLP and program schedule of the conferences
synthetic dataset
Synthetic Dataset:
  • To regulate some important parameters and manifest their effects on the citation network
  • Followed statistics regarding articles per field per year, distribution of the number of authors in a paper and citation information from the real dataset
  • Only tunable parameter used: Successful

interaction Rate p (p=0.1,0.2,…,1)

  • Multiplex Network Construction:

For each year t:

    • Citation Layer:

Directed author-wise citation links created at t, pointing to papers published before t (or sometimes, in t)

    • Interaction Layer:

Undirected interaction links between authors presenting in same sessions in selected conferences in t

    • Co-authorship Layer:

Undirected collaboration links between two authors if they co-author a paper published in those chosen conferences in t

evaluation metrics
Evaluation Metrics:
  • 1. Conversion Rate (CR) for a conference C for a

time-span T:

No. of “Successful” interactions in C during T


Total no. of interactions in C during T

From this, the definition of the Overall Conversion rate can be simply extended.

  • 2. Induced Citation Link Repetition (LR):

LR measures the no. of times each “induced”

citation link appears within the recorded time


  • 3. Lifespan of Induced citation (LS):

The Lifespan of an “induced” citation is measured

as the difference between the first and the last

appearing year of the “induced” citation link.

  • 4. Rate of appearance (RA):

The rate of appearance of the of a induced

citation link is denoted by the ratio of the

repetition count and lifespan.

Hence RA = LR / LS

  • 5. Influence of successful interaction (IG):

The influence of a “successful” interaction is

measured as the latency between the “successful”

interaction and the formation of the first induced


conversion rates
Conversion Rates
  • Real Datasets:

Networking Domain:

2.87% (381 out of

13240) for [0.9,0.1]

interaction probabilities

AI Domain:

2.1% (1291 out of

61896) for [0.9,0.1]

interaction probabilities

  • Synthetic Dataset:

Downfall near end years due to

“Boundary Effect”

heat maps

Networking Domain:

1. Overall Value increasing

2. Distributed Contribution

AI Domain:

1. Overall Value slowly


2. Dominated Contribution

induced citation repetition l r lifespan l s
Induced Citation Repetition (LR) & Lifespan (Ls)

In both domains,

Power-Law distribution

A significant no. of “induced”

citations repeat a high no.

of times



AI Domain

Significant no. of “induced” citations have high RA values

Reasons can be

a) Low LS or/and

b) High LR



AI Domain

AI Domain


Networking Domain

AI Domain

1. High RA ratio results from

mainly low LS

2. Ä large no. of induced" citations

missing from the right side of the

plot due to the boundary effect.

Networking Domain

1. Aperiodicity of repetitions

of “induced” citations

increase almost linearly with

their Lifespan

2. High LR not necessarily imply

high standard deviation

AI Domain




All the highly repeating “induced” citations have low “Influence” Gap

  • Influence Gap (IG)
  • Influence of Continents

AI Domain

Dominance of

North America-North America


AI Domain

Networking Domain


Citations To



Conversion Rates

    • 1. Considered only collaboration between established researchers (having at least 1 publication)
    • 2. In Networking domain out of 8920 co-author links, 2495 (28%) exhibits a past history of mutual citations!
    • 3. In AI domain 3211 out of 10192 (31.5%) are such “induced” co-author links.
  • Induced Collaboration Repetition Count and Influence Gap

Networking Domain

Here also, all highly repeating

“induced” collaborations have

small “influence” gap

AI Domain

component evolution
Component Evolution

Networking Domain:

1. Giant component

size 8152,

Second Largest

Component size 63

2. 28% (167) of induced

collaboration links

took part in the

merging process

AI Domain:

1. Giant component size 16203, Second Largest Component size 41

2. 36:6% (263) of induced collaboration links took part in the merging


conclusion future plans
Conclusion & Future Plans
  • Interactions during conferences can be used as a tool to boost own citation-count.
  • This can indirectly help in creating effective future collaborations and this cycle goes on.
  • With time people are being more and more aware about the benefits of interacting with fellow researchers during conferences.
  • Need to check
    • 1. Influence of specific fields of interacting authors on

creation of “induced” citations

    • 2. Effects of “induced” citations/collaborations on the

citation/collaboration degree distribution

    • 3. Modeling the dynamics
  • 1. A. L. Barabasi, H. Jeong, Z. Neda, E. Ravasz, A. Schubert, and T. Vicsek: “Evolution of the social network of scientic collaborations”. Physica A: Statistical Mechanics and its Applications, 311(3-4):590 - 614, 2002.
  • 2. A. Chin and M. Chignell.: “A social hypertext model for finding community in blogs. In HYPERTEXT '06”. Proceedings of the seventeenth conference on Hypertext and hypermedia, pages 11-22, New York, NY, USA, 2006. ACM Press.
  • 3. Q. He, B. Chen, J. Pei, B. Qiu, P. Mitra, and C. L. Giles: “Detecting topic evolution in scientific literature: how can citations help?” In CIKM, pages 957-966, 2009.
  • 4. X. Liu, J. Bollen, M. L. Nelson, and H. Van de Sompel.: “Co-authorship networks in the digital library research community”. Information processing & management, 41(6):1462-1480, 2005.
  • 5. P. Divakarmurthy, P. Biswas, and R. Menezes.: “A temporal analysis of geographical distances in computer science collaborations”. In SocialCom/PASSAT, pages 657-660. IEEE, 2011.