1 / 22

Towards “Unbiased” Ranking of Scientific Literature

Towards “Unbiased” Ranking of Scientific Literature. Speaker: Hai Zhuge Authors: Xiaorui Jiang, Xiaoping Sun and Hai Zhuge Knowledge Grid Research Group Institute of Computing Technology Chinese Academy of Sciences, China. Outline. Introduction Definition and source of “ranking bias”

trula
Download Presentation

Towards “Unbiased” Ranking of Scientific Literature

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Towards “Unbiased” Ranking of Scientific Literature Speaker: Hai Zhuge Authors: Xiaorui Jiang, Xiaoping Sun and Hai Zhuge Knowledge Grid Research Group Institute of Computing Technology Chinese Academy of Sciences, China ACM CIKM2012, Hawaii, USA

  2. Outline • Introduction • Definition and source of “ranking bias” • Analysis of “ranking bias” • Method • Intra-network ranking • Inter-network ranking • Results • Dataset ad Benchmarks • Recommendation intensity on papers and researchers • Recommendation sensitivity on papers • Venue ranking • Conclusion H. Zhuge, ICT, CAS

  3. Ranking Biases Ranking can help find important papers and researchers PageRank benefits old papers Now HITS benefits new papers p0 p0 p2 p2 p1 p1 p3 p3 p4 p4 p5 p5 p6 p6 H. Zhuge, ICT, CAS

  4. Example of Ranking Bias • Experiments on ACL Anthology • Paper ID: 4:J99-2002 is an article in Computational Linguistics in 1999 and is ranked 4th by PageRank. PageRank HITS Only3 papers out of 19 are after2000! • Only7papers out of 27 are before 2000! • None of 1980s H. Zhuge, ICT, CAS

  5. Time Distribution >40% >45% Quite similar 0% Our method: guarantees much “fairer” play between different time periods H. Zhuge, ICT, CAS

  6. Outline H. Zhuge, ICT, CAS • Introduction • Definition and source of “ranking bias” • Analysis of “ranking bias” • Method • Intra-network ranking • Inter-network ranking • Results • Dataset ad Benchmarks • Recommendation intensity on papers and researchers • Recommendation sensitivity on papers • Venue ranking • Conclusion 2014/9/24 6

  7. Data Model • Using metadata information only • 6 inter-network transition matrix • Paper-Researcher Network: PR, RP (=PRT) • Paper-Venue Network: PV, VP (=PVT) • Researcher-Venue Network: RV, VR (=RVT) ,r2 ,r4 r1 r3 r2 2 metadata 1 v1 v2 p1 p2 1 r4 Venues: v1: p1 v2: p2, p3 v3: p4 1 1 1 1 1 1 1 r1 2 2 1 r3 p3 p4 v3 1 r5 2 1 r3 r5 r1, Paper Influence Network (PIN) Researcher Influence Network (RIN) Venue Influence Network (VIN) H. Zhuge, ICT, CAS

  8. Intra-Network Ranking • paut: authority vector • psnd: hub vector H. Zhuge, ICT, CAS

  9. Iterative Ranking (1-)psndt) (1-)paut(t) rimp(t) (1-)(1-)rimp(t)/2 (1-)(1-)rimp(t)/2 paut(t) psnd(t) PIN: psnd PIN: paut RIN: rimp (1-)(1-)psnd(t)/2 (1-)(1-)paut(t)/2 (1-)vprs(t) (1-)rimp(t) (1-)(1-)psnd(t)/2 (1-)(1-)paut(t)/2 VIN: vprs (1-)(1-)vprs(t)/2 (1-)(1-)vprs(t)/2 vprs(t) H. Zhuge, ICT, CAS

  10. Outline H. Zhuge, ICT, CAS • Introduction • Definition and source of “ranking bias” • Analysis of “ranking bias” • Method • Intra-network ranking • Inter-network ranking • Results • Dataset ad Benchmarks • Recommendation intensity on papers and researchers • Recommendation sensitivity on papers • Venue ranking • Conclusion 2014/9/24 10

  11. Experiment Setup • ACL Anthology Network (AAN) till March 2011 • 18041 papers; 14386 researchers; 273 venues • Benchmark: • 227 papers collected from the reading lists of natural language processing or computational linguistics courses of 15 top universities – BenchP • The corresponding researchers (authors) – BenchR1 • The top-100 cited researchers from AAN - BenchR2 • Compared algorithms • PageRank • RHITS (randomized version of HITS by Ng et al., 2001) • CoRank: a generalized algorithm utilizing PIN and Researcher Collaboration Network (RCN) • Not compare FutureRank & P-Rank: not used RCN/RIN • MutualRank: this paper H. Zhuge, ICT, CAS

  12. Glance at MutualRank Results • 19 papers (16 in the 1990s and 3 in the 1980s) out of 36 are before 2000 – better reflect the reality Top-100 H. Zhuge, ICT, CAS

  13. CoRank is Similar to PageRank Top-15 relevant papers Quite similar H. Zhuge, ICT, CAS

  14. Recommendation Intensity on Papers: RI(P)@k • P – the top-k papers returned by algorithms • For each paper p in P H. Zhuge, ICT, CAS

  15. Recommendation Intensity on Researchers: RI(R)@k • R – the returned top-k researchers; researcher r ∈ R H. Zhuge, ICT, CAS

  16. Recommendation Intensity cont. • Performance under different settings • (a) MutualRank (BiRank) is consistently better than CoRank • (b) There is no big difference between using RIN and RCN • BiRank: using only PIN and RIN • TriRank: using PIN, RIN/RCN and VIN H. Zhuge, ICT, CAS

  17. Recommendation Sensitivity: RS(P,Y)@k • Among the papers published during the year range Y , the recommendation sensitivity of the top-k papers is RS(P,Y)@k. • RS(P,Y)@k is an entropy-like measure which reflects how uniform the results distribute between different time periods. • RS(P,Y)@k is also a measure reflecting the degree of bias • The flatter the recommendation sensitivity curve (RSC) is, the less sensitive the ranking algorithm is. • At a certain point, the smaller RS is, the less sensitivity the ranking algorithm is. H. Zhuge, ICT, CAS

  18. Venue Ranking • MutualRank also returns meaningful and reasonable results compared to online ranking systems and human judgments Correlation Analysis H. Zhuge, ICT, CAS

  19. Outline H. Zhuge, ICT, CAS • Introduction • Definition and source of “ranking bias” • Analysis of “ranking bias” • Method • Intra-network ranking • Inter-network ranking • Results • Dataset ad Benchmarks • Recommendation intensity on papers and researchers • Recommendation sensitivity on papers • Venue ranking • Conclusion 2014/9/24 19

  20. Conclusions Proposed a balanced ranking on the complex network consisting of paper network, author network and publishing venue network  Feasibility depends on the semantics of links H. Zhuge, ICT, CAS

  21. Ranking and Semantic Links • H. Zhuge. The Knowledge Grid: Toward Cyber-Physical Society, World Scientific Publishing co. 2012. • Chapter 2 includes Ranking Semantic Link Network • H.Zhuge, Semantic linking through spaces for cyber-physical-socio intelligence: A methodology, Artificial Intelligence, 175(2011)988-1019. • H.Zhuge, Interactive Semantics, Artificial Intelligence, 174(2010)190-204. • H.Zhuge and J.Zhang, Topological Centrality and Its Applications, Journal of the American Society for Information Science and Technology, 61(9)(2010)1824-1841. • H.Zhuge, Communities and Emerging Semantics in Semantic Link Network: Discovery and Learning, IEEE Transactions on Knowledge and Data Engineering, 21(6)(2009)785-799. H. Zhuge, ICT, CAS

  22. Thanks! H. Zhuge, ICT, CAS

More Related