1 / 26

Mining Advisor-Advisee Relationships from Research Publication Networks

Mining Advisor-Advisee Relationships from Research Publication Networks. Chi Wang, Jiawei Han, Yuntao Jia , Jie Tang, Duo Zhang, Yintao Yu SIGKDD, 2010 Presented by Hung-Yi Cai 2010/12/29. Outlines. Motivation Objectives Previous study Methodology Problem Formulation

aideen
Download Presentation

Mining Advisor-Advisee Relationships from Research Publication Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Mining Advisor-Advisee Relationships from Research Publication Networks Chi Wang, Jiawei Han, YuntaoJia, Jie Tang, Duo Zhang, Yintao Yu SIGKDD, 2010 Presented by Hung-Yi Cai 2010/12/29

  2. Outlines • Motivation • Objectives • Previous study • Methodology • Problem Formulation • Assumption and Framework • Preprocessing • TPFG Model • Model Learning • Experiments • Conclusions • Comments

  3. Motivation • Information network contains abundant knowledge about relationships among people or entities. • Discovery of those relationships can benefit many interesting applications such as expert finding and research community analysis.

  4. Objectives To propose a time-constrained probabilistic factor graph model (TPFG), which takes a research publication network as input and models the advisor-advisee relationship mining problem using a jointly likelihood objective function and further to design an efficient learning algorithm to optimize the objective function.

  5. Previous study • This work is different from the existing study in Relation Miningand Relational Learning. • Relation Mining:the study mainly employ text mining and language processing technique on text data and structured data including web pages, user profiles and corpus of literature. • Relational Learning:the study refers to the classification when objects or entities are presented in multiple relations.

  6. Methodology • Problem Formulation • Assumption and Framework • Preprocessing • TPFG Model • Model Learning

  7. Problem Formulation

  8. Assumption and Framework Assumption 1 based on the commonsense knowledge about advisor-advisee relationships. Assumption 2 determines that all the authors in the network have a strict order defined by the possible advising relationship.

  9. Preprocessing The purpose of preprocessing is to generate the candidate graph H′ and reduce the search space.

  10. Preprocessing • Then we have the following rule. • Author aj is not considered to be ai’s advisor if one of the following conditions holds:

  11. TPFG Model By modeling the network as a whole, this step can incorporate both structure information and temporal constraint and better analyze the relationship among individual links.

  12. TPFG Model The graph is composed of two kinds of nodes: variable nodes and function nodes.

  13. Model Learning To maximize the objective function and compute the ranking score along with each edge in the candidate graph H′, this step need to infer the marginal maximal joint probability on TPFG, according to Eq. (10). Sum-product + junction tree. There is a general algorithm called sum-product to compute marginal function on a factor graph based on message passing.

  14. Model Learning New TPFG Inference Algorithm. The original sum-product or max-sum algorithm meet with difficulty since it requires that each node needs to wait for all-but-one message to arrive.

  15. Model Learning • After the two phases of message propagation, we can collect the two messages on any edge and obtain the marginal function. • The improved message propagation is still separated into two phases. • Phase 1:the messages senti which passed from one to their ascendants are generated in a similar order as before. • Phase 2:messages returned from ascendants recvi are stored in each node.

  16. Experiments Experiment Step

  17. Experiments • Accuracy:Effect of rules in TPFG • Using R3 as filtering rules and YEAR2 as graduation year estimation method.

  18. Experiments • Accuracy:Effect of network structure • Using DFS with a bounded maximal depth d from the given set of nodes, denoted as DFS=d, we can closures with controlled depth for a given set of authors to test.

  19. Experiments • Accuracy:Effect of training data

  20. Experiments • Accuracy:Case study • Finding that TPFG can discover some interesting relations beyond the “ground truth” from single source.

  21. Experiments Scalability Performance

  22. Experiments Application:Visualization of genealogy

  23. Experiments Application:Expert finding and Bole search

  24. Conclusions • This paper studied the mining of advisor-advisee relationships from a research publication network as an attempt to discover hidden semantic knowledge in information networks. • Proposing a Time-constraint Probabilistic Factor Graph (TPFG) model to integrate local intuitive features in the network and results on the DBLP data sets demonstrate the effectiveness of the proposed approach.

  25. Comments • Advantages • The TPFG model can mining relationship between advisor and advisee from the research publication network. • Applications • Relationship Mining

More Related