1 / 24

Heterogeneous Consensus Learning via Decision Propagation and Negotiation

Heterogeneous Consensus Learning via Decision Propagation and Negotiation. KDD’09 Paris, France. Jing Gao† Wei Fan‡ Yizhou Sun†Jiawei Han† †University of Illinois at Urbana-Champaign ‡IBM T. J. Watson Research Center. Information Explosion. Not only at scale, but also at available sources!.

sarahpatel
Download Presentation

Heterogeneous Consensus Learning via Decision Propagation and Negotiation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Heterogeneous Consensus Learning via Decision Propagation and Negotiation KDD’09 Paris, France Jing Gao† Wei Fan‡ Yizhou Sun†Jiawei Han† †University of Illinois at Urbana-Champaign ‡IBM T. J. Watson Research Center

  2. Information Explosion Not only at scale, but also at available sources! Descriptions Videos Pictures Fan Site descriptions reviews Blogs

  3. Multiple Source Classification Image Categorization Like? Dislike? Research Area movie genres, cast, director, plots……. users viewing history, movie ratings… publication and co-authorship network, published papers, ……. images, descriptions, notes, comments, albums, tags…….

  4. Model Combination helps! Supervised or unsupervised supervised Some areas share similar keywords People may publish in relevant but different areas There may be cross-discipline co-operations unsupervised

  5. Motivation • Multiple sources provide complementary information • We may want to use all of them to derive better classification solution • Concatenation of information sources is impossible • Information sources have different formats • May only have access to classification or clustering results due to privacy issues • Ensemble of supervised and unsupervised models • Combine their outputs on the same set of objects • Derive a consolidated solution • Reduce errors made by individual models • More robust and stable

  6. Consensus Learning

  7. Related Work • Ensemble of Classification Models • Bagging, boosting, …… • Focus on how to construct and combine weak classifiers • Ensemble of Clustering Models • Derive a consolidated clustering solution • Semi-supervised (transductive) learning • Link-based classification • Use link or manifold structure to help classification • One unlabeled source • Multi-view learning • Construct a classifier from multiple sources

  8. Problem Formulation • Principles • Consensus: maximize agreement among supervised and unsupervised models • Constraints: Label predictions should be close to the outputs of the supervised models • Objective function NP-hard! Consensus Constraints

  9. Methodology Step 1: Group-level predictions How to propagate and negotiate? Step 2: Combine multiple models using local weights How to compute local model weights?

  10. Group-level Predictions (1) • Groups: • similarity: percentage of common members • initial labeling: category information from supervised models

  11. Group-level Predictions (2) Unlabeled nodes Labeled nodes • Principles • Conditional probability estimates smooth over the graph • Not deviate too much from the initial labeling [0.16 0.16 0.98] [0.93 0.07 0]

  12. Local Weighting Scheme (1) • Principles • If M makes more accurate prediction on x, M’s weight on x should be higher • Difficulties • “unsupervised” model combination—cannot use cross-validation

  13. Local Weighting Scheme (2) • Method • Consensus • To compute Mi’s weight on x, use M1,…, Mi-1, Mi+1, …,Mr as the true model, and compute the average accuracy • Use consistency in x’s neighbors’ label predictions between two models to approximate accuracy • Random • Assign equal weights to all the models consensus random

  14. Algorithm and Time Complexity for each pairs of groups O(s2) Compute similarity and local consistency iterate f steps for each group Compute probability estimates based on the weighted average of neighbors O(fcs2) linear in the number of examples! for each example for each model Compute local weights O(rn) Combine models’ predictions using local weights

  15. Experiments-Data Sets • 20 Newsgroup • newsgroup messages categorization • only text information available • Cora • research paper area categorization • paper abstracts and citation information available • DBLP • researchers area prediction • publication and co-authorship network, and publication content • conferences’ areas are known • Yahoo! Movie • user viewing interest analysis (favored movie types) • movie ratings and synopses • movie genres are known

  16. Experiments-Baseline Methods • Single models • 20 Newsgroup: • logistic regression, SVM, K-means, min-cut • Cora • abstracts, citations (with or without a labeled set) • DBLP • publication titles, links (with or without labels from conferences) • Yahoo! Movies • Movie ratings and synopses (with or without labels from movies) • Ensemble approaches • majority-voting classification ensemble • majority-voting clustering ensemble • clustering ensemble on all of the four models

  17. Experiments-Evaluation Measures • Classification Accuracy • Clustering algorithms: map each cluster to the best possible class label (should get the best accuracy the algorithm can achieve) • Clustering quality • Normalized mutual information • Get a “true” model from the groudtruth labels • Compute the shared information between the “true” model and each algorithm

  18. Empirical Results -Accuracy

  19. Empirical Results-NMI

  20. Empirical Results-DBLP data

  21. Empirical Results-Yahoo! Movies

  22. Empirical Results-Scalability

  23. Conclusions • Summary • We propose to integrate multiple information sources for better classification • We study the problem of consolidating outputs from multiple supervised and unsupervised models • The proposed two-step algorithm solve the problem by propagating and negotiating among multiple models • The algorithm runs in linear time. • Results on various data sets show the improvements • Follow-up Work • Algorithm and theory • Applications

  24. Thanks! • Any questions? http://www.ews.uiuc.edu/~jinggao3/kdd09clsu.htm jinggao3@illinois.edu Office: 2119B

More Related