1 / 38

Information Market based Decision Fusion in Multi-Classifier Combination

This research explores a method for designing a combiner that is effective, adaptable, and does not assume cooperation among base-classifiers. It utilizes information markets and parimutuel betting to achieve efficient decision fusion. The experiments and results demonstrate the effectiveness of the proposed method.

psalerno
Download Presentation

Information Market based Decision Fusion in Multi-Classifier Combination

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Information Market based Decision Fusion in Multi-Classifier Combination Johan Perols, Kaushal Chari and Manish Agrawal Information Systems/Decision Sciences, College of Business Administration University of South Florida jperols@coba.usf.edu kchari@coba.usf.edu magrawal@coba.usf.edu

  2. Overview • Multi-Classifier Combination • Information Market based Fusion • Experiments and Results • Contributions and Future Research Opportunities

  3. Multi-Classifier Combination Related Research Our Method Experiment Results Conclusion

  4. Research Objectives To design a combiner method that: • is relatively effective • can adapt to changes in ensemble composition and to changes in relative base-classifier accuracy; and • does not assume that base-classifiers are cooperative. Related Research Our Method Experiment Results Conclusion

  5. Information Markets Definition • Information markets are markets designed specifically for the purpose of information aggregation. • Equilibrium pricesprovide information about a specific situation, future event or object of interest Related Research Our Method Experiment Results Conclusion

  6. Information Markets - Parimutuel Betting • Empirical field research (Weitzman 1965; Ali 1977 and 1979; Asch, et al. 1982; and Thaler 1992) and experiments (Plott, et al. 2003) support the efficient market hypothesis in these betting markets. • Parimutuel betting originated in horse betting • odds horse i = total amount bet on all horses total amount bet on horse i • the market’s likelihood assessment that horse i will win the race is given by 1 / odds horse i • recursive relation between odds and amount bet • winners divide the total amount betin proportion to the sums they have wagered individually on the winning horse Related Research Our Method Experiment Results Conclusion

  7. IMF Algorithm • Binary Search iterations with odds setting and agent betting • Optimization Assumption: only objects classified as positive are investigated Agents’maximize their individual utility According to the parimutuel betting mechanism Related Research Our Method Experiment Results Conclusion

  8. IMF – Binary Search Pl – lower probability boundary Pu – upper probability boundary Qtj – total bets on event j Qt – total bets on all events Otj – market odds for outcome j ε – binary search stopping parameter Related Research Our Method Experiment Results Conclusion

  9. IMF – Optimization Pl – lower probability boundary Pu – upper probability boundary Qtj – total bets on event j Qt – total bets on all events Otj – market odds for outcome j Related Research Our Method Experiment Results Conclusion

  10. IMF – Get Cutoff Independent cycle that given prior cutoff Cj determines future Cj: 1) Use Cjfor the next n transactions. 2) Set Cj+k to Cj+k,use Cj+kfor transactions n to 2n 3) Set Cj-k to Cj-k,use Cj-k for transactions 2n to 3n 4) Set Cj to the cutoff that generated the highest net benefit Related Research Our Method Experiment Results Conclusion

  11. IMF – Classify Object Based on the final odds Oft1and cutoff C1 determined previously classify the object: if (1/Oft1C1) then class of t as j=1 else class of t as j=2 end if Related Research Our Method Experiment Results Conclusion

  12. IMF – Take Agent Bets Agents determine their bets qit1 and qit2 by solving P1: Z1= pit1Ui(wit + m - qit1 - qit2 + qit1Ot1) + pit2Ui(wit + m - qit1 - qit2 + qit2Ot2) S.T. qitj, sit  0 Ui - agent’s utility function Otj - current market odds pitj - agent’s probability estimate wit - agent’s wealth, m - periodic endowment k - multiplier that determines the house enforced maximum bet km., Related Research Our Method Experiment Results Conclusion

  13. IMF – Take Agent Bets Assuming a log utility function and forcing the agents to bet everything allowed (they can still hedge their bets) P1 is transformed to P2: Z2 = maxqitjpit1ln(wit+m-qit1-qit2+qit1Ot1) + pit2ln(wit+m-qit1-qit2+qit2Ot2) S.T. qitj 0 Related Research Our Method Experiment Results Conclusion

  14. IMF – Take Agent Bets Lemma 1: If wit+ m≤ km then the optimal bets of agent i while classifying t is: = pitj(wit+m) jJ. Lemma 2: If wit+ m > km then the optimal bets of agent i while classifying t is: = pit1 km + ait, and = pit2 km + ait Related Research Our Method Experiment Results Conclusion

  15. IMF – Distribute Payout On investigating transaction t΄ that occurred v transactions before the now current transaction t, if t΄ is found to be a member of the positive class j = 1, then agent i 's wealth is updated as follow: wit = wit+ Qt-v(qi,t-v,1/Q,t-v,1) If t΄ is found to be a member of the negative class j = 2, then agent i ’s wealth is updated as follows: wit = wit+ Qt-v(qi,t-v,2/Q,t-v,,2) Related Research Our Method Experiment Results Conclusion

  16. Research Objectives & Design Solutions Research Objective Design Solution Related Research Our Method Experiment Results Conclusion

  17. Experiment – Main Objectives • Does IMF outperform AVG, MAJ and WAVG? • Are these results sensitive to the: • number of agents in the ensemble; • cost-benefit ratio; • dataset size; • dataset positive ratio; or • dataset average agent accuracy? Related Research Our Method Experiment Results Conclusion

  18. Experiment – Variables Variable Function Description Related Research Our Method Experiment Results Conclusion

  19. Experiment – Datasets Positive Rate Average Agent Accuracy Dataset Instances Attributes Classes Adult 32,561 14 2 24.1% 82.00% Wisconsin Breast Cancer 699 110 2 34.5% 94.02% Contraceptive Method Choice 1,473 10 2 57.3% 66.12% Horse Colic 368 22 2 37.0% 81.68% Covertype (class 1 & 2) 10,000 11 2 72.9% 86.60% Covertype (class 3 & 4) 10,395 11 2 6.8% 95.17% Covertype (class 5 & 6) 10,009 11 2 66.9% 96.17% Australian Credit Approval 690 15 2 55.5% 83.65% German Credit 1,000 20 2 30.0% 72.02% Pima Indians Diabetes 768 8 2 34.9% 73.53% Thyroid Disease 3,772 5 2 7.7% 92.66% Labor 57 16 2 64.9% 80.06% Mushrooms 8,124 5 2 48.2% 91.19% Sick 3,772 12 2 6.9% 96.84% Spambase 4,601 58 2 39.4% 87.55% Splice-junction Gene Sequences 3,190 20 2 51.9% 62.31% Waveform 3,345 40 2 49.4% 86.91% Related Research Our Method Experiment Results Conclusion

  20. Experiment – Base-Classifiers • 22 base classifiers from Weka generated decision output for each data set • 10-fold cross validation • Standard parameter settings Base-Classifiers ADTree MultilayerPerceptron BayesNet NaiveBayes ConjunctiveRule NBTree DecisionStump NNge DecisionTable OneR IBk PART J48 RandomForest JRip RBFNetwork KStar Ridor LMT SimpleLogistic LWL SMO Related Research Our Method Experiment Results Conclusion

  21. Experiment – Implementation Facts • 2,000,900 base-classifiers decisions (5,350 average dataset size  17 datasets  22 base-classifiers) generated in Weka. • Combiner methods created using Visual Basic and LINGO. • Each of the 17 datasets combined 100 times (4 combiner methods x 11 levels of number of agents + 4 combiner methods x 14 cost-benefit ratios) for a total of 1,700 observations. • A total of 9,095,000 (100 dataset combinations x 5,350 average dataset size x 17 datasets) aggregated decisions generated. Related Research Our Method Experiment Results Conclusion

  22. Results – Main Experiment Net BenefitIMF < > Net Benefit AVG,WAVG,MAJ Moderator Does IMF outperform AVG, MAJ and WAVG? YES! Arethese results sensitive to... main effect p=0.001 IMF > AVG (p=0.0030) IMF > MAJ (p=0.0011) IMF > WAVG (p=0.0001) none number-of-agents p=0.739 cost-benefit ratio p=0.821 dataset size p=0.922 dataset average agent accuracy p=0.304 dataset positive ratio p<0.001 Related Research Our Method Experiment Results Conclusion

  23. 4 AVG IMF MAJ WAVG 3.75 lnNetBenefit 3.5 3.25 10 20 30 40 50 60 70 positive_ratio Combiner Method * Positive Ratio Follow-Up Analysis Related Research Our Method Experiment Results Conclusion

  24. Combiner Method * Positive Ratio Follow-Up Analysis lnNetBenefit LS Means 4 Positive Ratio Net BenefitIMF < > Net Benefit AVG,WAVG,MAJ 3.9 High p=0.2849 3.8 Middle p=0.1843 3.7 Low p=0.0010 3.6 High Positi Middle Posi 3.5 Low Positiv 3.4 3.3 3.2 3.1 AVG IMF MAJ WAVG Method Related Research Our Method Experiment Results Conclusion

  25. Results – Additional Analyses Test Status Dynamic cutoff algorithm performance. Impact of investigation time lags on IMF performance Impact of binary search stopping parameter on IMF performance Impact of house enforced max bet on IMF performance Net BenefitMAJ-DYN < > Net Benefit MAJ-OPT, MAJ-RAN p=0.0006 IMF(lag time) p=0.8908 interactions (p>0.12) main effect (p=0.32) min search space average-agent accuracy interaction (p=0.03) other interactions (p>0.05) max bet Related Research Our Method Experiment Results Conclusion

  26. Results Dynamic Cutoff – Follow-Up Analysis Related Research Our Method Experiment Results Conclusion

  27. Contributions IMF is a novel combiner method that: • outperforms AVG, MAJ and WAVG; • can adapt to changes in ensemble composition and in the relative base-classifiers accuracy; and • does not assume that the base-classifiers are cooperative. Dynamic Cutoff algorithm that: • outperforms randomly select cutoffs; and • does not perform significantly different from optimal cutoffs. Related Research Our Method Experiment Results Conclusion

  28. Future Research… • Implement and test IMF in different MCC architectures, i.e. bagging, boosting, etc. • Evaluate other agent behaviors (human biases, different utility functions, belief updating, etc). Related Research Our Method Experiment Results Conclusion

  29. Questions and ?

  30. Appendix

  31. Experiment – Additional Objectives • How do investigation time lags impact the performance of IMF? • What is the impact of selecting different IMF parameter values? • How good is the dynamic cut-off algorithm? Related Research Our Method Experiment Results Conclusion

  32. Results – Additional Analyses Test Status Dynamic cut-off algorithm performance. Impact of investigation time lags on IMF performance Result sensitivity to cost-based retraining Impact of binary search stopping parameter on IMF performance Impact of house enforced max bet on IMF performance Net BenefitMAJ-DYN < > Net Benefit MAJ-OPT, MAJ-RAN p=0.0006 IMF(lag time) p=0.8908 Training Type * Combiner Method p=0.13 interactions (p>0.12) main effect (p=0.32) min search space average-agent accuracy interaction (p=0.03) other interactions (p>0.05) max bet Related Research Our Method Experiment Results Conclusion

  33. *** * ** * *** ** *** * ** Results – Additional Analyses * y-axel: z-value of log net benefit ** x-axel: treatment level of k factor ***markers: average-agent accuracy (%) Related Research Our Method Experiment Results Conclusion

  34. Results – Cost Based Retraining Related Research Our Method Experiment Results Conclusion

  35. Experiment - Overview • 2,000,900 base-classifiers decisions (5,350 average dataset size  17 datasets  22 base-classifiers) generated in Weka. • Combiner methods created in Visual Basic and LINGO. • Each dataset combined 100 times (4 combiner methods x 11 levels of number of agents + 4 combiner methods x 14 cost-benefit ratios). Related Research Our Method Experiment Results Conclusion

  36. *Multi-Classifier Combination Combiner Method ”Weaknesses”: • performance is not “perfect” • training data requirement • stable base classifier performance • static ensemble composition • assuming cooperative base classifiers • no integration with coordination mechanism Related Research Our Method Experiment Results Conclusion

  37. Information Market based Fusion IMF Algorithm – Key Ideas • Odds are optimized based on agent bets in an iterative fashion until optimal or near optimal odds are found. • The odds are used to determine object class membership. • Winnings are distributed based on the pari-mutuel system in both implementations. Related Research Our Method Experiment Results Conclusion

  38. *Information Market based Fusion Behavioral Models (Plott, et al. 2003) • Decision Theory Private Information (DTPI) • Bets based on private information only. • No learning (private information not updated). • Competitive Equilibrium Private Information (CEPI) • Bets based on private information and market prices. • No learning. Related Research Our Method Experiment Results Conclusion

More Related