140 likes | 271 Views
This paper presents DiVo, an innovative voting method for one-class classification problems, aimed at improving model accuracy when the number of training examples is limited. It introduces a boundary rule based on distances between class members and training samples using Euclidean and Mahalanobis metrics. DiVo bypasses the need for artificial negative training data, enhancing data fidelity. Results demonstrate the effectiveness of DiVo through a simulation of the one-class classification problem, employing three-fold cross-validation to achieve robust performance metrics and insights in various application domains.
E N D
Paper study- 2012/12/22 DiVo: A Novel Distance based Voting Methodfor One Class Classification MerterSualp and Tolga Can IEEE Transactions on Knowledge and Data Engineering
OUTLINE Introduction Method of DiVo Results Discussion
Introduction • When there exist sufficiently many training examples, the estimation error of the model tends to decrease. • Although, it may not be possible or feasible to collect sufficient training data, especially in application domains. • Negative training data is artificially generated. • fidelity • Methods which are specifically developed to work with one class training datasets bypass the artificial data generation stage.
Method of DiVo - training • Boundary Rule: • The distance from a class member q to a training sample t, is less than or equal to the farthest distance from tto any of the other training samples. • distance metric : Euclidean / Mahalanobis • Euclidean distance :
Method of DiVo - training • A set T of k positive samples • Aset B of k boundary distances
Method of DiVo - testing • threshold “ratio” : 重疊、密集程度 • ratio • The label y of sample x • 0:negative , 1:positive
Results We simulate the one class classification problem by selecting each class as the target class and the rest of them as the non-targets and using a subset of the target class samples during the training phase.
Results • preprocessing • normalize all attribute values between 0 and 1. • 3-fold cross-validation • 1 for training , 2 for testing • f-measure • f-measure
Discussion DiVo-M
Discussion DiVo-E
Discussion Biomed Data(藍)
Discussion Dermatology Data (黃)