1 / 20

Discovering Outlier Filtering Rules from Unlabeled Data

Discovering Outlier Filtering Rules from Unlabeled Data. Author: Kenji Yamanishi & Jun-ichi Takeuchi Advisor: Dr. Hsu Graduate: Chia- Hsien Wu. Outline. Motivation Objective Introduction Main Framework Outlier Detector - SmartSifter Rule Generator – DL-ESC/DL-SC

burton
Download Presentation

Discovering Outlier Filtering Rules from Unlabeled Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Discovering Outlier Filtering Rules from Unlabeled Data Author: Kenji Yamanishi & Jun-ichi Takeuchi Advisor: Dr. Hsu Graduate: Chia- Hsien Wu

  2. Outline • Motivation • Objective • Introduction • Main Framework • Outlier Detector - SmartSifter • Rule Generator – DL-ESC/DL-SC • Experimentation–The network intrusion • Experimental Results • Conclusion • Opinion

  3. Motivation • The problem of the SmartSifter’s accuracy • The SmartSifter cannot find the general pattern of the identified outliers

  4. Objective • Improving the accuracy of SmartSiFter. • Discovering a new pattern that outliers in a specific group may commonly have

  5. Introduction • Developing SmartSifer : It is an on-line outlier detection algorithm • Improving the power of the SamtSifer by combining supervised learning method

  6. Main Framework A New Rule Classifier L

  7. Outlier Detector - SmartSifter ->SS • Using a probabilistic (Gaussian mixture) model->P(x,y) = p(x)p(y|x) • Employing an on-line discounting learning algorithm (SDLE)/(SDEM) to update the model • Giving a score to each datum

  8. Outlier Detector - SmartSifter ->SS (cont.) • SDLE algorithm: An on-line discounting variant of the Laplace law based estimation algorithm • SDEM algorithm: An on-line discounting variant of the incremental EM (Expectation Maximization) algorithm

  9. Outlier Detector - SmartSifter ->SS (cont.) • Outputting a sorted dataset • A highly scored data indicates a high possibility be an outlier

  10. Rule Generator – DL-ESC/DL-SC • Using a stochastic decision list • Employing the principle of minimizing extended stochastic complexity or stochastic complexity

  11. Rule Generator – DL-ESC/DL-SC (cont.) • If ξ makes t1 true, then μ = v1 with probability p1 else if ξ makes t2 true, then μ = v2 with probability p2 ……………………… else μ = vs with probability ps

  12. Experimentation - Network intrusion detection • The purpose of our experiment is to detect without making use of the labels concerning intrusions

  13. Experimentation – Dataset (cont.) • Using the dataset KDD Cup 1999 prepared for network intrusion detection • Using the 13 attributes for DL-ESC • Using four attributes for SmartSifter (service ,duration ,src_bytes ,dst_bytes) • Only “service” is categorical • Y= log(x+0.1),where the base of logarithm is e • Generating five datasets S0,S1,S2,S3,S4

  14. Experimentation – Dataset (cont.)

  15. Experimentation – Illustration by an Example (cont.) First Rule – S1 Update Rule – S1 Update Rule – S2

  16. Experimental Results • SS : SmartSifter • R&S: Rule and SmartSifter (This framework) • Using S0 as a training set to construct a filtering rule, each of S1,S2,S3,and S4 is used for test

  17. Experimental Results (cont.)

  18. Experimental Results (cont.)

  19. Conclusion • This new framework has two features • Improving the power of SmartSifter • Helping the user discovers a general pattern

  20. Opinion • Making the detection process more effective and more understandable • This framework can apply to other field

More Related