1 / 18

A Hybrid Anomaly Detection Model using G-LDA

A Hybrid Anomaly Detection Model using G-LDA. Bhavesh Kasliwal , Shraey Bhatia, Shubham Saini, I.Sumaiya Thaseen , Ch.Aswani Kumar. VIT University – Chennai. Typical IDS. This work mainly focused on Intrusion Identification. Architecture. Attribute Selection.

greta
Download Presentation

A Hybrid Anomaly Detection Model using G-LDA

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Hybrid Anomaly Detection Model using G-LDA BhaveshKasliwal, Shraey Bhatia, Shubham Saini, I.SumaiyaThaseen, Ch.Aswani Kumar. VIT University – Chennai

  2. Typical IDS This work mainly focused on Intrusion Identification

  3. Architecture

  4. Attribute Selection “With more data, the simpler solution can be more accurate than the sophisticated solution.” • Selection process based on means and modes of numeric attributes • A contrast between the mode values of anomaly and normal patterns with their corresponding means inclined towards the modes

  5. Selected Attributes A strong contrast between the trends of a selected and discarded attribute visible

  6. Training Set Selection (using LDA) • Latent Dirichlet Allocation is a generative model that allows sets of observations to be explained by unobserved groups that explain why some parts of the data are similar. • Apply LDA (separately on anomaly and normal packets) to obtain 200 sets of 10 packets each. Each set dominated by a particular packet type.

  7. Sample LDA Output Topic 0th: 0,icmp,eco_i,SF,18,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,1,0,0,1,1,0,1,0.25,0,0,0,0,anomaly 0,icmp,eco_i,SF,18,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,1,0,0,1,1,0,1,0.25,0,0,0,0,anomaly 0,icmp,eco_i,SF,18,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,1,0,0,1,1,0,1,0.25,0,0,0,0,anomaly 0,icmp,eco_i,SF,18,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,1,0,0,1,1,0,1,0.25,0,0,0,0,anomaly 0,icmp,eco_i,SF,18,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,1,0,0,1,1,0,1,0.25,0,0,0,0,anomaly 0,icmp,eco_i,SF,18,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,1,0,0,1,1,0,1,0.26,0,0,0,0,anomaly 0,icmp,eco_i,SF,18,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,1,0,0,1,1,0,1,0.26,0,0,0,0,anomaly 0,tcp,telnet,S0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,125,13,1,1,0,0,0.1,0.06,0,255,0.03,0.07,0,0,1,1,0,0,anomaly 0,tcp,uucp,S0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,135,9,1,1,0,0,0.07,0.06,0,255,0.04,0.07,0,0,1,1,0,0,anomaly 0,tcp,vmnet,S0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,258,10,1,1,0,0,0.04,0.05,0,255,0.04,0.05,0,0,1,1,0,0,anomaly Topic 1th: 0,tcp,finger,S0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,14,3,1,1,0,0,0.21,0.29,0,255,0.25,0.02,0.01,0,1,1,0,0,anomaly 0,tcp,finger,S0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,246,20,1,1,0,0,0.08,0.06,0,255,0.08,0.07,0,0,1,1,0,0,anomaly 0,icmp,ecr_i,SF,1032,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,511,511,0,0,0,0,1,0,0,255,0.55,0.01,0.55,0,0,0,0,0,anomaly 0,icmp,ecr_i,SF,1032,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,511,511,0,0,0,0,1,0,0,255,0.56,0.02,0.56,0,0.01,0,0,0,anomaly 0,icmp,ecr_i,SF,1032,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,511,511,0,0,0,0,1,0,0,255,0.6,0.01,0.6,0,0,0,0,0,anomaly 0,icmp,ecr_i,SF,1032,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,511,511,0,0,0,0,1,0,0,255,0.64,0.02,0.64,0,0,0,0.02,0,anomaly 0,icmp,ecr_i,SF,1032,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,511,511,0,0,0,0,1,0,0,255,0.64,0.02,0.64,0,0,0,0,0,anomaly ………………

  8. Genetic Algorithm

  9. Genetic Algorithm • Applied on Normal and Anomaly packets separately • Threshold value taken for providing a negative weight • Run for 3 generations • Top 3 values for anomaly and normal packets used

  10. Identifying nature of incoming packet • For each selected attribute value Fi in incoming packet • If Fi ∈ Vi • Si = (A* Frequency of Fi in Anomaly) – (Frequency of Fi in Normal) • Else • Si= 0 • C = Σ Si • If C > 0 • Then Anomaly • Else Normal

  11. Additional Weight • Multiplied to the anomaly frequency • Why ? • generic anomalies having diverse values • unlike the normal packets that contain values in a particular range • Trade-off between the accuracy and • the false positive rate required

  12. Additional Weight

  13. Results • Tested against 50000 anomaly and 50000 normal packets from KDDCup’99 dataset. • 88.5% Accuracy with 6% FPR

  14. Future Work • Focus on specific anomaly types • Better Attribute Selection algorithm ? • oneR • Entropy based • Chi-squared • randomForest • Better classification technique ? • Clustering – Hierarchical , K-Means • Decision Trees

  15. REFERENCES • Valeur, Fredrik, and Giovanni Vigna. Intrusion detection and correlation: challenges and solutions. Vol. 14. Springer, 2005. • Kim, Dong Seong, and JongSou Park. "Network-based intrusion detection with support vector machines." Information Networking. Springer Berlin Heidelberg, 2003. • Blei, David M., Andrew Y. Ng, and Michael I. Jordan. "Latent dirichlet allocation." the Journal of machine Learning research,Volume 3, pp.993-1022,2003. • Cramer, Christopher, and Lawrence Carin. "Bayesian topic models for describing computer network behaviors." Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on. IEEE, 2011. • Newton, Benjamin D. "Anomaly Detection in Network Traffic Traces Using Latent Dirichlet Allocation." • Li, Wei. "Using genetic algorithm for network intrusion detection." Proceedings of the United States Department of Energy Cyber Security Group,pp1-8,2004.

  16. REFERENCES (Contd.) • Bing-Yi Zhang,Ya-Min Sun,Yu-Lan,Bian,HongKeZhang,”LinearDiscriminant Analysis in network traffic modeling”, International Journal of Communication Systems”,Volume 19,Issue 1,pp.53-65,2006. • A.Gomathy and B.Lakshmi,”Network intrusion detection using Genetic algorithm and Neural Network”, Communications in Computer and Information Science,Volume 198,pp.399-408,2011. • Siva S,SivathaSindhu,S.Geetha,A.Kannan,”Decision tree based light weight intrusion detection using a wrapper approach”,Expert Systems with applications,Volume 39,pp.129-141,2012. • B.Kavitha,S.Karthikeyan,P.SheebaMaybell,”An ensemble design of intrusion detection system for handling uncertainity using neutrosophiclogicclassifier”,Knowledge based systems,Volume 28,pp.88-96,2012. • Saini, Shubham, BhaveshKasliwal, and Shraey Bhatia. "Spam Detection using G-LDA." International Journal of Advanced Research in Computer Science and Software Engineering,Volume 3,Issue 10,pp.406-409,2013. • Cup, K. D. D. "Available on: http://kdd. ics. uci. edu/databases/kddcup 99/kddcup99. html.",2007.

  17. REFERENCES (Contd.) • Phan, Xuan-Hieu, and Cam-Tu Nguyen. "Jgibblda: A java implementation of latent Dirichlet allocation (lda) using gibbs sampling for parameter estimation and inference”,2006. • Shekhar R Gaddam, Vir V Phoha and Kiran S Balagani,”A novel method for supervised anomaly detection by cascading K-Means clustering and ID3 deicsion tree learning methods”, IEEE transactions on knowledge and data engineering,Volume.19,pp.345-354,2007. • Amor, Nahla Ben, Salem Benferhat, and ZiedElouedi. “Naive Bayesvs decision trees in intrusion detection systems” Proceedings of the 2004 ACM symposium on Applied computing, pp.420-424,2004. • Benferhat, S. and Tabia, K., “On the combination of Naive Bayes and decision trees for intrusion detection”, International Conference on Intelligent Agents, Web Technologies and Internet Commerce,Volume 1, pp. 211–216,2006. • [17] Xiang, C., and Lim, S. M, “Design of multiple-level hybrid classifier for intrusion detection system”, IEEE Transaction on System, Man and Cybernetics, Part A: Cybernetics, Volume 2, pp.117–122,2005. • [18] SumaiyaThaseen and Ch. Aswani Kumar, “An Analysis of supervised tree based classifiers for intrusion detection system”, IEEE International Conference on Pattern Recognition, Informatics and Mobile Engineering (PRIME), February 2013.

  18. QUESTIONS?

More Related