1 / 30

Data and Applications Security Developments and Directions

Data and Applications Security Developments and Directions. Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #20 Guest Lecture Data Mining for Intrusion Detection By Mamoun Awad March 24, 2005. Data Mining &Intrusion Detection Systems. Mamoun Awad

melita
Download Presentation

Data and Applications Security Developments and Directions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data and Applications Security Developments and Directions Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #20 Guest Lecture Data Mining for Intrusion Detection By Mamoun Awad March 24, 2005

  2. Data Mining &Intrusion Detection Systems Mamoun Awad Dept. of Computer Science University of Texas at Dallas

  3. Outline • Intrusion Detection • Data Mining • Approach • Data set & Results

  4. What is an intrusion? • An intrusion can be defined as “any set of actions that attempt to compromise the: • Integrity • confidentiality, or • availability of a resource”.

  5. Intrusion Examples • Virus • Buffer-overflows • 2000 Outlook Express vulnerability. • Denial of Service (DOS) • explicit attempt by attackers to prevent legitimate users of a service from using that service. • Address spoofing • a malicious user uses a fake IP address to send malicious packets to a target. • Many others • R2L, U2R, Probe, …

  6. Intrusion Detection System (IDS) • An Intrusion Detection System (IDS) inspects all inbound and outbound network activity and identifies suspicious patterns that may indicate a network or system attack from someone attempting to break into or compromise a system.

  7. Attack Types • Host-based attacks • Gain access to privileged services or resources on a machine. • Network-based attacks • Make it difficult for legitimate users to access various network services

  8. IDS Categories • Intrusion detection systems are split into two groups: • Anomaly detection systems • Identify malicious traffic based on deviations from established normal network. • Misuse detection systems • Identify intrusions based on a known pattern (signatures) for the malicious activity.

  9. Problem Statement • Goal of Intrusion Detection Systems (IDS): • To detect an intrusion as it happens and be able to respond to it. • False positives: • A false positive is a situation where something abnormal (as defined by the IDS) happens, but it is not an intrusion. • Too many false positives • User will quit monitoring IDS because of noise. • False negatives: • A false negative is a situation where an intrusion is really happening, but IDS doesn't catch it.

  10. Layered Security Mechanism

  11. Problem Statement • Misuse Detection

  12. Firewalls

  13. Firewall Rules Order Protocol source source destination destination action IP Port IP Port

  14. Hierarchical Distributed Firewall Setup

  15. Problem Statement • Anomaly Detection

  16. Our Approach SVM Class Training Testing Class Training Data Problem??? Testing Data

  17. Our Approach Hierarchical Clustering (DGSOT) SVM Class Training Testing Class Training Data Testing Data

  18. Dynamically Growing Self-Organizing Tree Algorithm (DGSOT)

  19. DGOST • Learning Process • Winner Node • Update the Tree • Stopping Criteria

  20. Support Vector Machine • Support Vector Machines (SVM) • One of the most powerful classification techniques • Find hyper-plane that separates classes • Based on the idea of mapping data points to a high dimensional feature space where a separating hyper-plane can be found

  21. The value of support vectors and non-support vectors

  22. The effect of adding new data points on the margins

  23. Feature Mapping Feature mapping from two dimensional input space to a two dimensional feature space.

  24. SVM Limitations • Long training time limits its use. • Clustering has a positive impact on the training of an SVM -- each cluster is represented by only one reference • Reduce training time • Degrade generalization -- we use a fewer number of points.

  25. Hierarchical clustering with SVM flow chart

  26. Training set • 1998 DARPA data that originated from the MIT Lincoln Lab • http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html • Size: 1012,477 data point

  27. Data set / Attack Types • DOS • denial-of-service • R2L • unauthorized access from a remote machine, e.g. guessing password; • U2R • unauthorized access to local super user (root) privileges, e.g., various ``buffer overflow'' attacks; • Probing • surveillance and other probing, e.g., port scanning.

  28. Methods Weighted Accuracy Average Accuracy Average Training Time Average FP rate Average FN rate Random Selection 62.5% 62.61% 0.049 hours 22.40% 37.38% Pure SVM 62.74% 62.75% 0.51 hours 30.75% 37,24% SVM+Rocchio Bundling 63.09% 63.11% 0.93 hours 30.98% 36.89% SVM + DGSOT 63.34% 63.36% 0.26 hours 51.56% 36.64% Results

  29. Relevant and Important Publications • “A Dynamical Growing Self-Organizing Tree (DGSOT) for Hierarchical Clustering Gene Expression Profiles,” Feng Luo, Latifur Khan , Farokh Bastani, I-Ling Yen and J. Zhou, the Bioinformatics Journal, Oxford University Press, UK, 20 16, (November 2004) 2605-2617. • “Automatic Image Annotation and Retrieval using Weighted Feature Selection”Lei Wang and Latifur Khan to appear in a special issue in Multimedia Tools and Applications, Kulwer Publisher. • “Hierarchical Clustering for Complex Data” Latifur Khan and Feng Luo, to appear in International Journal on Artificial Intelligence Tools, World Scientific publishers. • “A New Intrusion Detection System using Support Vector Machines and Hierarchical Clustering” Latifur Khan, Mamoun Awad, and Bhavani Thuraisingham, to appear in VLDB Journal: The International Journal on Very Large Databases, ACM/Springer-Verlag Publishing.

  30. Relevant and Important Publications • R. Lippman J. Haines, D. Fried., J. Korba, and K. Das, “The 1999 DARPA off-line intrusion detection evaluation” , Computer Networks, 34, pp. 579-595, 2000.

More Related