1 / 17

Intrusion Detection Using Data Mining

Intrusion Detection Using Data Mining. By Anshu Veda(04329022) Prajakata Kalekar(04329008) Anirudha Bodhankar(04329003) Under the Guidance of Prof Sunita Sarawagi. Problem Definition.

venedict
Download Presentation

Intrusion Detection Using Data Mining

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Intrusion Detection Using Data Mining By Anshu Veda(04329022) Prajakata Kalekar(04329008) Anirudha Bodhankar(04329003) Under the Guidance of Prof Sunita Sarawagi

  2. Problem Definition • An Intrusion Detection System is an important part of the Security Management system for computers and networks that tries to detect break-ins or break-in attempts. • Approaches to Solution • Signature-Based • Anomaly Based.

  3. Types of Intrusion Detection • Classification I • Real Time • After-the-fact (offline) • Classification II • Network Based • Host Based

  4. Approaches to IDS

  5. Approaches for IDS

  6. Recommended Approach • None provides a complete solution • A hybrid approach using HIDS on local machines as well as powerful NIDS on switches

  7. Attack Simulation • Types of attacks • NIDS • SYN-Flood Attack • HIDS • ssh Daemon attack.

  8. NIDS – Data Preprocessing • Input data • tcpdump trace. • Huge • One data record per packet • Features extracted(Using Perl Scripts) • Content-Based Group records and construct new features corresponding to single connection • Time-Based Adding time-window based information to the connection records (Param: Time-window) • Connection-Based Adding connection-window based information (Param: Time-window)

  9. Preprocessing on tcpdump • From the tcpdump data we extracted following fields • src_ip ,dst_ip • src_port, dst_port • num_packets_src_dest / num_packets_dest_src • num_ack_src_dst/ num_ack_dst_src • num_bytes_src_dst/ num_bytes_dst_src • num_retransmit_src_dst/ num_retransmit_dst_src • num_pushed_src_dst/ num_pushed_dst_src • num_syn_src_dst/ num_syn_dst_src • num_fin_src_dst/ num_fin_dst_src • connection status

  10. Preprocessing on tcpdump cont… • Time-Window Based Features • Count_src/count_dst • Count_serv_src/ count_serv_dest • Connection-Window Based • Count_src1 /count_dst1 • Count_serv_src1/ count_serv_dest1

  11. NIDS- Datamining Technique • Outlier Detection • Clustering Based Approach(K-Means) • Outlier Threshold • Preprocessed dataset • K-NN Based Approach • distance threshold • Preprocessed dataset • Results • Clustering did not give good results. • Limited Data • K-NN • Giving Alarms

  12. HIDS – Data Preprocesing • Input data • “strace” system call logs for a particular process(sshd) • One data record per system call • Sliding-Window Size for grouping. • Features extracted(Using Perl Scripts) • Sliding the window over the trace to generate possible sequences of system calls.

  13. HIDS – Data Preprocessing cont… a d f g a e d a e b s d e a a d f g d f g a f g a e g a e d a e d a e d a e d a e b a e b s e b s d b s d e s d e a

  14. Datamining Technique Used • Learning to predict system calls • Predict ith system call for each test record<p1, p2,p3> • Done using Classification (Decision Trees) • Anomaly Detection • Use of misclassificationscore to detect anomalies

  15. Literature Survey • Types of attacks (Host and Network Based) • Techniques • Association rules and Frequent Episode Rules over host based and network based • Outlier Detection using clustering • classification

  16. Future Work • NIDS • To incorporate threshold distance as a configurable parameter for K-Means Algorithm used • HIDS • Try out meta-learning algorithms for classification • A small user Interface for configuring parameters.

  17. References • “Mining in a data-flow Environment: Experience in Network Intrusion Detection”, W. Lee, S. Stolfo, K. Mok. • “Mining audit data to build intrusion detection models”, W. Lee, S. Stolfo, K. Mok. • “Data Mining approaches for Intrusion Detection”, W. Lee S. Stolfo. • “A comparative study of anomaly detection schemes in network intrusion detection”, A. Lazarevic, A ozgur, L. Ertoz, J. Srivastava, Vipin Kumar. • “Anomaly Intrusion detection by internet datamining pf traffic episodes” Min Qin & Kai Gwang. • “A database of computer attacks for the evaluation of Intrusion Detection System”, Thesis by Kristopher Kendall.

More Related