1 / 33

Parallel and Distributed Computing for Cyber Security

redford
Download Presentation

Parallel and Distributed Computing for Cyber Security

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    1. Parallel and Distributed Computing for Cyber Security

    2. Progress in HPC - past 6 decades

    3. Applications Drive the Technology “I think there is world market for maybe 5 computers” - Thomas Watson Sr. (1943)

    4. Data Mining - A Driver for Parallel/ Distributed Computing Lots of data being collected in commercial and scientific world Strong competitive pressure to extract and use the information from the data Scaling of data mining to large data requires HPC Data and/or computational resources needed for analysis are often distributed Sometimes the choice is distributed data mining or no data mining Ownership, privacy, security issues

    5. Cyber Intrusion Detection - Motivation Sophistication of cyber attacks and their severity is increasing Large-scale denial of service attacks Identify Theft/ Fraud Espionage DOD and Other U.S. Government Agencies are major targets for sophisticated state sponsored cyber attacks Security mechanisms always have inevitable vulnerabilities Firewalls are not sufficient to ensure security in computer networks Insider attacks difficult to detect

    7. What are Intrusions? Intrusions are actions that attempt to bypass security mechanisms of computer systems. They are caused by: Attackers accessing the system from Internet Insider attackers - authorized users attempting to gain and misuse non-authorized privileges Typical intrusion scenario

    8. What are Intrusions? Intrusions are actions that attempt to bypass security mechanisms of computer systems. They are caused by: Attackers accessing the system from Internet Insider attackers - authorized users attempting to gain and misuse non-authorized privileges Typical intrusion scenario

    9. Intrusion Detection Systems

    10. Data Mining for Intrusion Detection Increased interest in data mining based intrusion detection over the past decade Misuse detection Suitable for attacks for which it is difficult to build signatures Builds predictive models from labeled labeled data sets (instances are labeled as “normal” or “intrusive”) to identify known intrusions Cannot detect unknown and emerging attacks Madam ID project, ADAM project, fuzzy association rules [Bridges00], decision trees [Sinclair99], neural networks [Lippmann00, Ghosh99], genetic algorithms [Bridges00, Sinclair99], cost sensitive modeling (AdaCost [Fan99], MetaCost [Domingos99, Ting00]), learning from rare class ([Kubat97, Fawcett97, Provost01, Japkowicz01, Joshi02, Lazarevic03] Anomaly detection Detects emerging/novel attacks as deviations from “normal” behavior Potential high false alarm rate - previously unseen (yet legitimate) system behaviors may also be recognized as anomalies PHAD, ALAD [Chan01, Cha02], ADAM [Barbara01] finite mixture model [Yamanishi00], ?2 based [Ye01]), temporal sequence learning [Lane98], neural networks [Ryan98], generating artificial anomalies [Fan01], clustering [Eskin02], unsupervised SVM [Eskin02, Lazarevic03], outlier detection schemes (MINDS), Bayesian net [Valdes00], Hidden Markov models [Ourston03]

    11. Data Mining for Intrusion Detection Misuse Detection – Building Predictive Models

    12. Misuse Detection – Building Predictive Models Data Mining for Intrusion Detection

    13. MINDS – Minnesota INtrusion Detection System

    14. Typical Anomaly Detection Output

    15. Summarization Using Association Patterns

    16. Typical MINDS Output

    17. Typical MINDS Output

    18. Typical Summarization Output

    19. Detecting Modes of Network Traffic Using Clustering Used Shared Nearest Neighbor (SNN) clustering Not distracted by “noise” in the data CPU intensive: O(N2) Requires storing an N x K matrix K (number of neighbors) is typically between 10 – 20 K should be about the size of the smallest expect mode Clustered 850,000 connections collected over one hour at one US Army Fort Took 10 hours on a 16 CPU cluster Found 3135 clusters Largest clusters around 500 records, smallest cluster 10 records Large clusters correspond to normal behavior Many small clusters correspond to policy violations or other undesired behavior

    20. Detecting Modes of Network Traffic Using Clustering

    21. Detecting Modes of Network Traffic Using Clustering

    22. Detecting Modes of Network Traffic Using Clustering

    23. Detecting Modes of Network Traffic Using Clustering

    24. Need for HPC Very large data size Typical network traffic at University level reach around 500 million connections per day Compute intensive nature of the pattern finding algorithm Associative analysis Clustering Sequential pattern analysis

    25. Need for Distributed Intrusion Detection Attacks on the network infrastructure may be launched from several different locations and may target multiple destinations Stealthy coordinated attacks with low traffic volumes are difficult to detect by IDSs based at a single network site Detection of such attacks in early stage requires correlation of data at multiple network sites

    31. Centralizing data is not possible Data needed for analysis is distributed Costs of centralizing data is too high Security and privacy issues Computational resources needed for analysis are distributed Need for Grid-based IDS

    32. Data Mining Middleware for Grids

    33. Grid-Based Data Mining: Distributed Network Intrusion Detection

    34. Publications Managing Cyber Threats: Issues, Approaches and Challenges, edited by V. Kumar, J. Srivastava, and A. Lazarevic, Kluwer Academic Publishers (forthcoming). MINDS - Minnesota Intrusion Detection System, Ertöz, L., Eilertson, E., Lazarevic, A., Tan, P., Srivastava, J., Kumar, V., Dokas, P., Data Mining: Next Generation Challenges and Future Directions, editors: H. Kargupta, A. Joshi, K. Sivakumar, Y. Yesha MIT/AAAI Press, 2004, AHPCRC Technical Report # 2003-121 Detection of Novel Network Attacks Using Data Mining, L. Ertöz, E. Eilertson, A. Lazarevic, P. Tan, P. Dokas, V. Kumar, J. Srivastava, Workshop on Data Mining for Computer Security, IEEE International Conference on Data Mining, Melbourne, FL, November 19, 2003, AHPCRC Technical Report # 2003-108

More Related