Network-level Malware Detection

Network-level Malware Detection Mike McNett, Matthew Spear, Richard Barnes CS-851 – Malware 23 October 2004

Outline • Introduction: Design of a System for Real-Time Worm Detection • Example 1: Detecting Early Worm Propagation through Packet Matching (DEWP) • Example 2: Fast Detection of Scanning Worm Infections • Example Application: Therminator • Conclusions

Introduction Questions Being Considered: • Why network level detection? • What are the alternatives? • Are there reasonable solutions? • What are the limitations, advantages, disadvantages compared to the alternatives?

Introduction • Malware Detection Options? • Prevention vs. Treatment • Signature vs. Anomaly • Host-based containment • Network containment • Packet Header vs. Packet Payload • What are the advantages, disadvantages, and limitations of the above?

Network-level Detection

Design of a System for Real-Time Worm Detection • Hash • Count Vector • Character Filter • SRAM Analyzer • Alert Generator • Periodic Subtraction of Time Averages

Design of a System for Real-Time Worm Detection • Scalable to high throughput • Solution depends on specialized hardware • Low false positive rate • What are the problems? • What are the advantages? • Are there other, more simplistic signatures? • Can similar attacks be detected at the host level?

Detecting Early Worm Propagation through Packet MatchingXuan Chen and John Heidemann ISI-TR-2004-585February 2004

DEWP • Router-based system: • automatically detects and quarantines Internet worm propagation • matches destination port numbers between incoming and outgoing connections (automated signature creation) • detects and suppresses worms due to unusual traffic patterns • detects worm propagation within about 4 seconds • protects > 99% hosts from random-scanning worms

DEWP Thesis • Matches destination port numbers between incoming and outgoing connections. Two observations on worm traffic: • Worms usually exploit vulnerabilities related to specific network port numbers • Infected hosts will probe other vulnerable hosts exploiting the same vulnerability • So… high levels of bi-directional probing traffic with the same destination port number new worm • Scalable: Matching destination port numbers consumes low computational power

DEWP • Two components of DEWP: worm detector and packet filter • Two step detection: destination port matching and destination address counting • Uses packet filtering to suppress worm spreading • Can deploy at different levels of network

Worm Containment • DEWP uses traffic filtering – routers drop packets with the automatically discovered destination port • Worm containment: protect internal hosts from internal and external threats; notify other networks about attacks

Design • Maintains one port-list for each direction (incoming and outgoing): records number of connections to different destination ports • Timer for each entry in port-lists: • If port has not been accessed for certain time interval, reset corresponding list entry • Monitor outgoing destination addresses of non-zero entries in both port-lists • Every T seconds, check number of unique addresses observed within last time interval. Worm traffic detected with the following condition: • Nis the number of unique addresses observed. • Long-term average: •  is the system sensitivity to changes

Effectiveness of Worm Detection and Quarantine • Random scanning worm: detects worm traffic in 4.8 seconds when fully deployed with a 1 second detection interval. • Always detects worm probing traffic in 4-5 seconds when deployed to different layers. • Number of infected hosts in the protected network – primarily determined by the number of probing packets received from outside • Can protect almost all hosts from infection when only deployed on the access router.

Local Scanning • Local scanning: Can detect worm probing traffic in 3.87 seconds. But, almost all vulnerable hosts in the protected network are compromised • Deployment has little impact on either detection delay or infection percentage. • The infection percentage increases as number DEWP deployed layers are reduced: When only on the access router  all vulnerable hosts compromised within 10 seconds • More frequent detection reduces vulnerability to local-scanning worms • DEWP quickly detects worm attacks regardless probing techniques. • With full deployment about 9% vulnerable hosts compromised in the protected network • Due to difficulty to effectively quarantine local-scanning worms  a very small detection interval and wide deployment is critical to protect vulnerable hosts

Effect of Detection Intervals • Address-counting with an interval of T seconds. • Different detection intervals affect detection delay and infection percentage • Random-scanning worm. Detection delay and the number of infected hosts increases with detection intervals. • Local-scanning worms: 1) No significant difference in detection delay; 2) Infection percentage increases dramatically at larger intervals: • So, automatic system needs to react to worm traffic within small time intervals

False Detections • No false positives • Discovered ~10 suspicious destination ports including 21 (FTP), 53 (DNS), and 80 (Web) • Depends on address-counting to reduce false positives • Worm scan rate C affects false negatives: when worm scan at low rate, probing traffic has less effect on overall traffic. DEWP routers have more difficulty distinguishing them from normal traffic. • With C = 500  worm traffic stands out compared to regular traffic • DEWP is not able to detect worms with scanning rate lower than C = 25.

Conclusions • Detects and quarantines propagation of Internet worms • Uses port-matching and address-counting as the signature. • Detects worm attack within 4-5 seconds • By automatically blocking worm traffic, it protects most vulnerable hosts from random-scanning worms. • Authors believe that an automatic worm detection and containment system should be widely deployed and have very small detection intervals • Not realistic to deploy DEWP on all routers – for random scanning worms – sufficient to put on access router.

Worm Detection Fast Detection of Scanning Worm Infection

Detection Techniques • Reverse Sequential Hypothesis Testing (TH) • Detects worms based upon number of failed connection attempts • Uses probability to determine if a local host is scanning • Designed to be tied into a containment system • Signature Based Analysis (Early Bird System (EBS)) • Detects worms based upon Rabin signatures of content/port • Used in conjunction with a containment system

Definitions

Basic Algorithm • Maintain separate state information for each host (l) being monitored ( ), the hosts that have been previously contacted, and an FCC queue (FCCQ) of first contact attempts that have been attempted but have not been recorded in the observation (PCH). • When a packet is observed check to see if d is in the PCH ofl, if not then add d PCH andadd the attempt to FCCQ as PENDING. • When an incoming packet is sent to l and the source address exists in FCCQ update the record to SUCCESS in the FCCQ unless the packet is a TCP RST. • When the head entry of FCCQ has status of PENDING and has been in queue for longer than a predefined time limit set its status to FAILURE. • If the entry at the head of FCCQ has status other than PENDING update and compare it to η1

Basic Algorithm Credit Based Connection Rate Limiting (CBCRL) • Simple scheme to limit the amount of connections l can make in a given slot of time by allotting each l a set number of credits (Cl) that is modified given events. • Used in conjunction with TH to limit number of connections a host can make allowing TH time to determine if a host is infected.

Experiment • Conducted two experiments in 2003 (isp-2003) and 2004 (isp-04). • Worms identified via comparing traffic to known worm descriptions.

Results

Limitations, Future Work? Are there any serious flaws in this algorithm? Future work? • Warhol type scanning • Network outages can cause TH to decide that a host is a worm • Worms could conceivably collaborate to defy detection • Worms could remember hosts that it can contact and defy detection through them • Spoofing attack to get an uninfected host blocked • Interleave scanning with benign activities (i.e. for every scan visit a website that is known to be running) • Can trivially modify to work with the containment strategies discussed earlier

THERMINATOR!!! Science comes to the aid of network-level anomaly detection

Network behavior is complicated • How do we use “microscopic” packet-level data to make “macro” network-level decisions? • Too broad, e.g. keeping track of global traffic patterns. • Too refined, e.g. looking at individual packets. • Hmm… who else tries to make sense of the overall behavior of millions of single objects? • Physicists and Chemists!

Idea • Given a computer network with >1000 nodes, • Want to detect anomalous traffic, without any foreknowledge. • Idea of THERMINATOR • Take advantage of lots of packet-level data. • Use physical techniques to distill information into relevant statistics: Temperature, entropy, etc.

Data Reduction • Take the set of hosts and group them into “buckets” or “conversation groups”. • Observe communication among buckets. • Calculate physical statistics based on these higher-level communications. • By virtue of the mathematics, these are guaranteed to be the same as if we’d just looked at hosts.

Physical Network Visualization • Based on reduced data, we know pseudo-physical statistics: • Bucket size • Temperature • Entropy • Heating rate • Work rate • Visualizing these data shows network events. Image courtesy of DISA

Network Event Detection

THERMINATOR Implementation • Jointly developed by DISA, NSA, and Lancope Inc. • Uses Lancope’s data-collection hardware to provide data to THERMINATOR. • THERMINATOR reduces data, computes stats, and provides visualization. • “Research tests validated that THERMINATOR detected anomalies that the intrusion detection systems did not capture.” -- NSA

Conclusion • Combined approaches (host-based, network-based, visualization)? • Can signatures be automatically generated? • Can attacks be visualized? • Potential impacts of false positives (is the medicine worse than the sickness) and automated containment? • Need different solutions for local-scanning vs. non-local scanning worms? • Are there other scientific areas that malware research can leverage?

Network-level Malware Detection

Network-level Malware Detection

Presentation Transcript

Windows Malware: Detection And Removal

Data Mining for Malware Detection

Data Mining Methods for Malware Detection

Crowdsourcing Service-Level Network Event Detection

Polymorphic Malware Detection

Trends in Circumventing Web-Malware Detection

Analyzing Malware Detection Effectiveness with Multiple Anti-Malware Programs

Analyzing Malware Detection Efficiency with Multiple Anti-Malware Programs

Malware Detection Based on Malicious Behaviors Using Artificial Neural Network

Malware Detection

IMDS: Intelligent Malware Detection System

Data Mining for Malware Detection

Network-Level Spam Detection

Behavior-Based Malware Detection

Detection of ASCII Malware

Malware Classification And Detection

Malware detection with OSSEC

Graph Techniques for Malware Detection

Data Mining for Malware Detection

Malware Detection in Android Applications