CS-558. Heat-seeking Honeypots: Design and Experience. John P. John, Fang Yu, Yinglian Xie , Arvind Krishnamurthy, and Martín Abadi . Smyrnaki Ourania. Goal. Attackers search for vulnerable servers . Aim to understand the behavior of attackers : How they find them Compromise
How do attackers find Web servers?
How can we obtain these malicious queries?
Given the query used by the attacker, how do we create an appropriate honeypot?
Install vulnerable Web Software
When is an application compromised?
New files added or application files have been compromised
Manually identify and set up software
Set up web pages matching the query
Crawler fetches the Web pages at these URLs, along with the other elements require to render these pages (e.g. images,css)
Log all visits to our local heat-seeking honeypots.
Process log and automatically extract attack traffic.
Honeypots receive legitimate traffic and malicious traffic since our honeypots are publicly accessible.
2 kinds of legitimate traffic:
How can we identify crawler traffic?
Looking for known user agent strings
Disadvantage: Does not always work!
User agent string easily spoofed, attackers can use a well known string to avoid detection.
Crawlers visit static and dynamic links.
Dynamic links generated by the real software.
Static links refer to automatically generated honeypots.
More that 200 Software Honeypot pages that contain dynamic links have been crawled by 3 search engines.
Anyone visiting more than Threshold > 75% is considered a crawler,
while others are considered legitimate users to reach their honeypot pages.
Most popular page with over 10.000 visits was for site running Joomla, a CMS.
Crawled and indexed by Web sites. (96 pages)
password guesses,software installation attempts, SQL-injection attacks, remote ﬁle inclusion attacks, and cross-site scripting (XSS) attacks.