Loading in 2 Seconds...
Loading in 2 Seconds...
Finding Security Vulnerabilities in Java Applications with Static Analysis USENIX Security 2005 Authors: V. Benjamin Livshits and Monica S. Lam Presented in UIUC CS527 (Fall ’07) by Matt Stockton Introduction / Motivation
Authors: V. Benjamin Livshits and Monica S. Lam
Presented in UIUC CS527 (Fall ’07) by Matt Stockton
How can we solve this dilemma?
Definition: the analysis of computer software that is performed without actually executing programs built from that software - wikipedia
Many static analysis tools exist for analyzing C/C++ code. These look for buffer overflows, string format vulnerabilities, etc.
Java language safety features prevent direct memory access so static analysis is not as necessary…or is it?
Even with automatic memory management, Java applications are still exploitable , although the vector of attack is quite different. The paper presents a technique for identifying these attack vectors by using static analysis techniques.
Parameter Tampering – Enter maliciously formed data into HTML forms
URL Tampering – Directly edit the URL string (usually modifying an HTTP GET request after form submissions)
Hidden Field Manipulation – Web sites sometimes use hidden forms for persistence. An attacker can manually change the values
HTTP Header Manipulation – Free tools allow you to intercept browser requests, and change HTTP headers.
Cookie Poisoning – Manually modify web site cookies stored on a computer
Non-Web Input Sources – Modify command-line parameters sent to web application management scripts.
SQL Injections – Use input to generate SQL queries that will leak information from the database, or perform a malicious insert, update, or deletion. Example: Username:
HTTP Response Splitting – User-controlled content that the web application uses in the HTTP response header. Web application could send multiple responses, could corrupt proxy cache
Path Traversal – Crafted user input allows user to read / write / update files that shouldn’t be accessible. Example: File To Delete:
Command Injection – Force web application to execute a command it shouldn’t be executing
Matt ‘ OR 1 = 1; --
Need a less costly, more automated process for this type of auditing. Primary motivation for this paper.
Analyze the code without actually running the application
Use different algorithms to analyze the code to find errors. Wide range of complexity to the algorithms
If source code is not available, byte code can be used to perform the static analysis (technique used in this tool)
Basic premise is to give a static analysis tool something to look for (some type of pattern). If the tool finds a match, it will note the match. Simple Example: grep Complex Example: this tool
General Goals for static analysis – Soundness, Precision, Scalability
Proposes a methodology and tool to detect a diverse set of common web application vulnerabilities.
Improve precision of tool by using fully context-sensitive pointer analysis (less false positives)
Deliver an actual implementation of the idea, built as an Eclipse Plug-in: http://suif.stanford.edu/~livshits/work/lapse/
Validate the methodology against real web applications. Found real errors, with few false positives.
Tainted Object Propagation ~ Modeling object flow through an application
Source Descriptor Example
<HttpServletRequest.getParameter(String), -1, ε>
Sink Descriptor Example
<Connection.executeQuery(String), 1, ε>
Derivation Descriptor Examples
<StringBuffer.append(String), 1, ε, -1, ε>
<StringBuffer.toString(String), 0, ε, -1, ε >
Using the descriptors, you can theoretically find all sources and sinks in the code, and can understand when a sink uses a tainted source object that is still tainted after manipulation by derivation descriptor rules.
Generating the rules for sources, sinks, and derivations is a
Without providing 100% coverage for all sources, sinks, and derivations, the
model is incomplete and can miss vulnerabilities!
For this tool, J2EE APIs were evaluated to generate sources and
sinks, and Java String manipulation libraries were evaluated to generate
What if something is missing? Definitely a possibility.
- This tool used some additional static analysis to pinpoint tainted sources that were never passed to a method listed in derivation descriptors. This found additional derivation descriptors.
Concern: What if the source is written to a File, and used later by the application? This ‘derivation’ cannot be covered through String manipulation
To have sound static analysis, your tool needs to track what
object references (program variables) point to tainted objects
(on the heap)
In a naïve implementation, to maintain soundness, you could
end up with a very large number of potentially tainted object
references if you do not perform good points-to analysis.
Example: Are buf1 and buf2 both tainted? Is this a violation?
Not many technical details on the BDD method, but this essentially allows this tool to perform context-sensitive static analysis to reduce the set of objects that could be tainted.
NOTE: Exact points-to analysis is an undecidable problem. Need a conservative estimate that is still sound (doesn’t miss any tainted objects)
Sound and precise context-based points-to analysis, reducing
the tainted object space
Further reduction of tainted object space by introducing a
clever way to handle Container references – can identify / name
underlying structure of the Container, resulting in a further
reduced tainted object space.
Object naming for String manipulation methods. Introduced
logic to name Strings produced from String manipulation
methods to further reduce tainted object space.
Source, Sink, and Derivation descriptions can be created using Program
Query Language (PQL)
PQL – Java-like language that can be used to describe a sequence of
dynamic events that involves variables referring to object instances
Two main PQL statements define the framework that is used to find security violations. User must then define source(), derived() and sink()
Fairly simple to understand the definitions for source, sink, and derived.
Tested against 8 large open-source web applications
Created set of source, sink, and derivation descriptors (derivation focused
on String, StringBuffer, and StringTokenizer classes
Four combinations of testing (with/without context sensitivity, with/without
improved object naming)
Recorded a total of 41 potential security violations. 29 turned out to be
security errors, and 12 were false positives
More precise with both context sensitivity and improved naming enabled
(and actually faster execution time)
Found two errors in common library code (J2EE and hibernate)
Almost all errors were confirmed by the application developers, resulting in
Parameter manipulation to perform HTTP splitting was the most prevalent
Browser re-direct attacks based on user-entered data (HTTP referrer field
SQL injection vector in Hibernate library code
All due to not defining an object naming rule correctly -StringWriter.toString()
Once this was added to the naming rules, there were no false positives
Input validation / control flow is not handled – If application
does some parameter validation – this tool will not take that
Source / Sink / Derivation descriptions need to be manually
created and potentially updated – J2EE sources / sinks, and
String library descriptors cover a lot – are there more?
Need to manually tune the object naming rules so that you can
minimize false positives
Can you think of other paths not covered by the
Example - user input gets stored to a file, then read in later and used in a sink
Penetration Testing – Black box and white box. Depending on
the effort, may only catch a small sample of security risks. Will
not identify parts of the system that remain untested
Runtime Monitoring – Pattern matching of HTTP requests at
runtime by a proxy. White list of good inputs and/or blacklist
of bad inputs. Protection against errors already manifested in
Protection at levels other than application (e.g. Oracle virtual
private databases to minimize amount of data available to
This paper proposes applying tainted object propagation
techniques to Java web applications, and presents a tool
implemented as an Eclipse plug-in
The proposed technique maintains static analysis soundness,
and increases scalability and precision with context-sensitive
pointer analysis and object naming.
Improved object naming by modifying naming for Containers
Seems like a good tool requiring minimal manual integration
work to use as an additional mechanism to measure your web
Shortcomings - Weak Analysis(?), Manually creating PQL descriptors
Can we use this with other languages (.NET, ROR, *SPs)
Do people actually use PQL? (http://pql.sourceforge.net/)