Skip this Video
Download Presentation
SATE 2010 Background

Loading in 2 Seconds...

play fullscreen
1 / 22

Cautions on Using SATE Data - PowerPoint PPT Presentation

  • Uploaded on

SATE 2010 Background. Vadim Okun, NIST [email protected] October 1, 2010 The SAMATE Project Cautions on Using SATE Data. Our analysis procedure has limitations

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Cautions on Using SATE Data' - victoria

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

SATE 2010 Background

Vadim Okun, NIST

[email protected]

October 1, 2010

The SAMATE Project

cautions on using sate data
Cautions on Using SATE Data
  • Our analysis procedure has limitations
  • In practice, users write special rules, suppress false positives, and write code in certain ways to minimize tool warnings.
  • There are many other factors that we did not consider: user interface, integration, etc.
  • So do NOT use our analysis to rate/choose tools

Tools that work on source code







CVE entries

Tool C

Tool B


Tool A







Human findings


sate 2010 timeline
SATE 2010 timeline
  • Choose test programs (2 C, 1 C++, 2 Java). Provide them to tool makers (28 June)
  • Teams run their tools, return reports (30 July)
  • Analyze the tool reports (22 Sept)
  • Report at the workshop (1 Oct)
  • Teams submit a research paper (Dec)
  • Publish data (between Feb and May 2011)
participating teams
Participating teams
  • Armorize CodeSecure
  • Concordia University Marfcat
  • Coverity Static Analysis for C/C++
  • Cppcheck
  • Grammatech CodeSonar
  • LDRA Testbed
  • Red Lizard Software Goanna
  • Seoul National University Sparrow
  • SofCheck Inspector
  • Veracode
    • a service company
test cases
Test cases
  • Dovecot: secure IMAP and POP3 server – C
  • Pebble: weblog server – Java
  • Wireshark - C
  • Google Chrome – C++
  • Apache Tomcat - Java
  • All are open source programs
  • All have aspects relevant to security
  • From 30k LoC (Pebble) to 4.7M LoC (Chrome)

CVE-based pairs

(vulnerable and fixed)

tool reports
Tool reports
  • Teams converted reports to SATE format
    • Several teams also provided original reports
  • Described environment in which they ran tool
  • Some teams tuned their tools
  • Several teams provided analysis of their tool warnings

Original tool formats






analysis procedure
Analysis procedure

3 Selection Methods



Tool warnings

~60K warnings

Analyze for


and associate


to human





to CVEs


the data

method 1 warning selection for dovecot and pebble only
Method 1 – Warning SelectionFor Dovecot and Pebble only
  • We assigned severity if a tool did not
  • Mostly avoid warnings with severity 5 (lowest)
  • Statistically select from each warning class
  • Select more warnings from higher severities
  • Select 30 warnings from each of 10 tool reports
    • 1 report had only 6 warnings
    • Did not analyze Marfcat warnings
  • Total is 276
method 2 human findings for dovecot and pebble only
Method 2 – Human findingsFor Dovecot and Pebble only
  • Security experts analyze test cases
  • A small number of findings
    • Root cause, with an example trace
  • Find related warnings from tools
  • Goal: focus our analysis on weaknesses found most important by security experts
method 3 cves for wireshark chrome and tomcat
Method 3 - CVEsFor Wireshark, Chrome, and Tomcat
  • Identify the CVEs
    • Locations in code
  • Find related warnings from tools
  • Goal: focus our analysis on real-life exploitable vulnerabilities
correctness categories
Correctness categories
  • True security weakness
  • True quality weakness
  • True but insignificant weakness
  • Weakness status unknown
  • Not a weakness
differences from sate 2009
Differences from SATE 2009
  • Add CVE-selected test cases
  • Include a C++ test case
  • Larger test cases: Chrome - 4.7 MLOC
  • More correctness categories (true quality)
  • More detailed guidelines for analysis
  • Still, much can be improved…
  • Romain Gaucher, Ram Sugasi
  • Aurelien Delaitre, Sue Wang, Paul Black, Charline Cleraux, and other SAMATE team members
  • Most of all, the participating teams!
  • What weaknesses exist in real programs?
  • What do tools report for real programs?
  • Do tools find important weaknesses?
  • Focus on tools that work on source code
  • Defects that may affect security
  • Goal is NOT too choose the “best” tools
  • This is the 3rd SATE (1st in 2008)
sate goals
SATE goals
  • To enable empirical research based on large test sets
  • To encourage improvement and adoption of tools
  • NOT to choose the “best” tools
sate common tool output format
SATE common tool output format


<weakness id=“23”>

<name cweid=“79”>SQL Injection</name>

<location id=“1” path=“dir/f.c” line=“71”/>

<grade severity=“2” probability=“0.5”/>

<output> Query is constructed

with user supplied input … </output>


one or more


that it is true

1 to 5, with 1 – the highest

…and other annotation

lessons learned
Lessons learned
  • Guidelines for analysis often ambiguous – need to be refined even more
  • Our analysis has inconsistencies and lapses
  • Careful analysis takes longer than expected
    • We do not know the code well
  • Tool interface is important to understand a weakness
analysis procedure20
Analysis procedure
  • We cannot know all weaknesses in the test cases
  • Impractical to analyze all tool warnings

So analyze the following:

  • Method 1. A subset of warnings from each tool report
  • Method 2. Tool warnings related to manually identified weaknesses

SATE tool output format

  • Common format in XML
  • For each weakness
    • One or more trace - locations - line number and pathname
    • Name of weakness and (optional) CWE id
    • Severity: 1 to 5 (ordinal scale), with 1 – the highest
    • Probability that the problem is true positive
    • Original message from the tool
    • And other annotation
our analysis
Our analysis
  • Correctness of warning
  • Associate warnings that refer to the same (or similar) weakness