evaluating static analysis tools l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Evaluating Static Analysis Tools PowerPoint Presentation
Download Presentation
Evaluating Static Analysis Tools

Loading in 2 Seconds...

play fullscreen
1 / 17

Evaluating Static Analysis Tools - PowerPoint PPT Presentation


  • 171 Views
  • Uploaded on

Evaluating Static Analysis Tools. Dr. Paul E. Black paul.black@nist.gov http://samate.nist.gov/. Static Analysis Examine code Handles unfinished code Can find backdoors, eg, full access for user name “JoshuaCaleb ” Potentially complete. Dynamic Analysis Run code

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Evaluating Static Analysis Tools' - ordell


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
evaluating static analysis tools

Evaluating Static Analysis Tools

Dr. Paul E. Black

paul.black@nist.gov

http://samate.nist.gov/

static and dynamic analysis complement each other
Static Analysis

Examine code

Handles unfinished code

Can find backdoors, eg, full access for user name “JoshuaCaleb ”

Potentially complete

Dynamic Analysis

Run code

Code not needed, eg, embedded systems

Has few(er) assumptions

Covers end-to-end or system tests

Static and Dynamic Analysis Complement Each Other
different static analyzers are used for different purposes
Different Static Analyzers Are Used For Different Purposes
  • To check intellectual property violation
  • By developers to decide if anything needs to be fixed (and learn better practices)
  • By auditors or reviewer to decide if it is good enough for use
dimensions of static analysis

Syntactic

Heuristic

Analytic

Formal

Dimensions of Static Analysis

Application

(explicit)

  • Analysis can look for general or application-specific properties
  • Analysis can be on source code, byte code, or binary
  • The level of rigor can vary from syntactic to fully formal.

Properties

Source

Code

Byte code

General

(implicit)

Binary

Level of Rigor

sate 2008 overview
SATE 2008 Overview
  • Static Analysis Tool Exposition (SATE) goals:
    • Enable empirical research based on large test sets
    • Encourage improvement of tools
    • Speed adoption of tools by objectively demonstrating their use on real software
  • NOT to choose the “best” tool
  • Co-funded by NIST and DHS, Nat’l Cyber Security Division
  • Participants:
      • Aspect Security ASC  HP DevInspect
      • Checkmarx CxSuite  SofCheck Inspector for Java
      • Flawfinder  UMD FindBugs
      • Fortify SCA  Veracode SecurityReview
      • Grammatech CodeSonar
sate 2008 events
SATE 2008 Events
  • Telecons, etc. to come up with procedures and goals
  • We chose 6 C & Java programs with security implications and gave them to tool makers (15 Feb)
  • Tool makers ran tools and returned reports (29 Feb)
  • We analyzed reports - (tried to) find “ground truth” (15 Apr)
      • We expected a few thousand warnings - we got over 48,000.
  • Critique and update rounds with some tool makers (13 May)
  • Everyone shared observations at a workshop (12 June)
  • We released our final report and all data 30 June 2009

http://samate.nist.gov/index.php/SATE.html

sate 2008 there s no such thing as one weakness
SATE 2008: There’s No Such Thing as “One Weakness”
  • Only 1/8 to 1/3 of weaknesses are simple.
  • The notion breaks down when
    • weakness classes are related and
    • data or control flows are intermingled.
  • Even “location” is nebulous.
how weakness classes relate

Improper Input

Validation CWE-20

Command

Injection

CWE-77

Cross-Site

Scripting

CWE-79

Validate-

Before-Canonicalize

CWE-180

Relative

Path Traversal

CWE-23

Predictability

CWE-340

Container

Errors

CWE-216

Symlink

Following

CWE-61

Race

Conditions

CWE-362

Permissions

CWE-275

How Weakness Classes Relate
  • Hierarchy
  • Chains

lang = %2e./%2e./%2e/etc/passwd%00

  • Composites
  • from “Chains and Composites”,Steve Christey, MITREhttp://cwe.mitre.org/data/reports/chains_and_composites.html
intermingled flow 2 sources 2 sinks 4 paths how many weakness sites
Intermingled Flow:2 sources, 2 sinks, 4 pathsHow many weakness sites?

free line 1503

free line 2644

use line 808

use line 819

other observations
Other Observations
  • Tools can’t catch everything: cleartext transmission, unimplemented features, improper access control, …
  • Tools catch real problems: XSS, buffer overflow, cross-site request forgery - 13 of SANS Top 25 (21 with related CWEs)
  • Tools reported some 200 different kinds of weaknesses
    • Buffer errors still very frequent in C
    • Many XSS errors in Java
  • “Raw” report rates vary by 3x depending on code
  • Tools are even more helpful when “tuned”
  • Coding without security in mind leaves MANY weaknesses
current source code security analyzers have little overlap
Current Source Code Security Analyzers Have Little Overlap

Non-overlap: Hits reported by one tool and no others (84%)

Overlap: Hits reported by more than one tool (16%)

2 tools

3 tools

4 tools

All 5 tools

from MITRE

precision recall scoring

Reports Everything

100

80

60

40

20

Misses Everything

0

60

0

40

20

80

100

Precision & Recall Scoring

The Perfect Tool

Finds all flaws and finds only flaws

Finds more flaws

“Better”

Finds mostly flaws

All True Positives

No True Positives

from DoD

tool a

Reports Everything

100

80

60

40

20

Misses Everything

0

60

0

40

20

80

100

Tool A

Use after free

TOCTOU

Tainted data/Unvalidated user input

Memory leak

All flaw types

Uninitialized variable use

Null pointer dereference

Buffer overflow

Improper return value use

All True Positives

No True Positives

from DoD

tool b

Reports Everything

100

80

60

40

20

Misses Everything

0

60

0

40

20

80

100

Tool B

Command injection

Tainted data/Unvalidated user input

Format string vulnerability

Improper return value use

Use after free

Buffer overflow

TOCTOU

All flaw types

Uninitialized variable use

Memory leak

Null pointer dereference

All True Positives

No True Positives

from DoD

best tool

Reports Everything

100

80

60

40

20

Misses Everything

0

60

0

40

20

80

100

Best Tool

Format string vulnerability

Tainted data/Unvalidated user input

Command injection

Improper return value use

Buffer overflow

Null pointer dereference

Use after free

TOCTOU

Memory leak

Uninitialized variable use

All True Positives

No True Positives

from DoD

tools useful in quality plains
Tools Useful in Quality “Plains”
  • Tools alone are not enough to achieve the highest “peaks” of quality.
  • In the “plains” of typical quality, tools can help.
  • If code is adrift in a “sea” of chaos, train developers.

Tararua mountains and the Horowhenua region, New Zealand

Swazi Apparel Limited www.swazi.co.nz used with permission

tips on tool evaluation
Tips on Tool Evaluation
  • Start with many examples covering code complexities and weaknesses

SAMATE Reference Dataset (SRD) http://samate.nist.gov/SRD

Many cases from MIT: Lippmann, Zitser, Leek, Kratkiewicz

  • Add some of your typical code.
  • Look for
    • Weakness types (CWEs) reported
    • Code complexities handled
    • Traces, explanations, and other analyst support
    • Integration and machine-readable reports
    • Ability to write rules and ignore “known good” code
  • False alarm ratio (fp/tp) is a poor measure. Report density (r/kLoc) is probably better.