Dependence Graphs for Information Assurance

Dependence Graphs for Information Assurance Tim Teitelbaum tt@grammatech.com (Cornell) Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY http://www.grammatech.com OASIS --- Santa Fe

Situation • Front door open • Eligible Receiver: 65% of attacks successful; 63% undetected [Tenet; Minihan] • Back door (installed and) open • Back Orifice, etc. demonstrate the potential of automated intrusion maintenance • Blinds are up • Open sources expose critical software and all its flaws • Foundation is rotten • Buggy software is the norm[Weise; Engler] • A dozen new buffer-overrun attacks every week[Epstein] • The domestic is foreign • 195,000 H1B visas issued per year OASIS --- Santa Fe

DISCEX II Keynotes • Characteristics • Planning • Long term view • Strategic investment • Stealth • Questions for OASIS • What is your model of a nation-state attack? • How do you address it? • DoD Perspective • “Nation states are our greatest concern” LTG Edward G. Anderson III Deputy Commander in Chief United States Space Command OASIS --- Santa Fe

DISCEX II Keynotes • Industry Perspective • “DARPA should focus on tools for building safer code” • “More emphasis needed on correct implementation, especiallyfor security products!” Jeremy Epstein Director, Product Security webMethods, Inc. • DoD Perspective • “Nation states are our greatest concern” LTG Edward G. Anderson III Deputy Commander in Chief United States Space Command OASIS --- Santa Fe

Questions • Front door open • What have nation states been doing to us while we have been so exposed? • Back door (installed and) open • What percentage of observed attacks are nation-state attacks? • Blinds are up • Are nation states investing heavily in vulnerability analysis of open source code? • Foundation is rotten • Are nation-state insider code-level attacks distinguishable from bugs? • The domestic is foreign • How many foreign agents program routers for Cisco? • How does Cisco defend its products from its own malicious employees? • Do you consider firmware part of the TCB? On what basis? OASIS --- Santa Fe

Development of Advanced [...] Software Tools for Trap Door Analysis and Malicious Code Detection This program is designed to develop advanced software tools and techniques that can detect and eliminate trap doors and other malicious code in software. Although detecting subtle but intentional alterations to computer code is problematic, these tools will increase the integrity of software products, and thereby reduce the probability of future penetrations and compromises of computers and networks. DARPA HARD President’s Critical Infrastructure Protection Plan Recommendations [page 61] 6.1 Critical Infrastructure Protection Research and Development Intrusion Detection and Monitoring . . . OASIS --- Santa Fe

Role of Static Code Analysis in IA • Assumptions • Implementations are critical • Tools for understanding code are strategic • Attacks and Vulnerabilities • Trap doors and exploitable bugs • Approach • Statically detect and eliminate • Other applications of core-technology • Policy enforcement by code rewriting • Model extraction from code • Scope • Mission-critical and mass-market • Open-source and closed-source • Source and binary OASIS --- Santa Fe

2 3 1 1 Accomplishments [past 6 months] • Core Static-Analysis Research • Context-Sensitive GMOD / GREF Analysis • Fine-grained discrimination by structure-fields • Variable-based queries and function-based queries • Non-structured control constructs, e.g., switch, break, continue, goto • better precision • Pointer Analysis • better performance [about x6 faster] • Interprocedurally-precise model checker (mu-calculus) • Information Assurance Workbench • Buffer overrun vulnerability detection and analysis • Pattern matching on AST fragments OASIS --- Santa Fe

Analysis of Buffer Overrun Vulnerabilities 1 • Code Redattack • Begins by exploiting a buffer-overrun vulnerability • Static analysis • Can detect potential buffer-overrun vulnerabilities OASIS --- Santa Fe

Analysis of Buffer Overrun Vulnerabilities, cont. • Code Red attack • Exploits an unchecked byte-string to wide-character-string conversion • Assume the operation used was mbstowcs(char *dst, char *src, int length) • Can 2*length be bigger than size of dst? • Dependence queries • Reveal potential information-flows, e.g., • from data sources under user control (external strings) • to dangerous operations (unchecked length arguments) OASIS --- Santa Fe

Analysis of Buffer Overrun Vulnerabilities, cont. Can users influence the length argument of a string-to-wide-character conversion? Sources of external strings Sources of internal strings strlength(buf) mbstowcs(wide_buf, buf, ); Bounds- check SIZE mbstowcs(wide_buf, buf, ); Unchecked variable-length argument Unchecked fixed-length argument r mbstowcs(wide_buf, buf, ); Checked variable-length argument OASIS --- Santa Fe

Analysis of Buffer Overrun Vulnerabilities, cont. Dependences from data sources to mbstowcs arguments Sources of external strings Sources of internal strings strlength(buf) mbstowcs(wide_buf, buf, ); Bounds- check SIZE mbstowcs(wide_buf, buf, ); Unchecked variable-length argument Unchecked fixed-length argument r mbstowcs(wide_buf, buf, ); Checked variable-length argument OASIS --- Santa Fe

Analysis of Buffer Overrun Vulnerabilities, cont. Chop from sources to targets shows all possible information flows sources targets Sources of external strings Sources of internal strings strlength(buf) mbstowcs(wide_buf, buf, ); bounds- check SIZE mbstowcs(wide_buf, buf, ); Unchecked variable-length argument Unchecked fixed-length argument r mbstowcs(wide_buf, buf, ); Checked variable-length argument OASIS --- Santa Fe

Analysis of Buffer Overrun Vulnerabilities, cont. Good news: find all flows; Bad news: false positive (flow through bounds-check) Sources of external strings Sources of internal strings strlength(buf) mbstowcs(wide_buf, buf, ); bounds- check SIZE mbstowcs(wide_buf, buf, ); Unchecked variable-length argument Unchecked fixed-length argument r mbstowcs(wide_buf, buf, ); Checked variable-length argument OASIS --- Santa Fe

Analysis of Buffer Overrun Vulnerabilities, cont. Code Red source-code mock-up, showing chop-sources, chop-targets , and query-results. bounds- check chop-sources chop-targets OASIS --- Santa Fe

Analysis of Buffer Overrun Vulnerabilities, cont. • Model checking queries • Can assert and check properties about flow paths • Counter-examples: reveal possible vulnerabilities • Sample (false) assertion Every path from a user data source to the length argument of mbstowcs goes through a bounds-check • Sample counter-example • Path from data source to unchecked length argument of mbstowcs OASIS --- Santa Fe

Analysis of Buffer Overrun Vulnerabilities, cont. Good news: counter-example avoids bounds-check Sources of external strings Sources of internal strings strlength(buf) mbstowcs(wide_buf, buf, ); bounds- check SIZE mbstowcs(wide_buf, buf, ); Unchecked variable-length argument Unchecked fixed-length argument r mbstowcs(wide_buf, buf, ); Checked variable-length argument OASIS --- Santa Fe

Analysis of Buffer Overrun Vulnerabilities, cont. Counter-example in query-results; Chop result in _______ background No bounds- check OASIS --- Santa Fe

Analysis of Buffer Overrun Vulnerabilities, cont. • Constraint satisfaction [Wagner, et al.] • Assert required constraints between destination buffer sizes and corresponding copy length arguments • Report all cases where constraints are not satisfied • Use of CodeSurfer[future work] • Implement in industrial-strength framework • Reduce false positives reported • Context-sensitive constraint satisfaction • Better pointer analysis • Interactive tool for analysis of false positives OASIS --- Santa Fe

Context-Sensitive GMOD / GREF Analysis 2 • Accurate dependence analysis for reference parameters • Previously, a major source of imprecision • Now, context-sensitive analysis of non-local variable usage • Substantial improvement • Relevant for buffer-overrun analysis • Example • Instead of mbstowcs(char *dst, char *src, int length) consider assign(char *dst, char *src) { *dst = *src; } OASIS --- Santa Fe

Context-Sensitive GMOD / GREF Analysis • When procedure P has formal parameter F of type *T, the flow-insensitive, context-insensitive points-to set of F is the union of the points-to sets of all corresponding actual parameters in calls to P (plus any other pointers assigned to F in P) assign(&a1, &a2); assign(&b1, &b2); assign(char *dst, char *src) { *dst = *src; } OASIS --- Santa Fe

Context-Sensitive GMOD / GREF Analysis, cont. • Additional (hidden) actual and formal parameters are generated to represent the variables modified or referenced via formal F (as well as variables modified or referenced via global pointer variables) assign(&a1, &a2); assign(&b1, &b2); a2,b2 assign(char *dst, char *src) a1,b1 { *dst = *src; } OASIS --- Santa Fe

Context-Sensitive GMOD / GREF Analysis, cont. • The generated actual parameters are wired to the accessible defs and uses of the variables accessible via F. • In general, there is more than one def for each actual-in, and more than one use for each actual-out. a2=… …=a1 b2=… …=b1 assign(&a1, &a2); assign(&b1, &b2); a2,b2 assign(char *dst, char *src) a1,b1 { *dst = *src; } OASIS --- Santa Fe

Context-Sensitive GMOD / GREF Analysis, cont. • Previously, the wiring was based on the points-to sets of the corresponding formal parameters. Thus, the additional edges (in blue) were also wired. a2=… …=a1 b2=… …=b1 assign(&a1, &a2); assign(&b1, &b2); a2,b2 assign(char *dst, char *src) a1,b1 { *dst = *src; } OASIS --- Santa Fe

Context-Sensitive GMOD / GREF Analysis, cont. • A backward slice from a use of variable b1 shows what influences its value. • It is computed by following dependence edges backward. a2=… …=a1 b2=… …=b1 assign(&a1, &a2); assign(&b1, &b2); a2,b2 assign(char *dst, char *src) a1,b1 { *dst = *src; } OASIS --- Santa Fe

Context-Sensitive GMOD / GREF Analysis, cont. • A backward slice from a use of variable b1 shows what influences its value. • It is computed by following dependence edges backward. • Only feasible paths are followed, i.e., edges shown in gold are not followed. Good! a2=… …=a1 b2=… …=b1 assign(&a1, &a2); assign(&b1, &b2); a2,b2 assign(char *dst, char *src) a1,b1 { *dst = *src; } OASIS --- Santa Fe

Context-Sensitive GMOD / GREF Analysis, cont. • But the path to variable a2 would also be followed. Bad! Variable a2 has no influence on variable b1 a2=… …=a1 b2=… …=b1 assign(&a1, &a2); assign(&b1, &b2); a2,b2 assign(char *dst, char *src) a1,b1 { *dst = *src; } OASIS --- Santa Fe

Context-Sensitive GMOD / GREF Analysis, cont. • … and other spurious paths would also be followed. Very bad! a2=… …=a1 b2=… …=b1 assign(&a1, &a2); assign(&b1, &b2); a2,b2 assign(char *dst, char *src) a1,b1 { *dst = *src; } OASIS --- Santa Fe

Context-Sensitive GMOD / GREF Analysis, cont. • This bad behavior has now been eliminated. • We now distinguish between variables accessible because of actual parameters and variables accessible because of globals. a2=… …=a1 b2=… …=b1 assign(&a1, &a2); assign(&b1, &b2); a2,b2 assign(char *dst, char *src) a1,b1 { *dst = *src; } OASIS --- Santa Fe

Context-Sensitive GMOD / GREF Analysis, cont. • Big win in precision • Big win in time and space • But a new time and space problem [example not shown] Conjecture: Greater precision makes previously identical sets (with shared representations) different (and therefore unshared) OASIS --- Santa Fe

Discrimination by Structure Field 3 • Previously, all fields participated in every operation on any field • e.g., predecessors of p->f were defs of every field of struct pointed to by p • Now, there is an option to discriminate on structure fields • e.g., predecessors of p->f are only defs of field f of structs pointed to by p • But casts and unions must be taken into account • For portable analysis, cannot use explicit offsets • Two fields f1 and f2 in different struct types T1 and T2 have the same offsets if the field sequences leading up to f1 and f2 have pair-wise compatible types • Explicit offsets could be used in the future for precise platform-dependent analysis • New problem to be solved • Unless calls to malloc are immediately cast to their intended types, field discrimination is lost OASIS --- Santa Fe

Transitions (Spin-off SBIR Research) • Current SBIR Phase I projects to transition research to products • Malicious Code Detection in Firmware (Air Force) • CodeSurfer for x86; use to detect malicious code • Model Checking of Hierarchical Graph Structures (DARPA / ITO) • CodeSurfer model checking plug-in for QA • Inlined Reference Monitors for Java Bytecode (NIST) • Use of dependence-graph technology for insertion of efficient IRMs • Model Checking of UML designs (Navy / Aegis) • Model-checking to assure properties in UML Rose/RealTime designs • Dependence Graphs for Dynamic Internet Technologies (NSF) • CodeSurfer for Java; decision support for test coverage • Static Analysis for AOP (DARPA / PCES) • Aspect C (separate take from Gregor’s) • New Technique for Efficient Compression of Information (BMDO) * • BDD variant, potential for double-exponential decision tree compression * [unrelated to DARPA research] OASIS --- Santa Fe

Transitions, cont. • Recent Product Release • CodeSurfer 1.5 • Open APIs to C program representation and analysis operations • Paper • Software Inspection using CodeSurfer, WISE’01 Workshop on Inspection in Software Engineering, July 23rd, 2001, Paris. OASIS --- Santa Fe

Workshop Topics • Integration Opportunities • Projects exploring code rewriting or reorganization, or developing vulnerability scanners • Client of our open APIs to program representation and analyses • Projects relying on a system model • Potential to extract the model automatically from the code • Validation • Scalability • Performance on benchmarks • Vulnerability detection • False positive rate w.r.t. “truth”, e.g., known buffer overrun attacks OASIS --- Santa Fe

Future Work • Core Technology • Pointer analysis • Dependence analysis • concurrency, asynchronous control, reused storage, types, array strides • Model checker (model reduction and abstraction relaxation) • Constraint satisfaction: sets and numeric ranges • Summary information for libraries • Rewriting support • Performance • Information Assurance Workbench • Scanners for buffer overruns, race conditions, covert channels OASIS --- Santa Fe

Dependence Graphs for Information Assurance

Dependence Graphs for Information Assurance

Presentation Transcript

Obtaining Information from Graphs

Information assurance club

NATO Information Assurance

CyberDefenses Information Assurance

Information Assurance Services

Information Assurance/Cyber Defense

Information Assurance Professional

Information Assurance and Information Sharing

Nebraska University Center for Information Assurance

Information Assurance Management

Information Assurance IATF

Griffiss Institute Center for Information Assurance

Information Assurance Management

Information Assurance

Information Assurance

Information Assurance Workshop 2004

Information Assurance

Dependence Language Model for Information Retrieval

Information Assurance

Information Assurance

Information Assurance

Information Networks - Web Graphs