Towards static flow based declassification for legacy and untrusted programs
Download
1 / 19

Towards Static Flow-based Declassification for Legacy and Untrusted Programs - PowerPoint PPT Presentation


  • 92 Views
  • Uploaded on

Towards Static Flow-based Declassification for Legacy and Untrusted Programs. Bruno P. S. Rocha Sruthi Bandhakavi Jerry den Hartog William H. Winsborough Sandro Etalle To appear in IEEE Symposium on Security and Privacy 2010 (Oakland’10). Introduction.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Towards Static Flow-based Declassification for Legacy and Untrusted Programs' - snowy


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Towards static flow based declassification for legacy and untrusted programs

Towards Static Flow-based Declassification for Legacy and Untrusted Programs

Bruno P. S. Rocha

SruthiBandhakavi

Jerry den Hartog

William H. Winsborough

Sandro Etalle

To appear in IEEE Symposium on Security and Privacy 2010(Oakland’10)


Introduction
Introduction

  • Language-based information flow aims to analyze programs with respect to flow of information between channels of different security levels

  • Non-interference is a formal property for specifying valid flows (Goguen & Meseguer 1982)


Non interference
Non-interference

Private Input

Public Input

Program

Public Output

Change in the public input can change the output

Change in the private input should not change the output


Declassification
Declassification

  • Non-interference is not enough for most practical applications

  • In many occasions, it is necessary to downgrade the security level of specific data i.e., to declassify that data

  • Classic examples:

    • Average salary

    • Password verification

    • Encryption


Security typed languages
Security-typed languages

  • Common approach (Jif)

  • Variables have security types

  • Compilation-time analysis rejects programs which contain insecure flows

  • Declassification is usually associated with specific points in the code, and done explicitly via some declass command

    • Declassification policy is code-dependant

  • Problem: implies that programmer is trusted, and understands security labeling of variables


Example jif
Example: Jif

Static labels + declassification:

int{Alice:;Alice←*} b;int{Alice:Bob;Alice←*} y = 0;if (b) {// pc is at level {Alice:;Alice←*} at this point.declassify ({Alice:;Alice←*} to {y}) {// at this point, pc has been declassified to the label of the local // variable y (that is, {Alice:Bob;Alice←*} ) permitting the // assignment to y y = 1;

}}

Dynamic labels:

int{*lbl} m{*lbl}(label{*lbl} lbl, principal{*lbl} p, int{Alice:p}i)where {Alice:Bob} <= lbl, Bob actsfor p

{

// since Bob actsfor p, {Alice:p} <= {Alice:Bob},// and since {Alice:Bob} <= lbl, the label of the argument I// is <= {*lbl}. Therefore, we can return i+1.return i+1;

}


Non security typed approaches
Non security-typed approaches

  • Dataflow analysis are techniqueswhich do not rely on annotated code

    • Do not provide declassification

    • Have issues analyzing control-flow dependencies

  • Taint analysisalsoworks on unannotated code, but suffers from the same problems of dataflow analysis + is often too restrictive


Our contribution
Our contribution

  • We aim for a static flow-based analysis that provides 3 key points:

    • Analysis of programming code without any security-based annotations (therefore, untrusted code)

    • Support for user-defined declassification policies

    • Having code and policy separated and independent from each other

  • Individual solutions do exist

    • The true challenge lies in a combined solution

  • A first-step in a new direction for information flow analysis


The core analysis process
The core analysis process

  • Identify I/O operations

  • Extract expressions on inputs

  • Check if expressions are recognized by policy

Program

Policy

α

β

Match?

γ

δ

expression recognizer


Policy representation
Policy Representation

α

α

0

Sets of expressions are represented by graphs

length

Nodes labels with input channels, constant values or wildcard characters

ϕ2

Certain nodes are marked as final nodes

*1

*2

*1

*3

Loop expressions can also be expressed

Represents the expression set:

{0, (0 + α1), (0 + α1 + α2), …..}

ϕ1

A separated restriction (not discussed here) guarantees that indexes on α are unique

add2

add1


Code analysis step 1 preprocessing
Code analysis – Step 1: Preprocessing

Operators are converted to method calls, nested expressions are converted to a series of assignments to temp variables

sum := 0; i := 0;

len := length( α );

while (i <= len) do

val := α ;

sum := sum + val;

i := i + 1;

avg := sum / len ;

γ := avg;

sum := 0; i := 0;

len := length( α );

c : = leq(i,len);

while (c) do

val := α ;

sum := add(sum,val);

i := add(i,1);

c := leq(i,len);

avg := div(sum,len);

γ := avg;


Step 2 static single assignment ssa format
Step 2: Static Single Assignment (SSA) Format

sum1 := 0; i1 := 0;

len1 := length( α );

c1 := leq(i1, len1);

while ( c3 := ϕc3(c1, c2);

sum3 := ϕc3(sum1, sum2);

i3 := ϕc3(i1, i2);

c3) do

val1 := α ;

sum2 := add(sum3, val1);

i2 := add(i3, 1);

c2 := leq(i2, len1);

avg1 := div(sum3, len1);

γ := avg1;

Every variable is defined only once & Phi functions are declared to keep track of assignments in branching statements

sum := 0; i := 0;

len := length( α );

c := leq(i, len) ;

while (c) do

val := α ;

sum := add(sum, val);

i := (i, 1);

c := leq(I,len);

avg := div(sum, len) ;

γ := avg;


Step 3 program expression graph
Step 3: Program Expression Graph

γ

sum1 := 0; i1 := 0;

len1 := length( α );

c1 := leq(i1, len1);

while ( c3 := ϕc3(c1, c2);

sum3 := ϕc3(sum1, sum2);

i3 := ϕc3(i1, i2); c3) do

val1 := α ;

sum2 := add(sum3, val1);

i2 := add(i3, 1);

c2 := leq(i2, len1);

avg1 := div(sum3, len1);

γ := avg1;

avg1

α

len1

length

div2

ϕ2

div1

sum2

sum3

val1

c3

ctrl

add1

add2

ϕ1

0

sum1

Expression graph for γ


Step 4 policy matching
Step 4: Policy Matching

γ

α

α

α

0

length

length

div2

avg1

len1

ϕ2

ϕ2

div1

*3

val1

c3

*1

*2

*1

sum2

ctrl

ϕ1

add1

add2

Control dependency needs to be green

ϕ1

sum3

add2

add1

0

Expression graph for γ

(control dep. omitted)

sum1

Policy graphs


Revisiting the security property
Revisiting the security property

Public Input

Private Input

Program

Declassification policies split the domain into different parts each of which are indistinguishable

Public Output


Policy controlled release pcr
Policy Controlled Release (PCR)

Private Input

Revealed Knowledge(R) : Given an

environment, π, and a declassification policy, d,Ris the set of all environments for which the value of the

declassifiable expressions is the same .

R

Observed Knowledge(K) : Given an environment, π, and declassification

policy, d, K is the set of all environments which produce the same sequence of visible outputs as π. The observer cannot distinguish between environments in K.

Policy Controlled Release Theorem: If the program satisfies PCR, then the

knowledge obtained from observing the program (K) is bound by the

information released by the declassification policy (R). That is, .


Did i mention towards something
Did I mention “towards” something?

  • Assumptions that can be relaxed:

    • We use a simple toy language, that lacks:

      • More control-flow commands such as break, case, etc. (easy)

      • User-defined functions, object orientation (currently working on it)

      • Arrays and pointers (SSA solution for these helps a lot)

    • Matching mechanism is currently defined mathematically: algorithms are under way

    • Some operational issues still untreated:

      • Enforcing how many times a given loop runs

      • Algebraic equivalence of expressions denoted by the graphs i.e., add(x,y) = add(y,x)


Thank you
Thank you!

Questions?


The core idea
The core idea

  • Identify input and output operations in the code

  • Track all the possible expressions on inputsthat each variable can hold

  • Declassification policies work as expression recognizers, determining which expressions can be declassified

  • On outputs, analysis checks whether every possible expression sent to that channel is safe to be output (i.e., recognized by the policy)