1 / 12

Shimin Chen LBA Reading Group

Merlin: Specification Inference for Explicit Information Flow Problems Livshits, Nori, Rajamani (MSR), Banerjee (IMDEA), PLDI’09. Shimin Chen LBA Reading Group. Explicit Information Flow in a Program. Given a program, can construct a propagation graph: Node: method

sharis
Download Presentation

Shimin Chen LBA Reading Group

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Merlin: Specification Inference for Explicit Information Flow ProblemsLivshits, Nori, Rajamani (MSR), Banerjee (IMDEA), PLDI’09 Shimin Chen LBA Reading Group

  2. Explicit Information Flow in a Program • Given a program, can construct a propagation graph: • Node:method • Edge: explicit information flow between methods (through a method call parameter, a return value, or by way of an indirect update through a pointer) • (Information Flow) Specification: node labels • Regular (default): propagate taints to successors • Source: tainted initially • Sink: if tainted, then must report error • Sanitizer: cleanse/untaint/endorse information • Given a propagation graph and a specification, many tools exist that statically check if the propagation graph violates the specification.

  3. Example • GetParameter, GetHeader are sources • WriteLine is a sink Write to web page. S1 or S2 may contain malicious scripts to run in web browsers

  4. Problem & Solution • Problem: user-provided specification incomplete • False positive: incomplete information about Sanitizers • False negative: incomplete information about Sources and Sinks • Solution: Merlin • Automatically infers information flow specifications for programs • Intuition: most paths in a propagation graph are secure (from source to sink, passing sanitizers) • Approach (Idea): • A random variable per node: the node is a source, sink, or sanitizer w/ prob … • Compute probabilistic constraints for paths • Solve these constraints

  5. Merlin Architecture input input output

  6. Construct Propagation Graph • Inter-procedural data flow: • Limited by the accuracy of pointer analysis

  7. Assumptions • Most paths in the propagation graph are secure • Number of sanitizers is small, relative to the number of regular nodes Focus: string-related vulnerabilities

  8. Potential Sources, Sinks and Sanitizers • Potential sources: methods that produce strings as output • Potential sanitizers: methods that take a string as input and produce a string as output • Potential sinks: methods that take a string as input, but do not produce a string as output

  9. Constraints Path safety: most paths from a source to a sink pass through at least one sanitizer. But exponential number of paths. Triple safety: O(N3), N is number of nodes.

  10. Constraints cont’d Pairwise Minimization: unlikely to have two sanitizers on the same path. Sanitizer Prioritization: favor nodes with higher s(m) total source-to-sink paths passing m total paths passing m s(m)=

  11. Constraints cont’d Source wrapper avoidance: unlikely to have two sources on the same path. Sink wrapper avoidance: unlikely to have two sinks on the same path.

  12. Ideas of Solving the System • Each node is assigned a random variable with {true, false} two possible values • For each constraint, generate a probability constraint • For example, if node A and node B are both potential sources, but they are on the same path, then • Xa: true if A is source, false if A is not • Xb: true if B is source, false if B is not • Prob(Xa AND Xb = true) = low1 • Low1 is an input constant to the algorithm • Solve the set of constraints (using a tool called factor graph) • More details in the paper

More Related