Efficient software based fault isolation
Download
1 / 43

"Efficient Software-based Fault Isolation" - PowerPoint PPT Presentation


  • 101 Views
  • Uploaded on

"Efficient Software-based Fault Isolation". by R. Wahbe, S. Lucco, T. E. Anderson, and S. L. Graham. Presenter: Tom Burkleaux. What is the problem?. With closely cooperating software modules, how do we protect from distrusted code? What is distrusted code?

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' "Efficient Software-based Fault Isolation" ' - samson-trujillo


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Efficient software based fault isolation

"Efficient Software-based Fault Isolation"

by R. Wahbe, S. Lucco, T. E. Anderson, and S. L. Graham.

Presenter: Tom Burkleaux


What is the problem
What is the problem?

  • With closely cooperating software modules, how do we protect from distrusted code?

  • What is distrusted code?

  • Code that might corrupt memory of other cooperating modules

  • Code that is not adequately tested

  • Perhaps written by third-party persons


Cooperating modules
Cooperating Modules

  • Complex systems are often broken down into separate (logical) components

  • This supports the following goals:

  • Independent development of individual components

  • Upgrade and enhance existing components

  • “Extensibility” – allow other programmers to add functionality


Examples cooperating modules
Examples: Cooperating Modules

  • Micro-kernel operation systems

    Elements of the OS are moved to user- space.

  • Postgres database manager

    Extensible Type System

  • Microsofts Object Linking & Embedding (OLE)

    Extensibility code supported by OS

    Link together independently developed software modules


Structuring cooperating modules 1
Structuring Cooperating Modules, 1

  • How can we structure cooperating modules?

  • There are two basic choices, each with a distinct advantage:

  • Share the same address space

    Low communication cost between modules

  • Keep modules in separate address spaces

    Separate protection domains


Structuring cooperating modules 2
Structuring Cooperating Modules, 2

  • … and each has a disadvantage:

  • Share the same address space

    Distrusted code can cause hard to find bugs within a system

  • Keep modules in separate address spaces

    Cross-Protection Domain calls are expensive (RPC). Overall application performance suffers.


Structuring cooperating modules 3
Structuring Cooperating Modules, 3

Shared Memory

RPC

Protection Domains Shared Address


Structuring cooperating modules 4
Structuring Cooperating Modules, 4

Shared Memory

Yes

?

RPC

Distrusted

Code

Slow!

Protection Domains Shared Address


Trends in structuring
Trends in structuring?

  • In OS, which method of structuring coop. modules is more prevalent?

  • BSD System

  • Mach

  • Mac OS X

  • Linux

    (I think?) more a tendency to share address space for performance reasons


Proposed solution
Proposed Solution

If we want fault isolation, the authors offer a tradeoff:

Fault Isolation with –

“substantially faster communication between fault domains, at a cost of slightly increased execution time for distrusted modules”


What does solution look like
What Does Solution Look Like?

  • We want the speed of shared memory

  • But we want to prevent distrusted code from corrupting memory of other modules

  • “Fault Domains”

Shared Memory

NO

Yes

Distrusted

Code


Overview of techniques
Overview of Techniques

  • The authors suggest two techniques for software based fault-isolation:

  • Segment Matching

  • Sandboxing

  • And two times these can be applied:

  • Compile Time

  • Object Linking


Fault domain definition
“Fault Domain” -- Definition

990---

991---

991---

992---

992---

993---

Each segment shares unique pattern of upper bits. “segment identifier”

Segment (code)

Segment (data)

Fault Domain


Examining binary code
Examining Binary Code

  • For distrusted modules:

  • What about modifying the binary so we add a check on all loads and stores? “binary patching”

  • Assume this could be done at load-time

  • Addresses are used very frequently, and this method would add extra instructions for each address reference

  • Many tools are based on identifying compiler-specific idioms to distinguish between code and data


Software enforced fault isolation segment matching
Software-Enforced Fault Isolation: Segment Matching

  • “unsafe instruction” – an instruction that jumps or stores an address, and can’t be statically verified

  • Jumps through register are example

  • Compiler can add code to check instructions

  • On typical RISC architectures, this takes 4 instructions

  • Requires dedicated registers, to prevent checks being by-passed


Segment matching 2
Segment Matching, 2

Pseudo code example.

w/o dedicated registers, code could jump to last instruction


Segment matching 3
Segment Matching, 3

With segment-matching, we can pin-point the source of the fault

How does this compare with hardware-based memory protection?

Shared Memory

Yes

Distrusted

Code

Trap


Segment matching 4
Segment Matching, 4

What about the loss of registers?

  • We need four: addresses in data segment, address in code segment, segment shift amount, and segment identifier

  • Author rely on most modern architectures having at least 32 registers


Software enforced fault isolation address sandboxing
Software-Enforced Fault Isolation: Address Sandboxing

Idea is we can reduce run-time overhead by giving up information on source of fault.

Sandboxing:

Before each unsafe instruction we simply insert code that sets the uppers bit of the target to the correct segment identifier


Sandboxing 2
Sandboxing, 2

  • There is no trap

  • Only two extra instructions

  • Recall that segment matching has five extra instructions


Sandboxing 3
Sandboxing, 3

Any address access outside of the module’s segment is prevented

We access and potentially corrupt an incorrect address within our own segment

Execution continues, unaware of error!

Shared Memory

?

Yes

Distrusted

Code

The sandbox


Optimizations
Optimizations

  • Guard Zones are one example of an opt. that can be handled by a compiler.

  • Avoid address arithmetic

  • Reg + Offset instr.

  • We sandbox only the Reg and handle offset by creating guard zones


Process resources
Process Resources

  • We need to prevent distrusted modules from corrupting resources allocated on a per-address-space basis.

  • One idea: make OS aware of fault domains

  • Authors choose to require distrusted modules to access resources through cross-fault-domain RPC


Implementation
Implementation

  • Authors identify two strategies for implementation

  • 1. Have a compiler emit encapsulation code and have a verifier confirm object code at load time.

  • 2. Modifying object code at load time. “binary patching”

    They went with option 1. Problem with modifying object code is making modified object code use a subset of registers.


Low latency cross fault rpc
Low Latency Cross Fault RPC

  • Because distrusted modules are isolated, we need something like LRPC.

  • Efficient software isolation was first part of solution

  • The second part of solution is fast communication across fault domains


Cross fault rpc 2
Cross Fault RPC, 2

  • Constraints of running distrusted code:

  • Distrusted code can’t directly call a function outside its segment, or return from an outside call via an address on the stack

  • When code is running within a distrusted module it has its own execution context.


Cross fault rpc 3
Cross Fault RPC, 3

  • Like LRPC, the solution is done through stubs

  • For each pair of fault domains a customized call and return stubs is created for each exported procedure

  • Stub run unprotected outside both domain

  • They are responsible for copy of args

  • And managing machine state


Cross fault rpc 4
Cross Fault RPC, 4

Args are copied through a shared buffer

Jump Table: each entry is a control transfer instruction to legal entry point outside the domain


Cross fault rpc 5
Cross Fault RPC, 5

The stubs and the jump table are added to the code.

The jump table added to the code segment, so distrusted code can’t modify.


Evaluation
Evaluation

  • For evaluation, the authors looked at three questions:

  • What is overhead for software encapsulation?

  • How fast is XFD-RPC?

  • What effect does this solution have on end-user applications?


Evaluation software encapsulation 1
Evaluation – Software Encapsulation, 1

The authors developed an analytical model to predict cost. Expected overhead is:

(s-instructions – interlocks ) / cycles-per-second

------------------------------------------------------

Original-execution-time-seconds

S-instructions = sandbox instructions

Interlocks = saved floating point interlocks


Evaluation software encapsulation 2
Evaluation - Software Encapsulation, 2

The purpose of the model is help identify 2nd order effects.

To get data, they ran various benchmarks, and ran the benchmark code as “untrusted” modules.

They found that their model predicted the “average” overhead very well. But individual benchmark test showed NEGATIVE overhead!



Evaluation software encapsulation 4
Evaluation - Software Encapsulation, 4

  • Explaining results

  • Anomolies – 2nd order effects? Conjecture: “instruction cache mapping conflicts”

  • Programs with more floating-point operations exhibitied less overhead (2.5% vs 5.6%)

  • These were compute heavy benchmarks

  • They expect I/O programs to have less overhead

    Overall, overhead is not much!


Evaluation cross fault rpc 1
Evaluation – Cross-Fault RPC, 1

  • How to measure XFD-RPC? Their mechanism spends most of its time saving and restoring registers.

  • Table 2 (following) – shows performance for NULL cross fault domain RPC. And this is compared to a C procedure call and Pipes.

  • Their call is 1 order of magnitude better than C procedure call

  • Other optimized RPCs are, at best, 2 orders of magnitude better than C procedure call.



Evaluation cross fault rpc 3
Evaluation – Cross-Fault RPC, 3

Table 3 measures how their system works when applied to Postgres, using the Sequoia 2000 benchmark.

Postgres has extensible type system, which is a recognize safety problem.

They want to compare their system with Postgres’ built in “untrusted function manager” and traditional hardware protection domains.


Analysis
Analysis

  • The author’s Postgress example shows savings over other methods.

  • The formula for savings in general is:

  • Savings = (1 – r) tc - h td

  • tc time spent crossing fault domains

  • td time spent in distrusted code

  • h overhead for encapsulation

  • r ratio of their crossing time to hardware RPC


Analysis 2
Analysis, 2

  • The savings formula can be graphed to illustrate the breakeven curve.

  • In figures following:

    X-axis: percentage of time spent crossing domains

    Y-axis: relative cost of software enforced fault-domain crossing vs hardware method



Analysis 4
Analysis, 4

  • The question is: does savings in efficient XFD-RPC (over traditional RPC) make up for encapsulation overhead?

  • Answer appears to yes

  • Author’s give example if app spends 30% of its time crossing fault domains, their RPC mechanism needs to be only 10% better

  • Figure 5 was conservative and assumed everything was protected. Usually most of the app is trusted. Figure 6 assume only 50% of time is spend in distrusted code.



Some additional points
Some Additional Points

Why not object code? Because tools are there and you may lose compiler efficiencies

How many tools are written for this? In what compiler languages?

Any Questions?


ad