Rewind repair replay three r s to improve dependability l.jpg
This presentation is the property of its rightful owner.
Sponsored Links
1 / 30

Rewind, Repair, Replay: Three R’s to improve dependability PowerPoint PPT Presentation


  • 59 Views
  • Uploaded on
  • Presentation posted in: General

Rewind, Repair, Replay: Three R’s to improve dependability. Aaron Brown and David Patterson ROC Research Group University of California at Berkeley SIGOPS European Workshop, 23 September 2002. What if computer systems could travel in time?. We could have retroactive repair

Download Presentation

Rewind, Repair, Replay: Three R’s to improve dependability

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Rewind repair replay three r s to improve dependability l.jpg

Rewind, Repair, Replay:Three R’s to improve dependability

Aaron Brown and David Patterson

ROC Research GroupUniversity of California at Berkeley

SIGOPS European Workshop, 23 September 2002


What if computer systems could travel in time l.jpg

What if computer systems could travel in time?

  • We could have retroactive repair

    • travel back and fix problems before they had a chance to corrupt data

  • We could eliminate human operator error

    • make a mistake? Just travel back and try it again.

  • Our systems could be more robust

    • we could eliminate the dangers of upgrades

    • we could better tolerate buggy software

    • we might even be able to tolerate viruses and hackers

  • We could make more dependable systems


Sci fi vs computer time travel l.jpg

Sci-fi time travel

our hero loses a loved one or lives through disaster

hero uses time machine to travel back in time

hero alters the past to avert the future disaster

hero returns to the present; past changes have been merged into the original timeline

Computer time travel

human error, software bug, or attack causes data loss

Rewind: roll system state backwards in time

Repair: make changes to avert foretold disaster

Replay: roll system state forward, merging the original timeline with the effects of repairs

Sci-fi vs. computer time travel

  • Three R’s are the fundamental primitives of computer time travel


Key properties of the 3r s l.jpg

Key properties of the 3R’s

  • Recovery from problems at any system layer

    • rewind, repair, replay cover OS through application

  • Recovery from unanticipated problems

    • arbitrary repair

  • No assumptions about correct application behavior

    • physical rewind

  • Integrated interface

    • provide “undo for sysadmins”


What about existing approaches l.jpg

What about existing approaches?


Designing a 3r system l.jpg

Designing a 3R system

  • Goals

    • application-neutrality

    • provide abstractions for reasoning about 3R behavior

  • Target domain: network services

    • accessed by remote users via well-defined interfaces

    • email, messaging, e-commerce, auctions, forums, web hosting, enterprise applications (J2EE, .NET), ...

  • Challenges, learned from first attempt

    • integrating history and repair during replay

    • managing inconsistency in externally-visible state


Basic architecture l.jpg

ControlUI

App. Service

Includes: - user state - application - operating system

UndoManager

Time-travelstorage layer

HistoryLog

3R API

control

Basic architecture

  • Application-independent undo manager

    • coordinates 3R cycle; manages external inconsistencies

    • linked via a set of APIs to application, time-travel storage, history log, and control UI


Abstracting the application service l.jpg

Abstracting the application service

  • To the undo manager, the application is:

    • a collection of state

    • a history of events affecting the state

      • an event is typically a user interaction with the service

    • a model of acceptable external consistency

  • These are encoded into application-defined verbs

    • high-level encodings of user interactions (events)

      • records of intent to alter state, not actual state changes

    • reference application state by opaque UIDs

    • provide policies that define external consistency


Verbs and the 3r cycle l.jpg

Verbs and the 3R cycle

  • Normal operation

    • undo manager logs application-provided verbs to disk

Userinteraction

ControlUI

App. Service

Verbs

Includes: - user state - application - operating system

UndoManager

HistoryLog

Time-travelstorage layer

control


Verbs and the 3r cycle10 l.jpg

Verbs and the 3R cycle

  • Rewind

    • time-travel storage layer reverts system hard state to rewind point

    • all changes since rewind point are discarded

ControlUI

App. Service

Includes: - user state - application - operating system

UndoManager

HistoryLog

Time-travelstorage layer

control


Verbs and the 3r cycle11 l.jpg

Verbs and the 3R cycle

  • Repair

    • operator edits logged history and/or makes arbitrary changes to system

ControlUI

Repairs

Edits

App. Service

Includes: - user state - application - operating system

UndoManager

HistoryLog

Time-travelstorage layer

control


Verbs and the 3r cycle12 l.jpg

Verbs and the 3R cycle

  • Replay

    • undo manager feeds verbs back to application for re-execution in the context of repaired system

ControlUI

App. Service

Includes: - user state - application - operating system

UndoManager

HistoryLog

Verbs

Time-travelstorage layer

control


The fundamental roles of verbs l.jpg

The fundamental roles of verbs

  • Providing application-independence

    • verbs encapsulate application semantics, but remain semi-opaque to undo manager

  • Integration of repair into history

    • high-level specification of intent makes verbs relatively independent of system changes

    • verbs are re-executed, not restored, so they inherit effects of repairs

  • Scoping restored history

    • only changes logged as verbs will be preserved by 3Rs

      • effects of bugs, corruption, human error are discarded

    • can reason about what is preserved/lost in 3R cycle


Managing external inconsistency l.jpg

Managing external inconsistency

  • External inconsistency == time paradox?

    • system is internally-consistent after a 3R cycle

    • but external observers see inexplicable state changes

    • external inconsistency is OK unless affected state was externalized (observed) before the 3R cycle

  • Coping with external inconsistency

    • cannot eliminate

    • must manage: ignore, explain, compensate, encompass

  • Verbs let us manage external inconsistency


Managing inconsistency with verbs l.jpg

Managing inconsistency with verbs

  • To detect inconsistencies:

    • verbs specify the state that they depend upon

    • undo manager tracks signatures of that state

    • if verb is altered or if signatures don’t match, there is an inconsistency

      • applications supporting relaxed consistency can replace signature-check with arbitrary consistency predicates

  • To detect state viewed externally:

    • verbs indicate what state they externalize

      • example: IMAP fetch verb externalizes email message

  • To handle externalized inconsistencies:

    • verb supplies compensation functions


Email example original timeline l.jpg

Hello

olleH

m

m

!

Deliver

Fetch

Inbox

olleH

Move

olleH

Folder1

DeliverMsg

MoveMsg

FetchMsg

Externalizes:—

ContentDep: —

ExistsDep:Inbox

Externalizes:—

ContentDep: —

ExistsDep:Inbox, Folder1

Externalizes:m

ContentDep: m

ExistsDep:m, Folder1

+ input “Hello”

+ Signature(m)=“olleH”

Email example: original timeline

Systemboundary

Systemstate

Verbs

Historylog

Time


Email example replay timeline l.jpg

Hello

Hello

Hello

olleH

m

m

m

m

!

Deliver

Deliver

Fetch

Fetch

Inbox

olleH

Move

Move

Hello

olleH

Hello

Folder1

mismatch! => inconsistency

DeliverMsg

DeliverMsg

MoveMsg

MoveMsg

FetchMsg

FetchMsg

Externalizes:—

ContentDep: —

ExistsDep:Inbox

Externalizes:—

ContentDep: —

ExistsDep:Inbox

Externalizes:—

ContentDep: —

ExistsDep:Inbox, Folder1

Externalizes:—

ContentDep: —

ExistsDep:Inbox, Folder1

Externalizes:m

ContentDep: m

ExistsDep:m, Folder1

Externalizes:m

ContentDep: m

ExistsDep:m, Folder1

+ input “Hello”

+ input “Hello”

+ Signature(m)=“olleH”

+ Signature(m)=“olleH”

Email example: replay timeline

Systemboundary

X

Systemstate

Verbs

Historylog

Time


Recap 3r architecture l.jpg

Recap: 3R architecture

  • Goal: application-neutral implementation of 3R’s

    • verb abstraction couples generic undo manager to app.

    • verbs provide tools to reason about 3R behavior

  • Challenges

    • integrating history and repair during replay

      • re-executing verbs restores intent of history

    • managing inconsistency in externally-visible state

      • verbs track externalization, state dependencies, and define compensations


Status l.jpg

Status

  • Prototype implementation of 3R primitives nearly complete

    • app-independent undo manager written in Java

    • all APIs defined as Java interfaces

    • Network Appliance filer as time-travel storage layer

    • BerkeleyDB as history log

  • First target app: web-based email service

    • 3R-enhanced JavaMail API provider classes

      • plus additional hooks to verb-ify operator maintenance tasks like account creation

    • JWebMail web front-end

    • RDBMS-based backend mail store (DB2 or MySQL)

    • implementation in progress


Open issues future work l.jpg

Open issues & future work

  • Resource impact of the 3R’s

    • what are the performance/space penalties for the 3R’s?

  • Verb definition

    • can we specify verbs & consistency policy declaratively?

  • Providing the 3R’s at multiple granularities

    • can we track & manage cross-granularity dependencies?

  • Measuring the dependability benefit of 3R’s

    • how do we build recovery/dependability benchmarks?

  • Other uses for verb-based characterizations

    • easy georeplication? online self-checking? automatic verification of upgrades?


Conclusions l.jpg

Conclusions

  • We can build time travel for computers

    • using the 3R’s: Rewind, Repair, Replay

  • An architecture for the 3R primitives

    • generic undo manager coupled to application by verbs

  • Verbs are a useful abstraction for the 3R’s

    • can use to reason about effects of 3R’s on state

    • help address problem of external inconsistencies

  • Prototype 3R-enabled email system under construction

    • hope to demonstrate increased dependability and faster recovery from problems


Rewind repair replay three r s to improve dependability22 l.jpg

Rewind, Repair, Replay:Three R’s to improve dependability

For more information:

http://roc.cs.berkeley.edu/

[email protected]


Backup slides l.jpg

Backup slides


Verbs vs transactions l.jpg

Verbs vs. transactions

  • Both encapsulate state-altering events

  • But, unlike transactions:

    • verbs are higher-level, recording end-user intent, not specific state changes

    • verbs do not depend on internal data models (but do depend on external protocols)

      • transactions are the reverse

    • verbs do not necessarily conform to ACID consistency

      • verbs inherit consistency model provided by application at the external-protocol level


Implementing verbs l.jpg

Implementing verbs

  • Verbs are defined by a type hierarchy

    • base type defines interfaces for state dependencies, externalizations, predicates, compensations

    • applications subclass the base type for their verbs

      • additions to the type are opaque to the undo manager

  • Referencing state

    • all user-visible state named by time-invariant UIDs

    • undo manager requires signature method for all state

  • Consistency predicates and compensations are application-supplied functions

    • they encode the app’s external consistency model


Defining verbs l.jpg

Defining verbs

  • Currently, verbs are defined procedurally

    • provide dependency information via lists of state IDs

    • provide functions for special consistency predicates

    • provide functions for compensation

  • Better: declarative specification

    • compile textual specification into verb code using libraries of predicates and compensation fns

    • reduces complexity of adding 3R’s to the application

    • increases confidence in undo system via easier testing


External consistency policies l.jpg

External consistency policies

  • Verbs capture external consistency policies

  • Example: email

    • message order in folder is irrelevant

      • AppendMessage verb does not express dependency on content of target folder, only its existence

    • content of messages is relevant, except for headers

      • ReadMsg verb depends on hash of target message body; if changed, compensate by inserting explanatory text

  • Example: e-commerce

    • order total depends on item prices, not descriptions

      • Checkout verb depends on prices of items in cart, not their hash-values; if sum of prices changed, compensate by emailing customer for approval


External consistency policies 2 l.jpg

External consistency policies (2)

  • Example: auctions

    • new bid must be larger than prior bids

      • PlaceBid verb depends on content of all bids in bid set; if one is now larger than new bid, compensate by canceling new bid and informing bidder


Application implications l.jpg

Application implications

  • To support the 3R’s, an application must have:

    • a high-level, verb-structured interface/API for user, operator, and external actions

    • a state model where all user-visible state:

      • is nameable via the API

      • is tagged with GUIDs

      • supports a signature/hash method

    • a relaxed external consistency model that allows compensation for externalized inconsistent verbs


Example a 3r email store l.jpg

Example: a 3R email store

  • State

    • mailstores, folders, messages, user properties, aliases

  • Verbs

    • transport: create/delete/alter mapping; deliver msg

    • directory: create/alter/delete user-entry; create/alter/delete filter-rule; add/remove maildrop

    • store: create/delete store; create/rename/delete folder; expunge folder; list folder; set folder flags; copy msg; append msg; fetch msg; set msg flags

HTTP

IMAP, internal

WebUI

SMTP

Transport

Store

internal

LDAP, internal

verbs

Directory/Auth.

UndoMgr

verbs


  • Login