1 / 26

Recovery-Oriented Computing User Study

Recovery-Oriented Computing User Study. Training Materials October 2003. Overview. Informed consent & Introduction User study scenario & your role Training (20 minutes) Two study sessions (30 minutes each) Wrapup and questionnaire. Informed Consent.

walt
Download Presentation

Recovery-Oriented Computing User Study

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Recovery-Oriented ComputingUser Study Training Materials October 2003

  2. Overview • Informed consent & Introduction • User study scenario & your role • Training (20 minutes) • Two study sessions (30 minutes each) • Wrapup and questionnaire

  3. Informed Consent • Please read the overview of the study and the informed consent form • please feel free to ask any questions you have about the experiment, its goals, its procedures, etc. • If you agree to participate in the experiment, please sign the informed consent form

  4. Introduction • This study is evaluating new recovery tools • the tools are designed to help system administrators recover from problems affecting server systems • You will be playing the role of a system administrator • in each of two sessions, you will be trying to recover an e-mail server system from a pre-existing problem

  5. Introduction (2) • In each session, you may (or may not) be given an experimental recovery tool to use • We are trying to understand when the tool is useful for you and when it is not • so if you are given the tool, please think carefully about whether or not to use it when you are attempting to recover from a problem • at the end of the session, you will be asked to explain why you chose to use (or not use) the tool

  6. The Scenario

  7. User Study Scenario • You are one of several system administrators of an electronic mail (e-mail) service • the administrators work in shifts • the study starts when you arrive for your shift • You arrive to find users complaining that the e-mail service is not working • you will be provided with details of the complaint • the e-mail failure may be caused by: • failure of the e-mail software, or • an error made by the administrator on the previous shift

  8. User Study Scenario: Your Role • Your responsibilities and goals: • restore the e-mail service to normal operation as quickly as possible • minimize the amount of lost e-mail and user work • Note: • you should prioritize restoring service over preserving changes made by other administrators

  9. User Study Scenario: Resources • Resources you will have: • a log of all actions performed by administrators in previous shifts • a day-old backup of the server’s file systems • the Internet • a test e-mail account • a guru • during each session, you may make up to one request for help to the guru • Plus any experimental recovery tool that we provide (described later)

  10. Training: E-mail Server

  11. E-mail Overview • This study concerns e-mail store servers • e-mail stores receive and store e-mail for their users • users’ mailboxes live on the e-mail store • they do not handle sending or routing of outgoing mail • E-mail stores use two protocols • SMTP: used to deliver incoming e-mail to a mailbox • SMTP is spoken between a remote server that sends the message, and the local recipient e-mail store server • IMAP: used to retrieve & manipulate mail in a mailbox • IMAP is spoken between a user’s e-mail client and their local e-mail store server

  12. E-mail Server Configuration • Mailboxes are text files in /var/mail, e.g. /var/mail/user173 • sendmail: process that receives and delivers incoming e-mail • imapd: process that provides remote access to mailboxes • Mail store configuration files can be found in /etc/mail SMTPServerProcess sendmail IMAPServerProcess imapd SMTP IMAP Internet incominge-mail reading e-mail Users Mailboxes /var/mail/userNNN E-mail Server (Linux) undovmN.cs.berkeley.edu N={1,2,3}

  13. Simple Familiarization Task • Take some time to get familiar with the console and the e-mail system • by performing a basic task as described below • Goals: • ensure sendmail is running • reconfigure server to recognize mail sent touser@roc.cs.berkeley.edu • restart sendmail to activate reconfiguration • First step: • connect to undovm3.cs.berkeley.edu with ssh continues...

  14. Simple Familiarization Task (2) • Next, check if sendmail is running: • execute the command:ps ax | grep sendmail • Reconfigure server to accept new host name: • edit /etc/mail/local-host-names to add the line:roc.cs.berkeley.edu • Finally, restart sendmail: • run /etc/init.d/sendmail restart • Try this task now!

  15. Training: Experimental Recovery Tool

  16. Recovery Tool: an Undo System • The undo system can undo administrative changes to the e-mail store, including: • changes to configuration files • software upgrades • deleted or altered files • It can be used to restore the e-mail server to a previously known-good state • by “rewinding” to a date when the system worked OK • The undo system preserves incoming e-mail and user mailbox changes

  17. When Can the Undo System Help? • The undo system is useful: • when you cannot tell what is causing a problem • but you know that the system was working at some point in the past • when a problem affects system state • typically, the same cases where restoring a backup would fix the problem • It does not help when the problem does not affect state • like if a server process (e.g., sendmail) has crashed cleanly without corrupting state

  18. Why Use the Undo System? • Unlike using a backup, the undo system also repairs the side effects of problems • example: if a problem caused e-mail to be lost, using undo to fix the problem will restore the lost e-mail • the undo system does this by recording incoming e-mail and users’ mailbox edits, then restoring them during recovery • Undo is also useful when you cannot diagnose a problem • simply undo the system to a point in time when it was known to be working

  19. Undo System Operation • An undo cycle has two stages: • rewind: the e-mail system’s state is reverted to the way it appeared at a past time (the “rewind point”) • all changes to the system made since the rewind point are undone, including: • changes made by administrators • changes due to software bugs • incoming e-mail delivery and user mailbox edits • commit: makes the rewind permanent but restores incoming e-mail & user mailbox edits to present time • Net effect: undo cycle undoes all changes except incoming e-mail and mailbox edits

  20. Illustration of Undo Cycle • Before undo: user event user events(incoming e-mail, mailbox edits) time admin changes admin change • After rewind: undone changes user events(incoming e-mail, mailbox edits) time admin changes Rewind point • After commit: restored user events user events(incoming e-mail, mailbox edits) time admin changes note that admin changes remain undone

  21. Controls for the Undo System • Rewind: begins an undo cycle • defines a rewind point and undoes all later changes • may cause e-mail server to automatically reboot • takes 4 to 5 minutes to execute • Commit: completes the undo cycle • makes the rewind permanent • restores incoming e-mail & mailbox edits to present time • takes about 5 minutes to execute • Cancel: aborts the undo cycle • restores e-mail server to the state it was in before rewinding

  22. Undo System Interface • Main window: normal state • time is divided into 5-minute intervals • each interval contains userevents like incoming mail • it’s fastest to rewind to a checkpoint Intervals Intervalscontainingcheckpoints Timeline(color indicatesrelative load) Checkpoints Current time Current undo status

  23. Undo System Interface (2) • Main window: rewound state Current time (inthe past) indicatesundo point Current undo status History of undooperations Commit andCancel buttons

  24. Undo System Interface (3) • Event window • used to initiate rewind • to view, double-click on an interval in main window Click to invokeundo cycle Selected event(rewind point) Current time Description of event(here, user170 is examining their mailbox) Event sequence #

  25. Familiarization, Part II • Try out the undo system interface • note: actually performing an undo cycle may take 10 or more minutes to complete • Familiarize yourself with the various resources available to you during the study • Outlook Express e-mail client • the test e-mail account: user250@undovmN.cs.berkeley.edu N={1,2,3} • the system backup: /backup • books, documentation, the Internet • guru advice: at most one question per session

  26. Resources for More Information • E-mail in general • About Internet email protocols http://perl.about.com/library/weekly/aa020600a.htm • E-mail references: http://www.newt.com/email/references.html • Sendmail • O’Reilly Sendmail book (next to your workstation) • Sendmail home page: http://www.sendmail.org • SMTP RFC: http://www.isi.edu/in-notes/rfc2821.txt • IMAPd • IMAP general info: http://www.imap.org/ • UW-IMAP home page: http://www.washington.edu/imap/ • IMAP RFC: http://www.isi.edu/in-notes/rfc3501.txt

More Related