1 / 32

SureMail Notification Overlay for Email Reliability

SureMail Notification Overlay for Email Reliability Sharad Agarwal Venkat Padmanabhan Dilip A. Joseph 8 March 2006 Outline Email loss problem Design philosophy SureMail design SureMail robustness to security attacks SureMail implementation What is Email Loss?

Jimmy
Download Presentation

SureMail Notification Overlay for Email Reliability

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SureMailNotification Overlay for Email Reliability Sharad Agarwal Venkat Padmanabhan Dilip A. Joseph 8 March 2006

  2. Outline • Email loss problem • Design philosophy • SureMail design • SureMail robustness to security attacks • SureMail implementation

  3. What is Email Loss? • Email loss : sent email not received • Silent email loss • Loss w/o notification (no bounceback / DSN) • Why? • Aggressive spam filters • 90% corp. emails thrown away (blacklist) • AOL’s strict whitelist rules (must send 100/day) • Bouncebacks contribute to spam • Complex mail architecture upgrades / failures • SMTP reliability is per hop, not end-to-end

  4. How Much Email Loss? • Even loss of 1 email / user / year is bad • If it’s an important email • To really measure loss • Monitor many users’ send & receive habits • Count how many sent emails not received • Count how many bouncebacks received • Difficult to find enough willing participants that email each other across multiple domains

  5. Prior Work • “The State of the Email Address” • Afergan & Beverly, ACM CCR 01.2005 • Rely on bouncebacks; similar to “dictionary” attack • 25% of tested domains send bouncebacks • 1 sender • 0.1% to 5% loss, across 1468 servers, 571 domains • “Email dependability” • Lang, UNSW B.E. thesis 11.2004 • 40 accounts, 16 domains receive emails from 1 sender • Empty body, sequence number as subject • 0.69% silent loss

  6. Our Email Loss Study • Methodology • Controller composes email, sends • Our code for SMTP sending • Outlook for receiving (both inbox & junk mail) • Parse sent and received emails into SQL DB • Match on {sender,receiver,subject,attachment} • Heuristics for parsing bouncebacks • Want • Many sending, receiving accounts • Real email content

  7. Experiment Details • Email accounts • 36 send, 42 receive • Junk filters off if possible • Email subject & body • Enron corpus subset • 1266 emails w/o spam • Email attachment • 70% no attachment • jpg,gif,ppt,doc,pdf,zip,htm • marketing,technical,funny

  8. Email Loss Results

  9. Loss Rates by Account • Loss rate 1.82% to 0.82%

  10. Loss Rates by Attachment • Nothing stands out

  11. Loss Rates by Subject/Body • ~50-250 emails sent per subject • Without 35% case : loss rate 1.82% to 1.79%

  12. Summary of Findings • Email loss rates are high • 1.82% loss • 0.71% conservative silent loss ( 1 / 140 ) • Difficult to disambiguate cause of loss • Difference between domains (filters or servers?) • No difference between mailboxes • No difference between attachments • Only 1 body had abnormally high loss

  13. Outline • Email loss problem • Design philosophy • SureMail design • SureMail robustness to security attacks • SureMail implementation

  14. We Found Email Loss; Now What? • Can try to fix email architecture, but • Hard to know exactly what is problem • Spam filters continually evolve; not perfect • Some architectures are very complicated • How many email systems are out there? • The current system mostly works

  15. Fixing the Architecture • Improve email delivery infrastructure • more reliable servers • e.g., cluster-based (Porcupine [Saito ’00]) • server-less systems • e.g., DHT-based (POST [Mislove ’03]) • total switchover might be risky • “Smarter” spam filtering • moving target  mistakes inevitable • non-content-based filtering still needed to cope with spam load

  16. Email Notifications • DSN / bouncebacks • Most spam filters don’t generate DSN on drop • Bogus DSNs due to spam w/ bogus sender • Some MTAs block DSN for privacy • MTA crash may not generate DSN • No DSN for loss between MTA and MUA • MDN / read receipts • Expose private info (when read, when online) • Can help spammers

  17. Notification Design Requirements • Cause minimal MTA/MUA disruption • Cause minimal user disruption • Preserve asynchronous operation • Preserve user privacy • Preserve repudiability • Maintain spam and virus defenses • Minimize traffic overhead

  18. Outline • Email loss problem • Design philosophy • SureMail design • SureMail robustness to security attacks • SureMail implementation

  19. SureMail Design Requirements • Cause minimal MTA/MUA disruption • No MTA modification; no Outlook modification • Cause minimal user disruption • User notified only on loss • Preserve asynchronous operation • Preserve user privacy • Only receiver is notified of loss • Preserve repudiability • No PKI / authentication • Maintain spam and virus defenses • Emails not modified • Minimize traffic overhead • 85 byte notification per email

  20. Basic Operation • Sender S sends email to receiver R • S also posts notification to overlay • R periodically downloads new email • R also downloads notifications from overlay • Notification without matching email  loss • delay : median 26s, mean 276s, max 36.6 hrs

  21. You’ve Lost Mail! H1(Mnew), H1(Mold), T, MAC([T,H1(Mnew)] ,H2(Mold)) GetNotifications Request lost message Register Verify SureMail Overview Recipient R Sender S Dnot=H1(R) Dreg=H2(R)

  22. SureMail Overview • Emails, MTAs, MUAs unmodified • Parallel notification overlay system • Decentralized; limited collusion • Agnostic to actual implementation • end-host-based (e.g., always-on user desktops) • infrastructure-based (e.g., “NX servers”) • Prevent notification snooping & spam • Email based registration • Reply based shared secret

  23. Email-Based Registration • Goal: prevent hijacking of R’s notifications • Only R can receive emails sent to R • Limited collusion among notification nodes • One-time operation for initial registration • R sends registration request to H2(R), H3(R) • H2(R), H3(R) email registration secrets to R • To retrieve notifications at H1(R) • R uses registration secrets with H1(R); H1(R) verifies with H2(R) H3(R), sends back notifications • Neither H1(R), H2(R), H3(R) can associate notifications with R, unless they collude

  24. Reply-Based Shared Secret • Goal: prevent notification spoofing & spam • Only R & S know their email conversations • S rarely converses with spammers • Reply detection • S sends Mold to R, R replies with M’old • S uses H(Mold) to “prove” identity to R in future • Notification for Mnew from S to R • H1(Mnew),H1(Mold),T,MAC([T,H1(Mnew)],H2(Mold)) • Only R can identify S • Shared secret can be continually refreshed

  25. Attacks Defeated by Design • X cannot retrieve H1(R) notifications • H1(R) cannot identify R • H2(R), H3(R) cannot see R’s notifications • If they don’t collude; can increase to 3 nodes • X, H(R) cannot identify S • X, H(R) cannot learn Mnew, Mold • X cannot annoy R with bogus notifications • X cannot masquerade post to H1(R) as S

  26. First Time Sender • What if FTS email is lost? • FTS & spammer generally indistinguishable • But perhaps FTS knows I who knows R • Email networks have small world properties • I makes shared secret SI with all known parties • FTS sends email to R • Posts multiple notifications • One for every SI it has learned

  27. Other Issues • Reply-detection: • “in-reply-to” header may not always help • indirect checks based on text similarity • Reducing overhead: • post notifications only for “important” emails • delay posting in hope of receiving implicit ACK (reply) or NACK (bounce-back) • Mobility: • reply-based shared secret can be regenerated • web-mail • Can support mailing lists

  28. Outline • Email loss problem • Design philosophy • SureMail design • SureMail robustness to security attacks • SureMail implementation

  29. SureMail Implementation • Reply detection heuristic for shared secret • Notification service • Centralized server running • Chord based DHT running • Notification posting, retrieving • Grab in/out bound email via Outlook MAPI call • No modification to Outlook binaries • XML notification put/get commands • Simple Win32 GUI

  30. Lost! Not lost SureMail GUI • Client UI will see emails, will post & retrieve notifications • E.g. running on two machines netprofa@microsoft.com and netprofa@gmail.com

  31. Notification Results

  32. Summary • Email does get lost! • ~40 accounts, 158000 emails, 0.71%-0.91% silent loss • SureMail • Client based – unmodified email, servers, clients; no PKI • User intervention only on lost email • Keeps repudiability, privacy, asynchronous, spam & virus defense • Separate notification overlay robust • Simple, small message format • No virus, malware, spam filters needed • Provides failure independence • Status • ACM Hotnets 05; ACM Sigcomm 06 submission • Prototype implementation

More Related