andrew system e mail architecture at carnegie mellon university n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Andrew System E-mail Architecture at Carnegie Mellon University PowerPoint Presentation
Download Presentation
Andrew System E-mail Architecture at Carnegie Mellon University

Loading in 2 Seconds...

play fullscreen
1 / 43

Andrew System E-mail Architecture at Carnegie Mellon University - PowerPoint PPT Presentation


  • 193 Views
  • Uploaded on

Andrew System E-mail Architecture at Carnegie Mellon University. Rob Siemborski Walter Wong rjs3@andrew.cmu.edu wcw@cmu.edu. Computing Services Carnegie Mellon University 5000 Forbes Ave Pittsburgh, PA 15213. Last Revision: 01/27/2004 (wcw). Presentation Overview. History & Goals

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Andrew System E-mail Architecture at Carnegie Mellon University' - Jimmy


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
andrew system e mail architecture at carnegie mellon university

Andrew SystemE-mail Architecture atCarnegie Mellon University

Rob Siemborski Walter Wong

rjs3@andrew.cmu.edu wcw@cmu.edu

Computing Services

Carnegie Mellon University

5000 Forbes Ave

Pittsburgh, PA 15213

Last Revision: 01/27/2004 (wcw)

presentation overview
Presentation Overview
  • History & Goals
  • The Big Picture
  • Mail Transfer Agents
  • Mail Processing (Spam & Virus Detection)
  • The Directory
  • The Cyrus IMAP Aggregator
  • Clients and Andrew Webmail
  • Current Andrew Hardware Configuration
  • Future Directions
the early years
The Early Years
  • Early 80s – The Andrew Project
    • Campus-wide computing
    • Joint IBM/CMU Venture
    • One of the first large scale distributed systems, challenging the ‘mainframe’ mentality
    • The Andrew File System (AFS)
    • The Andrew Message System (AMS)
goals of the andrew message system
Goals of theAndrew Message System
  • Reliability
  • Machine and Location Independence
  • Integrated Message Database
    • Personal Mail and Bulletin Boards
  • Separation of Interface from Functionality
  • Support for Multi-Media
  • Scalability
  • Easy to Extend, Easy to Use
end of ams
End of AMS
  • AMS was a nonstandard system
    • Avoid becoming a “technology island”
    • Desire to not maintain our own clients.
  • AMS was showing scalability problems
  • Desire to decouple the file system from the mail system
project cyrus goals
Project Cyrus Goals
  • Scalable to tens of thousands of users
  • Support wide use of bulletin boards
  • Use widely accepted standards-based technologies
    • Comprehensive client support on all major platforms
  • Supports a disconnected mode of operation for the mobile user
project cyrus goals 2
Project Cyrus Goals (2)
  • Supports Kerberos authentication
  • Allows for easy sharing of private folders with select individuals
  • Separation of the mail store from a distributed file system
  • Can be independently installed, managed and set up for use in small departmental computing facilities
more cmu mail system goals
More CMU Mail System Goals
  • Allow users to have a single @cmu.edu address no matter where their actual mail store is located
    • “CMUName” Service
  • Ability to detect and act on incoming Spam and Virus Messages
  • Provide access mail over the Web
  • Integration of messaging into the overall Computing Experience
the big picture
The Big Picture

The Internet

Users /

Mail Clients

LDAP Directory

Servers

Cyrus IMAP Aggregator

Mail Transfer Agents

(Three Pools)

mail transfer agents
Mail Transfer Agents

The Internet

Users /

Mail Clients

LDAP Directory

Servers

Cyrus IMAP Aggregator

Mail Transfer Agents

(Three Pools)

mail transfer agents1
Mail Transfer Agents
  • Andrew has 3 Pools of Mail Transfer Agent (MTA) Machines
    • Mail exchangers (MX Servers) receive and handle mail from the outside world for the ANDREW.CMU.EDU domain.
    • The “SMTP Servers” process user submitted messages (SMTP.ANDREW.CMU.EDU)
    • Mail exchangers for the CMU.EDU domain (the CMU.EDU MXs)
  • All Andrew MTAs run Sendmail
mail transfer agents 2
Mail Transfer Agents (2)
  • Why 3 Pools?
    • MX Servers
      • Subject to the ebb and flow of the outside world
      • Significant CPU-intensive processing
      • Typically handle much larger queues (7,000+ messages each)
    • SMTP Servers
      • Speak directly to our clients
      • Need to be very responsive
      • Very small queues (200 messages each)
mail transfer agents 3
Mail Transfer Agents (3)
    • CMU.EDU MXs
      • Service separation from Andrew MX servers
      • Mostly just forwarding
      • No real need to duplicate processing done on Andrew MX servers
  • All Three Pools are Redundant
    • Minimize impact of a machine failure
mail transfer agents 4
Mail Transfer Agents (4)
  • Separate MTA pools give significant control over incoming email.
    • A message may touch multiple pools
    • Example:

Message processed by

CMU.EDU MX, bound for

foo@ANDREW.CMU.EDU

User submits message

to foo@CMU.EDU via

SMTP servers

Message

processed by

ANDREW MX

Final Delivery

To Cyrus Aggregator

mail processing
Mail Processing
  • All mail through the system is “processed” to some degree.
    • Audit Logging
    • Cleaning badly-formed messages
    • Blocking restricted sender/recipients/relays
  • More substantial processing done by Andrew MX Servers
mail processing 2
Mail Processing (2)
  • Spam Detection
    • Uses Heuristic Algorithms to identify Spam Messages (SpamAssassin)
    • Tags message with a header and score
    • User initiated filters (SIEVE) can detect the header and act upon it (bounce the message or file it into an alternate folder)
    • Very computationally expensive on MX
mail processing 3
Mail Processing (3)
  • Virus Detection
    • Uses signatures to match virus messages (ClamAV)
    • “Bounce” message immediately at the incoming RCPT
    • Debate between bounce vs. tag
the directory
The Directory

The Internet

Users /

Mail Clients

LDAP Directory

Servers

Cyrus IMAP Aggregator

Mail Transfer Agents

(Three Pools)

the directory1
The Directory
  • Mail delivery and routing is assisted by an LDAP-accessible database.
  • Every valid destination address has an LDAP entity
  • LDAP lookups can do “fuzzy matching”
  • LDAP queries done against replicated pool
the directory 2
The Directory (2)
  • Every account has a mailRoutingAddress: the “next hop” of the delivery process
    • mRA is not generally user configurable
  • Some accounts have a user-configurable mailForwardingAddress (mFA)
    • mFA will override the mRA
the cyrus imap aggregator
The Cyrus IMAP Aggregator

The Internet

Users /

Mail Clients

LDAP Directory

Servers

Cyrus IMAP Aggregator

Mail Transfer Agents

(Three Pools)

the imap protocol
The IMAP Protocol
  • Standard Protocol developed by the IETF
  • Messages Remain on Server
  • MIME (Multipurpose Internet Mail Extentions) Aware
  • Support for Disconnected Operation
  • AMS-Like Features (ACLs, Quota, etc)
the cyrus imap server
The Cyrus IMAP Server
  • CMU Developed IMAP/POP Server
    • Released to public and maintained as active Open Source project under BSD-like License
    • No servers were available implemented all of the features needed to replace AMS.
  • Designed to be a “Black Box” server
  • Performance and Scalability were key to Design
initial cyrus imap deployment
Initial Cyrus IMAP Deployment
  • Single monolithic server (1994-2002)
  • Originally deployed alongside AMS
  • Features were implemented incrementally
  • Users were transitioned incrementally
  • Local users provided a good testing pool
  • Scaled surprisingly well
cyrus imap aggregator design
Cyrus IMAP Aggregator Design
  • IMAP not well suited to clustering
    • No real concept of mailbox “location”
    • Clients expect consistent views of the server and its mailboxes
    • Significantly varying client implementation quality
  • Aggregator was designed to make many machines look like one so any user can share a folder to any other user
cyrus imap aggregator design 2
Cyrus IMAP Aggregator Design (2)

Users /

Mail Clients

  • Three Participating Types of Servers
    • IMAP Frontends (“dataless” Proxies)
    • IMAP Backends (“Normal” IMAP Servers; your data here)
    • MUPDATE (Mailbox Database)

Frontends Proxy

Requests

For Clients

Backends hold Traditional

Mailbox Data

MUPDATE Server

Maintains list

imap frontends
IMAP Frontends

Users /

Mail Clients

  • Fully redundant
    • All are identical
  • Maintain local replica of mailbox list
  • Proxies most requests, querying backends as needed
  • May also send IMAP referrals to capable clients

Frontends Proxy

Requests

For Clients

Backends hold Traditional

Mailbox Data

MUPDATE Server

Propogates mailbox list changes to frontends

imap backends
IMAP Backends

Users /

Mail Clients

  • Basically Normal IMAP Servers
  • Mailbox Operations are approved & recorded by MUPDATE server
    • Create / Delete
    • Rename
    • ACL Changes

Requests are proxied by Frontends

Backends hold Traditional

Mailbox Data

MUPDATE Server

approves mailbox operations

mupdate server
MUPDATE Server

Users /

Mail Clients

  • Specialized Location Server (similar to VLDB in AFS)
  • Provides guarantees about replica consistency
  • Simpler than maintaining database consistency between all the frontends

Frontends update local mailbox list replicas

Backends send mailbox list updates

MUPDATE Server

approves and replicates updates

cyrus aggregator data usage
Cyrus Aggregator:Data Usage
  • User INBOXes and sub folders
  • Users can share their folders
  • Internet mailing lists as public folders
  • Netnews Newsgroups as public folders
  • Public folders for “workflow”; general discussion, etc
  • Continued “bboard” paradigm: 30,000+ folders visible
cyrus imap aggregator advantages
Cyrus IMAP Aggregator:Advantages
  • Horizontal Scalability
    • Adding new capacity to frontend and/or backend is easy to do and can be done with no user visible downtime
  • Management possible through single IMAP client session
  • Wide client interoperability
  • Simple Client configuration
  • Ability to (mostly) transparently move users from one backend to another
  • Failures are partitioned
cyrus imap aggregator limitations
Cyrus IMAP Aggregator:Limitations
  • Backends are NOT redundant
  • MUPDATE is a single point of failure
    • Failure only results in error when trying to CREATE/DELETE/RENAME or change ACLs on mailboxes
cyrus imap aggregator backups
Cyrus IMAP Aggregator:Backups
  • Disk partition backup via Kerberized Amanda (http://www.amanda.org)
  • Restores are manual
  • 21 day rotation – no archival
  • Backup to disk (no tapes)
cyrus imap aggregator other protocol support
Cyrus IMAP Aggregator:Other Protocol Support
  • POP3 support for completeness
    • Possibly creates more problems than not (where did my INBOX go?)
  • NNTP to populate bboards
  • NNTP access to mail store
  • LMTP w/AUTH for mail transport from MTA to backends
clients
Clients

The Internet

Users /

Mail Clients

LDAP Directory

Servers

Cyrus IMAP Aggregator

Mail Transfer Agents

(Three Pools)

clients1
Clients
  • IMAP has many publicly available clients
    • Varying quality
    • Varying feature sets
  • Central computing recommends Mulberry
    • Roaming Profiles via IMSP
    • Many IMAP extensions supported (e.g. ACL)
    • UI not as popular
clients webmail
Clients - Webmail
  • Use SquirrelMail as a Webmail Client
  • Local Modifications
    • Interaction with WebISO (pubcookie) Authentication
    • Kerberos Authentication to Cyrus
    • Local proxy (using imtest) to reduce connection load on server
  • Preferences and session information shared via AFS (simple, non-ideal)
clients mailing lists
Clients – Mailing Lists
  • +dist+ for “personal” mailing lists+dist+~user/foo.dl@andrew.cmu.edu
  • Majordomo for “Internet-style” mailing lists
  • Prototype web interface for accessing bboards
    • Authenticated (for protected bboards)http://bboard.andrew.cmu.edu/bb/org.acs.asg.coverage
    • Unauthenticated (for mailing list archives)http://asg.web.cmu.edu/bb/archive.info-cyrus
andrew mail statistics
Andrew Mail Statistics
  • Approximately 30,000 Users
  • 12,000+ Peak Concurrent IMAP Sessions
  • 8+ IMAP Connections / Second
  • 650 Peak Concurrent Webmail Sessions
  • Approximately 1.5 Million Emails/week
  • See Also: http://graphs.andrew.cmu.edu
andrew hardware
Andrew Hardware
  • 5 frontends
    • 3 Sun Ultra 80s (2x450mhz UltraSparc II; 2 GB memory; Internal 10000 RPM disk)
    • 2 SunFire 280Rs (2x1ghz UltraSparc III; 4 GB memory; Internal 10000 RPM disk)
  • 5 backends
    • 4 Sun 220R (450mhz UltraSparc II; 2GB memory; JetStor II-LVD RAID5 8x36 GB 15000 RPM disks)
    • 1 SunFire 280R (2x1ghz UltraSparc III; 4GB memory; JetStor III U160 RAID5 8x73 GB 15000 RPM disks)
  • 1 mupdate
    • Dell 2450 (Pentium III 733 MHz; 1 GB memory; PERC3 RAID5 4x36GB 10000RPM disks)
  • 3 ANDREW.CMU.EDU MX
    • Dell 2650 (Pentium 4 3ghz; 2 GB memory; PERC3 RAID1 2x73GB 15,000rpm disks)
  • 3 SMTP.ANDREW.CMU.EDU
    • Dell 2650 (Pentium 4 3ghz; 2 GB memory; PERC3 RAID1 2x73GB 15,000rpm disks)
  • 2 CMU.EDU MX
    • Dell 2650 (Pentium 4 3ghz; 2 GB memory; PERC3 RAID1 2x73GB 15,000rpm disks)
  • 1 mailing list
    • Dell 2650 (Pentium 4 2.8ghz; 1 GB memory; PERC3 RAID1 2x73GB 15,000rpm disks)
  • 3 webmail
    • Dell Optiplex GX 260 small form factor (Pentium 4 2.4Ghz; 1GB memory; 80GB ATA disk)
current issues
Current Issues
  • Lack of client support for ‘check new’ for IMAP folders (even when client supports NNTP)
  • Large number of visible folders can be problematic for clients (i.e. PocketInbox)
potential future work
Potential Future Work
  • Online/Self-Service Restores (e.g. AFS “OldFiles”, delayed EXPUNGE)
  • Virtual “Search” Folders
  • Fault tolerance
    • Replicate backends
    • Support multiple MUPDATE servers
  • Multi-Access Messaging Hub
    • One Mail Store, many APIs
    • IMAP, POP, NNTP, HTTP/DAV/RSS, XML/SOAP
    • Web Bulletin Boards / blog interface
    • Remove Shared Folder / Mailing List Distinction
current software
Current Software
  • MTA: Sendmail 8.12.10
  • LDAP: OpenLDAP 2.0
  • Cyrus: 2.2.3
  • MIMEDefang: 2.28
  • SpamAssassin: 2.61
  • ClamAV: 0.63
  • Squirrelmail: 1.4.2 (w/Local Modifications)