1 / 27

Dovecot M ail Storage

Dovecot M ail Storage. Timo Sirainen. Me: Timo Sirainen. Born 1979 in Finland First C64 BASIC programs around 1988 Open source coding since about 1998 Irssi IRC client 1999-2004, still widely used Worked as programmer since 1999 Went to university in 2006

niles
Download Presentation

Dovecot M ail Storage

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Dovecot Mail Storage TimoSirainen

  2. Me: TimoSirainen • Born 1979 in Finland • First C64 BASIC programs around 1988 • Open source coding since about 1998 • Irssi IRC client 1999-2004, still widely used • Worked as programmer since 1999 • Went to university in 2006 • Dovecot project started in 2002 • Working full time on it since about 2007 • 2009: Rackspace, USA • 2010: SAPO, Portugal

  3. Dovecot • Open source IMAP/POP3 server • Only mail retrieval to clients, no mail sending • First version released in 2002 • Mostly written by me • Except Sieve by Stephan Bosch • High performance is an important goal • Disk I/O is typical bottleneck -> everything optimized to reduce it

  4. Talk Overview • Traditional mailbox formats • Dovecot indexes • Dovecot mailbox formats • Full text search indexes • Future ideas

  5. mbox • One file per mailbox • Metadata in headers that are filtered out • X-UID, Status, X-Status, X-Keywords, etc. • Deleting requires moving data around • Fragile: corruption if crashes in the middle • Slow when deleting old messages • May become fragmented with constant appends • But non-fragmented file is fast to read

  6. Maildir • One file per message • Reading through all files can be slow • Message flags in filename (name:2,<flags>) • Lots of renaming • Finding the current filename can be difficult • Maildir is lockless? Not so much, Dovecot uses write/sync lock • Otherwise files can temporarily be lost during renames • Was the file really deleted or just renamed?

  7. Dovecot Index Files • Main index • List of messages • Message flags • Offsets to cache records • Cache file • Message size, some headers, etc. • Keep only data that client actually uses • Different clients want different data for different amount of time

  8. Dovecot Main Index • In two files: • dovecot.index: Somewhat recent snapshot • dovecot.index.log: Recent changes • All changes go through the log • Readers read snapshot to memory and apply latest changes from log • Once opened, only need to read log updates • Very efficient with remote filesystems (NFS, cluster FSes)! • Snapshot is updated “once in a while” • Tries to minimize disk I/O • Writes are usually more expensive than reads • Log also useful for finding “what changed” events for IMAP clients

  9. Dovecot Cache • The main reason for Dovecot’s good performance • Different IMAP clients want different data • Caching data that client doesn’t use wastes disk space and disk I/O • Flexible format, allows adding any number of fields • Per-field caching decisions: “no”, “temporary”, “permanent” • Cached fields never change (IMAP guarantees) • Data is added without locking -> duplicate data is possible • Once in a while the file is recreated -> deleted and unwanted records are dropped

  10. Locking • Lock waits are bad • Higher user visible latency • Timeout failures during high load • Dovecot v0.99 used traditional read/write index locks • Locking timeout problems • Redesigned v1.0 to do lockless reads

  11. Lockless reads: rename() • For: • Small files • Rarely changing files • If a large part of the file changes • Writer • Lock • If file has changed, read+update internal state • Write the updated data to temp file • rename() over the original file • Unlock • Reader • Just read the file. #1 Temp file rename() #2

  12. Lockless reads: Appends • For append-only files with “size” header in each written record • Writer • Lock • Write data with size=0 • Write size with each byte’s highest bit set to 1 • Unlock • Reader • Read one record at a time • Stop when seeing a size that isn’t fully written Size Data

  13. Lockless writes in future? • open(path, O_APPEND) usually provides atomic writes • Except with NFS • write() may also return less bytes than intended? (signal, out of space) • read() during a write may see incomplete data?

  14. Single-dbox • One file per message (u.<IMAP UID>) • Files have immutable metadata section • GUID, POP3 UIDL, received date, etc. • Advantages over Maildir: • Filenames don’t change • No IMAP UID <-> filename mapping required • Flags stored only in Dovecot index files • Automatically creates dovecot.index.backup once in a while • When fixing corruption, tries very hard to preserve flags based on (corrupted) index and backup files

  15. Multi-dbox • Multiple messages in a single file (m.<id>) • File format same as with single-dbox • Multiple files in a single mailbox • Files are about 2 MB (configurable) • Larger files -> less fragmentation, but deletion slower • Preallocation • Can be rotated every n days (for incremental backups) • Delayed (ioniced) nightly deletions (“doveadm purge”) • Crash or power loss can’t corrupt or lose data • Tries very hard to preserve as much data as possible in case of (filesystem) corruption. • Saves a backup of the original broken file

  16. Benchmarks • Realistic IMAP benchmarks are difficult to do • Depends on clients and user behavior

  17. Benchmarks • Reading 10k messages via IMAP

  18. Benchmarks: # NFS ops • Reading 10k messages via IMAP • Above: uncached, below: cached

  19. Benchmarks: # NFS ops Random IMAP commands sent with: imaptest logout=5 msgs=1000 delete=10 expunge=10 secs=60 seed=1 L+A+G = lookup + access + getattr

  20. New dbox-only Features

  21. Alternative Mail Storage • Users rarely access their old mails • Lower performance storage is cheaper -> Move old mails there • dbox supports “alternative path” setting: If u.* or m.* file isn’t found from primary path, it’s looked up from alternative path • Files could even be moved with /bin/mv • But easier/safer with “doveadmaltmove” • This would be difficult with Maildir because its filenames change

  22. Detached Mail Attachments • MIME parts can be saved to external files • Only if they’re large enough (default: 128 kB) • Also can be filtered based on Content-Type, etc. headers • Avoid extra disk seek for downloading attachments that clients automatically display inline • Supports saving base64 encoded MIME parts decoded (25% less disk space) • Only if re-encoding can be done to 100% original • dbox-only • Metadata contains pointers to external parts • Saving is done via simplified “filesystem API”

  23. Single Instance Storage • Storage’s internal deduplication • Could be enabled only for attachment storage • Dovecot’s SIS • FS API backend • Based on file hashes and hard links • Hash is configurable (e.g. SHA256 + size) • Byte-by-byte verification after hash found • Never, trust hash uniqueness (not implemented) • Immediate comparison during saving • Delayed (nightly) comparison and deduplication

  24. Dovecot SIS • Attachments saved to “HA/SH/HASH-GUID” under global attachment dir (e.g. /var/attachments/) • GUID guarantees filename uniqueness • e.g. file with hash “123456” is saved to 12/34/123456-GUID • “HA” and “SH” may be symlinks to other mounts • SIS is done by hard linking HA/SH/hashes/HASH to HA/SH/HASH-GUID if it exists. • Basically: “lnhashes/123456 123456-guid” • No attempts to create cross-mount hard links • Safe to move/backup/restore attachment files • But hashes/HASH is auto-deleted only when its link count drops from 2 to 1. External changes may leak it.

  25. Full Text Search Indexes • Dovecot has abstract FTS API • IMAP protocol says search is about “substring matching” (e.g. “ello” matches “hello”) • Almost no FTS engines support this • Few people seem to care about this anymore • Currently supported FTS backends: • Squat: Dovecot’s own indexer, supports substring matching. • Currently index updating is too inefficient • Apache Solr

  26. FTS: Solr • Solr is a search engine server using Lucene • Dovecot talks to Solr via HTTP • Sharding via per-user fts_solr setting

  27. Future • FS API used for indexes and dbox • Support for key-value databases • Asynchronous disk I/O

More Related