Backup gmail with offlineimap: folders and labels.
Here are a couple details about backing up a gmail account through the IMAP protocol using offlineimap.
Gmail uses IMAP folders, that we can consider as normal filesystem folders, and also labels, often called tags in other applications. The problem is that the IMAP protocol (or at least Maildir) does not know anything about labels. So this is how gmail deals with this discrepancy:
Gmail uses only 4 real IMAP folders*:
[Gmail]/All Mail[Gmail]/Drafts[Gmail]/Spam[Gmail]/Trash
This means that every email in your account, apart from trashed ones, spam or drafts, are stored in the IMAP folder [Gmail]/All Mail. If we synchronize our Gmail account (e.g. with offlineimap), the Gmail server will transorm the labels as real IMAP folders. For example, synching with offlineimap without any folderfilter, we get the following directory structure:
[Gmail]
All Mail
Drafts
Important
Sent Mail
Spam
Starred
Trash
INBOX
Personal
Receipts
Travel
Work
Every labeled email (note that even [Gmail]/Sent Mail is a label), will be duplicated because any email will always be stored in All Mail and will also be stored in the folder(s) corresponding to its label(s).
If you have a crappy internet connection, for backup purposes, we can fetch only the [Gmail]/All Mail folder. This can be achieved setting folderfilter in the .offlineimaprc file as:
folderfilter = lambda foldername: foldername in ['[Gmail]/All Mail']
Note, however, that in this way we lose the label structure.
If instead we want to use the downloaded Maildir for bidirectional synchronization, i.e. we want to use a local mail client like mutt, then we have to fetch everything, thus downloading and storing duplicated emails. When synching back, the Gmail IMAP server will know how to handle IMAP folders as labels, and will not duplicate anything server-side.
A sample offlineimaprc file is:
## For a complete offlineimaprc example, see: https://github.com/OfflineIMAP/offlineimap/blob/master/offlineimap.conf
[general]
## Possible options: Quiet Basic TTYUI Blinkenlights
## NB: only TTYUI and Blinkenlights are capable of prompting for the password
ui = TTYUI
accounts = gmail
[Account gmail]
localrepository = gmail-local
remoterepository = gmail-remote
[Repository gmail-local]
type = Maildir
localfolders = ~/.gmail
## This will locally replicate the IMAP folder structure, instead of the default separator '.' that flattens it.
sep = /
[Repository gmail-remote]
type = Gmail
remoteuser = marco.rucci@gmail.com
# remotepass = <insert your password or you will be prompted>
## One-way synching. Perfect for backups.
readonly = True
ssl = yes
cert_fingerprint = f3043dd689a2e7dddfbef82703a6c65ea9b634c1
## Sync only 'All Mail': we lose the folder/labels structure, but this is the only way to not duplicate emails.
# folderfilter = lambda foldername: foldername in ['[Gmail]/All Mail']
More information at the following pages:
http://askubuntu.com/questions/23287/what-can-i-use-to-automate-backups-of-gmail
http://pbrisbin.com/posts/mutt_gmail_offlineimap
https://github.com/OfflineIMAP/offlineimap/
http://support.google.com/mail/bin/answer.py?hl=en&answer=82367&topic=1668961&ctx=topic
* I did not find this explicitly anywhere but this is what I understood reading this google IMAP support page.
