One of the things we’ve been aiming for for a while is to ensure the robustness of all our stored email by keeping a checksum for every email delivered. We’ve now rolled this out for every email on every server, storing a reliable and secure 160 bit checksum for every message. As mentioned earlier, this was one of the features of cyrus 2.3.10 (the IMAP server we use) that we helped contribute code for.
Most people don’t think corruption is an issue, but recent research by CERN has shown that with today’s large hard drives, this is a potentially serious problem, with an estimated corruption rate of 3 files in every TB of data. In most cases, corruption of data is a silent problem that people don’t realise has happened until they need the data.
To deal with this, we ensure that as soon as an email is delivered to a mailbox, a SHA-1 checksum of that email is generated and stored in the email index.
When the email is replicated, the email content and the checksum are sent separately. We then generate the checksum on the replicated email content and ensure that it matches the original checksum to see that the email was replicated correctly.
We also repeat this procedure when the email is backed up, ensuring that the backup of the email is correct.
We also run a regular check process that takes blocks of emails and recomputes their checksum to see it matches what is in the index. If there’s any issues, we’re alerted and can find which of the master, replica or backup email are correct and can correct the problem.