eBay and PayPal emails now with Truedomain

Emails from eBay and PayPal are now being protected by Truedomain, which means you’ll see the green Truedomain protection bar when you view these emails in the web interface.

paypal-truedomain

This is currently a trial, but we hope this becomes permanent for all eBay and PayPal emails since these are common targets for phishing attacks, which is exactly what Truedomain is designed to help protect against.

Let us know if you appreciate this protection so we can help make this a permanent feature for eBay and PayPal emails, by emailing truedomain@fastmail.fm with your thoughts.

HTML emails – from bad to worse

This is a technical post. Regular Fastmail users subscribed to receive email updates from the Fastmail blog can just ignore this post.

Originally email was designed as a text only medium (1982). Over time, various extensions were added to allow transporting attachments (1993), and for different content formats, such as HTML (1997).

HTML has become a very popular way of delivering richer email content. HTML has many tools and a large infrastructure, and is easy to display in web browsers, because that’s what they’re actually designed to do.

The problem is that HTML is a markup language, but users generally use WYSIWYG type tools to edit the content of their messages, and those tools then output HTML. Unfortunately the HTML they output is of variable quality. To make things even worse, most email reading software has limited HTML & CSS display capabilities, so senders can’t actually rely on the full range of HTML or CSS to be available. The result is that most HTML email still uses the same type of HTML we were using in 1999, with deeply nested tables and explicit attributes on each tag to layout the email content.

As a web mail provider, we have to deal with all the variable HTML content, and try and display it correctly. I’ve seen numerous odd examples over the years, from emails that use absolute positioning and fixed width and height on every single element to layout everything in a neat grid (and is horribly broken if you change the font size at all), to the messy conditional comment HTML that Microsoft Word generates.

However recently I’ve had a few extreme examples of of badly generated HTML arrive, and in each case it’s been from Mac Mail (specific header “X-Mailer: Apple Mail (2.936)”). I’ve removed the content, added some newlines between tags, and put an example here. That looks pretty ordinary, nothing funny. Now looking at the HTML that generated it (I’ve put the HTML as text with appropriate indenting here to make it easier to follow). That’s 330k in size, and 5407 lines of almost entirely HTML tags. To get the initial piece of text content, it’s 741 nested tags! Worse a lot of that is nested inline and block tags alternating one after another, which is technically invalid HTML, and really annoys our HTML tidy code that tries to fix it up.

I’ll be working to try and fix this, but at the moment, really bad emails like this can cause extremely slow display on some browsers when viewed via the webmail interface.

New extra quota pricing

We’ve recently changed the way users pay for additional mailbox and file storage quota.

Previously, we had a one-time payment scheme for extra storage, where you only had to make a single payment, but the prices were very high:

  • $3.95 – 10M
  • $6.95 – 20M
  • $12.95 – 50M
  • $24.95 – 100M
  • $44.95 – 200M
  • $64.95 – 300M
  • $84.95 – 400M
  • $99.95 – 500M
  • $149.95 – 1000M
  • $199.95 – 2000M

Under the new scheme you pay an annual fee, which is added on to your subscription. The fee is charged on a pro-rata basis to the end of your current subscription when you increase your storage. The amount of extra storage you get depends upon your service level.

  • AdFree: $4.95 = 100M for 1 year
  • Full: $4.95 = 500M for 1 year
  • Enhanced: $4.95 = 1000M for 1 year

Like subscriptions, you also get an appropriate pro-rata refund towards your new service level if you upgrade.

Anyone who has already purchased extra storage under the old scheme may keep it indefinitely. If you want to buy more extra storage now though, you will have to switch to the new scheme. In the process, you will receive a full refund for any old extra storage you purchased (so it’s like you’ll have had it for free for the entire time!), and you will then be required to pay the new annual quota rates on all your extra storage.

Note 1: Currently this is only available for Personal accounts. We’re looking at extra quota options for Family/Business in the future.

Note 2: This new scheme doesn’t apply to legacy Member accounts. Unfortunately legacy Member accounts can no longer purchase extra storage. For Member accounts, we recommend you consider upgrading to Ad Free. You’ll get a full refund of your $14.95 Member account towards your Ad-Free account (so it’s like your Member account was free for all the time you used it), so you’ll be able to immediately get 3 years of Ad Free level service which includes 100M of storage, much more than the 16M of Member.

Security alert: Phishing attempt on Fastmail users

Over the weekend, we detected a phishing attempt against Fastmail users. Phishing is where someone sends you an email claiming to be from a Fastmail administrator, and asking you to reply with your username and password.

We will never send you an unsolicited email asking you for these details, and you should never respond to these emails, you should just delete them.

When a phishing attempt like this occurs, we quickly take steps to try and block any more of the emails entering our system, and also block any attempts to reply to the emails. We also check our logs to see if any users did reply to the email, and contact those users to let them know that the email was a fraud, and if they sent their password, they should immediately change it.

Fastmail’s outgoing servers have a good sending reputation, and spammers and scammers would like to take advantage of that. We have many processes in place that block spammers and scammers from signing up, so sometimes they’ll try and steal account details from existing users, which is what these phishing emails are trying to do.

SCSI HBAs, RAID controllers and timeouts

This is a technical post. Regular Fastmail users subscribed to receive email updates from the Fastmail blog can just ignore this post.

For the last few years, most of the IMAP servers we’ve bought have followed the same hardware format. A 1U server with an LSI SCSI or SAS controller, connected to two external RAID storage units. The RAID storage units use an ARECA controller and present the internal SATA/SAS disks as SCSI/SAS volumes. This setup has worked really well and generally been very solid.

However after recently upgrading the hard drives in one of our RAID storage boxes, we started experiencing some annoying kernel errors. Under high IO load as we synced new data to them, we’d end up seeing something like this in the kernel log.

[ 1378.310010] mptscsih: ioc1: attempting task abort! (sc=ffff88083cfa6000)
[ 1378.310091] sd 2:0:0:0: [sdj] CDB: Read(10): 28 00 0d 18 ad 2d 00 00 02 00
[ 1378.682660] mptscsih: ioc1: task abort: SUCCESS (sc=ffff88083cfa6000)

These would usually be repeated many times, and sometimes we’d see things like this after the above messages.

[ 1400.805969] Errataon LSI53C1030 occurred.sc->req_bufflen=0x1000,xfer_cnt=0x400
[ 1400.827927] mptbase: ioc1: LogInfo(0x11070000): F/W: DMA Error
[ 1401.090516] mptbase: ioc1: LogInfo(0x11070000): F/W: DMA Error

Simultaneously, the RAID controller would report in it’s log:

2010-08-16 08:24:50 Host Channel 0 SCSI Bus Reset

And there would often be some corruption of any data that was being written at the time.

We’d seen a problem like this before when we’d bought new hard drives, but after upgrading the firmware in the hard drives, they’d gone away. Unfortunately in this case, the new hard drives we had already had the latest firmware, so that wasn’t something that would help.

We tried a number of things. Downgrading the SCSI bus speed to 80 MB/s. Using the latest version of the LSI driver from their website (4.22) rather than the version that comes in the vanilla Linux kernel (3.04.14). Reducing the SCSI queue depth on the LSI card from 64 to 16. Upgrade the RAID controller firmware to the very latest version. None of these things helped. In each case, with high IO load, within 10 minutes we could cause the error to occur.

My final thought was that maybe it’s timeout related. With SCSI, the HBA can queue a lot of requests to be completed out of order. So if you shove a lot of IOPs to the RAID unit (so many that the write back cache fills up) maybe the internal scheduler in the RAID controller is interacting with the TCQ in the hard drives in some way badly, and some of the requests end up taking a long time to complete. Then the HBA has some timeout amount, and if a request takes longer than that, it assumes something has gone wrong and then tries to cancel everything that’s outstanding and reset the bus.

In Linux, you can control the timeout for each SCSI target device (eg a RAID volumeset in our case) via a tunable in /sys/.

/sys/block/sd*/device/timeout

The default value for the timeout on these LSI cards is 30 seconds. I increased it to 300 seconds on all targets, and we started the IO storm again.

Normally we’d see problems within 10 minutes. We let this run for 24 hours and not a problem!

Not 100% conclusive proof, but it’s looking pretty likely that that’s culprit. So my assumption is that the LSI card has a 30 second default timeout, and the RAID unit under heavy IO load can take longer than 30 seconds to respond to some queued requests. It would explain why the problem only occurs under heavy load and when the write back cache gets filled up.

Hopefully this helps someone else if they encounter this problem one day.

Additional: So even with these changes, one of the things we noticed was that a high IO load to one RAID volume (eg. in our case, moving users around) can severely affect the performance of other RAID volumes. The issue is related to the way each SCSI HBA has a queue depth it can manage, but in the kernel, each mounted volume has it’s own outstanding request queue. When the number of volumes is large, the sum of request in the volume queues can be much larger than the HBA queue, causing poor response times as lots of processes block on IO. On our systems with a large number of volumes, reducing the per-volume queue depth (/sys/block/sd*/device/queue_depth) from the default of 64 to 16 resulted in much more even performance. Other reading.

Archive billing enabled for Business accounts

Our Business accounts have a powerful archiving/journaling feature. What this allows you to do is per-user, you can enable archiving of all incoming/outgoing email. This email goes into a separate account and folder structure that only the business administrator can access. The archive account has unlimited storage, and special ACLs on folders so users can’t delete emails (unless specially overridden). This means the archive is a true journal of all received/sent messages for a user, which is useful for compliance and tracking.

The plan was to always charge for this feature and it’s been documented as such in the help, however the code was never actually in place to do this.

The code for this is now done, and all affected users were sent an email, informing them that archive billing would be starting soon, and it has now been enabled.

New POP/IMAP server version

Over the last 24 hours, we’ve rolled out a new POP/IMAP server version for all users. This new server is the result of months of great work by Bron and includes many improvements and fixes. Not that many of the fixes are currently user visible changes, but they are significant internal improvements that help improve reliability, conformance and performance, and will allow us to build some future features we’re looking at.

  • Email replication improvements

    Email replication has been made much more efficient and reliable. The format includes CRC auto-integrity checking features, so that any unexpected mismatches between both ends are automatically detected and fixed. It can also recover automatically from unclean shutdowns or machine crashes where “split brain” has occurred, automatically fixing up mailboxes and messages. The format has also been made future extensible, allowing more features to be added without compatibility problems.

  • Performance and integrity improvements

    The internal mailbox format used to store emails has been significantly reworked. The new format has reliable locking semantics to remove all race conditions. It also stores and checks CRCs on all record data and cache data, and SHA1 checks on all message files. This ensures that any corruption in any data is detected early and can be dealt with. By moving around some of the data (such as the user seen state), and only lazily opening files as needed, the new format also improves performance in many common cases.

  • Strict MODSEQ, QRESYNC support and full IMAP test suite conformance

    Recent extensions to IMAP allow clients to more quickly synchronise data between the server and the client (eg. CONDSTORE/MODSEQ and QRESYNC). While the server has supported CONDSTORE/MODSEQ for a while, unfortunately it was a bit buggy in some situations, causing message seen state to get out of sync. The server now correctly and accurately support CONDSTORE/MODSEQ, and also supports the current QRESYNC standard that will allow clients that support it to sync even faster. We also now correctly pass detailed IMAP stress tests.

  • Major code cleanups

    All of these improvements have also been done with major internal code cleanups. This will allow us to continue building additional functionality and features more easily in the future, and to more easily fix and debug any other issues that are encountered.

Unfortunately no good deed goes unpunished, and even though we’ve been testing this code ourselves and on a sub-set of users for weeks with continuous improvements, unfortunately some bugs did get through when we finally rolled out to all users. Then in the attempt to fix these issues as quickly as possible, we also introduced some other issues. The net result was that for about 12 hours, there was a sequence of small but potentially annoying bugs that would have affected different sets of users.

  • On first access, we upgrade a mailbox to the new format. During the upgrade, we found some existing caches had allowed invalid data to enter them, causing corruptions on upgrading which caused problems when accessing these mailboxes. These cases are now caught and new cache data is built from the underlying message files
  • While reconstructing the mailboxes that had been incorrectly upgraded by the above code, a quota error caused some peoples quota to temporarily be double their actual used amount. This has been fixed now. If this bug sent a user over quota temporarily, it shouldn’t be a problem. When a user is over quota, we return a temporary 4xx error, which means no messages should have been lost, the other side should just have re-delivered when they were back under quota.
  • IMAP IDLE wasn’t returning new messages, only updating existing messages, causing pushing of new messages to most email clients to not work
  • Mail App has a bug with parsing IMAP IDLE unsolicited fetch responses that contain more than flags information. We’ve added a workaround for this Mail App bug
  • The IMAP COPYUID response was producing a non-conformant result, which caused some programs to report an error (Outlook 2010)
  • POP3 was using an optimised mode if a mailbox was empty. Unfortunately the code to mark a mailbox as “non empty” wasn’t working properly when messages were delivered, but was working for IMAP logins. This meant that messages delivered wouldn’t be downloaded by POP until you did an IMAP or web login
  • The POP3 TOP command wasn’t working, causing some programs (Outlook in POP mode) that download email headers to fail
  • The POP3 UIDL command with a message ID was producing a non-conformant result, which was parsed incorrectly by some programs. This caused some POP programs to download the same message more than once, or to delete off the server before it should have
  • Update: An update to UID sequence handling  caused the mailbox status command to report unread messages as read and vice-versa, causing the unread count on folders to actually be the read count for a short while.
  • Update: The XLIST extension wasn’t working. This has been added back, so client that support it will automatically pick the right Sent Items, Drafts, Trash, Junk Mail folders when setting up a new account
  • Update: NOOP on Mac Mail. Like the bug above with Mac Mail and IDLE, this was affecting the NOOP command as well
  • Update: Storing the \Seen flag + another flag on a message that already had the \Seen flag would cause \Seen to actually disappear. This mostly manifested as when deleting a message, it would cause it to become marked as “unread” again

All these issues have now been fixed, and we’re closely monitoring all the server logs to see if there’s any other issues, but at this stage we believe that the new server and code is working correctly for all cases we’re aware of and for all clients, IMAP and POP.

All this new code is part of the open source project cyrus, and we’ll be pushing this code back to the main cyrus code base, which will eventually form the basis for a new cyrus version 2.4. For those interested in technical details, Bron will post to the cyrus mailing lists when he’s had a bit of time to compile all the documentation and technical details.

    Quota increase for all accounts

    Thanks to new equipment Opera are purchasing, we’ve now rolled out quota increases for all accounts. The changes are below:

    Personal accounts

    • Guest: 10M -> 25M
    • Ad free: 50M -> 100M
    • Full: 800M -> 1G (1000M)
    • Enhanced: 8G -> 10G

    Family accounts

    • Lite: 50M -> 200M
    • Everyday: 600M -> 800M
    • Superior: 6G -> 8G

    Business accounts

    • Basic: 80M -> 250M
    • Standard: 1G -> 1.5G
    • Professional: 10G -> 15G

    The website will be updated shortly with this new information.

    Note: Legacy Member accounts have not changed. The one time charge that Member accounts paid means we can no longer add to or upgrade those accounts. Please consider upgrading to the Ad free account instead.

    Mailbox shows folder purge/spam learning properties

    To remind people which folders are automatically learning email as spam or non-spam, and which folders are having emails automatically purged, the mailbox screen now shows these folder properties when you view the folder.

    If you don’t want to see these, you can turn off the display of these properties on the Options –> Account Preferences screen with the “Show custom properties” option.

    New External SMTP Server option for personalities

    Normally when you send email, the email will be queued by the FastMail.FM server. From there, we find the appropriate server for each recipient of the email, and forward it on to that server directly.

    However in some cases, you actually want to send the email via another server using authenticated SMTP, such as your ISP or corporate email server. This is especially useful if you have a corporate email server that requires SMTP authentication to send to particular internal mailing lists, or you want to avoid SPF issues.

    Using the new External server/External username/External password options that can be set on the Options –> Personalities screen, you can send email via an external server with SMTP AUTH. If enabled, we will try and send using SMTP AUTH via port 587, the standard SMTP submission port. If the remote server advertises STARTTLS support, we will try and switch to TLS on the connection. If you want to use a different port, you can append :port-number to the server name. Unfortunately straight SSL over SMTP (port 465) is NOT currently supported.

    This feature will also be used if you send email via Fastmail’s SMTP server if the From header email address matches a corresponding personality email address and the Use external SMTP server option is set.

    More information about this feature and it’s use is in the personality help pages.