/ technology

The Wonderful World of E-Mail

Setting up your own email server sounds like a good idea when you are into privacy or when your boss asks you to do so. What sounds like an easy afternoon project quickly turns into a mess, even when you only have very few (or only one) mail account to manage. After trial-and-erroring for a long time, I want to give an overview over all involved technologies and possible stumbling blocks. This is not intended to be a tutorial, for various reason. a) your setup might be totally different and b) most server setup guides make admins copying around config files they do not really understand. This post is also somewhat written in a way that you can pick and read only parts of it in an order you like. Furthermore, I assume that you are familiar with basic technical terms. And finally, a small note on wording: I use Transport Layer Security (TLS) instead of the outdated Secure Sockets Layer (SSL) abbreviation, with the exception when it is part of other names or configuration variables.

Be aware that messing up only parts of the configuration might mean "no emails in and/or out", not only "no spam / newsletters" but might also be "no tax reports, bills, messages from business partners + family + friends, important notifications...". So start slowly, test a lot (using external mail accounts, e.g., GMail, which also tells you about signature validation) and deploy changes on Sundays / during night.

Your own tiny world

We will start with the setup of your own server and how to receive mails from other providers, store them on your server and distribute them to clients. This marks the first part of the overall story and was basically sufficient until some years ago. Read the second part on why this might not be enough when you also want to send out mails to the outer world.

OpenSMTPD

The central part of the global email infrastructure is SMTP.[1] It is the protocol used to send mails (either from clients to server or between servers). After using Postfix for a long time and hating it every time I had to (re-)configure it, I decided that I need a simpler but bulletproof solution. Luckily, people started to develop OpenSMPD, the same group that is also responsible for OpenSSH, OpenBSD and LibreSSL. So you know that you get well designed and safe software. A bonus is the excellent IPv6 support. Sadly, this amazing piece of software is not available as a Debian package, so I had to compile it myself (which is not a huge deal). The config is way shorter than for other servers:

pki crepererum.net certificate "/etc/ssl/my/crepererum.net.chain.pem" \
                 key "/etc/ssl/my/crepererum.net.key" \
                 dhe auto

ca crepererum.net certificate "/etc/ssl/my/ca-sha2.pem"

table aliases file:/etc/opensmtpd/aliases
table domains file:/etc/opensmtpd/domains
table secrets file:/etc/opensmtpd/secrets

listen on eth0 port 25 tls pki crepererum.net auth-optional <secrets>
listen on eth0 port 465 smtps pki crepererum.net auth <secrets> mask-source
listen on eth0 port 587 tls-require pki crepererum.net auth <secrets> mask-source
listen on lo mask-source
listen on lo port 10029 tag dkim mask-source

accept from any for domain <domains> alias <aliases> deliver to lmtp "/var/run/dovecot/lmtp"
accept for local alias <aliases> deliver to lmtp "/var/run/dovecot/lmtp"

accept tagged ! dkim for any relay via smtp://127.0.0.1:10028
accept tagged dkim for any relay pki crepererum.net

For humans, that means:

  • optional encryption on port 25 (server-to-server port)
  • mandatory encryption on port 465 and 587 (message submission port, used by clients)
  • clients are masked, so they will not appear as Received header (there is no reason why people should know my private IP address that I used while sending the mail since this information leakes approximate location and ISP)
  • internal submission (from loopback) do not need any authentication (this is an implicit rule)
  • external unauthenticated traffic (server-to-server) is only accepted for domains and addressed managed by this server
  • there is an alias table so I only need to check one single postbox for all my addresses (there are some special ones for the purpose of domain management)
  • authenticated and internal submission can send to everywhere, but will be first send through a proxy which adds a DKIM signature (see below)
  • emails that bounce back from the DKIM proxy are finally relayed to the outer world; during this process, we try to use a secure connection (with encrypted fallback) and offer the option to also present our own certificate

You may wonder where I get my TLS certificate from. It is a free, automatically renewed one from Let's Encrypt. They are mostly advertised to be used to add HTTPS to your website, but there is no limitation in using them for other services as well. I also use them for my blog and for IMAP (see below).

Dovecot

Since SMTP is only for mail transport but not for mail management (including folders, tags and search), you need another server. This time, it is an IMAP server. Please do yourself a favor and do not use the outdated POP3, which only supports downloading mails to clients without inter-client synchronization. I went for my old and reliable friend Dovecot. I will not post any configuration here since the developers and package maintainers decided to use the (kinda suboptimal) model of splitting the config into many files. Here is a short description of it:

  • IMAP only
  • encryption (see comment on Let's Encrypt in the OpenSMPTD section)
  • Sieve support for managing mail filter rules
  • communication between OpenSMTP and Dovecot is handled via Unix socket and LMTP, which is supposed to be more efficient than a full-blown SMTP connection
  • spam filter (see next section)
  • full text search support (clients can search mails that they did not synchronize yet)
  • push notification (clients can leave an open connection to the server and get push notifications from the server, works pretty well, even on smartphones using K-9 Mail)

Rspamd

Still more than 50% of all global mails are spam, and no matter what you do, you are going to be a target of it. They get your mail, from leaked user databases, you GIT commits on GitHub, the Whois data of your domain, your website, ... Some years ago, I was running the most popular spam filter out there --- SpamAssassin. Sadly, the system is overly complex and slow. Later, I used DSPAM, a machine-learning-based filter that grows with your decisions, since I think that the "spam" vs "no spam" decision should mostly rely on your personal preferences instead of hardcoded rules. The results were quite good, but it seems that the development of DSPAM has kinda stopped. Also, it does not detect the violation of many policies described later in this article[2]. A rather new member of the anti-spam world is Rspamd. It is lightweight and fast and combines machine learning, rule checks and public spam lists into one easily configurable package. I have integrated it into dovecot using this tutorial (for learning) and a sieve script (for filtering). The results are really good. Here is a short version from the current stats:

# scan stats
Messages scanned: 13041
Messages with action reject: 7538, 57.80%
Messages with action soft reject: 0, 0.00%
Messages with action rewrite subject: 0, 0.00%
Messages with action add header: 1385, 10.62%
Messages with action greylist: 484, 3.71%
Messages with action no action: 3634, 27.86%
Messages treated as spam: 8923, 68.42%
Messages treated as ham: 4118, 31.57%

...

# learn stats
Statfile: BAYES_SPAM type: sqlite3; length: 8.96M; free blocks: 0; total blocks: 128.41k; free: 0.00%; learned: 432; users: 1; languages: 1
Statfile: BAYES_HAM type: sqlite3; length: 13.24M; free blocks: 0; total blocks: 195.52k; free: 0.00%; learned: 255; users: 1; languages: 1
Total learns: 687

It only learns from messages of which I change the classification manually (by moving them either from any folder to the SPAM folder or by moving them out of the SPAM folder to another location). This happened to \(432 + 255 = 687\) out of \(13041\) mails, which is about 5% and includes the initial learning period and configuration tuning. That is pretty satisfying. What is not very great is the fact that nearly 70% of all incoming mails are spam. "Broken" is an appropriate description for the global infrastructure, I guess.

The outer world

A quick intro on why mail protection and encryption is different from website encryption. For websites, you have a single domain (or a limited number of subdomains) with one or multiple servers and IP addresses responsible for it. The communication between services is rather simple: call them, get your data in and out and you are done. No traces of the technology behind the service. The huge difference for mails is that you can have a great amount of servers (with different DNS names) responsible for one domain. Furthermore, the first server you send an email two is rarely the last one who handles the mail. It is send through a chain of servers, everyone doing some job, altering content and headers, archiving mails, multiplying them (e.g. in case of mailing lists) and doing other funny stuff that you have never thought of. So it is not sufficient to only protect the communication between providers (which btw. got only popular recently). This part is all about the communication with other providers and what standards and tricks are involved here to protect your mails from the first to the last part of the huge chain of servers.

Sender Policy Framework (SPF)

When email was invented, it was primarily used by universities to send messages either to users on the same system or to other educated people. Nobody checked if the mail was sent by the right system, since it was "the good old world"®. SMTP does not even know a real difference between client-to-server and server-to-server communication. And this is a problem nowadays since I can basically send emails in the name of every address in the world. This is pretty bad for spam and fishing protection. Providers and companies realized that and SPF was born. Whenever you get an email that seems to belong to a certain domain, you can check, via the domain's DNS record, if the server that sent you the mail was allowed to speak in behalf of this domain. Here is an example for my domain:

crepererum.net. TXT
v=spf1 mx a -all

That means:

  • it is a SPF version 1 record
  • servers that are (directly or indirectly) mentioned by a MX (mail / @ handlers), A (IPv4 website), or AAAA (IPv6 website) record are allowed to send mails
  • all others are strictly forbidden to send mails for this domain

DomainKeys Identified Mail (DKIM)

Now that we can verify that a server is allowed to send us mails, we face another problem: How can you ensure, that the mail is still as originally send out by the senders provider, especially after the mails processed by a chain of SMTP servers as described ealier? Sure, there is PGP and S/MIME (and I will come back to that later), but you cannot expect that a spam protection system is able to gather all required keys for all possible senders. So the solution is to introduce yet another, domain-bound, key, which is used to sign outgoing mails and which is also published via DNS. This is called DKIM and is the enhanced version of *Yahoo!*s DomainKey, which is now obsolete. It works similar to SPF. Here an example:

default._domainkey.crepererum.net. TXT
v=DKIM1; p=MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQCixgC9OeTZcIVdyii+wq4RbbAhKcUhrbeIguQYAShszRAVnZK71oVcMHnf1yxxnEkerSqqksVz4Ze4Fmq/wlvQvQBByqDDrHQz8VIrFHueefdNbPbTJjsOkqzdoSBWW6F18fCW8hvGEpv24q6oWVvakMb6J5BNiHFQhSidZytanwIDAQAB

In clear-text, that means:

  • version 1 DKIM record
  • 1024 bit RSA public key (not necessarily the same as used for SMTP/IMAP encryption)

Which parts of the message are protected by DKIM is part of the software configuration of the senders server. In my case, these are (apart from the message body and the DKIM signature itself) the following headers:

  • message-id
  • subject
  • from
  • to
  • cc
  • date
  • content-type
  • mime-version

This would still allow the intermediate servers to add bcc or reply-to fields if this makes sense for their infrastructure.

You may wonder if SPF is required when DKIM is in action. The mails are (partly) authenticated and the important parts cannot be altered. Still, SPF is important. The reason for this is that DKIM does neither require you to include the message ID into the protected part nor does any standard prevent you from resending the same mail multiple times (and checking this would require an enormous amount of technical infrastructure). So an attacker could resend a valid email, also known as replay attack. SPF protects you from this since only your servers are allowed to send mails on your behalf.

Do not confuse DKIM with DKIM Core, which aims to create a simpler standard. As fas as I know, no provider supports that standard (currently).

DKIMproxy

So DKIM requires a software component to add digital signatures to outgoing mails. The most popular software in this field may be OpenDKIM. Sadly, the clean and scalable infrastructure of OpenSMTPD does not implement support for Milter plugins. They have their own filter system with some example filters in the extras pack. Due to design changes the removed experimental/incomplete parts, which also included the DKIM signer. Apart from that, the signer had a bug which led to invalid signatures if mails are send to multiple people (i.e. multiple recipients within the to header). My alternative is the usage of DKIMproxy, which I already mention while describing the OpenSMPTD configuration. I am not entirely happy with this setup since the proxy is a slow Perl script and I would rather prefer the filter-based solution, but it works for now.

Domain-based Message Authentication, Reporting and Conformance (DMARC)

The major problem with SPF and DKIM is that they are only mechanism and that nobody talks about them. That means you cannot tell other mail providers how they should handle mails with invalid signatures or illegal senders. Is this a known technical test or an attack? Also, there is no information exchange protocol so you never know if these mechanisms work or if they fail. This is important to track technical failures but also to get notified about potential attacks. Luckily, there is yet-another-standard for this --- DMARC. It also works via DNS records and tells others both: how to handle failed validations and how to inform you about the results. The record that I am using:

_dmarc.crepererum.net. TXT
v=DMARC1; p=reject; sp=reject; adkim=s; aspf=s; pct=100; fo=1; rua=mailto:dmarc@crepererum.net; ri=172800

This means:

  • it is a version 1 DMARC record (as for the two other records above: there is no version 2, at least not this year)
  • reject all non-valid mail for domain and subdomains (a good alternative would be quarantine)
  • do a strict validation for DKIM and SPF
  • reject 100% of the mails that fail one of the validation steps (use lower percentage for test purposes)
  • generate a failure report if at least one of the mechanisms (DKIM or SPF) failed
  • send reports to my special mail address[3]
  • send reports aggregated, every 172800 seconds = 48h

I have mentioned that other providers send "reports". How do they look like? The authors of the protocol were smart enough to also standardize these. They are either submitted via HTTP(S) or via mail (that happens in my case) and are well-formed compressed XML files. Here is an example that I got from Google's servers:

<?xml version="1.0" encoding="UTF-8" ?>
<feedback>
  <report_metadata>
    <org_name>google.com</org_name>
    <email>noreply-dmarc-support@google.com</email>
    <extra_contact_info>https://support.google.com/a/answer/2466580</extra_contact_info>
    <report_id>15283316391971123284</report_id>
    <date_range>
      <begin>1473552000</begin>
      <end>1473638399</end>
    </date_range>
  </report_metadata>
  <policy_published>
    <domain>crepererum.net</domain>
    <adkim>s</adkim>
    <aspf>s</aspf>
    <p>reject</p>
    <sp>reject</sp>
    <pct>100</pct>
  </policy_published>
  <record>
    <row>
      <source_ip>2a01:4f8:d13:138b::2</source_ip>
      <count>1</count>
      <policy_evaluated>
        <disposition>none</disposition>
        <dkim>pass</dkim>
        <spf>pass</spf>
      </policy_evaluated>
    </row>
    <identifiers>
      <header_from>crepererum.net</header_from>
    </identifiers>
    <auth_results>
      <dkim>
        <domain>crepererum.net</domain>
        <result>pass</result>
        <selector>default</selector>
      </dkim>
      <spf>
        <domain>crepererum.net</domain>
        <result>pass</result>
      </spf>
    </auth_results>
  </record>
  <record>
    <row>
      <source_ip>65.55.111.106</source_ip>
      <count>1</count>
      <policy_evaluated>
        <disposition>quarantine</disposition>
        <dkim>fail</dkim>
        <spf>fail</spf>
        <reason>
          <type>forwarded</type>
          <comment>looks forwarded, downgrade to quarantine with phishing warning</comment>
        </reason>
      </policy_evaluated>
    </row>
    <identifiers>
      <header_from>crepererum.net</header_from>
    </identifiers>
    <auth_results>
      <dkim>
        <domain>crepererum.net</domain>
        <result>fail</result>
        <selector>default</selector>
      </dkim>
      <spf>
        <domain>crepererum.net</domain>
        <result>fail</result>
      </spf>
    </auth_results>
  </record>
</feedback>

And here is what we can learn from it:

  • They correctly parse my submitted policy.
  • Google supports IPv6 🎉
  • 65.55.111.106, which is the Outlook Server, seems to forward mails and is breaking both SPF and DKIM validation. 🙈 Do NOT forward mails without clearly declaring them as forwarded! You can do this by using Sender Rewriting Scheme (SRS). Also, do not alter the message body or existing headers since this breaks the DKIM signature.

Blacklists

Since spam, phishing, viruses and other bad stories got way to common, providers started to protect themselves by managing blacklists of IP addresses and domains. These blacklists are either specific to certain companies or a collaborative, global effort. The issue with these blacklists is when you are listed by them without knowing about it. That happened to me with the infamous Microsoft Blacklist. Only a kinda unclear response from the Outlook's SMTP servers gave a hint about this situation. Looking up the reason for this mess, I figured out that another server which owned my IP address years ago was infected by malware. Since blacklist managers normally do not recheck the placements, this state was active since 8 years. A quick request to remove my server from the list was followed by a quick, positive response.

Lesson learned: check your blacklist status, e.g. by using MX Toolbox. Another thing the helps others to trust you is to set up Reverse DNS.

A final hint about black list: DO NOT PAY for some kind of mail trust programs. Companies try to squeeze out money from small providers and individuals to get their system a lower spam score (or higher trust scores). There is absolutely no reason why you should pay to make their shitty spam protection work. Delisting from trustworthy blacklists is for free (but might require the creation of an account for their system).

End-to-end Encryption and Signatures

Now we have built up the trust and protection to send emails from one provider to another. But what if you do not trust your provider? There are two common ways to establish end-to-end encryption and mail signatures: S/MIME and PGP.

I want to start with S/MIME. It is very similar to what you know from TLS encryption. CAs issue certificates for individual users and users trust CAs to do the right thing, even when this assumption was violated many many times. Sadly, Let's Encrypt does not issue certificates that are usable for that purpose. StartSSL does this, for free. If you trust them or not is your decision.

An alternative is the "Web of Trust" that is created by PGP, or the de facto only usable implementation GPG. The problem here is that this system only works for people who invest a shitload of time for this idea. Keybase.io tries to work around this issue by providing other ways to people to prove that they own a specific key. There are also other techniques like OPENPGPKEY, which is similar to DANE (but for PGP instead of TLS keys), but never really kicked of. Then there are the classic key servers, newer append-only-logs like CONIKS or the somewhat promising OpenPGP Web Key Service. All in all, there are many ideas to fix an overcomplicated standard.

When using PGP, please use PGP/MIME. It avoids these ugly -----BEGIN PGP SIGNED MESSAGE----- lines, which may confuse people who are not familiar with this kind of technology. If the recipients email client does not support PGP/MIME or S/MIME, the signature will just appear as an attachment, which rarely results in misunderstanding, especially since they have the meaningful names signature.asc and smime.p7s. It is also possible to use both standards at the same time, e.g. with Evolution.

I personally use end-to-end signatures for basically every outgoing mail, but I kinda dislike encryption. They reason is that it failed too often in the past and it also disables some very essential features like server-driven search and spam protection. Also be aware that headers are not encrypted, so the list of recipients and the subject field are still visible to an attacker.

Not implemented stuff

There are some things that I did not implement for various reasons, but which I want to mention since others may find them useful.

Domain Name System Security Extensions (DNSSec) and DNS-based Authentication of Named Entities (DANE)

The first one is DNSSec. It basically protects the DNS data received from a DNS server by a signature. This can be important for mail servers since not only the SPF, DKIM and DMARC records are fetched using this way but also the IP address of the mail server itself. There is some heavy discussion about this technology and the suboptimal key hierarchy involved with it, which I will not comment on (the interested reader should be able to look up this discussion). The reason why I do not use it is that my server provider, Hetzner, is not willing to implement it.

A technology that would be enabled by DNSSec is DANE. It stores the public TLS key or its fingerprint as a DNS record so even when an evil CA issues wrong certificates, you have some kind of protection. For SMTP transport DANE is more important as for HTTPS, since you cannot rely on things like HTTP Strict Transport Security (HSTS), which includes certificate preloading into browsers, and HTTP Public Key Pinning (HPKP). For Strict Transport Security there is a draft called SMTP Strict Transport Security (SMTP STS) that proposes a similar technology for SMTP, but currently its in an early state. For Key Pinning there are some abandoned drafts like Trust Assertions for Certificate Keys (TACK), but as far as I know there is no active movement towards a proper solution. Be aware that DANE requires support from the software connecting to your server (other servers in our case; mail clients, browsers and co for other connection types) to be effective and this support is kinda rare.

For both technologies there is a client side though. I do use a DNSSec-enabled resolver. This can simply be done by setting up Unbound, even when the DNS resolver provided by your hoster does not offer DNSSec validation. On the other hand, my SMTP server does not validate DANE records, and also does not care about the DNSSec response status, so it is pretty useless.

More TLS

The encryption and authentication of connection between mail servers can be protected by TLS, as shown in the OpenSMTPD configuration. Currently, I do not enforce secure connections for server-to-server traffic. There are multiple options that I could change here and here are the reasons why that might be a bad idea:

  • enforcing encryption of incoming traffic: Some servers do not support it. While this may stop a good amount of spam, it might also stop you from getting flight or event tickets, newsletters you really want or mails you do not even know that they were send to you.

  • client certificate validation: This would require an encrypted incoming connection and is basically the opposite certificate validation normally done in TLS. The normal case is: the host server presents a valid certificate and the client server (the one that initiates the connection) validates it. Here, the client server also presents a certificate so both sides can ensure that they are talking to the right party. While this sounds great in theory, I have only seen one server that supports it and it was owned by a small company. I am not aware of any larger mail provider implementing it. So requiring it would basically stop all incoming traffic.

  • enforcing encryption of outgoing traffic: This seems to be the easiest thing to do. You can ensure that your outgoing mails are encrypted when being send to another server[4] and if your server is unable to establish a connection, you get a notification. So you do not end up with silent failures. Sadly, some providers still do not set up TLS encryption and I do not want to worry about this fact every time I send an email to someone.

The Thing about Time

TLS, DNSSec, PGP, and anti-spam --- they all rely on correct time information. Heaving a at least somewhat accurate system clock is crucial, and not that many people tend to think about it. The way most systems get is this time information is the usage of the Network Time Protocol (NTP). The bad message is: it is completely unprotected against man-in-the-middle attacks. The required UDP packages are sent out without any signature. There are drafts like Network Time Security (NTS), Autokey, or OpenTimestamps, but non of them is currently usable. Also, they rely on the fact that the single NTP server you are asking for time information is not lying. All in all, the situation is pretty bad. Roughtime tries to solve all of these problems but is in a very early stage. Since it is somewhat supported by Google, which earlier tried to add time information to TLS to work-around screwed client clocks but is now failing with this approach due to TLS 1.3, there is hope that this approach (or some successor) will somehow succeed. For now, I am just hoping that NTP attacks will not take place.

Distributed Denial-of-Service (DDoS)

I problem I luckily do not had to care about that much (yet) is DDoS prevention. My server provider offers some protection though. I could decrease the risk by taking measurements against SYN flooding. The Linux kernel offers some protection, but as GitHub points out, that might not be sufficient. Also take other services on the same host into account, which may increase the (general) attack surface. If you are administrating a larger provider, DDoS prevention is for sure something that you should take seriously, but at the end of the day it reduces to the question to: What is bigger, the hammer, or the nail?

Spam Protection by Creative Protocol Interpretation

An interesting fact about spam is the following: most of these mails are send by bots or one-shot scripts, and both methods are rather lazy. You can exploit this laziness by using some "gray" parts of SMTP related standards.

The first method are traps. The standard for MX records for domains allows you to specify multiple servers that are responsible for your server's SMTP transport. This is intended for load balancing and redundancy in case of failures. Additionally, every MX record is has a preference assigned (lowest number = highest preference). Since spammers are lazy, they usually pick either the first or the last entry of the sorted list. You now can use the following example setup:

  5    x-spamtrap-no-smtp-running.domain.org.
 10    x-normal-server-1.domain.org.
 10    x-normal-server-2.domain.org.
100    x-spamtrap-always-defer.domain.org.

On the first server x-spamtrap-no-smtp-running there is no SMTP server running at all. Normal, properly designed SMTP senders would now disable the route to this server and would try the next entry in the list, which is either x-normal-server-1 or x-normal-server-2. The last server is x-spamtrap-always-defer and runs a SMTP server that always defers messages and never accepts them. The reason for having a server at all here is that it is the failover node when all "normal" servers are unreachable. Using a "defer" instead of a "unreachable" increases the probability that proper senders keep the message within their queues and try again later. Spammers normally do not care if the message was delivered and will move on to their next target. Keep in mind that you need at least two additional IPv4 (and IPv6) addresses and one additional SMTP server instance to implement this setup. This and the reason that I am not sure about all the implications of this setup are the reason I never used this kind of protection.

A method that also uses the "defer"-like response is Greylisting. It rejects new, unknown senders (by IP, mail address, host, ...) using an error which looks like a temporary problem. Proper mail servers now will queue the message and try again. This technique requires some kind of plugin within your SMTP server to be usable. A major drawback of this protection scheme is that even when your server advises the downtime as "2 minutes" (for example), some senders will delay the message for a longer time. This is particularly annoying when you are awaiting a "mail address confirmation" message. Also, it may prevent you from getting some important mails altogether, since some web services do not use intermediate servers to send out mails but use one-shot scripts. So you may not get your ordered concert ticket.

Mailing Lists

Something that will for sure result in problems are mailing lists. Let's recap what a mailing list does in its core: It receives a mail and sends to all subscribers. While doing so, it may do the following wrong things:

  • keeping from header intact, violating SPF: The list operator must use SRS (see DMARC section above).
  • altering subject fields, e.g. by prepending [name of the list]␣, breaking DKIM: This is discouraged in RFC 6377, which also explains how to build DKIM-friendly mailing lists. See paragraph later in this section for a comment on mail clients and there role in that story.
  • adding additional footers to body, breaking DKIM: Same as above.
  • other wrong modifications of the message (e.g. replacing newline characters, changing MIME types, ...), breaking DKIM: Same as above.
  • not caring about end-to-end signatures: Follow the same rules as described for DKIM.
  • not caring about end-to-end encryption: Sadly this is a very complicated part. For S/MIME, you could use a list-specific key-pair and do a decryption + re-encryption operation on the mailing list server; or provide all subscribers the same private key. While the first one looses the end-to-end guarantees, the latter one is not scalable. For PGP, you have the same options and for the re-encryption concept there is already a great implementation named Schleuder. Additionally, the GPG team is working on a more sophisticated solution, but this one is a) not ready and b) requires special client support.

A major issue here are incomplete implementations of mail clients (yes, this time it is about the software on your smartphone / PC). They should be able to show special mailing list information, which also includes information about list owners and how to unsubscribe a list, see List-* headers described in RFC 4021. So per standard, there is absolutely no reason for list managers to alter the actual content of the mail. In reality though, clients are badly designed and laws of some countries kinda force you to add list metadata so you might not have a chance at all to provide a proper mailing list implementation.

Conclusion

As you can see, the whole worldwide email infrastructure is a mess. This is especially true because of the following reasons:

  • New standards invalidate behavior that worked before, e.g. DKIM in case of mailing lists.
  • You have to implement new standards and protocol extensions, which may rely on each other when you want to avoid being marked as spam. It is hard to keep up-to-date since there is no common news channel for all of these (partly competing) technologies. Also, the increasing amount of complexity heavily increases the chance of failure.
  • Speaking of spam: there are large providers and service companies making a whole lot of money classifying IP address ranges or domains as spam and you have to beg them not to do so for your system. Big companies win, small providers loose that battle.
  • It is basically impossible to send files via email since you cannot know if there is one SMTP server in the chain which rejects your message because the attachment is larger than a certain threshold. And there is no consensus on how this limit should be chosen. Also, most mail clients fetch the entire mail from the server which is especially annoying when your smartphone starts to download a 100MB photo archive, which was sent by your friend, when it has limited battery and your monthly volume is already at its limit.
  • End-to-end security (signatures as well as encryption) totally sucks. Also, most metadata is leaked with all usable technologies. Dark Mail wanted to provide a solution for this but never really kicked off.
  • Setting up your system costs A LOT of time, know-how, effort and frustration. On the other hand, most providers spy on you, so if you want independence and privacy, you have to go down that rabbit hole.
  • Specifying the actual content of the mail is a mess as well. Plain text works very well, but even basic formatting requires HTML, which is a horror story on its own.
  • The protocols and standards lead to some confusing edge cases, e.g. that you might get mails without any to header (in case the sender only used the bcc field).

Anyway, I hope you have learned something. In case you found something unclear or even wrong, feel free to drop me message (probably not via email 😉).


Image: "Sorting Mail in Front Royal" by USMC Archives, CC BY 2.0, 1951


  1. Fun fact: it is the "Simple Mail Transfer Protocol". I do not see where the protocol that led to this article is "simple". ↩︎

  2. That seems kinda contradictory to the machine learning approach, but it makes sense in a way that the policies are contracts between server operators and violations are an absolute no-go. On the other hand, the semantic content of the mails are a rather personal thing. ↩︎

  3. In case you are wondering: this mail addresses is protected by Rspamd as well. The same holds for other special mail addresses like the one for TLS certificates. ↩︎

  4. I am only talking about the first hop here. As Snowden taught us, spy agencies wiretap intranets so there might be a weak spot which you cannot control. In that case, only end-to-end encryption can protect you. ↩︎