In this article I'll explain my learnings about implementing a filtering system for incoming emails. The level of yak shaving I had to go through to reach the final goal has no precedent in the entire human history.
I don't use a webmail or the IMAP protocol, I stick to the classic POP3s (which means download emails from the server and keep a local copy).
It all started with an innocent question: I want to separate emails from mailing lists from the rest. I can split this story in three parts:
- Figure out how to handle incoming email classification (did some research and used maildrop and related utilities because
fetchmail
orprocmail
are too hostile for me) - Configure Emacs
mu4e
to handle maildir folders - Figure out how to switch system emails from the default mbox format to the more modern Maildir format and have them be delivered in a location I choose - and convince
exim4
to accommodate my wishes)
By researching and manually implementing with separated tools (each one with its own set of quirks) all the steps that a modern software (such as Thunderbird) implements, I realize how much work mail clients actually do.
The architecture of a modern email exchange is the following (courtesy of Wikipedia):
MUA → MSA → MTA → (the internet) → MTA → MDA →→ MRA →→ MUA
If I read correctly that list, my current idea of "email client" should be close to the following (from left to right):
mu4e
: The thing I use to write and read emails (MUA)smtpmail
: the SMTP client, which is part of Emacs (MSA)- (The
MTA
is my email server) mpop
: the POP3 client (MRA)maildrop
: mail delivery filter/agent (MDA)mu4e
again (MUA) to read the email from disk
In comparison, applications like Mozilla Thunderbird do all the above seamlessly.
§ Step 1: handle incoming email classification
After researching a bit and configured fetchmail
I decided I was not happy (the configuration looks convoluted and scattered on multiple files) and looked further to finally settle on maildrop
.
Luckily my POP3 retrieval tool (mpop
) supports different delivery mechanism. Instead of dropping the email in the mailbox, I can config mpop
to hand it to an MDA and let it do the rest.
- delivery maildir ~/.local/mail/city17.xyz/catchall/Inbox
+ delivery mda /usr/bin/maildrop ~/.config/maildrop/mailfilter
A maildrop
filter file is relatively easy to grok (man 5 maildroprc
and man 5 mailfilter
). I got a general idea with some copypasta (here and here), then proceeded to create my filters with the help of the good documentation.
Filtering messages from mailing list is also easy because each mailing list service has a specific header, here's a few (src):
- Courier mlm:
/^List-Post: <mailto:vlug-public@vlugnet.org>/
- GNU mailman:
/^List-Id:.*<fsfe-de\.fsfeurope\.org>/
- ezmlm:
/^Mailing-List: contact getmail-help@discworld\.dnsalias\.org/
- majordomo:
/^Sender: owner-vlug@listserv\.uni-stuttgart\.de/
Here's an example for the Guix support mailing list:
# Relevant headers
# List-Id: <help-guix.gnu.org>
# List-Post: <mailto:help-guix@gnu.org>
if (/^List-Post: .*help-guix@gnu\.org/:h)
{
log "MATCHED!"
}
Now where should maildrop
save the message? After some more research I've learned about an ancillary tool, maildirmake
(man 1 maildirmake
) that is used to create Maildir compliant folder structures without bothering to know the specs:
# Create a Maildir folder with two nested folders
$ maildirmake TestMaildir
$ maildirmake -f Drafts TestMaildir
$ maildirmake -f Drafts.Urgent TestMaildir
# Result:
$ tree -a TestMaildir
TestMaildir
├── cur
├── .Drafts
│ ├── cur
│ ├── maildirfolder
│ ├── new
│ └── tmp
├── .Drafts.Urgent
│ ├── cur
│ ├── maildirfolder
│ ├── new
│ └── tmp
├── new
└── tmp
12 directories, 2 files
The directory cur
is for read messages, new
is for unread messages and tmp
stores the message being written before moving it to new
. Please note the paths prepended with a dot (/.Drafts
).
A visual way to represent the above could be:
Jane's emails (0)
├── Inbox
├── Drafts
│ ├── Urgent
As messages arrive and are being marked as "read", this is what happens:
$ tree -a TestMaildir
TestMaildir
├── cur
│ ├── 1670876132.M796214P691272Q1.localdomain:2,RS
│ ├── 1671353042.M030514P2342530Q4.localdomain:2,S
├── .Drafts
│ ├── cur
│ ├── maildirfolder
│ ├── new
│ └── tmp
├── .Drafts.Urgent
│ ├── cur
│ ├── maildirfolder
│ ├── new
│ └── tmp
├── new
│ ├── 1670876132.M796214P691272Q1.localdomain:2,RS
│ ├── 1671353042.M030514P2342530Q4.localdomain:2,S
└── tmp
12 directories, 4 files
Visual representation:
Jane's emails (4)
├── Inbox (2 unread)
├── Drafts
│ ├── Urgent
Please note that maildirmake
does not handle creating maildirs with a dot (example: "personal.com"), so the workaround is two lines of sh:
mkdir -p $MAILDIR/personal.com/{cur,tmp,new}
touch $MAILDIR/personal.com/maildirfolder
§
Step 2: configure Emacs mu4e
to handle maildir folders
With this newly acquired knowledge, we can create folders and tell maildrop
the locations to save each mailing list message.
if (/^List-Post: .*help-guix@gnu\.org/:h)
{
log "MATCHED!"
# save the message in this mailfolder then immediately terminates
to "$MAILDIR/.../.lists.help-guix/."
}
The maildrop
config file (curiously by default is ~/.mailfilter
) can be tested for typos with:
$ echo | maildrop <your-config-file> -V 9 2>/dev/null \
&& echo "OK" || echo "Error $?"`
OK
Or with a real email, checking if it's saved in the correct place:
cat $MAILDIR/.../1673263026.M771328P839100Q2.localdomain | \
maildrop -V 1 <your_config_file>
Configuring mu4e
to read from all this folder is a breeze, thanks to mu4e-maildir-shortcuts:
(setq mu4e-maildir-shortcuts
'((:maildir "/.../.lists.help-guix" :key ?i)
;; other folders...
))
And it's done! Now everytime we download emails, they will be classified in the correct folder.
§ Step 3: system emails from mbox to Maildir format
Wait, we're not finished yet, just one more little itch to scratch :)
How about those mysterious messages ("You have new mail") that from time to time appear on the console after updating the system? What are these emails? Where are they saved? Turns out they're somehow interesting but this is a another can of worms for the next article about diving into how exim4 works.
Obligatory XKCD reference: xkcd.com/1728.
§ Conclusions
Reatively hard stuff to figure out, mostly because there are not many examples around but the documentation is usually great and with a bit of focus one can figure out this stuff.
I'm glad I've learned a bit about tools written +15 years ago, still doing their job greatly. They are still being used is because they work, are stable and are well-documented.