Communities • Chris's Wiki

chevron_right

Spammers do forge various noreply@<you> sender addresses

pubsub.slavino.sk / chris_spam · Saturday, 1 June - 02:47 edit · 2 minutes

It is probably not news to anyone reading this that some of the time, spammers sending you email will forge the email as being from various addresses at your domain, for either or both of the SMTP 'MAIL FROM' envelope sender address and the From: header address. Spammers have been doing this to us for years. What I hadn't realized until now, when I looked at the actual addresses being forged, was that spammers were forging various variations on 'noreply@<us>', in various variations of words and cases. Over the past ten days we've seen all of 'noreply@', 'Noreply@', 'Nonreply@', 'no_reply@', 'NOREPLY@', 'no-reply@', and 'NO-REPLY@'.

Of course, spammers also forge various plausible administrative addresses as well, such as 'Administrator@', 'Admin@', 'cpanel@', 'support@' (and 'Support@'), and one case of 'hr@', as well as the expected 'postmaster@'. These are almost all addresses that don't exist here and never have, so I'm pretty confident that spammers are just making them up instead of drawing them from a list of (past) legitimate email addresses of people here. I suspect that some or perhaps many of these forged addresses are being used on phish spams, and this is probably the case for the various 'noreply@' addresses.

(Spammers clearly use old email address lists to generate their envelope sender addresses, because we reject a lot lot of SMTP 'MAIL FROM' addresses that used to be real email addresses here but which have since been removed (we do eventually close some accounts). Interestingly, there is also a relatively frequently forged sender address that is a single-letter typo for a real person's email address.)

One of the lessons I draw from this little exercise in curiosity is that if we've created administrative-like email addresses in our system simply to reserve them, and we aren't using them, we should actively block their use as external sender addresses. If we want to create a dummy 'cpanel@' address, for example, we should definitely make it so that it's not accepted as a SMTP envelope sender.

(Because of some features of our mail environment , people here can created valid email addresses without our involvement (this has various entirely legitimate uses, including expendable personal email addresses ). Historically this has meant that we grabbed a number of addresses simply as precautions to reserve them, without ever intending them to be 'legitimate'.)

PS: We do have a local noreply-like address, for internal use. However, spammers don't seem to forge it on their messages, perhaps because it basically never appears on email we send to actual people and thus has never made it onto various spammer lists of email addresses here.

(All of the email that we send to people has real sender and reply addresses that are read by us, even if the mail is sent by automated systems.)

Značky: #Network

chevron_right

A DKIM signature on email by itself means very little

pubsub.slavino.sk / chris_spam · Sunday, 24 December, 2023 - 03:49 edit · 2 minutes

In yesterday's entry on what I think the SMTP Smuggling attack enables , I casually said that you were safe if you ignored SPF results and only paid attention to DKIM . As sometimes happens, this was my thoughts eliding some important qualifications that I just take as given when talking about DKIM, but that I should spell out. The most important qualification is that a (valid) DKIM signature by itself means almost nothing , which is a bit unlike how SPF works.

First off, anyone can DKIM sign a message , provided that they control a bit of DNS (you could probably even do it in a mail client). Quite a lot of people, including spammers, can even DKIM sign email that is 'aligned' with the 'From:' header, which means that the DKIM signature is from the From: domain, not just from some random domain. A valid DKIM signature does provide definite attribution , and if it's for the From: domain, it more or less identifies who authorized the mail. Also, in practice lack of a DKIM signature is itself a signal , because an increasing number of places more or less require a DKIM signature , sometimes one that is from the From: domain.

(However, some people only have SPF records and this can be deliberately used to create email that can't be easily forwarded .)

A valid DKIM signature for the From: domain is at least as strong a sign as an SPF pass result. However, this doesn't mean that the email is any good, any more than an SPF pass does; spammers can and do pass both checks. Similarly, lack of a valid DKIM signature for the From: domain doesn't mean that it's not from that domain. To have some idea of that you need to check the domain's DMARC policy. In effect, the equivalent of SPF is the combination of DKIM and DMARC (or something like it).

So when I casually wrote about (only) paying attention to DKIM, I was implicitly thinking of using DKIM along with something else to tell you when DKIM results matter. This might be specific knowledge of which important domains you deal with DKIM sign their email (including your own domain), or it might mean checking DMARC, or both. And of course you can ignore both SPF and DKIM signatures, apart perhaps from logging DKIM results.

( We don't explicitly use DKIM signatures and DMARC in our Exim configuration, but these days we use rspamd for spam scoring and I think it makes some use of DKIM and perhaps DMARC.)

Značky: #Network

chevron_right

What I think the 'SMTP Smuggling' attack enables

pubsub.slavino.sk / chris_spam · Saturday, 23 December, 2023 - 02:49 edit · 2 minutes

The very brief summary of SEC Consult's "SMTP Smuggling" attack is that under the right circumstances, it allows you (the attacker) to cause one mail server to 'submit' an email with contents and SMTP envelope information that you provide to a second mail server. To the second email server, this smuggled email will appear to have come from the first mail server (because it did), and can inherit some of the authentication the first mail server has.

(It's important to understand that the actual vulnerability is in the second mail server, not the first one; the first one can and often must be completely RFC compliant in its behavior.)

The obvious authentication that the smuggled email inherits is SPF , because that's based on the combination of the sending IP (the first mail server) and the SMTP envelope sender (and possibly message From:), which is under your control. So you can put in a SMTP envelope sender (and a From:) that claims to be 'from' the first mail server, and the second mail server will accept it as authentic.

(An almost as obvious thing is that the smuggled email gets to share in whatever good reputation the sending email server has with the receiver. This is most useful if you can get a big, high reputation mail system to be the first server, which is possible (or perhaps 'was' by the time you're reading this).)

If you forge email as being from something that has a DMARC policy that passes the policy if SPF passes, you can also get your forged email to pass DMARC checks. The same is true if the second email server happens to be something that imposes its own implicit DMARC-like policy that accepts email if SPF passes and (and possibly that SPF is 'aligned' with the From: message address).

What you can't fully do is inherit DKIM authentication. You can add your own valid DKIM headers to your smuggled email, but you can only do this for domains with DNS under your control (or domains where you've managed to obtain the DKIM signing keys). This probably doesn't include the first email server and its domain, and because the first email server doesn't recognize your smuggled email as an actual email message, it won't DKIM sign the email for you. The only way you can get the domain of the first email server to DKIM sign your second email for you is if the second email server is also an internal one belonging to the same domain and it will DKIM sign outgoing messages. This general configuration is reasonably common (incoming and outgoing email servers are often different), but usually they run the same mail software and so they won't have the different interpretations of the email message(s) that SMTP Smuggling needs.

The result of this is that if the second (receiving) email server doesn't check SPF results and only pays attention to DKIM ( which is increasingly mandatory in practice ), it's almost completely safe from SMTP Smuggling even if it accepts things other than 'CR LF . CR LF' as the email message terminator. Since SPF breaks things ( also ), this is what I feel you should already be doing.

Značky: #Network

chevron_right

The (historical) background of 'SMTP Smuggling'

pubsub.slavino.sk / chris_spam · Thursday, 21 December, 2023 - 03:55 edit · 5 minutes

The recent email news is SEC Consult's SMTP Smuggling - Spoofing E-Mails Worldwide ( via ), which I had a reaction to . I found the article's explanation of SMTP Smuggling a little hard to follow, so for reasons that don't fit within the scope of today's entry, I'm going to re-explain the central issue in my own way.

SMTP is a very old Internet protocol, and like a variety of old Internet protocols it has what is now an odd and unusual core model. Without extensions, everything in SMTP is line based, with the sender and receiver exchanging a series of 7-bit ASCII lines for commands, command responses, and the actual email messages (which are sent as a block of text in the 'DATA' phase, ie after the sender has sent a 'DATA' SMTP command and the receiver has accepted it). Since SMTP is line based, email messages are also considered to be a series of lines, although the contents of those lines is (mostly) not interpreted. SMTP needs to signal the end of the email text being transmitted, and as a line based protocol it does this by a special marker line; a '.' on a line by itself marks the end of the message.

(In theory there's a defined quoting and de-quoting process if an actual line of the message starts with a '.'; see RFC 821 section 4.5.2, which is still there basically intact in RFC 5321 section 4.5.2 . In practice, actual mailer behavior has historically varied.)

When you have a line based protocol you must decide how the end of lines are marked (the line terminator ). In SMTP, the official line terminator is the two byte (two octet) sequence 'CR LF', because this was the fashion at the time. This includes the lines that are part of the email message that is sent in the DATA phase, and so the last five octets sent at the end of a standard compliant SMTP message are 'CR LF . CR LF'. The first 'CR LF' is the end of the last line of the actual message, and then '. CR LF' makes up the '.' on a line by itself.

(This means that all lines of the message itself are supposed to be terminated with 'CR LF', regardless of whatever the native line terminator is for the systems involved. If you're doing SMTP properly, you can't just blast out or read in the raw bytes of the message, even apart from RFC 5321 section 4.5.2 concerns. There are various ESMTP extensions that can change this.)

Unfortunately, SMTP's definition makes life quite inconvenient for systems that don't use CR LF as their native line ending, such as Unix (which uses just LF, \n). Because SMTP considers the email message itself to be a sequence of lines (and there's a line length limit), a Unix SMTP mailer has to keep translating all of the lines in every email message it sends or receives back and forth between lines ending in \n (the native format) and \r\n (the SMTP wire format). Doing this translation raises various questions about what you should send if you encounter a \r (or a \r\n) in a message as you send it, or encounter a bare \n (or \r) in a message as you receive it. It also invites shortcuts, such as turning \r\n into \n as you read data and then dealing with everything as Unix lines.

Partly for this reason and partly because CR LF line endings make various people grumpy, there has been somewhat of a tradition of mailers accepting other things as line endings in SMTP, not just CR LF. Historically a variety of Unix mailers accepted just LF, and I believe that some mailers have accepted just CR. Even today, finding SMTP listeners that absolutely require 'CR LF' as the line ending on SMTP commands isn't entirely common (GMail's SMTP listener doesn't, for example, although possibly this will cause it to be unhappy with your email, and I haven't tested its behavior for message bodies). As a result, such mailers can accept things other than 'CR LF . CR LF' as the SMTP DATA phase message terminator. Exactly what a mailer accepts can vary depending on how it implemented things.

(For instance, a mailer might turn '\r\n' into '\n' and accept '\n' as a line terminator, but only after checking for a line that was an explicit '. CR LF'. Then you could end messages with 'LF . CR LF', without the initial 'CR'; the bare LF would be taken as the line terminator for the last data line, then you have the '. CR LF' of the official terminator sequence. But if you sent 'LF . LF', that wouldn't be recognized as the message terminator.)

This leads to the core of SMTP Smuggling, which is embedding an improper SMTP message termination in an email message (for example, 'LF . LF'), then after it adding SMTP commands and message data to submit another message (the smuggled message ). To make this do anything useful we need to find a SMTP server that will accept our message with the embedded improper terminator, then send the whole thing to another mail server that will treat the improper terminator as a real terminator, splitting what was one message into two, sent one after the other. The second mail server will see the additional mail message as coming from the first mail server, although it really came from us, and this may allow us to forge message data that we couldn't otherwise.

(There are various requirements to make this work; for example, the second mail server has to accept being handed a whole block of SMTP commands all at once. These days this is a fairly common thing due to an ESMTP extension for 'pipelining', and also because SMTP receivers have to do extra work to detect and reject getting handed a block of stuff like this. See the original article for the gory details and an extended discussion.)

What you can do with SMTP Smuggling in practice has some limitations and qualifications, but that's for another entry.

Značky: #Network

chevron_right

The various meanings of DKIM signing message headers

pubsub.slavino.sk / chris_spam · Sunday, 5 November, 2023 - 02:34 edit · 2 minutes

When I talked about the issue of what headers to include in email DKIM signatures , I didn't really cover the specifics of how you DKIM sign email headers and what the various options mean. The specifics can matter, especially since they help you (me) understand and navigate through the options that mailers (such as Exim ) offer here.

In email messages, DKIM signatures appear in a DKIM-Signature header, which lists a bunch of parameters:

DKIM-Signature: v=1; a=rsa-sha256; c=relaxed;
   d=list.zfsonlinux.org;
   h=from:to:subject:message-id:in-reply-to:references:date
   [....]

The 'h=' list (which isn't complete here) is a list of headers that have been signed. More specifically, it's a list of instances of headers. If there are multiple instances of a given header in a message, DKIM defines an order to them and the instances of the header are checked (or used) in that order. So if you include 'from' once in the DKIM header list, you are saying that your DKIM signature includes DKIM's first 'From:' header in the message. If a second 'From:' header is added to the message, it's not included what's covered by your DKIM signature; it can have any value and the message will still pass DKIM validation.

As mentioned last time , including a header that doesn't exist in the DKIM signature signs its absence; if that header is then added to the message, the DKIM signature will become invalid. DKIM signing things that aren't there is sometimes called oversigning a header; you're not just signing what's present, you're also signing what's not. As a corollary of this, if you want to seal a message against having extra copies of some headers added, you can deliberately oversign existing headers. This is done by including their names an extra time in the h= list; the first time signs the existing header, and the second time signs that there's no second header. So if we wanted to make sure no one added a second 'From:' to a message, we'd sign 'h=from:from:[....]'.

One reason to oversign existing headers that should only appear once is that anyone who adds a second 'From:', 'Date:' or whatever to your message is probably up to no good. Another reason is that it's hard to predict which instance of the header a mail client will show to people reading the message, and there are probably some mail clients that will show the wrong instance of the header (the instance that isn't covered by your DKIM signature and so can be set to anything by an attacker).

This creates several options and decisions:

do you make it so that certain headers can't be added to the message later, like the List-* and Resent-* families, or allow them to be added later?
what headers do you sign if they're present? For example, should you sign Resent-* or List-* headers at all?
do you oversign some existing headers so that no additional copies can be added?

Based on a quick skim of email that I have handy, relatively few sources of mail seem to be oversigning existing headers. However, GMail does oversign at least some email for core headers like From: and Subject:. Since Google is one of the eight hundred pound gorillas of email, if they're doing it people's DKIM signature validation is at least prepared to cope with this.

(I suspect that having two From:, Subject:, or so on headers trips enough spam detection systems that attackers don't normally do it.)

Značky: #Network

chevron_right

The issue of what headers to include in your DKIM signatures

pubsub.slavino.sk / chris_spam · Friday, 27 October, 2023 - 03:20 edit · 3 minutes

Increasingly, you have to sign your outgoing email messages with DKIM . When you use DKIM to sign things, in one sense you're signing an abstract 'email message', and in another, more concrete sense, you're signing the email body plus some of the email message headers. You might innocently think that the message headers to sign are standardized and obvious, but I've recently learned that neither is the case due to a recent discussion on the Exim mailing list. Different mail systems may sign different sets of headers in ways that are more or less aggressive, and some of these ways have downstream effects.

(This is especially relevant to Exim, where the default configuration of what headers to sign is perhaps somewhat aggressive.)

A basic part of DKIM signing is that if a message doesn't have a particular header and you include it in the DKIM signature headers anyway, what you're doing is signing that there is no such header in the email; basically, the header is interpreted as having a null value. If someone adds the header later, it will have a non-null value and so fail the DKIM signature check. Signing nonexistent headers is important if you think that adding them would change the meaning of the message as people perceive it (or as they see it).

As far as what headers to include goes, RFC 6376 provides relatively little guidance in section 5.4 and then a big and somewhat questionable list in section 5.4.1 . Some headers are in practice part of the meaning of the message as people reading it will perceive things; in this category I'd include From: (which is required anyway), Subject: and Date:, and probably To:, cc:, and Reply-To:, and in practice I'd roll in In-Reply-To and References and some others. Some headers will change the interpretation of the message body if modified so must be protected by the DKIM signature; this includes all MIME related headers.

But then you have headers that may or may not change what you see as the meaning of the message if they're added to it after your signature. In this category are both the Resent-* family of headers for resent messages and especially the List-* family of mailing list headers. In some environments, whether a message was sent directly to people or came through a (visible) mailing list matters, as does what mailing list; in those environments you probably want to include the List-* headers in your DKIM signatures. But in other environments, this is not critical and in fact your people may be sending messages to outside mailing lists and want this to not break the DKIM signatures of their messages so the post-mailing-list version of their email is still accepted by, for example, GMail .

(You can have a similar discussion about Resent-*. Maybe these headers should never be signed, maybe they should be signed only if they're present, and maybe they should always be signed so that if someone visibly resends a signed message, it no longer passes DKIM verification.)

Now that I'm aware of this issue, we're probably going to change away from the Exim default (which signs all of the section 5.4.1 headers, plus the MIME headers) to something where we definitely don't sign the List-* headers and probably don't sign the Resent-* headers.

PS: One of the reasons to not sign Resent-* and List-* headers is that in both cases, you can do resending and mailing lists without changing the headers at all. Breaking DKIM signatures if people actually do add headers thus only encourages them to not add the headers; since adding the headers is useful and nice, we shouldn't discourage people from doing so.

Značky: #Network

chevron_right

Having ClamAV reject email using the Malwarepatrol database seems unwise

pubsub.slavino.sk / chris_spam · Wednesday, 6 September, 2023 - 03:07 edit · 3 minutes

In practice, ClamAV is both a virus and malware recognition engine and a collection of malware signatures. ClamAV only comes with a limited set of signatures, so supplementing it with additional third party sources is popular (and perhaps almost essential). Often people use update tools and scripts to configure and fetch these additional signatures, such as Fangfrisch . One of the popular providers of third party signatures is Malware Patrol , who have a number of tiers of access, including a (free) tier for educational institutions. Since we are an educational institution, we signed up for this tier and added it to the configuration of the third party update script we were using at the time so that it would be part of our email anti-spam filtering ( when we switched over to ClamAV from our prior solution ). Well, we thought we'd added it; in fact we'd made a configuration mistake such that we were silently failing to fetch the Malware Patrol database. We only noticed and fixed this mistake when we switched to Fangfrisch for our third party updates.

Soon afterward, our logs started reporting rather a lot of Malware Patrol hits and some people here started complaining that email to them was being rejected. Investigation showed that the rejections were from Malware Patrol signatures and the ones we could decode had what I would call alarmingly broad text matches that they were looking for (Malware Patrol uses ClamAV's body-based signature content format , generally with just a string it's looking for).

(One reason we couldn't decode what some Malware Patrol signatures were matching was that the Malware Patrol data is updated frequently, with signatures regularly being removed.)

Malware Patrol is fairly open and unapologetic about these broad matches in an article called Whitelisting for Block Lists . They specifically say:

Malware Patrol’s #1 goal is to protect customers from malware and ransomware infections. These days, this can mean blocking mainstream domains. Consequently, our customers report potential false positives for sites like docs(.)google(.)com, drive(.)google(.)com, dropbox(.)com and github(.)com. Systems like Google Docs serve files from their root directories. This forces some block list formats to then block the entire domain, frustrating users.
[...]

Although Malware Patrol doesn't say this explicitly, it appears that the ClamAV database format is one such format that sometimes forces them to block entire domains like 'drive.google.com' (we observed this in one signature). They suggest filtering their database before using it, but this has a number of problems; the ClamAV format is hex-encodes the ASCII bytes, for example, and on a larger scale it would mean we'd only be excluding things after people here had run into problems and reported them to us.

I don't fault Malware Patrol for their choice. The balance between false positives and false negatives is not one with a clear single answer, and Malware Patrol seems to have come down on the side of not having false negatives, even at the cost of false positives. But it does mean that Malware Patrol's objectives and ours aren't in alignment, as we care more about avoiding (too many) false positives than we do about avoiding every last false negative.

Our resolution to this was to take Malware Patrol out of our third party ClamAV data sources. I'm sure there are situations where using their database as part of ClamAV screening makes sense, but my view is that if you're rejecting email based on ClamAV signature matches, you likely can't use Malware Patrol's data . It's too dangerous unless you have a quite high tolerance for false positives. Even in a system where a Malware Patrol signature match only contributed to a message's spam score, I think you could only really add a modest increase in the odds of the message being spam.

(As far as I know, ClamAV stops looking once it's found a signature and the order it checks signature databases isn't documented. This means there's no way to tell it to check signature databases you trust more before Malware Patrol.)

PS: I don't know how common it is to use ClamAV signature matches to reject email, but it is, for example, an obvious way to configure Exim, especially since Exim's malware scanning documentation does this in its example .

Značky: #Network

chevron_right

Email anti-spam (and really all anti-spam) is all heuristics now

pubsub.slavino.sk / chris_spam · Thursday, 31 August, 2023 - 01:08 edit · 2 minutes

On the Fediverse, I noted something :

This is my sad face when Spamhaus puts lists.ubuntu.com (185.125.189.65) in the SBL CSS . Something went wrong here. Well, several things, starting with Cantor & Siegel .

Back in the days, one of the things some people said about DNS blocklists in general and sometimes Spamhaus in particular was that they were opaque, capricious, and didn't actually validate what they were putting in their blocklists, so who knows what could wind up in there for who knows what reason. Those people would take this incident as a validation of their view.

(I was going to say that this was a long standing IP address used to send Ubuntu security announcements, but it looks like we only just started to get them from this IP, although the entire IP range is owned by Canonical.)

I have bad news for such people. This is what all email anti-spam systems are doing today. There are no effective anti-spam systems that are based only on sure positive signs of spam. Everything is an opaque black box full of heuristics and uncertainty, with hopefully occasional misfires that are hopefully not too spectacular. Sometimes people hand write rules and try to assess them, sometimes people take straightforward statistical approaches (eg, Bayesian scoring), and sometimes companies go for the complicated statistics that are generally known as 'Machine Learning' or these days 'AI' (in press releases, at least).

This is not an accident and it's not because people are lazy. It's because anti-spam isn't working against a blind natural phenomenon; instead, anti-spam is engaged in an iterated game against human driven spam. If there's a sure-fire signal of spam that can be used to reject or filter email, the humans driving spam are highly incentivized to get rid of it, and only the ones who are successful at that will survive.

This is simply one of the prices that spam exacts from us. We can no longer live in a world of certainty, where we can be confident that our anti-spam systems are right about things. And sometimes we'll see things that are so obvious (to us humans, on the spot, only having to look at this one incident) that they make us have sad faces.

(There's also the related issue that no one can afford to pay enough humans enough to constantly be evaluating and updating anti-spam rules and heuristics all of the time. All effective anti-spam systems have to operate partially automatically, and sometimes that will pass things that an alert human would not have.)

Značky: #Network

chevron_right

You should delete the 'User-Agent' header from outgoing email

pubsub.slavino.sk / chris_spam · Thursday, 29 June, 2023 - 02:41 edit · 2 minutes

We all know about the HTTP User-Agent header , which browsers and other web things send to web servers. The nominal purpose of this is covered in RFC 9110 section 10.1.5 , and it's not terrible, but in practice websites have abused the header for years (if not decades) and the whole thing is a major mess ( eg ). A very long time ago, some mail clients decided that they'd advertise by adding an 'X-Mailer' header to email they sent, with their name in it. Somewhat more recently, various mail clients decided that they would do this using a 'User-Agent' header (sometimes in addition to an X-Mailer header); one common example is Thunderbird.

I have come to think that this is a bad idea and that you should configure your mail submission server to strip User-Agent (and probably also X-Mailer). First off, leaving this header in leaks information about your users to various people. With the way that the Internet has evolved, hiding this information is now the right answer, much like hiding user IPs turned out to be the right call . If you need to know client and device usage information for your own purposes, log the header value before you delete it (but understand that not all clients may add it in the first place).

(This information leaks not just to the people who your users send email to, but also to the people who operate the receiving email servers. These days that often means Google and Microsoft.)

Second, with the way that the spam filtering landscape has evolved into an unpredictable mess based in large part on opaque signals, other people's mail servers may well decide that they don't like certain User-Agent values. If your people are using one of those mail clients (possibly authentically, unlike spam that forges such a User-Agent), their email will be less likely to get through. Since not everything provides a User-Agent field in the first place, I believe that stripping it out entirely is not likely to be harmful, especially by comparison.

(You might feel that using User-Agent in this way is morally wrong, but other mail servers don't care about your feelings and anyway they may not be explicitly looking at 'User-Agent' as such. They may well be just feeding everything in as barely classified text and letting some pile of math look for correlations, so any header and any header value or part of its value that has correlations will be used.)

In my view, giving other people's large and opaque mail systems fewer reasons to consider real email from your people to be spam is a good reason all by itself. The privacy benefits just tilt the situation even more toward removing any User-Agent header that mail clients may have added.

(As a corollary, it's long since past time that mail clients stopped adding this header. No one is paying attention to it and it's a little leak of private information.)

Značky: #Network