update: @hotmail.com, @msn.com and @live.com addresses seem also affected
We got called in by a customer last week. He had a mysterious problem where all mails that were sent to an @outlook.com address would be rejected, while mails to other domains would be delivered just fine, and he couldn’t figure out a way to solve this.
It’s clear that Microsoft has changed something around the 1st of September to the DNS entries of the outlook.com domain, which has caused this problem with Domino. But this change might have surfaced a problem in the way Domino looks up where to send a mail, which is part of all current Domino versions.
Diagnosing the problem
To diagnose what the problem was, we (my colleague and I) started off by registering an email address in the Outlook.com domain. This is free, so that made it easy. We then sent a mail from the domain of the customer to this mailbox. The message we saw in the Domino console was:
Client not authenticated to send mail when sending mail to @outlook
Of course, mail servers don’t authenticate between each other, so the problem must be that we didn’t end up at the correct server. Domino tried to send the message to 220.127.116.11. When we checked at MX Toolbox what the correct IP address was, we saw an 104.47.x.x address. This differs from today’s entry which can be seen below. That’s already an indication that Microsoft has been playing around with their DNS entry.
So where did the 18.104.22.168 come from? If we look at the A-records of outlook.com, we see this list:
Name: outlook.com Addresses: 22.214.171.124 126.96.36.199 188.8.131.52 184.108.40.206 220.127.116.11 18.104.22.168 22.214.171.124 126.96.36.199 188.8.131.52
So there’s our IP address, but that’s not according to the protocol. Domino should have done a lookup on the MX record, received the outlook-com.olc.protection.outlook.com entry, and then it should have done an A-record lookup on that entry. That would have given a long list of IP addresses in the 52.101.x.x range. Apparently, the IPs that got returned on the A-record lookup of outlook.com do have a mail server listening on them, but those are meant for mail clients sending mail off to other domains.
So, summarizing. Domino was contacting the wrong IP addresses to deliver mail for the outlook.com domain. It ended at IP addresses that are meant for mail clients. Domino should have tried to deliver the mail to the IP addresses that are listed for the outlook-com.olc.protection.outlook.com dns entry. It’s clear that Microsoft changed something that caused this behaviour, but I can’t exclude the possibility that Domino’s behaviour isn’t without fault here either (check the 2nd update for an explanation why it’s very unlikely that this is Domino’s problem).
Fixing the problem
“Fixing” in this regard meant coming up with a workaround. Microsoft and/or HCL are the only ones who could truly “fix” this problem. But luckily we could come up with a workaround by using good old Foreign SMTP documents. You will need 2 of these Foreign SMTP Domain documents. One for outlook.com and one for the rest.
The relevant section for the generic Foreign SMTP Domain, would like something like this:
The section for the special outlook.com Foreign SMTP Domain would look like this.
Next to these, you also need two SMTP Connection documents to connect these Foreign SMTP Domains.
With this change, we force Domino to send any mail for the outlook.com address through outlook-com.olc.protection.outlook.com and that’s exactly where Domino should deliver the mails to.
The workaround we created above is a temporary workaround, which should be removed as soon as possible. After all, we never know if Microsoft decides to suddenly change their MX record to a different address in which case the workaround would break.
As my ISP at home blocks port 25 for outgoing traffic and I can’t have a PTR record for my IP address, I can’t personally test if the situation has changed and the problem has solved itself. I would have to do this at customers, but for obvious reasons, I don’t want to test on their live production systems.
When we worked and solved this problem at this customer, I also created a post in the HCL Domino forum to see if more people experienced the same problem. They did, which is how I know that the problem persisted in all versions of Domino. An interesting solution that one person found, was to use a different DNS server. They changed it to 184.108.40.206 and that solved their problem. This confirms our suspicion that Microsoft must have done something with the DNS records for the outlook.com domain. Nevertheless, it seems that Domino was the only mail system affected, so I think it would be wise for HCL to have a look at their code of the router task for sending mails to the internet to see if it does adhere to all the current standards.
I found a website where you can see the DNS history of a site. This shows that the MX record for Outlook.com has been stable for 6 years already, pointing at outlook-com.olc.protection.outlook.com. The A-record for that address, however, is a different story. That history shows no less than 7 changes since the 1st of September, and shows a very long list of continuous changes. And not just changes of adding or removing a single IP, but really replacing the long list of IP addresses in the 52.101.x.x range, as it is now, with 2 single, and often changing, addresses in the 104.47.x.x range.
This doesn’t solve the mystery of how Domino came up with a 52.96.x.x address to send the mail to, but it does show that Microsoft has been playing around with the outlook.com MX-record. Not with the record itself, but with the IP addresses where that record ends up eventually.
It looks like Domino isn’t the only sending mail server that runs into problems with the Outlook.com domain. This interesting article shows that the problem might be that there are too many IP addresses linked to outlook-com.olc.protection.outlook.com to parse by some DNS resolvers. It also gives an indication why changing DNS servers might solve the problem. Some DNS servers only return 2 IP addresses in the 104.47.x.x range instead of the 56 addresses that other DNS servers return. The explanation for Domino’s strange behaviour could be that it’s a fallback mechanism in case an error occurs in the lookup of the MX record, where it tries to deliver the mail in one final attempt to the addresses belonging to the A-record of the domain. This is pure speculation though.
What we can get from this is that Microsoft is really messing around currently with the DNS records of some of their consumer mail domains (@hotmail.com, @msn.com and @live.com seem to be having problems too). Something which also shows when we look at the MX health of the domain (thanks Lars Berntrop-Bos for pointing that out).