
On Oct 5, 2021, at 5:34 AM, Jürgen Botz via Outages <outages@outages.org> wrote:
OK, aside from the Facebook outage, and no doubt because of it, we effectively had an outages outage... messages to the lsit were massively delayed, causing lots of duplicate reports and general failure to have any meaningful confirmations, etc.
Why? What happened? Well, outages is hosted on a server (puck.nether.net) with a couple dozen other lists, and some of these lists probably have members with email addresses at facebook domains, and the SMTP server kept trying to look those up and having wait for timeouts? Something like that, except that since there were no routes to facebook's nameservers there shouldn't have been any need to wait for timeouts. Anyone here form nether.net actually take a look?
It's more than idle curiosity, I'd like make my mail servers more resilient to this kind of situation, and the outages list of all probably should be, too.
Sure, I’ve gone and tweaked a few more things, there’s a balancing act here between having a single system send out lots of concurrent mail to a server, eg: 60756540E50* 22760 Tue Oct 5 01:28:00 outages-bounces@outages.org (host itchy.cerento.com[199.190.154.20] said: 451 Only one recipient at a time (in reply to RCPT TO command)) REDACTED@cerento.com Some systems are explicitly configured to not be optimized (see above, then imagine how many people in parallel @gmail might get a message) And there are many people with old subscriptions or domains that don’t exist anymore which gets to be exciting when everything is going on at once: (connect to mail.bestii.com[172.96.180.81]:25: Connection timed out) redacted@bestii.com (connect to canadawebhosting.com[74.201.58.138]:25: Connection timed out) redacted@canadawebhosting.com (connect to muhpanel.ironusmaidenus.com[51.15.246.204]:25: Connection refused) redacted@ironusmaidenus.com (connect to mail.mailhost4.com[198.50.245.163]:25: Connection refused) redacted@mailhost4.com (connect to mail.etsms.com[50.232.238.69]:25: Connection timed out) redacted@etsms.com (connect to keyedupmedia.com[204.11.56.48]:25: Connection timed out) redacted@keyedupmedia.com (connect to metacloud.com[72.52.10.14]:25: Connection timed out) redacted@metacloud.com (connect to smtp.naturalwireless.com[52.119.91.194]:25: Connection timed out) redacted@naturalwireless.com redacted@naturalwireless.com (host s1.mail.pciwest.net[2604:2400:a::425] said: 450 4.7.1 Bad Attachment .com (in reply to end of DATA command)) redacted@presys.com (delivery temporarily suspended: connect to mail.talueee.com[159.69.230.243]:25: No route to host) redacted@talueee.com (connect to mail-1.meridian-enviro.com[198.58.69.47]:25: Connection timed out) redacted@meridian-enviro.com (reason unavailable) (All of these are from the same message btw) I was away for personal reasons yesterday so missed out on all the fun. I’ll check how this message goes out and see if the tweaks helped. - Jared