-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deliverability problems with emails coming from microsoft [+fix/workaround in v0.0.13] #237
Comments
…incoming email from microsoft for issue #237
See https://www.xmox.nl/b/#22c8911bf3f768931d93f599b0eb03882d1c78e3 for a binary with the workaround. |
Thanks to @mdavids for reporting. Incoming deliveries were aborted by microsoft's smtp client during starttls. I have an @outlook.com account and an office365 account, and testing showed messages did not come through. I am not getting DSNs or have access to error logs, but mdavids got an error "451 4.4.0 Security status Renegotiate". An initial hunch that renegotiation was in play (which the Go TLS stack doesn't implement) didn't lead to a solution. The connections turned out to be TLS1.3, which doesn't support renegotiation at all. After some added TLS debugging and testing, the workaround is to disable session tickets in the Go TLS config for the server. I started receiving TLS reports from microsoft with "validation failure" errors (no further details) on Oct 24. They were about MTA-STS, which I checked and couldn't find any issues. (One limitation of TLS reporting is that there's no way to classify an error as just "TLS error". It can only be classified as a TLSA (DANE) error or an MTA-STS error, which can put you on the wrong track when analyzing problems). Anyway, after looking at the TLS messages being exchanged, I noticed the "session ticket message" the Go TLS library sends immediately after its "server finished" message. The TLS1.3 RFC seems to say that such messages can only be sent after having read a "client finished" message, https://datatracker.ietf.org/doc/html/rfc8446#section-4.6.1. Maybe Microsoft recently updated its TLS stack to be more strict, rejecting connections with messages coming in at a moment when not allowed. Probably something more is going on, I would expect the impact to be larger and more well-known. |
I have TLS reports without failures from Microsoft on Oct 23, so that points to changes on 24th. |
For completeness: The tcp connection is aborted by remote during the TLS handshake, after mox has written the "server finished" and "sessionticket" messages. Remote doesn't write any bytes in response (no tls alert or client finished message). |
And TLS reports.
Errors on Oct 24:
And the logging of a failing session:
|
See golang/go#70232 |
The fix is in mox release v0.0.13, released just now. |
For the record: https://list.mailop.org/private/mailop/2024-November/029764.html |
… incoming deliveries and add an alerting rule if the failure rate becomes >10% (e.g. expired certificate). the prometheus metrics includes a reason, including potential tls alerts, if remote smtp clients would send those (openssl s_client -starttls does). inspired by issue #237, where incoming connections were aborted by remote. such errors would show up as "eof" in the metrics.
…ssion tickets the field is optional. if absent, the default behaviour is currently to disable session tickets. users can set the option if they want to try if delivery from microsoft is working again. in a future version, we can switch the default to enabling session tickets. the previous fix was to disable session tickets for all tls connections, including https. that was a bit much. for issue #237
Incoming delivery attempts from microsoft are aborted (by them) during SMTP STARTTLS.
Current workaround is disabling session tickets in our TLS config. Commit upcoming, more details will follow.
The text was updated successfully, but these errors were encountered: