Some of our hosted Mender customers started reporting they could no longer connect to the service on September 30th, 2021.
Running the Mender client, a solution to manage software on IoT devices, errors similar to the following is output on command line:
Error code 12: server certificate verification failed. CAfile: /etc/ssl/certs/ca-certificates.crt CRLfile: none
Let's Encrypt was used for issuing and renewing certificates to hosted Mender.
Let's Encrypt provides the infrastructure to issue trusted certificates, including managing root certificates at the top of the chain. This root certificate is installed on all Internet-connected devices across the globe, so your Internet services can prove they are trusted because this root certificate has signed (a chain) your Internet service certificate (most commonly, your web service certificate).
Let's Encrypt certificates by default to use a certificate chain that is cross signed with a DST Root CA X3 root certificate, which just expired:
Validity
Not Before: Sep 30 21:12:19 2000 GMT
Not After : Sep 30 14:01:15 2021 GMT
OpenSSL 1.0.2 defaults to using the untrusted chain when validating certificates. This means, that if this DST Root CA X3 root certificate exist in your system certificate store all its leaf certificates will fail verification.
At its core, the TLS client connecting to the service must update its root certificates. This would remove the offending root certificate and ensure a trusted root certificate used by Let's Encrypt is in place and certificate verification succeeds again.
How to do this depends on the operating system of the TLS client. For example:
Red Hat family: Update the ca-certificates
package.
Ubuntu 16.04: Update OpenSSL to 1.0.2g-1ubuntu4.20
In general Linux environment, you can blacklist the certificate with id pkcs11:id=%c4%a7%b1%a4%7b%2c%71%fa%db%e1%4b%90%75%ff%c4%15%60%85%89%10
If you have just one TLS client to worry about, e.g. your own laptop, this solution would be fine.
But if you have customers across the globe, some of who are experiencing this issue, how do you proceed? Even if you can reach all of them and tell them the steps to do, it may already be too late to remediate because they already lost access to your service.
Issue a new TLS certificate to your service from a different provider than Let's Encrypt. This will enable all TLS clients to connect to your service again without any client side modification. All you need to do is to get and configure a new certificate in your service. There are many providers out there, simply do a web search for purchase server certificate
to find one.
You can still continue to use Let's Encrypt down the line, but this will immediately restore service and buy you time to ensure all your customers' TLS clients are updated before you switch back to Let's Encrypt (as outlined above).
This is what we quickly did for all our hosted Mender users, to ensure all of our customers have a good experience and a recourse for fixing the issue with an over-the-air software update to their IoT devices.
This was a bit unfortunate event as it was caused by a combination of Let's Encrypt cross-signing certificates and OpenSSL then deciding to follow such chains to improve security (even though in this case it might not improve security). Also, this was not a surprise for Let's Encrypt, the OpenSSL community nor the OS vendors. Various patches for this has been released years ago.
The biggest lesson is therefore to always keep your system up to date with new software updates. Not only will this address vulnerabilities, but as you can see it will also ensure avoiding downtime.
For IoT devices, we highly recommend using Mender to deploy system updates at least quarterly. Both Yocto Project and Debian family OSes are fully supported for robust system updates.