Latest issue with LetsEncrypt

Turbond · June 14, 2025, 10:10pm

NethServer Version: ns8
Module: mail/letsencrypt
Hi, I have an issue with the renewal of the mail domain on ns8. It expired this morning and wouldn’t renew. I tried deleting it to get a new certificate (production system for work and about 22 users), but now I get

2025-06-15T09:11:57+12:00 [1:traefik1:traefik] 2025-06-14T21:11:57Z ERR Unable to obtain ACME certificate for domains error="unable to generate a certificate for the domains [autoconfig.xxx.info]: error: one or more domains had a problem:\n[autoconfig.xxx.info] invalid authorization: acme: error: 400 :: urn:ietf:params:acme:error:dns :: DNS problem: NXDOMAIN looking up A for autoconfig.xx.info - check that a DNS record exists for this domain; DNS problem: NXDOMAIN looking up AAAA for autoconfig.xx.info - check that a DNS record exists for this domain\n" ACME CA=https://acme-v02.api.letsencrypt.org/directory acmeCA=https://acme-v02.api.letsencrypt.org/directory domains=["autoconfig.xx.info"] providerName=acmeServer.acme routerName=webtop1-autoconfig-https@file rule=Host(`autoconfig.xxx.info`)
2025-06-15T09:32:58+12:00 [1:traefik1:traefik] 2025-06-14T21:32:58Z ERR Unable to obtain ACME certificate for domains error="unable to generate a certificate for the domains [autoconfig.xx.info]: error: one or more domains had a problem:\n[autoconfig.xx.info] invalid authorization: acme: error: 400 :: urn:ietf:params:acme:error:dns :: DNS problem: NXDOMAIN looking up A for autoconfig.xxx.info - check that a DNS record exists for this domain; DNS problem: NXDOMAIN looking up AAAA for autoconfig.xx.info - check that a DNS record exists for this domain\n" ACME CA=https://acme-v02.api.letsencrypt.org/directory acmeCA=https://acme-v02.api.letsencrypt.org/directory domains=["autoconfig.xx.info"] providerName=acmeServer.acme routerName=webtop1-autoconfig-https@file rule=Host(`autoconfig.xxx.info`)
2025-06-15T09:33:22+12:00 [1:traefik1:traefik] 2025-06-14T21:33:22Z ERR Unable to obtain ACME certificate for domains error="unable to generate a certificate for the domains [autoconfig.xxx.info]: error: one or more domains had a problem:\n[autoconfig.xxx.info] invalid authorization: acme: error: 400 :: urn:ietf:params:acme:error:dns :: DNS problem: NXDOMAIN looking up A for autoconfig.xxx.info - check that a DNS record exists for this domain; DNS problem: NXDOMAIN looking up AAAA for autoconfig.xxx.info - check that a DNS record exists for this domain\n" ACME CA=https://acme-v02.api.letsencrypt.org/directory acmeCA=https://acme-v02.api.letsencrypt.org/directory domains=["autoconfig.xxx.info"] providerName=acmeServer.acme routerName=webtop1-autoconfig-https@file rule=Host(`autoconfig.xxx.info`)
2025-06-15T09:33:24+12:00 [1:traefik1:traefik] 2025-06-14T21:33:24Z ERR Unable to obtain ACME certificate for domains error="unable to generate a certificate for the domains [autoconfig.xxx.info]: error: one or more domains had a problem:\n[autoconfig.xxx.info] invalid authorization: acme: error: 400 :: urn:ietf:params:acme:error:dns :: DNS problem: NXDOMAIN looking up A for autoconfig.xxx.info - check that a DNS record exists for this domain; DNS problem: NXDOMAIN looking up AAAA for autoconfig.xxx.info - check that a DNS record exists for this domain\n" ACME CA=https://acme-v02.api.letsencrypt.org/directory acmeCA=https://acme-v02.api.letsencrypt.org/directory domains=["autoconfig.xxx.info"] providerName=acmeServer.acme routerName=webtop1-autoconfig-https@file rule=Host(`autoconfig.xxx.info`)
2025-06-15T09:33:35+12:00 [1:traefik1:traefik] 2025-06-14T21:33:35Z ERR Unable to obtain ACME certificate for domains error="unable to generate

etc. I don’t have an autoconfig specified in my subdomains and never had. Is this a cryptic rate limit error from lets encrypt, or an improvement with the latest updates.

How can I resolve manually so production server doesn’t cause expired certificate errors.

Thanks

Turbond

Turbond · June 14, 2025, 10:14pm

Found it… Webtop has added a new few sub-domains… so I wish it had told me this when it did it’s update. Which explains the error above.

What threw me was the GUI saying everything was fine… certificate obtained with the big green circle and white tick. Obviously this should have shown not obtained and then I would have located the error faster, as it was confusing having mail report outdated certificate, but the GUI saying all was fine. Can this please be looked into, or better yet a small GUI enhancement that allows viewing of the obtained certificates.

danb35 · June 14, 2025, 10:34pm

Interesting–other than autoconfig., do you know what they are? I haven’t seen errors on my system yet, but it’d be good to get ahead of it.

Turbond · June 14, 2025, 11:03pm

autodiscover. domain
autoconfig. domain

Just a heads up. I am now getting timeouts.
So that is letsencrypt not happy with the retries. Looks like I’ll be busy telling people to ignore certificates error until further notice.

What I’d really like is a way to request individual certificate on all the https routes, and on the TLS Certificate GUI page have these listed with a renew button (force renew) in case the system breaks again. If anyone can point me to the github I’ll have a go at fixing/breaking it, to get these features.

Please note:

Last login: Sun Jun 15 12:24:00 2025 from 192.168.3.8 [root@kea ~]# api-cli run module/traefik1/delete-certificate --data '{"fqdn":"mail.deleted_domain.co.nz","type":"internal"}' Warning: using user "cluster" credentials from the environment <3>Timeout after about 30 seconds. Certificate not obtained for ['mail.current_doamin.info', 'kea.current_domain.info', 'mail.other_current_domain.co.nz']. <3> false

This is the issue.

davidep · June 16, 2025, 7:36am

Hi @Turbond, thanks for reporting your experience in detail — we understand how frustrating it can be, especially on a production system.

UI limitations and certificate handling
The current UI does not yet clearly represent the changes introduced by the recent rework of Traefik configuration. This update introduced the use of a single default certificate with multiple server names (SNI) for services like Mail, NethVoice Proxy, and Ejabberd. While we’ve already released some minor improvements to reduce confusion, a major UI rework is planned. It will improve visibility over TLS certificates, including their associated server names and other relevant attributes, to help users better understand the current certificate status.
For now, the TLS Certificates page displays:
- All server names included in the default Traefik-generated certificate.
- Only the main server name of manually uploaded certificates — additional names are not shown.
- Certificates generated automatically by Let’s Encrypt through HTTP routes configuration are no longer listed on this page.
Webtop adding autoconfig/autodiscover names
Starting from Webtop version 1.4.2, additional server names like autoconfig.example.org and autodiscover.example.org are automatically registered only when the Settings page is saved. Pre-existing systems remain unaffected until that button is pressed. These names are optional: if they are not valid DNS names (as in your case), they simply fail ACME validation. This failure is harmless, unless a service explicitly requires those certificates (which is not typical).
delete-certificate timeout and challenge failures
The delete-certificate action likely fails because one or more of the listed server names (mail.current_doamin.info, kea.current_domain.info, etc.) fail the ACME HTTP challenge. If Loki is up and responding, yet a timeout occurs, it’s possible that port 80 is blocked or redirected by an intermediate device (like a firewall, proxy, or ISP router). Please verify that port 80 is open and properly forwarded to the NS8 node. Alternatively, consider switching to the TLS-ALPN-01 challenge, which operates over port 443. More details are available in the release notes from 2025-04-04. You may also try to increase the timeout of the delete-certificate action by adding "sync_timeout": 60, though ACME challenges typically complete much sooner.

Turbond · June 17, 2025, 7:48am

Hi Davide,

Thanks for the follow up. When I changed the FQDN of the mail server, I got a new certificate, so port blocks isn’t the issue.

I’m going to backup and then manually delete the all in one certificate of Traefik (after I’ve downloaded a copy as I’m wondering if it’s corruption in the certificate files.) then request again and see if it works.

davidep · June 17, 2025, 8:15am

You’re right, this says the HTTP challenge can work. In this case, ensure the other server names are correctly resolved by public DNS.

First of all, I’d try the delete-certificate.

Run

  journalctl -f --grep acme &

Then run delete-certificate with "sync_timeout": 60.

davidep · June 26, 2025, 7:36am

Hi @Turbond, I’m sorry but it’s not true. We found that the failing certificate request for autoconfig and autodiscover server names is blocking also other legitimate Let’s Encrypt requests. We’ll revert the Webtop automatic HTTP routes creation.

If you want to manually delete the offending HTTP route entries from the Webtop node run:

rm -v /home/traefik*/.config/state/configs/webtop*-auto{config,discover}.yml

I changed the topic category to Bug, thank you again for raising the issue.

Turbond · June 26, 2025, 8:34am

Thanks for that…

I worked around it by changing mail server FQDN and by ensuring the webtop routes existed in my DNS A records. So got it working but I was a little grumpy telling everyone the new Mail server address.

Yes Webtop is on the same node
As an aside I also have a expired certificate that can’t be updated as the domain name no longer exists and I think this is also blocking, so will manually remove

davidep · June 26, 2025, 2:29pm

You can try this workaround: