NS8 knows what certificates it’s obtained, and it knows when they expire. At a minimum it should raise a warning in the /cluster-admin pages, and really should send out an email, if
Renewal is failing for some reason (and, of course, describe that reason), or
A cert is due to expire in under 30 days (which would likely result from the above).
I admit I’m assuming this, based on the stated fact in that thread that his cert is expired since a day ago, and this is the first time he’s raised the issue.
I think it’s a good idea to get notified when certs are not working BEFORE the users are affected.
Traefik renews the certs automatically so I don’t know if there’s a hook to catch it.
Here is a first draft of a script that checks the certs from traefik if they’re valid and expiring in under 30 days:
#!/bin/bash
ACMEPATH=/home/traefik1/.config/state/acme/acme.json
for i in $(jq -r '.[] .Certificates | .[] | .domain.main' ${ACMEPATH} | sort | uniq); do
cert_end=$(echo -n Q | openssl s_client -servername ${i} -connect ${i}:443 2>/dev/null | openssl x509 -noout -dates | grep notAfter | cut -d "=" -f 2)
days_left=$(( ($(date -d "$cert_end" +%s) - $(date +%s)) / 86400 ))
# echo "Days left: $days_left"
if true | openssl s_client -connect ${i}:443 </dev/null 2>/dev/null | openssl x509 -noout -text | grep -q ${i}; then
if [[ $days_left -lt 30 ]]; then
echo "${i} is valid but renewal doesn't work. Days left: $days_left"
else
echo "${i} is valid. Days left: $days_left"
fi
else
echo "${i} is NOT valid. Days left: $days_left"
fi
done
As we saw in the other thread, it will at least log that it tried and failed–though it doesn’t look like it logs any detail about why it failed (I assume–and hope–it logs that info somewhere, but it doesn’t seem to go into the main system log, making troubleshooting a challenge).
Using following log searches should provide a failure reason:
2025-06-30T20:45:39+02:00 2025-06-30T18:45:39Z ERR Error renewing certificate from LE: {wiki.domain.tld []} error="error: one or more domains had a problem:\n[wiki.domain.tld] invalid authorization: acme: error: 403 :: urn:ietf:params:acme:error:unauthorized :: 1.2.3.4: Invalid response from http://wiki.domain.tld/.well-known/acme-challenge/63nMJK_Q5oeZci_U1cq-cRx6JZNKsdfafasdfasdf: 404\n" acmeCA=https://acme-v02.api.letsencrypt.org/directory providerName=acmeServer.acme
We could also check if the right port is opened and if DNS is setup correctly.