Resolved:After switch to new master node (http routes and tls certificates) not working anymore

NethServer Version: 8
Module: TLS and HTTP-Routes

Hi Team,

I did add a new node to my current cluster.
Then I switched all apps to the new node and changed the master node to the new one.
All went fine so far.

But if I shut now the worker node I get these error messages if I open the module for HTTP-Routes or TLS Certificates. Also authentication for Email is not working anymore

What did I wrong?

Many thanks for your help
Benjamin

Do certificates of the workers node or apps still exist?

Is there something in the logs?

Let’s try to list the certificates on CLI:

api-cli run module/traefik1/list-certificates | jq

Only on traefik2 the certificates are listed

Could it be that I need a secondary samba service?
Have seen it is only running on the old node and not shown as installed app for whatever reason

Update:
Have now installed samba on new master node, but no change at all

No information on the logfile found

2026-02-15T22:58:40+01:00 [2:traefik2:traefik] 192.168.0.50 - - [15/Feb/2026:21:58:40 +0000] “GET /cluster-admin/api/cluster/task/3fa10136-f46c-480c-b8cd-49e3663fc5e5/status HTTP/2.0” 200 1197 “-” “-” 49400 “cluster-admin-https@file” “http://127.0.0.1:9311” 2ms
2026-02-15T22:58:40+01:00 [2:traefik2:traefik] 192.168.0.50 - - [15/Feb/2026:21:58:40 +0000] “GET /cluster-admin/api/cluster/task/3fa10136-f46c-480c-b8cd-49e3663fc5e5/status HTTP/2.0” 200 1197 “-” “-” 49401 “cluster-admin-https@file” “http://127.0.0.1:9311” 1ms
2026-02-15T22:58:40+01:00 [2:traefik2:traefik] 192.168.0.50 - - [15/Feb/2026:21:58:40 +0000] “POST /cluster-admin/api/module/traefik1/tasks HTTP/2.0” 404 109 “-” “-” 49402 “cluster-admin-https@file” “http://127.0.0.1:9311” 2ms
2026-02-15T22:58:40+01:00 [2:traefik2:traefik] 192.168.0.50 - - [15/Feb/2026:21:58:40 +0000] “POST /cluster-admin/api/module/traefik2/tasks HTTP/2.0” 201 447 “-” “-” 49403 “cluster-admin-https@file” “http://127.0.0.1:9311” 3ms
2026-02-15T22:58:40+01:00 [2:traefik2:agent@traefik2] task/module/traefik2/6323528c-c297-4818-9b09-828c4c21cb04: list-routes/20readconfig is starting
2026-02-15T22:58:40+01:00 [2:traefik2:traefik] 192.168.0.50 - - [15/Feb/2026:21:58:40 +0000] “GET /cluster-admin/api/module/traefik2/task/6323528c-c297-4818-9b09-828c4c21cb04/context HTTP/2.0” 200 455 “-” “-” 49404 “cluster-admin-https@file” “http://127.0.0.1:9311” 1ms
2026-02-15T22:58:40+01:00 [2:traefik2:traefik] 192.168.0.50 - - [15/Feb/2026:21:58:40 +0000] “GET /cluster-admin/api/module/traefik2/task/6323528c-c297-4818-9b09-828c4c21cb04/context HTTP/2.0” 200 455 “-” “-” 49405 “cluster-admin-https@file” “http://127.0.0.1:9311” 1ms
2026-02-15T22:58:40+01:00 [2:traefik2:traefik] 192.168.0.50 - - [15/Feb/2026:21:58:40 +0000] “GET /cluster-admin/api/module/traefik2/task/6323528c-c297-4818-9b09-828c4c21cb04/context HTTP/2.0” 200 455 “-” “-” 49406 “cluster-admin-https@file” “http://127.0.0.1:9311” 1ms
2026-02-15T22:58:40+01:00 [2:traefik2:traefik] 192.168.0.50 - - [15/Feb/2026:21:58:40 +0000] “GET /cluster-admin/api/module/traefik2/task/6323528c-c297-4818-9b09-828c4c21cb04/context HTTP/2.0” 200 455 “-” “-” 49407 “cluster-admin-https@file” “http://127.0.0.1:9311” 1ms
2026-02-15T22:58:40+01:00 [2:traefik2:agent@traefik2] task/module/traefik2/6323528c-c297-4818-9b09-828c4c21cb04: action “list-routes” status is “completed” (0) at step validate-output.json
2026-02-15T22:58:40+01:00 [2:traefik2:traefik] 192.168.0.50 - - [15/Feb/2026:21:58:40 +0000] “GET /cluster-admin/api/module/traefik2/task/6323528c-c297-4818-9b09-828c4c21cb04/status HTTP/2.0” 200 625 “-” “-” 49408 “cluster-admin-https@file” “http://127.0.0.1:9311” 1ms
2026-02-15T22:58:40+01:00 [2:traefik2:traefik] 192.168.0.50 - - [15/Feb/2026:21:58:40 +0000] “GET /cluster-admin/api/module/traefik2/task/6323528c-c297-4818-9b09-828c4c21cb04/status HTTP/2.0” 200 625 “-” “-” 49409 “cluster-admin-https@file” “http://127.0.0.1:9311” 1ms

Did you upload a custom cert or are you using LE certificates?

Is there an error when listing the certs of traefik1?

On the worker node, are there still traefik configurations?

runagent -m traefik1 ls configs

Only using LE Certs with type automatic. I think it is not related to the certificates more to any config with this two modules (certificates and http-routes)

On new master node

[root@idefix home]# api-cli run module/traefik1/list-certificates | jq

{
“certificates”: []
}

[root@idefix cur]# runagent -m traefik1 ls configs
runagent: [FATAL] Cannot find module traefik1 in the local node

On old node:

[root@asterix ~]# api-cli run module/traefik1/list-certificates | jq
AuthenticationError: invalid username-password pair or user is disabled.
[root@asterix ~]# runagent -m traefik1 ls configs
_api.yml _default_cert.yml _http2https.yml cluster-admin.yml samba1-amld.yml

1 Like

Thanks, I could reproduce the issue.
I found that there’s already a bug filed, see HTTP routes error if a node goes offline · Issue #7842 · NethServer/dev · GitHub

In my test, after removing the worker node, the leader showed the certificates and routes again.

1 Like

Ok, should I try it?

Please check first if you can reproduce the issue by listing the routes of the workers node traefik instance on the leader:

~]# api-cli run module/traefik1/list-routes
TaskSubmissionCheckFailed: Client "module/traefik1" was not found

…and it’s always good to have a backup.

On which node should I run that command and how to backup correctly?

On the leader node.

You could use the NS8 backup to backup the apps of the node or maybe backup the whole node at virtualiziation layer like using PBS for Proxmox.

root@idefix cur]# api-cli run module/traefik1/list-routes
[“cluster-admin”, “samba1-amld”]

Are both nodes running now? For the test the worker node should be down.

yeah, was running. Now I got.

[root@idefix cur]# api-cli run module/traefik1/list-routes
TaskSubmissionCheckFailed: Client “module/traefik1” was not found

OK, thanks, just to know that it’s the same issue.

1 Like

If the Linux4all issue is the same described in the bug, can we change the topic category to Bug ?

1 Like

Again, many thanks to you @mrmarkuz. After removing the old node all was working as expected.
Your quick help was really appreciated

1 Like