Hello,
I have 2 Vbox VMs, one leader and one worker, both of them runs clean Debian 12 install.
After joining worker node it immidietly goes offline on admin panel. This is what I have found, looking in syslog:
2023-08-09T11:05:37.867897+02:00 NodeDebian agent@cluster[5337]: Traceback (most recent call last):
2023-08-09T11:05:37.868038+02:00 NodeDebian agent@cluster[5337]: File “/var/lib/nethserver/cluster/actions/join-node/30start_replication”, line 64, in
2023-08-09T11:05:37.868110+02:00 NodeDebian agent@cluster[5337]: cluster.vpn.initialize_wgconf(ip_address, listen_port, peer={
2023-08-09T11:05:37.868162+02:00 NodeDebian agent@cluster[5337]: File “/usr/local/agent/pypkg/cluster/vpn.py”, line 36, in initialize_wgconf
2023-08-09T11:05:37.868198+02:00 NodeDebian agent@cluster[5337]: peer_ep_address = socket.getaddrinfo(peer_hostname, peer_port, proto=socket.IPPROTO_UDP)[0][4][0]
2023-08-09T11:05:37.868242+02:00 NodeDebian agent@cluster[5337]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2023-08-09T11:05:37.868281+02:00 NodeDebian agent@cluster[5337]: File “/usr/lib/python3.11/socket.py”, line 962, in getaddrinfo
2023-08-09T11:05:37.869672+02:00 NodeDebian agent@cluster[5337]: for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
2023-08-09T11:05:37.869738+02:00 NodeDebian agent@cluster[5337]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2023-08-09T11:05:37.869780+02:00 NodeDebian agent@cluster[5337]: socket.gaierror: [Errno -2] Name or service not known
2023-08-09T11:05:37.936889+02:00 NodeDebian agent@cluster[5337]: task/cluster/51a774be-c186-4194-9592-794a3dba2a10: action “join-node” status is “aborted” (1) at step 30start_replication
Removing: “proto=
”
from /usr/local/agent/pypkg/cluster/vpn.py
peer_ep_address = socket.getaddrinfo(peer_hostname, peer_port, proto=socket.IPPROTO_UDP)[0][4][0]
It made the error go away.
When I tried to join it again. Unfortunately it failed to connect.
Again in syslog:
2023-08-09T12:42:39.892910+02:00 NodeDebian agent@cluster[488]: Leader response is successful: the new node ID is node/8!
…
2023-08-09T12:42:41.312262+02:00 NodeDebian firewalld[560]: ERROR: NAME_CONFLICT: new_service(): ‘ns-wireguard’
2023-08-09T12:42:41.318858+02:00 NodeDebian agent@cluster[488]: Error: NAME_CONFLICT: new_service(): ‘ns-wireguard’
2023-08-09T12:42:41.374372+02:00 NodeDebian agent@cluster[488]: task/cluster/193597fc-4035-4956-8d82-481a8cff4143: action “join-node” status is “aborted” (26) at step 20wgboot
I need to mention that on second try, I uninstalled NS8 and installed it again. Strangely I don’t see any vpn connection on worker machine. Could wireguard be the problem?