Ns8: fail to add node

Hi gyus, i try to add a node to a Nethserver8 Cluster.
The operating system is ALma Linux 9 (update), fresh install of 2 virtual Machine on Proxmox.
I create the first node without problem, when i try to add the second node, the procedure is ok, but if i check cluster console, the second node is always offline.
virtual machine have static ip.
what can i check to find the mistake?
suggestions?
Thanks

Thanks for testing NS8!
Can you reach/ping the NS8 nodes from each other? Are the hostnames resolvable by DNS?
Did you check /var/log/messages for errors?

3 Likes

hi @mrmarkuz , ping sure is ok. i need to check DNS (it’s a domain controller nethserver 7), so i can update DNS without problem.
then i check log and give a feedback asap.

2 Likes

hi, i check ping and hostname (fqdn) is correct. i can ping nodes.
in /var/log/messagges i can find this:

May 15 12:38:51 nethhost01 cluster[740]: Traceback (most recent call last):
May 15 12:38:51 nethhost01 cluster[740]: File “/usr/local/agent/pyenv/lib64/python3.9/site-packages/aioredis/connection.py”, line 803, in disconnect
May 15 12:38:51 nethhost01 cluster[740]: self._writer.close() # type: ignore[union-attr]
May 15 12:38:51 nethhost01 cluster[740]: File “/usr/lib64/python3.9/asyncio/streams.py”, line 353, in close
May 15 12:38:51 nethhost01 cluster[740]: return self._transport.close()
May 15 12:38:51 nethhost01 cluster[740]: File “/usr/lib64/python3.9/asyncio/selector_events.py”, line 698, in close
May 15 12:38:51 nethhost01 cluster[740]: self._loop.call_soon(self._call_connection_lost, None)
May 15 12:38:51 nethhost01 cluster[740]: File “/usr/lib64/python3.9/asyncio/base_events.py”, line 751, in call_soon
May 15 12:38:51 nethhost01 cluster[740]: self._check_closed()
May 15 12:38:51 nethhost01 cluster[740]: File “/usr/lib64/python3.9/asyncio/base_events.py”, line 515, in _check_closed
May 15 12:38:51 nethhost01 cluster[740]: raise RuntimeError(‘Event loop is closed’)
May 15 12:38:51 nethhost01 cluster[740]: RuntimeError: Event loop is closed
May 15 12:38:51 nethhost01 cluster[740]: Error executing task get-info on node/3 Client “node/3” was not found
May 15 12:38:51 nethhost01 node[741]: task/node/1/900c28da-79a7-49bd-8e8b-3b01b85b3653: get-info/20read is starting
May 15 12:38:51 nethhost01 node[741]: task/node/1/900c28da-79a7-49bd-8e8b-3b01b85b3653: action “get-info” status is “completed” (0) at step validate-output.json

the status of dashboard is this:


if i try to remove the node the dashboard loop like this:

i try to disable firewalld on the nodes and to disable IPV6, but without good news.
Other suggestions?
thanks

in the second node (worker) i don’t see wireguard interface. is normal?

I think you found the culprit.
It’s an upstream bug on selinux: 2149452 – wireguard-tools-1.0.20210914-2 won't start with selinux enable after upgrading to 9.1
It should be fixed on RHEL 9.2.
In the meanwhile, try the workaround suggested on the issue:

semanage permissive -a wireguard_t
2 Likes

maybe, i try a fresh install from 9.2 iso and give a feedback. :wink:

my friends, if you use Alma Linux install from 9.2 version.


We have a solutions @giacomo.
Very well.
Thanks.

2 Likes

We are going to update the images as soon as rocky releases 9.2.

2 Likes