Hi gyus, i try to add a node to a Nethserver8 Cluster.
The operating system is ALma Linux 9 (update), fresh install of 2 virtual Machine on Proxmox.
I create the first node without problem, when i try to add the second node, the procedure is ok, but if i check cluster console, the second node is always offline.
virtual machine have static ip.
what can i check to find the mistake?
suggestions?
Thanks
Thanks for testing NS8!
Can you reach/ping the NS8 nodes from each other? Are the hostnames resolvable by DNS?
Did you check /var/log/messages
for errors?
hi @mrmarkuz , ping sure is ok. i need to check DNS (it’s a domain controller nethserver 7), so i can update DNS without problem.
then i check log and give a feedback asap.
hi, i check ping and hostname (fqdn) is correct. i can ping nodes.
in /var/log/messagges i can find this:
May 15 12:38:51 nethhost01 cluster[740]: Traceback (most recent call last):
May 15 12:38:51 nethhost01 cluster[740]: File “/usr/local/agent/pyenv/lib64/python3.9/site-packages/aioredis/connection.py”, line 803, in disconnect
May 15 12:38:51 nethhost01 cluster[740]: self._writer.close() # type: ignore[union-attr]
May 15 12:38:51 nethhost01 cluster[740]: File “/usr/lib64/python3.9/asyncio/streams.py”, line 353, in close
May 15 12:38:51 nethhost01 cluster[740]: return self._transport.close()
May 15 12:38:51 nethhost01 cluster[740]: File “/usr/lib64/python3.9/asyncio/selector_events.py”, line 698, in close
May 15 12:38:51 nethhost01 cluster[740]: self._loop.call_soon(self._call_connection_lost, None)
May 15 12:38:51 nethhost01 cluster[740]: File “/usr/lib64/python3.9/asyncio/base_events.py”, line 751, in call_soon
May 15 12:38:51 nethhost01 cluster[740]: self._check_closed()
May 15 12:38:51 nethhost01 cluster[740]: File “/usr/lib64/python3.9/asyncio/base_events.py”, line 515, in _check_closed
May 15 12:38:51 nethhost01 cluster[740]: raise RuntimeError(‘Event loop is closed’)
May 15 12:38:51 nethhost01 cluster[740]: RuntimeError: Event loop is closed
May 15 12:38:51 nethhost01 cluster[740]: Error executing task get-info on node/3 Client “node/3” was not found
May 15 12:38:51 nethhost01 node[741]: task/node/1/900c28da-79a7-49bd-8e8b-3b01b85b3653: get-info/20read is starting
May 15 12:38:51 nethhost01 node[741]: task/node/1/900c28da-79a7-49bd-8e8b-3b01b85b3653: action “get-info” status is “completed” (0) at step validate-output.json
the status of dashboard is this:
if i try to remove the node the dashboard loop like this:
i try to disable firewalld on the nodes and to disable IPV6, but without good news.
Other suggestions?
thanks
in the second node (worker) i don’t see wireguard interface. is normal?
I think you found the culprit.
It’s an upstream bug on selinux: 2149452 – wireguard-tools-1.0.20210914-2 won't start with selinux enable after upgrading to 9.1
It should be fixed on RHEL 9.2.
In the meanwhile, try the workaround suggested on the issue:
semanage permissive -a wireguard_t
maybe, i try a fresh install from 9.2 iso and give a feedback.
my friends, if you use Alma Linux install from 9.2 version.
We have a solutions @giacomo.
Very well.
Thanks.
We are going to update the images as soon as rocky releases 9.2.