Remove node manually NH8

NethServer Version: 8 Beta 2

Hi everyone, I have an issue with a cluster on NH8.
I’ve joined a node but the leader see it offline. When I try to remove the worker node the page remains stuck and I can’t remove the second node.

There is a way to remove the second node from command line? I didn’t found anything in the documentation…

Thanks in advice

Hi Filippo,

You didn’t find it in the docs because it is not the right way to do it. I suppose the UI has a bug that prevents you from removing the offline node.

Having say that, to remove a node from the command line run this command (assuming the offline node is 2):

api-cli run remove-node --data '{"node_id":2}'

Added a card, to see if the issue is reproducible. See Trello.


please @fffeal what Operating did you use ?

debian, rocky, …

How did you install the cluster/nodes

I cannot reproduce, this what I did (using core 2.0.0)

install and initiate the cluster
install another and trigger the srcipt then join to
the cluster gets two nodes
stop the
the node 2 is unreachable in the UI
I can remove the node with the button in the UI (right corner)

on another attemp I simply restart the node 2 and after 30 seconds the UI refreshed to see it online

what I did wrong to reproduce ?

I’ve installed the cluster leader on a rocky 9; after that I’ve installed another node on rocky 9.
I’ve tried to join the second node and it didn’t work (problem with my dns), resolved the dns problem the node joined successfully, but on the leader node saw it offline;

I thought that my co-worker was the one who removed the node and joined another, cause when i saw the response here and I checked the NH8 it was all okk; but he now tells me he don’t do anything.

It’s strange, I think that solving the problem with DNS while the second node was trying to join to the leader made something happens.

Anyway, the problem was that while the second node was offline, the button to remove it was un-clickable.
For the rest, I think that after a lot they fixed themself.

Tell me if you need more information.

I suspect that something was wrong during jonction to the cluster, something not finished or I do not know and maybe this explained the grayed button, did you try to refresh the webpage or clean the cache of the page (yes it is web user interface sometimes the browser is guilty too)

only logs inspection in each side could give us the answer, please try it again and let us know what you achieve or not

I think it was the problem with DNS. When I fixed it the second node joined immediatly.

I tried to check /var/log/messages, but I don’t find nothing useful; if there is other log that I have to check tell me.

Now the situation is fine, I’ll try to shutdown the second node to check if I can press the button while the node is offline and I’ll let you know

1 Like

Journalctl is the way now to check

journalctl -t # find a tag like dovecot
journalctl -u sshd.service # find a service
journalctl _UID=$(id -u mail1) #to find all related to mail with all containers

-f follow
-e go to the end of entries

Someone suggested me that you could use the Log page too to find issue, honestly from a backend guy I would never trust that one day he could suggest it

1 Like

Thanks for the tips.
I’ve checked agent@cluster and this is the result:

I don’t know if it is useful.
I see that there aren’t log before Sep 18 19:50, and I’ve had the problem before that hour, if i find something I’ll let you know.

redis 6379/tcp was down…this is the blue screen for us, you cannot save something in database