Remove node manually NH8

fffeal · September 18, 2023, 2:29pm

NethServer Version: 8 Beta 2

Hi everyone, I have an issue with a cluster on NH8.
I’ve joined a node but the leader see it offline. When I try to remove the worker node the page remains stuck and I can’t remove the second node.

There is a way to remove the second node from command line? I didn’t found anything in the documentation…

Thanks in advice

davidep · September 19, 2023, 7:56am

Hi Filippo,

You didn’t find it in the docs because it is not the right way to do it. I suppose the UI has a bug that prevents you from removing the offline node.

Having say that, to remove a node from the command line run this command (assuming the offline node is 2):

api-cli run remove-node --data '{"node_id":2}'

Added a card, to see if the issue is reproducible. See Trello.

stephdl · September 20, 2023, 8:39am

please @fffeal what Operating did you use ?

debian, rocky, …

stephdl · September 20, 2023, 8:47am

How did you install the cluster/nodes

stephdl · September 20, 2023, 9:41am

I cannot reproduce, this what I did (using core 2.0.0)

install rockY9-pve.org and initiate the cluster
install another rocky9-pve2.org and trigger the srcipt then join to rockY9-pve.org
the cluster gets two nodes
stop the rocky9-pve2.org
the node 2 rocky9-pve2.org is unreachable in the UI
I can remove the node with the button in the UI (right corner)

on another attemp I simply restart the node 2 rocky9-pve2.org and after 30 seconds the UI refreshed to see it online

what I did wrong to reproduce ?

fffeal · September 20, 2023, 10:17am

I’ve installed the cluster leader on a rocky 9; after that I’ve installed another node on rocky 9.
I’ve tried to join the second node and it didn’t work (problem with my dns), resolved the dns problem the node joined successfully, but on the leader node saw it offline;

I thought that my co-worker was the one who removed the node and joined another, cause when i saw the response here and I checked the NH8 it was all okk; but he now tells me he don’t do anything.

It’s strange, I think that solving the problem with DNS while the second node was trying to join to the leader made something happens.

Anyway, the problem was that while the second node was offline, the button to remove it was un-clickable.
For the rest, I think that after a lot they fixed themself.

Tell me if you need more information.

stephdl · September 20, 2023, 10:21am

I suspect that something was wrong during jonction to the cluster, something not finished or I do not know and maybe this explained the grayed button, did you try to refresh the webpage or clean the cache of the page (yes it is web user interface sometimes the browser is guilty too)

only logs inspection in each side could give us the answer, please try it again and let us know what you achieve or not

fffeal · September 20, 2023, 10:29am

I think it was the problem with DNS. When I fixed it the second node joined immediatly.

I tried to check /var/log/messages, but I don’t find nothing useful; if there is other log that I have to check tell me.

Now the situation is fine, I’ll try to shutdown the second node to check if I can press the button while the node is offline and I’ll let you know

stephdl · September 20, 2023, 10:49am

Journalctl is the way now to check

journalctl -t # find a tag like dovecot
journalctl -u sshd.service # find a service
journalctl _UID=$(id -u mail1) #to find all related to mail with all containers

-f follow
-e go to the end of entries

stephdl · September 20, 2023, 11:00am

Someone suggested me that you could use the Log page too to find issue, honestly from a backend guy I would never trust that one day he could suggest it

fffeal · September 20, 2023, 12:50pm

Thanks for the tips.
I’ve checked agent@cluster and this is the result:

I don’t know if it is useful.
I see that there aren’t log before Sep 18 19:50, and I’ve had the problem before that hour, if i find something I’ll let you know.

stephdl · September 20, 2023, 12:52pm

redis 6379/tcp was down…this is the blue screen for us, you cannot save something in database