What constitutes an offline node

EddieA · June 13, 2025, 7:04pm

Got this in my mailbox this morning:

What are the conditions for thinking a node is offline.

I’m running NS8 in an ESXi VM which is not showing any evidence of being underpowered or resource constrained.

mrmarkuz · June 13, 2025, 8:19pm

Do you run more nodes? It seems that just node 1 is affected.

The node agents communicate with the API server via the redis database and if there’s no answer from a node after some attempts it is considered offline.

To check the cluster status from CLI:

api-cli run cluster/get-cluster-status | jq

Maybe it helps to restart the metrics or loki services?

Metrics:

runagent -m metrics1 systemctl --user restart prometheus.service alertmanager.service alert-proxy.service

Loki:

runagent -m loki1 systemctl --user restart loki.service

Please also check the logs for errors, maybe there’s more information about the cause.