Thought NS8 was crashing after 2 core updates (was Proxmox)

NethServer Version: NS8
Module: Core

To start, NS8 has been pretty much rock solid for me, as far as its Core. I have had some minor issues here and there, but those end up being app issues - sometime.. or most of the time it is self inflicted - (ME ok… usually it was me :laughing: ). I am unsure if this is to do with core updates or NS8 updates at all.

The only reason I think it is a problem with NS8 Core is I will notice that Mesh Central, Website, or one of the other apps I am running on NS8 would normally be running. Unfortunately recently when I end up going to them I will notice them to be down. At that point I will try to go to NS8 dashboard and I can not loaded it web or internally. At that point the only choice I have is to restart NS8. After restarting NS8, I can then reach NS8 dashboard and the other apps as well after a boot of the underlining OS.

I want to make sure you know I am unsure which update this started with or that it is due to an update, but that is the only thing I have done before I started seeing these crashes from my NS8. The crashes have been happening the last 2 weeks.

I have looked in Loki, under System, but I am not seeing anything that I can tell has to do with any crashes.

I am hoping to see if someone can help me verify this issue. I was also hoping someone could point me to where in the logs I should look.

I would like to get my NS8 back to being stable. It looks like I will need some extra eyes on this.

Thanks,

Usually the log of an affected app or at least the cluster log should show something about the issue.

You could enable the alert notifications.
This way you will get a mail In case of issues, for example when swap gets filled, see Metrics and alerts — NS8 documentation
You can check current alerts on the nodes page, see Cluster management — NS8 documentation

1 Like

I have Email notifications turned on. Thanks @mrmarkuz for that tid-bit of good information. Just enabled Alert notifications in Metrics. I will looks to see if I get Alert notifications from Metrics.

1 Like

Definitely getting metrics now. I am getting a “summary = TLS certificate on Node expires in XXXd XXh XXm Xs”

I know that is for my Certifcates. So I will get that soon.

I have not seen any errors that look like it has to do with NS8 crashes so far. But NS8 did have a lock up this morning. Closer to 11:00 AM. Had to reboot the system.

Did not see that incident in the alerts.

Is there anything interesting in the logs at about 11:00 AM?
You can send me the logs via PM so I can check them…

1 Like

Yes. Thx

1 Like

@mrmarkuz I don’t think this is NS8 or anything to do with NS8 core upgrades. I think I may have found the problem.

No wonder I was not able to find the issues in the logs and didn’t see the problem in metrics. Looks like it is not a NS8 problem after all. Right now it looks like it is something to do with Proxmox instead.

Thanks for the second set of eyes on this. I am reviewing it now. I think I have pin pointed the problem.

1 Like