Nethsec suddenly doomed

After running flawlessly for two months, my NethSec instance suddenly went critical: CPU, memory, and disk I/O were completely maxed out.
Unfortunately, all logs were lost after the reboot, so I couldn’t investigate what happened.

  • Why are logs not persisted across reboots?

  • Any idea what could have caused such a resource spike?

Fortunately Pulse (which is behind that gateway) had the time to send me a Telegram alert @mrmarkuz !

Could it be linked to the config we did yesterday to get Pulse working behind the reverse proxy ? That’s the only thing that changed :thinking:

I’m using the same config for Pulse on my NethSec and there were no issues.

Logs are stored in RAM, you could use a controller to save the data, see also Logs — NethSecurity documentation

2 Likes

right, the controller. I tried once but didn’t succeeded. Will try again.

1 Like

Hi Matthieu!

Why are logs not persisted across reboots?

This is how openwrt in the current configuration works, everything non-configuration is lost upon restart/update. This ensures the system is always clean and performant.

You can achieve persistent logs by enabling Persistent Storage if you have a disk/partition free to use. If you go with the controller, this will indeed extend farther the data retention of the firewall.

Any idea what could have caused such a resource spike

Could be anything, do you have IPS enabled on the firewall? High CPU/MEM usage is usually correlated to the security software that analyzes the traffic.

2 Likes

The graphs show that there wasn’t high network traffic in the timeframe of high cpu and disk usage.

Disk IO is strange, I can’t think of a suspect. Persistent logs may have helped (or not) to discover the source of the problem.

1 Like

Thanks @filippo_carletti and @Tbaile

IPS is not enabled, nor any packet inspection module.

I configured the controller and will have access to the logs if it happens again, which is not the case up to now. The disk IO is indeed very unusual in this case.