Ns8 merged error

Hello everyone, I wanted to know if by restarting the system or shutting down, you can incur errors in the file system. I am writing this because there is no correct system closure procedure like ns7. or am I wrong?
I add if this can be useful , that I have 2 rocky linux 8 and version 9 . I’ve done a lot of rollback and delete snapshots, never had a problem. With Ns8 unfortunately I am haunted by these errors , which regardless of the delete or rollback snap , occur at the restart I shut down and re-on the server. So ns8 that does not have a shutdown procedure can generate these problems ? By deleting the merges everything is put in place, but what is the technical motivation? Corrupted files? If so, then can the reboot cause this?

I’m going to reproduce your setup this weekend to test Rocky on Proxmox using a qcow disk on a xigmanas NFS.

NS8 uses the system shutdown procedure of the distro (as NS7 did) so I don’t think that’s the issue.

Maybe you can find a solution in this thread:

Ok , but rocky with the installation of Ns8 of course . As written I have 2 rocky linux, moreover credited on ns7 and I launched snaps … never had a problem. Instead I believe it is the reboot or shutdown procedure that causes this. No one at the moment answered me when you restart or turn off rocky, ns8 closes all disk writes or anything … If this is not the case, it is equivalent to brutally turning off the server.

In addition, at the moment I managed to make the apps I had on ns7 work, so the migration from ns7 to ns8 in my personal point of view applies only to domain users and then transfer the entire ns7 domain to 8, the rest has to be arranged. But considering the structure of ns8 it cannot be claimed otherwise they are two different products for different uses.

I heavily tested, did a snapshot, did a backup to PBS, stopped the machine while restarting a service but I couldn’t reproduce the error yet.

In my test environment I just use a GB LAN connection to the xigmanas which slows down a lot and could trigger issues.

Did you already try to setup a fresh NS8 on a local Proxmox node instead of the NAS? Just to see if the issue is coming back…

Hi Mark, thank you for trying to do some tests. No, I haven’t tried yet. My ns8 at the moment works without problem , but without having launched anything or restarted the server . The virtual disk is of type qcow2 on nfs ( zfs on hpnas ) . I could in the day try to restart the base system ( rocky 9 ) and see after the reboot what happens . I’ll let you know later. Thank you for the time you’ve dedicated.

1 Like

Anyway Mark, I haven’t restarted or snap in three days. If there was actually a problem managing the server in nfs on nas , in these 3 days I should have had errors again . I’m trying nextcloud roundcube glpi wordpress and I’m using them as if they were in production, they work without problems including piwigo for photos. I’m sure if I had to restart , so from rocky linux “reboot” , when I restart I might have surprises . However, as written before after I try again. But do you think if rocky restarts, is the ns8 server shut down in a brutal way or is a correct system shutdown?

It’s a correct shutdown. You can watch in Proxmox console.
It is waited until the container storage is unmounted.

Even just stopping the VM (without shutdown) wasn’t an issue.

Ciao Francesco,
just to be sure if you hit the error again, please report the full log message!

1 Like

I got the “invalid argument” error again on my NS8 Rocky dev test machine…

Here are the logs:

Solvable by removing the merged dirs and restarting the services:

rmdir /home/zammad3/.local/share/containers/storage/overlay/*/merged

2 Likes

Finally ! So mark is there a underlying problem that has nothing to do with snap reboot or something?

Hi Davide, Mark has already done it.

Yes it is a Podman bug. Ensure you have the latest updates. Hopefully in RL 9.4 Podman 4.9 will receive more fixes than the 4.6.1 branch.

1 Like

I frequently perform updates, but I think this is a really important bug as in every restart or other situation the server is unusable unless you proceed with the deletion of the merged . Now I am more serene as the doubt of nfs, snapshots and more has been excluded. Thank you .

Bear in mind that

The use case that does not work well is having the container image store reside on an NFS mount point. – https://www.redhat.com/sysadmin/rootless-podman-nfs

1 Like

So much the better! Visot that I assume that a cluster is in a SAN , at least in the middle of the world. Anyway, thank you for your intervention that has satiated all my doubts and not just mine.

I tried to move the qcow2 disk from nfs to local storage .



# rmdir /home/traefik1/.local/share/containers/storage/overlay/*/merged
runagent -m traefik1 systemctl --user restart traefik

After is ok .

1 Like

For the bug fix, a boot script can apply the workaround automatically, until we receive the upstream patched releases in Rocky Linux 9.4 and Debian 13 (Trixie).

3 Likes

It would be important to know if sombody hit the same bug on Debian 12!

Debian users, please share your experience.

No more problems on Rocky 9.3 after two reboots.

Thank you