NethServer Version: NS8 on AlmaLinux
Module: Loki
Hi all,
I’m trying to udpate loki on two different servers on Almalinux, but I get this error:
<7>podman-pull-missing ghcr.io/nethserver/loki:1.3.1
<7>podman-pull-missing docker.io/traefik:v3.3.5 docker.io/grafana/loki:3.4.3
Trying to pull docker.io/library/traefik:v3.3.5...
Getting image source signatures
Copying blob sha256:3726c0c457c1ef0b6f1451755983ad25f602ecf58be4c89254ea1eddd17376a4
Copying blob sha256:dcc8cd112e3beb9d5fb4b1bcf11d884caeb7e5e00fefe1d07fab43ebc40386e9
Copying blob sha256:f18232174bc91741fdf3da96d85011092101a032a93a388b79e99e69c2d5c870
Copying blob sha256:da1600f8cecfd444899afd69b681a02879ce5607d0278fefa5f8747744c75cc6
Copying config sha256:66c037adf0b4eeeb4b1dbcbfc7520eae76ce967e73f02e1d569808930129ab3b
Writing manifest to image destination
66c037adf0b4eeeb4b1dbcbfc7520eae76ce967e73f02e1d569808930129ab3b
<7>extract-image ghcr.io/nethserver/loki:1.3.1
flock: getting lock took 0.000004 seconds
time="2025-05-12T11:50:37+02:00" level=error msg="While recovering from a failure (saving incomplete layer metadata), error deleting layer \"4070ba230d95721c40f7eacac8440354c8b8fa7622b9ef5e18f763abb97a4e4c\": open /run/user/1003/containers/overlay-layers/.tmp-mountpoints.json3170055619: no space left on device"
Error: creating container storage: open /run/user/1003/containers/overlay-layers/.tmp-mountpoints.json3877251958: no space left on device
Traceback (most recent call last):
File "/usr/local/agent/actions/update-module/05pullimages", line 81, in <module>
).check_returncode()
^^^^^^^^^^^^^^^^^^
File "/usr/lib64/python3.11/subprocess.py", line 502, in check_returncode
raise CalledProcessError(self.returncode, self.args, self.stdout,
subprocess.CalledProcessError: Command '('extract-image', 'ghcr.io/nethserver/loki:1.3.1')' returned non-zero exit status 125.
The problem is missing space on the device, but the space it’s available:
~]# df -h
Filesystem Size Used Avail Use% Mounted on
devtmpfs 4.0M 0 4.0M 0% /dev
tmpfs 3.8G 1.4M 3.8G 1% /dev/shm
tmpfs 1.6G 151M 1.4G 10% /run
/dev/mapper/vg-lv_root 79G 41G 35G 54% /
/dev/sda2 493M 268M 189M 59% /boot
tmpfs 769M 2.0M 767M 1% /run/user/1008
tmpfs 769M 1.7M 767M 1% /run/user/1001
tmpfs 769M 1.9M 767M 1% /run/user/1004
tmpfs 769M 386M 384M 51% /run/user/1003
tmpfs 769M 84K 769M 1% /run/user/1002
tmpfs 769M 1.9M 767M 1% /run/user/1007
tmpfs 769M 1.9M 767M 1% /run/user/1006
tmpfs 769M 2.2M 767M 1% /run/user/1009
tmpfs 769M 1.7M 767M 1% /run/user/1005
tmpfs 769M 1.9M 767M 1% /run/user/1012
tmpfs 769M 1.8M 767M 1% /run/user/1010
tmpfs 769M 232K 769M 1% /run/user/1016
tmpfs 769M 1.9M 767M 1% /run/user/1011
tmpfs 769M 1.9M 767M 1% /run/user/1015
tmpfs 769M 1.9M 767M 1% /run/user/1014
tmpfs 769M 1.9M 767M 1% /run/user/1013
shm 63M 0 63M 0% /var/lib/containers/storage/overlay-containers/0de3c14fa843af0eac62ea698506e986816404c3374a4b345b9e13707d7516c7/userdata/shm
overlay 79G 41G 35G 54% /var/lib/containers/storage/overlay/f9353c261b6449388d2e3fb2198bf527f1523d1a6e0226f02abf851aa4900b4a/merged
shm 63M 0 63M 0% /var/lib/containers/storage/overlay-containers/5ab3d124c0abcb983a50ce0e409b13eb06dcca195ea5225cfedf15abe17394d8/userdata/shm
overlay 79G 41G 35G 54% /var/lib/containers/storage/overlay/e0b719cda18bc6283c94cb555e428a7064aca5e746463394efe69eb6baae3b37/merged
shm 63M 0 63M 0% /var/lib/containers/storage/overlay-containers/eccb8dfc9695adc86826aa74ee760ccde6e73b421b71d60ec3de94d8aeaae172/userdata/shm
overlay 79G 41G 35G 54% /var/lib/containers/storage/overlay/61d8da72af99c37d0e8f125eae00a03a4c5c19b42ebc0b3e28f876dd8d695488/merged
shm 63M 0 63M 0% /var/lib/containers/storage/overlay-containers/7ab06652bd1b952bff91512f2661e22ff89e641f9e35ee485a08e4a55de562a4/userdata/shm
overlay 79G 41G 35G 54% /var/lib/containers/storage/overlay/6e8f623dfb557a2b529279b4f30e090882716f24872b3b57c5cd71b639113786/merged
shm 63M 0 63M 0% /var/lib/containers/storage/overlay-containers/83d6310f5455abea445cb49196cf7d048a102357a394002abe08d03a96ad0297/userdata/shm
overlay 79G 41G 35G 54% /var/lib/containers/storage/overlay/74919ab069cdf54cbc30898c768a038dda14868b106726b7689f91aeac712c4a/merged
shm 63M 0 63M 0% /var/lib/containers/storage/overlay-containers/2a26ff31f6d715c41331376de98ecfdafb6a6799fbe2ce739717013317bd2b13/userdata/shm
overlay 79G 41G 35G 54% /var/lib/containers/storage/overlay/64f188c71dc558761ea965a820ba0b12252ac5511b7bc0bcd0c399307d50be2d/merged
tmpfs 769M 0 769M 0% /run/user/0
Could someone help me?
Thank you in advance!
mrmarkuz
(Markus Neuberger)
May 12, 2025, 12:12pm
2
Let’s check podman info to get used filesystem and other params:
runagent -m loki1 podman info
Please also check the used space of loki:
runagent -m loki1 podman system df
WARNING! Before changing things it’s always good to have a backup.
Maybe it helps to increase the tmpfs by editing /etc/systemd/logind.conf
to set the following: (10% is default)
RuntimeDirectorySize=20%
After a reboot tmpfs size should be doubled.
Maybe related:
opened 03:37PM - 24 Jan 23 UTC
closed 03:57PM - 24 Jan 23 UTC
kind/bug
locked - please file new issue/PR
### Issue Description
When `/var/lib/containers` is located on an XFS filesyste… m, it is impossible to remove container when no free space is left on that filesystem. Moreover, `podman` ends up in a bad state where the container is no longer visible in the summary but the container's storage is left behind.
This situation could, for example, be caused by a container that has exhausted its storage - it becomes impossible to remove such container.
In my reproduction the `/var/lib/containers` resides on `XFS` filesystem. I have not been able to reproduce this issue with `ext4`.
### Steps to reproduce the issue
Steps to reproduce the issue
1. I am reproducing this in a VM, so initialize the environment first:
```
mkdir reproduction
cd reproduction
vagrant init generic/centos9s
vagrant up
vagrant ssh
```
2. Inside the VM, install required packages and mount `/var/lib/containers` on an XFS filesystem, set the necessary SELinux attributes
```
sudo yum install -y xfsprogs podman
sudo fallocate -l 300M /xfs.bin
sudo mkfs.xfs /xfs.bin
sudo mount -t xfs -o loop /xfs.bin /var/lib/containers
sudo chcon -u system_u -t container_var_lib_t /var/lib/containers
```
3. Start a container that fills its own storage and exits
```
sudo podman pull docker.io/library/alpine:3.17
sudo podman run --name test docker.io/library/alpine:3.17 sh -c 'dd if=/dev/zero of=/bigfile || exit 1'
```
4. Try to remove the container, see the error message
```
$ sudo podman rm test
Error: removing container a3dcb9bde158e64c40429476da0362a6b305d36b0b05b60a732305d2fc2ec08a root filesystem: 2 errors occurred:
* open /var/lib/containers/storage/overlay-layers/.tmp-layers.json16945529: no space left on device
* open /var/lib/containers/storage/overlay-containers/.tmp-containers.json2124646358: no space left on device
```
5. Verify that despite the error above the container is gone from the `podman container ls -a` list, however the disk for `/var/lib/containers` is still full which means that the container's storage was left behind:
```
$ sudo podman container ls -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
$ df -h /var/lib/containers
Filesystem Size Used Avail Use% Mounted on
/dev/loop0 295M 295M 32K 100% /var/lib/containers
```
### Describe the results you received
Error message when deleting a container, container gone from the list of containers while container storage is left behind.
### Describe the results you expected
Container successfully removed.
### podman info output
```shell
host:
arch: amd64
buildahVersion: 1.28.0
cgroupControllers:
- cpuset
- cpu
- io
- memory
- hugetlb
- pids
- rdma
- misc
cgroupManager: systemd
cgroupVersion: v2
conmon:
package: conmon-2.1.5-1.el9.x86_64
path: /usr/bin/conmon
version: 'conmon version 2.1.5, commit: 48adb81a22c26f0660f0f37d984baebe7b9ade98'
cpuUtilization:
idlePercent: 94.9
systemPercent: 1.74
userPercent: 3.37
cpus: 2
distribution:
distribution: '"centos"'
version: "9"
eventLogger: journald
hostname: centos9s.localdomain
idMappings:
gidmap: null
uidmap: null
kernel: 5.14.0-205.el9.x86_64
linkmode: dynamic
logDriver: journald
memFree: 221835264
memTotal: 1864462336
networkBackend: netavark
ociRuntime:
name: crun
package: crun-1.7.2-2.el9.x86_64
path: /usr/bin/crun
version: |-
crun version 1.7.2
commit: 0356bf4aff9a133d655dc13b1d9ac9424706cac4
rundir: /run/crun
spec: 1.0.0
+SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
os: linux
remoteSocket:
path: /run/podman/podman.sock
security:
apparmorEnabled: false
capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
rootless: false
seccompEnabled: true
seccompProfilePath: /usr/share/containers/seccomp.json
selinuxEnabled: true
serviceIsRemote: false
slirp4netns:
executable: /bin/slirp4netns
package: slirp4netns-1.2.0-2.el9.x86_64
version: |-
slirp4netns version 1.2.0
commit: 656041d45cfca7a4176f6b7eed9e4fe6c11e8383
libslirp: 4.4.0
SLIRP_CONFIG_VERSION_MAX: 3
libseccomp: 2.5.2
swapFree: 2147479552
swapTotal: 2147479552
uptime: 0h 7m 18.00s
plugins:
authorization: null
log:
- k8s-file
- none
- passthrough
- journald
network:
- bridge
- macvlan
volume:
- local
registries:
search:
- registry.access.redhat.com
- registry.redhat.io
- docker.io
store:
configFile: /etc/containers/storage.conf
containerStore:
number: 0
paused: 0
running: 0
stopped: 0
graphDriverName: overlay
graphOptions:
overlay.mountopt: nodev,metacopy=on
graphRoot: /var/lib/containers/storage
graphRootAllocated: 308969472
graphRootUsed: 308936704
graphStatus:
Backing Filesystem: xfs
Native Overlay Diff: "false"
Supports d_type: "true"
Using metacopy: "true"
imageCopyTmpDir: /var/tmp
imageStore:
number: 1
runRoot: /run/containers/storage
volumePath: /var/lib/containers/storage/volumes
version:
APIVersion: 4.3.1
Built: 1669638068
BuiltTime: Mon Nov 28 12:21:08 2022
GitCommit: ""
GoVersion: go1.19.2
Os: linux
OsArch: linux/amd64
Version: 4.3.1
```
### Podman in a container
No
### Privileged Or Rootless
Privileged
### Upstream Latest Release
Yes
### Additional environment details
_No response_
### Additional information
I have been able to reproduce the issue with `XFS` filesystem, but not with `ext4` filesystem.
opened 05:09AM - 22 Apr 22 UTC
kind/bug
**Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)**
…
/kind bug
**Description**
Running `podman rm` (or `podman ps` or any other command) fails on a freshly booted system (runRoot empty) when graphRoot is full.
In my particular use case, we have a filesystem dedicated to podman graphRoot, so when that hits maximum capacity our user could no longer delete stopped image to free space.
**Steps to reproduce the issue:**
I've reproduced on my laptop as follow, as root:
```
# truncate -s 200M /tmp/btr
# mkfs.btrfs /tmp/btr
# mount /tmp/btr /mnt/t
# /src/podman/bin/podman --runroot /run/containers.test --root /mnt/t/containers ps
# (eventually at this point run something)
# dd if=/dev/urandom of=/mnt/t/filler bs=1M
<ENOSPC>
# for f in {1..100}; do dd if=/dev/urandom of=/mnt/t/filler.$f bs=4k count=4 status=none || break; done
<ENOSPC> (rationale is single big file isn't enough to fill 100% of the FS)
# rm -rf /run/containers.test # (simulate reboot)
# /src/podman/bin/podman --runroot /run/containers.test --root /mnt/t/container ps
ERRO[0000] [graphdriver] prior storage driver overlay failed: write /mnt/t/container/overlay/metacopy-check582242757/l1/.tmp-f3761660769: no space left on device
Error: write /mnt/t/container/overlay/metacopy-check582242757/l1/.tmp-f3761660769: no space left on device
# (same result with podman rm)
# touch '/run/containers.test/overlay/metacopy()-false' '/run/containers.test/overlay/native-diff()-true'
# /src/podman/bin/podman --runroot /run/containers.test --root /mnt/t/container ps
<works>
```
**Describe the results you received:**
ENOSPC error for something that shouldn't require space
**Describe the results you expected:**
actual listing files or allowing to delete some.
**Additional information you deem important (e.g. issue happens only occasionally):**
There are various tests made -- rightly so -- on overlay directory that are cached in /run.
I see various ways of working around this:
- move to cache to the storage we're testing. This is related to a specific graphRoot, so it'd make senes to cache it there instead -- that'd make the cached result persistent so it wouldn't go away on reboot and allow this to work. That's probably for the best -- what if someone changes their graphRoot without resetting their runRoot?
- disable these checks for commands that shouldn't care about these (ps, rm probably won't go about creating new overlays, so don't need to know)
- allow test failures and handle them as whatever result is safe for some commands (e.g. ps, rm); that's pretty hacky and probably not reliable
**Output of `podman version`:**
I've reproduced on today's main:
```
Client: Podman Engine
Version: 4.0.0-dev
API Version: 4.0.0-dev
Go Version: go1.17.8
Git Commit: 78ccd833906087d171f608d66a0384135dc80717
Built: Fri Apr 22 13:53:53 2022
OS/Arch: linux/amd64
```
**Output of `podman info --debug`:**
shouldn't be needed, ask if you really want it.
**Package info (e.g. output of `rpm -q podman` or `apt list podman`):**
built from sources.
**Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide? (https://github.com/containers/podman/blob/main/troubleshooting.md)**
Yes
Hi @mrmarkuz !
~]# runagent -m loki1 podman info
host:
arch: amd64
buildahVersion: 1.37.6
cgroupControllers:
- memory
- pids
cgroupManager: systemd
cgroupVersion: v2
conmon:
package: conmon-2.1.12-1.el9.x86_64
path: /usr/bin/conmon
version: 'conmon version 2.1.12, commit: eb379dceb7efebd9a9d6b3349a57424d83483065'
cpuUtilization:
idlePercent: 98.67
systemPercent: 0.6
userPercent: 0.73
cpus: 4
databaseBackend: sqlite
distribution:
distribution: almalinux
version: "9.5"
eventLogger: file
freeLocks: 2045
hostname: cloud-hosting
idMappings:
gidmap:
- container_id: 0
host_id: 1003
size: 1
- container_id: 1
host_id: 296608
size: 65536
uidmap:
- container_id: 0
host_id: 1003
size: 1
- container_id: 1
host_id: 296608
size: 65536
kernel: 5.14.0-503.23.2.el9_5.x86_64
linkmode: dynamic
logDriver: journald
memFree: 1112879104
memTotal: 8057352192
networkBackend: netavark
networkBackendInfo:
backend: netavark
dns:
package: aardvark-dns-1.12.2-1.el9_5.x86_64
path: /usr/libexec/podman/aardvark-dns
version: aardvark-dns 1.12.2
package: netavark-1.12.2-1.el9.x86_64
path: /usr/libexec/podman/netavark
version: netavark 1.12.2
ociRuntime:
name: crun
package: crun-1.16.1-1.el9.x86_64
path: /usr/bin/crun
version: |-
crun version 1.16.1
commit: afa829ca0122bd5e1d67f1f38e6cc348027e3c32
rundir: /run/user/1003/crun
spec: 1.0.0
+SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
os: linux
pasta:
executable: /usr/bin/pasta
package: passt-0^20240806.gee36266-6.el9_5.x86_64
version: |
pasta 0^20240806.gee36266-6.el9_5.x86_64
Copyright Red Hat
GNU General Public License, version 2 or later
<https://www.gnu.org/licenses/old-licenses/gpl-2.0.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
remoteSocket:
exists: false
path: /run/user/1003/podman/podman.sock
rootlessNetworkCmd: pasta
security:
apparmorEnabled: false
capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
rootless: true
seccompEnabled: true
seccompProfilePath: /usr/share/containers/seccomp.json
selinuxEnabled: false
serviceIsRemote: false
slirp4netns:
executable: /usr/bin/slirp4netns
package: slirp4netns-1.3.1-1.el9.x86_64
version: |-
slirp4netns version 1.3.1
commit: e5e368c4f5db6ae75c2fce786e31eef9da6bf236
libslirp: 4.4.0
SLIRP_CONFIG_VERSION_MAX: 3
libseccomp: 2.5.2
swapFree: 7912058880
swapTotal: 8589930496
uptime: 1225h 5m 52.00s (Approximately 51.04 days)
variant: ""
plugins:
authorization: null
log:
- k8s-file
- none
- passthrough
- journald
network:
- bridge
- macvlan
- ipvlan
volume:
- local
registries:
docker.io:
Blocked: false
Insecure: false
Location: docker.io
MirrorByDigestOnly: false
Mirrors:
- Insecure: false
Location: ghcr.io/nethserver/docker.io
PullFromMirror: ""
Prefix: docker.io
PullFromMirror: ""
search:
- registry.access.redhat.com
- registry.redhat.io
- docker.io
store:
configFile: /home/loki1/.config/containers/storage.conf
containerStore:
number: 0
paused: 0
running: 0
stopped: 0
graphDriverName: overlay
graphOptions: {}
graphRoot: /home/loki1/.local/share/containers/storage
graphRootAllocated: 83880697856
graphRootUsed: 42968928256
graphStatus:
Backing Filesystem: extfs
Native Overlay Diff: "true"
Supports d_type: "true"
Supports shifting: "false"
Supports volatile: "true"
Using metacopy: "false"
imageCopyTmpDir: /var/tmp
imageStore:
number: 6
runRoot: /run/user/1003/containers
transientStore: false
volumePath: /home/loki1/.local/share/containers/storage/volumes
version:
APIVersion: 5.2.2
Built: 1738640782
BuiltTime: Tue Feb 4 04:46:22 2025
GitCommit: ""
GoVersion: go1.22.9 (Red Hat 1.22.9-2.el9_5)
Os: linux
OsArch: linux/amd64
Version: 5.2.2
and
~]# runagent -m loki1 podman system df
TYPE TOTAL ACTIVE SIZE RECLAIMABLE
Images 6 3 454.6MB 106.5MB (23%)
Containers 0 0 0B 0B (0%)
Local Volumes 2 0 192.2MB 192.2MB (100%)
1 Like
davidep
(Davide Principi)
May 12, 2025, 2:05pm
4
Probably the tmpfs mounted on /run/user/1003
(has Loki uid 1003?) has exhausted the inodes.
In this case either reboot the node or stop/start the Loki user session:
systemctl stop user@$(id -u loki1)
systemctl start user@$(id -u loki1)
2 Likes
Yeah! @davidep you nailed it! How can I avoid this happens again?
davidep
(Davide Principi)
May 12, 2025, 2:37pm
6
You hit a bug which has already been fixed. However once Loki fills the tmpfs a manual recovery is required. See Loki memberlist-kv startup error · Issue #7426 · NethServer/dev · GitHub
2 Likes