Samba error with cluster restore NS8

I have currently a NS8 server running behind a OPNsense firewall with HAproxy, all virtualized on ESXi, no issues.
I needed new hypervisor hardware and decided to move away from ESXi to Proxmox.
In current test setup my WAN port of Proxmox is connected to my local LAN and I have a isolated Proxmox environment.
I do a NS8 cluster backup config on ESXi and cluster restore on a fresh NS8 qcow2 install on Proxmox
After that a restore (using IDrive S3) of loki, traefik and samba, only I get an error with the samba restore “Job for samba-dc.service failed because a timeout was exceeded.” Restarting the samba-dc service delivers the same time-out.
The var/log/messages on on RockyLinux shows repeatedly:
Oct 4 11:40:27 node bash[10265]: /usr/bin/bash: connect: Connection refused
Oct 4 11:40:27 node bash[10265]: /usr/bin/bash: line 1: /dev/tcp/10.5.4.1/53: Connection refused

I understand that the reason will be somewhere in my personal setup but I’m fully puzzled where to look further. Any hints pointing me in the right direction are very appreciated.

last part of var/log/messages:

Oct  4 11:40:08 node redis[1456]: 1:M 04 Oct 2025 11:40:08.057 * 1 changes in 5 seconds. Saving...
Oct  4 11:40:08 node redis[1456]: 1:M 04 Oct 2025 11:40:08.058 * Background saving started by pid 86
Oct  4 11:40:08 node redis[1456]: 86:C 04 Oct 2025 11:40:08.233 * DB saved on disk
Oct  4 11:40:08 node redis[1456]: 86:C 04 Oct 2025 11:40:08.233 * Fork CoW for RDB: current 1 MB, peak 1 MB, average 0 MB
Oct  4 11:40:08 node redis[1456]: 1:M 04 Oct 2025 11:40:08.258 * Background saving terminated with success
Oct  4 11:40:09 node traefik[4750]: 192.168.22.188 - - [04/Oct/2025:11:40:09 +0000] "GET /cluster-admin/api/cluster/task/c8afd5b2-3386-4166-90b9-482100e1992e/context HTTP/2.0" 200 454 "-" "-" 1533 "cluster-admin-https@file" "http://127.0.0.1:9311" 3ms
Oct  4 11:40:09 node bash[10092]: /usr/bin/bash: connect: Connection refused
Oct  4 11:40:09 node bash[10092]: /usr/bin/bash: line 1: /dev/tcp/10.5.4.1/53: Connection refused
Oct  4 11:40:10 node traefik[4750]: 192.168.22.188 - - [04/Oct/2025:11:40:10 +0000] "GET /cluster-admin/api/module/samba1/task/ef9a1cb5-349d-4aef-84fc-c1443e60ae21/context HTTP/2.0" 200 844 "-" "-" 1534 "cluster-admin-https@file" "http://127.0.0.1:9311" 152ms
Oct  4 11:40:14 node redis[1456]: 1:M 04 Oct 2025 11:40:14.079 * 1 changes in 5 seconds. Saving...
Oct  4 11:40:14 node redis[1456]: 1:M 04 Oct 2025 11:40:14.079 * Background saving started by pid 87
Oct  4 11:40:14 node redis[1456]: 87:C 04 Oct 2025 11:40:14.187 * DB saved on disk
Oct  4 11:40:14 node redis[1456]: 87:C 04 Oct 2025 11:40:14.188 * Fork CoW for RDB: current 1 MB, peak 1 MB, average 0 MB
Oct  4 11:40:14 node traefik[4750]: 192.168.22.188 - - [04/Oct/2025:11:40:14 +0000] "GET /cluster-admin/api/cluster/task/c8afd5b2-3386-4166-90b9-482100e1992e/context HTTP/2.0" 200 454 "-" "-" 1535 "cluster-admin-https@file" "http://127.0.0.1:9311" 3ms
Oct  4 11:40:14 node redis[1456]: 1:M 04 Oct 2025 11:40:14.281 * Background saving terminated with success
Oct  4 11:40:14 node bash[10092]: /usr/bin/bash: connect: Connection refused
Oct  4 11:40:14 node bash[10092]: /usr/bin/bash: line 1: /dev/tcp/10.5.4.1/53: Connection refused
Oct  4 11:40:15 node traefik[4750]: 192.168.22.188 - - [04/Oct/2025:11:40:15 +0000] "GET /cluster-admin/api/module/samba1/task/ef9a1cb5-349d-4aef-84fc-c1443e60ae21/context HTTP/2.0" 200 844 "-" "-" 1536 "cluster-admin-https@file" "http://127.0.0.1:9311" 177ms
Oct  4 11:40:19 node traefik[4750]: 192.168.22.188 - - [04/Oct/2025:11:40:19 +0000] "GET /cluster-admin/api/cluster/task/c8afd5b2-3386-4166-90b9-482100e1992e/context HTTP/2.0" 200 454 "-" "-" 1537 "cluster-admin-https@file" "http://127.0.0.1:9311" 195ms
Oct  4 11:40:19 node bash[10092]: /usr/bin/bash: connect: Connection refused
Oct  4 11:40:19 node bash[10092]: /usr/bin/bash: line 1: /dev/tcp/10.5.4.1/53: Connection refused
Oct  4 11:40:19 node systemd[9391]: samba-dc.service: start-post operation timed out. Terminating.
Oct  4 11:40:19 node systemd[9391]: samba-dc.service: Control process exited, code=killed, status=15/TERM
Oct  4 11:40:19 node systemd[9391]: libpod-e90b9a11dee24a37aad8cebe5839964dcc4d1bd0abfd72e21a40b69660529668.scope: Consumed 1.493s CPU time.
Oct  4 11:40:20 node redis[1456]: 1:M 04 Oct 2025 11:40:20.001 * 1 changes in 5 seconds. Saving...
Oct  4 11:40:20 node redis[1456]: 1:M 04 Oct 2025 11:40:20.001 * Background saving started by pid 88
Oct  4 11:40:20 node redis[1456]: 88:C 04 Oct 2025 11:40:20.009 * DB saved on disk
Oct  4 11:40:20 node redis[1456]: 88:C 04 Oct 2025 11:40:20.009 * Fork CoW for RDB: current 0 MB, peak 0 MB, average 0 MB
Oct  4 11:40:20 node redis[1456]: 1:M 04 Oct 2025 11:40:20.101 * Background saving terminated with success
Oct  4 11:40:20 node traefik[4750]: 192.168.22.188 - - [04/Oct/2025:11:40:20 +0000] "GET /cluster-admin/api/module/samba1/task/ef9a1cb5-349d-4aef-84fc-c1443e60ae21/context HTTP/2.0" 200 844 "-" "-" 1538 "cluster-admin-https@file" "http://127.0.0.1:9311" 65ms
Oct  4 11:40:20 node podman[10226]: e90b9a11dee24a37aad8cebe5839964dcc4d1bd0abfd72e21a40b69660529668
Oct  4 11:40:20 node systemd[9391]: samba-dc.service: Failed with result 'timeout'.
Oct  4 11:40:20 node systemd[9391]: Failed to start Samba DC and File Server.
Oct  4 11:40:20 node agent@samba1[9412]: Job for samba-dc.service failed because a timeout was exceeded.
Oct  4 11:40:20 node agent@samba1[9412]: See "systemctl --user status samba-dc.service" and "journalctl --user -xeu samba-dc.service" for details.
Oct  4 11:40:20 node agent@samba1[9412]: task/module/samba1/ef9a1cb5-349d-4aef-84fc-c1443e60ae21: action "restore-module" status is "aborted" (1) at step 60resume_state
Oct  4 11:40:20 node agent@cluster[2466]: Assertion failed
Oct  4 11:40:20 node agent@cluster[2466]:  File "/var/lib/nethserver/cluster/actions/restore-module/50restore_module", line 103, in <module>
Oct  4 11:40:20 node agent@cluster[2466]:    agent.assert_exp(restore_task_result['exit_code'] == 0)
Oct  4 11:40:20 node traefik[4750]: 192.168.22.188 - - [04/Oct/2025:11:40:20 +0000] "GET /cluster-admin/api/module/samba1/task/ef9a1cb5-349d-4aef-84fc-c1443e60ae21/context HTTP/2.0" 200 844 "-" "-" 1539 "cluster-admin-https@file" "http://127.0.0.1:9311" 3ms
Oct  4 11:40:20 node traefik[4750]: 192.168.22.188 - - [04/Oct/2025:11:40:20 +0000] "GET /cluster-admin/api/module/samba1/task/ef9a1cb5-349d-4aef-84fc-c1443e60ae21/status HTTP/2.0" 200 3777 "-" "-" 1540 "cluster-admin-https@file" "http://127.0.0.1:9311" 3ms
Oct  4 11:40:20 node agent@cluster[2466]: task/cluster/c8afd5b2-3386-4166-90b9-482100e1992e: action "restore-module" status is "aborted" (2) at step 50restore_module
Oct  4 11:40:20 node traefik[4750]: 192.168.22.188 - - [04/Oct/2025:11:40:20 +0000] "GET /cluster-admin/api/cluster/task/c8afd5b2-3386-4166-90b9-482100e1992e/context HTTP/2.0" 200 454 "-" "-" 1541 "cluster-admin-https@file" "http://127.0.0.1:9311" 3ms
Oct  4 11:40:20 node traefik[4750]: 192.168.22.188 - - [04/Oct/2025:11:40:20 +0000] "GET /cluster-admin/api/cluster/task/c8afd5b2-3386-4166-90b9-482100e1992e/status HTTP/2.0" 200 324 "-" "-" 1542 "cluster-admin-https@file" "http://127.0.0.1:9311" 2ms
Oct  4 11:40:20 node systemd[9391]: samba-dc.service: Scheduled restart job, restart counter is at 1.
Oct  4 11:40:20 node systemd[9391]: Stopped Samba DC and File Server.
Oct  4 11:40:20 node systemd[9391]: Starting Samba DC and File Server...
Oct  4 11:40:22 node systemd[9391]: Started libcrun container.
Oct  4 11:40:22 node kernel: xfs filesystem being remounted at /home/samba1/.local/share/containers/storage/overlay/6719646a7b48f79780caa01b5812a3c16c225a1c3755df2536345b538051b2e0/merged/etc/samba supports timestamps until 2038 (0x7fffffff)
Oct  4 11:40:22 node kernel: xfs filesystem being remounted at /home/samba1/.local/share/containers/storage/overlay/6719646a7b48f79780caa01b5812a3c16c225a1c3755df2536345b538051b2e0/merged/srv/shares supports timestamps until 2038 (0x7fffffff)
Oct  4 11:40:22 node kernel: xfs filesystem being remounted at /home/samba1/.local/share/containers/storage/overlay/6719646a7b48f79780caa01b5812a3c16c225a1c3755df2536345b538051b2e0/merged/srv/homes supports timestamps until 2038 (0x7fffffff)
Oct  4 11:40:22 node kernel: xfs filesystem being remounted at /home/samba1/.local/share/containers/storage/overlay/6719646a7b48f79780caa01b5812a3c16c225a1c3755df2536345b538051b2e0/merged/var/lib/samba supports timestamps until 2038 (0x7fffffff)
Oct  4 11:40:22 node podman[10240]: 886cfb3ea9c4b47d00cf7c6475d43515c6bdb79fbde2c1bf5814f96ca8d8b9fa
Oct  4 11:40:22 node bash[10265]: /usr/bin/bash: connect: Connection refused
Oct  4 11:40:22 node bash[10265]: /usr/bin/bash: line 1: /dev/tcp/10.5.4.1/53: Connection refused
Oct  4 11:40:22 node samba-dc[10250]: # Global parameters
Oct  4 11:40:22 node samba-dc[10250]: [global]
Oct  4 11:40:22 node samba-dc[10250]: #011bind interfaces only = Yes
Oct  4 11:40:22 node samba-dc[10250]: #011interfaces = 127.0.0.1 10.5.4.1
Oct  4 11:40:22 node samba-dc[10250]: #011obey pam restrictions = Yes
Oct  4 11:40:22 node samba-dc[10250]: #011passdb backend = samba_dsdb
Oct  4 11:40:22 node samba-dc[10250]: #011realm = AD.mydomain.NL
Oct  4 11:40:22 node samba-dc[10250]: #011registry shares = Yes
Oct  4 11:40:22 node samba-dc[10250]: #011server role = active directory domain controller
Oct  4 11:40:22 node samba-dc[10250]: #011template homedir = /srv/homes/%U
Oct  4 11:40:22 node samba-dc[10250]: #011workgroup = mydomain
Oct  4 11:40:22 node samba-dc[10250]: #011rpc_server:tcpip = no
Oct  4 11:40:22 node samba-dc[10250]: #011rpc_daemon:spoolssd = embedded
Oct  4 11:40:22 node samba-dc[10250]: #011rpc_server:spoolss = embedded
Oct  4 11:40:22 node samba-dc[10250]: #011rpc_server:winreg = embedded
Oct  4 11:40:22 node samba-dc[10250]: #011rpc_server:ntsvcs = embedded
Oct  4 11:40:22 node samba-dc[10250]: #011rpc_server:eventlog = embedded
Oct  4 11:40:22 node samba-dc[10250]: #011rpc_server:srvsvc = embedded
Oct  4 11:40:22 node samba-dc[10250]: #011rpc_server:svcctl = embedded
Oct  4 11:40:22 node samba-dc[10250]: #011rpc_server:default = external
Oct  4 11:40:22 node samba-dc[10250]: #011winbindd:use external pipes = true
Oct  4 11:40:22 node samba-dc[10250]: #011recycle:versions = yes
Oct  4 11:40:22 node samba-dc[10250]: #011recycle:keeptree = yes
Oct  4 11:40:22 node samba-dc[10250]: #011recycle:repository = 
Oct  4 11:40:22 node samba-dc[10250]: #011full_audit:failure = none
Oct  4 11:40:22 node samba-dc[10250]: #011full_audit:success = none
Oct  4 11:40:22 node samba-dc[10250]: #011full_audit:priority = INFO
Oct  4 11:40:22 node samba-dc[10250]: #011full_audit:facility = LOCAL7
Oct  4 11:40:22 node samba-dc[10250]: #011full_audit:prefix = %R|%I|%u|%S
Oct  4 11:40:22 node samba-dc[10250]: #011acl_xattr:ignore system acls = yes
Oct  4 11:40:22 node samba-dc[10250]: #011acl_xattr:security_acl_name = user.NTACL
Oct  4 11:40:22 node samba-dc[10250]: #011idmap config * : backend = tdb
Oct  4 11:40:22 node samba-dc[10250]: #011include = /etc/samba/include.conf
Oct  4 11:40:22 node samba-dc[10250]: #011inherit owner = windows and unix
Oct  4 11:40:22 node samba-dc[10250]: #011map archive = No
Oct  4 11:40:22 node samba-dc[10250]: #011vfs objects = dfs_samba4 acl_xattr recycle full_audit
Oct  4 11:40:22 node samba-dc[10250]: 
Oct  4 11:40:22 node samba-dc[10250]: 
Oct  4 11:40:22 node samba-dc[10250]: [sysvol]
Oct  4 11:40:22 node samba-dc[10250]: #011inherit owner = no
Oct  4 11:40:22 node samba-dc[10250]: #011path = /var/lib/samba/sysvol
Oct  4 11:40:22 node samba-dc[10250]: #011read only = No
Oct  4 11:40:22 node samba-dc[10250]: #011acl_xattr:ignore system acls = no
Oct  4 11:40:22 node samba-dc[10250]: 
Oct  4 11:40:22 node samba-dc[10250]: 
Oct  4 11:40:22 node samba-dc[10250]: [netlogon]
Oct  4 11:40:22 node samba-dc[10250]: #011path = /var/lib/samba/sysvol/ad.mydomain.nl/scripts
Oct  4 11:40:22 node samba-dc[10250]: #011read only = No
Oct  4 11:40:22 node samba-dc[10250]: 
Oct  4 11:40:22 node samba-dc[10250]: 
Oct  4 11:40:22 node samba-dc[10250]: [homes]
Oct  4 11:40:22 node samba-dc[10250]: #011browseable = No
Oct  4 11:40:22 node samba-dc[10250]: #011comment = %u home directory
Oct  4 11:40:22 node samba-dc[10250]: #011read only = No
Oct  4 11:40:22 node samba-dc[10250]: 2025-10-04T11:40:22Z chronyd version 4.5 starting (+CMDMON +NTP +REFCLOCK +RTC +PRIVDROP +SCFILTER +SIGND +ASYNCDNS +NTS +SECHASH +IPV6 -DEBUG)
Oct  4 11:40:22 node samba-dc[10250]: 2025-10-04T11:40:22Z Disabled control of system clock
Oct  4 11:40:22 node samba-dc[10250]: 2025-10-04T11:40:22Z Loaded 0 symmetric keys
Oct  4 11:40:22 node samba-dc[10250]: 2025-10-04T11:40:22Z MS-SNTP authentication enabled
Oct  4 11:40:22 node samba-dc[10250]: [2025-10-04T11:40:22.071359] smart-multi-line: error opening smart-multi-line.fsm file; filename='/usr/share/syslog-ng/smart-multi-line.fsm', error='No such file or directory (2)'
Oct  4 11:40:22 node samba-dc[10250]: [2025-10-04T11:40:22.071495] smart-multi-line: your smart-multi-line.fsm seems to be empty or non-existent, automatic multi-line log extraction will probably not work; filename='/usr/share/syslog-ng/smart-multi-line.fsm'
Oct  4 11:40:22 node samba-dc[10250]: samba version 4.19.5-Ubuntu started.
Oct  4 11:40:22 node samba-dc[10250]: Copyright Andrew Tridgell and the Samba Team 1992-2023
Oct  4 11:40:22 node samba-dc[10250]: daemon 'samba' : Starting process...
Oct  4 11:40:22 node samba-dc[10250]: 2025-10-04 11:40:22,181:wsdd ERROR(pid 17): error while sending packet on wg0: [Errno 126] Required key not available
Oct  4 11:40:22 node samba-dc[10250]: /usr/sbin/smbd: smbd version 4.19.5-Ubuntu started.
Oct  4 11:40:22 node samba-dc[10250]: /usr/sbin/smbd: Copyright Andrew Tridgell and the Samba Team 1992-2023
Oct  4 11:40:22 node samba-dc[10250]: /usr/sbin/smbd: INFO: Profiling turned OFF from pid 26
Oct  4 11:40:22 node samba-dc[10250]: /usr/sbin/winbindd: winbindd version 4.19.5-Ubuntu started.
Oct  4 11:40:22 node samba-dc[10250]: /usr/sbin/winbindd: Copyright Andrew Tridgell and the Samba Team 1992-2023
Oct  4 11:40:22 node samba-dc[10250]: 2025-10-04 11:40:22,344:wsdd ERROR(pid 17): error while sending packet on wg0: [Errno 126] Required key not available
Oct  4 11:40:22 node samba-dc[10250]: 2025-10-04 11:40:22,668:wsdd ERROR(pid 17): error while sending packet on wg0: [Errno 126] Required key not available
Oct  4 11:40:23 node samba-dc[10250]: 2025-10-04 11:40:23,169:wsdd ERROR(pid 17): error while sending packet on wg0: [Errno 126] Required key not available
Oct  4 11:40:26 node redis[1456]: 1:M 04 Oct 2025 11:40:26.021 * 1 changes in 5 seconds. Saving...
Oct  4 11:40:26 node redis[1456]: 1:M 04 Oct 2025 11:40:26.022 * Background saving started by pid 89
Oct  4 11:40:26 node redis[1456]: 89:C 04 Oct 2025 11:40:26.162 * DB saved on disk
Oct  4 11:40:26 node redis[1456]: 89:C 04 Oct 2025 11:40:26.163 * Fork CoW for RDB: current 1 MB, peak 1 MB, average 0 MB
Oct  4 11:40:26 node redis[1456]: 1:M 04 Oct 2025 11:40:26.223 * Background saving terminated with success
Oct  4 11:40:27 node bash[10265]: /usr/bin/bash: connect: Connection refused
Oct  4 11:40:27 node bash[10265]: /usr/bin/bash: line 1: /dev/tcp/10.5.4.1/53: Connection refused
Oct  4 11:40:32 node bash[10265]: /usr/bin/bash: connect: Connection refused
Oct  4 11:40:32 node bash[10265]: /usr/bin/bash: line 1: /dev/tcp/10.5.4.1/53: Connection refused
Oct  4 11:40:37 node bash[10265]: /usr/bin/bash: connect: Connection refused
Oct  4 11:40:37 node bash[10265]: /usr/bin/bash: line 1: /dev/tcp/10.5.4.1/53: Connection refused

error msgs after restore:

renamed 'restore/private' -> './private'
renamed 'restore/state/sysvol' -> './sysvol'
renamed 'restore/state/account_policy.tdb' -> './account_policy.tdb'
renamed 'restore/state/registry.tdb' -> './registry.tdb'
renamed 'restore/state/share_info.tdb' -> './share_info.tdb'
renamed 'restore/state/winbindd_cache.tdb' -> './winbindd_cache.tdb'
removed 'restore/backup.txt'
removed 'restore/state/syslog-ng/syslog-ng-disk-buffer.dirlock'
removed 'restore/state/syslog-ng/syslog-ng-00000.qf'
removed 'restore/state/syslog-ng/syslog-ng-00001.qf'
removed 'restore/state/syslog-ng/syslog-ng-00002.qf'
removed 'restore/state/syslog-ng/syslog-ng-00003.qf'
removed directory 'restore/state/syslog-ng'
removed directory 'restore/state'
removed 'restore/etc/gdbcommands'
removed 'restore/etc/include.conf'
removed 'restore/etc/smb.conf.distro'
removed 'restore/etc/smb.conf'
removed 'restore/etc/smb.conf.orig'
removed directory 'restore/etc'
removed 'restore/gencache.tdb'
removed directory 'restore'
removed 'backup/samba-backup.tar.bz2'
removed directory 'backup'
Created symlink /home/samba1/.config/systemd/user/default.target.wants/samba-dc.service → /home/samba1/.config/systemd/user/samba-dc.service.
Job for samba-dc.service failed because a timeout was exceeded.
See "systemctl --user status samba-dc.service" and "journalctl --user -xeu samba-dc.service" for details.

Samba is not starting. The Samba port 53 is checked but Samba never listens so you get a connection refused and in the end a timeout.

Some blind shots:

ERROR(pid 17): error while sending packet on wg0: [Errno 126] Required key not available

Does wireguard work?

ping 10.5.4.1

Proxmox is installed on a VIPER VP4000 Mini 500GB SSD and the Proxmox VM on a Lexar LNQ790 4TB SSD, there is more than 3.5 TB space left. NS8 is a clean qcow2 environment so is assume enough disk space left and the SSD’s should deliver enough speed.

I can ping 10.5.4.1

The same cluster backup-restore process on a fresh NS8 vmdk on my ESXi system delivers the same Samba error.

You mean there should be a warning in the var/log/messages log (or NS8 GUI system logs)?

In the NS8 GUI system logs.

Did the IP change in the new environment? Maybe you need to set the new IP for Samba.

Did the IP change in the new environment? Maybe you need to set the new IP for Samba

Both have the same fixed IP-address

On my current working NS8 system I did a restart of Samba (runagent -m samba1 systemctl --user restart samba-dc) and compared the the logs with the non-working restored NS8 system.

NS8-working logs show also the

error while sending packet on wg0: [Errno 126] Required key not available

errors. So those one doesn’t seem the error cause.

The NS8-working log shows

node1 systemd[1685]: Started Samba DC and File Server.

while the NS8-non-working doesn’t show this message but repeatedly

[1:samba1:bash] /usr/bin/bash: line 1: /dev/tcp/10.5.4.1/53: Connection refused

The NS8-working log only seems to show this last message during the restart of Samba (or other services).

I also did a compare on both systems on the output of:

 runagent -m samba1 systemctl --user status samba-dc.service

NS8-working shows:

samba-dc.service - Samba DC and File Server
     Loaded: loaded (/home/samba3/.config/systemd/user/samba-dc.service; enabled; preset: disabled)
     Active: active (running) since Mon 2025-10-06 14:41:43 UTC; 1h 9min ago
    Process: 3722 ExecCondition=bash -c [[ -n "$$IPADDRESS" ]] (code=exited, status=0/SUCCESS)
    Process: 3723 ExecStartPre=/bin/rm -f /run/user/1012/samba-dc.pid /run/user/1012/samba-dc.cid (code=exited, status=0/SUCCESS)
    Process: 3724 ExecStartPre=runagent bash -c (echo -n DNS_FORWARDER= ; print-nameservers) > dns_forwarder.env (code=exited, status=0/SUCCESS)
    Process: 3727 ExecStart=/usr/bin/podman run --dns=none --no-hosts --detach --conmon-pidfile /run/user/1012/samba-dc.pid --cidfile /run/user/1012/samba-dc.cid --cgroups=no-conmon --network=host --hostname=${HOSTNAME} --replace --name=samba-dc --env=REALM --env=IPADDRESS --env=PREFIXLEN --env=NBDOMAIN --env>
    Process: 3746 ExecStartPost=/usr/bin/bash -c [[ $$SERVER_ROLE == member ]] || while ! exec 3<>/dev/tcp/${IPADDRESS}/53; do sleep 5 ; done (code=exited, status=0/SUCCESS)
    Process: 3827 ExecStartPost=/usr/bin/bash -c [[ $$SERVER_ROLE == member ]] || while ! exec 3<>/dev/tcp/${IPADDRESS}/88; do sleep 5 ; done (code=exited, status=0/SUCCESS)
    Process: 3828 ExecStartPost=/usr/bin/bash -c [[ $$SERVER_ROLE == member ]] || while ! exec 3<>/dev/tcp/${IPADDRESS}/389; do sleep 5 ; done (code=exited, status=0/SUCCESS)
    Process: 3829 ExecStartPost=/usr/bin/bash -c [[ $$SERVER_ROLE == member ]] || while ! exec 3<>/dev/tcp/${IPADDRESS}/3268; do sleep 5 ; done (code=exited, status=0/SUCCESS)
   Main PID: 3736 (conmon)
      Tasks: 1 (limit: 22933)
     Memory: 688.0K
        CPU: 313ms
     CGroup: /user.slice/user-1012.slice/user@1012.service/app.slice/samba-dc.service
             └─3736 /usr/bin/conmon --api-version 1 -c 548ad02e8960e6823035f3ae9310e244d7b271cb5719b2636b0a78f34d61a638 -u 548ad02e8960e6823035f3ae9310e244d7b271cb5719b2636b0a78f34d61a638 -r /usr/bin/crun -b /home/samba3/.local/share/containers/storage/overlay-containers/548ad02e8960e6823035f3ae9310e244d7b271>

NS8-non-working shows:

â—Ź samba-dc.service - Samba DC and File Server
     Loaded: loaded (/home/samba3/.config/systemd/user/samba-dc.service; enabled; preset: disabled)
     Active: activating (start-post) since Mon 2025-10-06 15:49:53 UTC; 1s ago
    Process: 24718 ExecCondition=bash -c [[ -n "$$IPADDRESS" ]] (code=exited, status=0/SUCCESS)
    Process: 24719 ExecStartPre=/bin/rm -f /run/user/1007/samba-dc.pid /run/user/1007/samba-dc.cid (code=exited, status=0/SUCCESS)
    Process: 24720 ExecStartPre=runagent bash -c (echo -n DNS_FORWARDER= ; print-nameservers) > dns_forwarder.env (code=exited, status=0/SUCCESS)
    Process: 24723 ExecStart=/usr/bin/podman run --dns=none --no-hosts --detach --conmon-pidfile /run/user/1007/samba-dc.pid --cidfile /run/user/1007/samba-dc.cid --cgroups=no->
   Main PID: 24733 (conmon); Control PID: 24743 (bash)
      Tasks: 3 (limit: 22882)
     Memory: 1.1M
        CPU: 311ms
     CGroup: /user.slice/user-1007.slice/user@1007.service/app.slice/samba-dc.service
             ├─24733 /usr/bin/conmon --api-version 1 -c dd05a532b1fff570b01368cb9bb44f4e6ec19be54df3b6d023a519c9c6ba0dd3 -u dd05a532b1fff570b01368cb9bb44f4e6ec19be54df3b6d023a5>
             ├─24743 /usr/bin/bash -c "[[ \$SERVER_ROLE == member ]] || while ! exec 3<>/dev/tcp/10.5.4.1/53; do sleep 5 ; done"
             └─24749 sleep 5

Here are 2 extra processes still running (24743 and 24749) compared with NS8-working. Could that be part of the problem? (I didn’t try to kill them yet)

I tried to reproduce but without success, a restored Samba is starting on a new NS8.
Was the file server enabled on the old installation?
Are there unconfigured domains in the UI at the Domains and users page on the new installed NS8?

I don’t think so. These are the processes that check if samba is up by checking the ports every 5 seconds.

Thanks Markuz for your effort. No fileserver is not enabled on old installation.

I did a complete new install of NS8 on Proxmox using qcow2 with create cluster, create new domain, added backup destination Did a backup for traefik and samba and saved the cluster backup. Stopped the NS8 server. Created again a new NS8. Now did a cluster restore from previous one. Traefik restore went ok. Samba restore, again ran into same error.

Investigated the logs: compared with a normal Samba restart which ends with “Started Samba DC and File Server” the Samba error situation log show repeatedly for every 5secs:

Oct 15 14:54:58 node bash[8673]: /usr/bin/bash: connect: Connection refused
Oct 15 14:54:58 node bash[8673]: /usr/bin/bash: line 1: /dev/tcp/10.5.4.1/53: Connection refused
Oct 15 14:54:58 node traefik[4823]: 192.168.22.188 - - [15/Oct/2025:14:54:58 +0000] "GET /cluster-admin/api/cluster/task/07214544-bbf7-42db-af7f-659475770eb7/context HTTP/2.0" 200 455 "-" "-" 642 "cluster-admin-https@file" "http://127.0.0.1:9311" 3ms
Oct 15 14:54:59 node traefik[4823]: 192.168.22.188 - - [15/Oct/2025:14:54:58 +0000] "GET /cluster-admin/api/module/samba1/task/f16799b3-da0c-4b3e-9db8-159869dd23ab/context HTTP/2.0" 200 764 "-" "-" 643 "cluster-admin-https@file" "http://127.0.0.1:9311" 110ms
Oct 15 14:55:01 node redis[1453]: 1:M 15 Oct 2025 14:55:01.038 * 1 changes in 5 seconds. Saving...
Oct 15 14:55:01 node redis[1453]: 1:M 15 Oct 2025 14:55:01.038 * Background saving started by pid 69
Oct 15 14:55:01 node redis[1453]: 69:C 15 Oct 2025 14:55:01.153 * DB saved on disk
Oct 15 14:55:01 node redis[1453]: 69:C 15 Oct 2025 14:55:01.153 * Fork CoW for RDB: current 1 MB, peak 1 MB, average 0 MB
Oct 15 14:55:01 node redis[1453]: 1:M 15 Oct 2025 14:55:01.239 * Background saving terminated with success

until samba-dc.service time-out

Oct 15 14:56:23 node systemd[7984]: samba-dc.service: start-post operation timed out. Terminating.
Oct 15 14:56:23 node systemd[7984]: samba-dc.service: Control process exited, code=killed, status=15/TERM
Oct 15 14:56:24 node agent@samba1[8004]: Job for samba-dc.service failed because a timeout was exceeded.
Oct 15 14:56:24 node agent@samba1[8004]: See "systemctl --user status samba-dc.service" and "journalctl --user -xeu samba-dc.service" for details.
Oct 15 14:56:24 node agent@samba1[8004]: task/module/samba1/f16799b3-da0c-4b3e-9db8-159869dd23ab: action "restore-module" status is "aborted" (1) at step 60resume_state
Oct 15 14:56:24 node agent@cluster[2649]: Assertion failed
Oct 15 14:56:24 node agent@cluster[2649]:  File "/var/lib/nethserver/cluster/actions/restore-module/50restore_module", line 103, in <module>
Oct 15 14:56:24 node agent@cluster[2649]:    agent.assert_exp(restore_task_result['exit_code'] == 0)

then it stop/starts Samba, runs into a time-out, over and over again.

For whatever reason the port 53 (and 88, 389, 3268 I think) are not opened on 10.5.4.1 I don’t know where that should happen: in the 50restore_module or in or after the 60resume_state or somewhere else before?

A Mail and Roundcubemail restored also works fine (on a clean NS8 install with create domain and users added manually)

It stays strange that after creating a new domain with users and mail restore I have a working system but I can’t do a backup-restore with Samba to re-create the same situation.

Could it be something that my domain name is not resolved well ? (putting it explicitly in the /etc/hosts file didn’t help)

For now I stop investigating the issue. No Samba restore for me possible :unamused_face:

1 Like

Did you enter 127.0.0.1 as nameserver because that could cause issues too, see also Samba DNS 100% CPU - #14 by mrmarkuz

Did you already try to follow the logs on CLI using

journalctl -f

in one terminal and restarting samba-dc in another terminal:

runagent -m samba1 systemctl --user restart samba-dc

No, it’s not in. Only my original gateway

I didn’t know this one, tried it but in the end it gives me no extra insight. In the end it’s - of course - just a copy of the logs.

1 Like

Let’s check the permissions of the config files: (in my case the samba instance is samba3, you’d need to adapt it to your samba instance)

[root@ns8rockytest ~]# ls -l /home/samba3/.local/share/containers/storage/volumes/config/_data/
total 24
-rw-r--r--. 1 samba3 samba3    8 Aug  7  2023 gdbcommands
-rw-r--r--. 1 samba3 samba3   46 Sep 29 15:39 include.conf
-rw-r--r--. 1 samba3 samba3 1277 Oct 16 08:13 smb.conf
-rw-r--r--. 1 samba3 samba3 8917 Nov 18  2024 smb.conf.distro
drwxr-xr-x. 2 samba3 samba3    6 Oct 10  2023 tls

Let’s check the smb.conf:

[root@ns8rockytest ~]# cat /home/samba3/.local/share/containers/storage/volumes/config/_data/smb.conf
# Generated by expand-config. Manual changes to this file are lost!
[global]
        
        bind interfaces only = Yes
        interfaces = 127.0.0.1 192.168.3.144
        netbios name = DC3
        netbios aliases = 
        realm = AD.NS8TEST.COM
        workgroup = NS8TEST
        log level = 1 auth_audit:3
        acl_xattr:security_acl_name = user.NTACL
        acl_xattr:ignore system acls = yes
        template homedir = /srv/homes/%U
        obey pam restrictions = yes
        registry shares = yes
        inherit owner = yes
        full_audit:prefix = %R|%I|%u|%S
        full_audit:facility = LOCAL7
        full_audit:priority = INFO
        full_audit:success = none
        full_audit:failure = none
        recycle:repository =
        recycle:keeptree = yes
        recycle:versions = yes

        vfs objects = dfs_samba4 acl_xattr recycle full_audit
        server role = active directory domain controller
        include = /etc/samba/include.conf

[sysvol]
        path = /var/lib/samba/sysvol
        read only = No
        acl_xattr:ignore system acls = no
        inherit owner = no
[netlogon]
        path = /var/lib/samba/sysvol/ad.ns8test.com/scripts
        read only = No

[homes]
comment = %u home directory
browseable = no
writeable = yes

My situation (also having samba3 currently)

[root@node ~]# ls -l /home/samba3/.local/share/containers/storage/volumes/config/_data
total 24
-rw-r--r--. 1 samba3 samba3    8 Apr 22 13:55 gdbcommands
-rw-r--r--. 1 samba3 samba3   94 Sep 15 15:43 include.conf
-rw-r--r--. 1 samba3 samba3 1276 Oct 18 09:17 smb.conf
-rw-r--r--. 1 samba3 samba3 8917 Jul 21 20:37 smb.conf.distro
drwxr-xr-x. 2 samba3 samba3    6 Jul 21 20:37 tls

and

[root@node ~]# cat /home/samba3/.local/share/containers/storage/volumes/config/_data/smb.conf
# Generated by expand-config. Manual changes to this file are lost!
[global]

        bind interfaces only = Yes
        interfaces = 127.0.0.1 10.5.4.1
        netbios name = DC1TESTR1
        netbios aliases =
        realm = AD.MYHOSTNAME.NL
        workgroup = MYHOSTNAME
        log level = 1 auth_audit:3
        acl_xattr:security_acl_name = user.NTACL
        acl_xattr:ignore system acls = yes
        template homedir = /srv/homes/%U
        obey pam restrictions = yes
        registry shares = yes
        inherit owner = yes
        full_audit:prefix = %R|%I|%u|%S
        full_audit:facility = LOCAL7
        full_audit:priority = INFO
        full_audit:success = none
        full_audit:failure = none
        recycle:repository =
        recycle:keeptree = yes
        recycle:versions = yes

        vfs objects = dfs_samba4 acl_xattr recycle full_audit
        server role = active directory domain controller
        include = /etc/samba/include.conf

[sysvol]
        path = /var/lib/samba/sysvol
        read only = No
        acl_xattr:ignore system acls = no
        inherit owner = no
[netlogon]
        path = /var/lib/samba/sysvol/ad.myhostname.nl/scripts
        read only = No

[homes]
comment = %u home directory
browseable = no
writeable = yes

Look quite the same to me.

Generic remark. Always try to be aware how my setup can differ from more generic setups: I’m using virtualized OPNsense as firewall against the internet. It terminates all TLS sessions. Using HAproxy in OPNsense to route on hostnames and pass connections via port 80.

1 Like