After disaster recovery nscd doesn't start automatically, but manually

NethServer Version: 7.6.1810 (final)
Module: nsdc

Hi folks,

the last 2 days were hard. My proxmox had som filesystem failures. After repairing the fs, the proxmox was o.k. again, but the nethserver-vms then showed xfs-failures. So I decided to do a desaster recovery with my hotsync slave. But this wasn’t the best idea, because if you have a production machine with subscription and a slave without subscription some repos seem to be mixed up. Also the mysql-database wasn’t synced, which was bad, because we’re using sogo-calender for event-managing in my company. Fortunaltely I was able to export the old databases from the crahed machine and no data were lost. :sweat_smile:
Otherwise I had a backup from the last night before, so desaster recorery from back was also an option. Always good to have 2 options. :wink:

But after recovery I’ve now a problem which I can’t solve.
Allmost everything now works fine again on the now masterrole ex-slave, except the nsdc-service doesn’t start automatically at boot. When I start it manually from (old) server-manager it works. :thinking:

The service is enabled in config, but provisiontype is “ns6upgrade”.
I looked at journalctl -M-nsdc --since"xxxx" but found no hint.
Also in messages.log I can’t find a hint why.
I also did signal-event nethserver-sssd-save and signal-event nethserver-sssd-update.

So please @davidep can you give me some advice?

TIA Ralf

@flatspin

Hi
There are good reasons for having Proxmox do backups to some external system like a NAS - daily, for important VMs…
And it’s never a bad idea to have a double measure of backups for important VMs like Nethserver with AD.
Nethserver does it’s daily backups, but Proxmox saves that machine on a daily basis.

If you have a proxmox backup of the Nethserver, you could try restoring that, and using rsync or whatever from your Hot-Sync copy to get data / mail back up to date…

My 2 cents
Andy

Hi Andy,

thanks for answering. The restore isn’t my real problem anymore.
It’s the nsdc container. The only thing is, that the nsdc-service doesn’t start automatically.
I’m sure that there is a simple slution for that, but I don’t find it ATM. :roll_eyes:

O.k. Problem solved. It was a missconfigured br0 device.

1 Like

Glad you could figure out what was wrong. Can you elaborate what exactly was configured wrong? Always curious to learn what can go wrong and how to fix… :wink:

I think it was a doubled mac-address. After resolving this the nsdc started correctly.
I had to give a specific gateway mac-address to some win-clients (win network awareness).

1 Like

The problem is back. :woozy_face:

The service nsdc is ignored at startup bei systemd.
The service is enabled and the link @nsdc in /etc/systemd/system/machines.targest.wants exists.
systemctl is-enabled nsdc gives enabled.

Crazy: the workaround to put systemctl start nsdc into rc.local works.
It’s o.k. for me, but it’s not the way it’s ment to be, so I’d like to know why.

Any ideas @giacomo or @davidep

TIA Ralf

Search for any startup error in system journal

 journalctl -u nsdc

Hi Davide, thanks for ansewring.

No error in journal.
Only entry is Failed to create directory /var/lib/machines/nsdc//sys/fs/selinux: Read-only file system. But this is expected behaviour AFAIK.
And I think this is from service start through rc.local. It seems, that systemd during boot completely ignores the nsdc-service??

Could it be your (systemd) targets got a bit messed-up?
Does your system(d) boot to the right (= multi-user.target) default target ?

# systemctl get-default
multi-user.target

and is the machines.target loaded and active ?

# systemctl list-units --type target | grep machine
machines.target        loaded active active Containers

Full list of the setup over here:

# systemctl list-units --type target
UNIT                   LOAD   ACTIVE SUB    DESCRIPTION
basic.target           loaded active active Basic System
cryptsetup.target      loaded active active Local Encrypted Volumes
getty.target           loaded active active Login Prompts
local-fs-pre.target    loaded active active Local File Systems (Pre)
local-fs.target        loaded active active Local File Systems
machines.target        loaded active active Containers
multi-user.target      loaded active active Multi-User System
network-online.target  loaded active active Network is Online
network.target         loaded active active Network
nfs-client.target      loaded active active NFS client services
nss-lookup.target      loaded active active Host and Network Name Lookups
nss-user-lookup.target loaded active active User and Group Name Lookups
paths.target           loaded active active Paths
remote-fs-pre.target   loaded active active Remote File Systems (Pre)
remote-fs.target       loaded active active Remote File Systems
rpc_pipefs.target      loaded active active rpc_pipefs.target
rpcbind.target         loaded active active RPC Port Mapper
slices.target          loaded active active Slices
sockets.target         loaded active active Sockets
swap.target            loaded active active Swap
sysinit.target         loaded active active System Initialization
timers.target          loaded active active Timers

Hi Mark,

thanks for reply. Yes, you’re right machines.target is not loaded.
Default target is multi-user => o.k. Only target missing is machines.

How to get it back?

from the top of my head:

systemctl enable machines.target

EDIT:
reboot… or (again from the top of my head)
systemctl start machines.target

and check:
systemctl status machines.target

4 Likes

Hey man, you’re great!
Thanks a lot and you have a really reliable “top of head”. :smile:

Have a nice weekend!

2 Likes