Disaster recovery attempts keep failing

v7rc3

I’ve been messing with this for a few days without success, so finally I went back to the basics.

Created a virtualbox vm, installed samba dc and backup, set up some users. Yesterday, I made sure it was up to date on software and let backup run lastnight, successful.
Created a second vm on a second host, joined to first as a member, added mail, blah, blah, member joined fine, gets users, etc.

Today
Trashed samba dc vm, installed from scratch with rc1, set hostname and local dns, updated, rebooted, installed backup, pointed to backup location, ran restore, still failed, not sure why or what’s going on. Once it fails I can’t access shell remotely with putty, only have console access. Documenting here until I can mess with it more later.

Any suggestions?

2 Likes

Thank you @fasttech I’m happy to see you behind the wheel again! :wink:

@quality_team, did anyone succeeded on this? We should try to reproduce the problem…

1 Like

@davidep
So this time I tried a restore against the member.

I restored a functional vm snapshot of the above samba dc pdc and left it running over the weekend with a joined nethserver member.

This morning I installed backup into the member and fired off a backup. I then restored the member to a new rc1 install, rebooted, updated, rebooted, installed backup, restored… this restore went significantly better in that it restored the machine with mail etc, but the restored machine fails to join the domain, I have included some logs and snapshots including the correctly populated account provider page.

These are virtualbox vms each hosted on a different hardware machine.

pre restore dashboard

post restore dashboard

included for time stamps

2016-12-27 13:47:20 - START - Restore config started 2016-12-27 13:47:22 - STEP - pre-restore-config done 2016-12-27 13:47:22 - STEP - restore-config-execute done 2016-12-27 13:52:26 - ERROR - Event post-restore-config failed - 256

various log entries containing errors or failures

`
Dec 27 13:50:16 server7c2 esmith::event[4045]: expanding /var/lib/nethserver/sieve-scripts/before.sieve
Dec 27 13:50:16 server7c2 esmith::event[4045]: Action: /etc/e-smith/events/actions/generic_template_expand SUCCESS [1.57084]
Dec 27 13:50:16 server7c2 esmith::event[4045]: Action: /etc/e-smith/events/nethserver-mail-server-update/S20nethserver-mail-server-conf SUCCESS [0.139383]
Dec 27 13:50:17 server7c2 esmith::event[4045]: Action: /etc/e-smith/events/nethserver-mail-server-update/S30nethserver-mail-postmap-update SUCCESS [0.388047]
Dec 27 13:50:18 server7c2 esmith::event[4045]: kinit: Cannot find KDC for realm “NETH.TEST.LOCAL” while getting initial credentials
Dec 27 13:50:21 server7c2 esmith::event[4045]: ads_connect: No logon servers
Dec 27 13:50:23 server7c2 esmith::event[4045]: ads_connect: No logon servers
Dec 27 13:50:23 server7c2 esmith::event[4045]: [ERROR] /usr/libexec/nethserver/smbads: failed to add service primaries to system keytab
Dec 27 13:50:23 server7c2 esmith::event[4045]: [ERROR] /usr/libexec/nethserver/smbads: failed to initialize keytabs
Dec 27 13:50:23 server7c2 esmith::event[4045]: Action: /etc/e-smith/events/nethserver-mail-server-update/S50nethserver-sssd-initkeytabs FAILED: 5 [6.304973]
Dec 27 13:50:23 server7c2 systemd: Reloading.
Dec 27 13:50:23 server7c2 esmith::event[4045]: [INFO] service amavisd restart

Dec 27 13:50:27 server7c2 esmith::event[4045]: Action: /etc/e-smith/events/actions/adjust-services SUCCESS [3.772836]
Dec 27 13:50:27 server7c2 esmith::event[4045]: [NOTICE] Initialize vmail Public IMAP namespace
Dec 27 13:50:27 server7c2 esmith::event[4045]: Action: /etc/e-smith/events/nethserver-mail-server-update/S98nethserver-mail-server-init-acl SUCCESS [0.321991]
Dec 27 13:50:27 server7c2 esmith::event[4045]: Event: nethserver-mail-server-update FAILED
Dec 27 13:50:27 server7c2 kernel: IPv4: martian source 192.168.124.210 from 192.168.124.196, on dev enp0s3

Dec 27 13:56:31 server7c2 kernel: IPv4: martian source 192.168.124.126 from 192.168.124.196, on dev enp0s3
Dec 27 13:56:31 server7c2 kernel: ll header: 00000000: ff ff ff ff ff ff 08 00 27 26 df 30 08 06 …’&.0…
Dec 27 13:56:34 server7c2 httpd: [ERROR] NethServer\Tool\GroupProvider: AccountProvider_Error_1
Dec 27 13:56:34 server7c2 httpd: [ERROR] Traceback (most recent call last):#012 File “”, line 3, in #012KeyError: ‘SECRETS/MACHINE_PASSWORD/SAMBA’#012Traceback (most recent call last):#012 File “”, line 3, in #012KeyError: ‘SECRETS/MACHINE_PASSWORD/SAMBA’#012(1) 00002020: Operation unavailable without authentication
Dec 27 13:56:37 server7c2 admin-todos: Traceback (most recent call last):
Dec 27 13:56:37 server7c2 admin-todos: File “”, line 3, in
Dec 27 13:56:37 server7c2 admin-todos: KeyError: ‘SECRETS/MACHINE_PASSWORD/SAMBA’
Dec 27 13:56:37 server7c2 admin-todos: Traceback (most recent call last):
Dec 27 13:56:37 server7c2 admin-todos: File “”, line 3, in
Dec 27 13:56:37 server7c2 admin-todos: KeyError: ‘SECRETS/MACHINE_PASSWORD/SAMBA’
Dec 27 13:56:37 server7c2 admin-todos: (1) 00002020: Operation unavailable without authentication
Dec 27 13:58:44 server7c2 httpd: [ERROR] NethServer\Tool\UserProvider: AccountProvider_Error_1
Dec 27 13:58:44 server7c2 httpd: [ERROR] Traceback (most recent call last):#012 File “”, line 3, in #012KeyError: ‘SECRETS/MACHINE_PASSWORD/SAMBA’#012Traceback (most recent call last):#012 File “”, line 3, in #012KeyError: ‘SECRETS/MACHINE_PASSWORD/SAMBA’#012(1) 00002020: Operation unavailable without authentication
Dec 27 13:58:50 server7c2 systemd: Starting Clean amavisd tmp folder…
Dec 27 13:58:50 server7c2 systemd: Starting Clean amavisd quarantine folder…
Dec 27 13:58:50 server7c2 systemd: Started Clean amavisd tmp folder.
Dec 27 13:58:50 server7c2 systemd: Started Clean amavisd quarantine folder.
Dec 27 13:58:56 server7c2 httpd: [ERROR] NethServer\Tool\GroupProvider: AccountProvider_Error_1
Dec 27 13:58:56 server7c2 httpd: [ERROR] Traceback (most recent call last):#012 File “”, line 3, in #012KeyError: ‘SECRETS/MACHINE_PASSWORD/SAMBA’#012Traceback (most recent call last):#012 File “”, line 3, in #012KeyError: ‘SECRETS/MACHINE_PASSWORD/SAMBA’#012(1) 00002020: Operation unavailable without authentication
Dec 27 13:58:58 server7c2 admin-todos: Traceback (most recent call last):
Dec 27 13:58:58 server7c2 admin-todos: File “”, line 3, in
Dec 27 13:58:58 server7c2 admin-todos: KeyError: ‘SECRETS/MACHINE_PASSWORD/SAMBA’
Dec 27 13:58:58 server7c2 admin-todos: Traceback (most recent call last):
Dec 27 13:58:58 server7c2 admin-todos: File “”, line 3, in
Dec 27 13:58:58 server7c2 admin-todos: KeyError: ‘SECRETS/MACHINE_PASSWORD/SAMBA’
Dec 27 13:58:58 server7c2 admin-todos: (1) 00002020: Operation unavailable without authentication
Dec 27 14:01:01 server7c2 systemd: Created slice user-0.slice.
Dec 27 14:01:01 server7c2 systemd: Starting user-0.slice.
Dec 27 14:01:01 server7c2 systemd: Started Session 1 of user root.
Dec 27 14:01:01 server7c2 systemd: Starting Session 1 of user root.
Dec 27 14:03:40 server7c2 sshd[2171]: Accepted password for root from 192.168.124.126 port 65402 ssh2
Dec 27 14:03:40 server7c2 systemd-logind: New session 2 of user root.
Dec 27 14:03:40 server7c2 systemd: Started Session 2 of user root.
Dec 27 14:03:40 server7c2 systemd: Starting Session 2 of user root.
Dec 27 14:04:31 server7c2 clamd: SelfCheck: Database status OK.
Dec 27 14:08:51 server7c2 systemd: Starting Cleanup of Temporary Directories…
Dec 27 14:08:51 server7c2 systemd: Started Cleanup of Temporary Directories.
Dec 27 14:10:45 server7c2 httpd: [ERROR] NethServer\Tool\GroupProvider: AccountProvider_Error_1
Dec 27 14:10:45 server7c2 httpd: [ERROR] Traceback (most recent call last):#012 File “”, line 3, in #012KeyError: ‘SECRETS/MACHINE_PASSWORD/SAMBA’#012Traceback (most recent call last):#012 File “”, line 3, in #012KeyError: ‘SECRETS/MACHINE_PASSWORD/SAMBA’#012(1) 00002020: Operation unavailable without authentication
Dec 27 14:14:31 server7c2 clamd: SelfCheck: Database status OK.
Dec 27 14:18:57 server7c2 httpd: [ERROR] NethServer\Tool\GroupProvider: AccountProvider_Error_1
Dec 27 14:18:57 server7c2 httpd: [ERROR] Traceback (most recent call last):#012 File “”, line 3, in #012KeyError: ‘SECRETS/MACHINE_PASSWORD/SAMBA’#012Traceback (most recent call last):#012 File “”, line 3, in #012KeyError: ‘SECRETS/MACHINE_PASSWORD/SAMBA’#012(1) 00002020: Operation unavailable without authentication`

1 Like

Just for giggles…
[root@server7c2 ~]# samba-tool user enable administrator -bash: samba-tool: command not found

So, I guess I’m the only one unable to do a restore.

I’ve reproduced the problem.
The root cause is that the nsdc machine is not initialized.
Dec 30 11:45:31 ns7-com.neth.net systemd-nspawn[2368]: Directory /var/lib/machines/nsdc doesn't look like an OS
I tried to fix it with signal-event nethserver-dc-save, nsdc runs, but name resolution still fails:

[root@ns7-com ~]# host -t SRV _ldap._tcp.`config get DomainName`
;; connection timed out; trying next origin
;; connection timed out; no servers could be reached
2 Likes

I fired up another vm, updated to rc3, rebooted, added nextcloud, created user, added files, rebooted, added ldap, created user, group, verified addition of ldap created user, group to nextcloud, rebooted, installed backup, ran backup via schedule.
Destroyed vm, created new, updated to rc3, rebooted, installed backup, restored config… success, ldap users and groups as before, nextcloud was installed.

But, nextcloud data restore fails, using restore data, selecting the nextcloud directories and config, the nextcloud users, the files and even the apps I install, ie; nextcloud calendar, none of them are restored.

to recap, unlike my previous posts which were two different vms with samba dc installed, this test is the ldap provider with nextcloud, and the NS restore appears successful, while the data restore of nextcloud is a fail.

Added to the todo list: https://github.com/NethServer/dev/projects/2#card-1241014

1 Like

Just opened an official issue, I’m working on it:

1 Like

I just opened a couple of pull requests, maybe some fixes are needed.

But if you want to test it, you can use latest nethserver-dc and nethserver-backup-config packages from testing repository.

1 Like

@giacomo @davidep what did I screw up? I can’t remember, do I have to add the repo? Trying to search for hints.

`[root@server7c3 ~]# yum --enablerepo=nethserver-testing install nethserver-dc-1.1.0-1.8.g673edfd.ns7.x86_64.rpm nethserver-backup-config-1.5.1-1.4.gbfc5b7f.ns7.noarch.rpm
Loaded plugins: changelog, fastestmirror, nethserver_events
Loading mirror speeds from cached hostfile

I guess, now that we’re on github, we don’t “take ownership” of the issue if we’re testing? I don’t see any way to do so.

I must be blind, I don’t see how to ‘assign’ it to myself.

Try the yum command without the .rpm packages extension

1 Like

dammit… thank you sir.

@giacomo my first attempt failed.

I… installed the two rpm’s on a current vm with samba dc, backup, nethserver some users/groups and files and *** php scl set to global 5.6 *** , rebooted, executed a backup.

Then started a fresh vm, updated to rc3… now the installation of the two rpm’s also installed their dependencies… this installed samba dc and backup config even before I installed backup… installed backup data, did not configure samba, configured backup, rebooted… fail after restore attempt, lost shell access, only console access.

@giacomo I suppose, I should not have installed the new nethserver dc rpm in the fresh install… or… hmmmmm… I cornfussssing myself now, how can I test this if restore is going to pull from the standard repo… hmmmmm…

The only way to test it on a clean machine is enabling the testing repo: you need to set enabled=1 inside the NethServer.repo file.

Your user account must be member of IssuesTeam on github. Now, you’ve been invited!

1 Like

So, I don’t know where that is.
I’m guessing from my searching this might be it; /etc/yum.repos.d/NethServer.repo idk.
I don’t know if I should set base and updates to 0.

@giacomo

I also wonder if I screwed up the backup by applying those rpms to the vm and executing a backup.