Hotsync doesn't work

flatspin · March 14, 2018, 2:22pm

NethServer Version: 7.4-1708
Module: hotsync 2.0.1

I wanted to test hotsync.

I installed hotsync according to http://docs.nethserver.org/en/v7/hotsync.html and did the configuration on 2 machines.

config show hotsync
hotsync=configuration
    MasterHost=
    SlaveHost=192.168.0.238
    SlavePort=273
    databases=enabled
    role=master
    status=enabled

But nothing happens. I see in cron.log every 15 minutes:
CROND[14345]: (root) CMD (/usr/sbin/hotsync > /dev/null )

Cron entry is in /etc/cron.d

When I try to execute /usr/sbin/hotsync I get:

hotsync
hotsync error: rsync returned 10. rsync output:
rsync: failed to connect to 127.0.0.1 (127.0.0.1): Connection refused (111)
rsync error: error in socket IO (code 10) at clientserver.c(122) [sender=3.0.9]

I outcommented last line in /usr/sbin/hotsync for debiging.

Same error messages as above every 15 minutes in /temp/hotsync.log

Did I do something wrong or is there somthing missing in the docs?

Can you please help me @Stll0

Thanks in advance.

Ctek · March 14, 2018, 2:32pm

Did you try to specify the IP for the MasterHost in the config ?
Is the slave reachable to ping etc etc ?

flatspin · March 14, 2018, 2:39pm

Hi Bogdan, thanks for answering.

Tried to set Master IP:

hotsync=configuration
    MasterHost=192.168.0.236
    SlaveHost=192.168.0.238
    SlavePort=273
    databases=enabled
    role=master
    status=enabled

slave is reachable.

Same result.

Ctek · March 14, 2018, 2:48pm

Just check that the rsync is running on your master
See if you can see the service running

flatspin · March 14, 2018, 3:02pm

o.k. started rsyncd and enabled it.

Now I get:
hotsync error: rsync returned 5. rsync output:
@ERROR: Unknown module 'hotsync’
rsync error: error starting client-server protocol (code 5) at main.c(1516) [sender=3.0.9]

Ctek · March 14, 2018, 3:19pm

and Hotsync is installed on both servers i presume ?

what you get if you run this:
rsync --list-only rsync://192.168.0.238

Stll0 · March 14, 2018, 3:51pm

Sorry @flatspin, we did a new release without properly update documentation
Please, check last instruction here
particularly, you should use
signal-event nethserver-hotsync-save
instead of
signal-event nethserver-hotsync-update

Stll0 · March 14, 2018, 3:56pm

here is already fixed http://docs.nethserver.org/en/latest/hotsync.html

flatspin · March 14, 2018, 4:12pm

nethserver-hotsync-save helped. No more error.

Thx a lot.
On my testinstallation backup module is not enabled, cause IMO it would be useless with hotsync. Right?
But rsync isn’t enabled with out backup? So hotsync should check wether rsync is up or not. Or at least it should be mentioned in the docs to check.

On the slave there a no modules installed. I read here:

That means in at least one hour, the slave should be a clone of the master, without installing a module by myself. Correct?

Thanks for helping also to @ctek

Stll0 · March 14, 2018, 5:11pm

If you are in a testing environment, it’s ok not having a backup, but in production, hotsync isn’t a substitute for backup. For instance, if you remove a file, it would be removed also on slave…

Almost. All RPMs are installed on slave to speed up restore, but events aren’t executed. That means that you haven’t configuration templates expanded in configuration files, stopped services and closed firewall ports. Slave become a clone when you restore it.

flatspin · March 14, 2018, 5:31pm

Good point, Didn’t think about that.

Thanks for explanation. Now I got it!

Tomorrow I’ll test the restore. It’s a testenviroment in proxmox, so everything is fine. Just wanted to test.

EDIT: looks funny (from master)

Stll0 · March 15, 2018, 7:51am

Let us know how it works!

flatspin · March 15, 2018, 3:31pm

Hi @Stll0 , I want to report my experinces with hotsync restore.

I shut down server1 and did hotsync promote from console and watched the output
Two errors:

The php71 results from onlyoffice I think.
The mailserver error?? Don’t know why.

After that: signal-event post-restore-data and wait until finish.

Login works on 2nd machine.
Sogo login works, calendar with events o.k., mail doesn’t work.
Nextcloud: Page is not reachable => could be from onlyoffice / nginx, which is not supported officially
All installed modules are present.
FQDN was changed, but login show old name.
AD with user and groups are present and working.

All i all it seems to work.

PS: during hotsync-promote fail2ban occupied 100% cpu (1 core from 4) for a long time.

But I don’t think that this is reasonable in any question.

I will try again with a machine without only office, just with officially supported modules and a clean installed slave.

Hope this helps.

Stll0 · March 15, 2018, 4:02pm

nethserver-rh-php71-php-fpm wasn’t installed because it is in nethserver-testing repository that’s disabled. hotsync-slave command, that installs packages on slave, should have emailed you this error. You can fix on slave by launching
yum install --noplugins --enablerepo=nethserver-testing nethserver-rh-php71-php-fpm
–noplugins ensure that no nethserver events is executed (but on a slave it’s the default)

About mail, could you please give me more logs about the error? You should see something more in /var/log/messages

filippo_carletti · March 15, 2018, 4:25pm

fail2ban parses logfiles. You could have used the strace command to see what it was doing. I suspect it’s “normal” fail2ban behaviour.
@stephdl what do you think?

stephdl · March 15, 2018, 4:30pm

something in logs @flatspin ?

tailf /var/log/fail2ban.log

flatspin · March 15, 2018, 4:33pm

Sorry, I wanted to start a second test with a clean machine an destroyed this one allready.

It was not a really clean machine. Rollback in Proxmox, so this could have caused something.
I don’t want you to break your head with somthing which was caued by my laziness.

I’ll give a feedback with second try. This time with a really clean installation.

alefattorini · March 16, 2018, 3:08pm

You’re becoming an HotSync expert thanks for your tests

flatspin · March 17, 2018, 12:18pm

So I did it again…
Complete fresh installation from iso 7.4.1708 + updates and hotsync as slave.
I named the server “slave.jeckel.lan”.
Setup a NS7 AD “all-in-one-box” with 1x red + 1x green, Proxy, FW, IPS, ufdb, openvpn, samba, NC, Sogo, ntop, bandwidth. Installed hotsync and enabled rsync on both machines. Tested hotsync with manual command. o.k. waited about 1,5 hours checked the logs and saw hotsync-cron running.
Everything’s fine so far.
Sendet a mail from admin to user and return (Sogo), created a event in Sogo. Created a file in NC with user. Created a samba share and put a file in it. Created a white list and a black list domain in webfilter.
Created my own selfsigned server certificate on command line, with rootC and serverCA for HSTS problem with newer opera, chrome and firefox versions and confirmed that that worked.

Then switched off master and did hotsync-promote on slave.
As if by magic, the machine started to install every module and set it self up as new master.

Now checked the interfaces: configuration of green and red o.k.
Then signal-event post-restore-data

Share is here with file, NC-login works, file is present, white + black list is present, IPS rules are correct, FW rules are correct, ntop works. SSH port is set to 2222. VPN settings are correct.
So far so good. Seems that most things are working after my “virtual desaster”.

But a few little things are missing:
Sogo login works, event is synced, but no mail account is set.
But mails are present in vmail folder.
When I go to graphs it shows the old FQDN with unknown host:
graphs-slave
In Domain accounts the part under “join is o.k.” is missing:

The accounts provider page is correct. Users and groups are present.

If I find anythings else I will report.

Nevertheless is this a geat work!! And it’s still in beta, so clearly not everythings is working perfectly.

So long my friends.

flatspin · March 19, 2018, 4:04pm

Solved the Sogo-problem. The owner of the mail folders was suricata after hotsync-promote. Changed owner of mailfolders to vmail (chown -R vmail:vmail vmail) and then Sogo worked.

After reboot CGP works also.

But the nsdc-service doesn’t start at boot. Enableing nsdc doesn’t help ether, but manual start works.