NethServer Hotsync [Needs Testing]

good news!!!

The problem isn’t with open files, it’s about a big binary file that is written while it is copied. That’s why mysql is dumped and dump are synced.
BTW, you get an email when a file changes during it’s copy (other kind of alerts coming soon), you’ll know if it doesn’t fit to your scenario.

it’s that. In your case scenario, you could use it dumping access db and copying the dump. Or like @planet_jeroen suggested, setting up a real HA.

1 Like

This sounds great, I will be testing beginning today hopefully.
For my situation, I don’t need 100% high availability, what I need is up to date disaster recovery, and I think this is going to accomplish that.

2 Likes

So, I have been testing for a couple days now, and first impressions are very good.
Couple of things I have noticed so far, that I would have to find a solution for;

  1. it is not a true sync, at least for me, as deleted files are not propogated. Not sure if this is just at my end, as if I run the command ‘hotsync --dry-run’, I see the error ‘IO error encountered – skipping file deletion’, but not sure if that is because of --dry-run.
  2. not deleting files as above is obviously causing files duplicated in pretty much anything, if file is moved or deleted, it is now in 2 places, in file shares, vmail, nextcloud, etc.
  3. there are a couple of issues with vmail, but they may be caused by the files not being deleted, I am getting multiple copies of the same email on slave if the email is unread. I also don’t see the updated read/unread status of emails, unless I reboot.
  4. in nextcloud, I don’t see the updated files unless i run a ‘occ files:scan’, which is normal, but just wondering if the files:scan should be part of the sync, or if it makes sense to just take care of that in cron. Might be better if it is built in, as I don’t know what would happen if the files:scan was in the middle when a hotsync happened. But in large file systems, the scan could take a long time. As you would want to run the scan when promoting slave to master to make sure files are all up to date, maybe it is best taken care of there, just may increase the time to promote on large file systems.
  5. Nethserver settings are not propogated, like file shares, pop3 connector. This may be by design, and may be done during the promotion to master, just want to make sure it is, as I have not tested that yet. It is best if they are handled there, as for pop3 connector, you couldn’t have the slave pulling email off of an external server, so would have to do something like block internet access on slave.
  6. Webtop calendar/contacts/tasks are not syncing. Again, may take care of itself during the promotion, but for testing, I am accessing the machine and do not see this stuff synced. Maybe there are manually commands I can run to accomplish.

Looks very, very promising, just thought I would discuss these possible issues before I test the promotion of slave to master.

2 Likes

@Stll0 will be happy to help with this :slight_smile:

That’s rsync default behaviour, if it encounter an IO error, skip deletion to avoid to delete everything from receiver :grin:

Try to find out why you got the IO error

1 Like

How do I go about finding that out, are you writing out a log file on the rsync, I see no errors in messages?
Also, I can successfully manually run ‘rsync --delete’ to the ‘clone’ and it works with no errors. I feel it is maybe a permission error on some file somewhere, but how do I find out?

If hotsync is executed by cron, you should receive errors by email

you can also look at /usr/sbin/hotsync and comment out rm -f /tmp/hotsync.log

YES. This is actually going to be amazing, trying it this morning :heart_eyes: totally something appropriate for a production server!!

3 Likes

I have never recieved anything my email about hotsync errors.
And, in the hotsync log, I get this, every time;

rsync: link_stat "/etc/yum/vars/serverid" failed: No such file or directory (2)
rsync: link_stat "/etc/my.pwd" failed: No such file or directory (2)
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1052) [sender=3.0.9]
1 Like

How do I go about debugging this?
I can successfully manually run ‘rsync --delete’ to the ‘clone’ and it works with no errors.
I can’t go on testing this, is is non-functional if the --delete does not work.

edit /usr/sbin/hotsync

and comment out last line

- rm -f /tmp/hotsync.log
+ #rm -f /tmp/hotsync.log

now after the sync, you should see all errors logged in /tmp/hotsync.log

let me know

Hello there, we developed a new improvement to hotsync.
The network communications has been changed: it does not use ssh anymore, but rsyncd+stunnel.

The slave has the rsyncd daemon running and both slave and master use stunnel daemon for security.
Every n minutes the master synchronize files copying them to rsyncd daemon on the slave.
All traffic is wrapped by stunnel that add encryption.
Here you can find the slave and master configuration.

The rest does not change: “hotsync” command is executed every n minutes.
Old “hotsync-setup” script has been removed: it was used to exchange ssh keys.

Here the rpm: http://packages.nethserver.org/nethserver/7.4.1708/autobuild/x86_64/Packages/nethserver-hotsync-1.0.0-1.21.pr2.gb406d98.ns7.noarch.rpm

3 Likes

Ehi @wbilger they are working for you :slight_smile:

I have never recieved anything by email about hotsync errors.
And, in the hotsync log, I get this, every time, never changes;

rsync: link_stat "/etc/yum/vars/serverid" failed: No such file or directory (2)
rsync: link_stat "/etc/my.pwd" failed: No such file or directory (2)
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1052) [sender=3.0.9]

you don’t receive emails because missing file errors are deliberately ignored, that’s right.
What about the --delete error?

I don’t receive that error by email, so not sure it even happens in real sync, but I assume it does, and that is the reason the --delete doesn’t work.
If I run ‘hotsync --dryrun’, then I get the error ‘IO error encountered – skipping file deletion’

Hey i totally like this module. Will be happy to test it but i want to confirm. Is this safe enough to test in production?. (by safe i mean if there’s a risk in breaking something else, i understand the functionality itself can be incomplete as of now)

Should be safe, I don’t know about the new update (connection security improvement) but just syncing one direction should do no harm to your production server. Don’t use production server as slave!

https://community.nethserver.org/t/nethserver-hotsync-a-new-ns-module/8391/14

2 Likes

thanks! will try with one of the secondary servers.

4 Likes