Duplicity replacement --> rsync Time machine-like backups

Tested rsync time machine backup to a cifs share on a Windows Server 2016 and it works so far. It saves the data uncompressed and unencrypted, which next to the known disadvantages, makes restoring a file really fast. A problem may be that smb shares don’t support links:

ln: failed to create symbolic link ‘//mnt/backup/testserver/latest’: Operation not supported

First I found an error in /var/log/last-backup.log but it told me what to do:

rsync_tmbackup: Safety check failed - the destination does not appear to be a backup folder or drive (marker file not found).
rsync_tmbackup: If it is indeed a backup folder, you may add the marker file by running the following command:
rsync_tmbackup:
rsync_tmbackup: mkdir -p -- "//mnt/backup/testserver" ; touch "//mnt/backup/testserver/backup.marker"

Tested without data, duplicity was faster in this case but I have to do some real test…

Test rsync time machine backup:

Duration: 1:36

Number of files: 3173
Number of files transferred: 12
Total file size: 298.97M bytes
Total transferred file size: 702.77K bytes
Literal data: 703.86K bytes
Matched data: 0 bytes
File list size: 196.27K
File list generation time: 0.001 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 389.80K
Total bytes received: 9.77K

sent 389.80K bytes  received 9.77K bytes  4.37K bytes/sec
total size is 298.97M  speedup is 748.22
rsync_tmbackup: Backup completed without errors.

Test duplicity:

Duration: 0:27

--------------[ Backup Statistics ]--------------
StartTime 1515485985.63 (Tue Jan  9 09:19:45 2018)
EndTime 1515485999.42 (Tue Jan  9 09:19:59 2018)
ElapsedTime 13.78 (13.78 seconds)
SourceFiles 3173
SourceFileSize 299244149 (285 MB)
NewFiles 3173
NewFileSize 299244149 (285 MB)
DeletedFiles 0
ChangedFiles 0
ChangedFileSize 0 (0 bytes)
ChangedDeltaSize 0 (0 bytes)
DeltaEntries 3173
RawDeltaSize 298967752 (285 MB)
TotalDestinationSizeChange 154899140 (148 MB)
Errors 0
-------------------------------------------------
1 Like

These are the parameters that I use in rsync when i copy to SMB Share

--modify-window=1 --no-links --iconv=ISO-8859-1,utf-8

These parameters should prevent some issues:

–modify-window=1 : resolve issues because FAT32 saves files timestamps with a 2-seconds resolution (I use it always if the target is a SMB share, if the partition is in NTFS also)
–no-links : doesn’t copy/follow links
–iconv=ISO-8859-1,utf-8 : translates character-encoding in filenames. This prevent errors if the linux encoded filename contains “?” or strange characters. This works well if the characters used in filenames are of the english alphabet, If the caracters are in others languages, there’s some issues restoring files (usually they should be manually renamed)

3 Likes

Don’t forget to take into account the fact that rsync-time-backup needs only ONE full backup, which will dramatically speed up the time spent by the computer to do backups.

The main reason why I switched to rsync is that every week our server was on his knees for 36 hours to backup 2 or 3 To on an external USB disk.

For such cases the time machine backup is really a good choice. If we keep it unzipped and unencrypted we won’t need a GUI for restore as it is just copy the files back to where they are missing.

Beware. rsync-time-backup relies heavily on hard links. While recent CIFS implementations seems to support them pretty well and out of the box, older windows / SMB servers seems to need “Linux extensions” to achieve this.

Not entirely true. The nethserver’s restore process does much more than copy files : it restores ACLs, restores all the databases of missing modules, takes care of everything.

Thats a case I didn’t though at first. This is probably a rsync drawback since not every filesystem supports what a linux partition supports as @saitobenkei pointed out.

Apple resolved that by making HFS file system mandatory for local disks or by creating a ‘sparsebundle’ (more or less a growing single file image of a file system) on remote locations.

The closest equivalent in linux world seems to be a FUSE file system with encfs or eCryptfs extensions that could at the same time resolve the missing encryption issue.

What do you think ?

They should be in config backup I think.

Ok, there has some extra work to be done to make it really simple.

I think it’s enough when newer shares work.

I have sorrows about speed but we may just try it…

The reason why I felt in love fort Nethserver is that it takes care of everything :smiling_face: Think about my user case : I’ve a full Nextcloud, Sogo, SMB File Server, AD controller, 4 physical networks and 1 VLAN, might add a freepbx on top of that :smiley: Restoring all this manually is just not an option for me (it would take me at least a full day since I’m not sysadmin normally).

I didn’t look deeper into restore but couldn’t we just reuse much of the backup-data restore code for time machine restore?

Yes, absolutely ! I’m actually working on it (slowly). I’ll try ecryptfs as well.

2 Likes

I copied the restore-data-duplicity to restore-data-rsync_backup and changed about line 112:

Use your backupdir here, a “latest” dir would be nice:

$ret = system("rsync -aP /mnt/backup/testserver/2018-01-09-213330 / > $logFile");
# $ret = system("$cmd &>$logFile");

Now you can do restore-data and it works, it perfectly restored my webtop postgres db.

2 Likes

Just did a backup of an encfs encrypted view of a 2,5Go documents folder and the time needed was pretty much the same as backing the non encrypted files. (Core i5)

Copying the encrypted files was actually a bit faster, but this is certainly due to some cache issue and also the fact that my VM runs on a Fusion Drive (SSD + HDD with dynamic allocation of most used files).

encfs was remarkably easy to set up, it took me 10 minutes to understand and set it up. However I was immediately confronted to the file name length limitation that when file names are encrypted. There is currently no direct fix for this, only some workarounds, including not encrypting file names, only contents. eCryptfs suffers the same limitation.

1 Like

Great !! I was sure it was easy.

Yeah, good to know it’s more or less easily possible to change the backup/restore procedure and have another backup approach than duplicity if needed.

I don’t know if this really is a disadvantage

Hey guys, I have the same problem here 22 hours coping the same data in the same harddrive is not a backup (the hard drive can be damaged for such a big a mount of data)

It’s possible to disable the full backup?

Thanks pagaille about share this knowelege, I’m going to probe this week your script.

1 Like

Thanks @hector. I believe that due to the way duplicity makes backup, it is necessary to recreate a full backup on a regular basis in order to ensure the integrity of the data. This is the main reason why I don’t like duplicity : it’s way of storing files.

Without having done any research, just a question for my clarification:
The full backup in duplicity is to get a base. After that, incremental (delta) backups are done. What happens with the rsync method ‘Time Machine Style’ with changed files? Will they get fully backed up as soon they are changed, so a previous backup of that file is not necessary in order to restore it?

You know, this is a use case that cries out for ZFS snapshots and replication…