Duplicity replacement --> rsync Time machine-like backups

LOL.
We can meet in a reserved room at fosdem.
I put your rsync backup in production, I’m keeping an eye on it.
I would like to work on the restore.
Open issues:

  • one-filesystem option
  • compress
  • rsync over ssh (or sshfs)

In that case : of course.

It’s been running for a month here. It started expiring backups :

rsync_tmbackup: Previous backup found - doing incremental backup from /mnt/backup/mattlabs/2018-01-21-230030
rsync_tmbackup: Creating destination /mnt/backup/mattlabs/2018-01-22-230040
rsync_tmbackup: Expiring /mnt/backup/mattlabs/2018-01-21-125033
rsync_tmbackup: Expiring /mnt/backup/mattlabs/2018-01-21-122849
rsync_tmbackup: Expiring /mnt/backup/mattlabs/2018-01-21-121344
rsync_tmbackup: Starting backup... 

I like the beauty and simplicity of that script :slight_smile:

1 Like

just a suggestion: expiration should be done after current backup job has finished
otherwise, it your retention policy is very “short”, you’d find yourself with no good backup

my 2c

I feel you :slight_smile:

The script proceeds as the following :

The script automatically deletes old backups using the following logic:

  • Within the last 24 hours, all the backups are kept.
  • Within the last 31 days, the most recent backup of each day is kept.
  • After 31 days, only the most recent backup of each month is kept.

Additionally, if (and only if) the backup destination directory is full, the oldest backups are deleted until enough space is available.

Therefore I believe that we are on the safe side.

1 Like

I took note that the Retention policy setting is ignored.

Absolutely. I love the dumb approach : HDD not full ? --> fill it. HDD full ? --> delete oldest backups until there is enough space. The user doesn’t have to worry and enjoys as much backups as the device can handle.

Have a look at the Apple doc for Time machine, I think we should mimic that https://support.apple.com/en-us/HT201250

3 Likes

Fully agree.

Do you know how time machine performs a restore? I’ve read some user documentation, but I can’t figure out quickly how it works behind the scenes (I don’t know macOS).

We could scan the first level of the backup disk to quickly know the dates of all backups (find /mnt/backup/servername/ -maxdepth 1 -type d), but traversing the whole disk to find all files will be slow.
Does time machine ask the user to wait? Or does it keep a cache of the backups?
NethServer keeps a cache, using duc (the same tool to measure disk usage).

Yes, I saw that, I disabled it :wink: took twice the time needed for the backup itself.

I would simply read and display the folder on the disk. The only real issue doing so is searching for every copy (backup) of a given file. But in my experience, I used that function maybe 10 times in 10 years.

Os/x restore files or a full installation by simply copying the latest or other backup folder, then probably restores databases and others tricks like we do.

The ACL and extended attributes are kept by making the use of an Apple file system mandatory, OR by creating a “sparsebundle” monolithic file containing an HFS (Apple) filesystem on a non-Apple file system. That’s a clever and controlled solution.

The indexing is done (for searching) by the general file indexer (which indexes also the contents, on the fly), Spotlight.

Maybe we could use some Lucene indexer for such use but I believe that it is really not a priority

This is a nice idea but will it put additional stress on the backup process? Also you need to add the class to get the info and also another one to read the info.

As resource wise, will this prove its utility if it is integrated into the backup?
Don’t get me wrong, i just ask as a way to see the best approach. :slight_smile:

I agree.

My general feeling is that we don’t really need to create any index of any sort regarding backups. Just browse the files & folders tree on the backup disk, done.

I believe the way Duplicity stored the files on the disk made the indexing step mandatory. But it is not needed here.

on the other hand I guess that we can use the log file (if you want to have the files parsed and indexed)
Rsync can output the activity or even generate the listing in a file in a specified format, if I’m not mistaking.

This way you have the index files generated from the log. Easy to parse and use.
Even tar (or most archivers) can generate a listing of the files (including the paths) if needed.

3 Likes

Brilliant !!

Hello Boys, at the end I made it work, I have problems with the inodes, I have to allocate 800K files.

I made an script to mount drives and execute the scripts (I’m not using pagaille script) and tmbackup script works so well.

How you configure the logs for the script?

Hi @hector !

Happy to see that more and more people find that script interesting.

Out of curiosity, why didn’t you use my script ? It has the advantage of keeping the whole nethserver backup logic functional (modules includes & excludes, databases handling, and so on).

Regarding the logs, the doc is here : https://github.com/laurent22/rsync-time-backup

You’ve a --log-dir parameter to specify the log destination. Take care, you’ll be responsible of deleting the old logs (they are normally deleted if the backup is successful).

If you need to further customise rsync’s parameter you can use the --rsync-set-flags command line parameter.

@hector, regarding your filesystem running out of inodes, it is not clear to me wether creating a hard links (rsync-time-machine relies heavily on them) increases the number of inodes used.

Anyway, this page explains and give some leads in case you run out of inodes.

Did you format the drive following the doc, namely mke2fs -v -T largefile4 -j /dev/sdc1 -L backup ? -T largefile4 makes it one inode for every 4 megabyte instead of default one node per 16K of disk space. It makes sense when using duplicity since it creates archives. rsync doesn’t.

You may also fomat your drive in XFS, there is no predefined inode count using this filesystem.

1 Like

I would love to format in XFS, but Ubuntu won’t recognize it (for my boss the ext4 is a headache: to locate info with bash is horrible, he need something familiar like windows to see the consistence of the data), I won’t use your script because I use an spare disk, I have to rotate it once at the month (secure info disk in a bank box) so i have to mount the drive manually, I declare de UUID of the disk in a bash script the mount and copy.

The rsync-time-backup script allows me to keep a version for 30 days and doesn`t use so much space also It does’nt recopy the same info in the same drive.

Thanks for the link and the clarify me the command I’m learning more here than SMEserver.

I’ve been using rsync for backup for about ten years, it’s worked nearly flawlessly for all of that time.

The only issue is inodes, lots and lots of inodes.

I’ve run out of them twice, I’ve learned that ‘df -i’ is my friend, and that XFS has more inodes than ext2.

One of the nice things about this form of backup is that you can mount the backup read/write, but share it as read only, and allow users to recover their own files, normal permissions persist, and they can only see what they should be able to in the first place.

1 Like

Thanks for this great script!

I’m having problems with files that have no write permissions. They get backed up, but when rsync_tmpbackup tries to expire old files, it cannot remove them, and it gives a “Permission denied” error. Normally I would add something like “–chmod u+w” to an rsync command to avoid this problem, but how can I do that with the nethserver backup script?

Any help would be appreciated…

Hi @ppw_1104 !

First of all : are you aware that the script has been included in the standard backup module in the latest NS versions ? You must configure it using Cockpit.

Hi @pagaille
Thanks a lot for the response. No, I did not know that, didn’t even know about Cockpit (sorry for my ignorance). I installed cockpit, and it recognizes the rsync-backup that I had already already configured through the CLI (using db). In Cockpit, I don’t see anything that could change the rsync behaviour, though (e.g., adding “–cmod u+w”).