Duplicity replacement --> rsync Time machine-like backups

Great !! I was sure it was easy.

Yeah, good to know it’s more or less easily possible to change the backup/restore procedure and have another backup approach than duplicity if needed.

I don’t know if this really is a disadvantage

Hey guys, I have the same problem here 22 hours coping the same data in the same harddrive is not a backup (the hard drive can be damaged for such a big a mount of data)

It’s possible to disable the full backup?

Thanks pagaille about share this knowelege, I’m going to probe this week your script.

1 Like

Thanks @hector. I believe that due to the way duplicity makes backup, it is necessary to recreate a full backup on a regular basis in order to ensure the integrity of the data. This is the main reason why I don’t like duplicity : it’s way of storing files.

Without having done any research, just a question for my clarification:
The full backup in duplicity is to get a base. After that, incremental (delta) backups are done. What happens with the rsync method ‘Time Machine Style’ with changed files? Will they get fully backed up as soon they are changed, so a previous backup of that file is not necessary in order to restore it?

You know, this is a use case that cries out for ZFS snapshots and replication…

Right, but there will be a new full backup done each week.

From doc : Each backup is on its own folder named after the current timestamp. Files that haven’t changed from one backup to the next are hard-linked to the previous backup so take very little extra space.

Therefore, changed files are simply copied in the current backup folder.

NB : Older versions are kept (versioning) ! The script automatically deletes old backups using the following logic:

Within the last 24 hours, all the backups are kept.
Within the last 31 days, the most recent backup of each day is kept.
After 31 days, only the most recent backup of each month is kept.
Additionally, if the backup destination directory is full, the oldest backups are deleted until enough space is available.

Is it clearer ?

1 Like

I’m sure you’re right, but I know ZFS requires a lot of RAM… That’s not an all-round solution. My old 4Go RAM server would not survive it :slight_smile:

The RAM isn’t as big a deal as the hacking that would be needed to get Neth installed on ZFS in a way that’s stable, repeatable, and would survive upgrades without any extra work. Other Linux flavors make this a bit easier, but not CentOS. I do like the idea (and, again, this use case would be perfect for snapshots and replication, though it would require another ZFS box to send them to), but it would take a lot more work before I think it’d be safe to use for production.

@pagaille I opened you a pull request to create the marker automatically. I also cleaned up some unused variables.

While I really appreciate your efforts, I don’t think that this is ready for inclusion in the core backup-data of NethServer.
Don’t get me wrong, I quickly reviewed the script and I think that we should go ahead, but I need to involve more people (@giacomo, @Stll0).

The first thing we must check is the inclusion/exclusion syntax of duplicity vs rsync.
I fear that we may leave out some files in the backup and include too much when there are exclusions.
We may adopt a syntax for the include/exclude files in /etc/backup-data.d/ and convert them appropriately to the format needed by the tool we use.

Then we must work on the restore interface.
We may need a function (a button on the interface) to format the usb local disk as needed.
I see that rsync-time-backup have some checks for the destination, but I don’t know if those are enough.
I didn’t test backup over ssh, I think it would be a good option.

@pagaille would you like to coordinate our efforts?

6 Likes

Thanks so much for your interest @filippo_carletti ! My first impression was that you wasn’t convinced of the advantages of this script :slight_smile:

Sure. As stated above, this is a work in progress that needs an UI and lot of testing.

It runs for some weeks here however, until now without any problem.

I already took care of this. I adapted the include and exclude files handling in the script so that rsync can use them. As far as I can tell, it works. The file format doesn’t have to be modified.

Yep. I’ve got an idea, I’ll try to write a draft.

Yes. It is missing currently.

Since nethserver-backup takes care of the mounting I see no reason why something could go wrong.

Me too ! It is a nice and easy alternative to webdav !

Sure ! How should I begin ?

2 Likes

We need a little time to test it a couple of server, then we can continue with the work!

You may go deeper into the matter at Fosdem! What do you think? Looks a great topic to dive into

In public ???! Naaah, I’m the man in the shadows :slight_smile:

1 Like

LOL.
We can meet in a reserved room at fosdem.
I put your rsync backup in production, I’m keeping an eye on it.
I would like to work on the restore.
Open issues:

  • one-filesystem option
  • compress
  • rsync over ssh (or sshfs)

In that case : of course.

It’s been running for a month here. It started expiring backups :

rsync_tmbackup: Previous backup found - doing incremental backup from /mnt/backup/mattlabs/2018-01-21-230030
rsync_tmbackup: Creating destination /mnt/backup/mattlabs/2018-01-22-230040
rsync_tmbackup: Expiring /mnt/backup/mattlabs/2018-01-21-125033
rsync_tmbackup: Expiring /mnt/backup/mattlabs/2018-01-21-122849
rsync_tmbackup: Expiring /mnt/backup/mattlabs/2018-01-21-121344
rsync_tmbackup: Starting backup... 

I like the beauty and simplicity of that script :slight_smile:

1 Like

just a suggestion: expiration should be done after current backup job has finished
otherwise, it your retention policy is very “short”, you’d find yourself with no good backup

my 2c

I feel you :slight_smile:

The script proceeds as the following :

The script automatically deletes old backups using the following logic:

  • Within the last 24 hours, all the backups are kept.
  • Within the last 31 days, the most recent backup of each day is kept.
  • After 31 days, only the most recent backup of each month is kept.

Additionally, if (and only if) the backup destination directory is full, the oldest backups are deleted until enough space is available.

Therefore I believe that we are on the safe side.

1 Like

I took note that the Retention policy setting is ignored.

Absolutely. I love the dumb approach : HDD not full ? --> fill it. HDD full ? --> delete oldest backups until there is enough space. The user doesn’t have to worry and enjoys as much backups as the device can handle.

Have a look at the Apple doc for Time machine, I think we should mimic that https://support.apple.com/en-us/HT201250

3 Likes