Backup failure with restic

Hi,

I am just trying out restic backups on amazon S3. This night, I got an error message for a non-existing file. Can somebody help me to understand the notification, I do not get :slight_smile:

Why does a single file has any impact - I mean it should be skipped.

Also: If I countercheck the backup configuration, it gives an error on checking Amazon S3 (I did not change anything!)

TIA
Thorsten

Backup: test
Backup started at 2021-03-14 01:05:03
Pre backup scripts status: SKIPPED (concurrent backup is running)
using parent snapshot fd0d0471
error: lstat /var/lib/nethserver/ibay/backup/ecodms/dmsbackup_2021-03-14_01_00_00.part: no such file or directory

Files: 402 new, 256 changed, 149397 unmodified
Dirs: 14 new, 180 changed, 22608 unmodified
Added to the repo: 4.159 GiB

processed 150055 files, 445.101 GiB in 56:41
snapshot f9e6acab saved
Warning: failed to read all source data during backup
Backup failed
Action ‘backup-data-restic test’: FAIL
Backup status: FAIL

NethServer Version: 7.9.2009
Module: backup

@thorsten

Hi Thorsten

This looks like your Backup was being done at the same time your EcoDMS was synching the backup file (I use EcoDMS too!).

The .part is typically there DURING a rsync transfer…
Or, if a rsync transfer was not completed, it get’s left over as a “zombie”… (rsync breaks off for whatever reasons, typically an Internet outage on one side…).

Hope this helps troubleshoot (at least the single file part)…
Can’t help with S3, as I don’t use it (yet)…

My 2 cents
Andy

Hi Andy,

I am not shure:

  • The restic backup should @ 2:05 a.m. this is after ecoDMS Backup ist finished (typcially arround 01:30).
  • even if so if a backup is running, I could accept if such a file is not backuped, but the complete job is cancelled
  • I do not get, why I can not do a backup on a running system - there will be always a file on a system which might be changed, deleted or added during backup - e.g. I can not guarantee that a user is not working at night. If so, I would have to take down the server during backup
  • The initial restic backup was running for 56 hours from Monday morning to Wednesday afternoon - I should have had the same errors as I was working on the the nethserver during that backup was running.

More Important, why can I not alter the backup settings anymore?

TIA
Thorsten

@thorsten

Hi

I’d exclude the .part files anyway from a backup. They’re almost always zombies, and will not help later on if you need something. As they’re zombies, they won’t get cleaned up later on, and rest as ballast… And some can be quite large…

Does it help if you clean out the S3 Storage?

Maybe also the Restic config, but I have no real idea where that lies, someone needs to give us more Infos… :slight_smile: (Most likely, there’s a “zombie” flag set, like “backup running”…).

My 2 cents
Andy

I really do not get it:

I did not change anything - but suddenly it started to do this error. I counterchecked the keys, passwords etc - and it is correct. I am not even capable to add a backup on Amazon.

I am not shure, I fear this could be a bug introduced within an update?

Or mabey ith might have been a problem related to his here:

Maybe @mrmarkuz has an idea?

You could try to test the connection with curl:

You may edit /usr/libexec/nethserver/api/system-backup/check-s3 and change curl -s to curl -v and run the script manually to see if it produces errors:

/usr/libexec/nethserver/api/system-backup/check-s3 <ACCESS_KEY> <SECRET_KEY> <BUCKET> <HOST>

1 Like

@mrmarkuz

in the following respond I change some things I suppose to be some host-ID, password, key etc

[root@ebb-s01 ~]# /usr/libexec/nethserver/api/system-backup/check-s3 mykey mysecret nethserver01 s3.amazonaws.com

<?xml version="1.0" encoding="UTF-8"?>

InvalidRequestThe authorization mechanism you have provided is not supported. Please use AWS4-HMAC-SHA256.1VJK NY14568XJQBCP0kEJREXRGQz8UKwmAeD1CIVntzQztU5fzlx+URAdwiL6Wj86wp k4UGUA6Ft8yITcD3mFUjqedGO+o=* About to connect() to nethserver0 1.s3.amazonaws.com port 443 (#0)

  • Trying 52.219.47.93…
  • Connected to nethserver01.s3.amazonaws.com (52.219.47.93) port 443 (#0)
  • Initializing NSS with certpath: sql:/etc/pki/nssdb
  • CAfile: /etc/pki/tls/certs/ca-bundle.crt
    CApath: none
  • SSL connection using TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
  • Server certificate:
  •   subject: CN=*.s3.amazonaws.com,O="Amazon.com, Inc.",L=Seattle,ST=Washing                                                                                                              ton,C=US
    
  •   start date: Jan 11 00:00:00 2021 GMT
    
  •   expire date: Feb 11 23:59:59 2022 GMT
    
  •   common name: *.s3.amazonaws.com
    
  •   issuer: CN=DigiCert Baltimore CA-2 G2,OU=www.digicert.com,O=DigiCert Inc                                                                                                              ,C=US
    

DELETE /tmp.vsWfnqJK3o HTTP/1.1
User-Agent: curl/7.29.0
Accept: /
Host: nethserver01.s3.amazonaws.com
Date: Sun, 14 Mar 2021 12:28:09 +0000
Authorization: AWS foo/bar=

  • The requested URL returned error: 400 Bad Request
  • Closing connection 0
    curl: (22) The requested URL returned error: 400 Bad Request

Maybe you need to change Authorization: AWS to Authorization: AWS4-HMAC-SHA256 in /usr/libexec/nethserver/api/system-backup/check-s3 line 46 and 53?

Source

OK,

  1. I did not touch that file - the first run last week was perfect, now it does not work anymore, why is that change necessary? Was there an update on nethserver or amazon s3?
  2. what / how do I need to change?

TIA
Thorsten

I don’t use Amazon S3 so I don’t know but maybe it depends on the region? There’s no update on Nethserver S3 check.

Maybe you can change the Authorization in the AWS settings?

Please try the following:

Sorry, did not read close enough - however tried to change to AWS4-HMAC-SHA256 - same error.
And really, what makes me worried: I did not touch / change the system - what the hell is the change that it does not work anymore???

You could test the S3 connection with an S3 client like cyberduck to check if it’s an AWS problem.

This is not an AWS Problem:

I may use cyberduck to connect to S3 - and it works
I can us Nethserver backup to restore a file - the restore from last weeks backup works without problems…

OK, maybe the check has an issue. Does a recent backup work if you start it in the UI?

You may try to get the curl line working or send me a PM with credentials so I can try later today…

Wired: this morning the backup did not start at all - now the backup started manually - however I can not change the settings.

you got a PN :slight_smile: TIA

1 Like