SOGo restoration is broken

Today i did an uninstall followed by an install NS8.
I loaded the latest cluster configuration backup.

Then i tried once again to restore the samba app but ended up in the same jam as my previous post.
Afterward i tried to restore the mail app, which seems to have worked. No errors, and the service responds on port 25 with the expected greeting, and able to connect an client via IMAP.

I did go on and tried an restore of SOGo which unfortunately failed. Not sure if this is related to the topic.

Jun 08 15:34:42 nethserver agent@sogo1[16606]: Traceback (most recent call last):
Jun 08 15:34:42 nethserver agent@sogo1[16606]:   File "/home/sogo1/.config/actions/restore-module/06copyenv", line 54, in <module>
Jun 08 15:34:42 nethserver agent@sogo1[16606]:     agent.set_env(evar, original_environment[evar])
Jun 08 15:34:42 nethserver agent@sogo1[16606]:                         ~~~~~~~~~~~~~~~~~~~~^^^^^^
Jun 08 15:34:42 nethserver agent@sogo1[16606]: KeyError: 'TRAEFIK_HTTP2HTTPS'
Jun 08 15:34:42 nethserver agent@sogo1[16606]: task/module/sogo1/41770622-c66f-4595-a58a-0ca7768e2197: action "restore-module" status is "aborted" (1) at step 06copyenv
Jun 08 15:34:42 nethserver agent@cluster[401]: Assertion failed
Jun 08 15:34:42 nethserver agent@cluster[401]:   File "/var/lib/nethserver/cluster/actions/restore-module/50restore_module", line 93, in <module>
Jun 08 15:34:42 nethserver agent@cluster[401]:     agent.assert_exp(restore_task_result['exit_code'] == 0)
Jun 08 15:34:42 nethserver traefik[16026]: 192.168.10.x - - [08/Jun/2024:13:34:42 +0000] "GET /cluster-admin/api/module/sogo1/task/41770622-c66f-4595-a58a-0ca7768e2197/context HTTP/2.0" 200 947 "-" "-" 355 "ApiServer-https@file" "htt>
Jun 08 15:34:42 nethserver agent@cluster[401]: task/cluster/c8ee8010-8a35-4043-abba-eabc59eebb23: action "restore-module" status is "aborted" (2) at step 50restore_module

For sogo it is a valid bug

The key does not exist

1 Like

hello @Viking

please could you test the clone and the backup/restoration of SOGo please

thank in advance

1 Like

Hi @stephdl,
Sorry, not exactly sure how i should do it.

What do you mean by clone,
As i restoring from scratch i do not have a clone if the SOGo.

I just tried to restore as previously, but as it fetches the older sogo 1.0.9 from the repo it obviously will not work.

So i am a little lost here… :wink:

install the new version like asked with add-module ghcr.io/nethserver/sogo:1.1.12-dev.1

then do a restore
then after you could try to clone, go to software, then sogo and in the upper case do a right click you will have clone the module

Still unsure of the exact procedure.

  1. No SOGo instances at all

  2. Did a add-module ghcr.io/nethserver/sogo:1.1.12-dev.1

  3. Restored SOGo from backup, this failed with KeyError: ‘TRAEFIK_HTTP2HTTPS’

  4. Cloned the failed instance this failed with KeyError: ‘TRAEFIK_LETS_ENCRYPT’

  5. Cloned the 1.2.12-dev.1, this failed with KeyError: 'ADMIN_USERS'

Log excerpts:

jun 10 20:02:19 nethserver agent@sogo4[5017]: Traceback (most recent call last):
jun 10 20:02:19 nethserver agent@sogo4[5017]:   File "/home/sogo4/.config/actions/clone-module/50call-configure-module", line 18, in <module>
jun 10 20:02:19 nethserver agent@sogo4[5017]:     "lets_encrypt": os.environ["TRAEFIK_LETS_ENCRYPT"] == 'True',
jun 10 20:02:19 nethserver agent@sogo4[5017]:                     ~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^
jun 10 20:02:19 nethserver agent@sogo4[5017]:   File "<frozen os>", line 679, in __getitem__
jun 10 20:02:19 nethserver agent@sogo4[5017]: KeyError: 'TRAEFIK_LETS_ENCRYPT'
jun 10 20:02:19 nethserver traefik[1336]: 192.168.10.243 - - [10/Jun/2024:18:02:18 +0000] "GET /cluster-admin/api/module/sogo3/task/85a4914d-19a6-4cdb-9631-d7c19b6670c3/context HTTP/2.0" 200 242 "-" "-" 1211 "ApiServer-https@file" "http://127.0.0.1:9311" 936ms
jun 10 20:02:19 nethserver agent@sogo4[5017]: task/module/sogo4/51056bb7-162e-45c2-ad1e-27fe3ccb7c4e: action "clone-module" status is "aborted" (1) at step 50call-configure-module
jun 10 20:02:19 nethserver agent@cluster[511]: Task module/sogo4/clone-module run failed: {'output': '', 'error': '<7>podman-pull-missing ghcr.io/nethserver/rsync:2.8.2\nTrying to pull ghcr.io/nethserver/rsync:2.8.2...\nGetting image source signatures\nCopying blob sha256:759a408b2356b87060583e867bdc2455d466cfebc6bb909e9c7fbc3825fbe3f9\nCopying blob sha256:82e94dfc42a5139b349717d0ef16203b01c32a0a5c61b3b3c034e6a329868fbf\nCopying config sha256:0ae319bd32975897aedf29d724406cbb015daf19b620f8c6bdf7a44e870309a2\nWriting manifest to image destination\nStoring signatures\n0ae319bd32975897aedf29d724406cbb015daf19b620f8c6bdf7a44e870309a2\n<7>podman run --rm --privileged --network=host --workdir=/srv --env=RSYNCD_NETWORK=10.5.4.0/24 --env=RSYNCD_ADDRESS=cluster-localnode --env=RSYNCD_PORT=20016 --env=RSYNCD_USER=sogo3 --env=RSYNCD_PASSWORD=4778a77c8188-4cec-411d-90b1-de8d9d417c9c --env=RSYNCD_SYSLOG_TAG=sogo4 --volume=/dev/log:/dev/log --volume=/home/sogo4/.config/state:/srv/state ghcr.io/nethserver/rsync:2.8.2\nImporting ADMIN_USERS from source instance\nImporting AUXILIARYACCOUNT from source instance\nImporting LDAP_DOMAIN from source instance\nImporting MAIL_DOMAIN from source instance\nImporting MAIL_SERVER from source instance\nImporting TRAEFIK_HOST from source instance\nTraceback (most recent call last):\n  File "/home/sogo4/.config/actions/clone-module/50call-configure-module", line 18, in <module>\n    "lets_encrypt": os.environ["TRAEFIK_LETS_ENCRYPT"] == \'True\',\n                    ~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^\n  File "<frozen os>", line 679, in __getitem__\nKeyError: \'TRAEFIK_LETS_ENCRYPT\'\n', 'exit_code': 1}
jun 10 20:02:19 nethserver agent@cluster[511]: task/cluster/a77c8188-4cec-411d-90b1-de8d9d417c9c: action "clone-module" status is "aborted" (1) at step 50clone_module
jun 10 20:15:00 nethserver agent@sogo5[5822]: task/module/sogo5/0b40b2f1-3b3a-4d64-a8a1-9349d5a0e00e: clone-module/50call-configure-module is starting                                                                                      
jun 10 20:15:00 nethserver agent@sogo5[5822]: Traceback (most recent call last):
jun 10 20:15:00 nethserver agent@sogo5[5822]:   File "/home/sogo5/.config/actions/clone-module/50call-configure-module", line 14, in <module>
jun 10 20:15:00 nethserver agent@sogo5[5822]:     "admin_users": os.environ["ADMIN_USERS"],
jun 10 20:15:00 nethserver agent@sogo5[5822]:                    ~~~~~~~~~~^^^^^^^^^^^^^^^
jun 10 20:15:00 nethserver agent@sogo5[5822]:   File "<frozen os>", line 679, in __getitem__
jun 10 20:15:00 nethserver agent@sogo5[5822]: KeyError: 'ADMIN_USERS'
jun 10 20:15:00 nethserver traefik[1336]: 192.168.10.243 - - [10/Jun/2024:18:14:59 +0000] "GET /cluster-admin/api/module/sogo2/task/42f0b28b-cf36-4fd8-acd5-67c5c92c33d2/context HTTP/2.0" 200 242 "-" "-" 1544 "ApiServer-https@file" "http://127.0.0.1:9311" 956ms
jun 10 20:15:00 nethserver agent@sogo5[5822]: task/module/sogo5/0b40b2f1-3b3a-4d64-a8a1-9349d5a0e00e: action "clone-module" status is "aborted" (1) at step 50call-configure-module
jun 10 20:15:00 nethserver agent@cluster[511]: Task module/sogo5/clone-module run failed: {'output': '', 'error': '<7>podman-pull-missing ghcr.io/nethserver/rsync:2.8.2\nTrying to pull ghcr.io/nethserver/rsync:2.8.2...\nGetting image source signatures\nCopying blob sha256:759a408b2356b87060583e867bdc2455d466cfebc6bb909e9c7fbc3825fbe3f9\nCopying blob sha256:82e94dfc42a5139b349717d0ef16203b01c32a0a5c61b3b3c034e6a329868fbf\nCopying config sha256:0ae319bd32975897aedf29d724406cbb015daf19b620f8c6bdf7a44e870309a2\nWriting manifest to image destination\nStoring signatures\n0ae319bd32975897aedf29d724406cbb015daf19b620f8c6bdf7a44e870309a2\n<7>podman run --rm --privileged --network=host --workdir=/srv --env=RSYNCD_NETWORK=10.5.4.0/24 --env=RSYNCD_ADDRESS=cluster-localnode --env=RSYNCD_PORT=20018 --env=RSYNCD_USER=sogo2 --env=RSYNCD_PASSWORD=5588eef154f4-b1f3-40cb-ad9e-b46c54df5e78 --env=RSYNCD_SYSLOG_TAG=sogo5 --volume=/dev/log:/dev/log --volume=/home/sogo5/.config/state:/srv/state ghcr.io/nethserver/rsync:2.8.2\nTraceback (most recent call last):\n  File "/home/sogo5/.config/actions/clone-module/50call-configure-module", line 14, in <module>\n    "admin_users": os.environ["ADMIN_USERS"],\n                   ~~~~~~~~~~^^^^^^^^^^^^^^^\n  File "<frozen os>", line 679, in __getitem__\nKeyError: \'ADMIN_USERS\'\n', 'exit_code': 1}

Try this way

Install the version I asked, lets say the module will be sogo5 : add-module ghcr.io/nethserver/sogo:1.1.12-dev.1

Configure this version of sogo5
Remove all old backups of sogo in the backup destination
Do a backup of this version sogo5
Restore the sogo5 by replacing the module

This is supposed without issue, sogo is sogo6 now

Then try to clone sogo6

Now is see the source of confusion… we have different objectives.

My objective is to restore the data contained in my backup.
Without being able to restore my backup all address books and calendars would be gone if this had been a production environment. (no worries, luckily this is a test system).
The only way i see that might work now, is to restore the files from the backup, wipe the sogo database and restore from sogo.sql. (think i will put that to the test too)

I will test disaster recovery on 1.1.12-dev and let you know how that works in my end.

I successfully restored sogo:1.1.12-dev.1, preferences, address book and calendar all as expected.

I also tried to recover the original backup and manually replaced the database, this worked also as hoped.

Thanks @stephdl

1 Like

could you try the clone of your sogo version now ?

  • Cloned sogo8 → 9
  • Removed old instance sogo8
  • Created backup sogo9
  • Restored by replacing the cloned instance sogo9 → 10

No errors and looks as expected. :slight_smile:

1 Like