NS7 to NS8 migration - sync data fails

danb35 · March 18, 2024, 11:43am

I’m attempting to migrate data from my NS7 server to NS8 using the nethserver-ns8-migration tool. The initial migration seems to have worked–hundreds of GB of data are now on the new server–but whenever I try to “Sync data” for either Email or Nextcloud (the only two modules for which that options is available), I get an error:

The result of the “copy command” is:

[root@neth ~]#  echo '{"app":"nethserver-mail","action":"sync"}' | /usr/bin/setsid /usr/bin/sudo /usr/libexec/nethserver/api/nethserver-ns8-migration/migration/update | jq
{
  "progress": "0.00",
  "time": "0.0",
  "exit": 0,
  "event": "migration-sync",
  "state": "running",
  "step": 0,
  "pid": 0,
  "action": ""
}
{
  "pid": 0,
  "status": "failed",
  "event": "migration-sync"
}
{
  "id": "1710761311",
  "type": "ApiFailed",
  "message": "sync nethserver-mail failed"
}

…which doesn’t seem very informative. There’s a delay of several seconds between the end of the first block (the one ending with "action": "") and the beginning of the second (the one that starts with "pid": 0).

The NS7 server is joined to the cluster and able to ping the leader node:

[root@neth ~]# ping 10.5.4.1
PING 10.5.4.1 (10.5.4.1) 56(84) bytes of data.
64 bytes from 10.5.4.1: icmp_seq=1 ttl=64 time=100 ms
64 bytes from 10.5.4.1: icmp_seq=2 ttl=64 time=107 ms
64 bytes from 10.5.4.1: icmp_seq=3 ttl=64 time=107 ms
64 bytes from 10.5.4.1: icmp_seq=4 ttl=64 time=100 ms
^C
--- 10.5.4.1 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 2999ms
rtt min/avg/max/mdev = 100.308/104.064/107.568/3.490 ms

The NS7 server has nethserver-ns8-migration-1.0.8-1.ns7.x86_64 and rsync-3.1.2-10.el7.x86_64–I haven’t been upgrading rsync because every time I do, it breaks hotsync. Where else should I be looking to track this down?

danb35 · March 20, 2024, 7:10pm

Bump.

dnutan · March 20, 2024, 7:48pm

Anything else that stands out from /var/log/ns8-migration.log?

danb35 · March 21, 2024, 10:34am

Looks like an rsync problem–I never knew rsync was so fragile. When I try to sync email, I get a bunch of listings which I assume correspond to individual emails, then:

.d..t...... wx_user@familybrown.org/Maildir/.Junk/tmp/
.d..t...... ./
<f.st...... dump.rdb
<f..T...... default.private
<f..T...... default.txt
removed ‘roundcubemail.sql’
rsync: failed to connect to 10.5.4.1 (10.5.4.1): Connection refused (111)
rsync error: error in socket IO (code 10) at clientserver.c(125) [sender=3.1.2]

The only thing that appears in the log file when I try the Nextcloud migration is this:

----------- sync nethserver-nextcloud Thu, 21 Mar 2024 06:19:36 -0400

Let’s try updating rsync and seeing what happens. Nope, same error:

.d..t...... ./
<f.st...... dump.rdb
<f..T...... default.private
<f..T...... default.txt
removed ‘roundcubemail.sql’
rsync: failed to connect to 10.5.4.1 (10.5.4.1): Connection refused (111)
rsync error: error in socket IO (code 10) at clientserver.c(126) [sender=3.1.2]

…and still nothing at all with Nextcloud.

Here’s what’s reported in ns8 for the attempted Nextcloud migration:
{"context":{"action":"import-module","data":{"credentials":["nextcloud1","171331998a0a3ec-bf0f-4f21-81ec-0d3bdd9f7a7b"],"port":20020,"volumes":["nextcloud-app-data"]},"extra":{"description":"ns8-action endpoint http://10.5.4.1:9311","isNotificationHidden":false,"title":"module/nextcloud1/import-module"},"id":"3dff11dc-87c2-4677-8e2b-b1cb8ebd94d9","parent":"","queue":"module/nextcloud1/tasks","timestamp":"2024-03-21T10:28:50.319716473Z","user":"admin"},"status":"aborted","progress":50,"subTasks":[],"validated":true,"result":{"error":"<7>podman-pull-missing ghcr.io/nethserver/rsync:2.5.5\n<7>podman run --rm --privileged --network=host --workdir=/srv --env=RSYNCD_NETWORK=10.5.4.0/24 --env=RSYNCD_ADDRESS=cluster-localnode --env=RSYNCD_PORT=20020 --env=RSYNCD_USER=nextcloud1 --env=RSYNCD_PASSWORD=(redacted) --env=RSYNCD_SYSLOG_TAG=nextcloud1 --volume=/dev/log:/dev/log --name=rsync-nextcloud1 --volume=/home/nextcloud1/.config/state:/srv/state --volume=nextcloud-app-data:/srv/volumes/nextcloud-app-data --volume=restic-cache:/srv/volumes/restic-cache ghcr.io/nethserver/rsync:2.5.5\nError: creating container storage: the container name \"rsync-nextcloud1\" is already in use by 9eee0061bee0a5b4aba7ff563581e9a7bba76d3b0bcd425fa4591e19d20e8c99. You have to remove that container to be able to reuse that name: that name is already in use\nTraceback (most recent call last):\n File \"/usr/local/agent/actions/import-module/10recvstate\", line 49, in <module>\n agent.run_helper(*podman_cmd, core_env['RSYNC_IMAGE']).check_returncode()\n File \"/usr/lib/python3.11/subprocess.py\", line 502, in check_returncode\n raise CalledProcessError(self.returncode, self.args, self.stdout,\nsubprocess.CalledProcessError: Command '('podman', 'run', '--rm', '--privileged', '--network=host', '--workdir=/srv', '--env=RSYNCD_NETWORK=10.5.4.0/24', '--env=RSYNCD_ADDRESS=cluster-localnode', '--env=RSYNCD_PORT=20020', '--env=RSYNCD_USER=nextcloud1', '--env=RSYNCD_PASSWORD=(redacted)', '--env=RSYNCD_SYSLOG_TAG=nextcloud1', '--volume=/dev/log:/dev/log', '--name=rsync-nextcloud1', '--volume=/home/nextcloud1/.config/state:/srv/state', '--volume=nextcloud-app-data:/srv/volumes/nextcloud-app-data', '--volume=restic-cache:/srv/volumes/restic-cache', 'ghcr.io/nethserver/rsync:2.5.5')' returned non-zero exit status 125.\n","exit_code":1,"file":"task/module/nextcloud1/3dff11dc-87c2-4677-8e2b-b1cb8ebd94d9","output":""}}

…and it thinks the mail sync is still ongoing, despite its having errored out some time ago:

dnutan · March 21, 2024, 11:37am

Shouldn’t rsync be able to continue from where it left off? that’s one of its main features.
(Not asking to you, Dan)

IDK, maybe you could check I/O stats or if the corresponding rsync process is running still.

danb35 · March 21, 2024, 12:33pm

Good thought, and it looks like it is on the NS8 box:

root@ns8:~# ps aux | grep rsync
mail1    1179773  0.2  0.1 1639068 48772 ?       Sl   Mar18  11:07 podman run --rm --privileged --network=host --workdir=/srv --env=RSYNCD_NETWORK=10.5.4.0/24 --env=RSYNCD_ADDRESS=cluster-localnode --env=RSYNCD_PORT=20022 --env=RSYNCD_USER=mail1 --env=RSYNCD_PASSWORD=(redacted) --env=RSYNCD_SYSLOG_TAG=mail1 --volume=/dev/log:/dev/log --name=rsync-mail1 --volume=/home/mail1/.config/state:/srv/state --volume=dovecot-data:/srv/volumes/dovecot-data --volume=restic-cache:/srv/volumes/restic-cache --volume=rspamd-redis:/srv/volumes/rspamd-redis ghcr.io/nethserver/rsync:2.5.4
mail1    1179786  0.0  0.0   8848  2024 ?        Ss   Mar18   0:00 /usr/bin/conmon --api-version 1 -c 755eb5549b60d9566e3d1d4f40882956e29aaf85d6be1e8dd393cfee54db0c4a -u 755eb5549b60d9566e3d1d4f40882956e29aaf85d6be1e8dd393cfee54db0c4a -r /usr/bin/crun -b /home/mail1/.local/share/containers/storage/vfs-containers/755eb5549b60d9566e3d1d4f40882956e29aaf85d6be1e8dd393cfee54db0c4a/userdata -p /run/user/1010/containers/vfs-containers/755eb5549b60d9566e3d1d4f40882956e29aaf85d6be1e8dd393cfee54db0c4a/userdata/pidfile -n rsync-mail1 --exit-dir /run/user/1010/libpod/tmp/exits --full-attach -s -l journald --log-level warning --runtime-arg --log-format=json --runtime-arg --log --runtime-arg=/run/user/1010/containers/vfs-containers/755eb5549b60d9566e3d1d4f40882956e29aaf85d6be1e8dd393cfee54db0c4a/userdata/oci-log --conmon-pidfile /run/user/1010/containers/vfs-containers/755eb5549b60d9566e3d1d4f40882956e29aaf85d6be1e8dd393cfee54db0c4a/userdata/conmon.pid --exit-command /usr/bin/podman --exit-command-arg --root --exit-command-arg /home/mail1/.local/share/containers/storage --exit-command-arg --runroot --exit-command-arg /run/user/1010/containers --exit-command-arg --log-level --exit-command-arg warning --exit-command-arg --cgroup-manager --exit-command-arg systemd --exit-command-arg --tmpdir --exit-command-arg /run/user/1010/libpod/tmp --exit-command-arg --network-config-dir --exit-command-arg  --exit-command-arg --network-backend --exit-command-arg netavark --exit-command-arg --volumepath --exit-command-arg /home/mail1/.local/share/containers/storage/volumes --exit-command-arg --runtime --exit-command-arg crun --exit-command-arg --storage-driver --exit-command-arg vfs --exit-command-arg --events-backend --exit-command-arg file --exit-command-arg container --exit-command-arg cleanup --exit-command-arg --rm --exit-command-arg 755eb5549b60d9566e3d1d4f40882956e29aaf85d6be1e8dd393cfee54db0c4a
mail1    1179788  0.0  0.0   2316  1396 ?        Ss   Mar18   0:00 rsync --daemon --no-detach
root     1349867  0.0  0.0   6336  2076 pts/0    S+   08:26   0:00 grep rsync

It isn’t on the NS7 box, though:

[root@neth ~]# ps aux | grep rsync
root     18173  8.9  0.0 333160 22908 ?        S    08:21   0:51 /usr/bin/rsync -z -r -a -H -A --delete --files-from=/tmp/tmp.V5OsAj0LcT --exclude-from=/tmp/tmp.tyZp2IB204 / rsync://hotsyncuser@127.0.0.1/hotsync/
root     20472  0.0  0.0 112816   980 pts/0    S+   08:31   0:00 grep --color=auto rsync

dnutan · March 21, 2024, 12:39pm

and under a different name, like migrat* ?

danb35 · March 21, 2024, 12:42pm

Hmmm:

[root@neth ~]# ps aux | grep migrat
root         7  0.0  0.0      0     0 ?        S     2023   1:36 [migration/0]
root        13  0.0  0.0      0     0 ?        S     2023   1:36 [migration/1]
root        18  0.0  0.0      0     0 ?        S     2023   1:20 [migration/2]
root        23  0.0  0.0      0     0 ?        S     2023   1:19 [migration/3]
root        28  0.0  0.0      0     0 ?        S     2023   1:26 [migration/4]
root        33  0.0  0.0      0     0 ?        S     2023   1:43 [migration/5]
root        38  0.0  0.0      0     0 ?        S     2023   1:34 [migration/6]
root        43  0.0  0.0      0     0 ?        S     2023   1:53 [migration/7]
root     22309  0.0  0.0 112816   980 pts/0    S+   08:42   0:00 grep --color=auto migrat

Andy_Wismer · March 21, 2024, 1:57pm

Depending on data amount, hardware, connection, this can take hours. No moving on the GUI, suspision the box is frozen.

Using eg htop, you can see it’s doing something, and df should show the disk growing.

But you can need plenty of patience…

I left it overnight "as is"n next morning was done…
Not always the raw size, but plenty of small files…

My 2 cents
Andy

danb35 · March 21, 2024, 3:19pm

Clearly the box isn’t frozen–the web UI is responsive (on both ends), I can ssh in and change things, etc. And I’m fine with the process taking a while, though I’d expect it to be faster than “several hours” for an update (“Sync data” as the NS8 migration module calls it). But if that’s what’s going on, the NS7 box shouldn’t have errored out, as it’s consistently doing.

Edit: but looks like there’s an update available to the ns8 migration module. Let’s see if it changes anything–because the NS7 box, at least as far as its web UI is concerned, doesn’t think it’s running any migration at all at this point.

So after updating to nethserver-ns8-migration-1.0.9-1, I get what seems to be the same result. Attempting to “sync data” with Nextcloud results only in a one-line entry in the ns8-migration.log (same as above). The “copy command” from the NS7 box and running it at the shell results in:

[root@neth ~]#  echo '{"app":"nethserver-nextcloud","action":"sync"}' | /usr/bin/setsid /usr/bin/sudo /usr/libexec/nethserver/api/nethserver-ns8-migration/migration/update | jq
{
  "progress": "0.00",
  "time": "0.0",
  "exit": 0,
  "event": "migration-sync",
  "state": "running",
  "step": 0,
  "pid": 0,
  "action": ""
}
rsync: failed to connect to 10.5.4.1 (10.5.4.1): Connection refused (111)
rsync error: error in socket IO (code 10) at clientserver.c(126) [sender=3.1.2]
rsync: failed to connect to 10.5.4.1 (10.5.4.1): Connection refused (111)
rsync error: error in socket IO (code 10) at clientserver.c(126) [sender=3.1.2]
rsync: failed to connect to 10.5.4.1 (10.5.4.1): Connection refused (111)
rsync error: error in socket IO (code 10) at clientserver.c(126) [sender=3.1.2]
rsync: failed to connect to 10.5.4.1 (10.5.4.1): Connection refused (111)
rsync error: error in socket IO (code 10) at clientserver.c(126) [sender=3.1.2]
rsync: failed to connect to 10.5.4.1 (10.5.4.1): Connection refused (111)
rsync error: error in socket IO (code 10) at clientserver.c(126) [sender=3.1.2]
rsync: failed to connect to 10.5.4.1 (10.5.4.1): Connection refused (111)
rsync error: error in socket IO (code 10) at clientserver.c(126) [sender=3.1.2]
rsync: failed to connect to 10.5.4.1 (10.5.4.1): Connection refused (111)
rsync error: error in socket IO (code 10) at clientserver.c(126) [sender=3.1.2]
rsync: failed to connect to 10.5.4.1 (10.5.4.1): Connection refused (111)
rsync error: error in socket IO (code 10) at clientserver.c(126) [sender=3.1.2]
rsync: failed to connect to 10.5.4.1 (10.5.4.1): Connection refused (111)
rsync error: error in socket IO (code 10) at clientserver.c(126) [sender=3.1.2]
rsync: failed to connect to 10.5.4.1 (10.5.4.1): Connection refused (111)
rsync error: error in socket IO (code 10) at clientserver.c(126) [sender=3.1.2]
rsync: failed to connect to 10.5.4.1 (10.5.4.1): Connection refused (111)
rsync error: error in socket IO (code 10) at clientserver.c(126) [sender=3.1.2]
rsync: failed to connect to 10.5.4.1 (10.5.4.1): Connection refused (111)
rsync error: error in socket IO (code 10) at clientserver.c(126) [sender=3.1.2]
rsync: failed to connect to 10.5.4.1 (10.5.4.1): Connection refused (111)
rsync error: error in socket IO (code 10) at clientserver.c(126) [sender=3.1.2]
rsync: failed to connect to 10.5.4.1 (10.5.4.1): Connection refused (111)
rsync error: error in socket IO (code 10) at clientserver.c(126) [sender=3.1.2]
rsync: failed to connect to 10.5.4.1 (10.5.4.1): Connection refused (111)
rsync error: error in socket IO (code 10) at clientserver.c(126) [sender=3.1.2]
rsync: failed to connect to 10.5.4.1 (10.5.4.1): Connection refused (111)
rsync error: error in socket IO (code 10) at clientserver.c(126) [sender=3.1.2]
rsync: failed to connect to 10.5.4.1 (10.5.4.1): Connection refused (111)
rsync error: error in socket IO (code 10) at clientserver.c(126) [sender=3.1.2]
rsync: failed to connect to 10.5.4.1 (10.5.4.1): Connection refused (111)
rsync error: error in socket IO (code 10) at clientserver.c(126) [sender=3.1.2]
rsync: failed to connect to 10.5.4.1 (10.5.4.1): Connection refused (111)
rsync error: error in socket IO (code 10) at clientserver.c(126) [sender=3.1.2]
rsync: failed to connect to 10.5.4.1 (10.5.4.1): Connection refused (111)
rsync error: error in socket IO (code 10) at clientserver.c(126) [sender=3.1.2]
rsync: failed to connect to 10.5.4.1 (10.5.4.1): Connection refused (111)
rsync error: error in socket IO (code 10) at clientserver.c(126) [sender=3.1.2]
rsync: failed to connect to 10.5.4.1 (10.5.4.1): Connection refused (111)
rsync error: error in socket IO (code 10) at clientserver.c(126) [sender=3.1.2]
rsync: failed to connect to 10.5.4.1 (10.5.4.1): Connection refused (111)
rsync error: error in socket IO (code 10) at clientserver.c(126) [sender=3.1.2]
rsync: failed to connect to 10.5.4.1 (10.5.4.1): Connection refused (111)
rsync error: error in socket IO (code 10) at clientserver.c(126) [sender=3.1.2]
rsync: failed to connect to 10.5.4.1 (10.5.4.1): Connection refused (111)
rsync error: error in socket IO (code 10) at clientserver.c(126) [sender=3.1.2]
rsync: failed to connect to 10.5.4.1 (10.5.4.1): Connection refused (111)
rsync error: error in socket IO (code 10) at clientserver.c(126) [sender=3.1.2]
rsync: failed to connect to 10.5.4.1 (10.5.4.1): Connection refused (111)
rsync error: error in socket IO (code 10) at clientserver.c(126) [sender=3.1.2]
rsync: failed to connect to 10.5.4.1 (10.5.4.1): Connection refused (111)
rsync error: error in socket IO (code 10) at clientserver.c(126) [sender=3.1.2]
rsync: failed to connect to 10.5.4.1 (10.5.4.1): Connection refused (111)
rsync error: error in socket IO (code 10) at clientserver.c(126) [sender=3.1.2]
rsync: failed to connect to 10.5.4.1 (10.5.4.1): Connection refused (111)
rsync error: error in socket IO (code 10) at clientserver.c(126) [sender=3.1.2]
rsync: failed to connect to 10.5.4.1 (10.5.4.1): Connection refused (111)
rsync error: error in socket IO (code 10) at clientserver.c(126) [sender=3.1.2]
rsync: failed to connect to 10.5.4.1 (10.5.4.1): Connection refused (111)
rsync error: error in socket IO (code 10) at clientserver.c(126) [sender=3.1.2]
{
  "pid": 0,
  "status": "failed",
  "event": "migration-sync"
}
{
  "id": "1711034747",
  "type": "ApiFailed",
  "message": "sync nethserver-nextcloud failed"
}

The error shown on the NS8 box is:

{

    "context":{
        "action":"import-module",
        "data":{
            "credentials":[
                "nextcloud1",
                "171331998a0a3ec-bf0f-4f21-81ec-0d3bdd9f7a7b"
            ],
            "port":20020,
            "volumes":[
                "nextcloud-app-data"
            ]
        },
        "extra":{
            "description":"ns8-action endpoint http://10.5.4.1:9311",
            "isNotificationHidden":false,
            "title":"module/nextcloud1/import-module"
        },
        "id":"f889a6bc-2513-4e91-888f-2a501f6ea318",
        "parent":"",
        "queue":"module/nextcloud1/tasks",
        "timestamp":"2024-03-21T15:22:02.364374272Z",
        "user":"admin"
    },
    "status":"aborted",
    "progress":50,
    "subTasks":[
    ],
    "validated":true,
    "result":{
        "error":"<7>podman-pull-missing ghcr.io/nethserver/rsync:2.5.5\n<7>podman run --rm --privileged --network=host --workdir=/srv --env=RSYNCD_NETWORK=10.5.4.0/24 --env=RSYNCD_ADDRESS=cluster-localnode --env=RSYNCD_PORT=20020 --env=RSYNCD_USER=nextcloud1 --env=RSYNCD_PASSWORD=(redacted) --env=RSYNCD_SYSLOG_TAG=nextcloud1 --volume=/dev/log:/dev/log --name=rsync-nextcloud1 --volume=/home/nextcloud1/.config/state:/srv/state --volume=nextcloud-app-data:/srv/volumes/nextcloud-app-data --volume=restic-cache:/srv/volumes/restic-cache ghcr.io/nethserver/rsync:2.5.5\nError: creating container storage: the container name \"rsync-nextcloud1\" is already in use by 9eee0061bee0a5b4aba7ff563581e9a7bba76d3b0bcd425fa4591e19d20e8c99. You have to remove that container to be able to reuse that name: that name is already in use\nTraceback (most recent call last):\n File \"/usr/local/agent/actions/import-module/10recvstate\", line 49, in <module>\n agent.run_helper(*podman_cmd, core_env['RSYNC_IMAGE']).check_returncode()\n File \"/usr/lib/python3.11/subprocess.py\", line 502, in check_returncode\n raise CalledProcessError(self.returncode, self.args, self.stdout,\nsubprocess.CalledProcessError: Command '('podman', 'run', '--rm', '--privileged', '--network=host', '--workdir=/srv', '--env=RSYNCD_NETWORK=10.5.4.0/24', '--env=RSYNCD_ADDRESS=cluster-localnode', '--env=RSYNCD_PORT=20020', '--env=RSYNCD_USER=nextcloud1', '--env=RSYNCD_PASSWORD=(redacted)', '--env=RSYNCD_SYSLOG_TAG=nextcloud1', '--volume=/dev/log:/dev/log', '--name=rsync-nextcloud1', '--volume=/home/nextcloud1/.config/state:/srv/state', '--volume=nextcloud-app-data:/srv/volumes/nextcloud-app-data', '--volume=restic-cache:/srv/volumes/restic-cache', 'ghcr.io/nethserver/rsync:2.5.5')' returned non-zero exit status 125.\n",
        "exit_code":1,
        "file":"task/module/nextcloud1/f889a6bc-2513-4e91-888f-2a501f6ea318",
        "output":""
    }

}

Kind of wonder whether a reboot of both systems would help, though this isn’t Windows–but the NS7 box has been up over nine months.

danb35 · March 21, 2024, 4:26pm

Nope. Attempting “sync data” on Mail results in the same error in the migration log on the NS7 box, and the same error reported in its GUI. The NS8 box still shows (in the UI) that the sync is ongoing at 16% (which, if the past is any indication, it will stay forever at that status). ps on the NS8 box looks pretty much the same as before.

dnutan · March 21, 2024, 5:40pm

If ALL data remains on NS7, in reference to nextcloud1, you might have to login with the user through runagent and then remove the container/storage 9eee0061bee0a5b4aba7ff563581e9a7bba76d3b0bcd425fa4591e19d20e8c99
using podman. Before that you could check activity with lsof or other means but seems there’s none? But before doing anything, would be wiser to hear from the developers (as I have no experience with podman).

do not run

runagent -m nextcloud1
# look for the ID reported by the log (9eee...0e8c99); if doesn't exist, try with sudo
podman ps --all 
# WARNING! Danger Ahead! (another option is to rename it instead or removing it) ->>
podman rm -f 9eee0061bee0a5b4aba7ff563581e9a7bba76d3b0bcd425fa4591e19d20e8c99

I think there are some other commands to abort running tasks but…

Regarding Mail, we have no information/log on what’s cooking.

danb35 · March 21, 2024, 5:54pm

No data has disappeared from NS7 (as expected), but around 400 GB have previously been transferred to NS8. But that’s how the migration is supposed to work–or at least, how it’s documented to work: Start an initial migration, update it whenever and as many times as necessary or desired using the “Sync data” button, and then “finalize migration” whenever you’re done. It’s the “sync data” step that isn’t working.

Andy_Wismer · March 21, 2024, 8:36pm

Even though the reboot saying is often attributed to help for Windows, truth is, it helps for ALL systems, Linux and Mac included.

The reason is simple and logical:

A reboot assures that all systems / services / containers are started in the correct chronological order.
It will also free stale connections, memory, locks, ports et Al…

My 2 cents
Andy

danb35 · March 21, 2024, 8:45pm

…and, in the case of Linux, it also reloads the kernel–and I’m sure there have been a number of kernel updates in the 280 days the NS7 system was up. But it doesn’t seem to have done anything to address the migration problem.

Andy_Wismer · March 21, 2024, 8:46pm

The most likely culprit is probably still buried somewhere in the migration script…

It does this for ALL systems!

danb35 · March 23, 2024, 10:57am

It’s two days later, and this hasn’t changed. I think it’s safe to say that nothing is actually happening, but it’s unclear why NS8 thinks it is.

davidep · March 23, 2024, 11:02am

On ns7, try a

pgrep -a rsync

If you see hanging processes

pkill rsync

danb35 · March 23, 2024, 11:03am

[root@neth ~]# pgrep -a rsync
[root@neth ~]#

davidep · March 23, 2024, 11:33am

So on ns8 side it seems there’s a container name conflict for Nextcloud, plus the Mail module reporting a rsync error.

You can try to reboot ns8 node. The rsyncd server is restarted by the next sync run.