opened 03:00PM - 03 Jun 24 UTC
bug
If the ns8-join command of the migration tool fails, a duplicate Redis key is ge…nerated for each failed attempt. If many failed attempts were run, the Wireguard peer table is polluted by duplicates and the wg0 configuration breaks.
**Steps to reproduce**
- Create a cluster with a bad VPN host endpoint (explode the VPN advanced form to find it). For example, set `myhost.dom.test`. As consequence, the leader FQDN is not in DNS: it is a condition that despite the docs, is often forgot.
- Join NS8 cluster with the IP address. E.g. `ns8-join --no-tlsverify <LEADER_IP> admin Nethesis,1234`
- Leave the cluster, e.g. `ns8-leave`
- Repeat leave/join steps 5 times
**Expected behavior**
Join fails. Only the last join attempt is left in the Redis DB, with the higher NODE_ID.
**Actual behavior**
After last join attempt in ns7:
```
[root@nscom2 ~]# config show wg-quick@ns8
wg-quick@ns8=service
Address=10.5.4.7
RemoteEndpoint=rl1.dom.test:55820
RemoteKey=XXXXXXXX
RemoteNetwork=10.5.4.0/24
status=enabled
```
Node keys from the first join attempt are still in place:
```
[root@rl1 ~]# redis-cli keys node/*/vpn
1) "node/7/vpn"
2) "node/5/vpn"
3) "node/4/vpn"
4) "node/6/vpn"
5) "node/3/vpn"
6) "node/2/vpn"
7) "node/1/vpn"
```
They overwrite the Wireguard "allowed ips" field, breaking the VPN configuration:
```
[root@rl1 ~]# wg
interface: wg0
public key: pfd5Bm8HnII6ZC18Ojuhrn02sBen1fvDX29KroKARxs=
private key: (hidden)
listening port: 55820
peer: RKUWF/SLwotQJq5OfDxUFSoHhSZ0D7kwGMAocwX9FSI=
allowed ips: 10.5.4.5/32
persistent keepalive: every 25 seconds
```
:warning: note IP 10.5.4.5, from a stale Redis node key.
**Components**
- core 2.8.1
- nethserver-ns8-migration-1.0.12-1.ns7.x86_64
**See also**
https://community.nethserver.org/t/migration-tool-duplicates-redis-keys-of-node/23789
Thanks to @mrmarkuz