NS8 installation on Rocky Linux fails

Installation on Rocky 9.1 fails:

Installing collected packages: typing-extensions, resolvelib, pyasn1, ptyprocess, lockfile, certifi, cchardet, urllib3, semver, redis, PyYAML, pyrsistent, pycparser, psutil, pexpect, packaging, multidict, MarkupSafe, ldap3, idna, hiredis, docutils, dnspython, chardet, attrs, async-timeout, yarl, requests, python-daemon, jsonschema, Jinja2, cffi, aioredis, pycares, cryptography, brotlipy, ansible-runner, aiohttp, ansible-core, aiodns
Successfully installed Jinja2-3.0.2 MarkupSafe-2.0.1 PyYAML-6.0 aiodns-3.0.0 aiohttp-3.7.4.post0 aioredis-2.0.1 ansible-core-2.14.1 ansible-runner-2.3.1 async-timeout-3.0.1 attrs-21.2.0 brotlipy-0.7.0 cchardet-2.1.7 certifi-2021.5.30 cffi-1.14.5 chardet-4.0.0 cryptography-39.0.0 dnspython-2.1.0 docutils-0.19 hiredis-2.0.0 idna-2.10 jsonschema-3.2.0 ldap3-2.9.1 lockfile-0.12.2 multidict-5.1.0 packaging-23.0 pexpect-4.8.0 psutil-5.8.0 ptyprocess-0.7.0 pyasn1-0.4.8 pycares-4.0.0 pycparser-2.20 pyrsistent-0.19.3 python-daemon-2.3.2 redis-3.5.3 requests-2.25.1 resolvelib-0.8.1 semver-2.13.0 typing-extensions-3.10.0.0 urllib3-1.26.6 yarl-1.6.3
Setup registry:
Add /etc/hosts entries:
Generate WireGuard VPN key pair:
Rnm8TDiW6SNLbSI5LUSwbbH6PDk5/rNNh+6iM8mND3A=
Add firewalld core rules:
Start Redis DB:
Created symlink /etc/systemd/system/default.target.wants/redis.service → /etc/systemd/system/redis.service.
Generating cluster password:
Write initial cluster environment state
Error: inspecting object: ghcr.io/nethserver/core:0.0.30: image not known
Generating api-server password:
Generating node password:
AUTH failed: WRONGPASS invalid username-password pair or user is disabled.
OK
OK
3
OK
OK
OK
OK
OK
OK
OK
OK
Start API server and core agents:
Created symlink /etc/systemd/system/multi-user.target.wants/api-server.service → /etc/systemd/system/api-server.service.
Created symlink /etc/systemd/system/default.target.wants/agent@cluster.service → /etc/systemd/system/agent@.service.
Created symlink /etc/systemd/system/default.target.wants/agent@node.service → /etc/systemd/system/agent@.service.
Grant initial permissions:
/var/lib/nethserver/node/install-core.sh: line 152: runagent: command not found

Even after rebooting, connection is refused when I try to reach the cluster-admin page. This is a fresh, clean install of Rocky 9.1, with only the hypervisor guest agents added.

Run uninstall.sh then try installation again.

[root@localhost ~]# echo $PATH
/root/.local/bin:/root/bin:/usr/local/sbin:/sbin:/bin:/usr/sbin:/usr/bin

…and that’s at least part of the problem; /usr/local/bin/ isn’t part of root’s path by default under Rocky (or, apparently, Oracle Linux). After adding it to $PATH, installation gets farther, but still fails:

Generating cluster password:
Write initial cluster environment state
Error: inspecting object: ghcr.io/nethserver/core:0.0.30: image not known
Generating api-server password:
Generating node password:
AUTH failed: WRONGPASS invalid username-password pair or user is disabled.
OK
OK
3
OK
OK
OK
OK
OK
OK
OK
OK
Start API server and core agents:
Created symlink /etc/systemd/system/multi-user.target.wants/api-server.service → /etc/systemd/system/api-server.service.
Created symlink /etc/systemd/system/default.target.wants/agent@cluster.service → /etc/systemd/system/agent@.service.
Created symlink /etc/systemd/system/default.target.wants/agent@node.service → /etc/systemd/system/agent@.service.
Grant initial permissions:
Install Traefik:
<7>podman-pull-missing ghcr.io/nethserver/traefik:0.0.6
Trying to pull ghcr.io/nethserver/traefik:0.0.6...
Getting image source signatures
Copying blob sha256:2a6e9e7ba98010c2bceee2a0b918972853452aef73cdfcf5d880323c2485051b
Copying config sha256:e59ad7a108fbfdbeeae12b8ef40a449ae5aa30442700b6a99159b1ff5900a97a
Writing manifest to image destination
Storing signatures
e59ad7a108fbfdbeeae12b8ef40a449ae5aa30442700b6a99159b1ff5900a97a
<7>extract-ui ghcr.io/nethserver/traefik:0.0.6
Extracting container filesystem ui to /var/lib/nethserver/cluster/ui/apps/traefik1
ui/index.html
39ab90c03c9f797bf9dfa5a0445028262f33b85626850b8456da740553c37463
Assertion failed
  File "/var/lib/nethserver/cluster/actions/add-module/50update", line 208, in <module>
    agent.assert_exp(create_module_result['exit_code'] == 0) # Ensure create-module is successful

OK, blew away that VM and installed a clean one–VMs are great for testing. Starting with a clean Rocky 9.1 installation, installed updates, added /usr/local/bin to root’s path, and rebooted. Then ran the install command, and it completed successfully. So the issue is that the installer expects root’s path to include /usr/local/bin, and it doesn’t by default in Rocky 9.1.

The same issue is present in Alma 9.1, and I suspect with Oracle Linux as well. Seems the installer needs to account for the actual default $PATH in these distros.

Thank you for the report Dan!

I always run the customized RL image of Digital Ocean for CI tests and never hit this issue. There can be some differences with the official Rocky Linux image.

Did anybody experience the same problem? /cc @lucag @nrauso @stephdl

I confirm rocky linux 9.1 is fine with ns8 either by the digital ocean droplet or by the minimal install iso. Not tested alma linux

Will try to test it again

Hmmm, I’d used the standard ISO, but selected “minimal install”–I wouldn’t think this makes a difference, but I guess it could.

Something else that might be relevant is that I’ve set up these installations with a non-root user, then sudo -i to root–as seems to be the general recommendation for Linux these days, root login is disabled. This ought to give a regular root environment, but I guess it could be a difference too.

Edit: well, isn’t that interesting, if not outright weird. I did a fresh installation of Rocky 9.1, but enabled root login this time. After logging in as root, /usr/local/bin is in the PATH. Strange. So I created a non-root user, and logged in as that user. /usr/local/bin is in that user’s path as well. sudo -i to become root, and now /usr/local/bin is not in the path (but /sbin is). Exited that shell (back to the regular user) and did sudo su -. Now /usr/local/bin is in the path, but /sbin is not. So the root user gets two different $PATHs, depending on how you become root. That can’t be right.

Do

ssh root@ipOrHostname

Same when you want to go to a linux user of your module

Ssh mattermost1@localhost

…and have to manually configure sshd to allow root logins, because allowing them has been strongly discouraged by pretty much everyone for well over a decade. And there’s no real reason to allow them; sudo will certainly get the job done. The issue is, apparently, that sudo -i unexpectedly doesn’t give root’s login environment (even though the sudo man page says it should), while sudo su - does. Maybe the Rocky folks can explain that: `sudo -i` and `sudo su -` behave differently - General - Rocky Linux Forum

With ns8 I learnt that environment variables matter a lot.

So if the env vars are not initiated well you fail

I know that su mattermost1 !== ssh mattermost1@localhost

Probably the same for root, with digital ocean you have no choice because the first and only user on the system after the droplet creation is root

1 Like