I ran into a frustrating issue while trying to install the goauthentik module on NethServer 8. The installation process froze at 67%, and I had no choice but to manually kill the add-module task. After that, the web interface became unstable — the Software Center wouldn’t load properly, and the system started logging persistent errors.
I tried removing the module manually, including cleaning up Redis keys and cached metadata, but I’m still getting repeated log entries like these:
It seems the agent is still trying to fetch attributes from two failed instances (goauthentik1 and goauthentik2), even though I’ve removed everything I could find — Redis keys, default_instance entries, roles, authorizations, and even scanned the entire filesystem for leftover references.
Has anyone faced something similar? Is there a definitive way to purge all traces of these failed module attempts so the Software Center stops trying to load them?
I’ve tried everything I could think of to clean up the failed installations, including:
Running cluster/list-modules and cluster/remove-module for both goauthentik1 and goauthentik2 — but they no longer appear in the module list.
Manually deleting Redis keys related to goauthentik1, goauthentik2, and goauthentik3, including:
roles/module/*
cluster/authorizations/module/*
cluster/default_instance/*
node/*/default_instance/*
task/module/*
Removing cached metadata and catalog files under /var/lib/nethserver/cluster, /node, and /catalog.
Scanning the entire filesystem for any references to goauthentik1 or goauthentik2 and quarantining anything suspicious.
Restarting agent@cluster multiple times.
Even tried re-adding the module with add-module and removing it again with remove-module, hoping to reset the state — but the errors persist.
Despite all this, the agent still tries to fetch IMAGE_URL for goauthentik1 and goauthentik2, and fails. Something is clearly inconsistent or stuck in the system state, but I can’t find where.
Has anyone faced this kind of issue before? Is there a deeper cleanup mechanism or a way to reset the cluster state entirely for these modules?
Are you blocked from multiple installation attempt of images from docker, hub, maybe trying signing into docke rhub on the machine, and retry
And just to comfirm, this is the version youre installing: Release 2.0.1 · geniusdynamics/ns8-goauthentik
Before I try to install it again, I need to remove whatever is causing the log messages. Do you know how I can manually remove the installation so that the unwanted goauthentik1 and 2 logs disappear permanently? Keep in mind that running commands like remove-module won’t help, as I already tried that and it gave me an error about the NODE_ID, so I had to manually delete the /home/goauthentik1 folder.
It’s easy to replicate the problem. Just run the command below:
When the installation reaches 67%, finish the process with CTRL+C… and I did this twice. What’s not easy is reverting the installation after manually deleting /home/goauthentik1 and /home/goauthentik2. I managed to remove goauthentik3 using the remove-module command. However, many errors still appear in the log regarding installation attempts 1 and 2.
I couldn’t reproduce. After starting the installation of ghcr.io/geniusdynamics/goauthentik:latest I interrupted the process at 67% with CTRL+C but it still finished as the task was already running in the UI and after some time the Authentik login page is working, at least I could create an admin and use the admin page.
Maybe it depends on when one interrupts with CTRL+C…
Is it possible to reinstall forcing the instance goauthentik1? So I can remove with module-remove after that. And I will do the same with goauthentik2.
api-cli run update-module --data '{"module_url":"ghcr.io/geniusdynamics/goauthentik:latest","instances":["goauthentik1"],"force":true}'
Warning: using user "cluster" credentials from the environment
Traceback (most recent call last):
File "/var/lib/nethserver/cluster/actions/update-module/50update", line 40, in <module>
ping_errors = agent.tasks.runp_brief([{"agent_id": f"module/{mid}", "action": "list-actions"} for mid in instances],
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/agent/pypkg/agent/tasks/run.py", line 61, in runp_brief
results = asyncio.run(_runp(tasks, **kwargs))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/asyncio/runners.py", line 190, in run
return runner.run(main)
^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/asyncio/runners.py", line 118, in run
return self._loop.run_until_complete(task)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/asyncio/base_events.py", line 653, in run_until_complete
return future.result()
^^^^^^^^^^^^^^^
File "/usr/local/agent/pypkg/agent/tasks/run.py", line 120, in _runp
return await asyncio.gather(*runners, return_exceptions=(len(tasks) > 1))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/agent/pypkg/agent/tasks/run.py", line 127, in _run_with_protocol
return await run_redisclient(taskrq, **pconn)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/agent/pypkg/agent/tasks/redisclient.py", line 77, in run_redisclient
await _task_submission_check_client_idle(rdb, taskrq, kwargs['check_idle_time'])
File "/usr/local/agent/pypkg/agent/tasks/redisclient.py", line 41, in _task_submission_check_client_idle
raise TaskSubmissionCheckFailed(f"Client \"{taskrq['agent_id']}\" was not found")
agent.tasks.exceptions.TaskSubmissionCheckFailed: Client "module/goauthentik1" was not found