New app instances with every failed app installation

yummiweb · February 20, 2025, 7:38pm

Dear developers,

I noticed that NETH8 creates a new instance directory for an app every time it tries to install(!) it and this directory remains even if the app installation fails. And with every installation attempt, a new instance directory is created, etc., etc. Currently with ONLYOFFICE.

I understand (a little) why every installation checks for the existence of a corresponding folder and in the database(?) and why a new location is used instead - after all, the space could be reserved for a restore from a backup, a move or whatever (well, what for, actually?).

But NETH8 KNOWS when an app installation has just been messed up, why doesn’t the installer then delete all the remnants (instance folder) and entries in the database (or wherever)?

If you didn’t create a snapshot before each step, the node would now look like it was older than Nethserver 7. That’s not nice.

Regards Yummiweb

PS.

I’m sorry, you put a lot of energy and effort into the project, which is really admirable - and I’m just complaining. But the more you delve into NETH8, the more it becomes apparent.

mrmarkuz · February 20, 2025, 8:26pm

It’s not a bug, it’s by design, see also Applications — NS8 documentation

This way there won’t ever be an instance name collision between apps that could live on different nodes.

It’s more safe to keep the data and have the option to debug the error.
It’s always possible to remove apps including folder from UI and CLI.

I think it’s just a kind of human numbering bias. Technically there’s no disadvantage when using a name like instance77.

You could define names for the instances, from the documentation:

It is possible to set a more meaningful name for the instance from the Software center page.

The app drawer shows the nice name:

yummiweb · February 20, 2025, 8:31pm

Addendum to add why I find the behavior so unpleasant:

By consistently introducing container technology in Nethserver8, the developers have created the wonderful situation that not every installation scatters its files across the system and an uninstallation (in principle) completely removes this data. A wonderful playground for trying things out, for example, to see if users accept them. And if not, the container is simply removed again. And the system underneath always remains clean and tidy and the overview is not lost. THAT is (in my opinion) the big advantage of the rather complex “Orchestrator” system. A dream for every admin.

And what does Nethserver 8 do? It destroys exactly this advantage. I understand why it is practical to have the container storage in the /home directory, but WHY the behavior is like this will probably remain a mystery to me forever.

yummiweb · February 20, 2025, 9:06pm

"I think it’s just a kind of human numbering bias. "

It may be that some cheeky Aspi brain cells are fooling me here.

But don’t you think it’s a mistake if dysfunctional app instances aren’t removed by the installer straight away? At this point, there CAN’T be any payload data in them that is worth keeping. Because there is no “installing over” them.

And yes, you can remove everything manually - if it’s safe and you know about it. But the installer doesn’t even report that there are remnants left that could be deleted. The user only sees that the app installation failed. He doesn’t see whether there was anything left behind that he could or should delete. For example, a message could be displayed about the ID of the “dysfunctional” instance.

The user will then probably try again. And again. And maybe again. And at some point he will have a functioning instance that he can see in the manager. It sees this instance and any previous instances. And if it ever needs to access the /home instance folder, it no longer recognizes what is functional and what is not. Especially if it renames the names of the instances.

The Nethserver 8, on the other hand, “apparently keeps a precise record” of its instances, of successful and failed installations, moved instances, restored instances, duplicated instances and deleted instances. Right? Otherwise it would not be able to know which instance ID to continue with, i.e. which to leave out, even though the instance folder is no longer there. It already knows BEFORE the next SERVISE-XYZ instance is installed what number it would get. This could even be displayed during installation, but would probably just be the display of a variable.

This book only seems to be a secret for the user, because he doesn’t see any of it. The information is there, but is it easy to see somewhere or would you have to look through tons of log files?

mrmarkuz · February 20, 2025, 9:23pm

NS8 doesn’t know if an app is dysfunctional. It just knows that there was an error.

EDIT:

App data deletion should only be done if it’s really wanted like on uninstallation.
It’s dangerous because just one wrong automatic dysfunctional check would lead to data loss.

yummiweb · February 20, 2025, 9:45pm

By dysfunctional I mean that the Nethserver itself does not consider the installation to be complete and therefore does not display it as an “instance” in the administration. Neither for deletion nor in any other way.

This means that the error codes in the installation were evaluated to such an extent that proper integration was aborted.

That is what I meant by “the Nethtserver knows about it”. And you think it doesn’t know about it?

mrmarkuz · February 20, 2025, 9:53pm

Did you refresh the browser? I think the app drawer doesn’t refresh when an app install fails.
Usually when the app is installed and there is a configuration error, the app should still be shown in Software Center.
Do you have an example for an app where the installation fails and it can’t be deleted?

yummiweb · February 20, 2025, 10:28pm

In this case, I unfortunately didn’t save any patterns because I quickly realized what the problem was (firewall). That’s why the instance couldn’t be seen at all because the package download didn’t even work. Only the Netserver created the framework for the instance and “booked” it. That means the instance folder would have been practically empty. That’s why I’m surprised that an installation process that failed so early on would leave such residues behind.

Andy_Wismer · February 20, 2025, 10:32pm

@mrmarkuz

Is there any max number the Apps can use? (256? 4096? or higher?)…

AFAIK, there is such a ceiling when dealing with nodes.

The following is valid:
A node will NEVER reuse a node IP (From the cluster VPN network).
That means there are in total 253 node numbers (IPs) which can be used.

During testing, with several nodes, “burning through” a few node IPs is fast…
What happens when no new nodes can be created?

Any restore will retain that number, so no new nodes can be created.
So how can any data be saved and restored?

Some questions I haven’t seen answered yet…

My 2 cents
Andy

yummiweb · February 20, 2025, 10:58pm

And again we are at the point where you are no longer in control of the matter.

Instance numbering is one thing and here I am no longer concerned with freedom of choice (which is obviously overrated) but rather with transparency and traceability - better still: with absolute predictability.

And what I am concerned with is that FOR ONE AND THE SAME INSTANCE (even from the backup!) it must be possible(!) to retain the instant ID, to fix it, so to speak.

And in this case it is about the fact that the “burning” of an ID apparently already takes place at a stage where it is not even necessary.

And why should it not be possible to make adjustments to the “bookkeeping” yourself? Is this a side effect of the new Italian accounting regulations? (no offense intended)

But what you are describing is even more extreme, because the cluster address spaces are indeed limited. I do not understand such a “REservation” at all. Either the node is permanently eliminated or not. And when you leave, you can check whether it is temporary or permanent.

And it would basically be no different with the apps…

Andy_Wismer · February 20, 2025, 11:01pm

The Apps as such are not limited by the Node VPN IP…

But I’m also not happy with the lack of information about this “caveat”…

It should be possible to purge or reset the current number somewhere.

My 2 cents
Andy

mrmarkuz · February 20, 2025, 11:05pm

I can’t find a limit and from https://www.w3schools.com/python/python_numbers.asp:

Int, or integer, is a whole number, positive or negative, without decimals, of unlimited length.

The maximum allowed size of a key in Redis is 512 MB

I think it could still be done by reconfiguring database and wireguard configs manually.
I think you never need to restore more than 250 nodes in a cluster…hopefully NS9 starts with 1 again
BTW, if you set another netmask for the VPN IP a lot more nodes would be possible.

Andy_Wismer · February 20, 2025, 11:14pm

Yes, but for most of us, no one informed us about this limit until it was effective. By then, the Cluster was active, and that means WITH Cluster VPN set!

Think about the numbers, say with 4 nodes in a cluster. People playing about with Apps and dedicated nodes will burn 64 easily within a year… in 2 years IP will become short.
I’ld give this scheme a max of 4 years, then node numbers will run out.

This somehow reminds me of what Franz Strauss (Long time MP of Bavaria in Germany) once said to a journalist…
During an Interview, he asked the journalist a question:
“What would happen, if one were to introduce socialism in the Sahara Desert?”
The journalist: “I don’t know…”
Franz Josef Strauss’s answer: “About 10 years, you would not notice any difference. Then the sand starts to runs out!”

One of his best remarks!

My 2 cents
Andy

yummiweb · February 20, 2025, 11:21pm

THAT was from Franz Josef? I didn’t know that. Well, you were closer and there was no “iron curtain”.

yummiweb · February 20, 2025, 11:25pm

I think it could still be done by reconfiguring database and wireguard configs manually.

Is there a database that you can look into and edit? Where is it? How do I get to it? Where do I have to sign?

mrmarkuz · February 20, 2025, 11:25pm

Burn 64 nodes a year? I don’t think so.

One shouldn’t play with apps and nodes on a production system (as I do)
You don’t burn the system anymore by installing bad apps in NS8, you can just remove them. That was an issue with NS7.
When using virtualization, restoring a VM doesn’t need a new node.

Yes, the redis database, see also Database | NS8 dev manual

Andy_Wismer · February 20, 2025, 11:32pm

Well, how about a Home Server “Cluster”… People play around, even if they have no idea…

Agreed. But noobies who have no clue?

AND:

How many of these noobies will be running on Cloud or native installs…
Oh, I need to reinstall this node…

The more nodes in the cluster (Especially easy with VMs) burns the niumber faster!

I think you are aware of the support questions already now looming…

Even now, how many questions do we need to answer, just because people do not understand IPs or DNS…

My 2 cents
Andy

yummiweb · February 20, 2025, 11:33pm

However, resorting to a VM backup or a snapshot is only a fallback and cannot be assumed as standard. Otherwise you could also save yourself the trouble of app backups.

Yes, it SHOULD be the case that the container host is no longer “contaminated” by installation residues etc.

In fact, however, there are residues in the “accounting” that the user can neither see nor clean up.

Unless there is a tool to look into this accounting or even edit it. In the good old days of Linux, this was the only acceptable way, everything remains under the control of the admin.

Am I too old?

mrmarkuz · February 20, 2025, 11:51pm

I didn’t say replace your backup but I think many users here use virtualization and therefore there’s no increase of nodes.

I really like the Proxmox backup server, it’s more than just a fallback. I’m also using the NS8 backup in case I need to restore 1 app. I also use a VPS in my cluster which isn’t Proxmox based…

You can edit the redis database using redis-cli, here is an example: VPN | NS8 dev manual
Redis commands in their docs: Commands | Docs