Questions and ideas about the NethServer8 upgrade - the show has to go on

yummiweb · November 2, 2024, 6:37pm

Dear NethServer community, dear developers

In preparation for the NethServer8 upgrade, I have now dealt with many aspects of the new system for months, which could be an obstacle. Nethserver 8 has developed excellently - no, you developed it excellently - so that an upgrade to Nethserver8 hardly stands in the way.

Unfortunately, the hurdle in the reliable integration of purpose -optimized data storage still remains for me. I would like to discuss this again in order to enable me to make a viable way of moving or continued operation.

I ask for help to understand various behaviors of the Nethserver8 - for the development of solutions for myself and possibly also for the NethServer8 further development.

Why purpose -optimized data storage? This not only affects the sheer size of the respective memory, which would also be accessible by choosing a growing file system. Rather, it means that the properties of the storage media, e.g. Harddisks (or even band drives) for archives or rarely used data, SSDs for emails and files and special SSDs (or even RAM) are intended for databases.

This - not so exotic - in my eyes - was previously also fulfilled in Nethserver 7. You could integrate your own data storage into almost every path, often during installation. And even afterwards everything was freely adjustable. Once set up, this worked reliably and transparently, thoughts had to be done at most about the type of mount, if the presence of “Lost + Found” led to irritation to services.

So why do I expect difficulties in Nethserver8?

Reliable mounts need reliably (predictable) paths, because no mount without a target path. And no reliable mount without a reliable target path. And here the sleepless nights begin, because in Nethserver8 the structure of the users (containers) is not determined by the admin, but solely by Nethserver8 - which is obviously good reasons for.

What initially looked for me after an accidentally created dead end in development turned out to be a restriction that is probably in the user logic of the container operation itself. Each container conveniently works in its own user context (including the associated right), which is why it only seems obvious in the nothing usual /home structure. To date, there is no problem, even within “/Home” everything can be mounted freely and down as you like. You only have to reliably know what is going on in “/home” or with which names (user names, IDS).

However, this managed the Nethserver8 itself - and very strict - and its way of working is so far only very limited and predictable for me - and unfortunately not at all taxable.

The reason for this strict management seems to be that during the Cesamt cluster life there must never be a situation in which a user name (or the user ID) suddenly occurs several times. This should be excluded reliably, even if containers are moved between nodes or restored from a backup.

Accordingly, the Nethserver8 itself decides which (new) users/ID awards it for a container. And apparently he has registered a database or list or the like in which he already used or burned IDS and possibly also other methods (path test?) Using which he may jump to the next iteration of the ID. Did I understand that up to this point?

The (for me) observed behavior seems to point out that IDS remained awarded once. This would at least mean that after the allocation (and the system of the folders) you can at least subsequently set your mounts.

For this I have the following considerations/questions:

Normal operation:
How much can you rely on the fact that the once awarded ID (and thus the path) remains?
Is it to be expected that the ID (and thus the path) awarded will change unexpectedly without any changes to the configuration?

If the ID no longer changes in “normal operation”, this would be at least a limited way for me.

But of course special cases would still have to be considered:
Under what circumstances is it even conceivable that the ID awarded (and thus the path) will ever change?
Under what conditions is an ID actually “skipped” (in the event of creation, restore or move)?

Container updates:
What happens technically during a container update?
The ID probably remains, but do you change (possibly temiprär) path within the container?
Which (drive) paths are also reliably stable with updates and which ones may be temporarily renamed/postponed?
An unhappily chosen mount path would otherwise crash the update process (or vice versa).

Of course, it would not be possible to live without container updates.

Restoration:
What happens to the ID when the container is restored from a backup?
Will a new ID be awarded or the existing ID continues to use?
How do you have to imagine the restoration?
Is the existing user path (my mount) reused directly so that the mount continues to work during the restore?
Or the restore first delete the corresponding user folder (and or user) and then put it back?
Then the mount would be broken, all recovered files would be restored on the wrong data storage!

I could live without integrated restore, it would not be nice.

Moving in the cluster (for me would be dispensable):
What happens to the ID when the container moves in the cluster?

The special case “move” seems to me to be problematic about Mounts:
The surrounding container does not yet have the new goal. Accordingly, there are no laid -out mounts. The moved container would initially land on the main drive or at best on a drive mapped according to “/home”.

Of course, you could then move back to a suitable data storage.
The “standard store” should be at least sufficiently large for the temporary container parade,
What he does not or may not be depending on the container size.

Upgrade Nethserver7 to Nethserver8:
O.G. also justifies another problem for me when upgrading.

During the moving process, the Nethserver8 creates the new containers (IDs and paths) and then also copies the data.
This would initially land all the data moved on a wrong (too small) drive. So the move fails.
(I haven’t tried this yet because of other fundamental questions)

A proposal in the forum was to first take parts of the data manually from the source path and to copy it manually later.
That would be enough for the pure upgrade.

But as described, is the problem not only when upgrading but when moving later in the cluster?
So why not find a solution for this in principle?

My idea was initially to simply create the mounts and paths in advance before update, move or restore were started.
But you should have:

The IDS to be awarded can reliably pretend - I cannot do that
The path prepared self -prepared should not lead to skipping the predicted ID
could be solved if necessary - display “Your next ID for service XYZ”
Would mean adjusting the test procedure. But there are probably not no reason.

Instead, I would have the following idea, which may be easy to implement:

The processes “container creation” and “data migration” should be separated in such a way that these are (if necessary) can be broken.
(Hopefully there is a suitable moment in the scripts)

After the “container creation” and with it the ID/path, the admin would then have the opportunity to integrate its own drives.
Then you could continue with the process of “data migration”.

Accordingly, it could also be done in the event of a container move.
First the “container creation”, then a break for the integration of drives, then the “data move”.

Should a recovery also require the deletion of existing paths or create new IDs/paths,
A corresponding “break” in the recovery process would also be a way to the goal.

I could imagine that such a “break option” is a sinful expansion for the Nethserver8 with regard to the scenarios I described.

It would be fantastic to be able to create the mounts directly in the GUI (similar to Raid and NFS in Nethserver7),
But I don’t want to overdo it …

In the meantime it would be helpful for me to get competent answers to the above questions
Or even a hint where I can install a break in the “moving script”.

I would very much like to move to Nethserver8 and would be happy to receive appropriate feedback.

Greetings Yummiweb

dnutan · November 2, 2024, 6:50pm

new user id and path on every app instance: installation (successful or not)/clone/restore/move action.

for a more detailed answer I think the dev team is the one who can provide it.

davidep · November 4, 2024, 3:28pm

Thank you, Yummiweb, for your time and efforts in preparing for the NS7 to NS8 migration. The way NS8 handles additional disks is still under development, and we plan to work on it in the first quarter of 2025 (see the announcement in NethServer project milestone 8.2).

One approach we’re considering is to allow admins to manage additional storage by following these steps:

mount a disk, permanently, on some directory. E.g. /mnt/disk1
tell NS8 to use it, e.g. by setting some node’s environment variable NODE_ADDITIONAL_DISK_MOUNT=/mnt/disk1, and some additional metadata about disk type, speed, size etc. The NS8 node will initialize the disk with proper sub directories and permissions automatically.
Let NS8 applications decide if they want to use the additional disk or not. App developers define the storage needs for their applications, while NS8 identifies available storage options. Together, they can match the best fit for storage requirements.

This approach does not aim to enforce predictability of directory layouts, as similar issues with file ownership (e.g., unpredictable uid/gid numbers and subuid/subgid ranges) would still arise.

A similar issue with uid/gid numbers exists also in NS7, and that’s one reason why we abandoned the development of Hotsync in NS7. In NS8, the container-based architecture solves it with a great app isolation, but it has a greater complexity.

The mount path for a volume should be permanent; no node events should change it unexpectedly.

However we are still missing a common approach to configure it. For example, this applies to configurations like MinIO and local node backups. Defining a common behavior for the storage management is another goal of the future milestone.

The new app image becomes a replacement of the old one. Volume paths are not changed, as the module name (or “ID”, to use your term).

The restore procedure creates a new module instance, with a different ID than the one that ran the backup. Data is restored into the volumes of the new module.

It is similar to restore. A new module instance is created, and data copied into it. The copy occurs between rsync processes that run inside containers, so the filesystem ownership is correctly mapped.

With the approach described above, volume data would land on the “best disk” according to what is available on the node and the app requirements.

Note that entirely mounting /home/appX to a separate disk can lead to problems with Podman, as discussed in other topics. Instead the above approach only some volumes bind mount are involved.

While available space checks are important, they are separate from the approach described above, though they can coexist effectively.

For experiments with the NS7 migration and volume path assignment, you can remove and re-create Podman volumes before each “Sync” call, with a bind-mount to an arbitrary host path. A working example of podman volume create --opt=device=... invocation is documented here, Backup and restore — NS8 documentation.

Once a volume is created with the correct options it is not modified by the application for the whole app instance lifetime.

In summary, an automatic volume management approach in NS8 would streamline system administration, reducing the need for manual configuration and extensive knowledge of application storage specifics. Configuring each volume storage path manually is difficult and error prone. It requires a deep knowledge of the application implementation. Those details should be known by the app developer, not necessarily by the system administrator. And I’m seeking a solution that simplifies the sysadmin’s life.

yummiweb · November 4, 2024, 6:08pm

Thank you Davidep for your extensive answer and the explanations.

I would like to summarize whether I understood it correctly:

If you talk about “volume” or volume path, you probably mean the volumes in the container, for example: “/home/samba1/.local/share/containers/storage/volumes/homes/”, right?
As for the possible mount options:

2.a.
A direct mount after “/home/” wouldn’t it be a problem? (Linux usual), or here you should also have a bind mount?

2.b
A direct mount according to e.g. “/home/container1/” is not recommended,
But a bind mount after “/home/container1/” or “/home/container1/.local/share/containers/storage/volumes/” would be o.k?

Dealing with mounts at NEW installations:

3.a. - Mount to “/home/”
No problem, the mount can be set up as usual when system installation or before the NETH8 installation.

3.b. - Mount to “/home/container1” (or deeper)
Here first create the respective “service” (container), stop, container (or volume) data matching and copying, starting containers and only then using it.

Dealing with mounts at Nethserver7 > NETH8 Data migration

4.a. - Mount to “/home/”
No problem, the mount can be set up as usual while system installation or before the NETH8 installation.

4.b. - Mount to “/home/container1” (or deeper)
For this, it needs/would need a “break” between container installation and data migration for the special mount facility.

I didn’t understand your answer correctly:

“For experiment with the NS7 Migration and Volume Path Assignment, You Can Remove and Re-Create Podman Volumes Before Each” Sync “Call, with a Bind-Mount to Arbitrary Host Path.”

Is there already a break between setup and data transmission that could be used for your own mounts?. I have not yet known the migration process.

I find no statement under your link:
https://docs.nethserver.org/projects/ns8/en/latest/backup.html#local-storage

Or where could I stop the procedure?

Dealing with restores in the case of own mountain:

5.a. - Mount to “/Home/”
Restore would be unproblematic because everything takes place in the “/Home” base path anyway.

5.B. - Mount to “/Home/Container1” (or deeper)
In principle, a restore on a desired drive is excluded because the new target path for the restore cannot become coined as a precaution?

It would be solvable if a break between the containers and restore migration would also be available.

Moving to other node:

6.a. - Mount to “/home/”
Moving is unproblematic if the target node “/home” has already been fitting.

6.a. - Mount to “/home/container1” (or deeper)
In principle, a move is not possible because the new target path for the restore cannot become as a precautionary mount?

It would be solvable if a break between the containers and moving migration would also be available.

Conclusion:
Basically, all the situations described (create, move, restore) are needed to separate (break) between creating the container environment and any movement of data from the container.

Such a “break” in the respective container management the Neth8 would already be fully usable with its own mounts.

Your approach:
Thank you for already thinking about the problem of different data storage and that you have already sketched solutions here.

Integration of your own data carrier with deposit appropriate qualifications.
The NethServer (depending on the app specification) itself, which drive is used.

The approach sounds very comfortable for inexperienced users.
But are they at all among the target group that would (or must) use differently qualified data storage?

For special oriented admins like me, the approach sounds rather horrible. In addition to the previous “self -authorations” of NETH8 (ID management), more should be added? Neth8 should decide on the use of data storage yourself? What would be next?
Brrrr …

I think the idea of being able to store your own data carriers with appropriate qualifications is very good! However, the admin should ALWAYS be allowed to decide on the actual use (when setting up, moving and restore).

Neth8 could make suggestions based on its guidelines, but these should ALWAYS be overlooked by the admin.
Seen from the development effort, the integration of this “adminsitrative freedom” makes no difference.

It would probably be much easier to retrofit a “break” in the respective processes for furnishings, move and restore.

At least for me this would be the minimum “desired option”, which would enable optimal use of the NETH8.

Your approach sounds more complex and not so bad if the admin always has the last word.

But my own requirements actually go further:
For example, I use different data carriers for file releases (previously). Why should a file archive also on an SSD?

Certain IMAP accounts or mail folders are also on separate data carriers, the general “MailCopy” e.g. or respective personal mail archives.

So far, the Neth8 has not allowed all of this and this does not depict your approach. For automatic processes and 99% of the cases, that would be far too special.

The (desired) break I described would at least enable everything by hand.

Greetings Yummiweb

davidep · November 5, 2024, 9:58am

I’ll address some of your questions here, and I think we largely agree on the problem definition, even if we’re considering different solutions.

Yes, that’s likely the default Podman path for rootless container volumes. I recommend avoiding direct writes into these paths, as this can lead to file ownership issues.

That setup is fine, as long as you’re not using an unsupported filesystem like NFS. If you decide to mount /home on a separate disk, choose a local SSD, as recommended for the root filesystem.

Directories cannot be pre-created; the core creates them as needed during the node/add-module action. If an existing directory is found, this action may fail.

This approach may be difficult to manage and support. If you’re referring to mounting after module creation, the feasibility depends on the volume’s purpose and the filesystem used.

Yes, in the NS7 migration tool, the Sync button serves as a breakpoint. At this point, you could remove and re-create any app volume during migration, as pressing it triggers a full rsync of the volume contents. Whilst I’m not 100% that this method works in any situation, it is worth testing it.

Right, the previous link is not clear. This one should be more appropriate for rootless containers: NS8 Add storage path setting like in minio to other apps - #10 by davidep

In restore and move procedures, selecting the “right drive” is possible if it’s managed by the core, as mentioned above. Doing it manually sounds error-prone.

Keeping solution costs low is likely beneficial for all users. For example: large slow affordable disk for Samba shared folders, small fast expensive disk for the root filesystem.

Yes, I believe this aligns with our overall project goals. For advanced configurations, we can provide guidance for sysadmins to override the default behavior when necessary. But Sysadmin freedom is not a goal on its own. We must consider that both app developers and support teams are “system users” as well.