NS8 Add storage path setting like in minio to other apps

jaywalker · October 6, 2023, 9:01pm

The minio app provides the advanced setting of a “storage path”, where any directory on the nethserver itself can be given to hold the minio data. This feature is great!

Can we please get this capability also for other apps that store larger amount of data, like

nextcloud (set the datadirectory)
Mail (set the location of the mail store)

and probably others?

Thank you!

pike · October 7, 2023, 7:47am

Well… may I be puzzled about this?
Use locally hardware storage for backup can be understandable, however having multiple devices for storing data outside the container makes sense on a systemistic point of view (bigger an cheaper storage for space-eager application) but on the other hand…

Can you afford the lost of that data if storage fails?
Are you going to use a RAID solution for your MinIO endpoint?
When you’ll move container from your current dock to the next one… which will be the migration path?

The paradigm shift from one server with services to container orchestrator increase the necessity for system design and systemistic tasks for the end user (sysadmin)… so why a multiple container location is not already in your current setup?
One for speed intensive containers.
One for space intensive containers.

This might be a better feature, IMVHO.

jaywalker · October 7, 2023, 8:52am

That is exactly what I want to accomplish. As this storage can be mounted remotely from a storage server, I fail to understand your concerns regarding backup (raid storage server with snapshots, backup etc. can be used) or migration of the container to another host (just mount the storage on the other host). Why wouldn’t that work?

Still, I ask for this as an option, so noone would be forced to use it. But in certain scenarios (storage server is already there, for example), I think the feature could be beneficial.

davidep · October 7, 2023, 12:53pm

The requested feature is appealing, but as Pike says, it’s also rather puzzling!

No, we can’t because the documented procedure is not compatible with the backup. The manual says:

Please note that the above path won’t be part of the backup.

This can be fine for a backup destination module like MinIO, but it is generally not acceptable.

Podman provides configuration options for storage paths: maybe they can be set to point to a network drive, as we already discussed here:

jaywalker · October 7, 2023, 6:57pm

Ok, what I want to achieve w.r.t the user data (e.g. many TBs of data in nextcloud) is the following backup concept:

one copy of the data on a machine A, including previous versions (e.g. snapshots). Might contain the live data.
another copy on a machine B, also including previous versions. This copy is a pure backup and should not contain any live data
an offsite backup of important files/directories

With my current setup, I do the following:

live data is on a NAS (machine A, e.g. Truenas), mounted via NFS to the server running my services such as nextcloud. It is automatically snapshotted on the NAS, so I have a version history.
the complete data is mirrored on another NAS, machine B, including the snapshots.
machine B does an offsite backup of crucial contents.

Storage requirements are thus 2x the user data (1x on machine A, 1x on machine B) plus the demands for the snapshots.

With NS8, I need to do the following:

Setup a nethserver machine/VM with enough storage to hold all user data in /home.
do a full backup of NS to another machine (machine A). This backup can hold previous versions of the data.
do a copy of this backup to a machine B.

As a consequence, I need to provide storage capacity for 3x the user data (1x for the nethserver, 1x for machine A, 1x for machine B), plus the demands for the snapshots.

Additionally, restore seems much more complex, since in my current setup, if e.g. a user accidentally deletes a file in nextcloud, I can directly restore it from the zfs snapshots on machine A. In case of the nethserver solution, a full restore of the S3 backup seems to be required in order to get access to the single file. In case of several TB of data, this is not an efficient process (and requires storage space for yet another full copy of the data…).

Summarized, I see that with NS8

storage demands increase (3x capacity for all user data needed, compared to 2x if live data is directly on e.g. Truenas)
usability gets worse

Maybe I just did not really understand your concept , but for me it looks like the backup/data-handling concept in NS8 comes with a lot of drawbacks / added complexity. Can you give advice how this is meant to be in NS8?

Thank you!

PS: I am aware that I did not talk about config/state data of the services like e.g. the nextcloud db. I left it out, because a) this data is rather small and I do not care so much about additional copies, and b) I want to understand the concept of NS8 first before going into too much detail.

davidep · October 9, 2023, 10:52am

I think a copy to machine B is not strictly required, as you already have a local backup copy in machine A and an off-site backup copy. Restic repositories provide a configurable number of snapshots. It does not look so different from your current setup and does not need a remote filesystem.

I think comparing ZFS with a backup software is not really fair.

Yes, we are missing a tool to do a selective restore with ease. We know the Restic engine can list backup contents and do a selective restore. However in practice this kind of feature is difficult to handle: I must know what I want to restore, where it was stored at file system level. This involves the knowledge of how the application organize data in the filesystem, thus it cannot solve the general problem easily.

As described in the Backup and restore — NS8 documentation page, the NS8 backup supports multiple repositories (locations). For each repository you can schedule one or more backups with a specific snapshot retention policy.

The full state of every application can be backed up and restored separately. It is not designed to work on single files. For this purpose, at least for the Mail module, Steph is working on a mail archiving prototype based on Piler.

jaywalker · October 15, 2023, 3:16pm

In my point of view, ZFS snapshots is part of the competition nethserver has to face. When it comes to backing up/restoring of files for file sharing services (such as samba or nextcloud), ZFS does this really well.

Of course, the snapshots do not include the server config etc., which has to be backed up separately, so it is not a full backup solution. But the config data is small, compared to the volume of the files in the file sharing services, and changes less frequently. That’s why I would like to separate between config backups and data backup. And I would really like to use snapshot technology (which today basically any NAS provides) for the data part, because it is efficient.

pike · October 16, 2023, 7:53am

Snapshot is a tool. A really powerful one.
If it’s made by the filesystem, the database, or the whole application, can be used to create a polaroid of the system/file/softare to revert if needed, and when it’s time, needs to be correcly managed (consolidation) or it will eat up storage space if the snapshot are kept forever.
But like RAID and virtualization and containerization, is not backup, because cannot be “extracted” from environment, which is needed to properly work as intended.

If this strategy (snapshot) and the requirements (ZFS) serves you as you wish and design, that’s fine, but only because you find the lathe an effective tool to open a food cans does not mean that can openers are worthless because cannot to all the things a lathe is useful to.

jaywalker · October 16, 2023, 8:21pm

All I am asking for is that nethserver considers to provide alternatives, because there are use cases where alternatives are valueable. It is really confusing me that this leads to such long discussions.
What is wrong with providing such an optional feature, as NS8 already has it for minio, also for other apps where it could be useful?

davidep · November 15, 2023, 4:18pm

Well I digged Podman documentation and I found a recipe that should work for any rootless module, more or less.

I assume the disk where we want to store the module data has been already formatted, configured in /etc/fstab and mounted on /mnt/disk00

In general, after installation (creation) module instances are in a stopped state. They require an additional configuration step to start. In this state they still have not created the volumes where persistent data is stored.

In this case, it is possible to create the expected volume in advance, providing the configuration that bind-mounts an arbitrary path of the node.

Let’s make an example with Dokuwiki. When it is started for the first time it creates a dokuwiki-data volume. Let’s bind it to /mnt/disk00.

# module must have full access to the disk, like its home directory:
chown dokuwiki1:dokuwiki1 /mnt/disk00 
chmod 700 /mnt/disk00 
# create the named volume, with the name Dokuwiki wants
runagent -m dokuwiki1 podman volume create --opt=device=/mnt/disk00/ --opt=type=bind dokuwiki-data

Now complete the configuration of Dokuwiki from the UI as usual.

It seems straightforward so far, but what happens if I have data in the disk and I want to attach it to the container? For instance, data coming from another Dokuwiki?

In this case there can be a disalignment of uid/gid numbers in the filesystem and a full remap of files ownership is required. This is a common problem with containers because of uid/gid namespaces and it is an open issue in this scenario

jaywalker · November 25, 2023, 8:32pm

Thank you for the information! I tried it today and it seems to work.
Do you have an idea how the volumes of the app can be displayed when it is unconfigured, such that it becomes clear which volumes can be created as a bind to another location?

I tried runagent -m APPNAME podman volume ls, but it is only working after the app has been configured and is running.
After doing the configuration, it can be checked with runagent -m APPNAME podman volume inspect VOLUMENAME, which displays the bind volume. The nethserver UI displays the volume just normal.

oneitonitram · December 4, 2023, 6:24am

Hello YAll, i came accross this which allows you to mount an S3 storage as a file system, i am not sure if this is something worth looking into.

kahing/goofys: a high-performance, POSIX-ish Amazon S3 file system written in Go (github.com)

LayLow · December 4, 2023, 6:29am

Jambo!

No developments since last June…?

yummiweb · February 4, 2024, 1:58pm

How about integrating the desired volume for the container data directly by mount (etc/fstab) in the path/home/container name?

This would not change the config files, DB entries and system behavior because the path has remained. That would be completely system transparent - by the way, just as the Unix/Linux developers thought. The only difference would be the required entry in the /etc /fstab, but it looks different for everyone anyway.

Such a move would then not be problematic:

integrate the future data volume into a temporary path (/MNT/container name)
Stop the container (I would generally want a start/SOPP function, also via terminal).
Copy the container data (/home/container name) to the volume or better move so that there are no remnants in the original path.
The data volume in the temporary path Unmounten, in the Mounten Displacement Path.
Adjust the /etc /f rod
(possibly a few refinements in the copy/postponement process)

In Nethserver 7, I also mixed separate data volumes in the respective path for the service data. This works absolutely smoothly because it is absolutely transparent.

However, the move to NS8 is now difficult for me. It is no matter which data carriers were in which path in which the moving tool (complete transparency), but in order to completely change the previous NethServer 7 system to NS8, I would now have to use a very, very large volume to migrate everything. Only afterwards could I integrate and move external data carriers. And I can’t imagine that I am the only one who has all server data on a single volume.

It would therefore make sense if you could install a step as part of the move that allows you to integrate separate data carriers in the standard path. With /home this would certainly also be done before the migration, but not with individual container paths because they are not yet invested (or you should first know what they are called).

And that brings me to the current difficulty in NS8 that the container name cannot be used. Once forgiven and deleted (the installation may not work), a new number for the container is awarded. So far I have not been able to rename the containers (e.g. from Mail2 to Mail1), because the change in the users and path names is probably not enough.

And what happens if the developers want to go in the way of certain container updates or migrations in the /Home Path in the /Home Path and then to “move” data or to rename containers? Then there would be a space problem quickly. Are there any considerations or fixed requirements? Without the appropriate information, I will not be able to plan the data structure under NS8 safely in the future.

Andy_Wismer · February 4, 2024, 5:25pm

@yummiweb

Simple, use virtualization…

You’re still thinking like in the ninties of the last millinium!

My 2 cents
Andy

yummiweb · February 4, 2024, 7:25pm

Dear Andy, what’s up with you currently? I take increased answers from them was in various articles), which are much more superficial than before and also a little massive.

You do not know in which I feel mentally and conceptually, or what procedures I work according to. Which is a shame because you can know it because we both have been in exchange several times.

I have been using virtualizations of various kinds for a very very long time. For several years, mainly via Proxmox VE. What I am not yet firmly firm are container systems like dockers.

But maybe you also have something productive to contribute to my suggestion and not just mobbing around (I can’t get used to this way). For example, how virtualization can contribute meaningfully? There are good reasons to rely on different data carriers (or images) in virtual environments. Think, for example, in recovery situations or if you want to enlarge storage space.

Yes, there are other methods such as LVM or cluster -wide distributed memory, but why use a (e.g. for a special case) complex solution if it is also easier? Complexity is the enemy of reliability and security.

But if it is already considered a “old scool” to hang drives or storage systems in standard paths as required (which would solve different problems described by the questioner), something is going wrong here. And somehow I also read other strange things here in the forum. Services (or containers) should also go through and cannot be switched off or pausable, there are enough reasons to do this, just think that no cases do not promptly patchable security gaps or even safety incidents. Should the complete “orchestrator” be shut down? Oh, I forgot, there are the external firewalls, so you can lock the port. But how does it work within the “orchestrator”? And since we are at Firewall: These skills should be provided and managed externally in the future and from a developer perspective this is certainly easier and the operator can then make his choice himself. But a rudimentary firewall management on the orchestrator itself would be very, very important - and if not by GUI, at least by terminal interface or at least through helpful documentation about what NS8 is actually doing in the background.

I don’t want to go into more detail about your argument now, but I ask two simple questions:

Have you set a password on your work station? Why? After all, you could finish the room door and also have the apartment door and maybe other doors between your work station and the “bad boys”.
Would you consider a safe to be safe, the mechanical or electrical opening systems of which can be manipulated for the attacker?

Yes, in the past all services were on a system and rather badly separated from each other, but I thought things are developing?

I please apologize for my rant and would be honest if the communication can be led again at a reasonable level.

Greetings Yummiweb

yummiweb · February 4, 2024, 8:07pm

I would like to emphasize and note the following:

The developers really do a hard job here, especially in view of the time available and their own schedule. And I can understand that it is not possible to discuss concepts at an advanced stage. And I also understand your (Andy) self -made approach to keep my back free. But unfortunately many things are not very well communicated. What will there be, what not? What is still unsure? How will what work?

The point of view also has a NethServer 7 operator is the problem that the system must be replaced within the foreseeable future. From the very good experience with Nethserver 7 (I find particularly since 7.6), many are excited and confident on NS8 and very happy to consider NS8 - many certainly because of the built -in migration process.

But before you migrate or change somewhere - wherever - you have to erect the possibilities of the target system whether it is even possible. With the first candidates you can of course try it out, be it by reading the decorations and/or.

With NS8, this is anything but clear, because of course there is so much in the development process. And I have to say that I find it really very positive that NS8 has been tested for very early stages. So you could already try a lot to assess the already existence of functionality for yourself and also to report errors to the developers (if you can’t contribute any other way).

However, it remains the difficulty (at least for me) that it is not clear what functionality is still coming or with what probability it still comes or not. And because of the currently quite rudimentary documentation (which is understandable), you hardly have a way to find (new) ways for yourself (new) ways to implement things with NS8 that you have so far done with NethServer 7. In any case, this is a problem for me because the concept and internal functional of NS8 are either (yet) not presented correctly or I just don’t find it.

Nethserver 7 used as a special path E-Smith and the handling of NS7 was well documented. How does NS8 work instead? How are the possibilities of making your own adjustments within the containers? You could deal with the underlying container management, but what are NS8 doing here? Where are there special paths (or will there be) and where not? All of this information will surely come, but until then you can hardly plan as an operator as it goes on.

Andy_Wismer · February 5, 2024, 12:14am

Fast, live resizing of disk (upsizing only, downsizing will not work.)
Hardware and local backup independant disaster recovery. AMD or Intel CPU would not matter.

If the storage blocks, so does the mount. Autofs doesn’t work in all situations.
And just mounting /var/lib/nethserver in some remote or local volume (USB) is an invitation to disaster.
System won’t boot, nor be able to find need libraries to work.

Path mounting, if done carefully, is OK. This also works for virtual systems.

Yes, for the simple reason RDP won’t allow access without a Password…

I might, but self destruct systems in the megaton class aren’t legal for private persons to use, operate or posess. And I’m not sure I would trust less…
I’d expect such a viable self destruct system to only allow restricted access, VPN only and encrypted over an OOB (OutOfBand) connection, rather like HPEs ILo or Dell’s DRAC or the PAL system the US uses…

I don’t agree with a lot of positions you refer (Not yours, the ones you refer to!) to here. Serviices should always have the option to be shut down or whatever. Migration, Maintenence are typical reasons.

I do hope and assume or devs will solve these major issues before release, but I have the feeling they are being distracted by to much “exotic” environments and wishes.

A firewall is in place, but only CLI access, no GUI. This would be as such OK for me. The big BUT is: Where’s the “helpful documentation”. Part is there, a lot of gaps are also there…

I am also aware what is possible for me will NOT work for a majority of users.

My understanding is generally, what is in NS7 (officially) will also be in NS8.
Contribs by third Parties depend strogly on active maintainers or creators. I’m certain the two majors, Stephdl and MrMarkuz won’t let us down, but there are others.
And a working migration is needed. At the moment, this is missing eg for Dolibarr, Dokuwiki (Both Stephdl) and Zabbix, MeshCentral and Guacamole, both MrMarkuz.

I’d like to participate in future in NS8 forums, like I’ve been doing for the past few years in NS7. As such, I’m hoping for a fairly complete and documentated release soon, and giving the Devs my thumbs up for the excellent work so far.

Queestions like:

NS7 could do VPNs without being a the primary network firewall, just being a server. Can NS8?

Another advantage of VMs: you can do “dry runs” for testing until you’re confidant and satisfied with the result.

yummiweb:

However, it remains the difficulty (at least for me) that it is not clear what functionality is still coming or with what probability it still comes or not. And because of the currently quite rudimentary documentation (which is understandable), you hardly have a way to find (new) ways for yourself (new) ways to implement things with NS8 that you have so far done with NethServer 7. In any case, this is a problem for me because the concept and internal functional of NS8 are either (yet) not presented correctly or I just don’t find it.

Nethserver 7 used as a special path E-Smith and the handling of NS7 was well documented. How does NS8 work instead? How are the possibilities of making your own adjustments within the containers? You could deal with the underlying container management, but what are NS8 doing here? Where are there special paths (or will there be) and where not? All of this information will surely come, but until then you can hardly plan as an operator as it goes on.

I agree with the above, also for me there are a lot of open questions.

The E-Smith Template system is gone, is any replacement planned or in the works?
Or just a wild “anyone can edit any config file”?

As to server security:

I recall a discussion I had online in about 2001. The subject was server / NOC (Network Operating Center) security.

All kinds of cool stuff and strageties were discussed. Face and Voice recoginzation for building and room access for example. These strategies were labeled as inpregnable, and no one can steal your data whe so secured…

As avocatus diabolis (Acting Devils Advocate) I asked the simple question what they woud do, if someone at the door, dressed in some fantistic military styled uniform (Think DPRK, China or Russia), asked nicely if you would kindly hand over your server including loading it on a ready truck.
Two mechanizcized divisions around the building, and a sub tipped with bunker cracker nukes just outside of international waters ought to be adequate arguments…

→ Silence, or typed “ummms”…

Physical security is paramount, before any logical security!
How far physiical security goes, depends.

I hope these answer at least most of your questions to the best of my capabilities, and in the usual civilized tone we’re used to in this forum! (Not ranting!!! )

My 2 cents
Andy

yummiweb · February 5, 2024, 11:24am

Dear Andy, thank you very much for your quick answer.

I am pleased to read that we do not think completely contra here and also think similarly regarding security - albeit with a different result.

The more I think about it, the more aspects I think of why an (internal) firewall and a differentiation of the data laying would be sensible (even essential) and how this is about “internal or external firewall”. I’m just not sure if this is the right threat for it. I let it get to it and share my thought

To make it clear briefly: I use virtualization - for many reasons and also for the reasons you have set out. Only the VM hosts, the Raspberries (as well as other small devices) and the Mac clients are still “bare metal”.

Why I think the differentiation of data storage is important:
My VM Host provides various data carriers for the virtual disks. The essentials are always connected internally. System data and service data (such as emails) usually lie on SSDs. Through writing activities stressed areas (swap, log) i like to put on uncritical data carriers - either fast HDDs or older SSDs. I would also like to put file data larger size (very expensive with SSD and unnecessarily with online connections) on large HDDs instead of SSDS. Virtual disks for databases would also also be placed on particularly suitable data carriers (or storage systems). And now it was only scenarios that were not with backup sizes or the opportunity to do backups partially (on VM Host level)

All of this works wonderfully with proxmoxve. I understand the scenario in which certain drives cannot be found and can therefore not be beguned and the server cannot start - but wouldn’t that be solvable by a software query at the start (already system start or only start to work)?

But they mentioned the developer’s approach to the “target group”.
There will probably be the following basic scenarios:

“Bare Metal” - still used in small scenarios and also as part of “Colocation” or “Server Housing”. Therefore, this target group should still be important.
“Virtualization” - should cover the rest of the scenarios. Either hosted yourself (loical or in the data center) or directly at the cloud provider (I have little experience with the latter).

As far as I understand that, the target group of NS8 users or supervisors are the not necessarily profound knowledge of matter and (also) therefore things should be kept simple. Agreed.

But something that affects almost everyone that somehow has to do with databases (and what else does a server do?) Is compliance with legal basics, right? In the EU, this is specifically the GDPR (called GDPR in Germany). And now we look at the above -mentioned scenarios from the point of view of the GDPR:

Access protection (device level): Either organized locally or you have to trust the cloud provider (there is a contract).
Access protection (service level): Codes and passwords for the machines or services (containers) can be assigned locally and in the cloud.
But what about the data carrier level? According to the GDPR, these should also be encrypted, not only the backups (and at best also scenarios of military violence).

A encryption “bare metal” has the problem that someone has to enter a password on the sheet at every start or restart (or insert a decryption stick or similar). Or you have connected a KVM hardware, then you can also remote.

In principle, encryption that the cloud provider provides is not (if you are honest). Here, too, you have “only” a contract, the rest is hope.

If you operate your own virtualization host (such as Proxmox VE) locally or in the data center, the data carrier level can be solved via LUKS encryption - i.e. directly on the host level. This has the advantage that no encryption does not have to be set up at the guest level and the computing resources for encryption and decryption do not have to be calculated separately for each guest or only the data carrier host is responsible. The decryption also only has to be opened when the VM host starts. The disadvantage is that the security (in the open state) now only has to be guaranteed by the VM host. However, this should be operated as possible anyway. A physical attacker on site (whom I cannot see directly) would be a problem, but carrying it away does not work (unless the attacker also carries the USV away).

Ok, until here we somehow correspond to the GDPR - provided that a user from the typical target group can implement this up to now. NS8 does not already provide its own mechanisms that could help the target group here. If NS8 were already conceptually available system data and service data on different data carriers (gladly in the same path), LUKS could be used here. Please do not save the keys for the backups in the system path!)

And what about structural or conceptual security? (also one aspect of the GDPR)
And here I come back to the question “internal or external firewall”.

The data connection for the (possibly) different nodes is made of NS (encrypted. Very exemplary!

But what about the data connection between NS8 and upstream firewall? This connection is either a physical (network cable) or a virtual.

The service dates running over it are usually (hopefully) encrypted, but what about the packet data itself? Basically, anyone who gains physically or virtually access to this connection can change the package data, i.e. (to) define where they are sent or where they come from (allegedly). And then there is no more important firewall. Upps …

Access to a network cable (locally or in the data center) is usually “only” a question of physical protection - or a question of what the provider contractually assures (and whether you trust this).

The same applies to access to a virtual connection between between server VM and firewall VM. If you only rent the required VMS with the cloud provider, you have to trust the provider that it effectively restricts access to the virtual connection. From experience with VM host systems, we know that “outbreaks” from VMS are not completely impossible and the more “strangers” are on one and the same hostin dance, the greater the risk.

The scenarios “bare metal” or “VM instance in the cloud” probably correspond to the typical target group. With the problems described (or risks). Only operators of their own VM hosts (locally or in the data center) can choose and secure their structures accordingly (at least it is your own responsibility). But that is probably not the target group of NS8.

If (as before) firewall and service were on the same device or within a VM, an attacker would have to attack the system bus or have already very deep access to the VM host to manipulate this connection. The typical NS7 or NS8 target group would protect this better - locally, in the data center and the cloud provider.

It is roughly clear to me what the idea behind the structural decision Firewall and Server (or orchestrator) may have been separated (and that it is hardly reversible).

However, it is not clear to me why the firewall is not intended as part of a container in NS8? Wouldn’t that have been an option? Maybe she will still be? (Openwrt runs as a container?)

So much for my 2222 cents

Greetings Yummiweb

PS.
Andy, if you know where I can ask specific questions about problems with NS8 without annoying, please let me know.

Andy_Wismer · February 5, 2024, 11:36am

Hi @yummiweb

I’ll reply to your longer post via PM, so as not to completly derail this post! This might take a while!

Don’t forget, while not living in the EU, I do live in Switzerland (We’re surrounded!) and we also adhere to the GDPR so I’m well aware of the issues.

My 2 cents
Andy