Proxmox HA chat

Well thought out!

Doing backups for all VMs is never a bad idea:
Better to have one backup too many, than one too little!
I do mine automatically every night, a few are done twice daily, once at lunch, once after work.

Doing Backups CAN take a bit longer, the 15 mins was referring to making the cluster and rebooting.

If you DO something, think it out first and do it well.
If you want to set up a cluster system, than do it in a way that you have the potential of 5-10 years of “carefree” running, at least on the Proxmox level!
Then it’s worth your while, and you honestly earned yourself a beer or whatever your preference. :slight_smile:

Most of my newer Proxmox have only 2x 250 GB SSDs or 2x 500 GB SSDs, depending on availability. (300 GB SAS…)

A little bit of local space isn’t wrong, but too much is definately a waste.

I’d also - depending on your Hardware - also install LM-Sensors, smartmontools (or equivalent), dmidecode (Good for say HP-ROK MS Licensing or similiar)…

SNMP and Zabbix Agents are also a must in my environments…

I wouldn’t want to be driving a car without a speedometer (or no oil/gas guage), chances nowadays you won’t get a fine, you might lose your driving license due to speeding. So monitoring system vitals is a must!

Both Proxmox AND your NAS equipped with 10GB NICs! You lucky guy!
Only had 2 clients with that kind of NICs…

My 2 cents
Andy

1 Like

That’s another thing I really don’t know anything about. I have it running on a VM, but I know I’m not anywhere close to taking advantage of its capabilities (or, for that matter, really understanding what they are).

Zabbix - or any form of monitoring - is in reality a must.
When I walk around, I do keep my eyes open, and not on a mobile device… :slight_smile:

In a company envirinment, monitoring makes your job easier, and more planable. You’re notified that there are 12 Updates for that Windows Server, User XY is still logged on Remote Station, your Printer is running out of toner, your NAS has an upgrade online in the waiting…

Even at home or a home environment monitoring is a must. I like knowing what is going on, which server or hardware is reaching limits, yadda yadda yadda…

I admit being a music buff, and because I do a LOT of networking stuff, Internet may get clogged. So I have ALL my music locally, so no Interuptions. A few years back, I thought putting my music on a music server with mirrored disks was sufficient, and a backup on USB. As always, we humans are lazy, USB isn’t done a often as should be…

Then there such things as power outages - even in clockwork countries like Switzerland!
Lightning, or whatever. Even Squirrels jumping in a window gap at Zurich main power substation… Sh*t happens.

One Disk died, it was Friday. OK, I’ll order another one, no probs. The second outage came on Sunday, another heavy storm. 5 years of music collecting lost!

Monitoring would have alerted me earlier to how precarious the situ was…

I can help you / walk you through Zabbix if you want, it’s not too hard.
SNMP is BTW the best joke. Simple Network Management Protocol…
It’s anything but simple sometimes, but once you get the hang of it, or use a good tool like Zabbix, a lot of the headaches are gone… :slight_smile:

If Zabbix isn’t actually doing much, it’s like having a Ferrari in the Garage. Not only is the Ferrari just wasting the purchase sitting in the garage, but it’s also taking awaay space, possible rental income for that space, … and so on.

Here are a few “teasers”:



The cameras show LIVE views…

2 Likes

Well, surprise surprise, things took longer than expected, but I finally got around to it this morning. First problem I encountered was that the PVE 6.1 (-1 or -2) ISO simply wouldn’t boot on my Dell C6100 blades–it froze up at “loading drivers”. The 5.4 ISO booted fine, though, so installed that and upgraded to 6.1. Both systems are up, the cluster is created, Let’s Encrypt certs for both systems are created and installed, I can (as promised) see both systems through a single web GUI, and restoring the backups is going quite nicely.

Only problem I’m having (which I was having previously as well) is that the vmbr0 interface isn’t coming up automatically at boot, which means the system is down until I log into the console and bring it up. Here’s what I’m seeing in /etc/network/interfaces, which looks like it should bring it up:

root@pve1:~# cat /etc/network/interfaces
auto lo
iface lo inet loopback

iface enp3s0f4 inet manual

auto vmbr0
iface vmbr0 inet static
	address 192.168.1.3
	netmask 255.255.255.0
	gateway 192.168.1.1
	bridge_ports enp3s0f4
	bridge_stp off
	bridge_fd 0

iface eno1 inet manual

iface eno2 inet manual

iface enp3s0f4d1 inet manual

@danb35

Hi

Cool that you finally got your Proxmox environment up & running.

An example from Proxmox GUI:
This box is connected to two networks, the main LAN and another Network. This other network is only “bridged” by Proxmox, so a VM can connect to both networks.

The config file:

auto lo
iface lo inet loopback

auto enp6s0
iface enp6s0 inet manual

auto enp7s0
iface enp7s0 inet manual

iface enp8s0 inet manual

auto enp9s0
iface enp9s0 inet manual

auto vmbr0
iface vmbr0 inet static
address 192.168.33.62
netmask 255.255.255.0
gateway 192.168.33.1
bridge-ports enp6s0
bridge-stp off
bridge-fd 0

auto vmbr1
iface vmbr1 inet manual
bridge-ports enp9s0
bridge-stp off
bridge-fd 0


As far as I can see, your config looks correct, maybe set auto in the GUI…
(On the NIC itself AND on the vmbr0 Bridge…)

Maybe the issue is the same that caused 6.1 to hang, NIC interfaces / drivers…

A solution may be a script started later in the boot chain, replicating your manual network start - as I understand it, it works afterwards…

Tip: Don’t forget this:

nano /etc/vzdump.conf

tmpdir: /tmp

For Backups on Networks…

LMsensors, DMIDecode (eg for ROK licensed MS Servers) besides the usual nano, mc, htop, screen and snmp & zabbix Agents all run well in Proxmox!

Andy

1 Like

I am looking to setup 3 proxmox servers using ceph all 3 will have a 10GB backend link I want to use the 10TB drive on each server as main drive that will be shared between hosts and vm’s. I was thinking of perhaps splitting the 10TB drive into 2 5TB 1 for data and 1 where I can move VM’s between hosts.

They are going to have a decent amount of memory and I am just looking for suggestions and/or pretty decent documentation.

I did think about going 2 hosts and a freenas box to be used for storage. FYI HA may not be really necessary as long as I can minimize downtime and can move vm’s between hosts.

Any suggestions or recommendations will be much appreciated.

@sektor

Hi David

If you setup your Proxmox as Cluster with shared storage, you don’t need a drive to “move” VMs between Hosts.
Proxmox has two forms of Cluster, one for fast Migration, the other is full HA Cluster. Both are actually the same thing, HA requires 3 running Proxmox for full automatic failover. A Cluster can be 2 Proxmox or more.

I usually use a small SSD (250/500 GB) for Proxmox (OS). All VMs are on “Shared Storage”, usually a dedicated NAS with RAID10 (The fastest according to Proxmox). None of my clients have 10 GBe Networking (yet), but Proxmox and NAS have BONDING of 2 NICs.

Live Migration takes about 90 seconds!!!

As the storage is shared, Proxmox only needs to transfer the RAM contents.

With Live Backups and Fast Migration, you have a very high availability and low downtime, without actually having Full HA Cluster. A Full HA Cluster is also possible with Proxmox, and doesn’t cost extra! You just need enough Proxmox.

Hope that helps…
Andy

2 Likes

@Andy_Wismer It does thank you, but unfortunately i do not have the ability for shared storage or a nas, unless I setup one of the hosts as freenas instead of proxmox, so I would have 2 proxmox hosts and a freenas box for storage.

I was thinking of not only doing vm redundancy but some kind of data redundancy as well, I have been doing tons of research and was going to do that route, but then decided on ceph.

The more I dig on ceph the more I’m not comfortable, because the documentation is a little bit involved and I did try posting in the proxmox forum but didn’t seem to get much help there.

I run proxmox at home with a nethserver vm and have been using it for a few years.

You guys here by far seem the most helpful.

2 Likes

@sektor

Hi

I fully agree with you, i personally think this NethServer forum one of the best in all open source. The people here make it so… @support_team & @dev_team, you’re meant too!
That’s also the reason I try to help as best as I can here…

Proxmox is great software, the best virtualizer in my opinion. But as it works so well, their forum is a bit “lacking”, and doesn’t come close to NethServer’s forum here!

I do run about 20-30 Proxmox / Nethserver Combos for my clients, mostly SME companies.
All use NethServer as AD, Mail, File, Print, NextCloud and Zabbix monitoring.
At all clients, i use shared storage on NAS (Mostly Synology) using NFS on Proxmox.
On Proxmox, i use XFS due to far better performance than Ext4…

I’d like to use CEPH, which is very reliable and solid. If it’s enough for CERN in Switzerland with their LHC (Large Haydron Collider) and the amounts of data they produce… Also, seeing who is behind CEPH… And CEPH is distributed…

But reading their HW suggestions… Start with 10 GBe networking…
I just do not have that kind of hardware spare, lying around to test and gain experience with CEPH yet, but i will get there…

With your Hardware, I’d setup 2 Proxmox and a NAS. I do approve of FreeNAS, ZFS is great, when not one of the best filesystems at all!
Have your NAS Backup to a USB Disk. Seagate Backup Hub + is a great buy, for a 8 or 10 GB USB3 Disk. I use those a lot. I’d also suggest using 1 or 2 1 GBe Links for the cluster network. Note, these must NOT be bonded, use the proxmox suggested built in redundancy protocoll for this, if using more than 1 link. I think 1 link is sufficient in your case.

See these networks…
Some of my clients are also multi site.

All have local daily backups, (both Server backups, eg Nethserver) and Proxmox live backups. But also Off-Site Backups, usually to the same modell NAS at home of the companies boss…
Higher availability in case the office NAS fails, we take the home one to the office until replacements come… (See image 2, below. The othe two have this too, but they are shown on other maps - the network is more complicated and larger.).

My 2 cents
Andy

3 Likes

@Andy_Wismer Thank you for the detail, unfortunately where the servers are I can’t add any external storage like that, so I plan on backing up offsite from the freenas.

I will formulate my plan and post it here, let me know what you think.

1 Like

Just added a third node–took down my xcp-ng server to put Proxmox on it instead. Six sockets of X5650s and 192GB of RAM among three nodes–should be enough for a while…

1 Like

@danb35

Sounds impressive! Are they now “clustered”?

And what are you running there, if i may ask?

Andy

Yes, it’s now a three-node cluster. The third node is really superfluous, as the first two would be enough to run what I’m using, but I have four nodes in a Dell C6100, so…

I’m running a variety of things, mostly “test” systems. My neth dev and test systems are there, a FreeNAS test system is on there, my bitwarden server, my PBX, my Zabbix server, TrueCommand, MeshCentral, a local Discourse instance, a CentOS 7 system with Asigra client (the folks at iXSystems were really pimping that plugin, so I was trying to figure out how to make it work), etc. If I hear of software that sounds interesting, I spin up a VM and play with it–and now that I’ve figured out how to use templates in Proxmox, that’s easier than ever.

1 Like

@danb35

Proxmox with templates & linux containers are really cool and fast. Spinning up a VM with vanilla debian, centos or ubuntu is a matter of 1-2 minutes or even less!

One of my Proxmox is dedicated to such testing…
Soon a second.

My 2 cents
Andy

PS:
In IT “one too many” is usually preferable to “one too little” (Think eg Backups).
The only exception to this is “problems”, everyone prefers less!
:slight_smile:

@Andy_Wismer So below is what I have to work with as I mentioned originally I was thinking of using cepth for not only virtual machine storage but as vm file system as well until I started digging into the documentation.

Basically I have 3 servers right now all running proxmox and my goal is to cluster them all but unfortunately I do not have a NAS where these servers are hosted and I cannot do usb backup, being I don’t have access to them so I was going to do offsite backups.

All 3 servers are Dell PowerEdge C6220
CPU: dual Intel Xeon 2.6 GHZ Octa cores for a total of 16 cores
Memory: 96 GB
Network: each server has a 10 Gig nic connected to a 10 Gig switch to be used for cluster traffic and 2 bonded Gigabit nics to the outside world.
Drives: 2 480 GB SSD Raid 1 via software raid for the OS 10TB Sata, 2 more 480GB SSD which were required per ceph documentation and a 1 TB SATA

I was kind of thinking of altering the drive configs of the servers a bit being I am not going the ceph route and looking to replicate at another hosting location for redundancy at some point.

The reason for the software raid and not a raid controller was because it was recommended to not have any raid setup at all when using CEPH as per documentation and the necessity for the SSD’s was for the OSD(Object Storage Daemon).

The recommended ratio per what I read is 1 OSD per physical disk as you can see where things start getting a little confusing at least in my opinion and obviously they were no help in the proxmox forums.

Basically I am looking for the best way to cluster them and obviously if I go the freenas route and do that as the “NAS” and shared storage for everything then it will be a 2 host cluster. Again I really appreciate the input and the help from yourself and anyone else that assists.

@sektor

Morning David!

It’s here 08:54 here in Switzerland, and for the last two - three weeks very sunny! The local farmers are a bit concerned about a drought, like we had 2 years ago (2018 was VERY dry).

Going through your hardware specs: they’re quite impressive.

I don’t quite get your planning where the disks/storage are concerned (I take that these are per hardware server…):

What was your original planning for the disks in your CEPH concept?

Did you see this:
hardware recommendations — Ceph Documentation ?

Also good to know: The Disks for the OS itself CAN be in RAID (1, 5 or whatever), just the disks for CEPH usage shouldn’t go through a Controller in RAID mode, CEPH needs to be able to control the disks directly!

With that kind of hardware, I’d see two options:

  • CEPH, as planned, maybe tweaking the original plan a bit.
  • 2 Node Proxmox Cluster, 3rd Node as FreeNAS

The second option would make use of a 2 Node Proxmox Cluster, with the third server setup as FreeNAS. FreeNAS uses ZFS and would be “Shared Storage” for the Proxmox Cluster.
The nodes for the Proxmox cluster would only have 2 * SSDs for the OS. The other disks would go to the FreeNAS System (Server No 3).

This would give you a Live-Migration cluster, high speed backups and snapshots and a VERY performant Proxmox Cluster. An additional bonus is youl’d have a single point (The FreeNAS) to connect to external Backups.

Note: This setup can - later on - be converted into CEPH, using the disks as initially planned…

→ This would entail making the FreeNAS back into a Proxmox cluster, and redistributing the disks.
During this transition, the data would need to be copied over to USB Disks or something for the migration.

My suggestion:

I’d personally suggest going the CEPH road, seeing as you already have the hardware AND the 10 GBe connection plus Switch. You also already have three nodes, which is required for a CEPH cluster.

But you could also take the 2nd road, get your know-how and self confidance up to Proxmox levels and mabe later, or much later, convert your Cluster into a Proxmox CEPH cluster…
This entails learning by doing / using and gives you the confidance that Proxmox is the best solution!

I can help you with both ways.

A small question regarding Internet connectivity / Firewall…
What do you have available / or planned?

Offsite backup in your original plan:
You’ld still need local a local backup, as backing up a live system offsite isn’t sensible and very time consuming… Saving a local backup Offsite, on the other hand, could be a simple rsync script. Remember, rsync let’s you set the used transfer speed, so offsite backups do not block your Internet (at 2 sites!)…

My 2 cents
Andy

@Andy_Wismer

Good Morning to you as well it is 08:09 here in the South East of the US, good to hear about the Sunny weather but I do pray you guys get the rain you need for your crops. Here we are fixing to get some pretty nasty weather this evening through Friday.

In regards to the planning of the storage I know it is a little unusual, but each physical server can only have 6 drives maximum so I did the following per my original plan which was to go this route https://pve.proxmox.com/wiki/Deploy_Hyper-Converged_Ceph_Cluster, which after reading the documentation again doesn’t seem to bad now.

I guess I must have been reading some other documentation, which could have possibly started confusing me, so with that said as per the documentation there is to be no raid in the server so the 2 OS disks I put in a Raid 1 Zpool and the remaining disks are to be as follows.

Being each server can only have a max of 6 physical disks and I read as per the documentation there needs to be or it is recommended to have 1 OSD per physical disk and I thought I remembered that it is recommended to have SSD’s for your OSD’s, so my 10TB drive which I planned on using for CephFS was going to use one of the remaining SSD’s for the OSD and the 2TB drive which I was going to use as my main Ceph pool was going to use the SSD for the OSD.

@fausp suggested about doing this same exact setup by configuring nested virtualization on one of the proxmox hosts, which is really a good idea to validate the setup because right now I have 2 hosts not doing anything right now, so what I was thinking about doing was setting up the 3rd host for the nested virtualization and spinning up another proxmox to complete the cluster while leaving my main host alone until I figure it out.

For testing purposes my nodes would be as follows current physical node 2 would be the primary, physical node 3 would be the secondary, and virtual node 4 would be the tertiary. In regards to resources for the virtual proxmox I decided to configure it at least half of the resources of the host it will be living on, but my question would be how to mimic the drive configuration.

From my understanding of the below lines you can’t partition the OSD’s it uses the whole disk unless I am incorrect in my understanding, also what seems unclear how does it know what physical disk goes with what OSD because I do not see where you specify that.

If the disk was used before (eg. ZFS/RAID/OSD), to remove partition table, boot sector and any OSD leftover the following command should be sufficient.

ceph-volume lvm zap /dev/sd[X] --destroy

I hope I cleared things up a bit, sorry I was up kind of late about the backup thing I agree with you, but unfortunately where the servers live is kind of my only option so what I planned to do was the 10TB drive was not only going to be data storage but backups as well and then from there backup up offsite somewhere.

Hi David

So you’re in Florida? Whereabouts? FL is more than 4 times the size of Switzerland (and at least that amount of more sunny days too)…

This seems to be quite a bit different from the second one:
https://pve.proxmox.com/wiki/Deploy_Hyper-Converged_Ceph_Cluster

Here states:
https://docs.ceph.com/docs/master/rados/configuration/common/#osds :

Ceph production clusters typically deploy Ceph OSD Daemons where one node has one OSD daemon running a filestore on one storage drive. A typical deployment specifies a journal size.

See also this: (Scroll down to the examples:)
https://docs.ceph.com/docs/master/start/hardware-recommendations/

But, like I said earlier, with that kind of hardware I’d choose to go the CEPH road myself.

Andy

I think I’ll continue the discussion of containers here rather than on the FreePBX thread–I’ve already derailed that enough. @Andy_Wismer, you’d said to download/install container templates through the storage screen. But unless I’m missing something (or there’s a different storage screen), there isn’t a place to do that:

I tried checking the docs:
https://pve.proxmox.com/wiki/Linux_Container

…but they weren’t helpful at all in finding the way to do this through the GUI–the only thing it says there is:

@danb35

Try clicking on Local, then Content, then Templates… Choose and enjoy!

Don’t forget, your’s isn’t quite the “default” storage used in Proxmox! :slight_smile:

Containers are ideal for testing a LOT of stuff (But not NethServer, because of the AD).
High power CPU stuff like ZoneMinder with 10 cams online and 4 of those 10 are Full HD…

And firing up a vanilla container takes 1-2 minutes, mostly as the human needs to click thru 4 -5 menu screens…

If clicking on Template remains empty, your Proxmox is a fresh install and needs a manual apt update…

Andy