Nethserver HA configuration

Hi to all, maybe we have HA experts and best practice resources.
Please share your thoughts regarding this.

Also we need to discuss backup MX in scope of the HA and data directory mount point like /var/lib/nethserver located on nfs (iscsi) external NAS device.

1 Like

Hi,

About High Avaibility… You need to make things in double, redondant.
One Primary DNS, and one secondary DNS, in another site, it’s better.
On a Windows domains, one WINS on each site, if the WINS go down, it’s not to important during 24h.
One DHCP on each site. You must have an inactive DHCP on each site, to be ready to be up if the main is in trouble.
If you have to have others services available ( access internet on each site, any service in general ), think about OSPF, with these “costs” routes, it can save the day if things going bad.
If you can’t offer the service, think in another degraded mode.

Edit: another way to think about HA is to be aware rapidly if a service is unavalable and to restore it rapidly.

If you have a COMPLETELY updated Environment (No Windows 9x, ME or Windows NT 4.0 or earlier) and not a configured DFSN, that is configured to be based on WINS, you don’t need it anymore.
If you run an Exchange 2007 in a multi-domain-environment, you will also need WINS.

For a good overview, see

I was thinking about ability to make a HA structure with NethServer, but I think it’s not possible actually.

For exemple: let’s imagine a scenario with a SME.
The SME has a NethServer instance, and want to make redundant services.

The easy way could be to contract a VPS, to install another NethServer instance.
But with the lack of OSPF, or another advanced routing technology, you can’t do lot og things…

DNSMasq is good as local DNS service and DHCP on site, but it’s weak for managing DNS zone like BInd or Unbound can do. With such Dns service you can imagine services like primary DNS and secondary DNS, if the primary is down, all clients go for the secondary.
Dhcp is able only on site, as a local service, the unique way is to have a second DHCP server onsite.
If the SME have an Owncloud instance, on the local server, with the lack of advanced routing, even if the Owncloud is synchronized with the VPS, if the principal Owncloud is unavalable, you can’t do anything. It’s dead.
With an OSPF feature, where you can put cost for each route, you can route all insite traffic through the local NethServer instance, and for the offsite traffic route through the VPS, and synchronize the Owncloud instances each hour… if the local Owncloud instance is down, with cost routes, you can reroute traffic throught the VPS… With such feature, you can tranquilly operate the down Owncloud instance, without stress… But actually, NethServer can’t do this.

I read about a planned feature to backup a NethServer instance on another, it’s a good feature, but it’s not a HA feature yet.
HA feature, is when services run in paralels to offer resilience.

Edit: Did you remember when I talked about MX backup? Roadmap for Q1 2016
Sadly, Nethserver isn’t going this direction…:tears:
People here think it’s unnecessary… But with more needs, perhaps we can change minds here, and make HA available :smiley:

So best practise to think about High Availibility.

  • A good analyse about your IT structure, backbone, server(s) and services/applications.
  • To analyse what is critical, in term of material, in term of software/services.
  • To plan backup solutions and plan to restore… Sometimes a backup is no sufficient, you must think about time to restore.
  • to plan how many time the service or application can be down before the situation become unacceptable.
  • what service are local/global, easy to setup, difficult to setup…

For exemple:
The backbone: if a simple switch is fried, you can simply change the switch by another one… If it’s the router or the gateway? How to continue to provide the service ( dns / dhcp / proxy / routage )

I know the email service is not simple. When the server must be restored and it take several hours… Your users are not very patient in general, they can’t wait a lot.
Sometime is better to duplicate the service than backup, or better, it’s better to duplicate the service to have the time to restore.
To put all yours eggs in the same basket is dangerous.

I thing to have a backup MX is a very good start, without this, it’s better to stay with gmail or hotmail :devil:

As for backup MX it is very simple:

The backup MX itself is not difficult… It’s the mailbox itself where is the difficulty.
On one disk in a server? A a raid in the server? On a cluster ? How to backup? How to restore?

In general “online” backup make me laugh, when I think time to restore…

For a home/ SOHO/ SME setup, I think a VPS can be a good solution to offer resilience, to offer time to restore the local instance when this one is down.

Ones i have faced with Duplicity restoration it is awfull that is why i have created a topic about Duplicity update, because it fials with SHA1 errors from time to time.

To make correction about one precedent thing I wrote:
DNSMASQ ihas primary and secondary dns for itself, and in the DHCP it can be set two DNS primary and secondary to transmit to the client. So this part is ok for HA :wink:
Let’s work for the rest :smile:

Nethserver can offer
Dns, ntp, Mx ( and backup Mx with the link given before).

I don’t know if software like Sogo can do by themslef sort of load balancing, or be install on two differents machines and act as a unique instance.

In my point of view, NethServer lack of advanced routing, like cost routes (dynamic routes) or even conditional routing, Quagga could be a really good implementation here. In such HA plan with a VPS as backup services.

Regarding HA, I just published a first review of documentation: http://docs.nethserver.org/en/latest/ha.html
Packages are not ready to the public yet, we will give a little announce when everything is ready (I hope very soon).

I read the posts, but I won’t give my general opinion, since HA is a huge topic and it has many many aspects to keep in mind even before designing such a complex environment. :smile:

Edit: I would like to hear your thoughts on the new nethserver-ha package!

5 Likes

the master node (or primary node) runs all the service, meanwhile the slave node (or secondary node) takes over only if the master node fails. Both nodes share a DRBD storage in active-passive mode.

Sounds very interesting to me, if I’m understanding correctly just the mysql service is currently “clusterized/supported”, right? Or is it just an example? I’m just wondering if we can try to clusterize with pcs other services like httpd for example or dnsmasq (as @jim and @sraellis already suggested)
Clusterizing httpd we can also support all web app as ownCloud.
@Nas

It’s just an example, you can put in cluster even other services like httpd.

In theory yes, but it’s not a good choice. Since ownCloud already has its own replication method, you should go that way.

2 Likes

Just read this topic after being AWOL for a longtime…still am, I think. :wink:

I think it would make a lot of sense enabling high availability, it would be a lot of help to those SMBs out there depending on NS. Microsoft has implemented this on DC, DFS, DNS (and DHCP to some extent since WinServer 2k12).

I’ve read previously somewhere that its available on linux, moreover, RedHat/CentOS (I’m more into Ubuntu and I have not put some time to reading through :slight_smile:) through Heartbeat / keepalived / linux-ha / pacemaker (oohhh…I’m just not so into these things so pardon my ignorance if these are not the modules / programs).

Not a developer here and have not tried and I do not know its possibility. What I’m just saying is that it adds a layer of protection for business along-side backups. It will also help NS gain some grounds vs other UTMs (which is not offering HA).

1 Like

Welcome back @vhinzsanchez
you’re right we should do that, but I don’t know how this path is doable

1 Like