Strange behavior with DNS server

I see some problems related to internal name resolution. My nethserver manages all the network, including DNS. I have some internal servers working, and they interact with each other with their respective FQDN which is configured in nethserver. For example: the server “server1.intdomain.corp” sends daily backups to “backupserver.intdomain.corp”.

Recently I installed DPI to prevent access to bitorrent network, and my servers started to fail. I think the problem is that, for some reason, nethserver treats the DNS queries of my internal servers as if they were destined to external DNS servers, which should be blocked by the firewall.

I take this conclusion because of what I see in the firewall logs, and because the “dig” and “nslookup” commands takes too long to process or fail.

firewall.log is filled with this:

Jul 31 11:12:23 gate kernel: Shorewall:em2_mac:DROP:IN=em2 OUT= MAC=00:26:b9:86:1f:e4:00:1e:67:52:54:4a:08:00 SRC=192.168.1.47 DST=192.168.1.1 LEN=68 TOS=0x00 PREC=0x00 TTL=64 ID=24869 DF PROTO=UDP SPT=39019 DPT=53 LEN=48 
Jul 31 11:12:23 gate kernel: Shorewall:em2_mac:DROP:IN=em2 OUT= MAC=00:26:b9:86:1f:e4:00:1e:67:52:54:4a:08:00 SRC=192.168.1.47 DST=192.168.1.1 LEN=68 TOS=0x00 PREC=0x00 TTL=64 ID=24870 DF PROTO=UDP SPT=39019 DPT=53 LEN=48 
Jul 31 11:12:28 gate kernel: Shorewall:em2_mac:DROP:IN=em2 OUT= MAC=00:26:b9:86:1f:e4:00:1e:67:52:54:4a:08:00 SRC=192.168.1.47 DST=192.168.1.1 LEN=68 TOS=0x00 PREC=0x00 TTL=64 ID=25622 DF PROTO=UDP SPT=39019 DPT=53 LEN=48 
Jul 31 11:12:28 gate kernel: Shorewall:em2_mac:DROP:IN=em2 OUT= MAC=00:26:b9:86:1f:e4:00:1e:67:52:54:4a:08:00 SRC=192.168.1.47 DST=192.168.1.1 LEN=68 TOS=0x00 PREC=0x00 TTL=64 ID=25623 DF PROTO=UDP SPT=39019 DPT=53 LEN=48

And the “dig” command sometimes fails:

root@nube:~# dig respaldo.fmorales.vfmsa

; <<>> DiG 9.9.5-3ubuntu0.15-Ubuntu <<>> respaldo.fmorales.vfmsa
;; global options: +cmd
;; connection timed out; no servers could be reached

All my internal devices (servers, phones, desktops, etc) are configured statically with the DNS pointing to 192.168.1.1 (my nethserver). I don’t see why the nethserver is blocking these DNS queries.

I think this problem is not DPI’s fault. I think this is very old problem I have in my system, because before I enabled DPI I saw the same problem when the internet went off. But now that external DNS queries are effectively blocked and I have the issue all the time.

Can someone take a look at this? tell me if the problem is with nethserver or something in my network.

Thank you.

What is mac? Did you define a custom zone with relevant firewall rules?

You could rule out ndpi temporarily disabling the bittorrent rule.

I don’t think I have a custom zone

I don’t know what “mac” means. I thought it was part of the interface name or something.

I know that if I disable DPI the problem will go away, or at least it will look like it. But if, for example, I disconnect the server from the ISP (turn off the internet) I will have the same issues, because (and this is my assumption) the internal servers won’t be able to make DNS queries when trying to access local internal servers. Which is not logical.

As an example, one of my servers backup script reports that it can’t resolve the backup server:

ssh: Could not resolve hostname respaldo.fmorales.vfmsa: Temporary failure in name resolution
rsync: connection unexpectedly closed (0 bytes received so far) [sender]
rsync error: unexplained error (code 255) at io.c(226) [sender=3.1.0]

em2_mac… From Shorewall FAQ:

interface_mac or interface_rec
The packet is being logged under the maclist interface option.

1 Like

This means that IP/MAC binding is enabled. I never used it, but I think that it could be the source of the problem. Could you please try to disable it. Meanwhile, I will try to reproduce the problem.

Yes, disabling IP/MAC binding seems to fix the problem.

I’m not able to test right now if the problem occurs when I disconnect the internet. I will test it at night.

What could be the problem with IP/MAC binding? It is a nice feature. I would like to be able to use it.

1 Like

I don’t know, but if we could confirm that it is the source of the problems, we can investigate further.
We need to reproduce the problem.

I did some tests and I think I found something.

I noticed that the problem occurs only from lxc containers or KVM vm’s which are behind a bond, in this case is a bond configured as balance-alb.

I noticed that the desktops are not having the problem. First I thought that this was because firefox is using the nethserver as proxy. But the nextcloud client doesn’t have problems for connecting to the internal nextcloud server (which is not in nethserver).

I have a proxmox server running various VM’s and containers, and most of them are behind a bond, but I noticed that the ones that aren’t don’t have the issue.

So, the problem occurs when IP/MAC binding is enabled, and only for machines that are behind a bond, that in my case is balance-alb.

I am willing to remove the bond to fix this, more than to remove the binding. But I’d like to know if this is maybe a bug.

AFAIK, with bond in balance-alb the machine will switch MAC address based on traffic volume.
This would probably mean that we can’t use ip/mac binding.
The feature is based on shorewall-maclist (shorewall-maclist), where we can set more than one ip for each mac.
You could try to set both mac to the same ip.

That’s what I thought.

I have eliminated the bond and distributed my VM’s between the interfaces. The bond have caused me other problems in the past. I think is because I’m using a not managed switch. I supposed this wouldn’t happen with some more advanced network hardware (managed switch).

Everything is working fine now. I have IP/MAC binding enabled. DPI is also enabled and blocking bitorrent.

Thank you.

1 Like