I’m working on Shorewall Script to WAN up/down

Previously make changes LSM (Monitoring), but in tests with my clients did not have the expected change.

Today reviewing the script Shorewall of NethServer is very different from Original Script of Mika Ilmaranta, and Tuomo Soini

On Monday I will perform on-site testing by disconnecting the ISP, from my client and verify proper operation.

Regards

1 Like

Hi so you want to disable LSM and write somth new? Or improve LSM? What do you want to achieve balancing or failover ?

LSM works well, but I’m revising the original script, with that of NethServer.

I’ll be doing some tests on my clients with more than two ISP and validate its operation.

The first tests gave me good results, that means that LSM being fulfilled its functions.

The script validates that the state of a ethX depending on the value and executes the command shorewall.

Regards.

How about PPPoE links ?

This week I will Acquire ISP USB device, and try your settings :coffee:

Test results:

ISP disconnection log

> Jun 28 12:36:30 GOAFE-FIREWALL lsm[1417]: link aba21 down event Jun 28 12:36:30 GOAFE-FIREWALL lsm[1417]: name = aba21, replied = 84, waiting = 16, timeout = 15, timeout max = 15, late reply = 0, cons rcvd = 0, cons wait = 2, cons miss = 2, cons miss max = 5, avg_rtt = 217.044, seq = 918, status = down Jun 28 12:36:30 GOAFE-FIREWALL lsm[1417]: seq * Jun 28 12:36:30 GOAFE-FIREWALL lsm[1417]: used 1111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111 Jun 28 12:36:30 GOAFE-FIREWALL lsm[1417]: wait 1000110011100001110000000000000000000000000000000000000000000100000000000000001000000001000000001111 Jun 28 12:36:30 GOAFE-FIREWALL lsm[1417]: replied 0111001100011110001111111111111111111111111111111111111111111011111111111111110111111110111111110000 Jun 28 12:36:30 GOAFE-FIREWALL lsm[1417]: timeout 1000110011100001100000000000000000000000000000000000000000000100000000000000001000000001000000001111 Jun 28 12:36:30 GOAFE-FIREWALL lsm[1417]: error 0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 Jun 28 12:36:30 GOAFE-FIREWALL esmith::event[11992]: Event: wan-uplink-update down aba21 8.8.4.4 eth2 root 84 16 15 0 0 2 2 217044 192.168.1.60 up 1435511190 Jun 28 12:36:30 GOAFE-FIREWALL esmith::event[11992]: Action: /etc/e-smith/events/wan-uplink-update/S50nethserver-shorewall-wan-update SUCCESS [0.335153] Jun 28 12:36:30 GOAFE-FIREWALL esmith::event[11992]: Event: wan-uplink-update SUCCESS

> Jun 28 12:24:40 GOAFE-FIREWALL lsm[1417]: link aba31 down event Jun 28 12:24:40 GOAFE-FIREWALL lsm[1417]: name = aba31, replied = 92, waiting = 8, timeout = 7, timeout max = 7, late reply = 0, cons rcvd = 0, cons wait = 7, cons miss = 7, cons miss max = 7, avg_rtt = 66.039, seq = 219, status = down Jun 28 12:24:40 GOAFE-FIREWALL lsm[1417]: seq * Jun 28 12:24:40 GOAFE-FIREWALL lsm[1417]: used 1111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111 Jun 28 12:24:40 GOAFE-FIREWALL lsm[1417]: wait 0000000000011111111000000000000000000000000000000000000000000000000000000000000000000000000000000000 Jun 28 12:24:40 GOAFE-FIREWALL lsm[1417]: replied 1111111111100000000111111111111111111111111111111111111111111111111111111111111111111111111111111111 Jun 28 12:24:40 GOAFE-FIREWALL lsm[1417]: timeout 0000000000011111110000000000000000000000000000000000000000000000000000000000000000000000000000000000 Jun 28 12:24:40 GOAFE-FIREWALL lsm[1417]: error 0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 Jun 28 12:24:40 GOAFE-FIREWALL esmith::event[3237]: Event: wan-uplink-update down aba31 8.8.8.8 eth3 root 92 8 7 0 0 7 7 66039 192.168.2.60 up 1435510480 Jun 28 12:24:41 GOAFE-FIREWALL esmith::event[3237]: Action: /etc/e-smith/events/wan-uplink-update/S50nethserver-shorewall-wan-update SUCCESS [0.260251] Jun 28 12:24:41 GOAFE-FIREWALL esmith::event[3237]: Event: wan-uplink-update SUCCESS Jun 28 12:24:43 GOAFE-FIREWALL kernel: eth2: link down Jun 28 12:24:45 GOAFE-FIREWALL kernel: eth3: link down Jun 28 12:24:46 GOAFE-FIREWALL kernel: eth2: link up, 100Mbps, full-duplex, lpa 0xC5E1 Jun 28 12:24:49 GOAFE-FIREWALL kernel: eth3: link up, 100Mbps, full-duplex, lpa 0xC5E1

Make your changes to LSM and S50nethserver-shorewall-wan-update

  • LSM File

================= DO NOT MODIFY THIS FILE =================

# 
# Manual changes will be lost when this file is regenerated.
#
# Please read the developer's guide, which is available
# at https://dev.nethesis.it/projects/nethserver/wiki/NethServer
# original work from http://www.contribs.org/development/
#
# Copyright (C) 2013 Nethesis S.r.l. 
# http://www.nethesis.it - support@nethesis.it
# 
#
# Debug level: 0 .. 8 are normal, 9 gives lots of stuff and 100 doesn't
# bother to detach
#
#debug=10
#debug=9
debug=8
#
# Defaults for the connection entries
#
defaults {
  name=defaults
  checkip=127.0.0.1
  eventscript=/usr/libexec/nethserver/lsm-wan-link-update
  notifyscript=
  max_packet_loss=15
  max_successive_pkts_lost=7
  min_packet_loss=5
  min_successive_pkts_rcvd=10
  interval_ms=1000
  timeout_ms=1000
  warn_email=root
  check_arp=0
  sourceip=
# if using ping probes for monitoring only then defaults should
# not define a default device for packets to autodiscover their path
# to destination
  device=eth0
# use system default ttl
# ttl=0
# assume initial up state at lsm startup (1 = up, 0 = down, 2 = unknown (default))
  status=1
}
  • S50nethserver-shorewall-wan-update File
 #!/bin/bash
    #
    # (C) 2009,2013 Mika Ilmaranta <ilmis@nullnet.fi>
    # Copyright © 2009-2010 Tuomo Soini <tis@foobar.fi>
    #
    # License: GPLv2
    #
    #
    # event handling script for use with shorewall multi-isp setup
    # To be able to utilize this script you must have shorewall >= 4.4.23.3
    #
    shift;
    STATE=${1}
    NAME=${2}
    CHECKIP=${3}
    DEVICE=${4}
    WARN_EMAIL=${5}
    REPLIED=${6}
    WAITING=${7}
    TIMEOUT=${8}
    REPLY_LATE=${9}
    CONS_RCVD=${10}
    CONS_WAIT=${11}
    CONS_MISS=${12}
    AVG_RTT=${13}
    SRCIP=${14}
    PREVSTATE=${15}
    TIMESTAMP=${16}
    DATE=$(/bin/date --date=@${TIMESTAMP})
    if [ ${STATE} = up ]; then
        state=0
        action=enable
    else
        state=1
        action=disable
    fi
    VARDIR=$(/usr/sbin/shorewall show vardir)
    echo $state > ${VARDIR:-/var/lib/shorewall}/${DEVICE}.status
    bash ${VARDIR:-/var/lib/shorewall}/firewall ${action} ${DEVICE} \
        >> /var/log/lsm 2>&1 \
        || bash ${VARDIR:-/var/lib/shorewall}/firewall restart >> /var/log/lsm 2>&1
    /usr/sbin/shorewall show routing >> /var/log/lsm
    exit 0;
    #EOF

I’m using the original and adapted to NethServer paths.

Try if you have more than one ISP

Regards :coffee:

2 Likes

Great work, but maybe firewall refresh would be enought ?

Thanks @Nas , but I just checked the scripts and operations. It was not much work :smile:

Really the script does the refresh or better said validating ISPs are up.

For companies that do not have a Department of Information Systems it is essential that this working perfectly, but will have the problem of Internet connection.

Regards

@jgjimenezs , if I understand the modifications, you lowered the interval between ping checks to 1 second to make the link status detection “faster” and restarted the firewall on link status change.

Concerning the timeout values, we could make them adjustable, but if the user sets arbitrary values lsm could become useless.

Regarding the firewall restart, AFAIK, it’s not needed. You could verify that’s not needed, saving a snapshot of the firewall with the standard script and your modified script.

Could you please describe the problems you had with the standard script? Could you share your configuration?

Oops. Yes. these values were testing to establish an appropriate value.

Using the script NethServer when the ISP connection is lost, it is in DOWN and UP does not change back.

I run shorewall restart to fix it

i think that after restoring one of the wan connections , traffic does not come thouhgt it , and after restarting firewall all is ok ! it is my Point of view!

2 Likes

Maybe it is better to make POST-UP script that would restart firewall :slight_smile:

I’ve tested wan up and down events a lot and I never saw your problem.
I think you have a configuration error. The most frequent error is forcing a checkip value that’s unreachable or using the same ip on both connections.
Forgive me if I’m starting from the basics, please show us, if you can, your provider config (db networks show).

2 Likes

That’s right I have it configured with the DNS of Google. That means that if I place each different IP WAN to verify connection, there would be problems?

You could let the system auto-discover the checkips it would be fine. Otherwise, set them differently for each provider, keeping in mind that when a link is down the checkip will not be reachable.
You could use 8.8.8.8 for provider1 and 8.8.4.4 for provider2.

1 Like

This customer has 3 WAN, provider1 8.8.4.4 and 8.8.8.8 provider2 provider3 another DNS

I will leave the original scripts configurare NethServer and as you say.

Thanks @filippo_carletti