Howto install Zabbix 3.4

testing
v7

(André Wismer) #61

@mrmarkuz, @syntaxerrormmm

I’m using MrMarkuz’s packaged Zabbix - so far working excellently.
Since one of the last updates, all systems won’t correctly repport the backup. The Backup, however, is correctly done.

Any ideas?

Most likely it has something to do with changed reporting in the update of the backup module…

Thanks!


(Markus Neuberger) #62

Yes, it seems like the backup-data log file that’s used by the script has changed. It’s now located in /var/log/backup/backup-backup-data-TIMESTAMP.log and has a slightly different format. The backup-script now takes the latest file to check if the backup time is ok and if it contains SUCCESS.

For now I adapted the script of @syntaxerrormmm with my weak python skills, it should work but please check and improve when needed. It now checks in the latest log file if the backup time is ok and if it contains SUCCESS.

Change /usr/local/bin/nethbackup_check.py to this:

Show script
#!/usr/bin/env python
# vim:sts=4:sw=4
# encoding: utf-8

import datetime, re, sys
import glob
import os

list_of_files = glob.glob('/var/log/backup/backup-backup-data-*.log')
latest_file = max(list_of_files, key=os.path.getctime)


BACKUPTYPE = {
    'Data':     latest_file,
    'Config':   '/var/log/backup-config.log'
}

def backup_check(backuptype, validity):
    # get line with time
    f = open(BACKUPTYPE[backuptype])
    timeline = f.readlines()[-6]
    f.close()

    # get line with status, hopefully success
    f = open(BACKUPTYPE[backuptype])
    successline = f.readlines()[-7]
    f.close()

    # Splitting the lines once read
    timeline_arr = str.split(timeline)
    successline_arr = str.split(successline)

    # Extract the date
    check = datetime.datetime.strptime(timeline_arr[3], '%Y-%m-%d').date()
    end = datetime.date.today()
    start = end - datetime.timedelta(days = int(validity))

    # Verifies the status of the last backup
    if start <= check <= end and re.match(r'SUCCESS', successline_arr[2]):
        return 1

    return 0

if __name__ == '__main__':
    print(backup_check(sys.argv[1], sys.argv[2]))

Please test and adapt, if it works I’ll add it to the module.


(André Wismer) #63

@mrmarkuz
It’s on my home server in TESTING !!!

Thx, will give a feedback when the check is done (ca 09:00) and when both checks are successful…

Andy


(Markus Neuberger) #64

Ooops, it seems the backup-config logfile has changed too, so for now only the backup-data check is working…
Thanks for testing!

EDIT:

I adapted the script /usr/local/bin/nethbackup_check.py for testing with working backup-config check, I am afraid I have to rewrite it because the original logic implies log files with same format and now config is checked via /var/log/messages.

Show script
#!/usr/bin/env python
# vim:sts=4:sw=4
# encoding: utf-8

import datetime, re, sys
import glob
import os
import subprocess

list_of_files = glob.glob('/var/log/backup/backup-backup-data-*.log')
latest_file = max(list_of_files, key=os.path.getctime)


BACKUPTYPE = {
    'Data':     latest_file,
    'Config':   '/var/log/messages.log'
}

def backup_check(backuptype, validity):
    if backuptype == 'Data':
        # get line with time
        f = open(BACKUPTYPE[backuptype])
        timeline = f.readlines()[-6]
        f.close()

        # get line with status, hopefully success
        f = open(BACKUPTYPE[backuptype])
        successline = f.readlines()[-7]
        f.close()

        # Splitting the lines once read
        timeline_arr = str.split(timeline)
        successline_arr = str.split(successline)

        # Extract the date
        check = datetime.datetime.strptime(timeline_arr[3], '%Y-%m-%d').date()
        end = datetime.date.today()
        start = end - datetime.timedelta(days = int(validity))

        # Verifies the status of the last backup
        if start <= check <= end and re.match(r'SUCCESS', successline_arr[2]):
            return 1

    if backuptype == 'Config':
        cmd = ["""grep 'post-backup-config SUCCESS' /var/log/messages | tail -1"""]
        output = subprocess.check_output(cmd,shell=True)
        # Splitting the lines once read
        line_arr = str.split(output)

        # From last line I will also extract the date
        check = datetime.datetime.strptime(line_arr[0] + " " + line_arr[1] + " " + str(datetime.datetime.now().year), '%b %d %Y').date()
        end = datetime.date.today()
        start = end - datetime.timedelta(days = int(validity))
        # Verifies the status of the last backup
        if start <= check <= end:
            return 1

    return 0

if __name__ == '__main__':
    print(backup_check(sys.argv[1], sys.argv[2]))

(André Wismer) #65

@mrmarkuz

Hi
Seems your Python capabilities are understated… :wink:
The check seems to work - at least for the Data Backup part, as stated.
The Config Backup still needs to be adapted.

Maybe this evening I’ll find time to go over the code…
Just had an emergency call from my client (Hotel). The conference room with over 50 people for the subject “Challenges of Digitization” - and Internet does not work! (Murphy does such things!)

In the end it was a half year old High Speed USB Disk, acting as the firewall HD. On ANY System, formatting would work - till ca. 50%. Then dead! Sandisk usually have good quality stuff, and the firewall wasn’t doing much writing in there…

Sh"t happens, as the saying goes… It’s working again, had to buy a new USB Stick…

Thanks!
Andy


(André Wismer) #66

@mrmarkuz
@syntaxerrormmm

Hi

I was too early about the backup-check… The returned data (Latest data) still implies a negativ check…

I wasn’t able to debug that script yet, but it’s still on my to-do list. Maybe you or syntaxerrormmm could have another look…

:wink:

Thx
Andy


(Emiliano Vavassori) #67

Guys, thanks for all the efforts trying to cope with updates.
Obviously, I have a lot of NSs failing the backups (which is obviously not the case) in our monitoring system, so I am affected on the update too.

I was working on a new version of the script on last friday, so stay tuned :wink: Just hope our customers don’t ask for the moon in the meantime :smile:


(André Wismer) #68

@syntaxerrormmm

This is the sort of thing when “Upstream” changes something in the middle of the game…
It’s ok and fine with me if such changes happen in Major Upgrades - not in minor updates…

I do remember when still using SME-Server - and RH decided to change the encoding page for the samba part from ISO8859-1 to UFT8 or something like that a few years back…
OK, so what happens to all users who had valid passwords in the Database (LDAP/AD). No one can log in any more!
Great!

I think, on that particular day, someone “Upstream” didn’t turn on their brain in the morning!

Not the stuff to start a day in IT…
:wink:

Keep up the great work, syntaxerrormmm !!!

Andy


(Emiliano Vavassori) #69

I worked on a complete rewrite and:
1 - I am not completely satisfied with the result (much spaghetti code, a lot of repetitions);
2 - It does not support the multiple backup jobs yet (only ‘Config’ and ‘Data’ can be passed for check);
3 - Should be less change-prone (since it checks out /var/log/messages instead of the single backup file) — Now it only depends on the syntax of the SUCCESS/FAILURE line;
4 - Because it needs to access to /var/log/messages, now it requires to be run with sudo in userparameters (at least if you run your zabbix system with a user different from root).

You can find the updated files within the last commit on the previous reported repo on GitHub.


(Markus Neuberger) #70

Zabbix 4.0 LTS is here!

https://www.zabbix.com/life_cycle_and_release_policy
https://www.zabbix.com/release_notes

It’s working in first tests but I did not test an update. Don’t test in production.

Installation instructions:

https://wiki.nethserver.org/doku.php?id=zabbix#zabbix_repo

I am going to test and integrate the script of @syntaxerrormmm, if someone has already tested it please report…


#71

great! :raised_hands:
just a quick test… i’ve updated an almost clean install of 3.4 (only a discovery rule was set) . it worked and didn’t see any error in logs.
tnx!


(Markus Neuberger) #72

That are great news! Thanks for testing!


(Alessio Fattorini) #73

Huge work here. Thanks for that.


(fpausp) #74

Updated to 4.0:

[root@nethmon01 ~]# tail -n 20 /var/log/zabbix/zabbix_agentd.log
   949:20181006:140012.176 **** Enabled features ****
   949:20181006:140012.176 IPv6 support:          YES
   949:20181006:140012.176 TLS support:           YES
   949:20181006:140012.176 **************************
   949:20181006:140012.176 using configuration file: /etc/zabbix/zabbix_agentd.conf
   949:20181006:140012.176 agent #0 started [main process]
   951:20181006:140012.180 agent #1 started [collector]
   952:20181006:140012.180 agent #2 started [listener #1]
   953:20181006:140012.182 agent #3 started [listener #2]
   954:20181006:140012.182 agent #4 started [listener #3]
   955:20181006:140012.191 agent #5 started [active checks #1]
   955:20181006:140012.195 active check configuration update from [127.0.0.1:10051] started to fail (cannot connect to [[127.0.0.1]:10051]: [111] Connection refused)
   955:20181006:140112.229 active check configuration update from [127.0.0.1:10051] is working again
   955:20181006:140112.229 no active checks on server [127.0.0.1:10051]: host [Zabbix] not found
   955:20181006:140312.249 no active checks on server [127.0.0.1:10051]: host [Zabbix] not found
   955:20181006:140512.267 no active checks on server [127.0.0.1:10051]: host [Zabbix] not found
   955:20181006:140712.283 no active checks on server [127.0.0.1:10051]: host [Zabbix] not found
   955:20181006:140912.300 no active checks on server [127.0.0.1:10051]: host [Zabbix] not found
   955:20181006:141112.318 no active checks on server [127.0.0.1:10051]: host [Zabbix] not found
   955:20181006:141312.336 no active checks on server [127.0.0.1:10051]: host [Zabbix] not found





[root@nethmon01 ~]# tail -n 20 /var/log/zabbix/zabbix_agentd.log-20180923
   953:20180923:215059.714 no active checks on server [127.0.0.1:10051]: host [Zabbix] not found
   953:20180923:215259.734 no active checks on server [127.0.0.1:10051]: host [Zabbix] not found
   953:20180923:215459.753 no active checks on server [127.0.0.1:10051]: host [Zabbix] not found
   953:20180923:215659.772 no active checks on server [127.0.0.1:10051]: host [Zabbix] not found
   953:20180923:215859.791 no active checks on server [127.0.0.1:10051]: host [Zabbix] not found
   953:20180923:220059.810 no active checks on server [127.0.0.1:10051]: host [Zabbix] not found
   953:20180923:220259.829 no active checks on server [127.0.0.1:10051]: host [Zabbix] not found
   953:20180923:220459.846 no active checks on server [127.0.0.1:10051]: host [Zabbix] not found
   953:20180923:220659.861 no active checks on server [127.0.0.1:10051]: host [Zabbix] not found
   953:20180923:220859.879 no active checks on server [127.0.0.1:10051]: host [Zabbix] not found
   953:20180923:221059.895 no active checks on server [127.0.0.1:10051]: host [Zabbix] not found
   953:20180923:221259.911 no active checks on server [127.0.0.1:10051]: host [Zabbix] not found
   953:20180923:221459.929 no active checks on server [127.0.0.1:10051]: host [Zabbix] not found
   953:20180923:221659.946 no active checks on server [127.0.0.1:10051]: host [Zabbix] not found
   953:20180923:221859.962 no active checks on server [127.0.0.1:10051]: host [Zabbix] not found
   953:20180923:222059.978 no active checks on server [127.0.0.1:10051]: host [Zabbix] not found
   953:20180923:222259.994 no active checks on server [127.0.0.1:10051]: host [Zabbix] not found
   953:20180923:222459.014 no active checks on server [127.0.0.1:10051]: host [Zabbix] not found
   953:20180923:222659.029 no active checks on server [127.0.0.1:10051]: host [Zabbix] not found
   953:20180923:222859.044 no active checks on server [127.0.0.1:10051]: host [Zabbix] not found





[root@nethmon01 ~]# tail -n 20 /var/log/zabbix/zabbix_server.log
  1562:20181006:140023.214 server #23 started [trapper #1]
  1563:20181006:140023.215 server #24 started [trapper #2]
  1564:20181006:140023.218 server #25 started [trapper #3]
  1568:20181006:140023.218 server #29 started [alert manager #1]
  1549:20181006:140023.219 server #13 started [escalator #1]
  1567:20181006:140023.222 server #28 started [icmp pinger #1]
  1550:20181006:140023.222 server #14 started [proxy poller #1]
  1569:20181006:140023.223 server #30 started [preprocessing manager #1]
  1571:20181006:140023.223 server #32 started [preprocessing worker #2]
  1566:20181006:140023.225 server #27 started [trapper #5]
  1570:20181006:140023.300 server #31 started [preprocessing worker #1]
  1572:20181006:140023.300 server #33 started [preprocessing worker #3]
  1565:20181006:140112.229 cannot send list of active checks to "127.0.0.1": host [Zabbix] not found
  1562:20181006:140312.249 cannot send list of active checks to "127.0.0.1": host [Zabbix] not found
  1564:20181006:140512.266 cannot send list of active checks to "127.0.0.1": host [Zabbix] not found
  1563:20181006:140712.283 cannot send list of active checks to "127.0.0.1": host [Zabbix] not found
  1565:20181006:140912.300 cannot send list of active checks to "127.0.0.1": host [Zabbix] not found
  1565:20181006:141112.318 cannot send list of active checks to "127.0.0.1": host [Zabbix] not found
  1564:20181006:141312.336 cannot send list of active checks to "127.0.0.1": host [Zabbix] not found
  1566:20181006:141512.353 cannot send list of active checks to "127.0.0.1": host [Zabbix] not found






[root@nethmon01 ~]# tail -n 20 /var/log/zabbix/zabbix_server.log-20180923
  1290:20180923:215059.713 cannot send list of active checks to "127.0.0.1": host [Zabbix] not found
  1290:20180923:215259.734 cannot send list of active checks to "127.0.0.1": host [Zabbix] not found
  1291:20180923:215459.753 cannot send list of active checks to "127.0.0.1": host [Zabbix] not found
  1295:20180923:215659.772 cannot send list of active checks to "127.0.0.1": host [Zabbix] not found
  1290:20180923:215859.791 cannot send list of active checks to "127.0.0.1": host [Zabbix] not found
  1290:20180923:220059.810 cannot send list of active checks to "127.0.0.1": host [Zabbix] not found
  1292:20180923:220259.829 cannot send list of active checks to "127.0.0.1": host [Zabbix] not found
  1291:20180923:220459.845 cannot send list of active checks to "127.0.0.1": host [Zabbix] not found
  1292:20180923:220659.861 cannot send list of active checks to "127.0.0.1": host [Zabbix] not found
  1292:20180923:220859.879 cannot send list of active checks to "127.0.0.1": host [Zabbix] not found
  1292:20180923:221059.895 cannot send list of active checks to "127.0.0.1": host [Zabbix] not found
  1293:20180923:221259.910 cannot send list of active checks to "127.0.0.1": host [Zabbix] not found
  1292:20180923:221459.929 cannot send list of active checks to "127.0.0.1": host [Zabbix] not found
  1291:20180923:221659.946 cannot send list of active checks to "127.0.0.1": host [Zabbix] not found
  1292:20180923:221859.962 cannot send list of active checks to "127.0.0.1": host [Zabbix] not found
  1293:20180923:222059.978 cannot send list of active checks to "127.0.0.1": host [Zabbix] not found
  1293:20180923:222259.994 cannot send list of active checks to "127.0.0.1": host [Zabbix] not found
  1293:20180923:222459.014 cannot send list of active checks to "127.0.0.1": host [Zabbix] not found
  1293:20180923:222659.029 cannot send list of active checks to "127.0.0.1": host [Zabbix] not found
  1293:20180923:222859.044 cannot send list of active checks to "127.0.0.1": host [Zabbix] not found

(Markus Neuberger) #75

Thanks for testing, did it work in general except of these errors?

I am going to do some more testing this weekend, if I encounter similar error I’ll report…

EDIT:

I have similar errors, it seems like the Zabbix host is called “Zabbix server” now instead of Zabbix.

In /etc/zabbix/zabbix_agentd.conf change the hostname:

Hostname=Zabbix server

and restart the services:

systemctl restart zabbix-agent zabbix-server

I’ll update the module, thanks


(André Wismer) #76

@mrmarkuz

Hi

Did an update of Zabbix from 3.4.x to 4.0.4 LTS on a “productive” Home-Server.
No errors in the Terminal during update, however the Web-Interface of Zabbix still shows 3.4.xx…
Any ideas? (Done the upgrade on two Zabbix Servers, both no problems but still showing the older Versions…)

Found it - need to update via Software-Center…

PS: Great work again!

Regards
Andy, now in Konstanz am Bodensee…