Zabbix and mrmarkuz repo testing

Andy_Wismer · March 25, 2020, 9:44am

All worked:

Bildschirmfoto 2020-03-25 um 10.42.11

I think less a yum issue, maybe the repo ce-base: repo.uk.bigstepcloud.com wasn’t up to date - or a combination of both…

But it works now.

Maybe adding a yum clean all in between both sequences would help.

I’m testing other servers soon…
Need a personal Java Beans update (=more coffee!)

mrmarkuz · March 25, 2020, 10:16am

I added one before the install/update command. That should do it.

Andy_Wismer · March 25, 2020, 10:32am

@mrmarkuz

Database vacuming:

In Zabbix Houskeeping set to only 30d…
Database size 13.35 GB (before vacuming)

after vacuming: Still the same size…

Housekeeping seems to have an issue, I’ll need to look up the issues I found on a forum. Even specific Zabbix/PG clean up commands (so far) didn’t help…

I’ll post them here shortly…

Andy_Wismer · March 25, 2020, 10:38am

@mrmarkuz

Here are a few:
(In a rough order of relevance…)

https://www.intelics.com.br/2018/10/24/cleaning-up-the-zabbix-database-on-postgresql-server/

https://www.try2answer.com/24/03/2018/delete-old-data-from-zabbix-database/

Tried a few things, bit still no success…

Note: This CAN and will cause issues with correct shutdown.
I have two servers which need >20 minutes to shutdown - because of the oversize db.
Now, if the UPS gives the shutdown command, it may not be sufficient…

michelandre · March 25, 2020, 12:46pm

Hi André,

Will it help if you add a delay of 25 minutes in the DB service config file so to make it wait before shutting down itself? It will give it time to finish all its pending jobs.

ExecStop=/usr/bin/sleep 1500

Michel-André

Andy_Wismer · March 25, 2020, 12:51pm

Hi

Thanks for the Tip, but as this is running in virtualization, I need to be sure that all VMs are shut down.

Prolonging the wait time (It often needs 45-60 mins) can’t be a solution, buying larger UPS or In House Nuke Power Station…

I have already stretched the time as much as possible in Proxmox before a shutdown is enforced.

The Databases on two clients are really the problem…

It’s like trying to read a 30 GB large Logfile - a simple Textfile - on Windows (Or Linux, for that matter). It’s simply too big!

The entries came from a period, when the provider or Internet had a lot of issues, also the in wall wiring had to be replaced after 20 years. The errors in monitoring added up…

But i need to clean it up!

pike · March 25, 2020, 2:51pm

Why this file is such large?
Why there’s no rotation about that file?

michelandre · March 25, 2020, 3:56pm

Hi André,

I just received the latest Newsletter from Zabbix: https://blog.zabbix.com/zabbix-integration-with-big-data-systems-in-large-scale-environment/8844/?utm_source=email+marketing+Mailigen&utm_campaign=Mailigen&utm_medium=email

It describes a system that the total number of devices installed is forecast to exceed 65,000 in the future.

Might contains some ideas for your system,

Michel-André

Andy_Wismer · March 25, 2020, 5:56pm

@michelandre
@pike

Hi Guys

Sorry for the delay in replying, was on a lengthy support call…

In the two cases involved, the massive data accumulation were due to wiring issues.

In one case, the network wiring in the wall was done by an electrician around 2000 (20 years ago).
What a lot of people forget is that in wall wiring is nothing else but a network cable hidden from view. It could be damaged, corroded, moist, broken… You don’t see it!

In this case, the wiring was done to Cat6 specs, read 1 GB/S…
The quality must have decayed over time and wasn’t noticed until the provider announced a 100% Internet Speed upgrade. They have fiber glass and 300 MB/S now.

They complained to the provider that no speed increase was noticable, and the provider said hook up a PC / Notebook directly to the Fiber-Modem. Speed was there!

My Client asked me if the 3 Months old new Firewall was crap… I told him, hook up your PC/Notebook to the firewall. Speed was there.

After the Fiber-Modem, the wiring went into our OPNsense Firewall, from ther to a wall socket leading to the server room.

And in the Server room: No more speed. The Wall cabling had deteriorated down to about 120MBit/S… A lot of runts and crap on the ethernet due to defective cabling.

An electrician put in new wiring, all worked…

What we didn’t think about at the time was how much data had acumated in Zabbix due to those abundant errors!

In the other case also something similiar, a 35 year old building…

Sh*t happens…

But still needs to be cleaned up…

The Networks have only about 10-15 PCs, are fairly small…

My 2 cents
Andy

mrmarkuz · March 25, 2020, 7:37pm

This one followed by this one should work to make the db smaller. It means deleting old entries, exporting and reimporting them to a new database. Did you already try?

Autopartitioning looks interesting too to maybe avoid performance problems.
Here is an adapted version.

@syntaxerrormmm do you have an idea how to optimize Zabbix/postgres when DBs grow big?

Andy_Wismer · March 25, 2020, 7:39pm

I did try a few options a short while before the lockdown, but after running a while, and vacumeing the DB, it was still the same size.
I actually checked, the amount of entries in the large tables weren’t reduced by my queries…

Will try a bit later with the two you enclosed… The Backup is running right now, that’ll take an hour or two…

I should never have let the db grow that big, but I wasn’t aware of the fact…

Andy

Andy_Wismer · March 25, 2020, 8:11pm

Results:

https://www.try2answer.com/24/03/2018/delete-old-data-from-zabbix-database/

This is actually the only one referring to PGSQL, all others have MySQL syntax…

But this doesn’t actually do anything at all…

Next try…

mrmarkuz · March 25, 2020, 8:20pm

You need the second step I think. Clean, export and reimport to new DB.

https://www.intelics.com.br/2018/10/24/cleaning-up-the-zabbix-database-on-postgresql-server/

Here is a script I included to change DB from ASCII to UNICODE. It does more or less the same but as postgres user and except the db cleaning. Just to give you an idea:

#!/bin/bash
systemctl stop zabbix-server
sudo -i -u postgres pg_dump zabbix > /tmp/pre-encoding-fix-backup.sql
sudo -i -u postgres psql -c "alter database zabbix rename to zabbix_pre_encoding_fix_backup"
sudo -i -u postgres psql -c "create database zabbix with encoding 'UNICODE' template=template0"
sudo -i -u postgres PGCLIENTENCODING=SQL_ASCII psql zabbix -f /tmp/pre-encoding-fix-backup.sql
systemctl start zabbix-server

Andy_Wismer · March 25, 2020, 8:26pm

It’s running the dump right now…

But a 30+ GB Database takes it’s time.

I can usually handle all forms of databases, but this feels like handling a several GB sized MS-Access file…
Really takes it’s time…

But it’s running now, the scripts are prepared and there’s space…

Andy_Wismer · March 25, 2020, 8:56pm

Hi

Partial success…

The script from intelics actually contains correct PgSQL syntax and reduced eg trends_uint by about 15%…

But it basically only cleans up orphaned stuff, not old stuff the houskeeper didn’t get…
But merging both web pages, I can get a result.
A bit of coding, adapting both pages…

But first a backup of the whole box, using the time to do a Java-Server check for Java Beans…
(Cleartext: Get another coffee from the Nespresso machine…)

mrmarkuz · March 25, 2020, 9:07pm

I think we are on the right way…

Andy_Wismer · March 25, 2020, 10:01pm

@mrmarkuz

It’'s “eating” history:

for 40 mins now, still at it…

The Docs are also coming along:

We’re on the right track, but good things take their time…

syntaxerrormmm · March 25, 2020, 10:05pm

Well, I have a DB that is quite hard to manage and surely needs some cleanup (actually 300+GB).

Since we planned to deploy another (updated) instance with auto-partitioning, once I have the new instance and a (final) working backup of the old one, I can do some tests and give feedback on both choices.

FWIW, I do know that tables that are often written (history, trends, etc.) are not subject to auto-vacuuming by PostgresSQL at all: I tried to fix the issue marking the tables as “vacuumable”, but achieved only a slower grow of the gross db size (I would expect a complete stop of the growing, or similar).

Andy_Wismer · March 25, 2020, 10:08pm

@syntaxerrormmm
@mrmarkuz

Autocleaning can’t cope with the speed zabbix writes into the db, especially history/trends…

That’s a conclusion on a great many forums…

History’s done, took 50 mins…

pike · March 25, 2020, 10:25pm

Time to shrink files…