Bandwidthd crashing after some hours

NS7 RC3

Hi,

I noticed that bandwithd stopped logging any activity some times after having installed it. Having played a lot with nethserver, I tought I would reinstall the module, which succeeded.

However, 14 hours later, the module stopped collecting data again.

Here is the filtered log file :

Jan 3 09:04:14 mattlabs pkgaction[19313]: remove: @nethserver-bandwidthd, nethserver-bandwidthd Jan 3 09:04:17 mattlabs yum[19313]: Erased: nethserver-bandwidthd-1.0.0-1.ns7.noarch Jan 3 09:06:25 mattlabs systemd: Stopping Bandwidthd Network Traffic Monitor... Jan 3 09:06:25 mattlabs systemd: Stopped Bandwidthd Network Traffic Monitor. Jan 3 09:07:29 mattlabs pkgaction[25513]: install: @nethserver-bandwidthd Jan 3 09:07:32 mattlabs yum[25513]: Installed: nethserver-bandwidthd-1.0.0-1.ns7.noarch Jan 3 09:07:36 mattlabs esmith::event[25599]: Event: nethserver-bandwidthd-update Jan 3 09:07:36 mattlabs esmith::event[25599]: Action: /etc/e-smith/events/nethserver-bandwidthd-update/S00initialize-default-databases SUCCESS [0.556] Jan 3 09:07:36 mattlabs esmith::event[25599]: expanding /etc/bandwidthd.conf Jan 3 09:07:36 mattlabs esmith::event[25599]: expanding /etc/httpd/admin-conf.d/bandwidthd.conf Jan 3 09:07:36 mattlabs esmith::event[25599]: expanding /etc/httpd/conf.d/bandwidthd.conf Jan 3 09:07:37 mattlabs esmith::event[25599]: Action: /etc/e-smith/events/nethserver-bandwidthd-update/S99nethserver-httpd-admin-asyncreload SUCCESS [0.010039] Jan 3 09:07:37 mattlabs esmith::event[25599]: Event: nethserver-bandwidthd-update SUCCESS Jan 3 09:07:38 mattlabs systemd: Started Bandwidthd Network Traffic Monitor. Jan 3 09:07:38 mattlabs systemd: Starting Bandwidthd Network Traffic Monitor... Jan 3 09:07:38 mattlabs bandwidthd: Monitoring subnet 10.0.1.0 with netmask 10.0.1.0 Jan 3 09:07:38 mattlabs bandwidthd: Opening any Jan 3 09:07:38 mattlabs bandwidthd: Packet Encoding: Linux Cooked Socket Jan 3 09:09:33 mattlabs bandwidthd: Monitoring subnet 10.0.1.0 with netmask 10.0.1.0 Jan 3 09:09:34 mattlabs bandwidthd: Opening any Jan 3 09:09:34 mattlabs bandwidthd: Packet Encoding: Linux Cooked Socket Jan 3 09:09:38 mattlabs bandwidthd: Monitoring subnet 10.0.1.0 with netmask 10.0.1.0 Jan 3 09:09:42 mattlabs bandwidthd: Monitoring subnet 10.0.1.0 with netmask 10.0.1.0 Jan 3 09:09:46 mattlabs bandwidthd: Monitoring subnet 10.0.1.0 with netmask 10.0.1.0 Jan 3 09:09:49 mattlabs bandwidthd: Monitoring subnet 10.0.1.0 with netmask 10.0.1.0 Jan 3 09:10:59 mattlabs bandwidthd: Initializing database info Jan 3 09:11:00 mattlabs bandwidthd: Sensor ID: 1 Jan 3 09:12:55 mattlabs bandwidthd: Initializing database info Jan 3 09:12:55 mattlabs bandwidthd: Sensor ID: 1 Jan 3 23:02:28 mattlabs bandwidthd: Error compiling SQL Statement to create new sensor_id Jan 3 23:02:28 mattlabs bandwidthd: Could not update sensor status Jan 3 23:02:29 mattlabs bandwidthd: Logging child still active: No response or slow database? Killing child. Jan 3 23:02:29 mattlabs bandwidthd: Error compiling SQL Statement to create new sensor_id Jan 3 23:02:29 mattlabs bandwidthd: Could not update sensor status Jan 3 23:02:29 mattlabs bandwidthd: Logging child still active: No response or slow database? Killing child. Jan 3 23:02:29 mattlabs bandwidthd: Error compiling SQL Statement to create new sensor_id Jan 3 23:02:29 mattlabs bandwidthd: Could not update sensor status Jan 3 23:02:30 mattlabs bandwidthd: SQLite select failed Jan 3 23:02:30 mattlabs bandwidthd: Could not update sensor status Jan 3 23:02:30 mattlabs bandwidthd: Logging child still active: No response or slow database? Killing child. Jan 3 23:02:30 mattlabs bandwidthd: Error compiling SQL Statement to create new sensor_id Jan 3 23:02:30 mattlabs bandwidthd: Could not update sensor status Jan 3 23:38:54 mattlabs bandwidthd: Error compiling SQL Statement to create new sensor_id Jan 3 23:38:54 mattlabs bandwidthd: Could not update sensor status Jan 3 23:38:56 mattlabs bandwidthd: Error compiling SQL Statement to create new sensor_id Jan 3 23:38:56 mattlabs bandwidthd: Could not update sensor status Jan 3 23:38:57 mattlabs bandwidthd: Error compiling SQL Statement to create new sensor_id Jan 3 23:38:57 mattlabs bandwidthd: Could not update sensor status Jan 3 23:38:57 mattlabs bandwidthd: Logging child still active: No response or slow database? Killing child. Jan 3 23:38:57 mattlabs bandwidthd: Error compiling SQL Statement to create new sensor_id Jan 3 23:38:57 mattlabs bandwidthd: Could not update sensor status Jan 3 23:38:57 mattlabs bandwidthd: Logging child still active: No response or slow database? Killing child. Jan 3 23:38:57 mattlabs bandwidthd: Error compiling SQL Statement to create new sensor_id Jan 3 23:38:57 mattlabs bandwidthd: Could not update sensor status Jan 4 08:24:36 mattlabs bandwidthd: Error compiling SQL Statement to create new sensor_id Jan 4 08:24:36 mattlabs bandwidthd: Could not update sensor status Jan 4 08:24:37 mattlabs bandwidthd: SQLite select failed Jan 4 08:24:37 mattlabs bandwidthd: Could not update sensor status Jan 4 20:04:23 mattlabs bandwidthd: Error compiling SQL Statement to create new sensor_id Jan 4 20:04:23 mattlabs bandwidthd: Could not update sensor status Jan 4 20:04:23 mattlabs bandwidthd: Error commiting transaction Jan 4 20:04:23 mattlabs bandwidthd: SQLite select failed Jan 4 20:04:23 mattlabs bandwidthd: Could not update sensor status Jan 4 20:04:23 mattlabs bandwidthd: SQLite select failed Jan 4 20:04:23 mattlabs bandwidthd: Could not update sensor status Jan 4 20:04:23 mattlabs bandwidthd: SQLite select failed Jan 4 20:04:23 mattlabs bandwidthd: SQLite select failed Jan 4 20:04:23 mattlabs bandwidthd: Could not update sensor status Jan 4 20:04:23 mattlabs bandwidthd: Could not update sensor status Jan 4 20:05:54 mattlabs bandwidthd: Error compiling SQL Statement to create new sensor_id Jan 4 20:05:54 mattlabs bandwidthd: Could not update sensor status Jan 4 20:05:54 mattlabs bandwidthd: SQLite select failed Jan 4 20:05:54 mattlabs bandwidthd: Could not update sensor status Jan 4 20:05:54 mattlabs bandwidthd: SQLite select failed Jan 4 20:05:54 mattlabs bandwidthd: Could not update sensor status Jan 4 20:06:14 mattlabs bandwidthd: SQLite select failed Jan 4 20:06:14 mattlabs bandwidthd: Could not update sensor status Jan 4 20:06:14 mattlabs bandwidthd: Logging child still active: No response or slow database? Killing child. Jan 4 20:06:14 mattlabs bandwidthd: Error compiling SQL Statement to create new sensor_id Jan 4 20:06:14 mattlabs bandwidthd: Could not update sensor status Jan 4 20:06:14 mattlabs bandwidthd: Logging child still active: No response or slow database? Killing child. Jan 4 20:06:14 mattlabs bandwidthd: Error compiling SQL Statement to create new sensor_id Jan 4 20:06:14 mattlabs bandwidthd: Could not update sensor status Jan 4 20:06:21 mattlabs bandwidthd: Error compiling SQL Statement to create new sensor_id Jan 4 20:06:21 mattlabs bandwidthd: Could not update sensor status Jan 4 20:06:21 mattlabs bandwidthd: Logging child still active: No response or slow database? Killing child. Jan 4 20:06:21 mattlabs bandwidthd: Error compiling SQL Statement to create new sensor_id Jan 4 20:06:21 mattlabs bandwidthd: Could not update sensor status Jan 4 20:06:24 mattlabs bandwidthd: Error compiling SQL Statement to create new sensor_id Jan 4 20:06:24 mattlabs bandwidthd: Could not update sensor status Jan 4 20:06:35 mattlabs bandwidthd: Error compiling SQL Statement to create new sensor_id Jan 4 20:06:35 mattlabs bandwidthd: Could not update sensor status Jan 4 20:06:35 mattlabs bandwidthd: SQLite select failed Jan 4 20:06:35 mattlabs bandwidthd: Could not update sensor status Jan 4 20:06:36 mattlabs bandwidthd: SQLite select failed Jan 4 20:06:36 mattlabs bandwidthd: Could not update sensor status Jan 4 20:55:10 mattlabs bandwidthd: Error compiling SQL Statement to create new sensor_id Jan 4 20:55:10 mattlabs bandwidthd: Could not update sensor status Jan 4 20:55:10 mattlabs bandwidthd: SQLite select failed Jan 4 20:55:10 mattlabs bandwidthd: Could not update sensor status Jan 4 20:55:10 mattlabs bandwidthd: SQLite select failed Jan 4 20:55:10 mattlabs bandwidthd: Could not update sensor status Jan 4 21:46:34 mattlabs bandwidthd: Error compiling SQL Statement to create new sensor_id Jan 4 21:46:34 mattlabs bandwidthd: Could not update sensor status Jan 4 21:46:34 mattlabs bandwidthd: Error compiling SQL Statement to create new sensor_id Jan 4 21:46:34 mattlabs bandwidthd: Could not update sensor status Jan 4 21:49:16 mattlabs control-service: bandwidthd restart Jan 4 21:49:16 mattlabs systemd: Stopping Bandwidthd Network Traffic Monitor... Jan 4 21:49:16 mattlabs systemd: Started Bandwidthd Network Traffic Monitor. Jan 4 21:49:16 mattlabs systemd: Starting Bandwidthd Network Traffic Monitor... Jan 4 21:49:16 mattlabs bandwidthd: Monitoring subnet 10.0.1.0 with netmask 10.0.1.0 Jan 4 21:49:16 mattlabs bandwidthd: Opening any Jan 4 21:49:16 mattlabs bandwidthd: Packet Encoding: Linux Cooked Socket Jan 4 21:49:19 mattlabs bandwidthd: Initializing database info Jan 4 21:49:19 mattlabs bandwidthd: SQLite select failed Jan 4 21:49:19 mattlabs bandwidthd: Could not update sensor status Jan 4 21:49:19 mattlabs bandwidthd: Sensor ID: 1 Jan 4 21:49:19 mattlabs bandwidthd: Logging child still active: No response or slow database? Killing child. Jan 4 21:49:19 mattlabs bandwidthd: SQLite select failed Jan 4 21:49:19 mattlabs bandwidthd: Could not update sensor status Jan 4 21:49:19 mattlabs bandwidthd: SQLite select failed Jan 4 21:49:19 mattlabs bandwidthd: Could not update sensor status

I didn’t found any information on those SQL errors.

Bug / Not bug ? Anybody feels like debugging this with me ?

Thanks

Matthieu

1 Like

I’m debugging a similar problem that appeared yesterday on my nethserver.
bandwidthd has run from the last September without problems but stopped to log new data after a reboot. I restarted the bandwithd service, but yesterday it stopped again:

Jan  3 17:52:57 nethsecurity7 bandwidthd: SQLite select failed
Jan  3 17:52:57 nethsecurity7 bandwidthd: Could not update sensor status
Jan  3 17:52:57 nethsecurity7 bandwidthd: SQLite select failed
Jan  3 17:52:57 nethsecurity7 bandwidthd: Could not update sensor status
Jan  3 18:14:10 nethsecurity7 bandwidthd: SQLite select failed
Jan  3 18:14:10 nethsecurity7 bandwidthd: Could not update sensor status
Jan  4 10:11:34 nethsecurity7 bandwidthd: SQLite select failed
Jan  4 10:11:34 nethsecurity7 bandwidthd: Could not update sensor status

I tried to analyze the daemon behavior with
strace -f -e 'trace=!poll' -p $(cat /var/run/bandwidthd.pid)
but it seems to work correctly, writing data every some minutes.

Since September nothing big changed on the system, apart from the 7.3 kernel (3.10.0-514.2.2.el7.x86_64).

2 Likes

On my side I tried to check the sqlite db for corruption, no problem found.

[root@mattlabs ~]# sqlite3 /var/www/bandwidthd/stats.db SQLite version 3.7.17 2013-05-20 00:56:22 Enter ".help" for instructions Enter SQL statements terminated with a ";" sqlite> pragma integrity_check; ok

I deleted the database and restarted the service. So far so good. Will report back in a few days.

Interesting information to note : it looks that some information were actually recorded in the database : some peaks, not more.

Hi @pagaille is that bandwidthd still running well?

As your issue appeared shortly after 7.3 was released, I pushed a newly build RPM to nethserver-testing. You can check it out with:

 yum --enablerepo=nethserver-testing update bandwidthd

That’s kind of a moving target. Sometimes it work, then stops then works again, then stops. I don’t know for sure. Each time it stops, the error is always the same :

Jan 10 00:04:25 mattlabs bandwidthd: Logging child still active: No response or slow database? Killing child.

The size of the DB doesn’t look to be an issue.

I restarted again with the latest version, but I already updated last night by chance, so I don’t think it will be positive. Will report back.

Update : So far so good ! Bandwidthd has never work for so long !

3 Likes

Update : bandwidthd has worked for 24h, which is a very good sign. However I noticed that it didn’t survived a reboot. The infamous “killing child” error message appeared again.

Jan 11 23:14:36 mattlabs systemd: Started Bandwidthd Network Traffic Monitor.
Jan 11 23:14:36 mattlabs systemd: Starting Bandwidthd Network Traffic Monitor...
Jan 11 23:14:36 mattlabs bandwidthd: Monitoring subnet 10.0.1.0 with netmask 10.0.1.0
Jan 11 23:14:37 mattlabs bandwidthd: Opening any
Jan 11 23:14:37 mattlabs bandwidthd: Packet Encoding: Linux Cooked Socket
Jan 11 23:14:39 mattlabs bandwidthd: Initializing database info
Jan 11 23:14:40 mattlabs bandwidthd: Sensor ID: 1
Jan 11 23:15:05 mattlabs bandwidthd: Logging child still active: No response or slow database? Killing child.

A manual restart of the service solved that.