Rrdtool plugin fails at system-init on arm

Going to open an isseu at arm-dev but hope the community can chip in what is happing here.
To begin with: it is not critical, it only holds the init process up for a while.

# journalctl | grep rrdtool
Jan 01 00:00:52 localhost.localdomain collectd[1227]: plugin_load: plugin "rrdtool" successfully loaded.
Jan 01 00:00:54 localhost.localdomain collectd[1227]: rrdtool plugin: Shutting down the queue thread. This may take a while.
Jan 01 00:00:54 localhost.localdomain collectd[1279]: plugin_load: plugin "rrdtool" successfully loaded.
Dec 15 09:19:20 localhost.localdomain collectd[1279]: rrdtool plugin: Shutting down the queue thread. This may take a while.
Dec 15 09:19:30 localhost.localdomain collectd[1279]: rrdtool plugin: rrd_update_r (/var/lib/collectd/rrd/localhost/cpu-0/cpu-idle.rrd) failed: /var/lib/collectd/rrd/localhost/cpu-0/cpu-idle.rrd: illegal attempt to update using time 1608023923 when last update time is 1608023923 (minimum one second step)
Dec 15 09:19:53 localhost.localdomain collectd[1279]: rrdtool plugin: rrd_update_r (/var/lib/collectd/rrd/localhost/cpu-0/cpu-interrupt.rrd) failed: /var/lib/collectd/rrd/localhost/cpu-0/cpu-interrupt.rrd: illegal attempt to update using time 1608023923 when last update time is 1608023923 (minimum one second step)
Dec 15 09:20:16 localhost.localdomain collectd[1279]: rrdtool plugin: rrd_update_r (/var/lib/collectd/rrd/localhost/cpu-0/cpu-nice.rrd) failed: /var/lib/collectd/rrd/localhost/cpu-0/cpu-nice.rrd: illegal attempt to update using time 1608023923 when last update time is 1608023923 (minimum one second step)
Dec 15 09:20:39 localhost.localdomain collectd[1279]: rrdtool plugin: rrd_update_r (/var/lib/collectd/rrd/localhost/cpu-0/cpu-softirq.rrd) failed: /var/lib/collectd/rrd/localhost/cpu-0/cpu-softirq.rrd: illegal attempt to update using time 1608023923 when last update time is 1608023923 (minimum one second step)
Dec 15 09:20:50 localhost.localdomain collectd[3539]: plugin_load: plugin "rrdtool" successfully loaded.

Does anyone have a clue what rrdtool plugin: rrd_update_r (/var/lib/collectd/rrd/localhost/cpu-0/cpu-nice.rrd) failed: /var/lib/collectd/rrd/localhost/cpu-0/cpu-nice.rrd does or supposed to do?

It is the stats collector, Collectd, writing to its DB. It does not like time leaps and complains a lot when the time gets changed suddenly. Maybe starting it after chronyd at boot can help.

2 Likes

Thanx, never would thought of this !

Will try to do so, but not optimistic if it succeeds.
Went down that rabbit hole before in an attempt to start the hole system-init.service after receiving a valid time from a ntp-server.

This because sometimes the default server-certificate also get a bogus date on system-init:

Will try to find the homework done and see if something new comes to mind;

EDIT: this was included in the attempt back then, looking at it now i’m not sure we have a time-sync.target with systemd 219 …

1 Like

I think I found the culprit :

  • Before nethserver-system-init I cannot ping a external server by name (e.g. ping www.google.com), the system can not resolve the servername in the DNS.
  • Chronyd can not resolve the ntp pool servername hence it does not sync the time. During nethserver-system-init dnsmasq takes over and then it’s the first time chronyd sync’s.
  • If I drop nameserver 8.8.4.4 in /etc/resolve.conf the time gets synced.

so what to do… :thinking:
Simply mitigate by dropping a kick-start resolve.conf in at image creation?

Not sure, the network might not be available for other reasons… I understand the board is missing an internal time source so the clock is not set at startup.

In general, services must not require the network (and clock) at startup to avoid issues like this. I’d say it is a collectd issue and I’d prefer to not change the default system configuration because of this.

Yes, it does relate to the NIC on a RPI4; which reports to be ready before it actually is…

Most SBC’s do not have a RTC …

It’s only at nethserver-system-init … not every time the system starts-up. Although this chrony-wait.service would start al times. Could also require first-boot to be present.

I agree, however resolf.conf is templated and overwritten during system-init.

and it solves two problems:

# journalctl | grep rrdtool
Dec 15 13:15:10 localhost.localdomain collectd[1167]: plugin_load: plugin "rrdtool" successfully loaded.
Dec 15 13:15:12 localhost.localdomain collectd[1167]: rrdtool plugin: Shutting down the queue thread. This may take a while.
Dec 15 13:15:12 localhost.localdomain collectd[1216]: plugin_load: plugin "rrdtool" successfully loaded.
Dec 15 13:16:02 localhost.localdomain collectd[1216]: rrdtool plugin: Shutting down the queue thread. This may take a while.
Dec 15 13:16:03 localhost.localdomain collectd[3388]: plugin_load: plugin "rrdtool" successfully loaded.

And the one of fasted system-init recorded by me on a 32bit arm system :
Startup finished in 1.755s (kernel) + 2min 15.051s (userspace) = 2min 16.807s

So I will sleep over it before making a decision.

1 Like

It seems a nice hack :wink:

It’s fine for me :+1:

1 Like

If there was no kernel bug for the RPI4-8GB concerning zram-sawp I would pushed a RC image to arm-dev.
However having slept over the issue discussed here made me doubt the solution and got undecided: :woozy_face:

PRO:

  • no bogus date for default server certificate
  • no errors collectd / rrdtool
  • waiting for the time implies network / internet is up and running which in it’s turn implies we have received an IP from a DHCP-server. Resulting in a valid network configuration after nethserver-system-init

CON:

  • nethserver-system-init won’t run without a working internet-connection. Which may result in support questions.

Also see:

feedback appreciated @dz00te @Andy_Wismer @mrmarkuz @royceb and all other who chip in :slight_smile:

@mark_nl

Hi Mark

I’d suggest: Go for it!

This needs to be documented, maybe also with the suggestion to test the RPI being used in case of problems.

The RPI can easily be tested by using a standard Raspberry-OS imaged SD. If that gets Internet, Your NethServer will also get Internet, as even the MAC Address is the same.
Almost anyone using Raspberries - or playing around with one - knows how to use BalenaEtcher, which makes imaging SDs fast and easy, even for Noobs… :slight_smile:

My 2 cents
Andy

I don’t know if a software could solve a missing hardware. The RTC Module.

@pike

There ARE people who add in one of the many RTC modules available on the market for RPI…

Admitted, although I have about 6 RPIs, only 2 have a real RTC…

My 2 cents
Andy

And the software was think and written considering RTC as a component of the hardware.
So why don’t classify RTC mandatory for NethServer on ARM?

1 Like

IMHO a bit overkill for a bogus date on a self-signed certificate and some non-critical log entries from collectd failures during nethserver-system-init

Meanwhile have one +1, so implemented it here:

1 Like