Going to open an isseu at arm-dev but hope the community can chip in what is happing here.
To begin with: it is not critical, it only holds the init process up for a while.
# journalctl | grep rrdtool
Jan 01 00:00:52 localhost.localdomain collectd[1227]: plugin_load: plugin "rrdtool" successfully loaded.
Jan 01 00:00:54 localhost.localdomain collectd[1227]: rrdtool plugin: Shutting down the queue thread. This may take a while.
Jan 01 00:00:54 localhost.localdomain collectd[1279]: plugin_load: plugin "rrdtool" successfully loaded.
Dec 15 09:19:20 localhost.localdomain collectd[1279]: rrdtool plugin: Shutting down the queue thread. This may take a while.
Dec 15 09:19:30 localhost.localdomain collectd[1279]: rrdtool plugin: rrd_update_r (/var/lib/collectd/rrd/localhost/cpu-0/cpu-idle.rrd) failed: /var/lib/collectd/rrd/localhost/cpu-0/cpu-idle.rrd: illegal attempt to update using time 1608023923 when last update time is 1608023923 (minimum one second step)
Dec 15 09:19:53 localhost.localdomain collectd[1279]: rrdtool plugin: rrd_update_r (/var/lib/collectd/rrd/localhost/cpu-0/cpu-interrupt.rrd) failed: /var/lib/collectd/rrd/localhost/cpu-0/cpu-interrupt.rrd: illegal attempt to update using time 1608023923 when last update time is 1608023923 (minimum one second step)
Dec 15 09:20:16 localhost.localdomain collectd[1279]: rrdtool plugin: rrd_update_r (/var/lib/collectd/rrd/localhost/cpu-0/cpu-nice.rrd) failed: /var/lib/collectd/rrd/localhost/cpu-0/cpu-nice.rrd: illegal attempt to update using time 1608023923 when last update time is 1608023923 (minimum one second step)
Dec 15 09:20:39 localhost.localdomain collectd[1279]: rrdtool plugin: rrd_update_r (/var/lib/collectd/rrd/localhost/cpu-0/cpu-softirq.rrd) failed: /var/lib/collectd/rrd/localhost/cpu-0/cpu-softirq.rrd: illegal attempt to update using time 1608023923 when last update time is 1608023923 (minimum one second step)
Dec 15 09:20:50 localhost.localdomain collectd[3539]: plugin_load: plugin "rrdtool" successfully loaded.
Does anyone have a clue what rrdtool plugin: rrd_update_r (/var/lib/collectd/rrd/localhost/cpu-0/cpu-nice.rrd) failed: /var/lib/collectd/rrd/localhost/cpu-0/cpu-nice.rrd does or supposed to do?
It is the stats collector, Collectd, writing to its DB. It does not like time leaps and complains a lot when the time gets changed suddenly. Maybe starting it after chronyd at boot can help.
Will try to do so, but not optimistic if it succeeds.
Went down that rabbit hole before in an attempt to start the hole system-init.service after receiving a valid time from a ntp-server.
This because sometimes the default server-certificate also get a bogus date on system-init:
Before nethserver-system-init I cannot ping a external server by name (e.g. ping www.google.com), the system can not resolve the servername in the DNS.
Chronyd can not resolve the ntp pool servername hence it does not sync the time. During nethserver-system-init dnsmasq takes over and then it’s the first time chronyd sync’s.
If I drop nameserver 8.8.4.4 in /etc/resolve.conf the time gets synced.
so what to do…
Simply mitigate by dropping a kick-start resolve.conf in at image creation?
Not sure, the network might not be available for other reasons… I understand the board is missing an internal time source so the clock is not set at startup.
In general, services must not require the network (and clock) at startup to avoid issues like this. I’d say it is a collectd issue and I’d prefer to not change the default system configuration because of this.
It’s only at nethserver-system-init … not every time the system starts-up. Although this chrony-wait.service would start al times. Could also require first-boot to be present.
I agree, however resolf.conf is templated and overwritten during system-init.
and it solves two problems:
# journalctl | grep rrdtool
Dec 15 13:15:10 localhost.localdomain collectd[1167]: plugin_load: plugin "rrdtool" successfully loaded.
Dec 15 13:15:12 localhost.localdomain collectd[1167]: rrdtool plugin: Shutting down the queue thread. This may take a while.
Dec 15 13:15:12 localhost.localdomain collectd[1216]: plugin_load: plugin "rrdtool" successfully loaded.
Dec 15 13:16:02 localhost.localdomain collectd[1216]: rrdtool plugin: Shutting down the queue thread. This may take a while.
Dec 15 13:16:03 localhost.localdomain collectd[3388]: plugin_load: plugin "rrdtool" successfully loaded.
If there was no kernel bug for the RPI4-8GB concerning zram-sawp I would pushed a RC image to arm-dev.
However having slept over the issue discussed here made me doubt the solution and got undecided:
PRO:
no bogus date for default server certificate
no errors collectd / rrdtool
waiting for the time implies network / internet is up and running which in it’s turn implies we have received an IP from a DHCP-server. Resulting in a valid network configuration after nethserver-system-init
CON:
nethserver-system-init won’t run without a working internet-connection. Which may result in support questions.
This needs to be documented, maybe also with the suggestion to test the RPI being used in case of problems.
The RPI can easily be tested by using a standard Raspberry-OS imaged SD. If that gets Internet, Your NethServer will also get Internet, as even the MAC Address is the same.
Almost anyone using Raspberries - or playing around with one - knows how to use BalenaEtcher, which makes imaging SDs fast and easy, even for Noobs…
IMHO a bit overkill for a bogus date on a self-signed certificate and some non-critical log entries from collectd failures during nethserver-system-init