Issue with RSpamd

NethServer Version: 7.09.2009 with all updates
Module: RSpamd 3.0

Hello again,

I have a strange behavior of the RSpamd in a sense that it is resetting the learned hams and spams to activate the filter (400 spams and hams). The server is running more than a year, it has learned but from time to time it is kind of “forgetting” the learned.

Then again if you go to the RSpamd web interface it says that it have the spams and hams

How to overcome this issue and have an active filter?

@stephdl
Do you have an idea?

it might happen, I already seen it, what you can check is what the rspamd UI output to our UI

echo ‘{“action”:“stats”}’ | /usr/bin/sudo /usr/libexec/nethserver/api/nethserver-mail/filter/read | jq

the number 1544 and 317 are seen as revision by my UI, it is not reflect the the number of learned

"statfiles": [
  {
    "symbol": "BAYES_SPAM",
    "users": 1,
    "total": 0,
    "size": 0,
    "revision": 1344,
    "languages": 0,
    "used": 0,
    "type": "redis"
  },
  {
    "symbol": "BAYES_HAM",
    "users": 1,
    "total": 0,
    "size": 0,
    "revision": 648,
    "languages": 0,
    "used": 0,
    "type": "redis"
  }
],

but the number of all learned are

  "info": {
    "version": "3.0",
    "learned": 303,
    "clean": 11490,
    "scanned": 12662,
    "auth": "ok",
    "greylist": 0,
    "read_only": false,
    "uptime": 1139689,
    "probable": 548,
    "config_id": "qh3aageog5i7iyeb6w4xr8dtshofoxg9ow9t74crkhtbrdqys5mu7dhkkhwd6d7mn947uoek9owtymu784tu5ekgsfcod3faodwp89d",
    "reject": 623,
    "soft_reject": 1
  },

However why the learned have been set to zero again, no idea yet

1 Like

I was hoping that after the new learning process the issue will be resolved, but unfortunately, after coming to 98%, the learning process has been restarted once again from the start. Now the situation is like this

The only thing that might trigger it (at least my assumption) was several dirty restarts few days ago due to no power supply (the UPS obviously had a faulty battery and didn’t help to gracefully shut down the server).

Any idea how to reconfigure to use the already learned spam and hams or I have to wait for another round of learning?

I see the same behavior in myself all the time. I have given up trying to understand it. Even after a brand new installation, the phenomenon occurs.

Thank you Marko, at least I know that I’m not the only one with this experience.

the only things we could smell is a redis database corruption, what is the output

ls -la /var/lib/redis/rspamd/

# ls -la /var/lib/redis/rspamd/
total 10280
drwxr-xr-x 2 redis redis       22 Feb 18 12:53 .
drwxr-x--- 3 redis redis       20 Jan 15 18:55 ..
-rw-r--r-- 1 redis mail  10524677 Feb 18 12:53 dump.rdb

In my case

Screenshot 2022-02-18

1 Like

today the next reset and a full spammed mailbox :frowning:

Ok, I trained my bayes filter manually again.

# rspamc stat

Messages learned: 1305

3 Likes

Nice trick :). You made my day with this one, thank you Marko!

No laptop next to me but I can confirm this, I have a wave of spam since three days

me too

1 Like

I wonder if the culprit is not here

I wonder if we do not purge the database each 100 days

This is the default, I think we could set to -1, we already set a max memory to 300MB

See the documentation

1 Like

image

I am not sure the bayes has been set to null in my case

I set to expire = -1 in /etc/e-smith/templates/etc/rspamd/local.d/classifier-bayes.conf/10Base, then signal-event nethserver-mail-filter-update

obviously the next rpm update will set again the former value

by curiosity @capote could you give the size of your database /var/lib/redis/rspamd/dump.rdb

# ll /var/lib/redis/rspamd/
total 19728
-rw-r--r-- 1 redis mail 20199725 Feb 25 20:02 dump.rdb

I have the same problem. Even tough I use NS over a year now, the spam filter never got above 20%.

I now created a custom-template with the above suggestions:

mkdir -p /etc/e-smith/templates-custom/etc/rspamd/local.d/classifier-bayes.conf/
sed 's/expire.*=.*/expire = -1/' </etc/e-smith/templates/etc/rspamd/local.d/classifier-bayes.conf/10Base >/etc/e-smith/templates-custom/etc/rspamd/local.d/classifier-bayes.conf/10Base
signal-event nethserver-mail-filter-update

Is there any means to retrain the filter according to the current inbox or send box?

1 Like

I’m facing (with an external email service and on a completely different platform/set of tools) an increased wave of spam.

IMVHO current and close past events are also affecting the volume and “noise” of unwanted messages due to a decreased signal-to-noise ratio on the world communications (so much more spam than usual).

IMVHO i’d train the “spam” section instead of the “ham” section, after the manual classification. It’s tedious do the selection instead of make the do the evaluation to amavisd/rSpamd.
But as usual, spam fight is about reaction, more than prevention.