Error in SSSD service on mail server

just 2 times in the week, my concern when I got the issues few years ago was that the server was under load due to a misconfigured backup job

if the sssd is restarted by watchdog it is not an issue

but this error appears quite constantly, several times a day, also periodically employees get such an error when sending a letter, I think this is related
translated_image_en
also there are updates available for the server, should i update?
info for update:
Build system:
python@2.7.5-94.el7_9 from ce-updates

CentOS-minimal:
microcode_ctl@2.1-73.16.el7_9 from ce-updates
python@2.7.5-94.el7_9 from ce-updates
nss-sysinit@3.90.0-2.el7_9 from ce-updates
openssh-clients@7.4p1-23.el7_9 from ce-updates
nspr@4.35.0-1.el7_9 from ce-updates
kernel-tools@3.10.0-1160.102.1.el7 from ce-updates
nss@3.90.0-2.el7_9 from ce-updates
nss-softokn@3.90.0-6.el7_9 from ce-updates
openssh@7.4p1-23.el7_9 from ce-updates
libssh2@1.8.0-4.el7_9.1 from ce-updates
python-libs@2.7.5-94.el7_9 from ce-updates
kernel@3.10.0-1160.102.1.el7 from ce-updates
nss-util@3.90.0-1.el7_9 from ce-updates
kernel-tools-libs@3.10.0-1160.102.1.el7 from ce-updates
ca-certificates@2023.2.60_v7.0.306-72.el7_9 from ce-updates
openssh-server@7.4p1-23.el7_9 from ce-updates
nss-softokn-freebl@3.90.0-6.el7_9 from ce-updates
nss-tools@3.90.0-2.el7_9 from ce-updates

Other:
clamav-update@0.103.11-1.el7 from epel
clamd@0.103.11-1.el7 from epel
clamav-lib@0.103.11-1.el7 from epel
samba-client-libs@4.10.16-25.el7_9 from ce-updates
samba-common@4.10.16-25.el7_9 from ce-updates
bind-export-libs@9.11.4-26.P2.el7_9.15 from ce-updates
cups-libs@1.6.3-52.el7_9 from ce-updates
libwbclient@4.10.16-25.el7_9 from ce-updates
samba-libs@4.10.16-25.el7_9 from ce-updates
c-ares@1.10.0-3.el7_9.1 from ce-updates
clamav@0.103.11-1.el7 from epel
netdata@1.43.2-1.el7 from epel
netdata-data@1.43.2-1.el7 from epel
samba-common-libs@4.10.16-25.el7_9 from ce-updates
clamav-filesystem@0.103.11-1.el7 from epel
samba-common-tools@4.10.16-25.el7_9 from ce-updates
libsmbclient@4.10.16-25.el7_9 from ce-updates
python3-libs@3.6.8-21.el7_9 from ce-updates
netdata-conf@1.43.2-1.el7 from epel
python3@3.6.8-21.el7_9 from ce-updates
python-perf@3.10.0-1160.102.1.el7 from ce-updates

please help, I have only been familiar with the NethServer for two days, and I have not worked with Linux

Yes, you should update (only via software center). If it is a production server, do it this weekend so you have Sunday to troubleshoot and on Monday your users can work.

You are doing just fine by asking for help.

1 Like

Hi @Andy_Wismer, so you recommend waiting for the error, then deleting AD account provider, after that doing a config restore, and hopefully this issue will not occur anymore. I will test another time, but even if this strange trick works, i would still consider this a bug. Or do you have an explanation for this ?

@trentatre

I consider this a working “work-around”, not really a solution.

Make sure, before using this trick, that you have working backups!

I have suggested this trick several times in the forum, and it’s always worked for the respective user.

See eg here:

See the end, where the user confirms it works!

:slight_smile:

My 2 cents
Andy

1 Like

I have performed all available updates but the problem persists
after the update, the problem got worse, also if you send a letter through roundcube, one message can be sent and when you send another, the following error appears


I also noticed that when red is turned on in the access of the SSHD service, then in the logs I see that the service refuses some unknown users, what could it be?


strangely enough, after adding the red item in the sssd service, the errors related to the sssd service disappeared
at 16:30 I restored the system backup that was saved before my manipulations

bots / cyber-criminals trying to gain access to the server. The attack can be targeted but often it is just random. Fail2Ban and SSH through VPN with restricted account (without exposing SSH port to Internet) are few of the strategies to harden a server.

You mean sshd instead of sssd? Changing zone settings (green, red…) for a service, the firewall is “restarted”.

1 Like

systemctl restart sssd

did you try it ?

yes i tried that, the service restarts then when i try to send a mail it takes a very long time to send and an error is written in the logs was terminated by own WATCHDOG
and since the morning it continues, the SSSD service is very bad

дек 04 08:34:20 mail.mmz.com.ua sssd[nss][10029]: Starting up
дек 04 08:34:30 mail.mmz.com.ua sssd[sssd][24394]: Child [9111] ('mmz.com.ua':'%BE_mmz.com.ua') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason.
дек 04 08:34:30 mail.mmz.com.ua sssd[be[mmz.com.ua]][10039]: Starting up
дек 04 08:34:58 mail.mmz.com.ua sssd[sssd][24394]: Child [24397] ('pam':'pam') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason.
дек 04 08:34:58 mail.mmz.com.ua sssd[pam][10114]: Starting up
дек 04 08:35:06 mail.mmz.com.ua sssd[sssd][24394]: Child [10039] ('mmz.com.ua':'%BE_mmz.com.ua') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason.
дек 04 08:35:06 mail.mmz.com.ua sssd[be[mmz.com.ua]][11096]: Starting up
дек 04 08:35:11 mail.mmz.com.ua sssd[sssd][24394]: Child [10029] ('nss':'nss') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason.
дек 04 08:35:11 mail.mmz.com.ua sssd[nss][11102]: Starting up
дек 04 08:35:13 mail.mmz.com.ua sssd[nss][11115]: Starting up
дек 04 08:35:17 mail.mmz.com.ua sssd[nss][11161]: Starting up
дек 04 08:35:38 mail.mmz.com.ua sssd[sssd][24394]: Child [11096] ('mmz.com.ua':'%BE_mmz.com.ua') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason.
дек 04 08:35:38 mail.mmz.com.ua sssd[be[mmz.com.ua]][11283]: Starting up
дек 04 08:37:11 mail.mmz.com.ua sssd[sssd][24394]: Child [11283] ('mmz.com.ua':'%BE_mmz.com.ua') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason.
дек 04 08:37:15 mail.mmz.com.ua sssd[be[mmz.com.ua]][11596]: Starting up
дек 04 08:37:38 mail.mmz.com.ua sssd[sssd][24394]: Child [11161] ('nss':'nss') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason.
дек 04 08:37:45 mail.mmz.com.ua sssd[nss][11640]: Starting up
дек 04 08:38:46 mail.mmz.com.ua sssd[sssd][24394]: Child [11640] ('nss':'nss') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason.
дек 04 08:38:46 mail.mmz.com.ua sssd[nss][12666]: Starting up
дек 04 08:38:56 mail.mmz.com.ua sssd[sssd][24394]: Child [11596] ('mmz.com.ua':'%BE_mmz.com.ua') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason.
дек 04 08:38:56 mail.mmz.com.ua sssd[be[mmz.com.ua]][12676]: Starting up
дек 04 08:43:57 mail.mmz.com.ua sssd[sssd][24394]: Child [12676] ('mmz.com.ua':'%BE_mmz.com.ua') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason.
дек 04 08:43:57 mail.mmz.com.ua sssd[be[mmz.com.ua]][13067]: Starting up
дек 04 08:51:37 mail.mmz.com.ua sssd[sssd][24394]: Child [12666] ('nss':'nss') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason.
дек 04 08:51:37 mail.mmz.com.ua sssd[nss][13637]: Starting up
дек 04 08:52:30 mail.mmz.com.ua sssd[nss][13637]: Enumeration requested but not enabled
дек 04 08:56:28 mail.mmz.com.ua sssd[sssd][24394]: Child [13637] ('nss':'nss') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason.
дек 04 08:56:28 mail.mmz.com.ua sssd[nss][14021]: Starting up
дек 04 08:56:30 mail.mmz.com.ua sssd[sssd][24394]: Child [13067] ('mmz.com.ua':'%BE_mmz.com.ua') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason.
дек 04 08:56:30 mail.mmz.com.ua sssd[be[mmz.com.ua]][14023]: Starting up
дек 04 08:58:29 mail.mmz.com.ua sssd[nss][14021]: Enumeration requested but not enabled
дек 04 08:59:53 mail.mmz.com.ua sssd[sssd][24394]: Child [14023] ('mmz.com.ua':'%BE_mmz.com.ua') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason.
дек 04 08:59:53 mail.mmz.com.ua sssd[be[mmz.com.ua]][14323]: Starting up
дек 04 09:01:59 mail.mmz.com.ua sssd[sssd][24394]: Child [10114] ('pam':'pam') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason.
дек 04 09:02:23 mail.mmz.com.ua sssd[sssd][24394]: Child [14323] ('mmz.com.ua':'%BE_mmz.com.ua') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason.
дек 04 09:02:23 mail.mmz.com.ua sssd[be[mmz.com.ua]][14557]: Starting up
дек 04 09:02:24 mail.mmz.com.ua sssd[pam][14521]: Starting up
дек 04 09:02:26 mail.mmz.com.ua sssd[pam][14558]: Starting up
дек 04 09:02:30 mail.mmz.com.ua sssd[pam][14562]: Starting up
дек 04 09:02:30 mail.mmz.com.ua sssd[sssd][24394]: Exiting the SSSD. Could not restart critical service [pam].
дек 04 09:03:35 mail.mmz.com.ua systemd[1]: sssd.service: main process exited, code=exited, status=1/FAILURE
дек 04 09:03:35 mail.mmz.com.ua systemd[1]: Unit sssd.service entered failed state.
дек 04 09:03:35 mail.mmz.com.ua systemd[1]: sssd.service failed.
дек 04 09:07:12 mail.mmz.com.ua systemd[1]: Starting System Security Services Daemon...
дек 04 09:07:41 mail.mmz.com.ua sssd[sssd][17155]: Starting up
дек 04 09:07:41 mail.mmz.com.ua sssd[be[mmz.com.ua]][17188]: Starting up
дек 04 09:07:41 mail.mmz.com.ua sssd[pam][17190]: Starting up
дек 04 09:07:41 mail.mmz.com.ua sssd[nss][17189]: Starting up
дек 04 09:07:41 mail.mmz.com.ua systemd[1]: Started System Security Services Daemon.
дек 04 09:07:44 mail.mmz.com.ua sssd[nss][17189]: Enumeration requested but not enabled
дек 04 09:10:02 mail.mmz.com.ua sssd[sssd][17155]: Child [17189] ('nss':'nss') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason.
дек 04 09:10:02 mail.mmz.com.ua sssd[nss][19521]: Starting up
дек 04 09:10:04 mail.mmz.com.ua sssd[sssd][17155]: Child [17188] ('mmz.com.ua':'%BE_mmz.com.ua') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason.
дек 04 09:10:04 mail.mmz.com.ua sssd[be[mmz.com.ua]][19525]: Starting up
дек 04 09:12:07 mail.mmz.com.ua sssd[sssd][17155]: Child [19525] ('mmz.com.ua':'%BE_mmz.com.ua') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason.
дек 04 09:12:07 mail.mmz.com.ua sssd[be[mmz.com.ua]][19702]: Starting up
дек 04 09:15:49 mail.mmz.com.ua sssd[nss][19521]: Enumeration requested but not enabled
дек 04 09:16:43 mail.mmz.com.ua sssd[sssd][17155]: Child [19521] ('nss':'nss') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason.
дек 04 09:16:43 mail.mmz.com.ua sssd[nss][20965]: Starting up
дек 04 09:17:14 mail.mmz.com.ua sssd[sssd][17155]: Child [20965] ('nss':'nss') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason.
дек 04 09:17:14 mail.mmz.com.ua sssd[nss][21082]: Starting up
дек 04 09:17:21 mail.mmz.com.ua sssd[sssd][17155]: Child [17190] ('pam':'pam') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason.
дек 04 09:17:21 mail.mmz.com.ua sssd[pam][21148]: Starting up

please explain what further steps I need to clarify and eliminate the problem, I do not understand this, I really need your support, dear

приветствую
я так понимаю ваши эксперименты ни к чему не привели?
имею аналогичную проблему на почтовом серваке и тоже пока без вариантов решения.

[mod break]
Please write in English on these forums so everybody can understand. For convenience I will translate this one:

Greetings
I take it your experiments didn’t lead to anything?
I have a similar problem on my mail server and also have no solutions yet.

Hello
that’s right, it got even worse, but it started gradually and got worse every day

there seems to be some kind of solution here, but dviewing is only for paid accounts

If anyone has a subscription account, please send me a solution to this problem

The server seems to be too weak to handle the load.
Could you please show the load average? (the output of the uptime command is ok).
You can try to temporarily work around the problem by increasing the SSSD watchdog timeout:

  • under your domain section of /etc/sssd/sssd.conf add
    • timeout = 20
  • restart sssd: systemctl restart sssd

The real fix is to lower the load average using powerful hardware.

2 Likes

yes, but before that everything worked perfectly and the equipment had enough power… could this be related to some kind of system update?