Error in SSSD service on mail server

NethServer Version: 2.7.9.2009
Module: your_module
the company’s netserver has a mail server installed, but it does not work correctly, the following error is constantly recorded in the logs
how do i fix these errors?

 18:40 Child [23919] ('mmz.com.ua':'%BE_mmz.com.ua') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason. sssd[sssd]
 18:30 Child [23760] ('nss':'nss') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason. sssd[sssd]
 18:21 Child [22991] ('mmz.com.ua':'%BE_mmz.com.ua') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason. sssd[sssd]
 18:19 Child [23027] ('nss':'nss') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason. sssd[sssd]
 18:17 Child [22825] ('nss':'nss') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason. sssd[sssd]
 18:17 Child [22828] ('mmz.com.ua':'%BE_mmz.com.ua') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason. sssd[sssd]
 18:15 Child [7356] ('pam':'pam') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason. sssd[sssd]
 18:15 Child [21942] ('mmz.com.ua':'%BE_mmz.com.ua') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason. sssd[sssd]
 18:15 Child [21916] ('nss':'nss') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason. sssd[sssd]
 18:13 Child [21689] ('mmz.com.ua':'%BE_mmz.com.ua') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason. sssd[sssd]
 18:13 Child [21669] ('nss':'nss') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason. sssd[sssd]
 18:10 Child [21593] ('mmz.com.ua':'%BE_mmz.com.ua') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason. sssd[sssd]
 18:10 Child [21591] ('nss':'nss') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason. sssd[sssd]
 18:09 Child [21428] ('mmz.com.ua':'%BE_mmz.com.ua') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason. sssd[sssd]
 18:09 Child [7552] ('nss':'nss') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason. sssd[sssd]
 18:07 Child [20888] ('mmz.com.ua':'%BE_mmz.com.ua') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason. sssd[sssd]
 18:00 Child [20529] ('mmz.com.ua':'%BE_mmz.com.ua') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason. sssd[sssd]
 17:55 Child [20354] ('mmz.com.ua':'%BE_mmz.com.ua') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason. sssd[sssd]
 17:53 Child [8638] ('mmz.com.ua':'%BE_mmz.com.ua') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason. sssd[sssd]
 16:28 Child [7369] ('mmz.com.ua':'%BE_mmz.com.ua') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason. sssd[sssd]
 16:21 Child [5227] ('nss':'nss') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason. sssd[sssd]
 16:20 Child [6443] ('mmz.com.ua':'%BE_mmz.com.ua') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason. sssd[sssd]
 16:20 Child [7312] ('pam':'pam') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason.
**I also attach the SSSD service settings**
  • sssd.service - System Security Services Daemon
    Loaded: loaded (/usr/lib/systemd/system/sssd.service; enabled; vendor preset: disabled)
    Active: active (running) since Fri 2023-12-01 15:11:18 EET; 5h 28min ago
    Main PID: 1606 (sssd)
    CGroup: /system.slice/sssd.service
    |- 1606 /usr/sbin/sssd -i --logger=files
    |-22844 /usr/libexec/sssd/sssd_pam --uid 0 --gid 0 --logger=files
    |-27177 /usr/libexec/sssd/sssd_nss --uid 0 --gid 0 --logger=files
    `-28455 /usr/libexec/sssd/sssd_be --domain mmz.com.ua --uid 0 --gid 0 --logger=files

Hi and welcome!

I assume posting your issue twice was a small mistake, so let’s focus on this one only.

There is a clue…

Possibly caused by the server being under high load.
I think the changes described in this thread were implemented in an update:

Related (upstream discussion):

2 Likes

Need to check in my log but for me it is not an issue. The main issue was when sssd was stuck in a failed state then the watchdog cannot restart it hence the trick I did in the systemd service. AFAIK the fix has never be ported it must be done manually

Helo. I have exactly the same problems with the same symptoms. Obviously this is not an individual server error of the theme author. Developer help required.

I can confirm same error here 1-2 months ago when trying new installation on proxmox. First time with low hardware ressources for VM thought of too heavy load as mentioned here. Second try with more cores and ram for VM same result and sssd stopping. Workaround from @stephdl didnt help. Checked on bare-metal very old test machine without no errors, but thats not the solution because want virtualization. So thought of something related to proxmox as virtualization host, but stopped worrying and investigating further about it, as NS8 was announced nearly. Would only confirm, that this bug exists, in my case strangly only for new installations on proxmox. Maybe it helps.

Hi @trentatre
I’d like to confirm the opposite: In the past few months I have installed a couple of NS7 for clients, and I do not do “native” bare-metal installs anymore - all in Proxmox, all use AD on NethServer. So far, none have any issues!

I have also run into the SSSD issue, but only very irregularily and rarely. In all those cases, a config restore after deleting Account Provider helped! (This implies also, that AD was working before…).

My 2 cents
Andy

just 2 times in the week, my concern when I got the issues few years ago was that the server was under load due to a misconfigured backup job

if the sssd is restarted by watchdog it is not an issue

but this error appears quite constantly, several times a day, also periodically employees get such an error when sending a letter, I think this is related
translated_image_en
also there are updates available for the server, should i update?
info for update:
Build system:
python@2.7.5-94.el7_9 from ce-updates

CentOS-minimal:
microcode_ctl@2.1-73.16.el7_9 from ce-updates
python@2.7.5-94.el7_9 from ce-updates
nss-sysinit@3.90.0-2.el7_9 from ce-updates
openssh-clients@7.4p1-23.el7_9 from ce-updates
nspr@4.35.0-1.el7_9 from ce-updates
kernel-tools@3.10.0-1160.102.1.el7 from ce-updates
nss@3.90.0-2.el7_9 from ce-updates
nss-softokn@3.90.0-6.el7_9 from ce-updates
openssh@7.4p1-23.el7_9 from ce-updates
libssh2@1.8.0-4.el7_9.1 from ce-updates
python-libs@2.7.5-94.el7_9 from ce-updates
kernel@3.10.0-1160.102.1.el7 from ce-updates
nss-util@3.90.0-1.el7_9 from ce-updates
kernel-tools-libs@3.10.0-1160.102.1.el7 from ce-updates
ca-certificates@2023.2.60_v7.0.306-72.el7_9 from ce-updates
openssh-server@7.4p1-23.el7_9 from ce-updates
nss-softokn-freebl@3.90.0-6.el7_9 from ce-updates
nss-tools@3.90.0-2.el7_9 from ce-updates

Other:
clamav-update@0.103.11-1.el7 from epel
clamd@0.103.11-1.el7 from epel
clamav-lib@0.103.11-1.el7 from epel
samba-client-libs@4.10.16-25.el7_9 from ce-updates
samba-common@4.10.16-25.el7_9 from ce-updates
bind-export-libs@9.11.4-26.P2.el7_9.15 from ce-updates
cups-libs@1.6.3-52.el7_9 from ce-updates
libwbclient@4.10.16-25.el7_9 from ce-updates
samba-libs@4.10.16-25.el7_9 from ce-updates
c-ares@1.10.0-3.el7_9.1 from ce-updates
clamav@0.103.11-1.el7 from epel
netdata@1.43.2-1.el7 from epel
netdata-data@1.43.2-1.el7 from epel
samba-common-libs@4.10.16-25.el7_9 from ce-updates
clamav-filesystem@0.103.11-1.el7 from epel
samba-common-tools@4.10.16-25.el7_9 from ce-updates
libsmbclient@4.10.16-25.el7_9 from ce-updates
python3-libs@3.6.8-21.el7_9 from ce-updates
netdata-conf@1.43.2-1.el7 from epel
python3@3.6.8-21.el7_9 from ce-updates
python-perf@3.10.0-1160.102.1.el7 from ce-updates

please help, I have only been familiar with the NethServer for two days, and I have not worked with Linux

Yes, you should update (only via software center). If it is a production server, do it this weekend so you have Sunday to troubleshoot and on Monday your users can work.

You are doing just fine by asking for help.

1 Like

Hi @Andy_Wismer, so you recommend waiting for the error, then deleting AD account provider, after that doing a config restore, and hopefully this issue will not occur anymore. I will test another time, but even if this strange trick works, i would still consider this a bug. Or do you have an explanation for this ?

@trentatre

I consider this a working “work-around”, not really a solution.

Make sure, before using this trick, that you have working backups!

I have suggested this trick several times in the forum, and it’s always worked for the respective user.

See eg here:

See the end, where the user confirms it works!

:slight_smile:

My 2 cents
Andy

1 Like

I have performed all available updates but the problem persists
after the update, the problem got worse, also if you send a letter through roundcube, one message can be sent and when you send another, the following error appears


I also noticed that when red is turned on in the access of the SSHD service, then in the logs I see that the service refuses some unknown users, what could it be?


strangely enough, after adding the red item in the sssd service, the errors related to the sssd service disappeared
at 16:30 I restored the system backup that was saved before my manipulations

bots / cyber-criminals trying to gain access to the server. The attack can be targeted but often it is just random. Fail2Ban and SSH through VPN with restricted account (without exposing SSH port to Internet) are few of the strategies to harden a server.

You mean sshd instead of sssd? Changing zone settings (green, red…) for a service, the firewall is “restarted”.

1 Like

systemctl restart sssd

did you try it ?

yes i tried that, the service restarts then when i try to send a mail it takes a very long time to send and an error is written in the logs was terminated by own WATCHDOG
and since the morning it continues, the SSSD service is very bad

дек 04 08:34:20 mail.mmz.com.ua sssd[nss][10029]: Starting up
дек 04 08:34:30 mail.mmz.com.ua sssd[sssd][24394]: Child [9111] ('mmz.com.ua':'%BE_mmz.com.ua') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason.
дек 04 08:34:30 mail.mmz.com.ua sssd[be[mmz.com.ua]][10039]: Starting up
дек 04 08:34:58 mail.mmz.com.ua sssd[sssd][24394]: Child [24397] ('pam':'pam') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason.
дек 04 08:34:58 mail.mmz.com.ua sssd[pam][10114]: Starting up
дек 04 08:35:06 mail.mmz.com.ua sssd[sssd][24394]: Child [10039] ('mmz.com.ua':'%BE_mmz.com.ua') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason.
дек 04 08:35:06 mail.mmz.com.ua sssd[be[mmz.com.ua]][11096]: Starting up
дек 04 08:35:11 mail.mmz.com.ua sssd[sssd][24394]: Child [10029] ('nss':'nss') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason.
дек 04 08:35:11 mail.mmz.com.ua sssd[nss][11102]: Starting up
дек 04 08:35:13 mail.mmz.com.ua sssd[nss][11115]: Starting up
дек 04 08:35:17 mail.mmz.com.ua sssd[nss][11161]: Starting up
дек 04 08:35:38 mail.mmz.com.ua sssd[sssd][24394]: Child [11096] ('mmz.com.ua':'%BE_mmz.com.ua') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason.
дек 04 08:35:38 mail.mmz.com.ua sssd[be[mmz.com.ua]][11283]: Starting up
дек 04 08:37:11 mail.mmz.com.ua sssd[sssd][24394]: Child [11283] ('mmz.com.ua':'%BE_mmz.com.ua') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason.
дек 04 08:37:15 mail.mmz.com.ua sssd[be[mmz.com.ua]][11596]: Starting up
дек 04 08:37:38 mail.mmz.com.ua sssd[sssd][24394]: Child [11161] ('nss':'nss') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason.
дек 04 08:37:45 mail.mmz.com.ua sssd[nss][11640]: Starting up
дек 04 08:38:46 mail.mmz.com.ua sssd[sssd][24394]: Child [11640] ('nss':'nss') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason.
дек 04 08:38:46 mail.mmz.com.ua sssd[nss][12666]: Starting up
дек 04 08:38:56 mail.mmz.com.ua sssd[sssd][24394]: Child [11596] ('mmz.com.ua':'%BE_mmz.com.ua') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason.
дек 04 08:38:56 mail.mmz.com.ua sssd[be[mmz.com.ua]][12676]: Starting up
дек 04 08:43:57 mail.mmz.com.ua sssd[sssd][24394]: Child [12676] ('mmz.com.ua':'%BE_mmz.com.ua') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason.
дек 04 08:43:57 mail.mmz.com.ua sssd[be[mmz.com.ua]][13067]: Starting up
дек 04 08:51:37 mail.mmz.com.ua sssd[sssd][24394]: Child [12666] ('nss':'nss') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason.
дек 04 08:51:37 mail.mmz.com.ua sssd[nss][13637]: Starting up
дек 04 08:52:30 mail.mmz.com.ua sssd[nss][13637]: Enumeration requested but not enabled
дек 04 08:56:28 mail.mmz.com.ua sssd[sssd][24394]: Child [13637] ('nss':'nss') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason.
дек 04 08:56:28 mail.mmz.com.ua sssd[nss][14021]: Starting up
дек 04 08:56:30 mail.mmz.com.ua sssd[sssd][24394]: Child [13067] ('mmz.com.ua':'%BE_mmz.com.ua') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason.
дек 04 08:56:30 mail.mmz.com.ua sssd[be[mmz.com.ua]][14023]: Starting up
дек 04 08:58:29 mail.mmz.com.ua sssd[nss][14021]: Enumeration requested but not enabled
дек 04 08:59:53 mail.mmz.com.ua sssd[sssd][24394]: Child [14023] ('mmz.com.ua':'%BE_mmz.com.ua') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason.
дек 04 08:59:53 mail.mmz.com.ua sssd[be[mmz.com.ua]][14323]: Starting up
дек 04 09:01:59 mail.mmz.com.ua sssd[sssd][24394]: Child [10114] ('pam':'pam') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason.
дек 04 09:02:23 mail.mmz.com.ua sssd[sssd][24394]: Child [14323] ('mmz.com.ua':'%BE_mmz.com.ua') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason.
дек 04 09:02:23 mail.mmz.com.ua sssd[be[mmz.com.ua]][14557]: Starting up
дек 04 09:02:24 mail.mmz.com.ua sssd[pam][14521]: Starting up
дек 04 09:02:26 mail.mmz.com.ua sssd[pam][14558]: Starting up
дек 04 09:02:30 mail.mmz.com.ua sssd[pam][14562]: Starting up
дек 04 09:02:30 mail.mmz.com.ua sssd[sssd][24394]: Exiting the SSSD. Could not restart critical service [pam].
дек 04 09:03:35 mail.mmz.com.ua systemd[1]: sssd.service: main process exited, code=exited, status=1/FAILURE
дек 04 09:03:35 mail.mmz.com.ua systemd[1]: Unit sssd.service entered failed state.
дек 04 09:03:35 mail.mmz.com.ua systemd[1]: sssd.service failed.
дек 04 09:07:12 mail.mmz.com.ua systemd[1]: Starting System Security Services Daemon...
дек 04 09:07:41 mail.mmz.com.ua sssd[sssd][17155]: Starting up
дек 04 09:07:41 mail.mmz.com.ua sssd[be[mmz.com.ua]][17188]: Starting up
дек 04 09:07:41 mail.mmz.com.ua sssd[pam][17190]: Starting up
дек 04 09:07:41 mail.mmz.com.ua sssd[nss][17189]: Starting up
дек 04 09:07:41 mail.mmz.com.ua systemd[1]: Started System Security Services Daemon.
дек 04 09:07:44 mail.mmz.com.ua sssd[nss][17189]: Enumeration requested but not enabled
дек 04 09:10:02 mail.mmz.com.ua sssd[sssd][17155]: Child [17189] ('nss':'nss') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason.
дек 04 09:10:02 mail.mmz.com.ua sssd[nss][19521]: Starting up
дек 04 09:10:04 mail.mmz.com.ua sssd[sssd][17155]: Child [17188] ('mmz.com.ua':'%BE_mmz.com.ua') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason.
дек 04 09:10:04 mail.mmz.com.ua sssd[be[mmz.com.ua]][19525]: Starting up
дек 04 09:12:07 mail.mmz.com.ua sssd[sssd][17155]: Child [19525] ('mmz.com.ua':'%BE_mmz.com.ua') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason.
дек 04 09:12:07 mail.mmz.com.ua sssd[be[mmz.com.ua]][19702]: Starting up
дек 04 09:15:49 mail.mmz.com.ua sssd[nss][19521]: Enumeration requested but not enabled
дек 04 09:16:43 mail.mmz.com.ua sssd[sssd][17155]: Child [19521] ('nss':'nss') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason.
дек 04 09:16:43 mail.mmz.com.ua sssd[nss][20965]: Starting up
дек 04 09:17:14 mail.mmz.com.ua sssd[sssd][17155]: Child [20965] ('nss':'nss') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason.
дек 04 09:17:14 mail.mmz.com.ua sssd[nss][21082]: Starting up
дек 04 09:17:21 mail.mmz.com.ua sssd[sssd][17155]: Child [17190] ('pam':'pam') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason.
дек 04 09:17:21 mail.mmz.com.ua sssd[pam][21148]: Starting up