SASL LOGIN authentication failed: Connection lost to authentication server

NethServer Version: 7.8.2003
Module: postfix

This message have been popping in my logs

hermod.durerocaribe.cu postfix/smtpd[20966]: warning: some_pc_name.durerocaribe.cu[###.###.###.###]: SASL LOGIN authentication failed: Connection lost to authentication server

and every time it happens the user of that PC complains that his Outlook ask for a password to before sending an email.

Any thoughts? I appreciate the help.

Is it only one PC? Maybe Outlook has wrong saved credentials.

Review deeply the configuration of your email client. Which version of Outlook are we talking about?

No this happens with almost all PCs

Version of Outlook varies from 2007/2010/2013

Also I’m had this error several times on my logs:

Aug 28 16:50:58 hermod.durerocaribe.cu httpd[12051]: [ERROR] NethServer\Tool\GroupProvider: Account provider generic error: SSSD exit code 1
Sep 03 12:41:51 hermod.durerocaribe.cu httpd[10701]: [ERROR] NethServer\Tool\GroupProvider: Account provider generic error: SSSD exit code 1
Sep 03 12:41:59 hermod.durerocaribe.cu httpd[17990]: [ERROR] NethServer\Tool\GroupProvider: Account provider generic error: SSSD exit code 1
Sep 04 09:19:27 hermod.durerocaribe.cu httpd[28814]: [ERROR] NethServer\Tool\GroupProvider: Account provider generic error: SSSD exit code 1

Maybe is a problem related to AD authentication between postfix and ssd ?
How can a check this ?

Does following command work?

account-provider-test dump

Does restarting NSDC solve the problem?

systemctl restart nsdc

Maybe machines.target is not running/enabled?

A config restore may solve the problem too:

For NS AD:

account-provider-test dump
{
   "BindDN" : "ldapservice@LOCAL.DUREROCARIBE.CU",
   "LdapURI" : "ldaps://nsdc-odin.local.durerocaribe.cu",
   "DiscoverDcType" : "ldapuri",
   "StartTls" : "",
   "port" : 636,
   "host" : "nsdc-odin.local.durerocaribe.cu",
   "isAD" : "1",
   "isLdap" : "",
   "UserDN" : "dc=local,dc=durerocaribe,dc=cu",
   "GroupDN" : "dc=local,dc=durerocaribe,dc=cu",
   "BindPassword" : "SECRET",
   "BaseDN" : "dc=local,dc=durerocaribe,dc=cu",
   "LdapUriDn" : "ldap:///dc%3Dlocal%2Cdc%3Ddurerocaribe%2Cdc%3Dcu"
}

For NS Mail

{
   "BindDN" : "ldapservice@LOCAL.DUREROCARIBE.CU",
   "LdapURI" : "ldap://nsdc-odin.local.durerocaribe.cu",
   "DiscoverDcType" : "dns",
   "StartTls" : "1",
   "port" : 389,
   "host" : "nsdc-odin.local.durerocaribe.cu",
   "isAD" : "1",
   "isLdap" : "",
   "UserDN" : "DC=local,DC=durerocaribe,DC=cu",
   "GroupDN" : "DC=local,DC=durerocaribe,DC=cu",
   "BindPassword" : "SECRET",
   "BaseDN" : "DC=local,DC=durerocaribe,DC=cu",
   "LdapUriDn" : "ldap:///dc%3Dlocal%2Cdc%3Ddurerocaribe%2Cdc%3Dcu"
}

For NS Proxy

{
   "BindDN" : "ldapservice@LOCAL.DUREROCARIBE.CU",
   "LdapURI" : "ldap://nsdc-odin.local.durerocaribe.cu",
   "DiscoverDcType" : "dns",
   "StartTls" : "1",
   "port" : 389,
   "host" : "nsdc-odin.local.durerocaribe.cu",
   "isAD" : "1",
   "isLdap" : "",
   "UserDN" : "DC=local,DC=durerocaribe,DC=cu",
   "GroupDN" : "DC=local,DC=durerocaribe,DC=cu",
   "BindPassword" : "SECRET",
   "BaseDN" : "DC=local,DC=durerocaribe,DC=cu",
   "LdapUriDn" : "ldap:///dc%3Dlocal%2Cdc%3Ddurerocaribe%2Cdc%3Dcu"
}

For NS Mattermost

{
   "BindDN" : "ldapservice@LOCAL.DUREROCARIBE.CU",
   "LdapURI" : "ldap://nsdc-odin.local.durerocaribe.cu",
   "DiscoverDcType" : "dns",
   "StartTls" : "1",
   "port" : 389,
   "host" : "nsdc-odin.local.durerocaribe.cu",
   "isAD" : "1",
   "isLdap" : "",
   "UserDN" : "DC=local,DC=durerocaribe,DC=cu",
   "GroupDN" : "DC=local,DC=durerocaribe,DC=cu",
   "BindPassword" : "SECRET",
   "BaseDN" : "DC=local,DC=durerocaribe,DC=cu",
   "LdapUriDn" : "ldap:///dc%3Dlocal%2Cdc%3Ddurerocaribe%2Cdc%3Dcu"
}

OK, so AD seems to work…maybe try a lower TLS policy for Outlook?

Maybe Outlook logs show more information:

https://support.microsoft.com/en-us/help/2862843/how-to-enable-global-and-advanced-logging-for-microsoft-outlook

This is the output shown after NS AD starts:

systemctl status nsdc
● nsdc.service - NethServer Domain Controller container
   Loaded: loaded (/usr/lib/systemd/system/nsdc.service; enabled; vendor preset: disabled)
   Active: active (running) since Fri 2020-09-04 13:09:49 CDT; 31min ago
     Docs: man:systemd-nspawn(1)
 Main PID: 963 (systemd-nspawn)
   Status: "Container running."
    Tasks: 32
   Memory: 231.1M
   CGroup: /machine.slice/nsdc.service
           ├─963 /usr/bin/systemd-nspawn --quiet --keep-unit --boot --network-bridge=br0 --machine=nsdc --capability=CAP_SYS_TIME
           ├─968 /usr/lib/systemd/systemd
           └─system.slice
             ├─samba.service
             │ ├─1976 /usr/sbin/samba -i --debug-stderr
             │ ├─1984 /usr/sbin/samba -i --debug-stderr
             │ ├─1985 /usr/sbin/samba -i --debug-stderr
             │ ├─1986 /usr/sbin/samba -i --debug-stderr
             │ ├─1987 /usr/sbin/samba -i --debug-stderr
             │ ├─1988 /usr/sbin/samba -i --debug-stderr
             │ ├─1989 /usr/sbin/samba -i --debug-stderr
             │ ├─1990 /usr/sbin/samba -i --debug-stderr
             │ ├─1991 /usr/sbin/samba -i --debug-stderr
             │ ├─1992 /usr/sbin/samba -i --debug-stderr
             │ ├─1993 /usr/sbin/smbd -D --option=server role check:inhibit=yes --foreground
             │ ├─1994 /usr/sbin/samba -i --debug-stderr
             │ ├─1995 /usr/sbin/samba -i --debug-stderr
             │ ├─1996 /usr/sbin/samba -i --debug-stderr
             │ ├─1997 /usr/sbin/samba -i --debug-stderr
             │ ├─1998 /usr/sbin/samba -i --debug-stderr
             │ ├─1999 /usr/sbin/samba -i --debug-stderr
             │ ├─2004 /usr/sbin/winbindd -D --option=server role check:inhibit=yes --foreground
             │ ├─2022 /usr/sbin/smbd -D --option=server role check:inhibit=yes --foreground
             │ ├─2023 /usr/sbin/smbd -D --option=server role check:inhibit=yes --foreground
             │ ├─2024 /usr/sbin/smbd -D --option=server role check:inhibit=yes --foreground
             │ ├─2661 /usr/sbin/samba -i --debug-stderr
             │ ├─2850 /usr/sbin/samba -i --debug-stderr
             │ ├─2852 /usr/sbin/samba -i --debug-stderr
             │ └─2856 /usr/sbin/samba -i --debug-stderr
             ├─console-getty.service
             │ └─1975 /sbin/agetty --noclear --keep-baud console 115200,38400,9600 vt220
             ├─systemd-logind.service
             │ └─1974 /usr/lib/systemd/systemd-logind
             ├─dbus.service
             │ └─1971 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation
             ├─ntpd.service
             │ └─1972 /usr/sbin/ntpd -u ntp:ntp -g
             └─systemd-journald.service
               └─1056 /usr/lib/systemd/systemd-journald

Sep 04 13:09:52 odin.durerocaribe.cu systemd-nspawn[963]: [  OK  ] Started Network Service.
Sep 04 13:09:52 odin.durerocaribe.cu systemd-nspawn[963]: [  OK  ] Reached target Network.
Sep 04 13:09:52 odin.durerocaribe.cu systemd-nspawn[963]: [  OK  ] Started Samba domain controller daemon.
Sep 04 13:09:52 odin.durerocaribe.cu systemd-nspawn[963]: [  OK  ] Started Login Service.
Sep 04 13:09:52 odin.durerocaribe.cu systemd-nspawn[963]: [  OK  ] Reached target Multi-User System.
Sep 04 13:09:52 odin.durerocaribe.cu systemd-nspawn[963]: [  OK  ] Reached target Graphical Interface.
Sep 04 13:09:52 odin.durerocaribe.cu systemd-nspawn[963]: Starting Update UTMP about System Runlevel Changes...
Sep 04 13:09:52 odin.durerocaribe.cu systemd-nspawn[963]: [  OK  ] Started Update UTMP about System Runlevel Changes.
Sep 04 13:09:53 odin.durerocaribe.cu systemd-nspawn[963]: CentOS Linux 7 (Core)
Sep 04 13:09:53 odin.durerocaribe.cu systemd-nspawn[963]: Kernel 3.10.0-1127.19.1.el7.x86_64 on an x86_64

When I restart NSDC this happens

systemctl restart nsdc
systemctl status nsdc
● nsdc.service - NethServer Domain Controller container
   Loaded: loaded (/usr/lib/systemd/system/nsdc.service; enabled; vendor preset: disabled)
   Active: inactive (dead) since Fri 2020-09-04 14:06:33 CDT; 4s ago
     Docs: man:systemd-nspawn(1)
  Process: 3762 ExecStart=/usr/bin/systemd-nspawn --quiet --keep-unit --boot --network-bridge=${BRIDGE} --machine=nsdc ${OPTIONS} (code=exited, status=0/SUCCESS)
 Main PID: 3762 (code=exited, status=0/SUCCESS)
   Status: "Terminating..."

Sep 04 14:06:30 odin.durerocaribe.cu systemd-nspawn[3762]: Stopping Load/Save Random Seed...
Sep 04 14:06:30 odin.durerocaribe.cu systemd-nspawn[3762]: Stopping Update UTMP about System Boot/Shutdown...
Sep 04 14:06:30 odin.durerocaribe.cu systemd-nspawn[3762]: [  OK  ] Stopped Load/Save Random Seed.
Sep 04 14:06:30 odin.durerocaribe.cu systemd-nspawn[3762]: [  OK  ] Stopped Update UTMP about System Boot/Shutdown.
Sep 04 14:06:30 odin.durerocaribe.cu systemd-nspawn[3762]: [  OK  ] Stopped Create Volatile Files and Directories.
Sep 04 14:06:30 odin.durerocaribe.cu systemd-nspawn[3762]: [  OK  ] Reached target Shutdown.
Sep 04 14:06:30 odin.durerocaribe.cu systemd-nspawn[3762]: Sending SIGTERM to remaining processes...
Sep 04 14:06:33 odin.durerocaribe.cu systemd-nspawn[3762]: Sending SIGKILL to remaining processes...
Sep 04 14:06:33 odin.durerocaribe.cu systemd-nspawn[3762]: Halting system.
Sep 04 14:06:33 odin.durerocaribe.cu systemd[1]: Stopped NethServer Domain Controller container.
[root@odin ~]# systemctl restart nsdc
[root@odin ~]# systemctl status nsdc
● nsdc.service - NethServer Domain Controller container
   Loaded: loaded (/usr/lib/systemd/system/nsdc.service; enabled; vendor preset: disabled)
   Active: active (running) since Fri 2020-09-04 14:06:48 CDT; 2s ago
     Docs: man:systemd-nspawn(1)
 Main PID: 3924 (systemd-nspawn)
   Status: "Container running."
    Tasks: 4
   Memory: 9.8M
   CGroup: /machine.slice/nsdc.service
           ├─3924 /usr/bin/systemd-nspawn --quiet --keep-unit --boot --network-bridge=br0 --machine=nsdc --capability=CAP_SYS_TIME
           ├─3925 /usr/lib/systemd/systemd
           └─system.slice
             ├─systemd-journal-flush.service
             │ └─3944 /usr/bin/journalctl --flush
             └─systemd-journald.service
               └─3941 /usr/lib/systemd/systemd-journald

Sep 04 14:06:48 odin.durerocaribe.cu systemd-nspawn[3924]: Mounting Huge Pages File System...
Sep 04 14:06:48 odin.durerocaribe.cu systemd-nspawn[3924]: [  OK  ] Reached target Local File Systems (Pre).
Sep 04 14:06:48 odin.durerocaribe.cu systemd-nspawn[3924]: [  OK  ] Reached target Local File Systems.
Sep 04 14:06:48 odin.durerocaribe.cu systemd-nspawn[3924]: [  OK  ] Reached target Slices.
Sep 04 14:06:48 odin.durerocaribe.cu systemd-nspawn[3924]: [  OK  ] Reached target Local Encrypted Volumes.
Sep 04 14:06:48 odin.durerocaribe.cu systemd-nspawn[3924]: [  OK  ] Started Load/Save Random Seed.
Sep 04 14:06:48 odin.durerocaribe.cu systemd-nspawn[3924]: [  OK  ] Mounted POSIX Message Queue File System.
Sep 04 14:06:48 odin.durerocaribe.cu systemd-nspawn[3924]: [  OK  ] Started Journal Service.
Sep 04 14:06:48 odin.durerocaribe.cu systemd-nspawn[3924]: Starting Flush Journal to Persistent Storage...
Sep 04 14:06:48 odin.durerocaribe.cu systemd-nspawn[3924]: [  OK  ] Mounted Huge Pages File System.

After restarting I tried to send one mail, but Outlook once again asked for password
And I got the same log message:

Sep 04 14:00:48 hermod.durerocaribe.cu postfix/smtpd[6568]: warning: inf-jfernandez-vm.durerocaribe.cu[192.168.9.170]: SASL LOGIN authentication failed: Connection lost to authentication server

Do you use fail2ban or IPS and it blocks somehow?
Did you try changing TLS policy?
Does it work when you configure Outlook to use another auth method?

I use fail2ban on NS Mail, but all my NS VM are inside my LAN and on the same subnet (192.168.9.0/24), I haven’t changed fail2ban default configuration set by NS. So fail2ban won’t ban any IP inside the green zone (LAN).

Did you try changing TLS policy?
Could you please elaborate I don’t know what you means with this :sweat_smile:

This is the configuration I use on Outlook 2013 (IMAP)

Is there another I could try?

In System/TLS policy you can set it:

Yes, you may try SMTP port 465 and SSL instead of TLS.

And 993 + SSL for IMAP.

1 Like

I finally solved the problem :relieved:. The HDDs seems to have a malfunction, thanks to iostat and netdata y was able to see a high disk utilization on all my NS VMs. I decided to backup all my VMs and containers, change all Proxmox HDDs and reinstall Proxmox.
So far everything looks good. I will monitor disk utilization and report back.

@mrmarkuz, @pike thanks for the support.

2 Likes

Really far from diagnose an hardware problem on host since a “lost of connection” on a client for a guest service…
But it has quite sense: too slow the chain of request from client to LDAP (and back) for allow the system to work.
But once more, “do the homework” before (analyze the status of the installation, the surroundings, take note of everything “not obvious” for the community and create a comprehensive yet brief installation status report and issue description) and after posting a support request (helping the community to concentrate on understand what’s wrong instead that on “what the hell is this setup???”) helps a lot to pinpoint the issue.