IMAP connection issues on NS8

NethServer Version: current NS8
Module: dovecot/auth

This problem appears for Roundcube and mail app on iOS and have been a problem since I upgraded to NS8 and I can’t find any explanation in the logs. I haven’t noticed this problem as much with Thunderbird except sometimes when I write a mail it gives a popup saying that it couldn’t connect to the server.

On iOS the user get an error message that says “Temporary authentication failure”. It’s not consistent but appears more and more. When using Roundcube they can login but it gets stuck when loading the messages. Yesterday I noticed that I had six auth processes running that consumed almost all of the CPU so I restarted the server but it’s still unreliable.

My NS8 is a VM on PVE with Debian 13 as OS.

Hi Mahaq, please go to System logs page and follow the log for a while. Then share the full logs, or any relevant excerpt that may help to understand the issue.

Did you finish the NS7 migration? Are you using Samba or OpenLDAP for users and groups? Is it internal or an external service?

If the system has few resources, a reboot may help under some occasions.

Hi Davide

The migration is completed and the old server is shutdown. Good question about user accounts but it should be Samba, have container named samba-dc that is running and it lists all users. The server is used externally.

I have assigned 2 vCPU and 8 GB RAM to the VM and that has been enough so far, the installation has only a few users.

Here is a short extract from the logs that points to a timeout issue with ldap.

2026-06-24T20:46:42+02:00 \[1::systemd\] Starting prometheus-node-exporter-apt.service - Collect apt metrics for prometheus-node-exporter…
2026-06-24T20:46:42+02:00 \[1::systemd\] Starting prometheus-node-exporter-nvme.service - Collect NVMe metrics for prometheus-node-exporter…
2026-06-24T20:47:18+02:00 \[1:mail1:dovecot\] imap(84630): Error: auth-master: login: request \[298844161\]: Login auth request failed: Internal auth failure (auth connected 60025 msecs ago, request took 60025 msecs, client-pid=84626 client-id=1)
2026-06-24T20:47:18+02:00 \[1:mail1:dovecot\] imap-login: Disconnected: Internal login failure (pid=84626 id=1): user=, method=PLAIN, rip=95.193.70.43, lip=192.168.1.106, mpid=84630, TLS, session=
2026-06-24T20:47:18+02:00 \[1:mail1:dovecot\] imap(user1)<1469>: copy from INBOX: box=Trash, uid=41417, msgid=<2028694997.66199.1782326239011@lor1-app145158.prod.linkedin.com>, from=LinkedIn <messages-noreply@linkedin.com>, subject=\*\*\*\*\*\*\*\*\*, flags=(\\Seen NonJunk)
2026-06-24T20:47:18+02:00 \[1:mail1:rspamd\] (controller) ; monitored; rspamd_monitored_dns_cb: DNS reply returned ‘no error’ for score.senderscore.com while ‘no records with this name’ was expected when querying for ‘1.0.0.127.score.senderscore.com’(likely DNS spoofing or BL internal issues)
2026-06-24T20:47:18+02:00 \[1:mail1:dovecot\] auth: Error: auth client 0 disconnected with 1 pending requests: EOF
2026-06-24T20:47:18+02:00 \[1:crowdsec1:crowdsec1-firewall-bouncer\] time=“2026-06-24T18:46:45Z” level=info msg=“56 decisions deleted”
2026-06-24T20:47:18+02:00 \[1:mail1:rspamd\] (rspamd_proxy) ; proxy; proxy_milter_finish_handler: finished milter connection
2026-06-24T20:47:25+02:00 \[1:mail1:dovecot\] auth-worker(84631): Error: ldap(/etc/dovecot/passdb.conf.ext): Initial binding to LDAP server timed out
2026-06-24T20:47:25+02:00 \[1:ldapproxy1:ldapproxy\] 2026/06/24 18:47:25 \[info\] 25#25: \*33363 client disconnected, bytes from/to client:71/0, bytes from/to upstream:0/71
2026-06-24T20:47:25+02:00 \[1:mail1:dovecot\] auth-worker(84631): Error: ldap(/etc/dovecot/userdb.conf.ext): Initial binding to LDAP server timed out
2026-06-24T20:47:27+02:00 \[1:mail1:dovecot\] indexer-worker(user1)<84628><>: Error: auth-master: userdb lookup(user1): Auth USER lookup failed
2026-06-24T20:47:27+02:00 \[1:mail1:dovecot\] indexer-worker(84628): Error: conn unix:indexer-worker (pid=84627,uid=90): User user1 lookup failed: Internal error occurred. Refer to server log for more information.
2026-06-24T20:47:27+02:00 \[1:ldapproxy1:ldapproxy\] 2026/06/24 18:47:27 \[info\] 25#25: \*33379 client 127.0.0.1:38580 connected to 127.0.0.1:20000
2026-06-24T20:47:29+02:00 \[1::promtail\] ts=2026-06-24T18:47:29.23426925Z level=info msg=“reporting Alloy stats” date=2026-06-24T18:47:29.234Z
2026-06-24T20:47:29+02:00 \[1::promtail\] ts=2026-06-24T18:47:29.258462864Z level=info msg=“failed to send usage report” retries=0 err=“Post "https://stats.grafana.org/alloy-usage-report\”: tls: failed to verify certificate: x509: certificate is valid for autoconfig.domain.tld, mail.domain.tld, mail.valhall.domain.tld, mist.valhall.domain.tld, smtp.domain.tld, smtp.valhall.domain.tld, webmail.domain.tld, webtop.domain.tld, not stats.grafana.org"
2026-06-24T20:47:31+02:00 \[1::promtail\] ts=2026-06-24T18:47:31.290979691Z level=info msg=“failed to send usage report” retries=1 err=“Post "https://stats.grafana.org/alloy-usage-report\”: tls: failed to verify certificate: x509: certificate is valid for autoconfig.domain.tld, mail.domain.tld, mail.valhall.domain.tld, mist.valhall.domain.tld, smtp.domain.tld, smtp.valhall.domain.tld, webmail.domain.tld, webtop.domain.tld, not stats.grafana.org"
2026-06-24T20:47:34+02:00 \[1::promtail\] ts=2026-06-24T18:47:34.845261929Z level=info msg=“failed to send usage report” retries=2 err=“Post "https://stats.grafana.org/alloy-usage-report\”: tls: failed to verify certificate: x509: certificate is valid for autoconfig.domain.tld, mail.domain.tld, mail.valhall.domain.tld, mist.valhall.domain.tld, smtp.domain.tld, smtp.valhall.domain.tld, webmail.domain.tld, webtop.domain.tld, not stats.grafana.org"
2026-06-24T20:47:36+02:00 \[1:mail1:dovecot\] auth: Error: auth-worker: Aborted USER request for admin: Lookup timed out
2026-06-24T20:47:36+02:00 \[1:mail1:dovecot\] auth-worker(84634): Error: ldap(/etc/dovecot/passdb.conf.ext): Initial binding to LDAP server timed out
2026-06-24T20:47:36+02:00 \[1:mail1:dovecot\] auth-worker(84634): Error: ldap(/etc/dovecot/userdb.conf.ext): Initial binding to LDAP server timed out
2026-06-24T20:47:36+02:00 \[1:mail1:dovecot\] imap(84633): Error: auth-master: login: request \[4194172929\]: Login auth request failed: Internal auth failure (auth connected 60036 msecs ago, request took 60036 msecs, client-pid=84632 client-id=1)
2026-06-24T20:47:36+02:00 \[1:mail1:dovecot\] auth-worker(84634): Warning: conn unix:auth-worker (pid=83359,uid=90): Auth master disconnected us while handling request for admin for 60 secs (result=FAIL)
2026-06-24T20:47:36+02:00 \[1:mail1:dovecot\] imap-login: Disconnected: Internal login failure (pid=84632 id=1): user=, method=PLAIN, rip=10.5.4.1, lip=10.5.4.1, mpid=84633, secured, session=

Hi,

The Dovecot errors suggest that authentication requests are timing out while trying to bind to LDAP:

ldap(...): Initial binding to LDAP server timed out

Since the migration has completed successfully and the samba-dc container is running with the expected users, this could indicate that Dovecot is temporarily using an outdated or incorrect configuration and is trying to contact the wrong LDAP endpoint.

As a first troubleshooting step, I’d suggest restarting the Mail application from the Applications page. If the issue persists, try rebooting the whole server to ensure all services reload their configuration and reconnect to the current LDAP service.

After the restart/reboot, could you check whether the authentication failures are still occurring and whether users can log in normally?

Thanks.

I’ve already done that many times and it doesn’t solve the problem. I have searched through a lot of files to see if it refers to the old server anywhere without finding anything. The only thing I found was that it used the old nsdc-hostname but I have updated the DNS server to point to this VM.

Let’s see the ldapproxy configuration. Please paste the output of:

runagent -m ldapproxy1 cat nginx/nginx.conf

If the command fails, check the ldapproxy ID with

runagent -l | grep ldapproxy

user nginx;
worker_processes auto;

error_log /var/log/nginx/error.log info;
pid /var/run/nginx.pid;

events {
worker_connections 1024;
}

-# L4 proxy to LDAP account providers

stream {
-# Domain internal.domain.tld
server {
proxy_pass internal_domain_tld;
listen 127.0.0.1:20000;

    proxy_ssl on;
}
upstream internal.domain.tld {
    server 192.168.1.106:636; # origin internal.domain.tld
}

}

Could the IP be getting banned by CrowdSec?

1 Like

It’s not and I had the problem a since the migration. I did the migration three months ago and setup CrowdSec three weeks ago. Something I’ve noticed is that one user had the problem since the migration, another user got it a month ago and another noticed it for the last week. All user accounts has password set to not expire.