Traefik Suddenly Not Routing - No errors

Ted · May 31, 2024, 4:16pm

Hi there,

I woke up this morning and suddenly none of my internet facing services are working. Since they run on multiple servers, this was a surprise. I thought maybe something had gone wrong with the router, but alas, after spending half the day troubleshooting, the only remaining piece is Traefik.

I’ve run all the NS8 updates and am on the latest core as of today. Traefik shows no errors in the log. However, I can go to the LAN address of a service and it loads perfectly. When I try to connect to the same service using it’s dns name, it fails to load. This happens even if I eliminate the router and put the DNS name in a hosts file pointed at the NS8 server.

Please help! Going crazy!

Here’s a snippet of the Traefik log:

Log

``2024-05-31T17:17:37+01:00 [1:traefik1:traefik] 192.168.7.114 - - [31/May/2024:16:17:37 +0000] “GET /cluster-admin/api/cluster/task/c91dc3c6-e120-4bbe-95e7-af81ada3f6f9/status HTTP/2.0” 200 1281 “-” “-” 1354 “ApiServer-https@file” “http://127.0.0.1:9311” 6ms
2024-05-31T17:17:37+01:00 [1:traefik1:traefik] 192.168.7.114 - - [31/May/2024:16:17:37 +0000] “GET /cluster-admin/api/module/loki1/task/fae2b206-bbc4-4aa3-8113-daa2d8a8b1ba/status HTTP/2.0” 200 190 “-” “-” 1355 “ApiServer-https@file” “http://127.0.0.1:9311” 5ms
2024-05-31T17:17:37+01:00 [1:traefik1:traefik] 192.168.7.114 - - [31/May/2024:16:17:37 +0000] “GET /cluster-admin/api/cluster/task/9620ee9d-4f74-4fb8-8d25-f7664ab28c1e/context HTTP/2.0” 200 240 “-” “-” 1356 “ApiServer-https@file” “http://127.0.0.1:9311” 3ms
2024-05-31T17:17:37+01:00 [1:traefik1:traefik] 192.168.7.114 - - [31/May/2024:16:17:37 +0000] “GET /cluster-admin/api/cluster/task/9620ee9d-4f74-4fb8-8d25-f7664ab28c1e/context HTTP/2.0” 200 240 “-” “-” 1357 “ApiServer-https@file” “http://127.0.0.1:9311” 3ms
2024-05-31T17:17:37+01:00 [1:traefik1:traefik] 192.168.7.114 - - [31/May/2024:16:17:37 +0000] “GET /cluster-admin/api/cluster/task/9620ee9d-4f74-4fb8-8d25-f7664ab28c1e/status HTTP/2.0” 200 250 “-” “-” 1358 “ApiServer-https@file” “http://127.0.0.1:9311” 3ms
2024-05-31T17:17:37+01:00 [1:traefik1:traefik] 192.168.7.114 - - [31/May/2024:16:17:37 +0000] “GET /cluster-admin/api/module/nextcloud3/task/b0a4619c-783b-423b-b777-c8cf01e5f539/context HTTP/2.0” 400 99 “-” “-” 1359 “ApiServer-https@file” “http://127.0.0.1:9311” 0ms
2024-05-31T17:17:42+01:00 [1:traefik1:traefik] 192.168.7.114 - - [31/May/2024:16:17:42 +0000] “GET /cluster-admin/api/module/nextcloud3/task/b0a4619c-783b-423b-b777-c8cf01e5f539/context HTTP/2.0” 400 99 “-” “-” 1360 “ApiServer-https@file” “http://127.0.0.1:9311” 0ms
2024-05-31T17:17:47+01:00 [1:traefik1:traefik] 192.168.7.114 - - [31/May/2024:16:17:47 +0000] “GET /cluster-admin/api/module/nextcloud3/task/b0a4619c-783b-423b-b777-c8cf01e5f539/context HTTP/2.0” 400 99 “-” “-” 1361 “ApiServer-https@file” “http://127.0.0.1:9311” 0ms
2024-05-31T17:17:53+01:00 [1:traefik1:traefik] 192.168.7.114 - - [31/May/2024:16:17:53 +0000] “GET /cluster-admin/api/module/nextcloud3/task/b0a4619c-783b-423b-b777-c8cf01e5f539/context HTTP/2.0” 400 99 “-” “-” 1362 “ApiServer-https@file” “http://127.0.0.1:9311” 0ms
2024-05-31T17:17:58+01:00 [1:traefik1:traefik] 192.168.7.114 - - [31/May/2024:16:17:58 +0000] “GET /cluster-admin/api/module/nextcloud3/task/b0a4619c-783b-423b-b777-c8cf01e5f539/context HTTP/2.0” 400 99 “-” “-” 1363 “ApiServer-https@file” “http://127.0.0.1:9311” 0ms`
type or paste code here

Ted · June 1, 2024, 10:21am

Hi guys. Don’t wish to be a pain here, but all of my services are out. This happened without me adjusting/touching/doing ANYTHING. I’ve traced the problem to Traefik. I suppose I could go hunt around on Traefik forums, but I can’t stop thinking about…

Because NS8 chooses to do things it’s own way, me poking about in Traefik config files is as likely to do damage as help anything. Surely, someone has some suggestions where I can look to get this sorted?

Thank you!

Ted · June 1, 2024, 10:24am

Ok. Disregard. Suddenly it’s all working again… Would still love to know where to look next time this happens. If there is a doc on Traefik or a forum post I’ve missed that I need to read up on, I’d be grateful to whomever points me in the right direction.

Ted · June 1, 2024, 11:02am

And… it’s broken again. Here’s what I’ve learned:

Traefik appears to be working for a few minutes after a restart. Then it quickly fails in some way I’ve yet to determine.

Here’s some additional info in hopes that it will draw out some ideas and suggestions:

Traefik acts as a reverse proxy for six services:

email (running via the NS8 module)
calibre-web (a docker container on OMV6 - same proxmox as NS8)
immich (hosted on a separate RPi without any virtualization)
emby (it’s own VM on the same proxmox as NS8)
wordpress 1 (hosted on NS8 - wordpress module)
wordpress 2 (hosted on NS8 - wordpress module)

I can connect to calibre-web, immich and emby using their internal addresses. They respond instantly. However, if I enter the internet facing addresses Traefik is supposed to be routing, I get a timeout after a very long time.

I am unable to connect to email or either wordpress site. I suppose I could try to create an SSH tunnel and bypass Traefik, but I feel like I’ve already isolated the problem.

As you can see in the log snippet above, the Traefik logs are almost entirely heartbeats for a partially migrated nextcloud server. Since that migration won’t complete and I need to see what’s really happening, I’ve tried to remove that instance, but uninstall failed.

Here’s another log snippet:

Log:

2024-06-01T11:56:02+01:00 [1:traefik1:traefik] 192.168.xxx.xxx - - [01/Jun/2024:10:56:02 +0000] "GET /cluster-admin/js/settings-firewall~9b659218.1524a672.js HTTP/2.0" 200 1314 "-" "-" 90 "ApiServer-https@file" "http://127.0.0.1:9311" 179ms

2024-06-01T11:56:02+01:00 [1:traefik1:traefik] 192.168.xxx.xxx - - [01/Jun/2024:10:56:02 +0000] "GET /cluster-admin/js/app~748942c6.302e6417.js HTTP/2.0" 200 3532 "-" "-" 30 "ApiServer-https@file" "http://127.0.0.1:9311" 215ms

2024-06-01T11:56:02+01:00 [1:traefik1:traefik] 192.168.xxx.xxx - - [01/Jun/2024:10:56:02 +0000] "GET /cluster-admin/js/chunk-vendors~5bb1f863.3fccd3d4.js HTTP/2.0" 200 16847 "-" "-" 41 "ApiServer-https@file" "http://127.0.0.1:9311" 216ms

2024-06-01T11:56:02+01:00 [1:traefik1:traefik] 192.168.xxx.xxx - - [01/Jun/2024:10:56:02 +0000] "GET /cluster-admin/js/cluster-admins~21833f8f.bffaa3c6.js HTTP/2.0" 200 4668 "-" "-" 83 "ApiServer-https@file" "http://127.0.0.1:9311" 184ms

2024-06-01T11:56:02+01:00 [1:traefik1:traefik] 192.168.xxx.xxx - - [01/Jun/2024:10:56:02 +0000] "GET /cluster-admin/js/chunk-vendors~d2305125.6a99a4ff.js HTTP/2.0" 200 23978 "-" "-" 58 "ApiServer-https@file" "http://127.0.0.1:9311" 217ms

2024-06-01T11:56:02+01:00 [1:traefik1:traefik] 192.168.xxx.xxx - - [01/Jun/2024:10:56:02 +0000] "GET /cluster-admin/js/settings-tls-certificates~31ecd969.9376ed65.js HTTP/2.0" 200 7456 "-" "-" 98 "ApiServer-https@file" "http://127.0.0.1:9311" 185ms

2024-06-01T11:56:02+01:00 [1:traefik1:traefik] 192.168.xxx.xxx - - [01/Jun/2024:10:56:02 +0000] "GET /cluster-admin/css/subscription~31ecd969.c60a3a4d.css HTTP/2.0" 200 178 "-" "-" 75 "ApiServer-https@file" "http://127.0.0.1:9311" 200ms

2024-06-01T11:56:02+01:00 [1:traefik1:traefik] 192.168.xxx.xxx - - [01/Jun/2024:10:56:02 +0000] "GET /cluster-admin/js/lang-pt-translation-json~45d767f3.026b5334.js HTTP/2.0" 200 17577 "-" "-" 85 "ApiServer-https@file" "http://127.0.0.1:9311" 187ms

2024-06-01T11:56:02+01:00 [1:traefik1:traefik] 192.168.xxx.xxx - - [01/Jun/2024:10:56:02 +0000] "GET /cluster-admin/js/app~c64e6414.0e5c33f1.js HTTP/2.0" 200 13904 "-" "-" 32 "ApiServer-https@file" "http://127.0.0.1:9311" 225ms

2024-06-01T11:56:02+01:00 [1:traefik1:traefik] 192.168.xxx.xxx - - [01/Jun/2024:10:56:02 +0000] "GET /cluster-admin/css/settings-tls-certificates~31ecd969.8ccd1286.css HTTP/2.0" 200 138 "-" "-" 77 "ApiServer-https@file" "http://127.0.0.1:9311" 191ms

2024-06-01T11:56:02+01:00 [1:traefik1:traefik] 192.168.xxx.xxx - - [01/Jun/2024:10:56:02 +0000] "GET /cluster-admin/css/settings-system-logs~21833f8f.46ce804d.css HTTP/2.0" 200 228 "-" "-" 73 "ApiServer-https@file" "http://127.0.0.1:9311" 205ms

2024-06-01T11:56:02+01:00 [1:traefik1:traefik] 192.168.xxx.xxx - - [01/Jun/2024:10:56:02 +0000] "GET /cluster-admin/js/chunk-vendors~b4f4bbe2.d842050e.js HTTP/2.0" 200 6119 "-" "-" 45 "ApiServer-https@file" "http://127.0.0.1:9311" 225ms

2024-06-01T11:56:02+01:00 [1:traefik1:traefik] 192.168.xxx.xxx - - [01/Jun/2024:10:56:02 +0000] "GET /cluster-admin/css/software-center~31ecd969.5a10b3f1.css HTTP/2.0" 200 11873 "-" "-" 74 "ApiServer-https@file" "http://127.0.0.1:9311" 205ms

2024-06-01T11:56:02+01:00 [1:traefik1:traefik] 192.168.xxx.xxx - - [01/Jun/2024:10:56:02 +0000] "GET /cluster-admin/css/settings-acme-servers~31ecd969.1cc194b8.css HTTP/2.0" 200 83 "-" "-" 68 "ApiServer-https@file" "http://127.0.0.1:9311" 206ms

2024-06-01T11:56:02+01:00 [1:traefik1:traefik] 192.168.xxx.xxx - - [01/Jun/2024:10:56:02 +0000] "GET /cluster-admin/js/app~970f9218.74b46c81.js HTTP/2.0" 200 6896 "-" "-" 31 "ApiServer-https@file" "http://127.0.0.1:9311" 226ms

2024-06-01T11:56:02+01:00 [1:traefik1:traefik] 192.168.xxx.xxx - - [01/Jun/2024:10:56:02 +0000] "GET /cluster-admin/js/subscription~31ecd969.61dc061d.js HTTP/2.0" 200 3924 "-" "-" 101 "ApiServer-https@file" "http://127.0.0.1:9311" 190ms

2024-06-01T11:56:02+01:00 [1:traefik1:traefik] 192.168.xxx.xxx - - [01/Jun/2024:10:56:02 +0000] "GET /cluster-admin/js/lang-ar-translation-json~3bbe8b71.e78b00d2.js HTTP/2.0" 200 18037 "-" "-" 93 "ApiServer-https@file" "http://127.0.0.1:9311" 191ms

2024-06-01T11:56:02+01:00 [1:traefik1:traefik] 192.168.xxx.xxx - - [01/Jun/2024:10:56:02 +0000] "GET /cluster-admin/js/software-center~31ecd969.ecdcfea1.js HTTP/2.0" 200 16959 "-" "-" 97 "ApiServer-https@file" "http://127.0.0.1:9311" 191ms

2024-06-01T11:56:02+01:00 [1:traefik1:traefik] 192.168.xxx.xxx - - [01/Jun/2024:10:56:02 +0000] "GET /cluster-admin/js/lang-it-translation-json~e043826f.8426f7e5.js HTTP/2.0" 200 19246 "-" "-" 76 "ApiServer-https@file" "http://127.0.0.1:9311" 278ms

2024-06-01T11:56:02+01:00 [1:traefik1:traefik] 192.168.xxx.xxx - - [01/Jun/2024:10:56:02 +0000] "GET /cluster-admin/js/chunk-vendors~9c5b28f6.feb6b12e.js HTTP/2.0" 200 24712 "-" "-" 43 "ApiServer-https@file" "http://127.0.0.1:9311" 299ms

2024-06-01T11:56:02+01:00 [1:traefik1:traefik] 192.168.xxx.xxx - - [01/Jun/2024:10:56:02 +0000] "GET /cluster-admin/css/app~0118529b.f76e51dd.css HTTP/2.0" 200 53543 "-" "-" 20 "ApiServer-https@file" "http://127.0.0.1:9311" 305ms

2024-06-01T11:56:02+01:00 [1:traefik1:traefik] 192.168.xxx.xxx - - [01/Jun/2024:10:56:02 +0000] "GET /cluster-admin/js/chunk-vendors~bc21d4b3.68b10b46.js HTTP/2.0" 200 86254 "-" "-" 49 "ApiServer-https@file" "http://127.0.0.1:9311" 302ms

2024-06-01T11:56:02+01:00 [1:traefik1:traefik] 192.168.xxx.xxx - - [01/Jun/2024:10:56:02 +0000] "GET /cluster-admin/js/lang-de-translation-json~3c620948.3d7b2bb2.js HTTP/2.0" 200 19829 "-" "-" 91 "ApiServer-https@file" "http://127.0.0.1:9311" 267ms

2024-06-01T11:56:02+01:00 [1:traefik1:traefik] 192.168.xxx.xxx - - [01/Jun/2024:10:56:02 +0000] "GET /cluster-admin/js/app~f71cff67.6dc394c7.js HTTP/2.0" 200 15669 "-" "-" 33 "ApiServer-https@file" "http://127.0.0.1:9311" 303ms

2024-06-01T11:56:02+01:00 [1:traefik1:traefik] 192.168.xxx.xxx - - [01/Jun/2024:10:56:02 +0000] "GET /cluster-admin/js/chunk-vendors~0f485567.ed56f101.js HTTP/2.0" 200 18672 "-" "-" 35 "ApiServer-https@file" "http://127.0.0.1:9311" 304ms

2024-06-01T11:56:02+01:00 [1:traefik1:traefik] 192.168.xxx.xxx - - [01/Jun/2024:10:56:02 +0000] "GET /cluster-admin/js/chunk-vendors~41d44f25.1a205b8d.js HTTP/2.0" 200 24335 "-" "-" 37 "ApiServer-https@file" "http://127.0.0.1:9311" 307ms

2024-06-01T11:56:02+01:00 [1:traefik1:traefik] 192.168.xxx.xxx - - [01/Jun/2024:10:56:02 +0000] "GET /cluster-admin/js/app~33d48c78.b5aa10d7.js HTTP/2.0" 200 21960 "-" "-" 28 "ApiServer-https@file" "http://127.0.0.1:9311" 308ms

2024-06-01T11:56:02+01:00 [1:traefik1:traefik] 192.168.xxx.xxx - - [01/Jun/2024:10:56:02 +0000] "GET /cluster-admin/js/chunk-vendors~86f6b1bc.c31f4ac8.js HTTP/2.0" 200 61359 "-" "-" 46 "ApiServer-https@file" "http://127.0.0.1:9311" 310ms

2024-06-01T11:56:02+01:00 [1:traefik1:traefik] 192.168.xxx.xxx - - [01/Jun/2024:10:56:02 +0000] "GET /cluster-admin/js/lang-en-translation-json~9b60384d.dcf4d241.js HTTP/2.0" 200 17277 "-" "-" 103 "ApiServer-https@file" "http://127.0.0.1:9311" 2ms

2024-06-01T11:56:03+01:00 [1:traefik1:traefik] 192.168.xxx.xxx - - [01/Jun/2024:10:56:03 +0000] "GET /cluster-admin/favicon.ico HTTP/2.0" 200 879 "-" "-" 106 "ApiServer-https@file" "http://127.0.0.1:9311" 54ms

2024-06-01T11:56:04+01:00 [1:traefik1:traefik] 192.168.xxx.xxx - - [01/Jun/2024:10:56:03 +0000] "POST /cluster-admin/api/cluster/tasks HTTP/2.0" 201 271 "-" "-" 105 "ApiServer-https@file" "http://127.0.0.1:9311" 1131ms

2024-06-01T11:56:04+01:00 [1:traefik1:traefik] 192.168.xxx.xxx - - [01/Jun/2024:10:56:03 +0000] "GET /cluster-admin/api/cluster/task/85e954ff-3a26-44a6-9fae-b68444f90a7d/context HTTP/2.0" 200 282 "-" "-" 107 "ApiServer-https@file" "http://127.0.0.1:9311" 1230ms

2024-06-01T11:56:06+01:00 [1:traefik1:traefik] 192.168.xxx.xxx - - [01/Jun/2024:10:56:06 +0000] "GET /cluster-admin/api/module/nextcloud3/task/b0a4619c-783b-423b-b777-c8cf01e5f539/context HTTP/2.0" 400 99 "-" "-" 109 "ApiServer-https@file" "http://127.0.0.1:9311" 0ms

2024-06-01T11:56:08+01:00 [1:traefik1:traefik] 192.168.xxx.xxx - - [01/Jun/2024:10:56:04 +0000] "GET /cluster-admin/api/cluster/task/85e954ff-3a26-44a6-9fae-b68444f90a7d/context HTTP/2.0" 200 282 "-" "-" 108 "ApiServer-https@file" "http://127.0.0.1:9311" 3453ms

2024-06-01T11:56:09+01:00 [1:traefik1:traefik] 192.168.xxx.xxx - - [01/Jun/2024:10:56:08 +0000] "GET /cluster-admin/api/cluster/task/85e954ff-3a26-44a6-9fae-b68444f90a7d/context HTTP/2.0" 200 282 "-" "-" 110 "ApiServer-https@file" "http://127.0.0.1:9311" 1494ms

2024-06-01T11:56:24+01:00 [1:traefik1:traefik] 192.168.xxx.xxx - - [01/Jun/2024:10:56:11 +0000] "GET /cluster-admin/api/cluster/task/85e954ff-3a26-44a6-9fae-b68444f90a7d/status HTTP/2.0" 200 495 "-" "-" 112 "ApiServer-https@file" "http://127.0.0.1:9311" 661ms

2024-06-01T11:56:24+01:00 [1:traefik1:traefik] 192.168.xxx.xxx - - [01/Jun/2024:10:56:14 +0000] "GET /cluster-admin/api/cluster/task/2366a50a-65fc-4f61-8c47-5ef4e9526b33/status HTTP/2.0" 200 1266 "-" "-" 127 "ApiServer-https@file" "http://127.0.0.1:9311" 6ms

2024-06-01T11:56:24+01:00 [1:traefik1:traefik] 192.168.xxx.xxx - - [01/Jun/2024:10:56:12 +0000] "POST /cluster-admin/api/cluster/tasks HTTP/2.0" 201 246 "-" "-" 116 "ApiServer-https@file" "http://127.0.0.1:9311" 2859ms

2024-06-01T11:56:24+01:00 [1:traefik1:traefik] 192.168.xxx.xxx - - [01/Jun/2024:10:56:15 +0000] "GET /cluster-admin/api/module/loki1/task/69029d17-3037-40bb-a779-8db7d16c8510/context HTTP/2.0" 200 192 "-" "-" 133 "ApiServer-https@file" "http://127.0.0.1:9311" 107ms

2024-06-01T11:56:24+01:00 [1:traefik1:traefik] 192.168.xxx.xxx - - [01/Jun/2024:10:56:15 +0000] "GET /cluster-admin/api/module/loki1/task/69029d17-3037-40bb-a779-8db7d16c8510/context HTTP/2.0" 200 192 "-" "-" 134 "ApiServer-https@file" "http://127.0.0.1:9311" 11ms

2024-06-01T11:56:24+01:00 [1:traefik1:traefik] 192.168.xxx.xxx - - [01/Jun/2024:10:56:19 +0000] "GET /cluster-admin/api/cluster/task/515ed12e-6c3b-4b76-9867-a1502050ea2b/context HTTP/2.0" 200 247 "-" "-" 148 "ApiServer-https@file" "http://127.0.0.1:9311" 2009ms

2024-06-01T11:56:27+01:00 [1:traefik1:traefik] 192.168.xxx.xxx - - [01/Jun/2024:10:56:27 +0000] "GET /cluster-admin/api/module/nextcloud3/task/b0a4619c-783b-423b-b777-c8cf01e5f539/context HTTP/2.0" 400 99 "-" "-" 151 "ApiServer-https@file" "http://127.0.0.1:9311" 0ms

2024-06-01T11:56:32+01:00 [1:traefik1:traefik] 192.168.xxx.xxx - - [01/Jun/2024:10:56:32 +0000] "GET /cluster-admin/api/module/nextcloud3/task/b0a4619c-783b-423b-b777-c8cf01e5f539/context HTTP/2.0" 400 99 "-" "-" 152 "ApiServer-https@file" "http://127.0.0.1:9311" 0ms

2024-06-01T11:56:37+01:00 [1:traefik1:traefik] 192.168.xxx.xxx - - [01/Jun/2024:10:56:37 +0000] "GET /cluster-admin/api/module/nextcloud3/task/b0a4619c-783b-423b-b777-c8cf01e5f539/context HTTP/2.0" 400 99 "-" "-" 153 "ApiServer-https@file" "http://127.0.0.1:9311" 0ms

2024-06-01T11:56:51+01:00 [1:traefik1:traefik] xxx.xxx.xxx.xxx - - [01/Jun/2024:10:56:51 +0000] "GET /emby/Users/6ae60cf693c548a390957aa72976840c/Items?Recursive=True&limit=20&IsPlayed=False&SortBy=Random&IncludeItemTypes=Movie&ImageTypeLimit=0&format=json HTTP/1.1" 200 726 "-" "-" 156 "emby-https@file" "http://192.168.xxx.xxx:8096" 15ms

2024-06-01T11:56:58+01:00 [1:traefik1:traefik] 192.168.xxx.xxx - - [01/Jun/2024:10:56:58 +0000] "GET /cluster-admin/api/module/nextcloud3/task/b0a4619c-783b-423b-b777-c8cf01e5f539/context HTTP/2.0" 400 99 "-" "-" 158 "ApiServer-https@file" "http://127.0.0.1:9311" 0ms

2024-06-01T11:57:03+01:00 [1:traefik1:traefik] 192.168.xxx.xxx - - [01/Jun/2024:10:57:03 +0000] "GET /cluster-admin/api/module/nextcloud3/task/b0a4619c-783b-423b-b777-c8cf01e5f539/context HTTP/2.0" 400 99 "-" "-" 159 "ApiServer-https@file" "http://127.0.0.1:9311" 0ms

2024-06-01T11:57:08+01:00 [1:traefik1:traefik] 192.168.xxx.xxx - - [01/Jun/2024:10:57:08 +0000] "GET /cluster-admin/api/module/nextcloud3/task/b0a4619c-783b-423b-b777-c8cf01e5f539/context HTTP/2.0" 400 99 "-" "-" 160 "ApiServer-https@file" "http://127.0.0.1:9311" 14ms

2024-06-01T11:57:13+01:00 [1:traefik1:traefik] 192.168.xxx.xxx - - [01/Jun/2024:10:57:13 +0000] "GET /cluster-admin/api/module/nextcloud3/task/b0a4619c-783b-423b-b777-c8cf01e5f539/context HTTP/2.0" 400 99 "-" "-" 161 "ApiServer-https@file" "http://127.0.0.1:9311" 0ms

2024-06-01T11:57:18+01:00 [1:traefik1:traefik] 192.168.xxx.xxx - - [01/Jun/2024:10:57:18 +0000] "GET /cluster-admin/api/module/nextcloud3/task/b0a4619c-783b-423b-b777-c8cf01e5f539/context HTTP/2.0" 400 99 "-" "-" 162 "ApiServer-https@file" "http://127.0.0.1:9311" 0ms

2024-06-01T11:57:23+01:00 [1:traefik1:traefik] 192.168.xxx.xxx - - [01/Jun/2024:10:57:23 +0000] "GET /cluster-admin/api/module/nextcloud3/task/b0a4619c-783b-423b-b777-c8cf01e5f539/context HTTP/2.0" 400 99 "-" "-" 163 "ApiServer-https@file" "http://127.0.0.1:9311" 0ms

2024-06-01T11:57:28+01:00 [1:traefik1:traefik] 192.168.xxx.xxx - - [01/Jun/2024:10:57:28 +0000] "GET /cluster-admin/api/module/nextcloud3/task/b0a4619c-783b-423b-b777-c8cf01e5f539/context HTTP/2.0" 400 99 "-" "-" 164 "ApiServer-https@file" "http://127.0.0.1:9311" 0ms

pike · June 1, 2024, 12:55pm

I resorted the reverse proxy serviced by Traefik.
italic for NS8-hosted
Then sorted by closeness.
-same host, different guest
-same host, container on a different guest
-another host

Please, consider to delete temporarely from immich one service at the time (excluding NS8 modules), for try to diagnose if there’s anyone of these that create issues.
When you find the “breaking” host, then avoid adding back, and try with others.

I bet on Immich and calibre-web as culprits.

Ted · June 1, 2024, 1:21pm

@pike you are amazing! I still don’t understand the cause. However, I removed all of the non-NS8 traefik entries as well as all NS8 modules that were not currently in use. The NS8 modules were immediately accessible again.

Then I slowly added back the services in Traefik. So far, everything is working. I will monitor for a few hours and report back.

Here’s my question – why would bad behavior at the far end of a reverse proxy prevent access to other proxied resources?

pike · June 1, 2024, 1:25pm

Personal opinion
I bet for configuration of Traefik well tested, polished and verified for any of the NS8 modules, both for behaviour and application compatibility. The dev team is skilled and conscious for trying to avoid any issue for adopters and customers.

Then… there’s the whole world behind of web applications that could work (or not) with traefik, and that could need some improvement and tests for any other “foreingn host” (as traefik perspective).
Last but not least: foreing hosts, even in LAN, might need specific settings even for timing/timeout. And if a service won’t respond in timely manner… maybe traefik might not like that that much?
IDK.

Keep updating this thread, while your tests proceed.

Ted · June 1, 2024, 2:22pm

Hmm… Traefik dropped again. So, once more, I removed all of the non-NS8 entries. Unfortunately, this time I’m still unable to access the NS8 resources via Traefik.

Ted · June 1, 2024, 3:34pm

I thought perhaps my brief success was related to removing some unnecessary NS8 modules, so I reinstalled and removed Webserver – taking the time to create a fake virtual host entry and be sure it all populated into Traefik. No luck. Nothing I’ve done has restored access.

Ted · June 2, 2024, 3:48pm

Ok. More poking. More information. I’ve learned a bit more about Traefik in NS8. It’s actually what’s providing the custer-admin interface – which means that Traefik is working for some things.

If someone understands the process and could help me trace the distinctions between the api and the app modules that would be helpful. I went ahead and started digging into the configs. They’re located (by default) in /home/traefik1/.config/state/configs

Cluster Admin appears to be setup in the _api_server.yml file:

http:
  middlewares:
    ApiServer-stripprefix:
      stripPrefix:
        forceSlash: 'false'
        prefixes:
        - /cluster-admin
    ApiServerMw2:
      redirectRegex:
        regex: ^.*/cluster-admin$
        replacement: /cluster-admin/
  routers:
    ApiServer-http:
      entrypoints:
      - http
      middlewares:
      - http2https-redirectscheme
      rule: Path(`/cluster-admin`) || PathPrefix(`/cluster-admin/`)
      service: ApiServer
      priority: '100000'
    ApiServer-https:
      entrypoints:
      - https
      middlewares:
      - ApiServerMw2
      - ApiServer-stripprefix
      priority: '100000'
      rule: Path(`/cluster-admin`) || PathPrefix(`/cluster-admin/`)
      service: ApiServer
      tls: {}
  services:
    ApiServer:
      loadBalancer:
        servers:
        - url: http://127.0.0.1:9311

Looks simple enough. However, these is another api file: _api.yml


http:
  middlewares:
    ApisEndpointMw0:
      ipWhiteList:
        sourceRange:
        - 127.0.0.1
    ApisEndpointMw1:
      stripPrefix:
        prefixes:
        - /ea7025c5-2247-435c-8db5-b962091286ca
  routers:
    ApisEndpointHttp:
      entrypoints:
      - http
      middlewares:
      - ApisEndpointMw1
      - ApisEndpointMw0
      priority: '100000'
      rule: PathPrefix(`/ea7025c5-2247-435c-8db5-b962091286ca/api`)
      service: api@internal

I’ve no idea what this one does. Note, that odd space on the first line is how the file is on my server. Is that correct?

I’m going to skip the certificate files for now and just compare an NS8 module to one of my other entries (again, I can remove or add these with no effect).

Here's a manually created entry for Emby: Emby.yml

http:
  services:
    Emby:
      loadBalancer:
        servers:
        - url: http://192.168.xxx.xxx:8096
  routers:
    Emby-http:
      rule: Host(`host.domain.com`)
      priority: '2'
      entryPoints: http,https
      service: Emby
    Emby-https:
      rule: Host(`host.domain.com`)
      priority: '2'
      entryPoints: http,https
      service: Emby
      tls:
        domains:
        - main: host.domain.com
        certresolver: acmeServer

Here's an NS8 Wordpress Module: wordpress6.yml

http:
  services:
    wordpress6:
      loadBalancer:
        servers:
        - url: http://127.0.0.1:20040
  routers:
    wordpress6-http:
      rule: Host(`host.domain.com`)
      priority: '2'
      entryPoints: http,https
      service: wordpress6
      middlewares:
      - http2https-redirectscheme
    wordpress6-https:
      rule: Host(`host.domain.com`)
      priority: '2'
      entryPoints: http,https
      service: wordpress6
      tls:
        domains:
        - main: host.domain.com
        certresolver: acmeServer

Other than the “redirect scheme” line they look identically. Of course, they’re both not working, so there’s that.

Ted · June 5, 2024, 8:38am

Continuing in my efforts to reverse engineer how Traefik works in NS8…

I created a new node, added it to my cluster and moved one of my wordpress sites over. Of course it works flawlessly on the new node.

Comparing Traefik configs on the new node, _api_server.yml is line for line identical. _api.yml contains the same odd starting carriage return on the new node and is the same except that the prefix codes are different.

So far, this remains completely unhelpful as to why one works and the other does not. I did run the core update yesterday, which included a bumped version of Traefik. This made zero difference in my problem.

Someone that knows this software, please help! I’d love to say I’ve screwed up somewhere. I think I’ve done that on this forum many times. But, a running system which I didn’t touch any part of suddenly stopped working. My mail server and several internet facing websites have been broken for more than a week, and I’ve no idea how to solve it.

Surely, someone knows what part of Traefik can change on it’s own? That would be a starting point at least.

Ted · June 5, 2024, 11:11am

Next idea… I setup a simple Nginx reverse proxy on a separate machine to handle incoming http and https requests, thus allowing me to bypass Trafik for non-NS8 modules.

This works great for all of the non-NS8 services. However, Trafik won’t play nice with the reverse proxy… at least not as far as I can tell. Sites on NS8 simply dump to root on the nginx machine.

Is there anyway to address NS8 modules outside of their 127.0.0.1 addressing?

Ted · June 5, 2024, 11:59am

Ok. Disregard this. inadequate testing. Not sure if this is true or not. More variables to eliminate.

Ted · June 6, 2024, 8:36am

Alright, after further testing, I have all of my non-NS8 sites working using an external nginx proxy.

However, the nginx proxy gets a 404 error when it tries to contact sites still on NS8. However, those sites are still trapped on my NS8 box with the broken Traefik. No longer trusting Traefik, I went ahead and migrated the sites I could away from NS8.

That said, my email server and one of my websites is still hostage, inaccessible on my NS8 box. I was unsuccessful migrating them to a new cluster node. I’ll try that again later today.

Ted · June 8, 2024, 7:09am

Moving them to a new cluster node does solve the problem.

But this isn’t an acceptable solution. If, at any time, without being touched, Traefik can break and the only solution is to turn up a new server and migrate all services…

The complete radio silence from everyone except @pike is also quite shocking.

BigSteve · August 17, 2024, 4:26pm

I’m worried I have the same problem. I can’t even hit my cluster admin page right now, and all my routing is broken.

Andy_Wismer · August 17, 2024, 8:58pm

Hi @Ted and Hi @BigSteve

I simply ignored this post, as it’s on the level of:

“The Internet is not working”, not (correctly) “MY Internet is not working”…

The first implies Google, Facebook and everything else is down, the second implies your router is screwed up…

→

Not one word about:

Environment
Virtualized or Native Hardware Install
What Router are you using? NethSecurity? OPNsense? pfSense? Provider Router?
CPU, RAM, Disk size (Used also!)
Base System (Debian? Rocky? Alma?)
What DNS is being used? (None?)
Was anything changed or modified before it stopped working?
Is the NS8 system AND the base system up to date?

and these would be only the basics…

→ If you really expect anyone to help, please provide Infos!
None of us here are have mindreading capabilities.

Yes, this is a critic, but constructive! Please provide infos, instead of forcing us to ask every possible question…

My 2 cents
Andy

Andy_Wismer · August 17, 2024, 9:02pm

Hi @BigSteve

If ALL your routing is broken, HOW did you post here?

Please provide infos as I suggested in the earlier post, and not such stupid, unreflected statements.

If your Smartphone or Notebook can reach this forum ALL your routing can not be broken, that statement is very much like

The Internet is broken…

My 2 cents
Andy

BigSteve · August 18, 2024, 7:17pm

Apparently you didn’t read the scope of the problem above. As Ted so clearly pointed out, the issue is the traefik reverse proxy routing doesn’t seem to be working for him, and I’m worried I have the same problem.

So, very sorry for the confusion, I would assume that some context could have been gleaned from the statements above about the reverse proxy routing been broken. I also don’t have access to the cluster-admin page.

Andy_Wismer · August 18, 2024, 7:28pm

And you still do not provide ANY information about your environment!

NS8 / Traefik depends heavily on the firewall / router / dns you are using.
No information on what’s configured - or not.

Without any infos I, and most others here supporting here will not bother helping…

My 2 cents
Andy