Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] my healthcheck started failing after upgrading to v0.8.0 #601

Closed
ovizii opened this issue Mar 14, 2024 · 8 comments
Closed

[BUG] my healthcheck started failing after upgrading to v0.8.0 #601

ovizii opened this issue Mar 14, 2024 · 8 comments
Labels
bug Something isn't working

Comments

@ovizii
Copy link

ovizii commented Mar 14, 2024

Describe the bug
This healthcheck has been working for a while but started failing after the upgrade to scrutiny v0.8.0

    healthcheck:
      test: curl -ILfSs http://localhost:8080/api/health  || exit 1
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 30s

If I manually enter the container and execute it, echo$? returns 0 so it clearly is working.

Expected behavior
The healthcheck to be working.

Log Files
tmp.log

Please also provide the output of docker info

Client: Docker Engine - Community
 Version:    25.0.4
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.13.0
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.24.7
    Path:     /usr/libexec/docker/cli-plugins/docker-compose
  scan: Docker Scan (Docker Inc.)
    Version:  v0.23.0
    Path:     /usr/libexec/docker/cli-plugins/docker-scan

Server:
 Containers: 51
  Running: 51
  Paused: 0
  Stopped: 0
 Images: 80
 Server Version: 25.0.4
 Storage Driver: zfs
  Zpool: rpool
  Zpool Health: ONLINE
  Parent Dataset: rpool/docker
  Space Used By Parent: 18224963584
  Space Available: 698153603072
  Parent Quota: no
  Compression: on
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: ae07eda36dd25f8a1b98dfbf587313b99c0190bb
 runc version: v1.1.12-0-g51d5e94
 init version: de40ad0
 Security Options:
  apparmor
  seccomp
   Profile: builtin
 Kernel Version: 6.5.11-8-pve
 Operating System: Debian GNU/Linux 12 (bookworm)
 OSType: linux
 Architecture: x86_64
 CPUs: 8
 Total Memory: 62.53GiB
 Name: nas
 ID: a9b276ef-51a5-47fe-b188-38ff9edb9e79
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Username: ovizii
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false
 Default Address Pools:
   Base: 172.16.0.0/12, Size: 28

WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled

@ovizii ovizii added the bug Something isn't working label Mar 14, 2024
@stanrc85
Copy link

I'm having the same problem. Healthcheck fails, but manually running the command in the console works as expected.

image

@ovizii
Copy link
Author

ovizii commented Mar 14, 2024

Good to hear it's not just me. Btw. the restart you see in the logs is my autoheal container restarting scrutiny because the healthcheck failed.

For what it is worth, I'm also wondering about this line in the logs:

scrutiny | time="2024-03-14T09:41:22+01:00" level=warning msg="Could not get the most recent data points from the database. This is expected to happen only if this is the very first submission of data for the device." type=web

This is definitely not the first time for this device. Also, I have 2 different SDA devices in scrutiny because I swapped around a few drives.

Scared to delete one from scrutiny as when I tried that, scrutiny stopped working.

@sherbibv
Copy link

Having the same problem with this health configuration

healthcheck:
      test: curl --connect-timeout 15 --silent --show-error --fail http://localhost:8080/api/health | grep -q 'true'
      interval: 60s
      retries: 5
      timeout: 10s
      start_period: 20s 

@AnalogJ
Copy link
Owner

AnalogJ commented Mar 16, 2024

@dropsignal was this something you experienced when developing on debian 12 (bookworm) ?

@dropsignal
Copy link
Contributor

dropsignal commented Mar 17, 2024

@dropsignal was this something you experienced when developing on debian 12 (bookworm) ?

Not this specifically, but it looks like it's related to the unified /usr issue. Healthcheck is trying to execute curl using /bin/sh which doesn't exist anymore. Technically it does on a full installation of Debian 12 because /bin is just a symbolic link to /usr/bin. However, in Debian 12 slim, /bin/sh is gone. If healthcheck were updated to use /usr/bin/sh, then it will work again.

The other option is to put a symbolic link from /bin/sh to /usr/bin/sh in the docker image, but that defeats the purpose of trying to unify everything under /usr/bin if others don't update their path references like they're supposed to.

@AnalogJ
Copy link
Owner

AnalogJ commented Mar 18, 2024

@ovizii @sherbibv @stanrc85

Can you try changing:

    healthcheck:
      test: curl -ILfSs http://localhost:8080/api/health  || exit 1
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 30s

to

    healthcheck:
      test: /usr/bin/curl -ILfSs http://localhost:8080/api/health  || exit 1
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 30s

or

    healthcheck:
      test:["CMD", "curl", "-ILfSs", "http://localhost:8080/api/health"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 30s

and report back?

@sherbibv
Copy link

@AnalogJ the second solution worked for me. Thank you!

@ovizii
Copy link
Author

ovizii commented Mar 18, 2024

The first one didn't work for me either while the second one does, but you need to insert a space after test: or will just error:

test:["CMD", => test: ["CMD",

Thanks for helping us out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants