Graceful shutdown of ChirpStack

Hi all,

I’m wondering what the proper way is to do a graceful shutdown of ChirpStack. We are running ChirpStack in kubernetes which sends a SIGTERM to containers in a pod to request a graceful shutdown (e.g. to migrate the pod to another node or to do rolling updates). As far as I can tell, ChirpStack ignores the SIGTERM and after the termination grace period, kubernetes sends a SIGKILL which shuts down ChirpStack uncleanly. Is there a way to tell ChirpStack to do a proper shutdown?

How long are you waiting? I’ve noticed when shutting down ChirpStack v4 containers it takes exactly 10.5 (or maybe it was 12.5?) seconds each and every time, even when nothing is actively happening and all dependencies are still available.

@brocaar I’ve been meaning to open a feature request about this. Is there a hard reason a graceful shutdown takes so long?

The default grace period in kubernetes is 30 seconds and in my observations it took 30 seconds for the container to exit.
I have been searching in the code, but I couldn’t find any signal handling. I have zero experience with rust though, so could very well be user error.

It might be that the sigterm signal is not correctly handled, could you create a GitHub issue if you think this is an issue? If this issue can be reproduced outside Kubernetes, that would be great :slight_smile:

I see the behavior with Compose (at least my 10+ second version), so it’s definitely not a Kubernetes thing. I will look into a reproducible test.

I can reproduce it with the code from GitHub - chirpstack/chirpstack-docker: Setup ChirpStack using Docker Compose.

Nothing happens when I execute:

docker kill --signal TERM chirpstack-docker-chirpstack-1

The container exits with exit code 137 when I execute:

docker kill --signal KILL chirpstack-docker-chirpstack-1

@bconway the 10 seconds delay you are seeing is consistent with:

The docker compose stop command attempts to stop a container by sending a SIGTERM . It then waits for a default timeout of 10 seconds). After the timeout, a SIGKILL is sent to the container to forcefully kill it. If you are waiting for this timeout, it means that your containers aren’t shutting down when they receive the SIGTERM signal.

From: Compose FAQs | Docker Docs.

1 Like

When I run chirpstack as a regular process, it does exit upon SIGTERM. It seems the signal is not properly propagated to the process when it is run in docker. I’ll investigate the Dockerfile to see if I can spot the problem.

The chirpstack binary is run as the entrypoint of the docker container, meaning it should receive the SIGTERM from docker.

However, since chirpstack is used as the entrypoint of the container it is PID 1, which has some impact on how signals are working:

The Linux kernel treats PID 1 as a special case, and applies different rules for how it handles signals. This special handling often breaks the assumptions that programs or engineers make.

First, some background. Any process can register its own handlers for TERM and use them to perform cleanup before exiting. If a process hasn’t registered a custom signal handler, the kernel will normally fall back to the default behavior for a TERM signal: killing the process.

For PID 1, though, the kernel won’t fall back to any default behavior when forwarding TERM. If your process hasn’t registered its own handlers (which most processes don’t), TERM will have no effect on the process.

Source: Introducing dumb-init, an init system for Docker containers

I managed to make it work by adding init: true for the chirpstack container in the compose file or by manually running the chirpstack container with docker run --init. Which is consistent with the info above.

I see two solutions:

  1. chirpstack registers a signal handler for SIGTERM/SIGINT (SIGINT for handling crtl-c) and performs graceful shutdown upon receiving such a signal.
  2. Make sure chirpstack isn’t PID 1 anymore by adding an init system (e.g. tini or dumb-init) to the container.

I think option 1 is preferable, since that allows for running proper shutdown logic i.s.o. the kernel just killing the process.

Good find!

Option 3 would be to set an entrypoint shim (or no entrypoint at all) and run ChirpStack by cmd. This is how almost all the images I use (including my own) work, and sidesteps the issue without significant code or added init packages. Mosquitto and Redis dependencies in chirpstack-docker do this as well:

@bconway I’m not sure that would help. Those shims do exec "$@" at the end, which replaces the shell process with the command that is passed as an argument. That would again make chirpstack PID 1. I guess that works for mosquitto and redis because they handle SIGTERM properly … ?

I misunderstood part of your post, thanks. It seems like SIGTERM handling would be a good idea…

thank you, that’s good advice