Health check stuck

ttulka · April 12, 2023, 12:48pm

Hi, we are running a Chirpstack v4.3.1 instance in a Kubernetes cluster (AKS in Azure).
The applications starts and connects to MQTT successfully, but calling the /health endpoint gets stuck for no obvious reasons:

/metrics endpoint works well on the same port, so there are no networking issues
no errors or relevant warnings in the log
Postgres schemas are created, so there is probably no Postgres issue (we are using Azure Database for PostgreSQL)
Redis is empty, so the Redis is probably the issue, but connecting the Redis instance (we are using Azure Cache for Redis) from a pod inside the cluster works fine (TLS enabled, no cluster).

How can we debug this issue deeper or get more information to help us fix it? (logLevel trace is already enabled)
Thanks!

Redis config:

    [redis]
      servers=[
        "redis://$REDIS_PASSWORD@$REDIS_HOST:$REDIS_PORT",
      ]
      tls_enabled=true
      cluster=false

Full log:

2023-04-13T11:26:10.840360Z  INFO chirpstack::cmd::root: Starting ChirpStack LoRaWAN Network Server version="4.3.1" docs="https://www.chirpstack.io/"
2023-04-13T11:26:10.840424Z  INFO chirpstack::storage: Setting up PostgreSQL connection pool
2023-04-13T11:26:10.923461Z  INFO chirpstack::storage: Applying schema migrations
2023-04-13T11:26:10.930026Z  INFO chirpstack::storage: Setting up Redis client
2023-04-13T11:26:11.394672Z  INFO chirpstack::region: Setting up regions
2023-04-13T11:26:11.394755Z  INFO setup{common_name=EU868 region_id=eu868}: chirpstack::region: Configuring region
2023-04-13T11:26:11.394872Z  INFO chirpstack::backend::joinserver: Setting up Join Server clients
2023-04-13T11:26:11.394878Z  INFO chirpstack::backend::roaming: Setting up roaming clients
2023-04-13T11:26:11.394883Z  INFO chirpstack::adr: Setting up adr algorithms
2023-04-13T11:26:11.394914Z  INFO chirpstack::integration: Setting up global integrations
2023-04-13T11:26:11.394920Z  INFO chirpstack::integration::redis: Initializing Redis integration
2023-04-13T11:26:11.394926Z  INFO chirpstack::gateway::backend: Setting up gateway backends for the different regions
2023-04-13T11:26:11.394931Z  INFO chirpstack::gateway::backend: Setting up gateway backend for region region_id=eu868 region_common_name=EU868
2023-04-13T11:26:11.395509Z  INFO chirpstack::gateway::backend::mqtt: Connecting to MQTT broker region_config_id=eu868 server_uri=tcp://emqx-listeners.infra:1883 clean_session=false client_id=6f1b23245d1f1b4b
2023-04-13T11:26:11.528331Z  INFO chirpstack::downlink: Setting up Class-B/C scheduler loop
2023-04-13T11:26:11.528510Z  INFO chirpstack::downlink: Setting up multicast scheduler loop
2023-04-13T11:26:11.527924Z  INFO chirpstack::gateway::backend::mqtt: Starting MQTT consumer loop
2023-04-13T11:26:11.528700Z  INFO chirpstack::api: Setting up API interface bind=0.0.0.0:8080
2023-04-13T11:26:11.529023Z  WARN chirpstack::api::backend: Backend interfaces API is disabled
2023-04-13T11:26:11.529106Z  INFO chirpstack::api::monitoring: Setting up monitoring endpoint bind=0.0.0.0:8090
2023-04-13T11:26:11.529242Z  INFO chirpstack::gateway::backend::mqtt: Connected to MQTT broker region_config_id=eu868
2023-04-13T11:26:11.529418Z  INFO chirpstack::gateway::backend::mqtt: Subscribing to gateway event topic region_config_id=eu868 event_topic=chirpstack/eu868/gateway/+/event/+

brocaar · April 24, 2023, 9:05am

I have tried to reproduce this issue locally, but it returns instantly with an OK response.

ttulka · April 24, 2023, 9:26am

It works locally with no problem. I guess the problem is only in Azure, especially with the managed Redis. Is it possible that the health check creates the connection differently than the application?

gbr · April 25, 2023, 1:41pm

@brocaar we found the issue: missing redis tls support · Issue #170 · chirpstack/chirpstack · GitHub its not working due to the missing redis tls support. In addition it seems that timeout parameter should be used, otherwise calls to it running forever.
And the log line regarding redis connection pool is misleading, because we thought that all is running fine at the beginning

system · July 24, 2023, 1:42pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.