Gateway health Prometheus metrics are only "counts" of connect/disconnect, with no context as to which gateways may have fallen offline or come back online

We are currently evaluating migrating to ChirpStack from another platform. We run a private network of 40 gateways that is growing, and we need to be able to monitor the health of the network at each of our deployment sites. All gateways are running Semtech’s UDP forwarder for now.

According to what I’ve found in the docs and forum, it appears the recommended way to continuously monitor the connection status of gateways is via the Prometheus endpoints provided by Gateway Bridge. However, these metrics only cover a count of events like connects and disconnects. (The MQTT integration metrics are of no use in this regard.). This makes it very difficult for us to use Prometheus to generate alerts about specific gateway disconnections, or to aggregate gateways for a deployment site so we can monitor a customer’s deployment as a group.

Am I correct in this conclusion? Is anyone operating a network and monitoring the connection status and health of gateways? What solutions have you implemented? Will we need to develop software to hit the API (https://github.com/brocaar/chirpstack-api/blob/master/protobuf/gw/gw.proto)?

Hi @cmenscher

Configure gateway stat message feature in bridge configuration .toml file.

# State topic template.
  #
  # States are sent by the gateway as retained MQTT messages (by default)
  # so that the last message will be stored by the MQTT broker. When set to
  # a blank string, this feature will be disabled. This feature is only
  # supported when using the generic authentication type.
  state_topic_template="gateway/{{ .GatewayID }}/state/{{ .StateType }}"

This publishes a conn state message, indicating if a gateway is online or offline. State messages are published as retained so that a MQTT client will immediately receive the latest state of the gateway(s) after subscribing to the conn state MQTT topic. The default MQTT topic for state messages is gateway/ID/state/STATE where ID is the ID of the gateway and STATE is the state (in this case conn ).

2 Likes

We wrote this prom-exporter: wobcom / Chirpstack Devices Prometheus Exporter · GitLab
It does devices and gateways by calling the API

1 Like

Interesting solutions! Thanks for the tips, we’ll look into them.