The device queue (downlink messages) is filling up with messages. We discovered this when the server stopped responding after several days. The downlink messages are sent using REST API /api/devices/<dev_eui>/queue
. The table device_queue
in database chirpstack_ns
contains a bunch of messages. Deleting these message, restarting the application server and the netowrk server brings the system back up again.
This happens on queues for offline devices. As we are in a test situation, there are a lot of devices that are turned off, but our software is still sending downlink messages to these devices. I am not sure if this is a configuration issue, bug in our system or a bug in the Chirpstack Network Server logic.
Versions
chirpstack-application-server/stable,now 3.17.6 amd64 [installed]
chirpstack-gateway-bridge/stable,now 3.13.3 amd64 [installed]
chirpstack-network-server/stable,now 3.16.1 amd64 [installed]
Characteristics:
- Devices are offline
- Class C devices (obviously)
- OTAA
- Message NOT confirmed
Questions
- Have we done a configuration error?
- Should our software stop sending downlink messages when devices are detected as offline?
- Is there a configuration setting to limit the maximum number of downlink messages in the queue?
Should our software stop sending downlink messages when devices are detected as offline?
For Class-A devices, it is expected that the queue will fill up in this case, as an uplink is required to trigger a downlink.
Is there a configuration setting to limit the maximum number of downlink messages in the queue?
There isn’t.
@brocaar Thanks for the response.
This is a class C device. I guess the queue needs to maintain the fcnt, but i don’t understand why the Chirpstack/gateway isn’t ACKing the downlink regardless of the actual device status… This is class C, non-confirmed messages.
It might be the scheduler batch size if there are too many Class-C downlinks in the queue in that case. The downlink fails as the device > GW association is invalidated in the database. If too many devices are failing because of this, then at some time a scheduler batch only includes failures, blocking the other items from being sent. Two things that you can do (this will be improved in the next version):
- Increase the batch size (see
chirpstack-network-server.toml
config)
- Stop enqueueing downlinks for inactive devices / flush these queues
Again, this will be improved in a next version, but the above will hopefully solve your issue until then 
Hi @brocaar,
As per your suggestion, we can do the second option but we cannot increase batch size in the chirpstack-network-server.toml
config because the scheduler batch size comes from the Go Source code.
chirpstack-network-server/downlink.go at 3971570b77c79c1cfd184b6f06a4f1770b5a0db0 · brocaar/chirpstack-network-server (github.com)
var (
schedulerBatchSize = 100
schedulerInterval time.Duration
)
Thanks