Chirpstack flooding gateway with downlink messages

Hi, Ive seen a similar problem here but at the end I think I don’t understand everything correctly:

I have a setup of 10 Class-C end devices and a single Kerlink gateway.
Everything is fine when I send 1 or 2 downlink messages to these nodes.

In certain situations I want to send messages to all nodes (even 3 messages per node) and I would expect chirpstack to schedule it properly for me.

From different logs I can see that chirpstack is pushing downlinks to my gateway and flooding it with messages.
Within single seconds it sends multiple messages to multiple nodes and reties that every 5 sec for every node, probably due to this setting:
class_c_lock_duration="5s"

In kerlink mqtt packet forwarder logs I can see lots of PULL_RESP like:

2025-02-19T09:53:45: Sending PULL_RESP, random_token: 6538, remote: 127.0.0.1:59059
....
2025-02-19T09:56:22: TX_ACK received, random_token: 6538, remote: 127.0.0.1:59059, error: NONE
2025-02-19T09:56:22: Handle TX_ACK error: No cache item for token, random_token: 6538, remote: 127.0.0.1:59059

In my understanding chirpstack is pushing these messages too often to the gateway. Gateway due to its duty cycle restrictions or whatever other reason cannot send this messages immediately and sends them after almost 3 minutes (9:53 vs 9:56).
In the meantime this token is removed from cache (probably after 1 minute) and gateway fails to do anything with it.
Additionally in the meantime chirpstack is pushing more and more messages to the gateway as it is retrying them causing a snowball effect becoming even worse and getting “stuck”.

Am I getting something wrong?
Of course I can extend class_c_lock_duration even to 1 minute, I can do my own scheduling or change my approach totally.
Im just a bit afraid that its just a matter of scale when the problem hits me again in production on 4000 end nodes and multiple gateways.

I would expect chirpstack to know the regulations of EU868, chirpstack knows how many gateways are available, how many messages are to be sent, probably it can also calculate time on air basing on current SF.
In that case I would expect chirpstack + kerlink to do the scheduling job for me instead of getting stuck on wrong timings.

I heard multiple times from different people that “LORA is not too good in sending many downlink messages” and I can understand that there are certain duty cycle restrictions etc, but … From my perspective there is something wrong in here, I’m wrong or my expectations are just too high :slight_smile:
I would understand that scheduling of 1000 confirmed messages can be slow and can take hours to accomplish but I cannot see a reason for it to get “stuck” forever…

Firstly I changed my implementation from confirmed to unconfirmed messages but unfortunately the problem remained the same :confused:

I was hoping that its more a problem of waiting for device confirms but it looks like its a general problem of waiting for txack vs token expiry.
Unconfirmed messages are also expecting to receive a txack - and gateway is sending these txacks too slow (because of duty cycle).

I ended up with my own rate limiter implementation on the application side and its a workaround for the problem.
I only did initial testing but it works as expected without “getting stuck” effect.
I actually make my own queue of downlink messages that I want to “enqueue” on chirpstack and I push them to chirpstack grpc in rate: 1 per 3 seconds.
Additionally I analyze event.ack and event.txack incomming from chirpstack (depending if its a confirmed or unconfirmed message) and I track “pending” messages and not exceed some low number like 5.

I still think that similar implementation should be introduced inside of chirpstack as it has much more knowledge (about number of gateways, which gateways are used while sending message, what SF is used etc).

Currently there are 3 different queues:

  1. on the gateway making sure that we are within permitted duty cycle
  2. in Chirpstack - all enqueued messages are there for every device
  3. in my application - making sure Im not pushing too much and too fast to chirpstack.

This seems to be way too complex :wink:

Kerlink’s stock software doesn’t have duty-cycle restrictions, unless you installed something that does.
Since the late versions of the UDP packet forwarder, Class C downlink requests will be queued and scheduled in the LoRa gateway by the internal Just-In-Time (JIT) scheduler.

Contemporary gateways based on a single SX1301 can only transmit one frame at a time, and may be half-duplex.

Chirpstack doesn’t automatically retry sending messages. That setting you quote is described as:

This defines the lock duration between scheduling two Class-C downlink payloads for the same device. The purpose of this lock is to avoid overlap between scheduling Class-C downlinks and / or spreading the downlink capacity load on the gateway.

If you’re sending so many messages that it’s not possible for the gateway to transmit them fast enough, are you breaking your local regulations? It sounds like you’re continuously transmitting.

What datarate are you using for RX2? If you’re using the lowest-possible, why not consider a higher datarate? using the lowest may lead to the gateway spending a lot of time in transmit mode, causing uplinks to be lost in this usecase. Every time you go up by 1 level, you half the time spent transmitting the message.

Is it necessary to send all 3 messages to every device at once? What’s the behaviour of your device like?
I would have expected the device to respond to every downlink, so you wouldn’t actually be able to send downlinks like that.

@sp193 I did some more investigation after your message…
You are right with “continuous transmitting”.
For Class-C, RX2 downlink - SF12 is used with 125kHz bandwidth.
For a 50 bytes payload it has ~2.8 sec time on air according to calculator.
It is also using only a single downlink channel where 10% duty cycle is allowed (360 out of 3600 seconds / hour).

Im not sure yet about Kerlink duty cycle restrictions (if its there it should start limitting ~120 messages / hour)

When it comes to "Chirpstack doesn’t automatically retry sending messages"
I can’t confirm that …
See the logs below from gateway packet forwarder.
These are logs after enqueuing 10 unconfirmed messages for 10 different devices on chirpstack concurrently (1 message per device).

between 15:26:05 and 15:26:07 chirpstack sent eu868/gateway/EUI/command/down (causing Sending PULL_RESP log) 10 times while getting 1 TX_ACK without an error.

Starting from 15:26:11.383 we are getting more and more PULL_RESP (chirpstack retries because there was a TX_ACK only for one message).

As a result we are getting TX_ACK every ~3 seconds because this is how fast the radio can transmit and much more PULL_RESP because Chirpstack really wants the message to be sent.

From what I observe every PULL_RESP is queued and sent over the radio causing more and more delays and the snowball effect which I described earlier.

My general idea is to use RX1 instead of RX2 but its also a bit painful as I can’t enqeue the message earier, tell Chirpstack to wait for the Uplink and use RX1. What I actually have to do is wait for the uplink and fit into restrictive get_downlink_data_delay="100ms" with my message.

I can see at least 2 possible improvements in chirpstack:
1. Not to flood gateway with messages as its not possible to send them in that rate using RX2
2. Prepare separate queues for RX1 and RX2 messages when class C is used.

It would be nice if someone from Chirpstack developers could confirm my observations and thoughts :slight_smile:

2025-03-14T15:26:05.236: Received message, topic: eu868/gateway/7076ff00560908fb/command/down, qos: 0
2025-03-14T15:26:05.236: Received downlink command, downlink_id: 1082741182, topic: eu868/gateway/7076ff00560908fb/command/down
2025-03-14T15:26:05.236: Sending PULL_RESP, random_token: 20926, remote: 127.0.0.1:59059
2025-03-14T15:26:06.226: TX_ACK received, random_token: 20926, remote: 127.0.0.1:59059, error: NONE
2025-03-14T15:26:06.226: Sending ack event, downlink_id: 1082741182, topic: eu868/gateway/7076ff00560908fb/event/ack
2025-03-14T15:26:06.338: Received message, topic: eu868/gateway/7076ff00560908fb/command/down, qos: 0
2025-03-14T15:26:06.338: Received downlink command, downlink_id: 526071507, topic: eu868/gateway/7076ff00560908fb/command/down
2025-03-14T15:26:06.338: Sending PULL_RESP, random_token: 14035, remote: 127.0.0.1:59059
2025-03-14T15:26:06.339: Received message, topic: eu868/gateway/7076ff00560908fb/command/down, qos: 0
2025-03-14T15:26:06.339: Received downlink command, downlink_id: 623895658, topic: eu868/gateway/7076ff00560908fb/command/down
2025-03-14T15:26:06.339: Sending PULL_RESP, random_token: 58474, remote: 127.0.0.1:59059
2025-03-14T15:26:06.339: Received message, topic: eu868/gateway/7076ff00560908fb/command/down, qos: 0
2025-03-14T15:26:06.339: Received downlink command, downlink_id: 1675975495, topic: eu868/gateway/7076ff00560908fb/command/down
2025-03-14T15:26:06.340: Sending PULL_RESP, random_token: 23367, remote: 127.0.0.1:59059
2025-03-14T15:26:06.340: Received message, topic: eu868/gateway/7076ff00560908fb/command/down, qos: 0
2025-03-14T15:26:06.340: Received downlink command, downlink_id: 3607647443, topic: eu868/gateway/7076ff00560908fb/command/down
2025-03-14T15:26:06.345: Sending PULL_RESP, random_token: 21715, remote: 127.0.0.1:59059
2025-03-14T15:26:06.345: Received message, topic: eu868/gateway/7076ff00560908fb/command/down, qos: 0
2025-03-14T15:26:06.348: Received downlink command, downlink_id: 45049201, topic: eu868/gateway/7076ff00560908fb/command/down
2025-03-14T15:26:06.349: Sending PULL_RESP, random_token: 25969, remote: 127.0.0.1:59059
2025-03-14T15:26:06.352: Received message, topic: eu868/gateway/7076ff00560908fb/command/down, qos: 0
2025-03-14T15:26:06.353: Received downlink command, downlink_id: 1178251610, topic: eu868/gateway/7076ff00560908fb/command/down
2025-03-14T15:26:06.354: Sending PULL_RESP, random_token: 45402, remote: 127.0.0.1:59059
2025-03-14T15:26:07.319: Received message, topic: eu868/gateway/7076ff00560908fb/command/down, qos: 0
2025-03-14T15:26:07.319: Received downlink command, downlink_id: 3870859697, topic: eu868/gateway/7076ff00560908fb/command/down
2025-03-14T15:26:07.320: Sending PULL_RESP, random_token: 41393, remote: 127.0.0.1:59059
2025-03-14T15:26:07.320: Received message, topic: eu868/gateway/7076ff00560908fb/command/down, qos: 0
2025-03-14T15:26:07.320: Received downlink command, downlink_id: 87061580, topic: eu868/gateway/7076ff00560908fb/command/down
2025-03-14T15:26:07.320: Sending PULL_RESP, random_token: 29772, remote: 127.0.0.1:59059
2025-03-14T15:26:07.320: Received message, topic: eu868/gateway/7076ff00560908fb/command/down, qos: 0
2025-03-14T15:26:07.321: Received downlink command, downlink_id: 555077538, topic: eu868/gateway/7076ff00560908fb/command/down
2025-03-14T15:26:07.321: Sending PULL_RESP, random_token: 53154, remote: 127.0.0.1:59059
2025-03-14T15:26:07.327: TX_ACK received, random_token: 14035, remote: 127.0.0.1:59059, error: UNKNOWN

// 10x PULL_RESP done and only 2 TX_ACK (one with error: UNKNOWN)

2025-03-14T15:26:07.327: Sending ack event, downlink_id: 526071507, topic: eu868/gateway/7076ff00560908fb/event/ack
2025-03-14T15:26:09.102: PUSH_DATA received, random_token: 998, remote: 127.0.0.1:55712
2025-03-14T15:26:09.102: Sending PUSH_ACK, random_token: 998 remote: 127.0.0.1:55712
2025-03-14T15:26:09.103: Sending uplink event, uplink_id: 1933197396, topic: eu868/gateway/7076ff00560908fb/event/up
2025-03-14T15:26:10.021: TX_ACK received, random_token: 58474, remote: 127.0.0.1:59059, error: NONE
2025-03-14T15:26:10.021: Sending ack event, downlink_id: 623895658, topic: eu868/gateway/7076ff00560908fb/event/ack
2025-03-14T15:26:11.383: Received message, topic: eu868/gateway/7076ff00560908fb/command/down, qos: 0
2025-03-14T15:26:11.383: Received downlink command, downlink_id: 501095965, topic: eu868/gateway/7076ff00560908fb/command/down
2025-03-14T15:26:11.383: Sending PULL_RESP, random_token: 7709, remote: 127.0.0.1:59059
2025-03-14T15:26:11.383: Received message, topic: eu868/gateway/7076ff00560908fb/command/down, qos: 0
2025-03-14T15:26:11.383: Received downlink command, downlink_id: 2320264485, topic: eu868/gateway/7076ff00560908fb/command/down
2025-03-14T15:26:11.383: Sending PULL_RESP, random_token: 27941, remote: 127.0.0.1:59059
2025-03-14T15:26:11.384: Received message, topic: eu868/gateway/7076ff00560908fb/command/down, qos: 0
2025-03-14T15:26:11.384: Received downlink command, downlink_id: 90034669, topic: eu868/gateway/7076ff00560908fb/command/down
2025-03-14T15:26:11.384: Sending PULL_RESP, random_token: 53741, remote: 127.0.0.1:59059
2025-03-14T15:26:11.384: Received message, topic: eu868/gateway/7076ff00560908fb/command/down, qos: 0
2025-03-14T15:26:11.384: Received downlink command, downlink_id: 3182963888, topic: eu868/gateway/7076ff00560908fb/command/down
2025-03-14T15:26:11.388: Sending PULL_RESP, random_token: 11440, remote: 127.0.0.1:59059
2025-03-14T15:26:11.389: Received message, topic: eu868/gateway/7076ff00560908fb/command/down, qos: 0
2025-03-14T15:26:11.390: Received downlink command, downlink_id: 1920175262, topic: eu868/gateway/7076ff00560908fb/command/down
2025-03-14T15:26:11.390: Sending PULL_RESP, random_token: 35998, remote: 127.0.0.1:59059
2025-03-14T15:26:11.532: PULL_DATA received, random_token: 3232, remote: 127.0.0.1:59059
2025-03-14T15:26:11.532: Sending PULL_ACK, random_token: 3232, remote: 127.0.0.1:59059
2025-03-14T15:26:12.484: Received message, topic: eu868/gateway/7076ff00560908fb/command/down, qos: 0
2025-03-14T15:26:12.484: Received downlink command, downlink_id: 1856120015, topic: eu868/gateway/7076ff00560908fb/command/down
2025-03-14T15:26:12.484: Sending PULL_RESP, random_token: 9423, remote: 127.0.0.1:59059
2025-03-14T15:26:12.485: Received message, topic: eu868/gateway/7076ff00560908fb/command/down, qos: 0
2025-03-14T15:26:12.485: Received downlink command, downlink_id: 1641539137, topic: eu868/gateway/7076ff00560908fb/command/down
2025-03-14T15:26:12.485: Sending PULL_RESP, random_token: 58945, remote: 127.0.0.1:59059
2025-03-14T15:26:12.485: Received message, topic: eu868/gateway/7076ff00560908fb/command/down, qos: 0
2025-03-14T15:26:12.485: Received downlink command, downlink_id: 3173803910, topic: eu868/gateway/7076ff00560908fb/command/down
2025-03-14T15:26:12.485: Sending PULL_RESP, random_token: 26502, remote: 127.0.0.1:59059
2025-03-14T15:26:12.709: TX_ACK received, random_token: 23367, remote: 127.0.0.1:59059, error: NONE
2025-03-14T15:26:12.709: Sending ack event, downlink_id: 1675975495, topic: eu868/gateway/7076ff00560908fb/event/ack
2025-03-14T15:26:15.406: TX_ACK received, random_token: 21715, remote: 127.0.0.1:59059, error: NONE
2025-03-14T15:26:15.406: Sending ack event, downlink_id: 3607647443, topic: eu868/gateway/7076ff00560908fb/event/ack
2025-03-14T15:26:16.457: Received message, topic: eu868/gateway/7076ff00560908fb/command/down, qos: 0
2025-03-14T15:26:16.457: Received downlink command, downlink_id: 1107056361, topic: eu868/gateway/7076ff00560908fb/command/down
2025-03-14T15:26:16.457: Sending PULL_RESP, random_token: 22249, remote: 127.0.0.1:59059
...
// much more  PULL_RESP and TX_ACK here before all 10 messages disappear from chirpstack queues

Hi.

You can adjust the RX2 datarate to something higher. It’s just DR0 by default. LoRaWAN’s defaults tend to be conservative, so that it’ll work with as many usecases as possible. You are meant to adjust it to fit your deployment sites.

Personally, I am using DR3. Go as high as you can, comfortably. Every time you go up by 1 datarate, you are approximately halving your air time.

DR0 is very slow. If you use an airtime calculator, you can compute how much air time your messages are taking. It’s possible for 1 message to take a few seconds to transmit with this datarate. Please keep within your duty cycle regulations.

Your gateway might be also unable to receive uplinks as the transmission takes place. Using low datarates could be a disadvantage if it’s misused.

I know that Chirpstack doesn’t retransmit because I’ve been inside its code before. There’s retransmission in the sense it’ll try both RX1 and RX2 since you’re using the UDP Packet forwarder, but it will not automatically repeat unconfirmed downlinks for you. Only the packet forwarder has knowledge of what timing(s) are in use, so Chirpstack can only try them one-by-one.

My general idea is to use RX1 instead of RX2 but its also a bit painful as I can’t enqeue the message earier, tell Chirpstack to wait for the Uplink and use RX1. What I actually have to do is wait for the uplink and fit into restrictive get_downlink_data_delay="100ms" with my message.

You could adjust that too, if your server cannot respond fast enough. But 100ms is quite some time for computers. It needs to be balanced between your server’s response time and the round-trip time (RTT) between the LoRa gateway and LNS.

I would choose according to the usecase. If you intend ad-hoc commands (because you chose class C), then you should just use class C.
Anyway, the JIT queue in the UDP Packet Forwarder only applies to class C.

I know that there is a usecase in EU where RX2 may be placed over a band with higher duty cycle and/or transmission power limits. In my region, such a thing doesn’t exist.

However, RX2’s characteristics are fixed and cannot be adjusted by ADR. So that is its weakness.

Prepare separate queues for RX1 and RX2 messages when class C is used.

You can implement this in your application. After all, Chirpstack wouldn’t know whether you want it sent now (Class C) or later (Class A-styled).