Chirpstack v4 ERROR DOWNLINK_PAYLOAD_SIZE exceeds

Hi guys,
For downlinks, based on the code, the max_payload_size is calculated from rx2 dr:

// get remaining payload size
        let max_pl_size = self.region_conf.get_max_payload_size(
            self.device_session.mac_version().from_proto(),
            self.device_profile.reg_params_revision,
            self.device_session.rx2_dr as u8,
        )?;

In: chirpstack/data.rs at e78dac316acd3c2c33bfc4f0c48167d1c7458540 · chirpstack/chirpstack · GitHub

So this is different from the dr indicated in the last uplink.

Hope this helps,

Thanks, I finally figured that out a couple days ago as well. The RX2 window is pegged to DR8 in US915, which has a max size of 53, and DR0 in EU8663, which has a max size of 51 (from the Regional Parameters spec v1.1rA).

What has been confusing me is there is effectively no RX1 window for class C devices. The code snippet posted by @Jerome73 is from set_tx_info_for_rx2 line 1235.

There is also a set_tx_info_for_rx1 which calculates max size as a function of the uplink DR and the region config rx_1_dr_offset value. However, set_tx_info_for_rx1 is not called for Class C devices. From _handle_schedule_next_queue_item line 194:

        if ctx._is_class_c() {
            ctx.get_class_c_device_lock().await?;
            ctx.set_immediately()?;
            ctx.set_tx_info_for_rx2()?;
        }
        if ctx._is_class_b() {
            ctx.set_tx_info_for_class_b_and_lock_device().await?;
        }
        if ctx._is_class_a() {
            return Err(anyhow!("Invalid device-class"));
        }

Note also the “Invalid device-class” error for Class A devices. This indicates the downlink message for Class A devices must already be queued before the uplink message. The uplink handler calls the downlink handle_response which ultimately results in setting both RX1 and RX2 tx info before sending the the response message. However, there is never anything to send for Class C devices since any downlink message for class C devices is is always sent immediately as an RX2 window message. As such, class C devices can never receive a downlink message in the RX1 window.

I have traced through the v3 Go code, and the logic appears the same for downlink messages, although I could swear I was able to send downlink messages larger than 53 bytes with v3. There is a difference in how the device lock is set in the uplink handling, though. Perhaps this is the difference. The v4 uplink handler does not seem to be taking the same RX1 window configuration rx1_delay into consideration. I might very well be missing something and/or not fully understanding the logic flow yet.

All that said, the current implementation appears to work contrary to my understanding of Class C downlink messages. Granted, my understanding could very well be incorrect. While the Regional Parameters doc does not specifically mention device class in the receive window sections, this article in the Semtech technical documents documentation, An In-depth Look at LoRaWAN® Class C Devices, specifically states:

Class C end devices implement the same two receive windows as Class A devices, but they do not close the RX2 window until they send the next transmission back to the server. Therefore, they can receive a downlink in the RX2 window at almost any time. A short window at the RX2 frequency and data rate is also opened between the end of the transmission and the beginning of the RX1 receive window, as illustrated in Figure 1.

image

While technically Chirpstack does implement the RX1 receive window for Class C devices, it seems impossible to insert a downlink message, as a response to an uplink message, into the system in such a way that it will be treated as being within the RX1 receive delay time period. To do so would require the uplink handler to wait until the configured rx1_delay time period has expired before initiating the RX2 downlink process, which seems non-trivial.

If my understanding of the Class C downlink handling is correct and this is a legitimate issue with Chirpstack, I am happy to write this up as an issue in GitHub. If my understanding is incomplete, I welcome more clarity.

For now, I will adjust my application to better handle downlink messages that are larger than 53 bytes (in US915, 51 in EU868) by breaking them into multiple downlink messages. This seems prudent even if the downlink RX1 window should be used, at least as a fallback. But then I’m not sure if the application layer can know which DR is going to be used for the downlink message. That is a discussion for another day.

1 Like

I noticed this was missing in the v4 implementation (will be included in the next release):

Maybe your issue is related to this?

1 Like

Yes, I think this is the issue. Without a slight delay, there is nothing to send when uplink handler builds the downlink response. My integration is handled quickly, but not as quickly as the the downlink handle_response process completes.

In examining the network config defaults, it looks like the default delay is 100ms. If that needs to be increased, we simply add to the network config:

[network]
...
get_downlink_data_delay=<x>
...

I did not have this set in the v3 config, which has the same 100ms default, so it seems like this v4 change is exactly what is needed.

Thanks so much for looking into this. I hope my analysis helped. I know it helped me to better understand how the downlink and receive windows work.


Side question (maybe this should be a new topic, but directly related to this scenario): How would an integration know if it’s able to use the rx1 window (send a larger response) or has to plan for the the rx2 window (possibly send multiple smaller responses)?

It seems to me the only option is to track the downlink queue item ID and wait for the “ack” message containing that ID. If the “ack” never comes, try a smaller message.

Thanks again.