Device with confirmedUp Uplink doesn't receive Downlink within Rx1 Receive Window

I am testing LoRa Devices with confirmedUp Uplinks.
Unfortunately sometimes the Downlink isn’t received by the Sensor.

A possible reason for this problem is that the Downlink is sometimes sent after 2 seconds from the time the Uplink was received.
Here an example using the LoRa Field Tester:


Sometimes the Downlink is sent immediately, sometimes after 1 second and sometimes after 2 seconds.

The default Receive Window for Rx1 is 1 second after the Uplink is sent and Rx2 is 2 seconds after Uplink.

However the LoRa Specification defines a fixed frequency and data rate for the Rx2 window (EU868: 869.525 MHz / DR0 (SF12, 125 kHz)). Source: https://lora-alliance.org/wp-content/uploads/2021/05/RP002-1.0.3-FINAL-1.pdf Page: 29 Line: 519

According the the image above, the normal frequency and data rate is used, even for Downlinks sent after 2 seconds, not the Rx2 data rate.

Is this expected behavior?
Could that result in the LoRa Device not receiving the Downlink?
Why would the Downlink be delayed for up to 2 seconds?

Using mosquitto_sub to look at gateway/+/event/ack to see if the gateway has problems sending downlink shows no errors, always status: OK

Also sometimes no Downlink is received by the Device and the Downlink doesn’t show up in the LoRaWAN Frames:


Using mosquitto_sub to look at all the communication between server and gateway shows that the Downlink was sent to the gateway, but the Field Tester didn’t receive the downlink (0 bars for up and downlink)

gateway/<redacted>002/event/up {"phyPayload":"QBF7pQGBMgACAcTjOpjs+kCFQnnRxQ==","txInfo":{"frequency":868100000,"modulation":"LORA","loRaModulationInfo":{"bandwidth":125,"spreadingFactor":7,"codeRate":"4/5","polarizationInversion":false}},"rxInfo":{"gatewayID":"<redacted>","time":"2021-11-09T13:28:51Z","timeSinceGPSEpoch":"1320499776.934151s","rssi":-115,"loRaSNR":-5,"channel":0,"rfChain":0,"board":0,"antenna":0,"location":null,"fineTimestampType":"NONE","context":"AAAAAAAAAAAACwABUHdBHA==","uplinkID":"49+unhowSSihcWWe5lLnww==","crcStatus":"CRC_OK"}}
gateway/<redacted>002/command/down {"phyPayload":"YBF7pQGDHQACAgEYNRU+","txInfo":{"gatewayID":"<redacted>","frequency":868100000,"power":14,"modulation":"LORA","loRaModulationInfo":{"bandwidth":125,"spreadingFactor":7,"codeRate":"4/5","polarizationInversion":true},"board":0,"antenna":0,"timing":"DELAY","delayTimingInfo":{"delay":"1s"},"context":"AAAAAAAAAAAACwABUHdBHA=="},"token":16046,"downlinkID":"Pq6xAz6zRFawVRSGzxQpZg==","items":[{"phyPayload":"YBF7pQGDHQACAgEYNRU+","txInfo":{"gatewayID":null,"frequency":868100000,"power":14,"modulation":"LORA","loRaModulationInfo":{"bandwidth":125,"spreadingFactor":7,"codeRate":"4/5","polarizationInversion":true},"board":0,"antenna":0,"timing":"DELAY","delayTimingInfo":{"delay":"1s"},"context":"AAAAAAAAAAAACwABUHdBHA=="}},{"phyPayload":"YBF7pQGDHQACAgEYNRU+","txInfo":{"gatewayID":null,"frequency":869525000,"power":27,"modulation":"LORA","loRaModulationInfo":{"bandwidth":125,"spreadingFactor":12,"codeRate":"4/5","polarizationInversion":true},"board":0,"antenna":0,"timing":"DELAY","delayTimingInfo":{"delay":"2s"},"context":"AAAAAAAAAAAACwABUHdBHA=="}}],"gatewayID":"<redacted>"}
gateway/<redacted>002/event/ack {"gatewayID":"<redacted>","token":16046,"error":"","downlinkID":"Pq6xAz6zRFawVRSGzxQpZg==","items":[{"status":"OK"}]}

The only other thing I can think of is that the timing for the Receive Window on the LoRa Device and the sent Downlink is misaligned. However ping between the Chirpstack Servers and between the Gateway Bridge and the Gateway are sub 1 millisecond…
The problem must be with the end device, right? The Field Tester works but the other sensors don’t.
Are there any other debugging steps I can take on the server side or with the gateways?

Additional Info:
LoRa Region: EU868
Chirpstack Gateway Bridge Version: 3.13.1
Chirpstack Network Server Version: 3.15.1 (upgrading isn’t currently possible due to production environment)
Chirpstack Application Server Version: 3.17.2
Field Tester MAC Version: 1.0.2a
other LoRa Sensor MAC Version: 1.0.3a
Used Gateway: Cisco LoRaWAN Gateway with common packet forwarder (semtech)
Chirpstack Network Server settings are all unchanged from default (deduplication_delay=“200ms”, get_downlink_data_delay=“100ms”, rx_window=0, rx1_delay=1, …)

Check and observe logs of the transmitting gateway (if you have more than one)

Hello Toby
You are reporting that you r problem nodes has MAC 1.0.3a and your tester is 1.0.2.a. Does your test device allow you to change the LoRa WAN MAC version? (e.g. on RHF 076 you can change it with AT+LW=VER,102 and other moidems may have a similar command) Check the device manual or the manual for the modem it uses. It would be interesting to see what happens when you set the MAC back to 1.0.2x

Maybe this explanation helps to understand the timing:

The uplink frame contains a context object which contains the gateway internal timestamp. ChirpStack returns this context object when sending the downlink in case of a timing: DELAY download. In this case the ChirpStack Gateway Bridge knows how to decode the internal time from the context and increment it with the specified delay from the downlink.

ChirpStack could send multiple downlink opportunities (e.g. RX1, RX2). The ack event contains a status list from which you can read which item was accepted by the gateway.

If ChirpStack would be too late, then it would be rejected by the gateway as the internal timestamp would already have occurred.