Gateway routing behavior

Curt_Black · May 5, 2021, 9:55pm

From the configuration:

  # Prefer gateways for downlink with given uplink (SNR) margin.
  #
  # When receiving an uplink (by multiple gateways), the Network Server will
  # prefer the gateways that have at least the configured margin for the uplink
  # SNR when sending a downlink. Margin:
  #   uplink SNR - required SNR for spreading factor
  #
  #  * In case multiple gateways match, the Network Server will select a random
  #    gateway from the match.
  #  * In case non of the gateways have the desired margin or the uplink
  #    modulation was not LoRa, then the gateway with the best SNR (or RSSI
  #    in case of FSK) will be selected when sending a downlink.
  gateway_prefer_min_margin=10

This describes an ability to set a margin, but I think based on the last comment, if only ONE gateway receives the packet, it will switch the routing info for that device.

Consider the following diagram, where multiple gateways are present:

overlap

Now the following scenario:

“Node” is on the edge of GW2 coverage
GW1 receiver is busy for some reason (e.g. sending a downlink to some device)
“Node” sends an uplink during this time, so GW2 is the only gateway to receive the message.
LNS updates “Node” gateway routing info to “GW2”, since that was the only Gateway to receive the packet
Routing is now sub-optimal (or potentially blocked)
In case downlink packet link budget is less than uplink, downlinks to “node” via GW2 can always fail, until Node sends another uplink via GW1 and the LNS sets the optimal route back to GW1

So, is my assumption of (4) correct? If it is the only gateway that received an uplink, LNS changes the route even if the margin config setting is not met.

My first goal of this topic is just to determine if the above scenario could occur.

thanks,
Curt

brocaar · May 10, 2021, 8:20am

It depends on the device class. For Class-A, the downlink is always routed through one of the gateways which received the uplink. For Class-C, last set of receiving gateways is stored in the device-session, so your assumption in 4) is correct.

fmgst · August 12, 2021, 10:59pm

In case multiple gateways match, the Network Server will select a random gateway from the match.
In case none of the gateways have the desired margin or the uplink modulation was not LoRa, then the gateway with the best SNR (or RSSI in case of FSK) will be selected when sending a downlink

I’m under Class-A and most of my sensors are received by at least four gateways, with significant differences in RSSI (from -115 to -40) depending on where sensor is located with reference to GWs.

I configured multiple GWs to try to (later) optimize signal strength (and thus battery life), but it seems that LoRaWAN doesn’t work that way? That it, do I have any control over which one GW a sensor it assigned to? If yes, how can I exercise this control?

Curt_Black · August 13, 2021, 8:53am

This idea comes down to something like ‘static routes’, as we would say in IP land. Its a good idea for LNS, and I would propose such an option in Chirpstack - however the default behavior is very hard to argue with — unless you are in a certain situation.

With multiple gateways that have “very good coverage” to your deployment, the dynamic routing is actually a good thing for your site deployment. The issue is when the above situation occurs when the gateway that is supposed to serve your device (GW X) happens to be busy for a minute (most likely its sending a downlink?) - and the horrible range gateway (GW Y) happens to pick up that uplink packet… Now, that gateway (GW Y) 78 miles away that “somehow through the grace of God” received that 1 packet because GW X was busy — the LNS now updates the routing info for device A to say “oh yes, GW Y is now the downlink path”

Please note :: this is only applicable to Class-C devices (for the most part)

Since you said you are under Class-A – I am not sure what the issue is? Can you elaborate on your issue “I configured multiple GWs to try to (later) optimize signal strength (and thus battery life), but it seems that LoRaWAN doesn’t work that way?” I think I know what you are asking: “static routes”?

roanwifi · September 28, 2023, 8:26pm

Hi,
I have a setup of ~100 class C nodes and a gateway. Some of the nodes have very low signal and SNR so I installed a 2nd gateway better located (but I keep the original sub-optimal gateway).
If I stop the suboptimal gateway and I send downlinks from chirpstack to any device it keeps in the downlink queue but is not sent to the node.
What I have to keep in mind or know in this scenario?
Thanks
Antonio

brocaar · October 9, 2023, 2:52pm

Class-C devices need to periodically send an uplink to update the device to gateway association. Probably ChirpStack is not aware that your device is now under the coverage of the new gateway and it keeps sending data to the old gateway which now has been turned off.

datnus · October 10, 2023, 1:32pm

If the node sends uplink to the optimal gateway, will a confirmed downlink be sent to the device via optimal gateway?

I believe it should work.

Alberto_Alexandre · February 15, 2024, 8:03pm

Hello, I’m working with lorawan class C devices. But I’m having an unexpected behavior, we have tens of thousands of devices on a network with about ten gateways.

However, the error rate of sending downlinks perceived by the final application is very high.

When investigating more closely, many devices are left with long downlink lines, without ever being cleaned.

Even after manually cleaning the queue for a particular device, and requesting new downlinks, they continue to get stuck in the queue.

Does anyone know what could be happening??

Chirpstack version 4.04

bconway · February 16, 2024, 1:41am

Is that a typo or correct? I cannot imagine you will have the downlink capacity to support that many devices on 10 gateways.

Alberto_Alexandre · February 27, 2024, 4:16pm

I am currently managing a LoRaWAN network deployment within a city, utilizing 10 RAK7289CV2 gateways with 16 channels each. The network supports approximately 60,000 Class C devices.

While downlink messages are successfully queued for these devices, the current configuration transmits downlinks to only a limited portion (approximately 10%) of the devices daily, with a frequency of one or two messages per device.

I am inquiring about the following:

Expected behavior: Is this downlink transmission strategy (limited percentage and frequency) typical for a network of this scale and device count, or might there be potential areas for optimization?
Troubleshooting guidance: Since the ChirpStack logs do not indicate any errors, could you advise on appropriate troubleshooting methods to investigate the cause of the static downlink queue and explore potential improvements in transmission efficiency?

I appreciate any insights or suggestions you may offer.

Alberto_Alexandre · March 1, 2024, 12:11pm

Hello, @brocaar

Could you or someone else assist in understanding what might be happening with the behavior of downlink packets remaining indefinitely stuck in the queue? Alternatively, could you point me to where I can seek assistance in the documentation?

Thank you for your attention to this matter.

brocaar · March 12, 2024, 1:48pm

You could look into the MQTT message exchange between ChirpStack and the GWs. The expected behavior is:

.../gateway/.../command/down → sending the downlink to the gateway
.../gateway.../event/ack → confirmation from the gateway that the downlink was enqueued (or error message)

You also might want to look into the gateway logs for errors.