How to get gateways/devices automatically reconnect after outage?

I have a chirpstack instance running for a while now, with mutlitple gateways and applications/devices connected. Everything was working fine. At some point, my chirpstack went down. I brought it back up, but my gateways did not automatically reconnect. I had to power cycle my gateways and then they reconnected.

Is there a configuration such that the gateways would automatically reconnect without intervention?

After the gateways were connected, I had to force my devices to re-join. I see the lorawan frames from the device being received by the gateway, but it doesn’t get delivered to the application/device.

Is there a configuration such that the devices would automatically reconnect without forcing a re-join?

I’m hoping there is someway to handle a chirpstack outage such that the gateways and devices recover gracefully, without intervention. I was not paying attention to how long the chirpstack was down. If it was brought back in a more timely fashion, would everything behave better?

Thanks! Kevin

If connection is lost - just restart packet forwarder at the GW side.
if endnodes already joined and activated in the current ChirpStack installation - no need to rejoin them. b-cos join status stored iside endnode.

I have a Dragino DLOS8 and a Dragino LIG16 gateway. Is this a feature of these gateways that they don’t automatically reconnect? Does anyone know of a configuration setting on the gateways that allows them to reconnect automatically?

My devices were already joined to the network. I see device packets at the gateway being received by the device, but it doesn’t get forwarded to the Application server. The device should just connect because it has already joined. Is there an easy way to debug this scenario? I would like to find out the reason why the packet gets dropped.

You need to setup your chirpstack components installation to persist the joined node session details that are in redis; I don’t recall the details but it should be in the instructions somewhere.

Gateways should automatically reconnect, but first make sure the persistence issue is fixed in your setup.

What protocol scheme are you using for connection between the gateways and chirpstack? Are you using the gateway bridge? If so where is that running?

Very very old versions of the gateway bridge from a few years back had problems with reconnecting to the MQTT broker but I believe that’s been fixed for quite a while.

1 Like

I though redis was a caching solution for performance. Shouldn’t persistent data be stored in the database instead of redis? It feels wrong requiring redis to persist data.

I’m using the gateway bridge that comes by default with the network server.

The way Chirpstack is architected, OTAA session details are in redis, so you need to set redis for persistance.

There’s ultimately not all that much that a LoRaWAN stack keeps track of that’s freely “disposable” - about the only thing it wouldn’t be important to still know on a restart of the various processes/containers would be the downlink opportunities following from particular uplinks, as the time opportunity for those would probably have passed. But other details of the fact that there was an uplink such as the uplink frame count and various state tracking would have to be persisted. So a lot of what looks like “temporary” storage is actually not safe to forget.

I really appreciate the responses.

This seems like it is a short-coming in the current architecture. Is there a reason why it isn’t persisted in the database? Couldn’t it be persisted in the database but read through the redis cache? I.e. on redist start-up, it fetches the appropriate data from the database?

I’m a loRaWAN newbie, and go newbie, but I could try looking at the source code to see how hard it might be to update this. Or would this require a lot of in-depth knowledge before attempting such an update? Thanks!

The point is that with fewer exceptions than one might imagine, almost everything that looks temporary line an increased frame count or a small change in awareness of node state actually has to be persisted if the stack is going to comply with the LoRaWAN spec.

Each change of state would have to be committed back too, so in essence you’d be trying to do persistance mode redis’s job better than the redis authors did, which breaks the basic rule of “don’t try to be a better database than the database” - when configured for persistence, it already is a write through memory cache backed by some sort of persistent database.

1 Like