Hi I’m expericing an strange problem with my application. Not sure if it is correctly classified.
I’m using Chirpstack v3 and everything is working correctly all the time. I have several gateways Mikrotik LR8 that connect via VPN to the sever where the gateway-bridge, network-server,app-server and database (postgres) are installed so everything is in the same local network.
Sometimes I have an outage on a lora gateway (eg power outage in the location) so the devices (class c) can uplink to the lora gw but that info never arrives to chirpstack. Also downlinks get stucked on the app server downlink queue.
When the connections from the lora gw to chirpstack returns some devices are unable to continue communicating uplinks (and downlinks complains of out of sync). Only solution is to locally force a join of the device (or wait 24 hours that devices does reboot and joins).
I’m trying to figure where the problem is; and guessing if this is something related to chirpstack internals; any timer related to the lora gw or to the devices…
Any hint would be greatly appreciated because I’m stuck and considering the reconfiguration of the devices to use ABP instead of OTA.
Thanks
Antonio
Hi Antonio,
The issue you’re experiencing with your Chirpstack application could be related to several factors. Let’s explore some potential causes and solutions:
- Communication Loss During Outage: When there is a power outage or connectivity loss at the location of the LoRa gateway, the uplink data from devices may not reach Chirpstack. In such cases, the data is lost, and downlinks may get stuck in the app server’s queue. To mitigate this, you can consider implementing a local buffering mechanism on the gateway itself, such as using a small storage device or an MQTT broker. This way, when the connection is restored, the buffered data can be sent to Chirpstack.
- LoRa Gateway Sync Issues: After a connection is restored, you mentioned that some devices are unable to continue communicating uplinks, and downlinks complain about being out of sync. This could be due to synchronization issues between the LoRa gateway and Chirpstack. Make sure that the LoRa gateway’s clock is synchronized with an accurate time source, such as NTP (Network Time Protocol). Additionally, ensure that the system clocks of Chirpstack’s components (gateway-bridge, network-server, app-server) are synchronized as well.
- Device Joining and Reboot: It’s worth noting that after a power outage or an extended period of disconnection, devices may need to rejoin the network. This is a normal behavior for LoRaWAN devices, especially if they are configured for over-the-air activation (OTA). You mentioned that forcing a join locally or waiting for device reboot resolves the issue. This indicates that the devices may need to perform a fresh join procedure to reestablish communication with Chirpstack.
- Review Chirpstack Logs: Check the logs of Chirpstack’s components (gateway-bridge, network-server, app-server) for any relevant error messages or warnings during the outage and recovery period. This can provide valuable insights into any internal issues or misconfigurations.
Considering the reconfiguration of devices from OTA to ABP is an option, but it’s important to evaluate the trade-offs. ABP eliminates the need for device joining but introduces other challenges, such as potential security vulnerabilities and limited scalability. If the outage frequency is infrequent or manageable, it may be worth optimizing the OTA process and addressing the underlying issues instead of switching to ABP.
I hope these suggestions help you identify and resolve the issue. Good luck with your Chirpstack deployment, and feel free to ask if you have any further questions.
Hi Nicolas. Thank you for your hints; here are my comments re your points:
-
Data loss during communication outage (both downlink/uplink) is not an issue as data has not realtime urgency.
-
Clocks are correctly NTP sync both gateways and server where chirpstack is running
-
This point is not clear to me. If the device has joined, then I have the devaddr, appskey and nwskey in both chirpstack and the device itself (as it has not been disconnected nor rebooted). Why when communication returns (after few hours) it can’t communicate to chirpstack or downlinks can go to the device (error out of sync) ? This seems to something related to chirpstack ?
-
I have checked the logs and I see the the chirpstack gatewaybridge unsuscribe the mqtt topic when the gateway is not reachable and subscribing back when it is reachable
“2023-05-18T15:00:07.000833318+02:00” level=info msg=“integration/mqtt: unsubscribing from topic” topic=“gateway/3133303734001e00/command/#”
“2023-05-18T15:00:07.011579177+02:00” level=info msg=“integration/mqtt: publishing state” gateway_id=3133303734001e00 qos=0 state=conn topic=gateway/3133303734001e00/state/conn
I have looked for but not found relevant logs in network server o application server (or may be I’m not looking for the correct ones).
Thanks for your help…Greatly appreciated. I’m still investingating.
Regards
Antonio