High load of gateway-bridge

Hey Orne, we are using the most updated gw-bridge in our blockchain LPWAN server, now we have like 6000 gateways online and the bridge is down quite often.

We suspect There might be some sort of race condition / deadlock in the code that is easily exposed when the load is high.

Any hint?

Interesting issue. How many instances/containers are you running?

does it have something to do with instances? it is running NS+AS+gw-bridge+Blockchain service containder with Kubernetes

you can try our app in apple/google play called MXC datadash, we have 6k GWs online with two SX1302 (essentially two packet forwarder each gw, so 12k connections)

We aren’t able to reproduce the gateway-bridge problem on our test-server, running lorhammer with 30000 gateways and two real gateways are regularly updated and gateway-bridge is fine.

But in our blockchain supernode 6000 gateway crashed it . same environment.

Could you define:

  • the bridge is down quite often
  • in our blockchain supernode 6000 gateway crashed it

Does down mean it becomes less responsive, does it actually crashes and if so, what is the error?

Hi, it means that the packet forwarder of SX1302 returns no PULL_ACK and all lora packets can’t be delivered to server.

Crashed it means we have the other supernodes that has like 100 gateways it is totally fine.

Weirdly there is no error or other things in our logs, so that is why we suspect it is a race condition / deadlock

not a single case of the restart so far since we switched to UTC and deployed the new version of gw-bridge. Reason is still unknown though.