Missing mqtt publishes for txacks

Environment

  • NS 3.13
  • AS 3.15
  • Localhost mosquitto broker
  • Class-C scheduler: 500ms

Problem

Missing/not processed txacks. I’ve been running a basic test that queues up 10 downlink messages back to back, each going to a different device. I can consistently reproduce this issue with class-C scheduler set to 500ms. If I set it back to the default 1s, the problem seems to go away, although I have seen it miss an ack once in a while even on 1s, just very rare. I hope this timing difference is a clue to what might be going on.

Questions to help debug

  1. I recall a problem used to exist where you could not publish mqtt “tx” downlinks too quickly to the application-server, perhaps this is a similar problem in network-server?
  2. I can build from docker and insert some debug messages where applicable, since I have an easily reproduceable setup, if that is useful. I’d love to help debug if anyone has ideas where to zoom in.
  3. What if the ACK is received after a subsequent queue-item is sent down? I can’t see anyway this would happen, since the broker is localhost to the LNS, and I did have a localhost mosquitto_sub on the server see the missing ack published less than 70ms after the downlink publish, just like the successful ones.

Reference Logs

LNS

LNS log filtered down to command/down and ack lines. Notice line 3 is the down with missed ack (downlink id c4ffe084-a207-4dd1-b867-32dbab8e07b8)

Jul 01 22:34:06 ip-172-31-54-5 chirpstack-network-server[29069]: time="2021-07-01T22:34:06.894145273Z" level=info msg="gateway/mqtt: publishing gateway command" command=down downlink_id=22a91bbd-90d9-40b1-9067-c48589c53b4c gateway_id=00800000a0005a49 qos=1 topic=gateway/00800000a0005a49/command/down
Jul 01 22:34:06 ip-172-31-54-5 chirpstack-network-server[29069]: time="2021-07-01T22:34:06.9903686Z" level=info msg="backend/gateway: downlink tx acknowledgement received" downlink_id=22a91bbd-90d9-40b1-9067-c48589c53b4c gateway_id=00800000a0005a49
Jul 01 22:34:07 ip-172-31-54-5 chirpstack-network-server[29069]: time="2021-07-01T22:34:07.402420019Z" level=info msg="gateway/mqtt: publishing gateway command" command=down downlink_id=c4ffe084-a207-4dd1-b867-32dbab8e07b8 gateway_id=00800000a0005a49 qos=1 topic=gateway/00800000a0005a49/command/down
Jul 01 22:34:07 ip-172-31-54-5 chirpstack-network-server[29069]: time="2021-07-01T22:34:07.91561108Z" level=info msg="gateway/mqtt: publishing gateway command" command=down downlink_id=d485a986-e10b-4142-9966-80d653f69030 gateway_id=00800000a0005a49 qos=1 topic=gateway/00800000a0005a49/command/down
Jul 01 22:34:07 ip-172-31-54-5 chirpstack-network-server[29069]: time="2021-07-01T22:34:07.988006402Z" level=info msg="backend/gateway: downlink tx acknowledgement received" downlink_id=d485a986-e10b-4142-9966-80d653f69030 gateway_id=00800000a0005a49
...

For reference, making sure LNS does use qos1:

Jul 01 18:40:11 ip-172-31-54-5 chirpstack-network-server[9032]: time="2021-07-01T18:40:11.682685132Z" level=info msg="backend/gateway: connected to mqtt server"
Jul 01 18:40:11 ip-172-31-54-5 chirpstack-network-server[9032]: time="2021-07-01T18:40:11.682714733Z" level=info msg="gateway/mqtt: subscribing to gateway event topic" qos=1 topic=gateway/+/event/+

GW Bridge

gw-bridge sending ack of the missing downlink id c4ffe084-a207-4dd1-b867-32dbab8e07b8 with qos1:

2021-07-01T22:34:07.331757+00:00 mtcdt chirpstack-gateway-bridge[18039]: time="2021-07-01T22:34:07.331015502Z" level=info msg="integration/mqtt: downlink frame received" downlink_id=c4ffe084-a207-4dd1-b867-32dbab8e07b8 gateway_id=00800000a0005a49
2021-07-01T22:34:07.342743+00:00 mtcdt chirpstack-gateway-bridge[18039]: time="2021-07-01T22:34:07.341733302Z" level=info msg="integration/mqtt: publishing event" downlink_id=c4ffe084-a207-4dd1-b867-32dbab8e07b8 event=ack qos=1 topic=gateway/00800000a0005a49/event/ack

Mosquitto

and finally, a 3rd party mosquitto subscriber showing downlink id c4ffe084-a207-4dd1-b867-32dbab8e07b8 (xP/ghKIHTdG4ZzLbq44HuA==) did in fact get published by the gw-bridge:

Client mosqsub/19284-thedude received PUBLISH (d0, q0, r0, m0, 'gateway/00800000a0005a49/command/down', ... (442 bytes))
 {"phyPayload":null,"txInfo":null,"token":50431,"downlinkID":"xP/ghKIHTdG4ZzLbq44HuA==","items":[{"phyPayload":"YOLiZwEAAwDHBmiv7bSk","txInfo":{"gatewayID":null,"frequency":92330000                                                                                    0,"power":20,"modulation":"LORA","loRaModulationInfo":{"bandwidth":500,"spreadingFactor":10,"codeRate":"4/5","polarizationInversion":true},"board":0,"antenna":0,"timing":"IMMEDIATE                                                                                    LY","immediatelyTimingInfo":{},"context":"Golxow=="}}],"gatewayID":"AIAAAKAAWkk="}
Client mosqsub/19284-thedude received PUBLISH (d0, q0, r0, m0, 'gateway/00800000a0005a49/event/ack', ... (119 bytes)) 
 {"gatewayID":"AIAAAKAAWkk=","token":50431,"error":"","downlinkID":"xP/ghKIHTdG4ZzLbq44HuA==","items":[{"status":"OK"}]}