Subscriber for `/event/ack` disconnected

Recently we experienced some issues with some of our gateways. It seems that after some time the subscriber for the live/gateway/{GatewayID}/event/ack is disconneced and does not reconnect anymore. From the server side we can see, that we still publish to the live/gateway/{GatewayID}/event/ack topic as usual. Restarting the gateway “solves” the problem.

This are the /var/log/messages chirpstack-gateway-bridge logs around the time the subscription stopped/the subscriber disconnected:

time="2020-09-26T19:25:33Z" level=info msg="integration/mqtt: publishing event" event=up qos=0 topic=live/gateway/{GatewayID}/event/up uplink_id=e989574a-1eb7-4c34-be0c-5c691aa0cb67
time="2020-09-26T19:25:36Z" level=info msg="integration/mqtt: downlink frame received" downlink_id=c0062ea5-42d5-404c-917f-e3857a94d4a1 gateway_id={GatewayID}
time="2020-09-26T19:25:38Z" level=info msg="integration/mqtt: publishing event" downlink_id=c0062ea5-42d5-404c-917f-e3857a94d4a1 event=ack qos=0 topic=live/gateway/{GatewayID}/event/ack
time="2020-09-26T19:29:51Z" level=info msg="integration/mqtt: publishing event" event=up qos=0 topic=live/gateway/{GatewayID}/event/up uplink_id=062d077d-1808-4f88-a0e4-9401df0832ca
time="2020-09-26T19:29:54Z" level=info msg="integration/mqtt: downlink frame received" downlink_id=39744f92-caa3-4a23-a791-b1f32ec2837f gateway_id={GatewayID}
time="2020-09-26T19:29:56Z" level=info msg="integration/mqtt: publishing event" downlink_id=39744f92-caa3-4a23-a791-b1f32ec2837f event=ack qos=0 topic=live/gateway/{GatewayID}/event/ack
time="2020-09-26T19:32:54Z" level=info msg="integration/mqtt: publishing event" event=up qos=0 topic=live/gateway/{GatewayID}/event/up uplink_id=76acd012-af2b-4860-8704-f491c1ab0fc4
time="2020-09-26T19:33:06Z" level=info msg="integration/mqtt: publishing event" event=up qos=0 topic=live/gateway/{GatewayID}/event/up uplink_id=8c5fdff3-4a23-4b4a-be83-1237a1fbc14c
time="2020-09-26T19:34:52Z" level=info msg="integration/mqtt: publishing event" event=up qos=0 topic=live/gateway/{GatewayID}/event/up uplink_id=964ec0c5-87f1-44a0-84e7-58744c996024
time="2020-09-26T19:35:13Z" level=error msg="mqtt: connection error" error="pingresp not received, disconnecting"
time="2020-09-26T19:37:54Z" level=info msg="integration/mqtt: publishing event" event=up qos=0 topic=live/gateway/{GatewayID}/event/up uplink_id=204540c7-2719-4618-9e87-9988550b8ce9
time="2020-09-26T19:38:06Z" level=info msg="integration/mqtt: publishing event" event=up qos=0 topic=live/gateway/{GatewayID}/event/up uplink_id=e66b64a1-d38a-4d7e-a8ef-ea9dd5abdef8
time="2020-09-26T19:39:32Z" level=info msg="integration/mqtt: publishing event" event=up qos=0 topic=live/gateway/{GatewayID}/event/up uplink_id=462b3551-b129-4cf5-94de-510664da60be
time="2020-09-26T19:40:34Z" level=info msg="integration/mqtt: publishing event" event=up qos=0 topic=live/gateway/{GatewayID}/event/up uplink_id=c15f2a19-764e-461e-9e51-0defee96f3c6
time="2020-09-26T19:42:28Z" level=info msg="integration/mqtt: publishing event" event=up qos=0 topic=live/gateway/{GatewayID}/event/up uplink_id=272f0f1a-0852-45db-8cbf-6b1a8deffc91
time="2020-09-26T19:42:54Z" level=info msg="integration/mqtt: publishing event" event=up qos=0 topic=live/gateway/{GatewayID}/event/up uplink_id=5567b8f7-8c28-47ae-8956-8af48fe18c45
time="2020-09-26T19:43:06Z" level=info msg="integration/mqtt: publishing event" event=up qos=0 topic=live/gateway/{GatewayID}/event/up uplink_id=08591118-2175-4178-852b-671786159574

Am I interpreting the logs right, that the /event/ack subscriber couldn’t connect anymore?

I saw that there are [integration.mqtt] settings to change the max_reconnect_interval and terminate_on_connect_error. I didn’t change them and I guess the default configuration should be fine, shouldn’t it?

Any suggestions which steps can be done to ensure that the subscriber of /event/ack reconnects after a disconnect?

The ChirpStack Gateway Bridge does not subscribe to any .../event/... topics, it only publishes. It only subscribes to the .../command/... topics :slight_smile: It is the ChirpStack Network Server which subscribes to the .../event/... topics and publishes to to the .../command/... topics.

Seems like I got this with the .../event/... topic wrong. Thanks for the clarification.

I checked the logs from the Network Server. The typical order of the messages is:

time="2020-09-26T19:20:28Z" level=info msg="gateway/mqtt: uplink frame received" gateway_id={GatewayID} uplink_id=87e5eaf4-e644-46ee-89b7-38cef6f0bc2e
time="2020-09-26T19:20:30Z" level=info msg="gateway/mqtt: publishing gateway command" command=down downlink_id=a9f50816-20e5-4b92-bdea-0187ea08b0df gateway_id={GatewayID} qos=0 topic=live/gateway/{GatewayID}/command/down
time="2020-09-26T19:20:33Z" level=info msg="backend/gateway: downlink tx acknowledgement received" downlink_id=a9f50816-20e5-4b92-bdea-0187ea08b0df gateway_id={GatewayID}

For one Gateway all logs stopped. Here I assume, that the Gateway stopped consuming and publishing to MQTT. The Gateway logs from this case are in the post above.

From another Gateway the Network Server stops logging the backend/gateway: downlink tx acknowledgement received message and the other two messages are logged. I assume that the gateway stops consuming to the .../command/... topic and stops therefore publishing to the .../event/ack topic (since no .../command/down is received). The Gateway logs from this case are below.

time="2020-09-29T13:42:03Z" level=info msg="integration/mqtt: publishing event" event=up qos=0 topic=live/gateway/{GatewayID}/event/up uplink_id=64f2d209-f844-45e8-9cb5-8e585af252f9
time="2020-09-29T13:42:27Z" level=info msg="integration/mqtt: downlink frame received" downlink_id=04b9fa26-cac6-4ade-a9e4-2a686f12eb98 gateway_id={GatewayID}
time="2020-09-29T13:42:27Z" level=info msg="integration/mqtt: publishing event" downlink_id=04b9fa26-cac6-4ade-a9e4-2a686f12eb98 event=ack qos=0 topic=live/gateway/{GatewayID}/event/ack
time="2020-09-29T13:44:20Z" level=info msg="integration/mqtt: publishing event" event=up qos=0 topic=live/gateway/{GatewayID}/event/up uplink_id=a21a7214-e4b9-412e-91b0-2027f5acf0f3
time="2020-09-29T13:44:28Z" level=info msg="integration/mqtt: publishing event" event=up qos=0 topic=live/gateway/{GatewayID}/event/up uplink_id=fb8f5827-919c-496f-af7b-826b17e58ece
time="2020-09-29T13:45:17Z" level=error msg="mqtt: connection error" error="pingresp not received, disconnecting"
time="2020-09-29T13:46:43Z" level=info msg="integration/mqtt: publishing event" event=up qos=0 topic=live/gateway/{GatewayID}/event/up uplink_id=10fb749c-42f4-4d12-bc9a-d44af1e4113b
time="2020-09-29T13:46:45Z" level=info msg="integration/mqtt: connected to mqtt broker"
time="2020-09-29T13:46:45Z" level=info msg="integration/mqtt: subscribing to topic" qos=0 topic="live/gateway/{GatewayID}/command/#"
time="2020-09-29T13:47:30Z" level=error msg="mqtt: connection error" error="pingresp not received, disconnecting"
time="2020-09-29T13:49:21Z" level=info msg="integration/mqtt: publishing event" event=up qos=0 topic=live/gateway/{GatewayID}/event/up uplink_id=e93704eb-202e-44e7-8fe3-f6e6e55d347d
time="2020-09-29T13:49:28Z" level=info msg="integration/mqtt: publishing event" event=up qos=0 topic=live/gateway/{GatewayID}/event/up uplink_id=232e0e71-f7ea-439b-9175-f3c1010f322e

In both cases the internet connection was shortly interrupted at the time the publisching/subscribing stopped. Is it possible that the Gateway has sometimes problems reconnecting the publisher/subscriber to MQTT?

We are seeing this same behavior as well.

What could we provide to help further debug this? I had started this discussion which is for the same issue Mqtt client loosing subscription and not re-subscribing

I also had opened an issue https://github.com/brocaar/chirpstack-gateway-bridge/issues/183. Happy to have the discussion wherever is best @brocaar

If this is a bug, then lets discuss it further on GitHub :slight_smile: