Mqtt client loosing subscription and not re-subscribing

Hi Orne,

I am seeing an issue with the chirpstack gateway bridge and i think specifically it may be an issue with the paho mqtt client. The client subscription get’s reset by peer (by my emqx broker) and the gateway bridge never re-subscribes to the topics.

I have the MaxReconnectInterval set to 10 minutes. And I’m seeing the client subscription never reconnect (went for 4.5 hours before I rebooted it and it was able to reconnect).

:point_up: the max reconnect may not be coming into play at all - because it’s loosing a subscription but the mqtt client does remain connected and is able to publish but fails to receive any messages on the subscription.

The client receives this error repeatedly (i’ve seen at most 5) and fails to re-obtain subscriptions:

time="2020-10-07T17:14:04Z" level=error msg="mqtt: connection error" error="read tcp IP:61384->IP:1883: read: connection reset by peer"
time="2020-10-07T17:19:49Z" level=error msg="mqtt: connection error" error="read tcp IP:34636->IP:1883: read: connection reset by peer"
time="2020-10-07T17:25:32Z" level=error msg="mqtt: connection error" error="read tcp IP:15750->IP:1883: read: connection reset by peer"
time="2020-10-07T17:36:06Z" level=error msg="mqtt: connection error" error="read tcp IP:28536->IP:1883: read: connection reset by peer"
time="2020-10-07T17:43:14Z" level=error msg="mqtt: connection error" error="read tcp IP:56268->IP:1883: read: connection reset by peer"

I am using emqx 4.1.0 - here is an error log for a subscription crash

2020-10-07 IP [error] <<"CLIENT_ID">>@IP:48052   crasher:
    initial call: emqx_connection:init/4
    pid: <0.3164.5290>
    registered_name: []
    exception exit: {timeout,
                        {gen_server,call,
                            [<0.2221.0>,
                             {subscribe,
                                 <<"gateway/id/command/#">>}]}}
      in function  emqx_connection:terminate/2 (/emqx_rel/_build/emqx/lib/emqx/src/emqx_connection.erl, line 424)
    ancestors: [<0.2321.0>,<0.2320.0>,esockd_sup,<0.2085.0>]
    message_queue_len: 0
    messages: []
    links: [<0.2321.0>]
    dictionary: [{send_pkt,1},
                  {acl_keys_q,
                      {[{subscribe,<<"gateway/id/command/#">>}],
                       []}},
                  {'$logger_metadata,
                      #{clientid =>
                            <<"CLIENT_ID">>,
                        peername => "IP:48052"}},
                  {recv_pkt,2},
                  {incoming_bytes,87},
                  {acl_cache_size,1},
                  {{subscribe,<<"gateway/id/command/#">>},
                   {allow,1602067233545}},
                  {outgoing_bytes,4},
                  {guid,{1602067233426642,157718347517020,1}}]
    trap_exit: false
    status: running
    heap_size: 1598
    stack_size: 27
    reductions: 3189
  neighbours: