Question about clustering the Gateway Bridge with multiple MQTT Brokers

wheeler · March 19, 2021, 8:49am

I am building a cluster setup and plan to use a network loadbalancer that routes UPD and TCP to container stacks. Each container stack contains

Gateway Bridge
Network Server
Application Server
MQTT Broker

The stateful components Redis and Database have their own cluster so each container stack communicates with the same stateful components. Now I am wondering if I have to pull out the MQTT Broker out of the stack as well. The documentation states:

For performance and to make the ChirpStack Gateway Bridge highly available, you can run ChirpStack Gateway Bridge on multiple servers, each connecting to the same MQTT broker.

Is there any drawback in having each container stack having its own MQTT broker? The only issue I can think of is that I might not see frames in the UI from a different container. But that would not be as problematic in my case. Any thoughts are welcome.

Best,
Thomas

chopmann · March 19, 2021, 8:59am

The “trivial” drawback:

Operation complexity and depending on the broker licenses.

Depending on the broker you are using, you might be better off using other type of isolation.
VerneMQ supports “Mountpoints” (Namespace), with them you can have a single entrypoint and depending on an attribute (client_id, a regex, etc…) you route them to a different “Mountpoint”.

Maybe adding a Diagram of your planned architecture might help clear out more questions.

wheeler · March 19, 2021, 9:37am

From what I understand the MQTT Broker is only needed for the communication of the Bridge with the Network-Server and not much else. That is why I thought it living within each stack shouldn’t be a problem. See the following image for more details.

wheeler · March 22, 2021, 3:26pm

Any idea? (bump)
Is there stateful communication for MQTT or must every Bridge / Network Server see all messages? Is maybe deduplication affected?

chopmann · March 22, 2021, 8:22pm

If a device is should be part of N-Container-Stacks it will not work. A device can only be part of a network.
If you want to connect a Gateway with N-Networks you’ll need to forward traffic to all the MQTT instances and register the gateway with each network. I don’t really get what you are trying to accomplish. Is it an HA-Setup or each Container Stack is an independent network with a common Load-Balancer?
If you are trying to HA you should consider each component as a unit, you’d have an AS-Cluster, NS-Cluster and so on. If your Container-Stacks represent failure Zones, then you will still need to “cluster” the components.
De-dup is handled with a “shared” Redis.

wheeler · March 23, 2021, 10:01am

Thank you for your reply, but I don’t think you fully understand my approach. It is indeed a HA setup that works well for all the components involved (except MQTT, I think). You don’t need to build a separate cluster for each component as they are stateless. As long as the Redis cluster and Postgres are always the same, it doesn’t matter where the NS and AS are located.

The only thing I’m not sure about is whether the MQTT broker is stateful or stateless as well.

Best,
Thomas

chopmann · March 23, 2021, 4:39pm

That might be. I think we are talking about the same thing, but I guess you are drawing your boxes according to your infrastructure or the way you are deploying the components. At the end you need:

1 Network-Server Service (with many NS-Containers)
1 Application-Server Service (with many AS-Containers)
1 Gateway Service (with many GB-Containers)

That’s what I mean with clustering. MQTT has some Stateful-properties, that’s why I’d suggest you to pick a MQTT Server that supports clustering and shared topics like HiveMQ or VerneMQ. Again, it does not mean all the containers “Live” in one Container Stack, but in the case of MQTT, they should be aware of the others. I don’t know if mosquitto supports clustering, but you could “fake it” with bridging every mqtt-server with each other.

How you distribute the chirpstack components among your (Virtual-)Machines is more or less irrelevant. I’m guessing you are deploying them with docker-compose on N-Machines and that’s why they are grouped together?

If you are using Kubernetes for your deployment, maybe having a look at the Helm-Chart I maintain can help you with your setup. https://gitlab.com/wobcom/iot/chirpstack-helm/-/tree/master

We have 3 NS-Pods, 3 AS-Pods and the GBs are all deployed at the gateways, that’s why the Chart does not really deploy the GB.
Some screenshots from the dashboard: https://grafana.com/grafana/dashboards/13303

jjanderson · March 24, 2021, 1:57pm

Have you tried a bridge between mqtt brokers?

I actually have chirpstack using an external broker (outside the stack) this is then used for multiple other mqtt based applications as well…

chopmann · March 24, 2021, 2:16pm

No, I have not. We have VerneMQ deployed and with a combination of https://docs.vernemq.com/configuration/listeners#mountpoints and https://docs.vernemq.com/plugindevelopment/webhookplugins#auth_on_register for isolation.

jjanderson · March 24, 2021, 3:25pm

ok, so I don’t know that platform, I use mosquitto with the configuration outlined here: http://www.steves-internet-guide.com/mosquitto-bridge-configuration/#:~:text=A%20MQTT%20bridge%20lets%20you,work%20as%20an%20MQTT%20bridge.

This may or may not meet your requirements but thought it may help… essentially it simply bridges the brokers via MQTT…

Otherwise, it could also work if you simply connect all your instances to the same broker?

chopmann · March 24, 2021, 4:22pm

I got my use-case fully covered. I think @wheeler might use your advice

wheeler · March 24, 2021, 4:30pm

Thank you @chopmann & @jjanderson for your responses.

I am building the setup in AWS with an ECS cluster which is similar to docker-compose. So yes, it is a setup with N-Machines and they are grouped in the graphic like that, because they “live” on the same machine.
You mentioned that MQTT has some stateful properties. Do you know more about this?
Also thank you for the idea with the bridge. I am not sure it is possible with my setup, but I will definitely have a look at it, because it would make my setup much easier.

chopmann · March 24, 2021, 4:59pm

https://www.hivemq.com/blog/mqtt-essentials-part-7-persistent-session-queuing-messages/

And the usual long-lived tcp stuff.

by the way, load-balancing grpc (http2) like http wont work as expected. grpc is also a long lived tcp. So you might end-up in a situation where all NS connect to 1 AS, even if you have multiple instances, this depends in the boot order. How many AS are ready before the NS-instances come up.

I dont remember exactly how ecs works, and if you can avoid that by having an NS connecting to it’s “local” AS.

wheeler · March 25, 2021, 9:58am

Thank you @chopmann for the response. In my setup the groups don’t know and cannot communicate with each other. So each group has only internal communication. Meaning the NS always communicates with the AS in the same group. This also means that the GB will always communicate with the same MQTT and therefore NS.
I am just wondering if it is really necessary that the communication of GB->MQTT->NS must be known to another group. Keep in mind that I have the GB in my stack meaning that UDP is used to reach the stack from outside.