High availability setup with multiple Chirpstack installations

We want to make our Chirpstack setup a little more robust given an incident a while ago, where the server we were running Chirpstack on failed and our server provider had to roll the server back to an earlier snapshot. So we are trying to figure out how to create a High Availability solution for our Chirpstack setup.

I’ve read this post: https://forum.chirpstack.io/t/can-a-gateway-report-to-more-than-one-network-server/10888 and it seems a gateway can send data to multiple Chirpstack instances. That seems like what we want to achieve.

Edit: Here is a drawing of what I want to try:

The idea is to have two or more independent Chirpstack installations on separate servers. The gateway will send data to each server. If one server goes down, data will still be sent to the others. We have Apache NiFi for fetching data from the chirpstack_as_events database… if it can’t access the DB on server one, it will just get the data from the DB on server 2, etc. It does so in 1 minute intervals.

However, what about things like fcounters, device setup with appkeys and such, and so on? Doesn’t there need to be a “one point of truth”, that syncs the gateway with Chirpstack?

How would one go about setting up a resilient server setup, that can handle one or more server failures?

I think you’re making a mistake by taking what is a distributed system and trying to replicate it on two distinct servers with a toggle between them.

Most of the services involved in a ChirpStack architecture are stateless. The ones that are not (Postgres, Redis) have fault tolerance patterns that are well-understood and should probably be the starting point of your architecture design.

1 Like

I’ll see if I understand you correctly:

Should we keep our single server setup with Gateway Bridge, Network Server and Application Server, and use Redis Sentinel and some sort of Postgresql HA system (maybe a Patroni/etcd setup)?

Should we spread out the Gateway Bridge, Network Server and Application Server to multiple servers, but have 1 instance of Redis and Postgresql, in a HA setup?

What I’m really missing is a architectural diagram of what someone has done to achieve HA, but I haven’t been able to find anything.

Edit: I’ll admit that my idea might not be the most elegant solution, but the primary goal to achieve here is to get the data from the sensor to a database on a server, even if one or more Chirpstack servers go down.

Added a drawing to the OP of what I want to try:

So, is this possible to create, without running into problems with device configurations, fcounters, etc?

I have another suggestion (see drawing below):

There are two servers (ServerA and ServerB, each with a Gateway Bridge and a MQTT broker. They are both connected to ServerC, where the Network and Application Servers are. The two servers are set up in a keep-alive setting, which I believe means that they share an external IP address, and they are set up to mirror each other. So if one goes down, the other simply takes over.

ServerA and ServerB has some kind of MQTT-subscriber, that saves all incoming data to a local database. The data on Database A and B only has a limited lifespan, let’s say 1 week.

So, the idea is, that as long as either ServerA or ServerB is alive, the data from our servers will always land somewhere on our platform. It might have to be filtered and decoded later, but that’s ok, as long as we have it.

I think you should look at clustering the parts individually. Chirpstack uses postgres db and redis as data storages. Redis already supports clustering feature. Postgres db has its own clustering strategies. You can put gateway bridges on the gateways itself. So you got 2 thing to clusterize: NS and AS. AS dont have a long running algorithm or calculation so I dont think it needs a clustering. All calculations are in NS. So you should think about the NS first. How do I clusterize it ? One answer is you can arrange an nginx load balancer to spread the traffic across multiple network server instances.

2 Likes

Unfortunately we use Mikrotik KNOT as our gateway devices, which means we can’t install Bridge on them.

1 Like