I had a v3 chirpstack production deployment for a couple of years, without any issues. Kudos to @brocaar for his awesome work. I have a huge amount of trust in his implementation.
My v3 deployment was a single ec2 instance in AWS, running the application server, network server and a gateway bridge in separate containers. The postgresql DB and redis were running in their own servers. The system was rock solid and I never had any issues with any part of it.
I recently migrated to v4, running the chirpstack server and a gateway bridge running in docker containers on a single ec2 instance. I had accidentally under-provisioned the ec2 instance (a smaller instance than what I had used for my v3 instance). The system was working, but then started getting overloaded and it appears that the gateway bridge docker container was getting starved because the chirpstack container was very busy.
This leads me to conclusion that it’s not the best production solution to have the gateway bridge container running on the same instance as chirpstack container. It is very simple to separate these onto separate instances.
Adding additional gateway bridge instances seems easy, by just segmenting the gateways that connect to each gateway bridge. Otherwise, just increasing the ec2 instance size is a very easy way to increase performance.
What are the possible ways to improve throughput through the chirpstack server? Can you have multiple chirpstack servers behind a load balancer? Since the communication between chirpstack and the gateway bridge is mqtt, what happens if multiple chirpstacks receive the same mqtt message? Will all instances try to process the mqtt message, defeating the parallelism? Otherwise, just increasing the ec2 instance size is a very easy way to increase performance.
Thanks to everyone in this forum. I really appreciate the support!