Chirpstack capacity

mshafek · July 31, 2023, 11:46am

I’m trying to gauge the capacity of our LoRaWAN network. How would I be able to check how many nodes send packets to a single frequency channel on a single gateway?

Jerome73 · July 31, 2023, 4:09pm

Are you trying to test the limits of the radio layer or the Chirpstack capacity to handle the messages?

mshafek · August 1, 2023, 7:47am

In a sense, yes. I’m trying to figure out how many more sensors I can add to the gateway. I’m also trying to see the distribution of the packets.

Jerome73 · August 1, 2023, 11:23am

Remember that the gateway do not “process” messages. It is a simple forwarder. So on the gateway side, the limit is… physics.

If you send messages in DR5, the time on air is 61.7 ms per message (assuming there is not reps). So you could theorically receive around 58 000 messages per channel and per hour.

Of course, this will never happen, because the devices will send messages on random basis, and not one after the other.

So you have to do a kind of “probabilistic calculation” depending on your own parameters (how many messages per hour for each device, spreading factor, nbreps…).

Here is an statement of Semtech: “A single eight-channel gateway can support a few hundred thousand messages over the course of a 24-hour period. If each end device sends 10 messages a day, such a gateway can support about 10,000 devices” (Source)

On Chirpstack side, you can run simulations with this tool: https://github.com/brocaar/chirpstack-simulator.

Hope this helps.

mshafek · August 1, 2023, 1:35pm

Thanks for the explanation @Jerome73!

I’ve been reading this article from Mokorola (https://www.mokolora.com/calculate-the-network-capacity-of-lorawan-gateway/).
It says here that we would need more than one gateway to ensure that all data can be received.

For example, I have a gateway that supports 10 000 devices but to ensure that all data is recevied, I should use 2 or 3 gateways. But in this case, since each gateway can accommodate up to 10 000 devices each, how would the situation be if I had 30 000 devices in the network?

I’m pretty new to all of this, so apologies if my questions aren’t that smart. I’m still trying to figure things out.

Thanks a lot!

Jerome73 · August 1, 2023, 1:45pm

Having more than one gateway is a good practice to enhance network reliability and optimize energy consumption.

With many gateways, you can reach an optimal performance of the network thanks to ADR.

But if you place 3 gateways at the same place, it will not increase network capacity.

For instance, we deployed a network in an airport with 9 gateways for 3000 sensors, because our customer requires that 95% of the sensors transmit on SF7.

Capacity is a consequence of other technical choices.

mshafek · August 2, 2023, 7:50am

How were you able to check that the 95% was achieved? Is it by checking the valid CRC percentage on every single gateway or did you use RSSI of packets?

Jerome73 · August 2, 2023, 8:22am

CRC has nothing to do with that.

You can check the DR of all the devices in a tenant on the dashboard.

petertoome · August 6, 2023, 10:47pm

Be very careful with using claims such as the one that says a Gateway can handle “10000 devices”. These claims originated early in the life of LoRa WAN and were made on the basis of a 64 channel Gateway. Further, it was based on the assumption that each node would send once per day and that the packet would be around 12 bytes. You need to de-rate the capacity according to your circumstances. First off, if you have a typical 8 channel Gateway, divide 10000 by 8 and you get 1250 as the maximum number of devices. Then, in theory you should look at your packet length and the range your devices will be from the Gateway. If you use ADR, the DR will change according to how far the nodes are from the Gateway. Which means you will probably have to assume a mid-point ADR of say 4 or 5. Then look at your packet length. Together they will determine the air time. But remember that messages are converted to base 64, you need 3 bytes on air to send 2 bytes of data. If your devices are sending more than once per day, you need to multiply the air time accordingly.
But what is also important to remember is that there is more to throughput than the uplink on-air time. After finishing an Uplink, a device sits inactive during the RX1 & RX 2 WIN Delays, then watches for downlink messages from the Gateway. If you have increased your delays out to 5 sec to cope with network latency, then that dead time becomes 5 seconds not 2. The downlink happens at the end of the window, so the delay is extended according to how long that message takes to send (at the higher DR of the downlink). It could be short (e.g. an ACK if your data is important enough for you to know that the packet was received) or long (config update, firmware update). The Tx/RX channel pairing is locked during that whole window. So even though you packet may have only taken 100 mSec to send, the send time is swamped by the rest of the delays. In calculating your effective on-air time, it would be simpler to just assume a window of 2 to 2.5 seconds for each message.
Then there is one more consideration: if you are using OTAA, what happens after a Gateway outage? All your nodes may need to rejoin the network, creating a storm of activity. Things take longer to sort as the join delay is 5 seconds not 2. If your network is running at capacity, it will take a long time to settle again. Running multiple Gateways helps avoid this but of course adds to the cost.
Interestingly, 25 years ago we used 450MHz radios with a 2 to 5km range to build radio networks in applications requiring 15 minute data ; it used tight ascii packets (40 to 60 bytes) with low overhead and didn’t have any encryption or security. The network would max out at about 100 Nodes - and that was on a single frequency. After 7 years of playing with LoRa WAN (in private networks) we find that despite the promises, and having 8 channels rather than 1, that LoRa WAN systems in the same environment (sensor count has increased so packet lengths are now 80 to 140 bytes), still max out at around the same capacity. But in practice, to keep the system reliable and to allow for recovery after events such as a Gateway outage, 30 to 40 devices is the safe limit. I acept that this is a pretty extreme LoRa WAN Use Case, but it is a reasonably common real world application and one that LoRa WAN has often been applied to.

Jerome73 · August 7, 2023, 7:09am

Hi @petertoome !

Your feedback is interesting but it is very different from ours.

We are deploying LoRaWAN private networks for 8 years now, and 30/40 devices per gateway for “more than one message per day” is just not fair and can confuse new users:

OTAA does not mean that you have to re-join after a gateway outage. If this is the case, this is clearly a (very) poor design on sensor side.
If you are using ACK on regular basis (which is not a “standard” LoRaWAN design), then the downlink SF must be setup accordingly. For instance, TTN uses SF9 => approximatly 250ms for a 80 bytes payload (far away from 2 seconds).
I don’t see why RX delays should have an influence on the network capacity.
LoRaWAN was designed to upload data from “small data” sensors, so the choice of the sensors is important to avoid long payload (80% coming from poor firmware design).

And of course, deploying a multi-gateways network is one of the key to provide more reliability.

Jerome73 · August 7, 2023, 7:14am

@bconway what is your opinion regarding point 3?

bconway · August 7, 2023, 2:05pm

Some people are arguing extremes in this thread. Also, payloads are not sent as base64 over the air, that would negate the whole point of small byte payloads.

Gateway outages are always a weird topic. I always think/plan for them in terms of minutes or hours and mostly covered by a battery, but then people come to this forum with reports of unexpected behavior after their gateway was offline for 7 days (yikes).

I’m not sure on the RX/TX pair locking off the top of my head, I’d need to revisit it, sorry.

I think both the optimistic and conservative views have been well-presented, and the OP should do some testing for their use case.

petertoome · August 10, 2023, 5:29am

I should explain our use case here, which is monitoring in irrigated and broadacre agriculture. It is very far from the LoRa WAN norm, but is one in which LoRa WAN has been heavily promoted (two state governments in Australia have offered multi-million dollar grant schemes to get LoRa WAN systems established in ag districts).
We have had large scale projects collapse in Australia because the participants planned on the 20km range promised by LoRaWAN proponents. What they found in practice was that in real world conditions (limited antenna height, uneven terrain, long packet lengths etc) the achievable range was 3 to 5km. To cover a district in those conditions means a heck of a lot more Gateways (and a huge increase in cost) than anticipated.
The norm in irrigated ag is 15 minute logging with multi-parameter sensors. Although you see lots of claims by companies looking to push products in this market, few have actual experience. I see lots of single level soil moisture sensors; whereas for over 20 years, the norm has been multi-level soil probes, which return 4, 8, 10, 12, or 16 values (typically with a sensor every 10cm along their length). Most probes return soil moisture and soil temperature and some also return soil conductivity, so there are 2 or 3 different measurement sets to send every 15 minutes. For soil moisture, the range is 00.000 to 99.999, so at 6 characters plus a field separator, for a 100cm 10 sensor probe that is 70 characters. Because we need to know when the reading was taken (rather than when it was sent) everything is time stamped, so more characters.
As another example, a weather station with air temp, RH, wind speed, wind direction, solar radiation and rain will still need a packet of 30 to 40 bytes, which once again is sent every 15 minutes. So lots of long packets sent very often. The inverse of the ideal use case.
Australia does not have the population density of Europe. Our agriculture districts are often a long way from cities, so infrastructure is lacking. If we put a Gateway out, it may end up on a cellular backhaul. The person who installed it may be based 4 hours away and an outage may take several days to rectify.
Fatalists love to cite “Sod’s Law” which says “everything which can go wrong will”. Others like to cite “Murhpy’s Law” which states “If there are two or more ways to do something and one of them will create a catastrophe, sooner or later someone will do it that way” (Murphy was an aerospace engineer and in that field such events are disastrous). In remote areas, these two rules often collide. So when the Gateway goes out because the cellular modem locks up, it coincides with someone re-starting the LoRa WAN server. So when the Gateway comes back on line, all the nodes re-join. Plus, because the Nodes are “smart” and buffer the stored data, they then start trying to send it to the server. Hence the “storm” at re-connection. It won’t happen every time, just when you have an unhappy customer breathing down your neck, because the irrigation system has been down, the weather is stinking hot and the crops are withering.
LoRa WAN’s designers paid a lot of attention to security, but didn’t really worry as much about data integrity. What is important in most monitoring applications is the ability to return a contiguous data set: making sure every reading was taken at the correct time and is transmitted successfully to the presentation software. Technologies that compete with LoRa WAN - mainly the cellular technologies - do this automatically; each node is a combination data logger and telemetry unit; they have enough memory for months to year’s worth of data; and they always keep track of the last data which was sent; if comms go out, they keep reading and then, when comms are restored, send the buffered data.
For LoRa WAN to succeed in this space, it must offer the same capabilities.
Simple LoRa WAN devices are just send and forget: the measurement is taken, the device sends the raw data and then goes back to sleep. Packets do not contain a time stamp and the time is taken from the Gateway receive time. But what happens if a packet is missed? In this scenario, nothing.
But if your data is precious, you need to add extra code to (1) buffer the data and (2) check that each packet has been received at the other end. But because now the uplink time is no longer related to the measurement time, each packet needs to carry an internal time stamp. The LoRa WAN Node will send a packet and wait for an ACK to know that it has been received. It will then go on to the next packet. If it doesn’t get an ACK, it will retry a number of times; if still no luck it can go to sleep and wait; it will try again later, using a Link Check packet to test the link. If the link check succeeds, it can start sending again. Because all of this is beyond the LoRa WAN scope, the protocol has to be designed by the node designer.
This adds a lot of complexity to the node design and does put a lot of extra demand on the network. It makes the approach unsuitable for public networks and limits it’s effective use to small scale private networks. And my previous comments were in context of this background.
I am a big fan of LoRa WAN. But I do appreciate it’s limitations and think it is important for users to understand them. Even after 7 years of experimenting, testing, repairing etc, I still regularly learn new things about the technology.
And I also want to make one other thing clear: Chirpstack has been the best thing ever for LoRa WAN. Without it, the pace of development and adoption would have been much slower. It is a superb product and one that Orne can truly be proud of.

Eric_Mann · October 22, 2024, 8:10am

To check how many nodes send packets to a single frequency channel on a single gateway in your LoRaWAN network:

Calculate Capacity: Determine the maximum packets a gateway can handle per day and the packet frequency per node.
2.
Monitor Network: Use tools like The Things Network Console to track real-time traffic and node activity.

This will help you gauge the network capacity effectively.