Hello!
We have implemented Chirpstack (v4) in AWS some time ago and by now it have been working pretty fine except that a few days ago we have been receiving wrong data from some of our devices.
As we can see in the previous picture, the data received goes from one range to another which is not fine since the data that we should be receiving is a value that ramps up slowly.
The devices that fail do not appear to have a particular patron except for that in the place where they are usually powered off due to the place’s maintenance, electricity cuts, etc. So they do a re-join with the loraserver every time this happens. (All of them are OTAA devices)
We are currently trying to figure out what/where the problem is but nothing seems to be wrong at first glance. We already check the logs and there is nothing that indicates a problem, no warnings or error messages.
The first time this problem happened we could solve it by restarting the affected device so it re-joined (via OTAA) but we can not do this every time it happens since some of the devices are very far.
This gave us a little clue about what may be happening, maybe is an OTAA related issue going on. but what exactly? We think that it can be an AppSKey problem since the whole message is readable in Chirpstack as we see in the first pictures and the data is correct except for the payload which is decrypted using the AppSKey. So if the AppSkey used to encrypt the data is not the same that is stored in the loraserver the payload would be incorrectly decrypted which is exactly what we see.(correct me if I am wrong). If this is the problem, how can we resolve this? Is there a code issue or a human error for our part? ( @brocaar )
That said, we wonder if there is anyone that has/had the same problem and how they managed to solve it or if there is anyone that may have a clue in how to solve this.
This is an urgent problem that we need to fix so we really appreciate everyone’s suggestions about how to resolve this.
Thanks in advance,
Best regards.
P.D.
Additionally, we leave a brief resume of our AWS high availability Chirpstack architecture and we leave our Chirpstack configuration files (obviously without the sensible information such as endpoints, users or passwords) since it may be important in this issue .
As previously mentioned, our Chirpstack architecture is mounted in AWS. The architecture is highly available. We’re using Elastic Beanstalk to
deploy ChirpStack in multiple EC2 instances.For Postgres we’re using an RDS cluster, and for Redis an ElastiCache cluster. The RDS cluster has two endpoints, we’re using the primary one. Same for ElastiCache. AWS handles all the clustering under the table, and exposes those endpoints.
The configuration files:
[logging]
level="info"
[postgresql]
dsn="postgres://[postgres-user]:[postgres-password]@[postgres-endpoint]:[postgres-port]/chirpstack?sslmode=disable"
max_open_connections=10
min_idle_connections=0
automigrate=false
[redis]
servers=[
"redis://[redis-endpoint]:[redis-port]",
]
tls_enabled=false
cluster=false
[network]
net_id="000000"
enabled_regions=[
"au915_0"
]
[api]
bind="0.0.0.0:8080"
secret="[api-secret]"
[integration]
enabled=["mqtt"]
[integration.mqtt]
event_topic="application/{{application_id}}/device/{{dev_eui}}/event/{{event}}"
command_topic="application/{{application_id}}/device/{{dev_eui}}/command/{{command}}"
server="ssl://[iot-core-endpoint]:[iot-core-port]/"
json=true
username=""
password=""
qos=0
clean_session=false
client_id=""
ca_cert="[ca-cert-path]"
tls_cert="[tls-cert-path]"
tls_key="[tls-key-path]"
```
[[regions]]
name=“au915_0”
common_name=“AU915”
[regions.gateway]
force_gws_private=false
[regions.gateway.backend]
enabled=“mqtt”
[regions.gateway.backend.mqtt]
event_topic=“gateway/+/event/+”
command_topic=“gateway/{{ gateway_id }}/command/{{ command }}”
server=“ssl://[iot-core-endpoint]:[iot-core-port]/”
username=“”
password=“”
qos=0
clean_session=false
client_id=“”
ca_cert=“[ca-cert-path]”
tls_cert=“[tls-cert-path]”
tls_key=“[tls-key-path]”
[[regions.gateway.channels]]
frequency=915200000
bandwidth=125000
modulation=“LORA”
spreading_factors=[7, 8, 9, 10, 11, 12]
[[regions.gateway.channels]]
frequency=915400000
bandwidth=125000
modulation=“LORA”
spreading_factors=[7, 8, 9, 10, 11, 12]
[[regions.gateway.channels]]
frequency=915600000
bandwidth=125000
modulation=“LORA”
spreading_factors=[7, 8, 9, 10, 11, 12]
[[regions.gateway.channels]]
frequency=915800000
bandwidth=125000
modulation=“LORA”
spreading_factors=[7, 8, 9, 10, 11, 12]
[[regions.gateway.channels]]
frequency=916000000
bandwidth=125000
modulation=“LORA”
spreading_factors=[7, 8, 9, 10, 11, 12]
[[regions.gateway.channels]]
frequency=916200000
bandwidth=125000
modulation=“LORA”
spreading_factors=[7, 8, 9, 10, 11, 12]
[[regions.gateway.channels]]
frequency=916400000
bandwidth=125000
modulation=“LORA”
spreading_factors=[7, 8, 9, 10, 11, 12]
[[regions.gateway.channels]]
frequency=916600000
bandwidth=125000
modulation=“LORA”
spreading_factors=[7, 8, 9, 10, 11, 12]
[[regions.gateway.channels]]
frequency=915900000
bandwidth=500000
modulation=“LORA”
spreading_factors=[8]
[regions.network]
installation_margin=10
rx_window=0
rx1_delay=1
rx1_dr_offset=0
rx2_dr=8
rx2_frequency=923300000
rx2_prefer_on_rx1_dr_lt=0
rx2_prefer_on_link_budget=false
downlink_tx_power=-1
adr_disabled=false
min_dr=0
max_dr=5
enabled_uplink_channels=[0, 1, 2, 3, 4, 5, 6, 7, 64]
[regions.network.rejoin_request]
enabled=false
max_count_n=0
max_time_n=0
[regions.network.class_b]
ping_slot_dr=8
ping_slot_frequency=0```