Accessing device dashboard consumes all cpu and memory until crash

Hi,

I’ve set up a few devices with three metrics. I got it up and running and after a few days I am unable to access the device dashboard. Checking htop i can verify that CPU usage goes through the roof, and memory consumption keeps increasing until the system is out of memory and crashes.

Chirpstack v4.0.2

There’s nothing in the log except the process murder performed by OOM killer:
Oct 31 13:46:12 chirpsrv systemd[1]: chirpstack.service: A process of this unit has been killed by the OOM killer.
Oct 31 13:46:12 chirpsrv systemd[1]: chirpstack.service: Main process exited, code=killed, status=9/KILL
Oct 31 13:46:12 chirpsrv systemd[1]: chirpstack.service: Failed with result ‘oom-kill’.
Oct 31 13:46:12 chirpsrv systemd[1]: chirpstack.service: Consumed 30.280s CPU time.

Does anyone know the cause of this, or perhaps can help me debug it further?

Hi,

same issue here.
created a new VM for testing purposes, but as soon as i access a gateway, one CPU maxes out, some leak is eating up all memory until the OOM-killer kills the Chirpstack process, ending in a infinite loop.

Same here since Yesterday @brocaar

Is there a chance that this is related to the summer to winter time switch that happened on October 30?

1 Like

@tbre that could be an explanation indeed switching the server time to (UTC) we have an temporarily fix @brocaar

It really seems to be a time/timezone related issue. After switching to UTC, everything works fine again.

Thanks a lot!

2 Likes

I ran into the same Issue. Changing the System Timezone of the server to UTC is a valid workaround for now. This Issue just appeared out of nowhere, everything works a few days ago.

Maybe a cause is “daylight savings time” here in Europe. We just rolled back our clocks by 1h last Weekend.

I think this is a bug and needs to be further investigated.

Log around the Crash:

sudo journalctl -f -n 1000 -u chirpstack
...
Nov 01 19:57:00 gw1 chirpstack[973]: 2022-11-01T18:57:00.326392Z  WARN chirpstack::gateway::backend: Config exists, but region is not enabled. To enable it, add it to 'network.enabled_regions'
Nov 01 19:57:00 gw1 chirpstack[973]: 2022-11-01T18:57:00.527000Z  WARN chirpstack::gateway::backend: Config exists, but region is not enabled. To enable it, add it to 'network.enabled_regions'
Nov 01 19:57:00 gw1 chirpstack[973]: 2022-11-01T18:57:00.527056Z  WARN chirpstack::gateway::backend: Config exists, but region is not enabled. To enable it, add it to 'network.enabled_regions'
Nov 01 19:57:00 gw1 chirpstack[973]: 2022-11-01T18:57:00.845844Z  WARN chirpstack::api::backend: Backend interfaces API is disabled
Nov 01 20:00:58 gw1 chirpstack[973]: memory allocation of 96 bytes failed
Nov 01 20:00:58 gw1 systemd[1]: chirpstack.service: Main process exited, code=killed, status=6/ABRT
Nov 01 20:00:58 gw1 systemd[1]: chirpstack.service: Failed with result 'signal'.
Nov 01 20:00:58 gw1 systemd[1]: chirpstack.service: Consumed 3min 40.293s CPU time.
Nov 01 20:00:58 gw1 systemd[1]: chirpstack.service: Scheduled restart job, restart counter is at 2.
Nov 01 20:00:58 gw1 systemd[1]: Stopped ChirpStack open-source LoRaWAN Network Server.
Nov 01 20:00:58 gw1 systemd[1]: chirpstack.service: Consumed 3min 40.293s CPU time.
Nov 01 20:00:58 gw1 systemd[1]: Started ChirpStack open-source LoRaWAN Network Server.
Nov 01 20:00:59 gw1 chirpstack[1030]: 2022-11-01T19:00:59.006614Z  WARN chirpstack::region: Config exists, but region is not enabled. To enable it, add it to 'network.enabled_regions'
Nov 01 20:00:59 gw1 chirpstack[1030]: 2022-11-01T19:00:59.006699Z  WARN chirpstack::region: Config exists, but region is not enabled. To enable it, add it to 'network.enabled_regions'
Nov 01 20:00:59 gw1 chirpstack[1030]: 2022-11-01T19:00:59.006717Z  WARN chirpstack::region: Config exists, but region is not enabled. To enable it, add it to 'network.enabled_regions'
Nov 01 20:00:59 gw1 chirpstack[1030]: 2022-11-01T19:00:59.006731Z  WARN chirpstack::region: Config exists, but region is not enabled. To enable it, add it to 'network.enabled_regions'

I’ve got the same - memory consumption and CPU usage is hanging my machine totally.
No logs from chirpstack that something is wrong.
I’ve checked htop, strace and lsof (they show no difference between correct operation and hang after entering Device instance page), but ltrace -f -p <chirpstack PID> showed that there are millions of repeated mprotect’s going on with the spped of light :sleepy::

[pid 19527] <... clock_nanosleep resumed>NULL) = 0
[pid 19527] poll([{fd=10, events=POLLIN|POLLOUT|POLLNVAL}, {fd=22, events=POLLIN|POLLOUT|POLLNVAL}, {fd=23, events=POLLIN|POLLOUT|POLLNVAL}, {fd=24, events=POLLIN|POLLOUT|POLLNVAL}], 4, 1000) = 4 ([{fd=10, revents=POLLOUT}, {fd=22, revents=POLLOUT}, {fd=23, revents=POLLOUT}, {fd=24, revents=POLLOUT}])
[pid 19527] clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=100000000},  <unfinished ...>
[pid 19524] mprotect(0x7f6eda2ad000, 4096, PROT_READ|PROT_WRITE) = 0
[pid 19524] mprotect(0x7f6eda2ae000, 4096, PROT_READ|PROT_WRITE) = 0
[pid 19524] mprotect(0x7f6eda2af000, 4096, PROT_READ|PROT_WRITE) = 0
[pid 19524] mprotect(0x7f6eda2b0000, 4096, PROT_READ|PROT_WRITE) = 0
[pid 19524] mprotect(0x7f6eda2b1000, 4096, PROT_READ|PROT_WRITE) = 0
[pid 19524] mprotect(0x7f6eda2b2000, 4096, PROT_READ|PROT_WRITE) = 0
[pid 19524] mprotect(0x7f6eda2b3000, 4096, PROT_READ|PROT_WRITE) = 0
[pid 19524] mprotect(0x7f6eda2b4000, 4096, PROT_READ|PROT_WRITE) = 0
[pid 19524] mprotect(0x7f6eda2b5000, 4096, PROT_READ|PROT_WRITE) = 0
[pid 19524] mprotect(0x7f6eda2b6000, 4096, PROT_READ|PROT_WRITE) = 0
[pid 19524] mprotect(0x7f6eda2b7000, 4096, PROT_READ|PROT_WRITE) = 0
[pid 19524] mprotect(0x7f6eda2b8000, 4096, PROT_READ|PROT_WRITE) = 0
[pid 19524] mprotect(0x7f6eda2b9000, 4096, PROT_READ|PROT_WRITE) = 0
[pid 19524] mprotect(0x7f6eda2ba000, 4096, PROT_READ|PROT_WRITE) = 0
[pid 19524] mprotect(0x7f6eda2bb000, 4096, PROT_READ|PROT_WRITE) = 0
[pid 19524] mprotect(0x7f6eda2bc000, 4096, PROT_READ|PROT_WRITE) = 0
[pid 19524] mprotect(0x7f6eda2bd000, 4096, PROT_READ|PROT_WRITE) = 0
[pid 19524] mprotect(0x7f6eda2be000, 4096, PROT_READ|PROT_WRITE) = 0
[pid 19524] mprotect(0x7f6eda2bf000, 4096, PROT_READ|PROT_WRITE) = 0
[pid 19524] mprotect(0x7f6eda2c0000, 4096, PROT_READ|PROT_WRITE) = 0
...

I’ll try to find how to switch the timezone to UTC, but I hope it’ll be solved until March where summer time will come.

Regards!

I had the same problem since the summer to winter switch. My installation works just fine before. After the time change my CPU and ram usage increases till 100% and my system crashes.
To solve the Problem, i had to change the timezone by using
“sudo dpkg-reconfigure tzdata” choose “None of the above” and then “UTC”
To use the Ubuntu GUI instead does not solve the Problem in my case.

Looking into this now (I can reproduce the issue).

Please next time create a GitHub issue for bug reports: https://github.com/chirpstack/chirpstack/issues. The forum is for community support (e.g. to help each other using ChirpStack), bug reports are better reported on GitHub as I will get an email notification and if it is critical I will prioritize this over other work.

I have found the issue. Will do a bit more testing and then will create a bugfix release shortly.

This has been fixed in Fix metrics per day interval calculation. · chirpstack/chirpstack@6b1cf4f · GitHub.