GCP: NetworkServer stops receiving messages

Hi!
I’ve been using the GCP implementation as mentioned in the Quickstart guide… it works very nicely but somtimes, the NetworkServer (which is on a Compute Engine VM) stops recieving the PubSub messages without notice… then, after a systemctl restart, it recieves all the enqueued messages and starts working normally… ther’'s no error in the logs.

Do anyone have the same problem?
Here is a log from the VM:

Nov 28 10:15:48 camuzzi-outbound-service loraserver[947]: time="2018-11-28T10:15:48-03:00" level=info msg="gateway/gcp_pub_sub: message received" gateway_id=b827ebfffe01adba type=stats
Nov 28 10:15:48 camuzzi-outbound-service loraserver[947]: time="2018-11-28T10:15:48-03:00" level=info msg="gateway updated" gateway_id=b827ebfffe01adba
Nov 28 10:16:18 camuzzi-outbound-service loraserver[947]: time="2018-11-28T10:16:18-03:00" level=info msg="gateway/gcp_pub_sub: message received" gateway_id=b827ebfffe01adba type=stats
Nov 28 10:16:18 camuzzi-outbound-service loraserver[947]: time="2018-11-28T10:16:18-03:00" level=info msg="gateway updated" gateway_id=b827ebfffe01adba
Nov 28 10:16:48 camuzzi-outbound-service loraserver[947]: time="2018-11-28T10:16:48-03:00" level=info msg="gateway/gcp_pub_sub: message received" gateway_id=b827ebfffe01adba type=stats
Nov 28 10:16:48 camuzzi-outbound-service loraserver[947]: time="2018-11-28T10:16:48-03:00" level=info msg="gateway updated" gateway_id=b827ebfffe01adba
Nov 28 10:17:18 camuzzi-outbound-service loraserver[947]: time="2018-11-28T10:17:18-03:00" level=info msg="gateway/gcp_pub_sub: message received" gateway_id=b827ebfffe01adba type=stats
Nov 28 10:17:18 camuzzi-outbound-service loraserver[947]: time="2018-11-28T10:17:18-03:00" level=info msg="gateway updated" gateway_id=b827ebfffe01adba
Nov 28 10:17:48 camuzzi-outbound-service loraserver[947]: time="2018-11-28T10:17:48-03:00" level=info msg="gateway/gcp_pub_sub: message received" gateway_id=b827ebfffe01adba type=stats
Nov 28 10:17:48 camuzzi-outbound-service loraserver[947]: time="2018-11-28T10:17:48-03:00" level=info msg="gateway updated" gateway_id=b827ebfffe01adba
Nov 28 10:18:18 camuzzi-outbound-service loraserver[947]: time="2018-11-28T10:18:18-03:00" level=info msg="gateway/gcp_pub_sub: message received" gateway_id=b827ebfffe01adba type=stats
Nov 28 10:18:18 camuzzi-outbound-service loraserver[947]: time="2018-11-28T10:18:18-03:00" level=info msg="gateway updated" gateway_id=b827ebfffe01adba
Nov 28 10:18:48 camuzzi-outbound-service loraserver[947]: time="2018-11-28T10:18:48-03:00" level=info msg="gateway/gcp_pub_sub: message received" gateway_id=b827ebfffe01adba type=stats
Nov 28 10:18:48 camuzzi-outbound-service loraserver[947]: time="2018-11-28T10:18:48-03:00" level=info msg="gateway updated" gateway_id=b827ebfffe01adba
Nov 28 10:19:18 camuzzi-outbound-service loraserver[947]: time="2018-11-28T10:19:18-03:00" level=info msg="gateway/gcp_pub_sub: message received" gateway_id=b827ebfffe01adba type=stats
Nov 28 10:19:18 camuzzi-outbound-service loraserver[947]: time="2018-11-28T10:19:18-03:00" level=info msg="gateway updated" gateway_id=b827ebfffe01adba
Nov 28 10:19:48 camuzzi-outbound-service loraserver[947]: time="2018-11-28T10:19:48-03:00" level=info msg="gateway/gcp_pub_sub: message received" gateway_id=b827ebfffe01adba type=stats
Nov 28 10:19:48 camuzzi-outbound-service loraserver[947]: time="2018-11-28T10:19:48-03:00" level=info msg="gateway updated" gateway_id=b827ebfffe01adba
Nov 28 10:20:18 camuzzi-outbound-service loraserver[947]: time="2018-11-28T10:20:18-03:00" level=info msg="gateway/gcp_pub_sub: message received" gateway_id=b827ebfffe01adba type=stats
Nov 28 10:20:18 camuzzi-outbound-service loraserver[947]: time="2018-11-28T10:20:18-03:00" level=info msg="gateway updated" gateway_id=b827ebfffe01adba
Nov 28 11:10:16 camuzzi-outbound-service loraserver[947]: time="2018-11-28T11:10:16-03:00" level=info msg="finished unary call with code OK" grpc.code=OK grpc.method=GetGatewayStats grpc.service=ns.Ne
tworkServerService grpc.start_time="2018-11-28T11:10:15-03:00" grpc.time_ms=132.394 peer.address="[::1]:45910" span.kind=server system=grpc
Nov 28 11:10:18 camuzzi-outbound-service loraserver[947]: time="2018-11-28T11:10:18-03:00" level=info msg="finished unary call with code OK" grpc.code=OK grpc.method=GetGateway grpc.service=ns.Network
ServerService grpc.start_time="2018-11-28T11:10:18-03:00" grpc.time_ms=4.548 peer.address="[::1]:45910" span.kind=server system=grpc
Nov 28 11:10:18 camuzzi-outbound-service loraserver[947]: time="2018-11-28T11:10:18-03:00" level=info msg="finished unary call with code OK" grpc.code=OK grpc.method=GetGatewayStats grpc.service=ns.Ne
tworkServerService grpc.start_time="2018-11-28T11:10:18-03:00" grpc.time_ms=7.028 peer.address="[::1]:45910" span.kind=server system=grpc
Nov 28 11:10:55 camuzzi-outbound-service loraserver[947]: time="2018-11-28T11:10:55-03:00" level=info msg="finished streaming call with code OK" grpc.code=OK grpc.method=StreamFrameLogsForGateway grpc
.service=ns.NetworkServerService grpc.start_time="2018-11-28T11:10:43-03:00" grpc.time_ms=11657.88 peer.address="[::1]:45910" span.kind=server system=grpc
Nov 28 11:10:55 camuzzi-outbound-service loraserver[947]: time="2018-11-28T11:10:55-03:00" level=info msg="finished unary call with code OK" grpc.code=OK grpc.method=GetGatewayStats grpc.service=ns.Ne
tworkServerService grpc.start_time="2018-11-28T11:10:55-03:00" grpc.time_ms=4.628 peer.address="[::1]:45910" span.kind=server system=grpc
Nov 28 11:10:59 camuzzi-outbound-service loraserver[947]: time="2018-11-28T11:10:59-03:00" level=info msg="finished unary call with code OK" grpc.code=OK grpc.method=GetGateway grpc.service=ns.Network
ServerService grpc.start_time="2018-11-28T11:10:59-03:00" grpc.time_ms=3.48 peer.address="[::1]:45910" span.kind=server system=grpc
Nov 28 11:10:59 camuzzi-outbound-service loraserver[947]: time="2018-11-28T11:10:59-03:00" level=info msg="finished unary call with code OK" grpc.code=OK grpc.method=GetGatewayStats grpc.service=ns.Ne
tworkServerService grpc.start_time="2018-11-28T11:10:59-03:00" grpc.time_ms=4.83 peer.address="[::1]:45910" span.kind=server system=grpc
Nov 28 11:16:28 camuzzi-outbound-service systemd[1]: Stopping LoRa Server...
Nov 28 11:16:28 camuzzi-outbound-service systemd[1]: Stopped LoRa Server.
Nov 28 11:16:28 camuzzi-outbound-service systemd[1]: Started LoRa Server.
Nov 28 11:16:28 camuzzi-outbound-service loraserver[15714]: time="2018-11-28T11:16:28-03:00" level=info msg="starting LoRa Server" band=AU_915_928 docs="https://docs.loraserver.io/" net_id=000000 vers
Nov 28 11:16:28 camuzzi-outbound-service loraserver[15714]: time="2018-11-28T11:16:28-03:00" level=info msg="starting LoRa Server" band=AU_915_928 docs="https://docs.loraserver.io/" net_id=000000 vers
ion=2.3.0
Nov 28 11:16:28 camuzzi-outbound-service loraserver[15714]: time="2018-11-28T11:16:28-03:00" level=info msg="disabling all channels"
Nov 28 11:16:28 camuzzi-outbound-service loraserver[15714]: time="2018-11-28T11:16:28-03:00" level=info msg="enabling channels" channels="[0 1 2 3 4 5 6 7]"
Nov 28 11:16:28 camuzzi-outbound-service loraserver[15714]: time="2018-11-28T11:16:28-03:00" level=info msg="setup redis connection pool" url="redis://127.0.0.1:6379"
Nov 28 11:16:28 camuzzi-outbound-service loraserver[15714]: time="2018-11-28T11:16:28-03:00" level=info msg="connecting to postgresql"
Nov 28 11:16:28 camuzzi-outbound-service loraserver[15714]: time="2018-11-28T11:16:28-03:00" level=info msg="gateway/gcp_pub_sub: setting up client"
Nov 28 11:16:28 camuzzi-outbound-service loraserver[15714]: time="2018-11-28T11:16:28-03:00" level=info msg="gateway/gcp_pub_sub: setup downlink topic" topic=au921-gateway-commands
Nov 28 11:16:28 camuzzi-outbound-service loraserver[15714]: time="2018-11-28T11:16:28-03:00" level=info msg="gateway/gcp_pub_sub: setup uplink topic" topic=au921-gateway-events
Nov 28 11:16:28 camuzzi-outbound-service loraserver[15714]: time="2018-11-28T11:16:28-03:00" level=info msg="gateway/gcp_pub_sub: check if uplink subscription exists" subscription=au921-gateway-events
-loraserver
Nov 28 11:16:28 camuzzi-outbound-service loraserver[15714]: time="2018-11-28T11:16:28-03:00" level=info msg="no geolocation-server configured"
Nov 28 11:16:28 camuzzi-outbound-service loraserver[15714]: time="2018-11-28T11:16:28-03:00" level=info msg="configuring join-server client" ca_cert= server="http://localhost:8003" tls_cert= tls_key=
Nov 28 11:16:28 camuzzi-outbound-service loraserver[15714]: time="2018-11-28T11:16:28-03:00" level=info msg="no network-controller configured"
Nov 28 11:16:28 camuzzi-outbound-service loraserver[15714]: time="2018-11-28T11:16:28-03:00" level=info msg="applying database migrations"
Nov 28 11:16:28 camuzzi-outbound-service loraserver[15714]: time="2018-11-28T11:16:28-03:00" level=info msg="migrations applied" count=0
Nov 28 11:16:28 camuzzi-outbound-service loraserver[15714]: time="2018-11-28T11:16:28-03:00" level=info msg="starting api server" bind="0.0.0.0:8000" ca-cert= tls-cert= tls-key=
Nov 28 11:16:28 camuzzi-outbound-service loraserver[15714]: time="2018-11-28T11:16:28-03:00" level=info msg="starting downlink device-queue scheduler"
Nov 28 11:16:28 camuzzi-outbound-service loraserver[15714]: time="2018-11-28T11:16:28-03:00" level=info msg="starting multicast scheduler"
Nov 28 11:16:29 camuzzi-outbound-service loraserver[15714]: time="2018-11-28T11:16:29-03:00" level=info msg="gateway/gcp_pub_sub: message received" gateway_id=b827ebfffe01adba type=stats
Nov 28 11:16:29 camuzzi-outbound-service loraserver[15714]: time="2018-11-28T11:16:29-03:00" level=info msg="gateway/gcp_pub_sub: message received" gateway_id=b827ebfffe01adba type=stats
Nov 28 11:16:29 camuzzi-outbound-service loraserver[15714]: time="2018-11-28T11:16:29-03:00" level=info msg="gateway updated" gateway_id=b827ebfffe01adba
Nov 28 11:16:29 camuzzi-outbound-service loraserver[15714]: time="2018-11-28T11:16:29-03:00" level=info msg="gateway updated" gateway_id=b827ebfffe01adba
Nov 28 11:16:29 camuzzi-outbound-service loraserver[15714]: time="2018-11-28T11:16:29-03:00" level=info msg="gateway/gcp_pub_sub: message received" gateway_id=b827ebfffe01adba type=stats
Nov 28 11:16:29 camuzzi-outbound-service loraserver[15714]: time="2018-11-28T11:16:29-03:00" level=info msg="gateway/gcp_pub_sub: message received" gateway_id=b827ebfffe01adba type=stats
Nov 28 11:16:29 camuzzi-outbound-service loraserver[15714]: time="2018-11-28T11:16:29-03:00" level=info msg="gateway updated" gateway_id=b827ebfffe01adba
Nov 28 11:16:29 camuzzi-outbound-service loraserver[15714]: time="2018-11-28T11:16:29-03:00" level=info msg="gateway updated" gateway_id=b827ebfffe01adba
Nov 28 11:16:29 camuzzi-outbound-service loraserver[15714]: time="2018-11-28T11:16:29-03:00" level=info msg="gateway/gcp_pub_sub: message received" gateway_id=b827ebfffe01adba type=stats
Nov 28 11:16:29 camuzzi-outbound-service loraserver[15714]: time="2018-11-28T11:16:29-03:00" level=info msg="gateway/gcp_pub_sub: message received" gateway_id=b827ebfffe01adba type=stats
Nov 28 11:16:29 camuzzi-outbound-service loraserver[15714]: time="2018-11-28T11:16:29-03:00" level=info msg="gateway updated" gateway_id=b827ebfffe01adba
Nov 28 11:16:29 camuzzi-outbound-service loraserver[15714]: time="2018-11-28T11:16:29-03:00" level=info msg="gateway/gcp_pub_sub: message received" gateway_id=b827ebfffe01adba type=stats

There’s some logs and the I had to restart the network server, you can see a bunch of logs in the same time.

Thanks in advance!

It’s me again, it happened again last night.
Here are some logs, but they don’t provide much:

Nov 29 00:12:41 camuzzi-outbound-service loraserver[15714]: time="2018-11-29T00:12:41-03:00" level=info msg="gateway updated" gateway_id=b827ebfffe01adba
Nov 29 00:13:11 camuzzi-outbound-service loraserver[15714]: time="2018-11-29T00:13:11-03:00" level=info msg="gateway/gcp_pub_sub: message received" gateway_id=b827ebfffe01adba type=stats
Nov 29 00:13:11 camuzzi-outbound-service loraserver[15714]: time="2018-11-29T00:13:11-03:00" level=info msg="gateway updated" gateway_id=b827ebfffe01adba
Nov 29 00:13:41 camuzzi-outbound-service loraserver[15714]: time="2018-11-29T00:13:41-03:00" level=info msg="gateway/gcp_pub_sub: message received" gateway_id=b827ebfffe01adba type=stats
Nov 29 00:13:41 camuzzi-outbound-service loraserver[15714]: time="2018-11-29T00:13:41-03:00" level=info msg="gateway updated" gateway_id=b827ebfffe01adba
Nov 29 00:14:11 camuzzi-outbound-service loraserver[15714]: time="2018-11-29T00:14:11-03:00" level=info msg="gateway/gcp_pub_sub: message received" gateway_id=b827ebfffe01adba type=stats
Nov 29 00:14:11 camuzzi-outbound-service loraserver[15714]: time="2018-11-29T00:14:11-03:00" level=info msg="gateway updated" gateway_id=b827ebfffe01adba
Nov 29 00:14:41 camuzzi-outbound-service loraserver[15714]: time="2018-11-29T00:14:41-03:00" level=info msg="gateway/gcp_pub_sub: message received" gateway_id=b827ebfffe01adba type=stats
Nov 29 00:14:41 camuzzi-outbound-service loraserver[15714]: time="2018-11-29T00:14:41-03:00" level=info msg="gateway updated" gateway_id=b827ebfffe01adba
Nov 29 00:15:11 camuzzi-outbound-service loraserver[15714]: time="2018-11-29T00:15:11-03:00" level=info msg="gateway/gcp_pub_sub: message received" gateway_id=b827ebfffe01adba type=stats
Nov 29 00:15:11 camuzzi-outbound-service loraserver[15714]: time="2018-11-29T00:15:11-03:00" level=info msg="gateway updated" gateway_id=b827ebfffe01adba
Nov 29 00:15:41 camuzzi-outbound-service loraserver[15714]: time="2018-11-29T00:15:41-03:00" level=info msg="gateway/gcp_pub_sub: message received" gateway_id=b827ebfffe01adba type=stats
Nov 29 00:15:41 camuzzi-outbound-service loraserver[15714]: time="2018-11-29T00:15:41-03:00" level=info msg="gateway updated" gateway_id=b827ebfffe01adba
Nov 29 00:16:11 camuzzi-outbound-service loraserver[15714]: time="2018-11-29T00:16:11-03:00" level=info msg="gateway/gcp_pub_sub: message received" gateway_id=b827ebfffe01adba type=stats
Nov 29 00:16:11 camuzzi-outbound-service loraserver[15714]: time="2018-11-29T00:16:11-03:00" level=info msg="gateway updated" gateway_id=b827ebfffe01adba
Nov 29 00:16:41 camuzzi-outbound-service loraserver[15714]: time="2018-11-29T00:16:41-03:00" level=info msg="gateway/gcp_pub_sub: message received" gateway_id=b827ebfffe01adba type=stats
Nov 29 00:16:41 camuzzi-outbound-service loraserver[15714]: time="2018-11-29T00:16:41-03:00" level=info msg="gateway updated" gateway_id=b827ebfffe01adba
Nov 29 00:17:11 camuzzi-outbound-service loraserver[15714]: time="2018-11-29T00:17:11-03:00" level=info msg="gateway/gcp_pub_sub: message received" gateway_id=b827ebfffe01adba type=stats
Nov 29 00:17:11 camuzzi-outbound-service loraserver[15714]: time="2018-11-29T00:17:11-03:00" level=info msg="gateway updated" gateway_id=b827ebfffe01adba
Nov 29 09:26:48 camuzzi-outbound-service loraserver[15714]: time="2018-11-29T09:26:48-03:00" level=info msg="finished unary call with code OK" grpc.code=OK grpc.method=GetGatewayProfile grpc.service=ns.NetworkServerService grpc.start_time="2018-11-29T09:26:48-03:00" grpc.time_ms=28.565 peer.address="[::1]:53242" span.kind=server system=grpc
Nov 29 09:27:07 camuzzi-outbound-service loraserver[15714]: time="2018-11-29T09:27:07-03:00" level=info msg="finished unary call with code OK" grpc.code=OK grpc.method=GetVersion grpc.service=ns.NetworkServerService grpc.start_time="2018-11-29T09:27:07-03:00" grpc.time_ms=0.04 peer.address="[::1]:53242" span.kind=server system=grpc
Nov 29 09:27:25 camuzzi-outbound-service loraserver[15714]: time="2018-11-29T09:27:25-03:00" level=info msg="finished unary call with code OK" grpc.code=OK grpc.method=GetServiceProfile grpc.service=ns.NetworkServerService grpc.start_time="2018-11-29T09:27:25-03:00" grpc.time_ms=1.662 peer.address="[::1]:53242" span.kind=server system=grpc
Nov 29 09:28:02 camuzzi-outbound-service loraserver[15714]: time="2018-11-29T09:28:02-03:00" level=info msg="finished unary call with code OK" grpc.code=OK grpc.method=GetServiceProfile grpc.service=ns.NetworkServerService grpc.start_time="2018-11-29T09:28:02-03:00" grpc.time_ms=1.75 peer.address="[::1]:53242" span.kind=server system=grpc
Nov 29 09:28:02 camuzzi-outbound-service loraserver[15714]: time="2018-11-29T09:28:02-03:00" level=info msg="service-profile updated" id=c9fed8f5-d2f3-4374-8c61-dc825641034c
Nov 29 09:28:02 camuzzi-outbound-service loraserver[15714]: time="2018-11-29T09:28:02-03:00" level=info msg="finished unary call with code OK" grpc.code=OK grpc.method=UpdateServiceProfile grpc.service=ns.NetworkServerService grpc.start_time="2018-11-29T09:28:02-03:00" grpc.time_ms=51.287 peer.address="[::1]:53242" span.kind=server system=grpc
Nov 29 09:28:07 camuzzi-outbound-service loraserver[15714]: time="2018-11-29T09:28:07-03:00" level=info msg="finished unary call with code OK" grpc.code=OK grpc.method=GetDeviceProfile grpc.service=ns.NetworkServerService grpc.start_time="2018-11-29T09:28:07-03:00" grpc.time_ms=1.515 peer.address="[::1]:53242" span.kind=server system=grpc

Maybe a Pub/Sub issue? It’s like a hard timeout.

I need to look into this.

I’ll be glad to help with any kind of info!

I can describe you my current implementation step by step.

BTW the forum is a bit weird today?

EDIT: nevermind, a CTRL + F5 fixed it

I have the same problem on using GCP Pub/Sub described here. I am using NetworkServer and AppServer on Google kubernetes.
I switch to mqtt configuration for NetworkServer. For AppServer Pub/Sub configuration works ok.

Hi!

Anyone got anything new on this?

Last week I stopped receiving data from my devices. I found nothing on the logs and loraserver and appserver were still working, so I thought maybe was something on the devices side.

After checking that the gateway and the devices were still sendingto Google Cloud, dug a bit more and discovered that it was loraserver which was not talking with Pub/Sub, even though it was still working. And as happened to you, after a restart it everything got back to normal.

I have been trying to figure out in the last days what could went wrong, but still found anything. And for the moment the situation has not happened again.

It’s a long time since this post was created. If any of you got a solution about this, comments would be appreciated.

Thanks!

I’ve never been able to reproduce this (yet), but it would be really helpful when you could also report your LoRa Server version. If possible, please test with the latest version available.

This fact has happened a couple of times more since last time. Activated debug logging in PubSub but nothing seems to appear there, as PubSub is storing the messages and pushes them once the network server has been restarted.

In my case I have Lora Server 2.3 (App Server 2.4 and GW Bridge 2.6, but those seem to have no problem).

I will create a separate environment with v3 of every component that you have just released and test it.
Will let know if with the latest version it still happens.

Any solution over this thread ?

I might be not having same issue but suddenly internal Chirpstack Network Server stop getting uplink form end-device!! Why?? I have no idea …

& Neither I am getting data from end-device nor even LNS is sending anything to gateway or getting from gateway …

Everything is like stucked just … When I restarted my node after 5 min got JOIN req and everything got back normal , but again after 10 min same issue occured , everything got stucked but I got uplink fCnt jumped to 83 from 21 . … ??

How i this possible and Why such issues are occuring ??

???