Rest API will not allow me to add devices if no WiFi

I have a custom gateway built with a Raspberry Pi CM3+ module and a RAK2245 Concentrator. This gateway has a cellular modem and wifi module connected over USB. I got everything up and running great on WiFi. I have a script that connects to AWS to upload data and get commands for the gateway.

In the field, this gateway will connect to AWS over cell, and my problems begin when I disconnect the WiFi. When I do this, I can no longer use the Rest API to add anything. Here is the functioning output of my script working over WiFi:

Logging in to Chirpstack API: http://localhost:8080/api/internal/login
{
“password”: “admin”,
“username”: “admin”
}
<Response [200]>
Received message 1 from topic ‘Network/Down/GW0000010103’: {“14”:“1003005,1,Test2,2659339aff8342cdbf5cba1f30392a9a”}
{‘14’: ‘1003005,1,Test2,2659339aff8342cdbf5cba1f30392a9a’}
Add new device command received. GUID: 1003005, Type: 1, Name: Test2, Key: 2659339aff8342cdbf5cba1f30392a9a
77828478000f4dfd
Checking if device exists: http://localhost:8080/api/devices/77828478000f4dfd
<Response [404]>
{‘error’: ‘object does not exist’, ‘code’: 5, ‘message’: ‘object does not exist’, ‘details’: }
Device does not exist. Adding device.
First getting application and device profile IDs.
Getting Application ID: http://localhost:8080/api/applications?limit=1000
Application name: PI-Data-Server id: 1
Getting Device Profile ID: http://localhost:8080/api/device-profiles?limit=1000
Device profile name: Position Indicator id: bb464183-94f4-49e3-a257-919f14cadb91
{
“device”:{“applicationID”: “1”, “devEUI”: “77828478000f4dfd”, “deviceProfileID”: “bb464183-94f4-49e3-a257-919f14cadb91”, “name”: “Test2”}
}
Creating device http://localhost:8080/api/devices
<Response [200]>
{}
Setting device keys http://localhost:8080/api/devices/77828478000f4dfd/keys
<Response [200]>
{}
Verifying Device Profile…http://localhost:8080/api/devices/77828478000f4dfd
<Response [200]>
{‘device’: {‘devEUI’: ‘77828478000f4dfd’, ‘name’: ‘Test2’, ‘applicationID’: ‘1’, ‘description’: ‘’, ‘deviceProfileID’: ‘bb464183-94f4-49e3-a257-919f14cadb91’, ‘skipFCntCheck’: False, ‘referenceAltitude’: 0, ‘variables’: {}, ‘tags’: {}}, ‘lastSeenAt’: None, ‘deviceStatusBattery’: 256, ‘deviceStatusMargin’: 256, ‘location’: None}
{‘devEUI’: ‘77828478000f4dfd’, ‘name’: ‘Test2’, ‘applicationID’: ‘1’, ‘description’: ‘’, ‘deviceProfileID’: ‘bb464183-94f4-49e3-a257-919f14cadb91’, ‘skipFCntCheck’: False, ‘referenceAltitude’: 0, ‘variables’: {}, ‘tags’: {}}

When I unplug the WiFi I can still access the api directly by running a python script from the command line. Adding a device works the same as above. I can bring up my cellular ppp0 interface and everything works fine.

When I reboot without WiFi plugged in to test conditions as they will be in the field. I can not add a device with the same python script. My output looks like this:

Logging in to Chirpstack API: http://localhost:8080/api/internal/login
{
“password”: “admin”,
“username”: “admin”
}
<Response [200]>
77828478000fa0e7
Checking if device exists: http://localhost:8080/api/devices/77828478000fa0e7
<Response [404]>
{‘error’: ‘object does not exist’, ‘code’: 5, ‘message’: ‘object does not exist’, ‘details’: }
Device does not exist. Adding device.
First getting application and device profile IDs.
Getting Application ID: http://localhost:8080/api/applications?limit=1000
Application name: PI-Data-Server id: 1
Getting Device Profile ID: http://localhost:8080/api/device-profiles?limit=1000
Device profile name: Position Indicator id: bb464183-94f4-49e3-a257-919f14cadb91
{
“device”:{“applicationID”: “1”, “devEUI”: “77828478000fa0e7”, “deviceProfileID”: “bb464183-94f4-49e3-a257-919f14cadb91”, “name”: “afq”}
}
Creating device http://localhost:8080/api/devices
<Response [500]>
{‘error’: ‘context deadline exceeded’, ‘code’: 2, ‘message’: ‘context deadline exceeded’, ‘details’: }
Setting device keys http://localhost:8080/api/devices/77828478000fa0e7/keys
<Response [404]>
{‘error’: ‘object does not exist’, ‘code’: 5, ‘message’: ‘object does not exist’, ‘details’: }
Verifying Device Profile…http://localhost:8080/api/devices/77828478000fa0e7
<Response [404]>
{‘error’: ‘object does not exist’, ‘code’: 5, ‘message’: ‘object does not exist’, ‘details’: }

So I can login and get data with the API, but I can not create a device. The error response “{‘error’: ‘context deadline exceeded’, ‘code’: 2, ‘message’: ‘context deadline exceeded’, ‘details’: }”
Does not give me any clues. Bringing up the ppp0 interface does not help either.

I’m lost. Any ideas?

Also, if I plug in the WiFi it does not work until I run: sudo ifconfig ppp0 down. Then I can bring the ppp0 interface, and it works. Unplug the WiFi and it still works until next reboot.

Why would it need the WiFi for the API to work?

Also relevant. This is the error from the journal for the application server:

Jun 05 03:26:46 raspberrypi chirpstack-application-server[447]: time=“2020-06-05T03:26:46Z” level=error msg=“finished unary call with code Unknown” ctx_id=1818b692-7891-44f6- b804-f6956e15bd1b error=“rpc error: code = Unknown desc = context deadline exceeded” grpc.code=Unknown grpc.method=Get grpc.service=api.DeviceService grpc.start_time=“2020-06 -05T03:26:41Z” grpc.time_ms=5042.61 peer.address=“127.0.0.1:56572” span.kind=server system=grpc

I have a similar setup with a custom gateway running on an RPi CM3+ but I am running it all with Docker (custom docker build for armv7).
I have only been testing with the application server web interface but with the same results.
When I load one of the “broken” endpoints (127.0.0.1:8080/#/organizations/1/device-profiles/<device-profile-uuid) with internet and then disconnect, the endpoint and all other “broken” endpoints continue to work fine. However if I don’t load one of these endpoints before losing internet connection then they are “broken” and always return ‘context deadline exceeded’, ‘code’: 2.

This is the same using either localhost or 127.0.0.1 and is the same using the default Chirpstack Docker on x86 ubuntu install or custom Docker build for RPi CM3+ armv7

@jamesod are you using Docker at all and did you manage to make any progress on this issue?

After some more digging it seems to be a DNS issue.
The following tests were performed with NO internet connection from starting the stack (docker-compose):

  1. Network server in Application server set to

    • chirpstack-network-server:8000
  2. Accessing http://localhost:8080/#/organizations/1/device-profiles/d170d1db-1e6d-473f-aef6-f74c83b7e784

chirpstack-application-server_1  | time="2020-08-18T00:27:29Z" level=warning msg="creating insecure network-server client" server="chirpstack-network-server:8000"
chirpstack-application-server_1  | time="2020-08-18T00:27:34Z" level=error msg="finished unary call with code Unknown" ctx_id=918fb316-d7a1-48dc-a32b-d04646d16a21 error="rpc error: code = Unknown desc = context deadline exceeded" grpc.code=Unknown grpc.method=Get grpc.service=api.DeviceProfileService grpc.start_time="2020-08-18T00:27:29Z" grpc.time_ms=5004.372 peer.address="127.0.0.1:37134" span.kind=server system=grpc
  1. Checking network server IP from Application server container:
$docker exec -it dba3124c2a97 nslookup chirpstack-network-server
nslookup: can't resolve '(null)': Name does not resolve

Name:      chirpstack-network-server
Address 1: 172.18.0.6 chirpstack-docker_chirpstack-network-server_1.chirpstack-docker_default
  1. Setting Network server in Application server to

    • 172.18.0.6:8000
  2. Accessing http://localhost:8080/#/organizations/1/device-profiles/d170d1db-1e6d-473f-aef6-f74c83b7e784

chirpstack-network-server_1      | time="2020-08-18T00:59:08Z" level=info msg="finished unary call with code OK" ctx_id=b9b7a6c5-e3a9-40a3-adf9-4dca6db57ed5 grpc.code=OK grpc.method=GetDeviceProfile grpc.service=ns.NetworkServerService grpc.start_time="2020-08-18T00:59:08Z" grpc.time_ms=1.365 peer.address="172.18.0.4:35718" span.kind=server system=grpc
chirpstack-application-server_1  | time="2020-08-18T00:59:08Z" level=info msg="finished client unary call" ctx_id=db08d14d-7b4a-407a-9dba-6c8d92c13c4d grpc.code=OK grpc.ctx_id=b9b7a6c5-e3a9-40a3-adf9-4dca6db57ed5 grpc.duration=2.343795ms grpc.method=GetDeviceProfile grpc.service=ns.NetworkServerService span.kind=client system=grpc
chirpstack-application-server_1  | time="2020-08-18T00:59:08Z" level=info msg="finished unary call with code OK" ctx_id=db08d14d-7b4a-407a-9dba-6c8d92c13c4d grpc.code=OK grpc.method=Get grpc.service=api.DeviceProfileService grpc.start_time="2020-08-18T00:59:08Z" grpc.time_ms=5.045 peer.address="127.0.0.1:38284" span.kind=server system=grpc

Which now works fine for all endpoints.
Obviously this isn’t ideal for a docker container as the docker network IP can change on each startup. For now I am just skirting the issue by running a script to get the IP and change it in the App server on startup of the stack.
@brocaar is this potentially a bug in the grpc lookup or can this be solved with some docker / DNS related fix?
EDIT: these tests were performed with the latest DockerHub images using chirpstack-docker as well as a docker build of the latest chirpstack-application-server from git