App Server Devices Not Visible Periodically

Hey,

We’ve been having an issue for a while, with both v2 and v3, where periodically the app server will stop displaying the list of devices. The rest of the functionality appears to work correctly. Restarting the app server fixes this.

I’ve tried turning on debugging to identify where the issue occurs, however I’m just getting a whole lot of sql queries printed to the log which look identical before and after reset; the results aren’t printed.

Does anyone have any suggestions as to how I might narrow down the cause of this? We’re running the server, gateway bridge, app server, mosquitto, postgres, and redis on the same Ubuntu server at the moment.

As the web-interfaces uses the REST API, the best would be to see if you can reproduce this when interacting directly with the REST API.

We have seen this, too. I reported it back in June, but because it is intermittent, it is difficult to diagnose. Best strategy would be to run repeated queries against the server until a failure occurs. At some point I had a hypothesis that it was an expired token issue.

Hey,

It’s taken a while for the issue to reoccur, but it has and I can’t find anything relevant in the logs.

I logged in with the API and I’m able to get a jwt token and query applications, gateways, organisations, users, etc. successfully, but any query to the /api/devices endpoint is met with an NGINX 504 gateway time-out.

Querying directly, without nginx (i.e. directly to port 8080) yields no response (no timeouts after several minutes).

In the app server log whenever I query /api/applications I see several queries with my username (args="[username]") prefixed. This does not happen when I query /api/devices. I can’t see any immediate entries in the log relating to my request (the log is quite spammy with debug enabled, however).

So it looks a bit like something is failing prior to the first log entry regarding the request.

Upon restarting the app server the /api/devices endpoint works immediately. There is a log entry produced with my username when I request the endpoint.

Any thoughts on how to extract further debug information?

Hey,

This appears to be occurring a couple of times per day at the moment; does anyone have any suggestions for narrowing down the cause on our installation?