Unable to Delete Gateway due to a lock in PostgreSQL Database

Hi, I’m encountering an issue for quite some time where I’m unable to remove the Gateway device from the Chirpstack Portal.

When I tried to delete the gateway, it takes a long time and then this error showed from the Inspect element:

{"error": "storage transaction rollback error: driver: bad connection:, "code":2, "message":"storage: transaction rollback error: driver: bad connection", :details: []}

After this failed to remove the gateway, suddenly the client connected to the Postgres database keep on increasing. I checked the pg_stat_activity table and found these Locked queries:

The first one is the Delete Query:

delete from gateway where mac = $1  

and after that, these queries keep on piling up:

select g.* from gateway g inner join network_server ns on ns.id = g.network_server_id where ns.gateway_discovery_enabled = true and g.ping = true and (g.last_ping_sent_at is null or g.last_ping_sent_at <= (now() - (interval '24 hours' / ns.gateway_discovery_interval))) order by last_ping_sent_at limit 1 for update

May I ask how do I fix this? Since I’m unable to delete most of the gateways. I only able to delete around 3 and I have around 100+ gateway in total.

Thank you

Here is one example from pg_stat_activity:

datid  |    datname    |  pid  | usesysid |    usename    | application_name |  client_addr  | client_hostname | client_port |         backend_start         |          xact_start           |          query_start          |         state_change          | wait_event_type |     wait_event      | state  | backend_xid | backend_xmin |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        query                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |    backend_type     
--------+---------------+-------+----------+---------------+------------------+---------------+-----------------+-------------+-------------------------------+-------------------------------+-------------------------------+-------------------------------+-----------------+---------------------+--------+-------------+--------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------
613766 | loraserver_as | 17744 |    16385 | loraserver_as |                  | 10.105.130.30 |                 |       54844 | 2023-06-27 11:10:03.660701+08 | 2023-06-27 11:10:12.209167+08 | 2023-06-27 11:10:12.211396+08 | 2023-06-27 11:10:12.211396+08 | Lock            | tuple               | active |   774543070 |    774542835 | delete from gateway where mac = $1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | client backend
 613766 | loraserver_as | 16406 |    16385 | loraserver_as |                  | 10.105.130.30 |                 |       20454 | 2023-06-27 11:07:36.071354+08 | 2023-06-27 11:08:48.83109+08  | 2023-06-27 11:08:48.831648+08 | 2023-06-27 11:08:48.83165+08  | Lock            | tuple               | active |             |    774542835 |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    +| client backend
        |               |       |          |               |                  |               |                 |             |                               |                               |                               |                               |                 |                     |        |             |              |                 select                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             +| 
        |               |       |          |               |                  |               |                 |             |                               |                               |                               |                               |                 |                     |        |             |              |                         g.*                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        +| 
        |               |       |          |               |                  |               |                 |             |                               |                               |                               |                               |                 |                     |        |             |              |                 from gateway g                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     +| 
        |               |       |          |               |                  |               |                 |             |                               |                               |                               |                               |                 |                     |        |             |              |                 inner join network_server ns                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       +| 
        |               |       |          |               |                  |               |                 |             |                               |                               |                               |                               |                 |                     |        |             |              |                         on ns.id = g.network_server_id                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             +| 
        |               |       |          |               |                  |               |                 |             |                               |                               |                               |                               |                 |                     |        |             |              |                 where                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              +| 
        |               |       |          |               |                  |               |                 |             |                               |                               |                               |                               |                 |                     |        |             |              |                         ns.gateway_discovery_enabled = true                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        +| 
        |               |       |          |               |                  |               |                 |             |                               |                               |                               |                               |                 |                     |        |             |              |                         and g.ping = true                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          +| 
        |               |       |          |               |                  |               |                 |             |                               |                               |                               |                               |                 |                     |        |             |              |                         and (g.last_ping_sent_at is null or g.last_ping_sent_at <= (now() - (interval '24 hours' / ns.gateway_discovery_interval)))                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                +| 
        |               |       |          |               |                  |               |                 |             |                               |                               |                               |                               |                 |                     |        |             |              |                 order by last_ping_sent_at                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         +| 
        |               |       |          |               |                  |               |                 |             |                               |                               |                               |                               |                 |                     |        |             |              |                 limit 1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            +| 
        |               |       |          |               |                  |               |                 |             |                               |                               |                               |                               |                 |                     |        |             |              |                 for update                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | 

I’m not sure what is causing the issue, but this might give a hint:

transaction rollback error: driver: bad connection

If for some reason the delete query takes a very long time, then it is very likely to affect other queries as well. Especially the query you are mentioning because it tries to lock the selected gateway records, which is not possible until the delete has finished.

Hi @brocaar thank u for the answer. I have one question, is there anything from AppServer, NetworkServer, or GatewayBrige log that I can check to find out why is the DELETE query took so long?

Because I have checked the Postgres DB and the connection from Kubernetes cluster where the Chirpstack is deployed to the DB and I can’t find any problem.

I also tried to onboard a new gateway and delete it and there is no problem with that. It seems like the deletion error only occurs on some of my older gateways.

Best Regards,

WK

Hi all, I found out the issue after @brocaar help me point out in the DELETE query.

So the issue is caused because the DELETE query took a long time to finish, and I think the LNS App just decided to timed out because it took a long time.

Here is the result when I tried to EXPLAIN the DELETE query, this query took 8 Minutes to complete:

postgres=# explain (analyze,buffers,timing)
postgres-# DELETE FROM gateway WHERE mac = E'\\x<MAC-ADDRESS>';
-[ RECORD 1 ]-----------------------------------------------------------------------------------------------------
QUERY PLAN | Delete on gateway  (cost=0.00..5.39 rows=0 width=0) (actual time=0.116..0.116 rows=0 loops=1)
-[ RECORD 2 ]-----------------------------------------------------------------------------------------------------
QUERY PLAN |   Buffers: shared hit=3 read=3 dirtied=1
-[ RECORD 3 ]-----------------------------------------------------------------------------------------------------
QUERY PLAN |   ->  Seq Scan on gateway  (cost=0.00..5.39 rows=1 width=6) (actual time=0.062..0.074 rows=1 loops=1)
-[ RECORD 4 ]-----------------------------------------------------------------------------------------------------
QUERY PLAN |         Filter: (mac = '\x<MAC-ADDRESS>'::bytea)
-[ RECORD 5 ]-----------------------------------------------------------------------------------------------------
QUERY PLAN |         Rows Removed by Filter: 110
-[ RECORD 6 ]-----------------------------------------------------------------------------------------------------
QUERY PLAN |         Buffers: shared hit=1 read=3 dirtied=1
-[ RECORD 7 ]-----------------------------------------------------------------------------------------------------
QUERY PLAN | Planning Time: 0.144 ms
-[ RECORD 8 ]-----------------------------------------------------------------------------------------------------
QUERY PLAN | Trigger for constraint gateway_ping_gateway_mac_fkey on gateway: time=315775.791 calls=1
-[ RECORD 9 ]-----------------------------------------------------------------------------------------------------
QUERY PLAN | Trigger for constraint gateway_ping_rx_gateway_mac_fkey on gateway: time=493.303 calls=1
-[ RECORD 10 ]----------------------------------------------------------------------------------------------------
QUERY PLAN | Trigger for constraint gateway_last_ping_id_fkey on gateway_ping: time=126707.910 calls=9699908
-[ RECORD 11 ]----------------------------------------------------------------------------------------------------
QUERY PLAN | Trigger for constraint gateway_ping_rx_ping_id_fkey on gateway_ping: time=37203.330 calls=9699908
-[ RECORD 12 ]----------------------------------------------------------------------------------------------------
QUERY PLAN | Execution Time: 481179.673 ms

Does anyone know how to fix this slow DELETE query? Thanks

Have you rebooted postgres?

Hi @datnus , yes I have rebooted the postgres multiple times. The Explanation from DELETE query above was from my local Postgres. I dump the database and ran the DELETE query there.

Seems like the slow query was caused by gateway ping table where during the delete query it trying to refer some foreign key or something that I don’t understand.

Is there a way to fix this? Please do let me know. Thanks

May be you should go to posgres forum to ask?

Hi @datnus I found out that the reason why it slow is because it cascades deletes everything on the gateway_ping table when we delete a gateway from either query or the Chirpstack portal and that’s the reason why the Gateway deletion quite slow.

For example, on one gateway if I run this query to list all the gateway_ping for certain MAC Address, it has a lot of records:

postgres=# SELECT count(*) FROM gateway_ping WHERE gateway_mac = E'\\x<MAC-ADDRESS>';
-[ RECORD 1 ]--
count | 9532153

Sorry, so that’s why I’m asking this community as I’m trying to find out what’s this gateway_ping table for and why some of my gateways have a lot of record on gateway_ping table. Is there a way for me to clear or check on something related to the gateway ping or something? So the delete function on Chirpstack App Server can be working again. Thanks

Best Regards,

WK

1 Like

Cascade delete is correct in this case. I believe.

May be you may want to run a check if your posgres has corrupted data.

You can try to delete all data in gateway_ping table for affected gateways?

I guess the table is not very important.

well… @datnus since deleting the gateway from the gateway table also cascades deleting the gateway_ping. I think I can just truncate the whole gateway_ping table if it is not important right?

But how can I determine that this won’t be happening again in the future?

Best Regards,

WK

Are you using the gateway ping feature? If not, you could turn it off or reduce the interval in which the pings are being sent. Please note that in v4 this feature has been removed as it doesn’t work (reliable) with all gateways and packet-forwarders.

1 Like

I see… thanks for the reply @brocaar. Let me check with my team and see if this fan fix the issue. I will inform this forum after that. Thanks

Best Regards,

WK

Hi @brocaar, I’m still checking with my team regarding the gateway ping feature. By the way, is it okay for me to truncate the gateway_ping table? It wont affect anything right? I hope that this can fix the failure to delete the gateway from the UI. Thanks

Best Regards,

WK

I believe truncating the gateway_ping should be fine.

This topic was automatically closed after 90 days. New replies are no longer allowed.