Lora-app-server does not remove messages from downlink_queue?

stefanbrudny · November 30, 2017, 10:05pm

Hi,
I am using LoPy and custom code for the sensor, so it MIGHT BE my mistake (but so far no clue whats wrong).

What:
I send the command to MQTT server:
{“fPort”:1,“data”:“bGVkRW5hYmxlLHRydWU=” , “confirmed”: true}

This reaches succesfully the device and device acts accordingly: it sends up message, what can be observed on a gateway:

JSON up: {“rxpk”:[{“tmst”:4043324075,“chan”:1,“rfch”:1,“freq”:868.300000,“stat”:1,“modu”:“LORA”,“datr”:“SF7BW125”,“codr”:“4/5”,“lsnr”:9.5,“rssi”:-46,“size”:29,“data”:“QCzwvQKAuh0CTV78tVq0IKBjWj+TYKFKOhdYxKg=”}]} INFO: [up] PUSH_ACK received in 0 ms INFO: [down] PULL_RESP received - token[0:0]

JSON down: {“txpk”:{“imme”:false,“tmst”:4044324075,“freq”:868.3,“rfch”:0,“powe”:14,“modu”:“LORA”,“datr”:“SF7BW125”,“codr”:“4/5”,“ipol”:true,“size”:32,“data”:“oCzwvQKFsgEDUgcAAQFj21hf/gNOZxEMFbwPovEyuK0=”}}
INFO: tx_start_delay=1495 (1495.500000) - (1497, bw_delay=1.500000, notch_delay=0.000000)

as well as on the server:

Nov 30 22:56:59 LoRa lora-gateway-bridge[417]: time=“2017-11-30T22:56:59+01:00” level=info msg=“gateway: received udp packet from gateway” addr=192.168.1.29:35358 protocol_version=2 type=PullData Nov 30 22:56:59 LoRa lora-gateway-bridge[417]: time=“2017-11-30T22:56:59+01:00” level=info msg=“gateway: sending udp packet to gateway” addr=192.168.1.29:35358 protocol_version=2 type=PullACK Nov 30 22:57:00 LoRa lora-gateway-bridge[417]: time=“2017-11-30T22:57:00+01:00” level=info msg=“gateway: received udp packet from gateway” addr=192.168.1.29:44355 protocol_version=2 type=PushData Nov 30 22:57:00 LoRa lora-gateway-bridge[417]: time=“2017-11-30T22:57:00+01:00” level=info msg=“gateway: rxpk packet received” addr=192.168.1.29:44355 data=QCzwvQKAsR0CELWWUUIX7I9HwXMaZ2fNuOjzh389 mac=a21c52fffe9dbbfa Nov 30 22:57:00 LoRa lora-gateway-bridge[417]: time=“2017-11-30T22:57:00+01:00” level=info msg=“backend: publishing packet” topic=“gateway/a21c52fffe9dbbfa/rx” Nov 30 22:57:00 LoRa lora-gateway-bridge[417]: time=“2017-11-30T22:57:00+01:00” level=info msg=“gateway: sending udp packet to gateway” addr=192.168.1.29:44355 protocol_version=2 type=PushACK Nov 30 22:57:00 LoRa loraserver[423]: time=“2017-11-30T22:57:00+01:00” level=info msg=“backend/gateway: rx packet received” Nov 30 22:57:01 LoRa loraserver[423]: time=“2017-11-30T22:57:01+01:00” level=info msg=“packet(s) collected” dev_eui=70b3d54999134adf gw_count=1 gw_macs=a21c52fffe9dbbfa mtype=UnconfirmedDataUp Nov 30 22:57:01 LoRa loraserver[423]: time=“2017-11-30T22:57:01+01:00” level=info msg=“rx info sent to network-controller” dev_eui=70b3d54999134adf Nov 30 22:57:01 LoRa lora-app-server[577]: time=“2017-11-30T22:57:01+01:00” level=info msg=“handler/mqtt: publishing data-up payload” topic=“application/1/node/70b3d54999134adf/rx” Nov 30 22:57:01 LoRa loraserver[423]: time=“2017-11-30T22:57:01+01:00” level=info msg=“mac-command block added to queue” cid=LinkADRReq dev_eui=70b3d54999134adf frmpayload=false Nov 30 22:57:01 LoRa loraserver[423]: time=“2017-11-30T22:57:01+01:00” level=info msg=“adr request added to mac-command queue” dev_eui=70b3d54999134adf dr=5 nb_trans=0 req_dr=5 req_nb_trans=1 req_tx_power_idx=2 tx_power=0 Nov 30 22:57:01 LoRa loraserver[423]: time=“2017-11-30T22:57:01+01:00” level=info msg=“device-session saved” dev_addr=02bdf02c dev_eui=70b3d54999134adf Nov 30 22:57:01 LoRa lora-app-server[577]: time=“2017-11-30T22:57:01+01:00” level=info msg=“device-queue item updated” id=1 Nov 30 22:57:01 LoRa lora-app-server[577]: time=“2017-11-30T22:57:01+01:00” level=info msg=“data-down item requested by network-server” confirmed=true dev_eui=70b3d54999134adf fcnt=425 id=1 Nov 30 22:57:01 LoRa loraserver[423]: time=“2017-11-30T22:57:01+01:00” level=info msg=“received data down from application” confirmed=true data_base64=“C/IRc/rx46A5BP6a25s=” dev_eui=70b3d54999134adf fcnt=425 more_data=false Nov 30 22:57:01 LoRa loraserver[423]: time=“2017-11-30T22:57:01+01:00” level=info msg=“pending mac-command block set” cid=LinkADRReq commands=1 dev_eui=70b3d54999134adf frm_payload=false Nov 30 22:57:01 LoRa loraserver[423]: time=“2017-11-30T22:57:01+01:00” level=info msg=“mac-command block removed from queue” cid=LinkADRReq dev_eui=70b3d54999134adf Nov 30 22:57:01 LoRa loraserver[423]: time=“2017-11-30T22:57:01+01:00” level=info msg=“backend/gateway: publishing tx packet” topic=“gateway/a21c52fffe9dbbfa/tx” Nov 30 22:57:01 LoRa lora-gateway-bridge[417]: time=“2017-11-30T22:57:01+01:00” level=info msg=“backend: packet received” topic=“gateway/a21c52fffe9dbbfa/tx” Nov 30 22:57:01 LoRa lora-gateway-bridge[417]: time=“2017-11-30T22:57:01+01:00” level=info msg=“gateway: sending udp packet to gateway” addr=192.168.1.29:35358 protocol_version=2 type=PullResp Nov 30 22:57:01 LoRa loraserver[423]: time=“2017-11-30T22:57:01+01:00” level=info msg=“device-session saved” dev_addr=02bdf02c dev_eui=70b3d54999134adf Nov 30 22:57:01 LoRa lora-gateway-bridge[417]: time=“2017-11-30T22:57:01+01:00” level=info msg=“gateway: received udp packet from gateway” addr=192.168.1.29:35358 protocol_version=2 type=TXACK Nov 30 22:57:01 LoRa lora-gateway-bridge[417]: time=“2017-11-30T22:57:01+01:00” level=info msg=“gateway: tx ack received” mac=a21c52fffe9dbbfa random_token=0

So the question is: why lora-app-server schedules the payload forever and does not clean the message in the queue?

I suspect the LoPy for some reason does not send confirmation in the first place, cause: I do not see it on GW nor I can see it in the loraserver log (in straigthforward consequence).

This bite me since some time, I had this tested and working (confirmations) and now with the simplest python code I can’t them have back.

Best,
SB

stefanbrudny · November 30, 2017, 10:52pm

Getting closer. Seems like I did not test well enough back then, this is completely the same with totally wrong solution (confirmations turned off…) so LoPy must ignore sending confirmations by its own… Problem here is that even if it is possible to send confiramtion mesg from Lopy, it is not possible to detect the ID of the original message. Yuck, thankfully thanks to low mesg ration on Lopy expected best-effort will work.

Anyway, this is wrong forum, switching to LoPy

brocaar · December 1, 2017, 11:35am

Please note that I’m currently refactoring the device-queue, also to be able to add future Class-B support. The behavior of the queue is going to change

What will happen is that the device-queue will be moved to LoRa Server and LoRa App Server will only store the reference of the downlink FCnt to your own given reference. Instead of endlessly retrying, you will receive a nACK and the item will be popped from the queue.

stefanbrudny · December 1, 2017, 2:03pm

Huh, ok. However it’s this new planned behaviour standarized by the protocol? Guess not.

This should be configurable from gui, as this feature is expect to be working by default without any user interaction.

brocaar · December 1, 2017, 3:19pm

The current implementation isn’t standardized either by the protocol The new implementation however will be closes to the protocol. See for example the Class C and Class B timeouts mentioned in the LoRaWAN Backend Interfaces documentation (part of the device-profile).

The problem with the current implementation is that LoRa App Server holds the queue and LoRa Server will pull an item every receive window from the application-server. This won’t be scalable for Class-B as each downlink frame must be transmitted at a specific ping slot. It would not be logic to put this scheduling in the LoRa App Server domain and it won’t be scalable to let LoRa Server pull LoRa App Server for ever (Class-B) device at every ping slot.

The reason to pop an item from the queue and send a nACK to LoRa App Server is that for a downlink re-transmission the same frame-counter can’t be used. As the encrypted payload will be stored in the LoRa Server device-queue, this means that it is automatically invalidated as the encryption is tied to a specific FCnt.

stefanbrudny · December 1, 2017, 5:07pm

Which means, actually, that item can’t be even stored in the lora server as it would have to be reencrypted with the new fcntl, if I get this right.

This, however, leaves us with new task to decide what to with failed transmissions on application level. And it’s OK, we shall reschedule.

BTW, class B is, at least in my opinion, doomed to start much later, if ever, as deployment is far more complicated and leaves much more room in business space for going into GSM instead. But that’s another story and server needs to be 100% compliant to the specs, including 1.1 and class B leaving experimental mode in 1.1.

brocaar · December 1, 2017, 5:17pm

As I see it, the encrypted FRMPayload will be stored together with the uplink fCnt in LoRa Server, so that LoRa Server can decide when to schedule it. LoRa Server will then notify LoRa App Server in case of a failure*, nACK and ACK.

(*a failure could be that the fCnt is not valid anymore, the max-payload size exceeded as the data-rate changed, …)

LoRa App Server will forward these errors, nACK and ACKs over MQTT (or HTTP integration) and then the end-application is able to decide if it should be re-scheduled. If so, it re-enqueues this using MQTT, LoRa App Server then encrypts it using the next FCntUp value and sends this to LoRa Server for future transmission.

as deployment is far more complicated

Could you elaborate on this? When LoRa Server would support Class-B and the end-user would have a gateway with GPS time sync and a Class-B capable device, what else would be complicated?

stefanbrudny · December 4, 2017, 10:08pm

Time sync?

I have quite experience in large scale networks, with time synchronization WITHOUT GPS (OK, GPS is used to insert initial Time of a day (ToD)). Basing on GPS (or not) initial PPS signal is constructed and maintained in the whole, typically national-wide network.

This means experience with technologies like SDH synchronzation, Nimbra Vision custom (?) sync protocol, SyncE and PtPv2.

Of course, GPS reduces the pain of the sending precise time using wires. Still, I am afraid of poor HW (SPI interface on low cost GWs introducing jitter), poor kernels 3.xx for old devices introducing extra routines, 3G networking introducing poor latency on input etc.

By the way: original problem seems to be avoided, too quickly deepsleeping, have to yet fix it somehow.

brocaar · December 11, 2017, 10:45am

In the latest release it is up to your application to re-enqueue a confirmed message. LoRa App Server will either send an ACK or nACK over MQTT (or the HTTP integration). Please see the latest release-notes:

https://forum.loraserver.io/t/release-lora-app-server-0-15-lora-server-0-23/492

stefanbrudny · December 12, 2017, 11:01pm

Yes, I’ve read through both the release and the standard once more. This is my hobby as it’s now unfortunately as I failed to engage my employer. Still trying, though, and we are at a middle scale (several tens millions of devices Manston).