-
Couldn't load subscription status.
- Fork 1.1k
umqtt hangs when connecting to a MQTT v3.1 implementation on SUBACK #205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
...tion type returns to subscribe channel it should be taking into account the 9 part of 0x9x, not the entire byte, as 1001 represent acknowledgement, regardless. 0x92 for example, and 0x90 should return the same operation code back to subscribe function to avoid a deadlock. Thingsboard for example, returns a 0x92 code as the octect instead of 0x90 as mosquitto. This fixes deadlock issues whiel trying to subscribe to Java MQTT implementations.
Thanks, but please follow https://github.com/micropython/micropython-lib/blob/master/CONTRIBUTING.md to format the commit message. And detailed commit message is great, but the final touch would be a reference to the MQTT spec section which mandates what you describe.
OK. The problem really comes down to what version of the MQTT protocol umqtt is actually supporting. I couldn't find any indication in the source code of what version of MQQT the client supports. From the looks of the implementation it is MQTT v3.1.1. (msg = bytearray(b"\x04MQTT\x04\x020円0円")). The server, at least in the io.netty java implementation ignores the protocol level, and acts according to the 3.1.1 spec "If the protocol name is incorrect the Server MAY disconnect the Client, or it MAY continue processing the CONNECT packet in accordance with some other specification. In the latter case, the Server MUST NOT continue to process the CONNECT packet in line with this specification"
According to the specification for v3.1,
http://public.dhe.ibm.com/software/dw/webservices/ws-mqtt/mqtt-v3r1.html#suback
the header for a SUBACK, contains a byte which first quartet is 1001, that is 0x9, the second quartet contains one bit for the DUP flag, two bits for QoS level, and one bit for the Retain flag.
In the specification for MQTT v3.1.1,
http://docs.oasis-open.org/mqtt/mqtt/v3.1.1/os/mqtt-v3.1.1-os.html#_Toc398718068
specifies the same exact first quartet of bits for SUBACK as 3.1.; nonetheless, the second quartet of a SUBACK is reserved and MUST BE exactly 0 0 0 0.
The specification for 3.1.1. regarding flags,
http://docs.oasis-open.org/mqtt/mqtt/v3.1.1/os/mqtt-v3.1.1-os.html#_Toc398718022
states that "Where a flag bit is marked as "Reserved" in Table 2.2 - Flag Bits, it is reserved for future use and MUST be set to the value listed in that table [MQTT-2.2.2-1]. If invalid flags are received, the receiver MUST close the Network Connection". In other words, the correct behavior for the umqtt client, if it were an implementation of MQTT 3.1.1, would be to check that the second quartet of the SUBACK byte is precisely 0x00, if not, it should disconnect from the server. Currently, changing the code in that particular way it is moot, as the point is that without clear indication of the supported protocol version, or a framework to incorporate and distinguish v3.1 or v3.1.1 of MQTT, there is no point in fixing the problem one way or another (i.e. either by disconnecting or simply ignoring the second quartet of bits (which is what I proposed in the pull request)).
Perhaps the maintainer and main contributor of the umqtt module could comment on problem, and suggest a way we could either support v3.1, or simply specify that v3.1.1 is the only supported protocol, and hence change the behavior of the client on first connect to simply disconnected from a v3.1 MQTT implementation.
cprima
commented
Mar 30, 2018
This fix, which was not merged, helped me to subscribe to a Thingsboard MQTT implementation.
Without it the subscribe call just hangs indefinitely.
I changed my local copy of simple.py to that trivial addition -- and am now making it past the nested wait_msg() inside subscribe()
Correction to the mask on the return of the operation, when the operation type returns to subscribe channel it should be taking into account the 9 part of 0x9x, not the entire byte, as 1001 represent acknowledgement, regardless. 0x92 for example, and 0x90 should return the same operation code back to subscribe function to avoid a deadlock. Thingsboard implementation of mqtt for example, returns a 0x92 code as the ack for a subscribe instead of 0x90 as mosquitto. This fixes deadlock issues while trying to subscribe to Java MQTT implementations.