-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Fix issue in umqtt when MQTT CONNECT packet is greater than 127 bytes #163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This module is written to avoid useful allocations and especially reallocations (fragment memory, cause issue with low-heap systems), and your patch brings up bunch of that.
Hi Paul,
Understood. How else can we address this issue? Can we allocate and array of size 13 first, write the correct size into it, and then throw away the unused 0-3 bytes depending on the total message size ? Otherwise, is there a way for us to know the total size to write, and then allocate 10-13 depending on that total size ?
Cheers,
Damien
Note that this patch only works for messages up to 255 bytes length, if it's larger than this then msg[1] contains a truncated size to start with.
How else can we address this issue?
One way is to support only messages up to 255 bytes length and always use 2 bytes to store the remaining length. If the length is 127 or less then the first byte is just 0x80. This seems to be allowed by the spec.
255 bytes is enough for my use case, but would be nice to fix this once and for all.
Is it possible to do the following:
- Allocate 13 bytes for header.
- As you are adding up the remaining length field, do the % 128 operation each time it passes 128
- Write msg[1] then msg[2] as needed based on the size
- Write the remaining 8 bytes of the header into remaining indexes
- Write the body starting in position 11, 12, or 13 depending on the total size.
Sorry if I am missing something obvious, or misunderstanding the allocation strategies that micropython uses!
One way is to support only messages up to 255 bytes length and always use 2 bytes to store the remaining length. If the length is 127 or less then the first byte is just 0x80. This seems to be allowed by the spec.
This seems to be not disallowed by spec, but unfortunately, interoperability behavior may vary. See 4a5965b
@dmascord : Also please see patch above for an idea how to do it. General idea is that you should precalculate message size first and allocate needed bytearray once. The calculation can be conservatively higher (i.e. you may allocate buffer to be able to hold 2 bytes of msg length), but encoding should use minimal no. of bytes needed (and they you will use 2nd arg of .write() to send out exact no. of bytes needed).
What about the latest committed approach ?
umqtt.simple/umqtt/simple.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm afraid these and above changes are unrelated to the topic of this PR and should be submitted separately.
umqtt.simple/umqtt/simple.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, this ladder doesn't look good. The idea to save memory, any memory. Here you try to avoid reallocations of heap, but by storing 4 times more than needed in code constants area (which is usually still in heap). Optimization is a subtle matter.
As I mentioned, you should allocate bytearray once, conservatively large enough, but fill it in in dynamic way. If there's a static trailer, you can fill it in with slice assignment syntax, as long as on the left and right side there's the same amount of data. Then it's essentially a memcpy(), not insertion or deletion. E.g.:
foo[3:4] = b"fo"
You replace 2 bytes with another 2 bytes, that's ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK... so, is there a way for me to see the memory usage in the different areas? So I can write the code changes, and test to see if I am doing it memory efficiently ?
Ie, If I allocate bytearray(b"\x100円0円0円0円0円0円0円0円0円0円0円0円0円0円"), (ie, the largest amount), and then fill that dynamically (ie, don't create a new sz variable, or enc_byte), then once I know that I won't need the extra bytes, can I right trim() to remove the excess ?
I'm sorry if I am asking very basic questions, but I need a bit of guidance to get this patch acceptable for the environment that it is running on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not reduce the bytearray to bytearray(b"x04MQTT\x04\x020円0円"), and then write the size to the socket as it is being calculated ? That way we can just have the allocation of the int, and not have to do any reallocations or writing into the bytearray ? How expensive is writing to the socket (ie, the function call) vs having an extra byte that will be thrown away ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK... so, is there a way for me to see the memory usage in the different areas? So I can write the code changes, and test to see if I am doing it memory efficiently ?
There're some, but you don't need to go that deep. Speculative reasoning (well, reasoning at all) about the changes you make should be generally enough (assuming you know common-sense principles of memory allocation, etc.)
then once I know that I won't need the extra bytes, can I right trim() to remove the excess ?
You would need to do that if that was a large buffer. For something of dozen of bytes, no need to bother.
If you want a crash course into this stuff, http://sqlite.org/malloc.html may be a good one, especially section 4. But again, you don't need all that, except perhaps for these important conclusions:
In other words, unless all memory allocations are of exactly the same size (n=1) then the system needs access to more memory than it will ever use at one time. Furthermore, we see that the amount of surplus memory required grows rapidly as the ratio of largest to smallest allocations increases, and so there is strong incentive to keep all allocations as near to the same size as possible.
That shows the idea - instead of allocating something of 7, 8, 9, or 10 bytes at different times, it's better always allocate 10.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not reduce the bytearray to bytearray(b"x04MQTT\x04\x020円0円"), and then write the size to the socket as it is being calculated ?
This shows exactly the kind of thinking we (MicroPython maintainers) would like to provoke in contributors to the project. On a first glance, viable and good idea, I'll leave it to you to prove it.
How expensive is writing to the socket (ie, the function call) vs having an extra byte that will be thrown away ?
Function calls are expensive, but the most pressing resource in uPy systems in uPy systems in memory. The heuristic of that is that you should avoid writing output byte by byte, but writing a million integers one by one is better than storing them in the array and writing at once (the latter won't work as there simply won't that much memory on most of uPy systems).
OK, have reworked it again, but now I have a commit chain to clean up :)
@dmascord , thanks for updates, it may take me some time to look thru them, sorry.
No worries, take your time, I am running my patches in my environment anyway.
martyvis
commented
Apr 26, 2017
Just +1ing the need for this patch. Losant's broker uses a 24 byte client ID, 36 byte user ID and 64 byte password.
@martyvis : Do you +1 the need for this patch, or your having tested it?
martyvis
commented
Apr 26, 2017
What's the easiest way to test? Currently running esp8266 1.8.7 stable release (esp8266-20170108-v1.8.7.bin) Can I get a bin file with the patch?
I'm afraid only you may know what is the easiest way for you to test it. But unless people who experience issues get together and work on resolving them (like - testing each other's patches), we'll have very slow progress ;-(.
martyvis
commented
Apr 26, 2017
I am happy to test, but haven't explored building my own binary. Is it in the daily builds or can I get it from somewhere else?
Hi martyvis,
I have just built the latest micropython binary based on the current git (as of 868453d3d83ea8d3ed161bd20f2f6cb59a396562), with the umqtt/simple.py "frozen".
http://tusker.org/firmware-combined-umqtt-20170426.bin
For your comfort level, at the moment, git diff shows the following:
diff --git a/esp8266/esp8266.ld b/esp8266/esp8266.ld
index 960c751..e5292ce 100644
--- a/esp8266/esp8266.ld
+++ b/esp8266/esp8266.ld
@@ -5,7 +5,7 @@ MEMORY
dport0_0_seg : org = 0x3ff00000, len = 0x10
dram0_0_seg : org = 0x3ffe8000, len = 0x14000
iram1_0_seg : org = 0x40100000, len = 0x8000
- irom0_0_seg : org = 0x40209000, len = 0x87000
+ irom0_0_seg : org = 0x40209000, len = 0xa7000
}
/* define common sections and symbols */
diff --git a/esp8266/modesp.c b/esp8266/modesp.c
index 432fd5a..ce75c1c 100644
--- a/esp8266/modesp.c
+++ b/esp8266/modesp.c
@@ -163,7 +163,7 @@ STATIC mp_obj_t esp_flash_user_start(void) {
if (IS_OTA_FIRMWARE()) {
return MP_OBJ_NEW_SMALL_INT(0x3c000 + 0x90000);
} else {
- return MP_OBJ_NEW_SMALL_INT(0x90000);
+ return MP_OBJ_NEW_SMALL_INT(0xb0000);
}
}
STATIC MP_DEFINE_CONST_FUN_OBJ_0(esp_flash_user_start_obj, esp_flash_user_start);
martyvis
commented
Apr 27, 2017
Yes this works nicely, Damien! Losant authenticates the connection and accepts the published message. ( Changing last 8 digits of strings to 8s)
>>> device_id='5901b3ce0d10990088888888'
>>> access_key=b'077e3118-1154-4691-901b-38fd88888888'
>>> access_secret=b'1cbf42ccc2b44b9df601b10278a224bb1ce898e951da9b9674fbb9b988888888'
>>> topic="losant/{0}/state".format(device_id)
>>> topic
'losant/5901b3ce0d10990088888888/state'
>>> c=MQTTClient(device_id,server,user=access_key,password=access_secret)
>>> c.connect()
0
>>> data['count']=1.23456
>>> msg
{'data': {'count': 1.23456}}
>>> c.publish(topic,ujson.dumps(msg))
And in Losant
Reported By Device at 04/27/2017 7:27:57 PM { "count": 1.23456 }
At least for me it would be good to see it merged with the master branch. (I actually ended up working around this by using Mosquitto as a bridge). Nothing is in production - I'm just trying to get my head around Losant and hoping that micropython proves to be a good way of prototyping IoT edge devices
@martyvis : Ok, thanks for testing!
@dmascord : For future cases, while you're waiting for review, feel free to squash incremental changes together and check compliance with https://github.com/micropython/micropython-lib/wiki/ContributorGuidelines. (No need to change anything now, I'll handle that myself.)
Uh oh!
There was an error while loading. Please reload this page.
using logic from http://docs.oasis-open.org/mqtt/mqtt/v3.1.1/os/mqtt-v3.1.1-os.html#_Toc385349213
Please see sample code which works (with this code fix) connecting to Azure IoT: