-
-
Notifications
You must be signed in to change notification settings - Fork 7k
Fix buffer retransmission and flush hang problems in hardware serial support #6855
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Note that I've not included one of the changes in #3865 - making tx_buffer_head == _tx_buffer_tail in write() atomic. This is because
- I couldn't see this change discussed anywhere (may have missed it)
- Does not compile for large buffers as defines local variable in scope of the for loop body generated
by the ATOMIC_BLOCK macro - Does not seem necessary to fix either of the problems raised in #3745.
There may of course be other potential problems that this does address but given the slight performance penalty I think it may need further discussion before implementation.
Also I've only tested that the specific issues are fixed, and on the Arduino Mega platform only.
Note that I've not included one of the changes in #3865 - making tx_buffer_head == _tx_buffer_tail in write() atomic.
This change is needed when the TX buffer is > 256 bytes. In that case, the head and tail are 2 bytes, so there is potential for reading half of the value before the ISR updates it and half of the value after. In fact, all (read and write) access of tail (which is written by the ISR) should be atomic, and all write access of head and tail (which are read by the ISR) should be atomic (using TX_BUFFER_ATOMIC). It might be even clearer to make all accesses to the head and tail atomic.
This is a tricky problem to reproduce, since it will only happen when:
- The ISR triggers exactly between the two read instructions
- The ISR modifies both bytes of
_tx_buffer_tail
I had a look around the code and it seems there's a lot of head and tail accesses other than the one in write()
that would have to be made atomic. If you want to add a (separate) commit for that, that would be great, but if not that's fine too (it's only for a corner case that requires changes to HardwareSerial.h and is documented there). I don't think this should block merging this PR.
I have two more minor remarks:
- The commit message of your first commit is a bit confusing, it talks about unconditional atomicity using
ATOMIC_BLOCK
but then does not apply that. The message should probably just focus on the introduction of theTX_BUFFER_ATOMIC
macro and explain why that is useful? - The last commit should probably provide a bit more detail about the race condition it protects against (in the commit message, or probably better in a comment in the code). Something like:
If TXC is cleared before writing UDR, and the previous byte completes before writing to UDR, TXC will be set, but a byte is still being transmitted (causing
flush()
to return too soon). So, UDR must be written before clearing TXC. If this is not done atomically, there is a chance interrupts delay the TXC clear so the byte written to UDR is transmitted (setting TXC) before clearing TXC (especially at high baudrates), so then TXC will be cleared but no bytes are left to transmit (causingflush()
to hang).
Other than that, this PR looks great to me!
I understood the possibility of the ISR changing the second byte of the buffer tail in between a read of the first and second byte, but wasn't sure why only this particular example was fixed since as you say there are several others. Anyway, to avoid making too many changes in the same pull request and causing possible delay I think I'll create a new PR for making buffer pointer accesses atomic where needed for large buffers.
Now to figure out how to change commit messages ..
Agreed! As for changing commit messages, check out interactive rebase (e.g. git rebase -i origin/master
) and then use the "reword" action (or "edit" combined with git commit --amend
to edit the contents of a previous commit).
New macro TX_BUFFER_ATOMIC makes the following code block atomic only if the transmit buffer is larger than 256 bytes. SREG is restored on completion. The macro is then used to simplify code for availableForWrite()
Moving the head buffer pointer and setting interrupt flag is now atomic in write(). Previously an intervening ISR could empty the buffer before the second ISR is triggered causing retransmission. Fixes: #3745 (original issue only)
Preserve values of configuration bits MPCMn and U2Xn. Avoid setting other read-only bits for datasheet conformance. See #3745
Changes requested by @matthijskooijman made locally by interactive rebase and forced pushed to the remote - hopefully that's an appropriate workflow?
Make write to UDR and clearing of TXC bit in flush() atomic to avoid race condition. Fixes #3745 (second different issue introduced later but discussed in the same issue)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's perfect. This PR looks good to merge to me. @facchinm, perhaps you have some more hardware to test this on?
@matthijskooijman looks good to me too; I'll test it later tomorrow with a handful of boards. It would probably be a good idea to apply it to https://github.com/arduino/arduinocore-avr so we can start using it proficiently
@facchinm how did the testing go? Anything I might need to do?
No problem on my side, all tests were just fine 😉
I'm trying to understand how we can handle the transition to https://github.com/arduino/ArduinoCore-avr and if it's better to merge it here (for the fix to appear in nightly) or there (so 1.6.21 will be based on it)
@matthijskooijman any idea?
@facchinm, I have no clue about the status of that external repository, so can't really comment on that (other than it appears that we have two places for the Arduino AVR core now, which seems like it could cause confusion. If we're going to migrate to there, we should probably remove the avr core here? Merging PR's can then probably be done using git format-patch
, a bit of sed
and git am
, or perhaps git even has something easier to translate paths?)
Hi @facchinm I see this was merged into ArduinoCore-avr a while ago. It isn't appearing in the current beta download yet though (Arduino-PR-beta1.9-BUILD27) even though the AVR core has been removed from the beta branch. Is that how it should be?
The beta channel currently tracks the latest official release of AVR core (1.6.20). We are working on the infrastructure but, at the moment, we are severely outnumbered compared to the number of projects we are involved in, so some efforts are slowing down "a bit". Sorry for that.
Uh oh!
There was an error while loading. Please reload this page.
Fixes #3745 fixes #6821
Based on #3865 by @NicoHood and suggestions from @matthijskooijman
Fixes two separate issues both discussed in #3745: serial buffer re-transmissions under heavy interrupt load and flush hangs at high baud rates. Tested with testcase for the buffer transmission problem and @NicoHood's testcase for the flush hang problem as posted in #3745 with both default buffer size and large buffer size (512 bytes) on Arduino Mega.
Also incorporates suggestions from @matthijskooijman:
setting read-only bits in the USCRnA register