This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2014年01月04日 11:54 by fredstober, last changed 2022年04月11日 14:57 by admin.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| quopri-newline.patch | martin.panter, 2015年01月18日 06:32 | review | ||
| quopri-newline.v2.patch | martin.panter, 2015年01月20日 04:43 | Also fix line break issues | review | |
| Messages (8) | |||
|---|---|---|---|
| msg207281 - (view) | Author: Fred Stober (fredstober) | Date: 2014年01月04日 11:54 | |
While trying to encode some binary data, I encountered this behaviour of the quopri_codec:
>>> '\r\n\n'.encode('quopri_codec').decode('quopri_codec')
'\r\n\r\n'
>>> '\n\r\n'.encode('quopri_codec').decode('quopri_codec')
'\n\n'
If this behaviour is really intended, it should be mentioned in the documentation that this coded is not bijective.
|
|||
| msg207413 - (view) | Author: Vajrasky Kok (vajrasky) * | Date: 2014年01月06日 07:24 | |
The quopri_codec uses binascii.b2a_qp method.
>>> binascii.b2a_qp('\r\n\n\n\n')
'\r\n\r\n\r\n\r\n'
The logic in b2a_qp when dealing with newlines is check whether the first line uses \r\n or \n.
If it uses \r\n, then all remaning lines' new lines will be converted to \r\n. if it uses \n, then all remaning lines' new lines will be converted to \n.
It has comment on the source code.
/* See if this string is using CRLF line ends */
/* XXX: this function has the side effect of converting all of
* the end of lines to be the same depending on this detection
* here */
I am not sure what the appropriate action here. But doc fix should be acceptable.
|
|||
| msg232812 - (view) | Author: Martin Panter (martin.panter) * (Python committer) | Date: 2014年12月17日 11:52 | |
RFC 1521 says that a text newline should be encoded as CRLF, and that any combination of 0x0D and 0x0A bytes that do not represent newlines should be encoded like other control characters as =0D and =0A. Since in Python 3 the codec outputs bytes, I don’t think there is any excuse for it to be outputting plain CR or LF bytes. The question is, do they represent newlines to be encoded as CRLF, or just data bytes that need ordinary encoding. |
|||
| msg232814 - (view) | Author: Marc-Andre Lemburg (lemburg) * (Python committer) | Date: 2014年12月17日 12:26 | |
I agree with Vajrasky: a patch for the documentation would probably be a good idea. Note that mixing line end conventions in a single text is never a good idea. If you stick to one line end convention, there's no problem with the codec, AFAICT. >>> codecs.encode(b'\r\n\r\n', 'quopri_codec') b'\r\n\r\n' >>> codecs.decode(_, 'quopri_codec') b'\r\n\r\n' >>> codecs.encode(b'\n\n', 'quopri_codec') b'\n\n' >>> codecs.decode(_, 'quopri_codec') b'\n\n' |
|||
| msg232826 - (view) | Author: Martin Panter (martin.panter) * (Python committer) | Date: 2014年12月17日 20:11 | |
Okay so maybe the documentation should include these restrictions on encoding: * The data being encoded should only include \r or \n bytes that are part of \n or \r\n newline sequences. Encoding arbitrary non-text data is not supported. * The two kinds of newlines should not be mixed * If \n is used for newlines in the input, the encoder will output \n newlines, and they will need converting to CRLF in a later step to conform to the RFC |
|||
| msg232827 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2014年12月17日 20:50 | |
Pure Python implementation returns different result. >>> import quopri >>> quopri.encodestring(b'\r\n') b'\r\n' >>> quopri.a2b_qp = quopri.b2a_qp = None >>> quopri.encodestring(b'\r\n') b'=0D\n' See also issue18022. |
|||
| msg234223 - (view) | Author: Martin Panter (martin.panter) * (Python committer) | Date: 2015年01月18日 06:32 | |
Here is a patch that clarifies in the documentation and test suite how newlines work in the "quopri" and "binascii" modules. It also fixes the native Python implementation to support CRLFs. * \n is used by default (e.g. for soft line breaks if the input has no hard line breaks) * CRLF is used instead if found in input (even in non-text mode!) * Typo errors in documentation * quopri uses istext=True * header flag does not affect newline encoding; only istext affects it One corner case concerns me slightly: binascii.b2a_qp(istext=False) will use \n for soft line breaks by default, but will suddenly switch to CRLF if the input data happens to contain a CRLF sequence. This is despite the CRLFs from the data being encoded and therefore not appearing in the output themselves. |
|||
| msg234343 - (view) | Author: Martin Panter (martin.panter) * (Python committer) | Date: 2015年01月20日 04:43 | |
Here is patch v2, which fixes some more bugs I uncovered in the quoted-printable encoders: * The binascii version would unnecessarily break a 76-character line (maximum length) if it would end with an =XX escape code * The native Python version would insert soft line breaks in the middle of =XX escape codes |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:57:56 | admin | set | github: 64320 |
| 2015年07月23日 01:54:38 | martin.panter | link | issue20132 dependencies |
| 2015年01月20日 04:44:00 | martin.panter | set | files:
+ quopri-newline.v2.patch type: behavior messages: + msg234343 |
| 2015年01月18日 06:32:33 | martin.panter | set | files:
+ quopri-newline.patch assignee: docs@python components: + Documentation keywords: + patch nosy: + docs@python messages: + msg234223 |
| 2014年12月17日 20:50:35 | serhiy.storchaka | set | nosy:
+ serhiy.storchaka messages: + msg232827 |
| 2014年12月17日 20:11:15 | martin.panter | set | messages: + msg232826 |
| 2014年12月17日 12:26:48 | lemburg | set | nosy:
+ lemburg messages: + msg232814 |
| 2014年12月17日 11:52:23 | martin.panter | set | nosy:
+ martin.panter messages: + msg232812 versions: + Python 3.4 |
| 2014年01月06日 07:24:25 | vajrasky | set | nosy:
+ vajrasky messages: + msg207413 |
| 2014年01月04日 15:28:53 | r.david.murray | set | nosy:
+ r.david.murray |
| 2014年01月04日 11:54:50 | fredstober | create | |