Message 327082 - Python tracker

➜

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

In-reply-to
Author	nascheme
Recipients	nascheme, xtreak
Date	2018年10月04日.20:17:40
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1538684260.3.0.545547206417.issue34801@psf.upfronthosting.co.za>

Content
Thank you for the research. The problem is indeed that \v is getting treated as a line separator. That is an intentional design choice, see: https://bugs.python.org/issue12855 It would seem to have some surprising implications for CSV parsing. E.g. if someone embeds a \v character in a quoted field, parsing the file using codecs.getreader() will cause the field to be split across two rows. Someone else has run into the same issue: https://www.enigma.com/blog/the-secret-world-of-newline-characters I'm not sure anything should be done. Perhaps we should do something to reduce that chances that people trip over this issue. E.g. if I want to parse a file containing Unicode text with the CSV module, how do I do it while allowing \v characters (or other new-line like characters other than \n) within fields?

Content

Thank you for the research. The problem is indeed that \v is getting treated as a line separator. That is an intentional design choice, see:
https://bugs.python.org/issue12855
It would seem to have some surprising implications for CSV parsing. E.g. if someone embeds a \v character in a quoted field, parsing the file using codecs.getreader() will cause the field to be split across two rows.
Someone else has run into the same issue:
https://www.enigma.com/blog/the-secret-world-of-newline-characters
I'm not sure anything should be done. Perhaps we should do something to reduce that chances that people trip over this issue. E.g. if I want to parse a file containing Unicode text with the CSV module, how do I do it while allowing \v characters (or other new-line like characters other than \n) within fields?

History
Date	User	Action	Args
2018年10月04日 20:17:40	nascheme	set	recipients: + nascheme, xtreak
2018年10月04日 20:17:40	nascheme	set	messageid: <1538684260.3.0.545547206417.issue34801@psf.upfronthosting.co.za>
2018年10月04日 20:17:40	nascheme	link	issue34801 messages
2018年10月04日 20:17:40	nascheme	create

homepage