homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Add support for reading records with arbitrary separators to the standard IO stack
Type: enhancement Stage: resolved
Components: IO, Library (Lib) Versions: Python 3.4
process
Status: closed Resolution: later
Dependencies: Superseder:
Assigned To: Nosy List: Douglas.Alan, abarnert, akira, amaury.forgeotdarc, benjamin.peterson, calestyo, eric.araujo, facundobatista, georg.brandl, jcon, maggyero, martin.panter, ncoghlan, nessus42, pconnell, pitrou, r.david.murray, ralph.corderoy, rhettinger, wolma, ysj.ray
Priority: normal Keywords: patch

Created on 2005年02月26日 07:24 by ncoghlan, last changed 2022年04月11日 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
pep-newline.txt abarnert, 2014年07月21日 00:32 draft PEP for expanding the newline argument
pep-peek.txt abarnert, 2014年07月21日 00:33 draft PEP for adding IOBase.peek, making this easier for end users to solve
io-newline-issue1152248.patch akira, 2014年07月26日 02:06 Added support for alternative newlines in _pyio.TextIOWrapper. Updated documentation. Added more io tests. No C implementation. No implemention for binary files.
io-newline-issue1152248-2.patch akira, 2014年07月26日 17:02 Reuploaded the patch so that it applies cleanly on the current tip review
Messages (43)
msg61179 - (view) Author: Alyssa Coghlan (ncoghlan) * (Python committer) Date: 2005年02月26日 07:24
There is no canonical way to iterate through a file on
chunks *other* than whole lines without reading the
whole file into memory.
Allowing the separator to be specified as an argument
to file.readlines and file.xreadlines would greatly
simplify the task.
See here for an example interface of the useful options:
http://mail.python.org/pipermail/python-list/2005-February/268482.html 
msg61180 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2005年02月26日 07:38
Logged In: YES 
user_id=1188172
I don't know whether (x)readlines is the right place, since
you are _not_ operating on lines.
What about (x)readchunks?
msg61181 - (view) Author: Douglas Alan (nessus42) Date: 2005年02月28日 18:57
Logged In: YES 
user_id=401880
In reply to birkenfeld, I'm not sure why you don't want to
call lines separated with an alternate line-separation
string "lines", but if you want to call them something else,
I should think they should be called "records" rather than
"chunks".
|>oug
msg61182 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2005年06月27日 04:25
Logged In: YES 
user_id=80475
The OPs request is not a non-starter. There is a proven 
precedent in AWK which allows programmer specifiable record 
separators.
msg61183 - (view) Author: Alyssa Coghlan (ncoghlan) * (Python committer) Date: 2005年06月27日 09:29
Logged In: YES 
user_id=1038590
As Douglas Alan's sample implementation (and his second
attempt [1]) show, getting this right (and reasonably
efficient) is actually a non-trivial exercise. Leveraging
the existing readlines infrastructure is an idea worth
considering.
[1]
http://mail.python.org/pipermail/python-list/2005-February/268547.html 
msg61184 - (view) Author: Skip Montanaro (skip.montanaro) * (Python triager) Date: 2005年06月27日 11:22
Logged In: YES 
user_id=44345
Seems the most likely place you'd want to use this is to select a non-
native line ending in a situation where you didn't want to use universal
newlines (select \r as a line ending on Unix, for example, and allow
\n to just be another character). In that case they'd clearly still be
lines, so embellishing the normal line reading machinery without
adding a new method would be most appropriate.
msg63060 - (view) Author: Facundo Batista (facundobatista) * (Python committer) Date: 2008年02月27日 02:41
Raymond disapproved it, Skip discouraged it, and Nick didn't push it any
more, all more than two years ago.
Nick, please, if you feel this is worthwhile, raise the discussion in
python-dev.
msg63067 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2008年02月27日 08:48
For the record, I thought it was a reasonable request.
AWK has a similar feature. The AWK book shows a number of example 
uses. Google's codesearch shows that the feature does get used in the 
field: http://www.google.com/codesearch?q=lang%3Aawk+RS&hl=en
I think this request should probably be kept open.
msg63068 - (view) Author: Facundo Batista (facundobatista) * (Python committer) Date: 2008年02月27日 11:08
Sorry, I misunderstood you. I assign this to myself to give it a try.
msg63134 - (view) Author: Alyssa Coghlan (ncoghlan) * (Python committer) Date: 2008年02月29日 11:58
The mail.python.org link I posted previously is broken. Here's an
updated link to the relevant c.l.p. thread:
http://mail.python.org/pipermail/python-list/2005-February/310020.html
From my point of view, I still think it's an excellent idea and would be
happy to review a patch, but I'm unlikely to get around to implementing
it myself.
Also keep in mind that we now have the option of doing this only for the
new io module in Python 3.0 - it may be easier to do that and implement
something in pure Python rather than having to deal with the 2.x file
implementation.
(P.S. I found the double negative in Raymond's original comment a little
tricky to parse even as a native English speaker. I would also take
Skip's comment as merely discouraging adding a completely new method
rather than the original idea)
msg64084 - (view) Author: Facundo Batista (facundobatista) * (Python committer) Date: 2008年03月19日 18:52
I took a look at it...
It's not as not-complicated as I original thought. 
The way would be to adapt the Py_UniversalNewlineFread() function to
*not* process the normal separators (like \n or \r), but the passed one.
A critical point would be to handle more-than-1-byte characters... I
concur with Nick that this would better suited for Py3k.
So, I'm stepping down from this, and flagging it for that version.
msg82188 - (view) Author: Alyssa Coghlan (ncoghlan) * (Python committer) Date: 2009年02月15日 23:58
Any further work on this should wait until the io-in-c branch has landed
(or at least be based on that branch).
msg87801 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2009年05月15日 09:18
> cat temp
this is$#a weird$#file$#
> ./python
Python 3.1b1+ (py3k:72632:72633M, May 15 2009, 05:11:27)
[GCC 4.3.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> f = open('temp', newline='$#')
[50354 refs]
>>> f.readlines()
['this is$#', 'a weird$#', 'file$#', '\n']
All I did was comment out the 'newline' argument validity check in textio.c.
msg87802 - (view) Author: Alyssa Coghlan (ncoghlan) * (Python committer) Date: 2009年05月15日 10:17
While RDM's quick test is encouraging, I think one of the key things is
going to be developing tests for the various cases:
- binary mode, single byte line ending
- binary mode, multi-byte line ending
- text mode, single byte single char line ending*
- text mode, multi-byte single char line ending
- text mode, multiple char line ending
The text mode tests would need to cover a variety of encodings (e.g.
ASCII, latin-1, UTF-8, UTF-16, UTF-32 and maybe something like koi8-r
and/or some of the CJK codecs).
*if applicable to codec under test
msg87803 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009年05月15日 11:13
-1 on this idea. readlines() exists precisely because line endings are
special when it comes to text IO (because of the various platform
differences).
If you want to split on any character, you can just use read() followed
by split(). No need to graft additional complexity on the file IO classes.
msg87805 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009年05月15日 11:24
And it's certainly not easy to do correctly :)
msg87806 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009年05月15日 11:25
Uh, trying again to remove the keyword :-(
msg87807 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009年05月15日 11:34
Ok, let me qualify my position a bit:
- -1 for abusing the newline parameter
- -1 for abusing readlines()
- +0 on an additional method ("readchunks" was suggested) which does the
splitting, either on a single character or on a string
Please bear in mind the latter should involve, for each of the C and
Python implementations:
- a generic unoptimized version for BufferedIOBase
- a generic unoptimized version for TextIOBase
- an optimized version for BufferedReader/BufferedRandom
- an optimized version for TextIOWrapper
However, it is certainly an interesting task for someone wanting to play
with C code, optimizations, etc.
msg87808 - (view) Author: Alyssa Coghlan (ncoghlan) * (Python committer) Date: 2009年05月15日 11:46
I agree with Antoine - given that the newlines parameter now deals with
Skip's alternate line separator use case, a new method "readrecords"
that takes a mandatory record separator makes more sense than using
readlines to read things that are not lines. (of course, taking the
alternate line ending use case away also reduces the total number of use
cases for the new method).
Note that the problem with the read()+split() approach is that you
either have to read the whole file into memory (which this RFE is trying
to avoid) or you have to do your own buffering and so forth to split
records as you go. Since the latter is both difficult to get right and
very similar to what the IO module already has to do for readlines(), it
makes sense to include the extra complexity there.
msg87817 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009年05月15日 13:07
> Note that the problem with the read()+split() approach is that you
> either have to read the whole file into memory (which this RFE is trying
> to avoid) or you have to do your own buffering and so forth to split
> records as you go. Since the latter is both difficult to get right and
> very similar to what the IO module already has to do for readlines(), it
> makes sense to include the extra complexity there.
I wonder how often this use case happens though. Usually you first split
on lines, and only then you split on another character or string (think
CSV files, HTTP headers, etc.).
When you don't split on lines, conversely, you probably have a binary
format, and binary formats have more efficient ways of chunking (for
example, a couple of bytes at the beginning indicating the length of the
chunk).
msg87823 - (view) Author: Douglas Alan (nessus42) Date: 2009年05月15日 17:46
Antoine Pitrou <report@bugs.python.org> wrote:
> Nick Coghlan <ncoghlan@gmail.com> added the comment:
> > Note that the problem with the read()+split() approach is that you
> > either have to read the whole file into memory (which this RFE is 
trying
> > to avoid) or you have to do your own buffering and so forth to split
> > records as you go. Since the latter is both difficult to get right 
and
> > very similar to what the IO module already has to do for 
readlines(), it
> > makes sense to include the extra complexity there.
> I wonder how often this use case happens though.
Every day for me. The reason that I originally brought up this request
some years back on comp.lang.python was that I wanted to be able to use
Python easily like I use the xargs program.
E.g.,
 find -type f -regex 'myFancyRegex' -print0 | stuff-to-do-on-each-
file.py
With "-print0" the line separator is chaged to null, so that you can
deal with filenames that have newlines in them.
("find" and "xargs" traditionally have used newline to separate files,
but that fails in the face of filenames that have newlines in them, so
the -print0 argument to find and the "-0" argument to xargs were
thankfully eventually added as a fix for this issue. Nulls are not
allowed in filenames. At least not on Unix.)
> When you don't split on lines, conversely, you probably have a binary
> format,
That's not true for the daily use case I just mentioned.
|>ouglas
P.S. I wrote my own version of readlines, of course, as the archives of
comp.lang.python will show. I just don't feel that everyone should be
required to do the same, when this is the sort of thing that sysadmins
and other Unix-savy folks are wont to do on a daily basis.
P.P.S. Another use case is that I often end up with files that have
beeen transferred back and forth between Unix and Windows and
god-knows-what-else, and the newlines end up being some weird mixture of
carriage returns and line feeds (and sometimes some other stray
characters such as "=20" or somesuch) that many programs seem to have a
hard time recognizing as newlines.
msg109038 - (view) Author: Ralph Corderoy (ralph.corderoy) Date: 2010年07月01日 10:05
Google has led me here because I'm trying to see how to process find(1)'s -print0 output with Python. Perl's -0 option and $/ variable makes this trivial.
 find -name '*.orig' -print0 | perl -n0e unlink
awk(1) has its RS, record separator, variable too. There's a clear need, and it should also be possible to modify or re-open sys.stdin to change the existing separator.
msg109098 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2010年07月02日 10:41
Ralph, core developers have not rejected this idea. It needs a patch now (even rough) to get the discussion further.
msg109117 - (view) Author: Douglas Alan (Douglas.Alan) Date: 2010年07月02日 17:31
Until this feature gets built into Python, you can use a Python-coded generator such as this one to accomplish the same effect:
def fileLineIter(inputFile,
 inputNewline="\n",
 outputNewline=None,
 readSize=8192):
 """Like the normal file iter but you can set what string indicates newline.
 
 The newline string can be arbitrarily long; it need not be restricted to a
 single character. You can also set the read size and control whether or not
 the newline string is left on the end of the iterated lines. Setting
 newline to '0円' is particularly good for use with an input file created with
 something like "os.popen('find -print0')".
 """
 if outputNewline is None: outputNewline = inputNewline
 partialLine = ''
 while True:
 charsJustRead = inputFile.read(readSize)
 if not charsJustRead: break
 partialLine += charsJustRead
 lines = partialLine.split(inputNewline)
 partialLine = lines.pop()
 for line in lines: yield line + outputNewline
 if partialLine: yield partialLine
msg111152 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2010年07月22日 06:44
This fileLineIter function looks like a good recipe to me. Can we close the issue then?
msg111168 - (view) Author: Alyssa Coghlan (ncoghlan) * (Python committer) Date: 2010年07月22日 11:42
A recipe in the comments on a tracker item isn't enough reason to close the RFE, no.
An entry on the cookbook with a pointer from the docs might be sufficient, although I'm still not averse to the idea of an actual readrecords method (with appropriate tests).
msg111177 - (view) Author: ysj.ray (ysj.ray) Date: 2010年07月22日 14:32
I think it's a good idea adding a keyword argument to specify the separator of readlines().
I believe most people can accept the universal meaning of "line", which has similar meaning of "record", that is a chunk data, maybe from using line separators other than '\n' in perl, or akw, or the find command. Maybe doing this doesn't pollute the meaning of "readlines". Splitting the file contents with s special character is really a common usage. Besides, I feel using a line separator other than '\n' doesn't mean we're dealing with binary format, in fact, I often deal with text format with the record separator '\t'.
msg111189 - (view) Author: Douglas Alan (Douglas.Alan) Date: 2010年07月22日 16:33
Personally, I think that this functionality should be built into Python's readlines. That's where a typical person would expect it to be, and this is something that is supported by most other scripting language I've used. E.g., awk has the RS variable which lets you set the "input record separator", which defaults to newline. And as I previously pointed out, xargs and find provide the option to use null as their line separator.
msg111202 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010年07月22日 17:54
> Personally, I think that this functionality should be built into
> Python's readlines. That's where a typical person would expect it to
> be, and this is something that is supported by most other scripting
> language I've used.
Adding it to readline() and/or readlines() would modify the standard IO
Abstract Base Classes, and would therefore probably need discussion on
python-dev.
msg111220 - (view) Author: Alyssa Coghlan (ncoghlan) * (Python committer) Date: 2010年07月22日 22:15
On Fri, Jul 23, 2010 at 3:54 AM, Antoine Pitrou <report@bugs.python.org> wrote:
>
> Antoine Pitrou <pitrou@free.fr> added the comment:
>
>> Personally, I think that this functionality should be built into
>> Python's readlines. That's where a typical person would expect it to
>> be, and this is something that is supported by most other scripting
>> language I've used.
>
> Adding it to readline() and/or readlines() would modify the standard IO
> Abstract Base Classes, and would therefore probably need discussion on
> python-dev.
That's also the reason why I'm suggesting a separate readrecords()
method - the appropriate ABC should be able to implement it as a
concrete method based on something like the recipe above.
msg111453 - (view) Author: Ralph Corderoy (ralph.corderoy) Date: 2010年07月24日 11:13
fileLineIter() is not a solution that allows this bug to be closed, no.
readline() needs modifying and if that means python-dev discussion then
that's what it needs. Things to consider include changing the record
separator as the file is read.
 $ printf 'a b c\nd e f ' |
 > awk '{print "<" 0ドル ">"} NR == 1 {RS = " "}'
 <a b c>
 <d>
 <e>
 <f>
 $
msg223490 - (view) Author: Andrew Barnert (abarnert) * Date: 2014年07月20日 00:07
http://thread.gmane.org/gmane.comp.python.ideas/28310 discusses the same idea.
Guido raised a serious problem with either adding an argument to readline and friends, or adding new methods readrecord and friends: It means the hundreds of existing file-like objects that exist today no longer meet the file API.
Putting the separator in the constructor call solves that problem. Construction is not part of the file API, and different file-like objects' constructors are already wildly different. It also seems to fit in better with what perl, awk, bash, etc. do (whether you either set something globally, or on the file, rather than on the line-reading mechanism). And it seems conceptually cleaner; a file shouldn't be changing line-endings in mid-stream—and if it does, that's similar to changing encodings.
Whether this should be done by reusing newline, or by adding another new parameter, I'm not sure. The biggest issue with reusing newline is that it has a meaning for write mode, not just for read mode (either terminal \n characters, or all \n characters, it's not entire clear which, are replaced with newline), and I'm not sure that's appropriate here. (Or, worse, maybe it's appropriate for text files but not binary files?)
R. David Murray's patch doesn't handle binary files, or _pyio, and IIRC from testing the same thing there was one more problem to fix for text files as well... but it's not hard to complete. If I have enough free time tomorrow, I'll clean up what I have and post it.
msg223491 - (view) Author: Andrew Barnert (abarnert) * Date: 2014年07月20日 00:41
While we're at it, Douglas Alan's solution wouldn't be an ideal solution even if it were a builtin. A fileLineIter obviously doesn't support the stream API. It means you end up with two objects that share the same file, but have separate buffers and out-of-sync file pointers. And it's a lot slower.
That being said, I think it may be useful enough to put in the stdlib—even more so if you pull the resplit-an-iterator-of-strings code out:
def resplit(strings, separator):
 partialLine = None
 for s in strings:
 if partialLine:
 partialLine += s
 else:
 partialLine = s
 if not s:
 break
 lines = partialLine.split(separator)
 partialLine = lines.pop()
 yield from lines
 if partialLine:
 yield partialLine
Now, you can do this:
with open('rdm-example') as f:
 chunks = iter(partial(f.read, 8192), '')
 lines = resplit(chunks, '0円')
 lines = (line + '\n' for line in lines)
# Or, if you're just going to strip off the newlines anyway:
with open('file-0-example') as f:
 chunks = iter(partial(f.read, 8192), '')
 lines = resplit(chunks, '0円')
# Or, if you have a binary file:
with open('binary-example, 'rb') as f:
 chunks = iter(partial(f.read, 8192), b'')
 lines = resplit(chunks, b'0円')
# Or, if I understand ysj.ray's example:
with open('ysj.ray-example') as f:
 chunks = iter(partial(f.read, 8192), '')
 lines = resplit(chunks, '\r\n')
 records = resplit(lines, '\t')
# Or, if you have something that isn't a file at all:
lines = resplit((packet.body for packet in packets), '\n')
msg223492 - (view) Author: Andrew Barnert (abarnert) * Date: 2014年07月20日 00:45
One last thing, a quick & dirty solution that works today, if you don't mind accessing private internals of stdlib classes, and don't mind giving up the performance of _io for _pyio, and don't need a solution for binary files:
class MyTextIOWrapper(_pyio.TextIOWrapper):
 def readrecord(self, sep):
 readnl, self._readnl = self._readnl, sep
 try:
 return self.readline()
 finally:
 self._readnl = readnl
Or, if you prefer:
class MyTextIOWrapper(_pyio.TextIOWrapper):
 def __init__(self, *args, separator, **kwargs):
 super().__init__(*args, **kwargs)
 self._readnl = separator
For binary files, there's no solution quite as simple; you need to write your own readline method by copying and pasting the one from _pyio.RawIOBase, and the modifications to use an arbitrary separator aren't quite as trivial as they look at first (at least if you want multi-byte separators).
msg224016 - (view) Author: Akira Li (akira) * Date: 2014年07月26日 02:06
To make the discussion more specific, here's a patch that adds support
for alternative newlines in _pyio.TextIOWrapper. It aslo updates the
documentation and adds more io tests. It does not provide C
implementation or the extended newline support for binary files.
As a side-effect it also fixes the bug in line_buffering=True
behavior, see issue22069O.
Note: The implementation does no newline translations unless in legacy
special cases i.e., newline='0円' behaves like newline='\n'. This is a 
key distinction from the behavior described in
http://bugs.python.org/file36008/pep-newline.txt
The initial specification is from
https://mail.python.org/pipermail/python-ideas/2014-July/028381.html 
msg224077 - (view) Author: Akira Li (akira) * Date: 2014年07月26日 17:02
> As a side-effect it also fixes the bug in line_buffering=True
> behavior, see issue22069O.
It should be issue22069 "TextIOWrapper(newline="\n", line_buffering=True) 
mistakenly treat \r as a newline"
Reuploaded the patch so that it applies cleanly on the current tip.
msg224149 - (view) Author: Andrew Barnert (abarnert) * Date: 2014年07月28日 02:14
Akira, your patch does this:
- self._writetranslate = newline != ''
- self._writenl = newline or os.linesep
+ self._writetranslate = newline in (None, '\r', '\r\n')
+ self._writenl = newline if newline is not None else os.linesep
Any reason you made the second change? Why change the value assigned to _writenl for newline='\n' when you don't want to actually change the behavior for those cases? Just so you can double-check at write time that _writetranslate is never set unless _writenl is '\r', '\r\n', or os.linesep?
msg224155 - (view) Author: Akira Li (akira) * Date: 2014年07月28日 08:01
> Akira, your patch does this:
>
> - self._writetranslate = newline != ''
> - self._writenl = newline or os.linesep
> + self._writetranslate = newline in (None, '\r', '\r\n')
> + self._writenl = newline if newline is not None else os.linesep
>
> Any reason you made the second change? Why change the value assigned
> to _writenl for newline='\n' when you don't want to actually change
> the behavior for those cases? Just so you can double-check at write
> time that _writetranslate is never set unless _writenl is '\r',
> \r\n', or os.linesep?
If newline='\n' then writenl is '\n' with and without the patch.
If newline='\n' then write('\n\r') writes '\n\r' with and without the
patch.
If newline='\n' then writetranslate=False (with the patch). It does not
change the result for newline='\n' as it is documented now [1]:
 [newline] can be None, '', '\n', '\r', and '\r\n'.
 ...
 If newline is any of the other legal values [namely '\r', '\n',
 '\r\n'], any '\n' characters written are translated to the given
 string.
[...] are added by me for clarity.
[1] https://docs.python.org/3.4/library/io.html#io.TextIOWrapper
writetranslate=False so that if newline='0円' then write('0円\n') would
write '0円\n' i.e., embed '\n' are not corrupted if newline='0円'. That is
why it is the "no translation" patch:
+ When writing output to the stream:
+
+ - if newline is None, any '\n' characters written are translated to
+ the system default line separator, os.linesep
+ - if newline is '\r' or '\r\n', any '\n' characters written are
+ translated to the given string
+ - no translation takes place for any other newline value [any string].
msg226397 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2014年09月05日 03:09
Related:
* Issue 563491: 2002 proposal for parameter to readline, rejected at the time
* Issue 17083: Newline is hard coded for binary file readline
Fixing this issue for binary files would probably also satisfy Issue 17083.
msg387491 - (view) Author: Christoph Anton Mitterer (calestyo) Date: 2021年02月22日 03:25
Just wondered whether this is still being considered?
Cheers,
Chris.
msg387512 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2021年02月22日 10:51
I don't think so.
msg387515 - (view) Author: Christoph Anton Mitterer (calestyo) Date: 2021年02月22日 14:35
Oh, what a pity,... 
Seemed like a pretty common use case, which is unnecessarily prone to buggy or inefficient (user-)implementations.
msg387552 - (view) Author: Christoph Anton Mitterer (calestyo) Date: 2021年02月23日 06:04
btw, just something for the record:
I think the example given in msg109117 above is wrong:
Depending on the read size it will produce different results, given how split() works:
Imagine a byte sequence:
>>> b"0円foo0円barbaz0円0円abcd".split(b"0円")
[b'', b'foo', b'barbaz', b'', b'abcd']
Now the same sequence, however with a different read size (here a shorter one):
>>> b"0円foo0円barbaz0円".split(b"0円")
[b'', b'foo', b'barbaz', b'']
>>> b"0円abcd".split(b"0円")
[b'', b'abcd']
=> it's the same bytes, but in the 2nd case one get's an extra b''.
History
Date User Action Args
2022年04月11日 14:56:09adminsetgithub: 41622
2021年02月23日 06:04:14calestyosetmessages: + msg387552
2021年02月22日 14:35:53calestyosetmessages: + msg387515
2021年02月22日 10:52:01pitrousetstatus: open -> closed
resolution: later
stage: needs patch -> resolved
2021年02月22日 10:51:32pitrousetmessages: + msg387512
2021年02月22日 03:25:11calestyosetnosy: + calestyo
messages: + msg387491
2019年08月25日 15:42:34maggyerosetnosy: + maggyero
2014年09月05日 03:09:20martin.pantersetmessages: + msg226397
2014年07月28日 08:01:37akirasetmessages: + msg224155
2014年07月28日 02:14:36abarnertsetmessages: + msg224149
versions: + Python 3.4, - Python 3.5
2014年07月26日 17:02:19akirasetfiles: + io-newline-issue1152248-2.patch

messages: + msg224077
2014年07月26日 08:24:20pconnellsetnosy: + pconnell
2014年07月26日 05:48:57rhettingersetversions: + Python 3.5, - Python 3.4
2014年07月26日 02:06:43akirasetfiles: + io-newline-issue1152248.patch

nosy: + akira
messages: + msg224016

keywords: + patch
2014年07月21日 00:33:29abarnertsetfiles: + pep-peek.txt
2014年07月21日 00:32:51abarnertsetfiles: + pep-newline.txt
2014年07月20日 17:40:41wolmasetnosy: + wolma
2014年07月20日 02:01:11martin.pantersetnosy: + martin.panter
2014年07月20日 00:45:25abarnertsetmessages: + msg223492
2014年07月20日 00:41:35abarnertsetmessages: + msg223491
2014年07月20日 00:07:05abarnertsetnosy: + abarnert
messages: + msg223490
2012年08月20日 05:46:03ncoghlansettitle: Enhance file.readlines by making line separator selectable -> Add support for reading records with arbitrary separators to the standard IO stack
versions: + Python 3.4, - Python 3.2
2011年06月01日 01:20:37jconsetnosy: + jcon
2010年07月24日 11:13:27ralph.corderoysetmessages: + msg111453
2010年07月22日 22:15:10ncoghlansetmessages: + msg111220
2010年07月22日 17:54:13pitrousetmessages: + msg111202
2010年07月22日 16:33:43Douglas.Alansetmessages: + msg111189
2010年07月22日 14:32:43ysj.raysetnosy: + ysj.ray
messages: + msg111177
2010年07月22日 11:42:53ncoghlansetstatus: pending -> open
resolution: works for me -> (no value)
messages: + msg111168
2010年07月22日 06:44:24amaury.forgeotdarcsetstatus: open -> pending

nosy: + amaury.forgeotdarc
messages: + msg111152

resolution: works for me
2010年07月02日 17:31:17Douglas.Alansetnosy: + Douglas.Alan
messages: + msg109117
2010年07月02日 10:41:06eric.araujosetnosy: georg.brandl, rhettinger, facundobatista, ncoghlan, pitrou, benjamin.peterson, nessus42, eric.araujo, ralph.corderoy, r.david.murray
messages: + msg109098
components: + Library (Lib), - Interpreter Core
2010年07月01日 10:05:04ralph.corderoysetnosy: + ralph.corderoy
messages: + msg109038
2010年04月13日 19:59:57eric.araujosetnosy: + eric.araujo
2009年05月15日 17:46:23nessus42setmessages: + msg87823
2009年05月15日 13:07:55pitrousetmessages: + msg87817
2009年05月15日 11:47:10ncoghlansetmessages: - msg87809
2009年05月15日 11:46:53ncoghlansetmessages: + msg87809
2009年05月15日 11:46:28ncoghlansetmessages: + msg87808
2009年05月15日 11:34:04pitrousetmessages: + msg87807
2009年05月15日 11:25:13pitrousetkeywords: - easy

messages: + msg87806
2009年05月15日 11:24:26pitrousetmessages: + msg87805
2009年05月15日 11:13:00pitrousetmessages: + msg87803
2009年05月15日 10:17:51ncoghlansetmessages: + msg87802
2009年05月15日 09:18:37r.david.murraysetkeywords: + easy
nosy: + r.david.murray
messages: + msg87801

2009年05月15日 02:53:53ajaksu2setnosy: + benjamin.peterson, pitrou

components: + IO
versions: + Python 3.2, - Python 3.1
2009年02月16日 06:15:00skip.montanarosetnosy: - montanaro.historic
2009年02月15日 23:58:45ncoghlansetmessages: + msg82188
stage: test needed -> needs patch
2009年02月15日 23:49:49ajaksu2setstage: test needed
versions: + Python 3.1, - Python 3.0
2008年03月19日 18:52:17facundobatistasetassignee: facundobatista ->
messages: + msg64084
versions: + Python 3.0
2008年02月29日 11:58:52ncoghlansetmessages: + msg63134
2008年02月27日 11:08:10facundobatistasetassignee: facundobatista
messages: + msg63068
2008年02月27日 08:48:48rhettingersetstatus: closed -> open
resolution: rejected -> (no value)
messages: + msg63067
2008年02月27日 02:41:20facundobatistasetstatus: open -> closed
nosy: + facundobatista
resolution: rejected
messages: + msg63060
2005年02月26日 07:24:20ncoghlancreate

AltStyle によって変換されたページ (->オリジナル) /