This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2015年09月20日 06:15 by martin.panter, last changed 2022年04月11日 14:58 by admin.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| stringio-seek.patch | martin.panter, 2015年12月15日 06:59 | review | ||
| Messages (6) | |||
|---|---|---|---|
| msg251149 - (view) | Author: Martin Panter (martin.panter) * (Python committer) | Date: 2015年09月20日 06:15 | |
This follows from Issue 12922. When no newline translation is being done, it would be useful to define the seek() offset as the code point offset into the underlying string, allowing stuff like: s = StringIO() print("line", file=s) # Some inflexible API with an unwanted newline s.seek(-1, SEEK_CUR) # Undo the trailing newline s.truncate() In general, relative seeks are not allowed for text streams, and absolute offsets have arbitrary values. But when no encoding is actually going on, these restrictions are annoying. I guess the biggest problem is what to do when newline translation is enabled. But I think this is a rarely-used feature of StringIO. I suggest to say that offsets in that case remain arbitrary, and let the code do whatever it happens to do (probably jumping to the wrong character, chopping CRLFs in half, etc, as long as it won’t crash). |
|||
| msg251152 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2015年09月20日 06:47 | |
I suspect it would be not easy to do for Python implementation. |
|||
| msg251206 - (view) | Author: Martin Panter (martin.panter) * (Python committer) | Date: 2015年09月21日 06:52 | |
I see the _pyio implementation wraps BytesIO with UTF-8 encoding. Perhaps it would be okay to change to UTF-32 encoding (a fixed-length Unicode encoding). That would use more memory, but the C implementation seems to use a Py_UCS4 buffer already. Then you could reimplement seek(), tell(), and truncate() by detaching and rebuilding the TextIOWrapper over the top. Not super efficient, but perhaps that does not matter for the _pyio implementation. The fact that it is so hard to do this (random write access to a large Unicode buffer) in native Python could be another argument to support this in the default StringIO implementation :) |
|||
| msg256292 - (view) | Author: Марк Коренберг (socketpair) * | Date: 2015年12月12日 20:20 | |
#25849 ? |
|||
| msg256302 - (view) | Author: Martin Panter (martin.panter) * (Python committer) | Date: 2015年12月12日 23:10 | |
Mark: This issue is about StringIO only. I am not proposing any change to TextIOBase or how on-disk text files are handled.
I intend to propose a patch to make StringIO more liberal, but haven’t got around to it yet. Do you think it would be worthwhile? IMO it would make StringIO a fairly efficient mutable text buffer. The alternatives [list(str), array("u")] are slower and/or use more than 4 bytes per character.
|
|||
| msg256441 - (view) | Author: Martin Panter (martin.panter) * (Python committer) | Date: 2015年12月15日 06:59 | |
There were a few tricky bits doing this with _pyio.StringIO, but I think I was successful. Here is a patch with both implementations and some tests. If people think this should go ahead, I can add documentation. In the process I may have discovered a bug with the TextIOWrapper implementations. Is calling truncate() meant to truncate the internal read buffer? At the moment you can read back truncated data, although the underlying byte stream is actually truncated. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:58:21 | admin | set | github: 69377 |
| 2021年05月19日 09:44:48 | Eli_B | set | nosy:
+ Eli_B |
| 2020年01月20日 08:22:33 | serhiy.storchaka | link | issue39365 superseder |
| 2015年12月15日 06:59:59 | martin.panter | set | files:
+ stringio-seek.patch keywords: + patch messages: + msg256441 |
| 2015年12月12日 23:10:02 | martin.panter | set | messages: + msg256302 |
| 2015年12月12日 20:20:10 | socketpair | set | nosy:
+ socketpair messages: + msg256292 |
| 2015年09月21日 06:52:13 | martin.panter | set | messages: + msg251206 |
| 2015年09月20日 06:47:42 | serhiy.storchaka | set | nosy:
+ pitrou, benjamin.peterson, stutzbach, serhiy.storchaka messages: + msg251152 |
| 2015年09月20日 06:15:12 | martin.panter | create | |