homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author eryksun
Recipients Zero, benjamin.peterson, docs@python, eryksun, fornax, martin.panter, pitrou, serhiy.storchaka, socketpair, steve.dower, stutzbach
Date 2016年01月19日.23:34:45
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1453246486.06.0.790685821146.issue26158@psf.upfronthosting.co.za>
In-reply-to
Content
FYI, you can parse the cookie using struct or ctypes. For example:
 class Cookie(ctypes.Structure):
 _fields_ = (('start_pos', ctypes.c_longlong),
 ('dec_flags', ctypes.c_int),
 ('bytes_to_feed', ctypes.c_int),
 ('chars_to_skip', ctypes.c_int),
 ('need_eof', ctypes.c_byte))
In the simple case only the buffer start_pos is non-zero, and the result of tell() is just the 64-bit file pointer. In Serhiy's UTF-7 example it needs to also convey the bytes_to_feed and chars_to_skip values:
 >>> f.tell()
 680564735109527527154978616360239628288
 >>> cookie_bytes = f.tell().to_bytes(ctypes.sizeof(Cookie), sys.byteorder)
 >>> state = Cookie.from_buffer_copy(cookie_bytes)
 >>> state.start_pos
 0
 >>> state.dec_flags
 0
 >>> state.bytes_to_feed
 16
 >>> state.chars_to_skip
 2
 >>> state.need_eof
 0
So a seek(0, SEEK_CUR) in this case has to seek the buffer to 0, read and decode 16 bytes, and skip 2 characters. 
Isn't this solvable at least for the case of truncating, Martin? It could do a tell(), seek to the start_pos, read and decode the bytes_to_feed, re-encode the chars_to_skip, seek back to the start_pos, write the encoded characters, and then truncate.
 >>> f = open('temp.txt', 'w+', encoding='utf-7')
 >>> f.write(b'+BDAEMQQyBDMENA-'.decode('utf-7'))
 5
 >>> _ = f.seek(0); f.read(2)
 'аб'
 >>> cookie_bytes = f.tell().to_bytes(sizeof(Cookie), byteorder)
 >>> state = Cookie.from_buffer_copy(cookie_bytes)
 >>> f.buffer.seek(state.start_pos)
 0
 >>> buf = f.buffer.read(state.bytes_to_feed)
 >>> s = buf.decode(f.encoding)[:state.chars_to_skip]
 >>> f.buffer.seek(state.start_pos)
 0
 >>> f.buffer.write(s.encode(f.encoding))
 8
 >>> f.buffer.truncate()
 8
 >>> f.close()
 >>> open('temp.txt', encoding='utf-7').read()
 'аб'
Rewriting the encoded bytes is necessary to properly terminate the UTF-7 sequence, which makes me doubt whether this simple approach will work for all codecs. But something like this is possible, no?
History
Date User Action Args
2016年01月19日 23:34:46eryksunsetrecipients: + eryksun, pitrou, benjamin.peterson, stutzbach, Zero, docs@python, socketpair, martin.panter, serhiy.storchaka, steve.dower, fornax
2016年01月19日 23:34:46eryksunsetmessageid: <1453246486.06.0.790685821146.issue26158@psf.upfronthosting.co.za>
2016年01月19日 23:34:46eryksunlinkissue26158 messages
2016年01月19日 23:34:45eryksuncreate

AltStyle によって変換されたページ (->オリジナル) /