Issue403100
Created on 2001年01月04日 17:50 by doerwalter, last changed 2022年04月10日 16:03 by admin. This issue is now closed.
| Files |
| File name |
Uploaded |
Description |
Edit |
|
None
|
doerwalter,
2001年01月04日 17:50
|
None |
| Messages (9) |
|
msg53079 - (view) |
Author: Walter Dörwald (doerwalter) * (Python committer) |
Date: 2001年01月04日 17:50 |
This patch modifies Objects/unicodeobject.c/PyUnicode_TranslateCharmap,
so that the error
PyErr_SetString(PyExc_NotImplementedError,
"1-n mappings are currently not implemented");
no longer occurs. I.e.
u"ab".translate({ord(u"a"): u"bbb", ord(u"b"): u"aaa"})
now works. It does this by exponentially
reallocating the string, when there is no more
available space.
|
|
msg53080 - (view) |
Author: Nobody/Anonymous (nobody) |
Date: 2001年01月04日 18:33 |
I like the idea, but the implementation needs some reworking:
the common case is 1-1 mapping so this should be as fast
as possible; extra size checks slow things down too much.
You can take a different approach, though:
leave things as they are and only add a special case for the 1-n
which does resizing depending on how many extra chars are inserted.
Then as final step, if resizing occurred, call _PyUnicode_Resize()
to cut down the allocate buffer to its true size.
-- Marc-Andre
|
|
msg53081 - (view) |
Author: Nobody/Anonymous (nobody) |
Date: 2001年01月05日 18:45 |
I'll checkin a patch for this tomorrow which implements what I had
in mind. The patch doesn't change the performance of the charmap
codec.
Thanks,
-- Marc-Andre
|
|
msg53082 - (view) |
Author: Marc-Andre Lemburg (lemburg) * (Python committer) |
Date: 2001年01月06日 15:03 |
Checked in a different patch providing the same functionality.
Please see the CVS checking message for details.
|
|
msg53083 - (view) |
Author: Walter Dörwald (doerwalter) * (Python committer) |
Date: 2001年01月05日 17:07 |
The problem, that you can't know beforehand how long
the result string will be, i.e. if there really will be any 1-n
replacements happening.
It would be possible to do a loop through the replacement
strings and see if there are any that are longer than one character,
but even if there are, you don't know if they will really be used.
So you have three choices:
(1) You either guess how much space you need and reallocate
when the space is not enough or
(2) you do a dry run of the algorithm once and count how much
space you need and do the algorithm a second time and this
time use the strings.
(3) you can keep the strings in a list and join the list into
one string in the end.
For the case of 1-1 mapping the following will happen:
(1) The first allocation has exactly the right amount of space,
there won't be any reallocations, but a size check for every
character will be don (which should be only a few assembler instructions).
The mapping will have to be accessed for every character
in the source string once.
(2) There will only be one allocation, but for every character in
the source string, the mapping has to be accessed twice, which
are calls to Python function, exception handling etc.
(3) You have to make as many memory allocations are are parts
of the final string that you create, including error handling etc.
I think (1) is clearly the fastest method.
|
|
msg53084 - (view) |
Author: Walter Dörwald (doerwalter) * (Python committer) |
Date: 2001年06月07日 10:09 |
Logged In: YES
user_id=89016
The patch that was checked in changes
PyUnicode_DecodeCharmap and PyUnicode_EncodeCharmap, but
not PyUnicode_TranslateCharmap, where this functionality is
also useful. . (e.g. for
u"<foo>".translate({ord("<"): u"<", ord(">"): u">"})
)
|
|
msg53085 - (view) |
Author: Marc-Andre Lemburg (lemburg) * (Python committer) |
Date: 2001年06月07日 12:32 |
Logged In: YES
user_id=38388
Reopened. This should really be marked as feature request
but for some reason SF won't let me change the Data Type.
|
|
msg53086 - (view) |
Author: Tim Peters (tim.peters) * (Python committer) |
Date: 2001年08月09日 21:02 |
Logged In: YES
user_id=31435
Changed to Feature Requests, at MvL's request.
|
|
msg53087 - (view) |
Author: Walter Dörwald (doerwalter) * (Python committer) |
Date: 2002年09月04日 20:37 |
Logged In: YES
user_id=89016
This is implemented by the PEP 293 patch. Closing the
request.
|
|
History
|
|---|
| Date |
User |
Action |
Args |
| 2022年04月10日 16:03:35 | admin | set | github: 33662 |
| 2001年01月04日 17:50:43 | doerwalter | create |