203 – std.format.doFormat() pads width incorrectly on Unicode strings

D issues are now tracked on GitHub. This Bugzilla instance remains as a read-only archive.
Issue 203 - std.format.doFormat() pads width incorrectly on Unicode strings
Summary: std.format.doFormat() pads width incorrectly on Unicode strings
Status: RESOLVED FIXED
Alias: None
Product: D
Classification: Unclassified
Component: phobos (show other issues)
Version: D1 (retired)
Hardware: x86 All
: P2 normal
Assignee: Walter Bright
URL:
Keywords: spec
Depends on:
Blocks:
Reported: 2006年06月17日 12:13 UTC by Matti Niemenmaa
Modified: 2014年02月15日 13:28 UTC (History)
0 users

See Also:


Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this issue.
Description Matti Niemenmaa 2006年06月17日 12:13:48 UTC
import std.string;
void main() {
	assert(format("%8s", "foo") == " foo");
	assert(format("%8s", "foobar") == " foobar");
	assert(format("%8s", "hello") == " hello");
	assert(format("%8s", "h\u00e9ll\u00f4") == " h\u00e9ll\u00f4");
	// this passes, though it shouldn't: assert(format("%8s", "h\u00e9ll\u00f4") == " h\u00e9ll\u00f4");
}
--
In the above, the last assertion fails.
One would expect the last two strings, having five characters each, to both be padded in the front by three spaces: however, it appears the byte count is being used for determining the length and not the actual character count, and so the last string is padded by only one space.
Comment 1 Thomas Kühne 2007年04月29日 02:09:33 UTC
> One would expect the last two strings, having five characters each,
> to both be padded in the front by three spaces: however, it appears
> the byte count is being used for determining the length and not the
> actual character count, and so the last string is padded by only one
> space.
The only relevant documentation I found is:
> Width
> Specifies the minimum field width. If the width is a *, the next
> argument, which must be of type int, is taken as the width. If
> the width is negative, it is as if the - was given as a Flags
> character.
"field width" could be both interpreted as " byte length" and
"UTF codepoint count".
Comment 2 Walter Bright 2008年06月24日 01:57:14 UTC
I suggest it's codepoint count, as field width is for display purposes.
Comment 3 Walter Bright 2008年07月09日 22:30:39 UTC
Fixed dmd 1.032 and 2.016


AltStyle によって変換されたページ (->オリジナル) /