|
|
|
bytes, strings: faster Fields
I'm unhappy about writing this CL, but Fields is much more common
than FieldsFunc and an inherently cheaper operation, because most
spaces are ASCII, so it does seem to justify special case code.
package bytes:
BenchmarkFields 50 23659299 ns/op 44.32 MB/s
BenchmarkFieldsFunc 50 42000576 ns/op 24.97 MB/s
package strings:
BenchmarkFields 100 22852634 ns/op 45.88 MB/s
BenchmarkFieldsFunc 50 35050756 ns/op 29.92 MB/s
Patch Set 1 #Patch Set 2 : diff -r a9fc9baa621b https://code.google.com/p/go/ #Patch Set 3 : diff -r a9fc9baa621b https://code.google.com/p/go/ #Patch Set 4 : diff -r a9fc9baa621b https://code.google.com/p/go/ #
Total comments: 9
Total messages: 8
|
rsc
Hello r (cc: golang-dev@googlegroups.com), I'd like you to review this change to https://code.google.com/p/go/
|
13 years, 4 months ago (2012年09月11日 03:47:17 UTC) #1 | ||||||||||||||||||||||||||||||||||||||||
Hello r (cc: golang-dev@googlegroups.com), I'd like you to review this change to https://code.google.com/p/go/
This also makes me sad. I remember smiling at the simplicity of all these functions when I first found them (ToUpper, TrimSpace, etc). Could you at least keep the old one-line implementation in a comment? It'd be awesome if a future sufficiently advanced compiler could approach this speed-up. On Mon, Sep 10, 2012 at 8:47 PM, <rsc@golang.org> wrote: > Reviewers: r, > > Message: > Hello r (cc: golang-dev@googlegroups.com), > > I'd like you to review this change to > https://code.google.com/p/go/ > > > Description: > bytes, strings: faster Fields > > I'm unhappy about writing this CL, but Fields is much more common > than FieldsFunc and an inherently cheaper operation, because most > spaces are ASCII, so it does seem to justify special case code. > > package bytes: > BenchmarkFields 50 23659299 ns/op 44.32 MB/s > BenchmarkFieldsFunc 50 42000576 ns/op 24.97 > MB/s > > package strings: > BenchmarkFields 100 22852634 ns/op 45.88 MB/s > BenchmarkFieldsFunc 50 35050756 ns/op 29.92 > MB/s > > Please review this at http://codereview.appspot.com/**6495114/<http://codereview.appspot.com/6495114/> > > Affected files: > M src/pkg/bytes/bytes.go > M src/pkg/bytes/bytes_test.go > M src/pkg/strings/strings.go > M src/pkg/strings/strings_test.**go > > >
On Mon, Sep 10, 2012 at 11:53 PM, Brad Fitzpatrick <bradfitz@golang.org> wrote: > This also makes me sad. I remember smiling at the simplicity of all these > functions when I first found them (ToUpper, TrimSpace, etc). > > Could you at least keep the old one-line implementation in a comment? It'd > be awesome if a future sufficiently advanced compiler could approach this > speed-up. I am happy to include a comment. I don't see that 'sufficiently advanced' thing happening. strings.FieldsFunc is a very general function and strings.Fields is a very specific function. Implementing a specific thing in terms of a general thing almost always carries an inherent cost. Russ
i'm unhappy too, but if you want to make me unhappy make the payoff big enough to compensate. http://codereview.appspot.com/6495114/diff/8001/src/pkg/strings/strings.go File src/pkg/strings/strings.go (right): http://codereview.appspot.com/6495114/diff/8001/src/pkg/strings/strings.go#ne... src/pkg/strings/strings.go:261: case '\t', '\n', '\v', '\f', '\r', ' ': you can do this in fewer comparisons since \t \n \v \f \r are sequential. http://codereview.appspot.com/6495114/diff/8001/src/pkg/strings/strings.go#ne... src/pkg/strings/strings.go:277: a := make([]string, n) you can do it in one scan by guessing that, N fields is enough, allocating a slice, and just using append. if len(s) < 80, there can be at most 40 fields and that is the common case you're trying to do. if it's a big string, two passes might be warranted but the strings are rarely big. it might be worth asking the question my point is that if you want to write fast code, write fast code. or to put it more proactively, i want to see better code and bigger speedups another way to save code is to have a helper that takes (data *byte, nbytes int) and sharing that between bytes and strings packages.
http://codereview.appspot.com/6495114/diff/8001/src/pkg/strings/strings.go File src/pkg/strings/strings.go (right): http://codereview.appspot.com/6495114/diff/8001/src/pkg/strings/strings.go#ne... src/pkg/strings/strings.go:256: for i := 0; i < len(s); i++ { i think you could take the i++ out of here, and put it in the ascii case below. then it's slightly fewer instructions in the unicode case (probably irrelevant) and the code is slightly more obvious. same applies to loop below. http://codereview.appspot.com/6495114/diff/8001/src/pkg/strings/strings.go#ne... src/pkg/strings/strings.go:258: c := uint32(s[i]) why the type conversion? http://codereview.appspot.com/6495114/diff/8001/src/pkg/strings/strings.go#ne... src/pkg/strings/strings.go:259: if c < 0x80 { utf8.RuneSelf? http://codereview.appspot.com/6495114/diff/8001/src/pkg/strings/strings.go#ne... src/pkg/strings/strings.go:277: a := make([]string, n) On 2012年09月11日 05:29:48, r wrote: > you can do it in one scan by guessing that, N fields is enough, allocating a > slice, and just using append. if len(s) < 80, there can be at most 40 fields and > that is the common case you're trying to do. if it's a big string, two passes > might be warranted but the strings are rarely big. it might be worth asking the > question > > my point is that if you want to write fast code, write fast code. or to put it > more proactively, i want to see better code and bigger speedups > > another way to save code is to have a helper that takes (data *byte, nbytes int) > and sharing that between bytes and strings packages. even if you don't do that, the code becomes simpler (and it's marginally faster) if you do make([]string, 0, n) and use append below.
On Tue, Sep 11, 2012 at 1:29 AM, <r@golang.org> wrote: > i'm unhappy too, but if you want to make me unhappy make the payoff big > enough to compensate. Fair enough. I played a bit more with this. I can change FieldsFunc to start with 10 fields (or len/2 if that's smaller) and then when it comes time to realloc, estimate the field count based on the current fields/length ratio. That cuts the second pass but still avoids a log n number of reallocs. Also, the Unicode is16 and is32 can use linear search instead of binary search for small tables. For uniform inputs linear search keeps up with binary until about n=20, because of the better memory access pattern, and for ASCII-skewed inputs it wins even more since it can stop early. I settled on a cutoff of 16 but will write a calibrator. Combined, those two give: benchmark old MB/s new MB/s speedup BenchmarkFields 29.73 111.57 3.75x BenchmarkFieldsFunc 27.99 56.30 2.01x Russ
do you have the updated version? http://codereview.appspot.com/6495114/diff/8001/src/pkg/bytes/bytes.go File src/pkg/bytes/bytes.go (right): http://codereview.appspot.com/6495114/diff/8001/src/pkg/bytes/bytes.go#newcod... src/pkg/bytes/bytes.go:303: if c < 0x80 { utf8.RuneSelf http://codereview.appspot.com/6495114/diff/8001/src/pkg/bytes/bytes.go#newcod... src/pkg/bytes/bytes.go:327: if c < 0x80 { utf8.RuneSelf http://codereview.appspot.com/6495114/diff/8001/src/pkg/strings/strings.go File src/pkg/strings/strings.go (right): http://codereview.appspot.com/6495114/diff/8001/src/pkg/strings/strings.go#ne... src/pkg/strings/strings.go:283: if c < 0x80 { utf8.RuneSelf
On Fri, Sep 21, 2012 at 5:03 AM, <r@golang.org> wrote: > do you have the updated version? No, not yet. I am still thinking about what the right resizing strategy is. I can make the code much faster on the benchmark by estimating the new array size based on fields/byte processed so far, cutting out almost all reallocations, but that gives it the chance to wildly overestimate and overallocate what it would otherwise need. The current code, while slower by being two passes, at least does not overallocate. I'm still not quite sure how to balance those. Probably I will give up that particular speedup and go back to ordinary append. Russ