|
|
|
sort: SortBy API
This is a fully implemented proposal.
Frequently, a data structure has to be sorted
exactly once and the sort criteria is simple.
SortBy permits sorting by providing two closures
(less and swap) and thus eliminates the need for
defining a helper data type just for sorting.
In many cases, the resulting code is more concise
and simpler.
The current implementation piggybacks on top of
the existing implementation and thus sorting
with SortBy likely is slower. However, this is
a temporary implementation detail - there's no
reason that SortBy cannot be as fast as Sort.
Patch Set 1 #Patch Set 2 : diff -r d620ce23ebe4 https://go.googlecode.com/hg/ #Patch Set 3 : diff -r d620ce23ebe4 https://go.googlecode.com/hg/ #
Total comments: 8
Patch Set 4 : diff -r 23a94e669d3a https://go.googlecode.com/hg/ #Patch Set 5 : diff -r 3395d16c46b3 https://go.googlecode.com/hg/ #
Total messages: 11
|
gri
Hello golang-dev@googlegroups.com, I'd like you to review this change to https://go.googlecode.com/hg/
|
14 years ago (2012年01月05日 18:19:24 UTC) #1 | ||||||||||||||||||||||||||||||||
Hello golang-dev@googlegroups.com, I'd like you to review this change to https://go.googlecode.com/hg/
http://codereview.appspot.com/5520048/diff/3003/src/pkg/sort/sort.go File src/pkg/sort/sort.go (right): http://codereview.appspot.com/5520048/diff/3003/src/pkg/sort/sort.go#newcode198 src/pkg/sort/sort.go:198: // IsSorted returns true if data.Less(j, i) is false for any 0 <= i < j < data.Len(). I read this as "IsSorted returns true if there exists any indexes i, j where i < j and data.Less(j, i) is false." Can we change "for any" to "for all"? http://codereview.appspot.com/5520048/diff/3003/src/pkg/sort/sort.go#newcode273 src/pkg/sort/sort.go:273: // false. The swap function swaps elements i and j in the data structure. Unlike above, this reads fine. Perhaps just change the word order above to be like this. http://codereview.appspot.com/5520048/diff/3003/src/pkg/sort/sort_test.go File src/pkg/sort/sort_test.go (right): http://codereview.appspot.com/5520048/diff/3003/src/pkg/sort/sort_test.go#new... src/pkg/sort/sort_test.go:52: SortBy( While concise, I'm not convinced this is more readable. In the cases where I've seen sort.Interface used, the type that implements that interface usually has a descriptive name, like "byIncreasingKey". This yields code like: sort.Sort(byIncreasingKey(data)) which reads better than: sort.SortBy( func(i, j int) bool { return data[i].key < data[j].key }, func(i, j int) { data[i], data[j] = data[j], data[i] }, len(data), ) This difference gets more pronounced when the sort has a secondary key, and so the less function contains some conditionals.
http://codereview.appspot.com/5520048/diff/3003/src/pkg/sort/sort.go File src/pkg/sort/sort.go (right): http://codereview.appspot.com/5520048/diff/3003/src/pkg/sort/sort.go#newcode198 src/pkg/sort/sort.go:198: // IsSorted returns true if data.Less(j, i) is false for any 0 <= i < j < data.Len(). On 2012年01月05日 18:50:09, Sameer Ajmani wrote: > I read this as "IsSorted returns true if there exists any indexes i, j where i < > j and data.Less(j, i) is false." Can we change "for any" to "for all"? Done. http://codereview.appspot.com/5520048/diff/3003/src/pkg/sort/sort.go#newcode273 src/pkg/sort/sort.go:273: // false. The swap function swaps elements i and j in the data structure. On 2012年01月05日 18:50:09, Sameer Ajmani wrote: > Unlike above, this reads fine. Perhaps just change the word order above to be > like this. It's different above because I have to define IsSorted. Anyway, used your suggestion above. http://codereview.appspot.com/5520048/diff/3003/src/pkg/sort/sort_test.go File src/pkg/sort/sort_test.go (right): http://codereview.appspot.com/5520048/diff/3003/src/pkg/sort/sort_test.go#new... src/pkg/sort/sort_test.go:52: SortBy( On 2012年01月05日 18:50:09, Sameer Ajmani wrote: > While concise, I'm not convinced this is more readable. In the cases where I've > seen sort.Interface used, the type that implements that interface usually has a > descriptive name, like "byIncreasingKey". This yields code like: > sort.Sort(byIncreasingKey(data)) > which reads better than: > sort.SortBy( > func(i, j int) bool { return data[i].key < data[j].key }, > func(i, j int) { data[i], data[j] = data[j], data[i] }, > len(data), > ) > This difference gets more pronounced when the sort has a secondary key, and so > the less function contains some conditionals. The more complex cases can still be handled the old way. Also, one can always write: smallerKey := func(i, j int) bool { return data[i].key < data[j].key } swap := func(i, j int) { data[i], data[j] = data[j], data[i] } sort.SortBy(smallerKey, swap, len(data)) which seems equally readable to me. - gri
Please include an ExampleSortBy test function demonstrating its use.
http://codereview.appspot.com/5520048/diff/3003/src/pkg/sort/sort.go File src/pkg/sort/sort.go (right): http://codereview.appspot.com/5520048/diff/3003/src/pkg/sort/sort.go#newcode186 src/pkg/sort/sort.go:186: // Sort sorts data via its interface. After sorting, isSorted(data) is true. s/isSorted/IsSorted/?
http://codereview.appspot.com/5520048/diff/3003/src/pkg/sort/sort.go File src/pkg/sort/sort.go (right): http://codereview.appspot.com/5520048/diff/3003/src/pkg/sort/sort.go#newcode186 src/pkg/sort/sort.go:186: // Sort sorts data via its interface. After sorting, isSorted(data) is true. On 2012年01月05日 22:56:29, tux21b wrote: > s/isSorted/IsSorted/? Done.
On Thu, Jan 5, 2012 at 2:52 PM, <adg@golang.org> wrote: > Please include an ExampleSortBy test function demonstrating its use. > > http://codereview.appspot.com/5520048/
On 2012年01月05日 22:52:17, adg wrote: > Please include an ExampleSortBy test function demonstrating its use. done
After consideration, I vote for leaving Sort alone and dropping this approach, at least for now. It's merely the latest in a line of attempts to provide a generic sorting algorithm in a non-generic language. These new suggestions are worth discussing in the abstract but not as a check-in right before Go 1, whose APIs should be close to fixed by now. I realize this is an addition, not a replacement, but I am concerned about the precedent this work-around-the-interface-model approach sets. In some cases the approach may be more convenient but I am unconvinced it's superior enough to suggest it as a general approach, and it also raises the issue of why only Sort gets this treatment. If Sort needs a different API, let's take our leisurely time designing, playing, and adapting with it, but after Go 1 is out and perhaps in a wider context involving collation and the general topic of the future of containers and algorithms in Go. It's likely more things will arrive in the next few weeks that will trigger a similar reply from me (there have been others already), so let me avoid all the explanation in those cases by establishing a shorthand: Not now. -rob
Ok. I will postpone this.
Just for the record: I modified the implementation, and a sort
implementation based on a less and swap function instead of an
interface appears to be faster then the current interface approach.
Specifically:
Old code:
sort_test.BenchmarkSortInt1K 10000 285444 ns/op
sort_test.BenchmarkSortInt64K 100 28132250 ns/op
New code:
sort_test.BenchmarkSortNInt1K 10000 242571 ns/op
sort_test.BenchmarkSortNInt64K 100 23886690 ns/op
The difference is about 15%. More exactly, this is using SortN, a
further modification of SortBy, hinted at in my original mail:
func SortN(less Criteria, n int)
which requires only a single closure:
type Criteria func(i, j int, swap bool) bool
where Criteria is basically the same as Interface.Less in the existing
code, except that it also swaps if swap is set and the result is true
(i.e., a[i] < a[j]). The two functions less and swap as needed for
SortBy can be obtained as follows from Criteria:
func (less Criteria) Less(i, j) bool { return less(i, j, false) {
and (credits to lvd who pointed this out):
func (less Criteria) Swap(i, j int) {
if !less(i, j, true) {
less(j, i, true)
}
}
It turns out that in some cases, Swap doesn't need to be called
anymore in the implementation, which reduces the number of upcalls.
But the extra complexity for swapping when it is needed may destroy
some of the benefit again. Nevertheless, this approach is faster than
what we have now.
And Criteria can also be created from an existing sort.Interface,
which outlines the connection between the current sort.Sort, and
sort.SortN.
func MakeCriteria(data Interface) Criteria {
return func(i, j int, swap bool) bool {
if data.Less(i, j) {
if swap {
data.Swap(i, j)
}
return true
}
return false
}
}
To summarize:
1) The existing API can be improved, both for ease of use, and better
performance.
2) There is evidence that a closure-based approach is faster.
3) SortBy (current proposal) can be further simplified to an approach
where only a single function (Criteria) is needed; albeit at the cost
of a more complex such function.
- gri
On Thu, Jan 5, 2012 at 5:36 PM, Rob 'Commander' Pike <r@google.com> wrote:
> After consideration, I vote for leaving Sort alone and dropping this approach,
at least for now. It's merely the latest in a line of attempts to provide a
generic sorting algorithm in a non-generic language. These new suggestions are
worth discussing in the abstract but not as a check-in right before Go 1, whose
APIs should be close to fixed by now. I realize this is an addition, not a
replacement, but I am concerned about the precedent this
work-around-the-interface-model approach sets. In some cases the approach may be
more convenient but I am unconvinced it's superior enough to suggest it as a
general approach, and it also raises the issue of why only Sort gets this
treatment.
>
> If Sort needs a different API, let's take our leisurely time designing,
playing, and adapting with it, but after Go 1 is out and perhaps in a wider
context involving collation and the general topic of the future of containers
and algorithms in Go.
>
> It's likely more things will arrive in the next few weeks that will trigger a
similar reply from me (there have been others already), so let me avoid all the
explanation in those cases by establishing a shorthand:
>
> Not now.
>
> -rob
>