-
Notifications
You must be signed in to change notification settings - Fork 183
Add (Int, +) finger trees #766
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
a5c4a0b
to
1060462
Compare
@treeowl Could you say something about the motivation for this PR? Ideally create an issue, so we can discuss the "problem" separately from this implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of this partial function, why not only export split
and let users deal with partial patterns if they want?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My only concern is whether that extra stuff will inline away. Certain functions are extremely sensitive to the performance of splitting tiny little sequences (e.g., 2–5 elements) where any little extra time/allocation can matter a lot. I'm not sure how hard it would be to rejigger Data.Sequence.Internal.splitTree
to make sure that works out okay.
1060462
to
5e746ca
Compare
@treeowl Could you say something about the motivation for this PR? Ideally create an issue, so we can discuss the "problem" separately from this implementation.
What really triggered it for me was wanting to play with incremental quicksort for sequences. Doing that (reasonably) efficiently requires use of FingerTree (Seq a)
(which is a lot like Seq (Seq a)
, but with different annotations and therefore different splitting behavior). And the simplest (not simple) approach requires something like the splitMap
function, which we already use to implement zipWith
and chunksOf
.
In principle, it would be nice to expand from (Int, +)
, to any Int
-based monoid. But I'm scared of the efficiency considerations of doing that, and I don't know of any reasonably nice interface that doesn't lean on the extremely non-portable Coercible
machinery. That is, one could imagine
data FingerTree s a class Sized s a | a -> s where size :: a -> s (<|) :: (Coercible s Int, Monoid s, Sized s a) => a -> FingerTree s a -> FingerTree s a
but then we're relying on an additional specialization along with all that non-portable stuff. Youch. So I figured I'd start with something conservative.
5e746ca
to
035e634
Compare
* Export a type for `(Int,+)` finger trees. * Export more `Data.Sequence` internals. * Offer a module of `Data.Sequence` internals intended for external use, that should obey the PVP.
I was about to mention https://hackage.haskell.org/package/fingertree but if the goal is to implement "incremental quicksort" for Seq
it does seem necessary to do this in containers and sufficient to keep it specialized to (Int, +)
.
I was about to mention https://hackage.haskell.org/package/fingertree but if the goal is to implement "incremental quicksort" for
Seq
it does seem necessary to do this in containers and sufficient to keep it specialized to(Int, +)
.
Well... not strictly necessary, I don't think, but the Int
s are unpacked here (but not there). That seems especially important for the very short sequences. Also, the fingertree
package has fallen behind in some optimizations.
So, do you intend to offer this incremental quicksort from a different package? Otherwise I don't understand why we need to enhance the public / stable API for this.
Or are you possibly talking about different sequence types than Seq
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good on first pass!
Unless i'm missing something (i'm not super familiar with this library's internals), there are a few improvements we can make to ergnomics here. For example, I cannot fromList [1..10]
. I must fromList $ Elem <$> [1..10]
, since Sized
has only two instances. The ability to debug splits would also be welcome.
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a debugging function that shows the split lying around? Would be useful to have.
This looks good on first pass!
Unless i'm missing something (i'm not super familiar with this library's internals), there are a few improvements we can make to ergnomics here. For example, I cannot
fromList [1..10]
. I mustfromList $ Elem <$> [1..10]
, sinceSized
has only two instances. The ability to debug splits would also be welcome.Thoughts?
That fromList
"limitation" is kind of essential to this structure. But we can add more Sized
instances. I believe this PR adds one for Seq
, we can get one for Array
(and UArray
?), and I guess inefficient ones for lists and non-empty lists too. Then there are potential derived instances for various base
newtypes. Did you check my logic/arithmetic for splitting? And do you have a better name for split
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cannot comment on this (not sure why you asked me to review). finger trees are unfamiliar structures to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would Data.Sequence.Unsafe
be better name?
(Int,+)
finger trees.Data.Sequence
internals.Data.Sequence
internals intendedfor external use, that should obey the PVP.
Functor
andTraversable
instances forFingerTree
andNode
.Generic1
instance forFingerTree
.