I am using Scala Enumeration ValueSets in a fairly high-throughput setting - creating, testing, union'ing and intersecting about 10M sets/second/core. I didn't expect this to be a big deal, because I had read somewhere that they were backed by BitSets, but surprisingly ValueSet.isEmpty showed up as a hot spot in a profiling session with YourKit.
To verify, I decided to try and reimplement what I needed using the Java BitSet, while trying to retain some of the type-safety of using Scala Enumerations. (code review moved to https://codereview.stackexchange.com/questions/74795/scala-bitset-implemented-with-java-bitset-for-use-in-scala-enumerations-to-repl ) The good news is, changing just my ValueSets to these BitSets did indeed lop off 25% of my run-time, so I don't know what ValueSet is really doing under the hood but it could be improved...
EDIT: Reviewing the ValueSet source seems to indicate that isEmpty is definitely O(N), implemented using the general SetLike.isEmpty. Considering ValueSet is implemented with a BitSet, is this a bug?
EDIT: This was the backtrace from the profiler. This seems like a crazy way to implement isEmpty on a bitset.
Backtrace of hot spot in YourKit
-
"I don't know what ValueSet is really doing under the hood ". The source is available, if you wanted to take a look: github.com/scala/scala/blob/v2.11.4/src/library/scala/…The Archetypal Paul– The Archetypal Paul2014年12月24日 09:13:29 +00:00Commented Dec 24, 2014 at 9:13
-
It sounds like you have working code that you want feedback on. If that's the case, you should move your question to Code Review. Otherwise, please pare down your question and make the actual question more evident - it took me a while to even find what you were asking about.Chris Hayes– Chris Hayes2014年12月24日 18:53:12 +00:00Commented Dec 24, 2014 at 18:53
-
Thanks, looking now. So far, it seems to confirm what I found in the profiler (attached); the ValueSet.isEmpty, at least, is implemented entirely with generic algorithms, neglecting that for a BitSet, it should be no more difficult than (x == 0).experquisite– experquisite2014年12月24日 18:59:47 +00:00Commented Dec 24, 2014 at 18:59
-
2Chris- I will split it into two as you suggest, one to code review my replacement, and one to discuss why ValueSet.isEmpty is O(N) when it's implemented with a BitSet, which honestly looks like a bug to me.experquisite– experquisite2014年12月24日 19:01:23 +00:00Commented Dec 24, 2014 at 19:01
1 Answer 1
For the record, I'm all for looking under the hood, but this design asks too much of any mortal coder.
The immortals, of course, have infinite time at their disposal.
Enumeration.ValueSet
is backed by a BitSet
but is not one itself. Something about favoring composition.
[Did you hear about the heir to a fortune who gave it all up to pursue his love of music? He favored composition over inheritance. Did I just make that up or did I hear it at Java One?]
No doubt, ValueSet should delegate more methods to the BitSet, including isEmpty.
I was going to suggest trying values.iterator.isEmpty
, but that just tests hasNext
which loops through all possible values checking for contains.
https://github.com/scala/scala/blob/v2.11.4/src/library/scala/collection/BitSetLike.scala#L109
If I'm reading that correctly.
The best option is e.values.toBitMask forall (_ == 0)
, which is the moral equivalent of BitSet.isEmpty
.
Explore related questions
See similar questions with these tags.