6

I am using Scala Enumeration ValueSets in a fairly high-throughput setting - creating, testing, union'ing and intersecting about 10M sets/second/core. I didn't expect this to be a big deal, because I had read somewhere that they were backed by BitSets, but surprisingly ValueSet.isEmpty showed up as a hot spot in a profiling session with YourKit.

To verify, I decided to try and reimplement what I needed using the Java BitSet, while trying to retain some of the type-safety of using Scala Enumerations. (code review moved to https://codereview.stackexchange.com/questions/74795/scala-bitset-implemented-with-java-bitset-for-use-in-scala-enumerations-to-repl ) The good news is, changing just my ValueSets to these BitSets did indeed lop off 25% of my run-time, so I don't know what ValueSet is really doing under the hood but it could be improved...

EDIT: Reviewing the ValueSet source seems to indicate that isEmpty is definitely O(N), implemented using the general SetLike.isEmpty. Considering ValueSet is implemented with a BitSet, is this a bug?

EDIT: This was the backtrace from the profiler. This seems like a crazy way to implement isEmpty on a bitset.

Backtrace of hot spot in YourKit

asked Dec 24, 2014 at 6:30
4
  • "I don't know what ValueSet is really doing under the hood ". The source is available, if you wanted to take a look: github.com/scala/scala/blob/v2.11.4/src/library/scala/… Commented Dec 24, 2014 at 9:13
  • It sounds like you have working code that you want feedback on. If that's the case, you should move your question to Code Review. Otherwise, please pare down your question and make the actual question more evident - it took me a while to even find what you were asking about. Commented Dec 24, 2014 at 18:53
  • Thanks, looking now. So far, it seems to confirm what I found in the profiler (attached); the ValueSet.isEmpty, at least, is implemented entirely with generic algorithms, neglecting that for a BitSet, it should be no more difficult than (x == 0). Commented Dec 24, 2014 at 18:59
  • 2
    Chris- I will split it into two as you suggest, one to code review my replacement, and one to discuss why ValueSet.isEmpty is O(N) when it's implemented with a BitSet, which honestly looks like a bug to me. Commented Dec 24, 2014 at 19:01

1 Answer 1

2

For the record, I'm all for looking under the hood, but this design asks too much of any mortal coder.

The immortals, of course, have infinite time at their disposal.

Enumeration.ValueSet is backed by a BitSet but is not one itself. Something about favoring composition.

[Did you hear about the heir to a fortune who gave it all up to pursue his love of music? He favored composition over inheritance. Did I just make that up or did I hear it at Java One?]

No doubt, ValueSet should delegate more methods to the BitSet, including isEmpty.

I was going to suggest trying values.iterator.isEmpty, but that just tests hasNext which loops through all possible values checking for contains.

https://github.com/scala/scala/blob/v2.11.4/src/library/scala/collection/BitSetLike.scala#L109

If I'm reading that correctly.

The best option is e.values.toBitMask forall (_ == 0), which is the moral equivalent of BitSet.isEmpty.

answered Dec 25, 2014 at 7:18

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.