I often find myself abusing dictionary objects just for the exist method like this,
For Each x in Something
If Not dict.Exists(x) Then dict.Add x, False
Next x
Then just exploit the .Exists(x)
method. It happens often enough that I thought it merited it's own class.
Instead of composing Scripting.Dictionary
, I decided to just maintain a sorted list of unique items. Usage is simple; unique items are inserted into the collection sorted. Non-unique items are not added.
Attributes
Let's just get this out of the way. Attribute VB_PredeclaredId
must be set to True
and NewEnum
allows the class to be iterable. I still haven't found a way to replicate this using an array instead of a collection. With an array I could offload some of my code to my standard library, but then this wouldn't be standalone or portable.
VERSION 1.0 CLASS
BEGIN
MultiUse = -1
END
Attribute VB_Name = "UniqueList"
Attribute VB_GlobalNameSpace = False
Attribute VB_Creatable = False
Attribute VB_PredeclaredId = True
Attribute VB_Exposed = False
Option Explicit
Private list As Collection
Public Sub Class_Initialize()
Set list = New Collection
End Sub
Public sub Class_Terminate()
Set list = Nothing
End Sub
Public Property Get NewEnum() As IUnknown
Attribute NewEnum.VB_UserMemId = -4
Set NewEnum = list.[_NewEnum]
End Property
Data Manipulation
Public Sub Add(ByVal element As Variant)
If IsEmpty Then
list.Add element
Exit Sub
End If
Dim index As Long
index = LocateIndex(element)
If (list(index) > element) Then
list.Add element, Before:=index
ElseIf list(index) < element Then
list.Add element, After:=index
End If
End Sub
Public Sub Merge(ByVal list_ As Variant)
Dim element As Variant
For Each element In list_
Add element
Next element
End Sub
Public Sub Remove(ByVal element As Variant)
Dim index As Long
index = LocateIndex(element)
If (list(index) = element) Then list.Remove index
End Sub
Public Sub Clear()
Set list = Nothing
Set list = New Collection
End Sub
Introspection
Note I don't implement an Item(x)
method. I don't want users to be able to scramble the data.
Public Function Exists(ByVal element As Variant) As Boolean
Exists = (element = list(LocateIndex(element)))
End Function
Public Function Count() As Long
Count = list.Count
End Function
Public Function IsEmpty() As Boolean
IsEmpty = (Count = 0)
End Function
Searching
The only private method is a binary search that returns where the element is or where it should be.
Private Function LocateIndex(ByVal element As Variant) As Long
Dim upper As Long
upper = Count
Dim lower As Long
lower = 1
While lower < upper
Dim middle As Long
middle = (lower + upper) \ 2
If list(middle) >= element Then
upper = middle
Else
lower = middle + 1
End If
Wend
LocateIndex = upper
End Function
-
1\$\begingroup\$ You should really come to the 2nd Monitor sometime. I really enjoy seeing your questions. \$\endgroup\$RubberDuck– RubberDuck2014年08月22日 20:53:39 +00:00Commented Aug 22, 2014 at 20:53
-
\$\begingroup\$ I think at this point it might be worth building a trie \$\endgroup\$user28366– user283662014年08月28日 11:01:50 +00:00Commented Aug 28, 2014 at 11:01
2 Answers 2
Note I don't implement an Item(x) method. I don't want users to be able to scramble the data.
Ok. But then why expose a NewEnum
to enable For Each
loops? Without an Item
getter (with VB_UserMemId
set to 0
to identify the type's default property), it's not clear how a For Each
loop might work. An Item
property won't let the client code scramble the data if you don't implement a setter for it.
Searching
I like the idea, I like the implementation, ..but I would have called if IndexOf
. Just a couple of points:
If list(middle) >= element Then
If list
is empty, this method explodes. You may want to consider raising a custom error in this case, and gracefully handle an empty collection in the public methods.
Also I like that you're declaring variables close to their usage, but this:
While lower < upper Dim middle As Long middle = (lower + upper) \ 2
Is potentially confusing. Consider moving the declaration outside the loop. It's not like the While
block defined a scope anyway!
Dim middle As Long
While lower < upper
middle = (lower + upper) \ 2
The binary search and sorted nature of this UniqueList
are nice, but given the goal:
I often find myself abusing dictionary objects just for the exist method
You wrote that code to implement your own Exists
method, and you're using Variant
which tells me you're using it for more than just Integer
items.
Objects
It would have been nice to handle object items, since a Variant
can also be an Object
; your code is silent about how objects are dealt with.
Actually, if I read your code correctly, this UniqueList
will happily allow me to Add
a New ADODB.Connection
, and then will blow up when I try to add a 2nd item... or if I try to remove it with the Remove
method - the Clear
method will work.
Because we can't override operators (like >=
) in VBA, I'd handle objects with a \$O(n)\$ linear search that performs an equality check on each IsObject(item)
item until it finds the existing object reference (or not). You could mitigate this a little, by introducing an IComparable
interface that custom classes could implement, and then you could call comparable.CompareTo(element)
and have custom classes that could be handled with a binary search, if they specify how they compare to items of a given type.
There should also be a mechanism for ensuring all items are of the same type: as it stands I could add an Integer
, a Date
, a Short
and a Long
to an instance of a UniqueList
, and VBA would find a way to sort them, and that's implicit and not pretty.
I think your class is more like some kind of a SortedList<T>
(to borrow from c# notation), that doesn't allow duplicates: all items in a SortedList<T>
are of type T
, whatever that type is.
Just a couple of notes. The biggest of which is that I think you've misunderstood what VB_PredeclaredId
does. It creates a default instance of the class. Which allows you to do things like this
UniqueList.Add someItem
Without declaring and setting a new instance of the class. It's a cool feature, but it's not what allows the list to be iterable and you don't need it here. The [_NewEnum]
is what allows the For Each
syntax.
The Clear
method doesn't need to Set list = Nothing
. Drop that line and just set it to a new instance of Collection
.
Lastly, I find it really odd that you have a list with no way to get an element back out of the list. We can add and remove items, but not retrieve them. I would rethink implementing an Item
method.
Explore related questions
See similar questions with these tags.