Description
An immutable keyed set is a readonly collection of elements without duplicates or null values having each of the elements linked to exactly one key.
Example
This simple example of an Exact Cover solution has a constraint set \$\{ 1,2,3,4,5,6,7 \}\$ covered by candidates \$\{ B,D,F \}\$.
- \$B: \{ 1,4 \}\$
- \$D: \{ 3,5,6 \}\$
- \$F: \{ 2,7 \}\$
This could be presented as an immutable keyed set. What's imperative is that each of the constraints \$X\$ only has 1 matching candidate \$S\$.
Usage
[TestMethod]
public void Usage()
{
IReadOnlyKeyedSet<char, int> set = new ImmutableKeyedSet<char, int>(Data());
var sb = new StringBuilder();
foreach (var entry in set)
{
sb.AppendLine($"{entry.Key}: {string.Join(",", entry)}");
}
var rendered = sb.ToString();
// B: 1,4
// D: 3,5,6
// F: 2,7
}
private static IEnumerable<KeyValuePair<char, IEnumerable<int>>> Data()
{
yield return new KeyValuePair<char, IEnumerable<int>>('B', new[] { 1, 4 });
yield return new KeyValuePair<char, IEnumerable<int>>('D', new[] { 3, 5, 6 });
yield return new KeyValuePair<char, IEnumerable<int>>('F', new[] { 2, 7 });
}
The purpose is to guard against an invalid set.
private static IEnumerable<KeyValuePair<char, IEnumerable<int>>> Data()
{
yield return new KeyValuePair<char, IEnumerable<int>>('B', new[] { 1, 2, 4 });
yield return new KeyValuePair<char, IEnumerable<int>>('D', new[] { 3, 5, 6 });
yield return new KeyValuePair<char, IEnumerable<int>>('F', new[] { 2, 7 });
}
The above should throw:
"Duplicate value 2 on key F and B"
Code
Interface IReadOnlyKeyedSet<TKey, TElement>
public interface IReadOnlyKeyedSet<TKey, TElement> : ILookup<TKey, TElement>
{
IEqualityComparer<TKey> KeyComparer { get; }
IEqualityComparer<TElement> ElementComparer { get; }
bool ContainsValue(TElement value);
bool TryGetValues(TKey key, out IEnumerable<TElement> values);
bool TryGetKey(TElement value, out TKey key);
}
Implementation ImmutableKeyedSet<TKey, TElement>
public class ImmutableKeyedSet<TKey, TElement> : IReadOnlyKeyedSet<TKey, TElement>
{
private readonly Dictionary<TKey, Bucket> buckets;
private readonly Dictionary<TElement, TKey> elements;
public ImmutableKeyedSet(
IEnumerable<KeyValuePair<TKey, IEnumerable<TElement>>> keyedSet,
IEqualityComparer<TKey> keyComparer = null,
IEqualityComparer<TElement> elementComparer = null)
{
KeyComparer = keyComparer ?? EqualityComparer<TKey>.Default;
ElementComparer = elementComparer ?? EqualityComparer<TElement>.Default;
buckets = new Dictionary<TKey, Bucket>(KeyComparer);
elements = new Dictionary<TElement, TKey>(ElementComparer);
Set(keyedSet ?? throw new ArgumentNullException(nameof(keyedSet)));
}
public IEqualityComparer<TKey> KeyComparer { get; }
public IEqualityComparer<TElement> ElementComparer { get; }
public IEnumerable<TElement> this[TKey key] => buckets[key];
public int Count => buckets.Count;
public bool Contains(TKey key) => buckets.ContainsKey(key);
public bool ContainsValue(TElement value) => elements.ContainsKey(value);
public IEnumerator<IGrouping<TKey, TElement>> GetEnumerator()
=> buckets.Values.GetEnumerator();
IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
public bool TryGetValues(TKey key, out IEnumerable<TElement> values)
{
var bucket = FindBucketByKey(key);
values = bucket == null ? default : bucket.Values;
return bucket != null;
}
public bool TryGetKey(TElement value, out TKey key)
{
var bucket = FindBucketByValue(value);
key = bucket == null ? default : bucket.Key;
return bucket != null;
}
private Bucket FindBucketByValue(TElement value)
{
if (elements.TryGetValue(value, out var key))
{
return buckets[key];
}
return null;
}
private Bucket FindBucketByKey(TKey key)
{
if (buckets.TryGetValue(key, out var bucket))
{
return bucket;
}
return null;
}
private void Set(IEnumerable<KeyValuePair<TKey, IEnumerable<TElement>>> keyedSet)
{
foreach (var entry in keyedSet)
{
var key = entry.Key;
if (key == null) throw new ArgumentException("Key must be set");
var bucket = FindBucketByKey(key);
if (bucket != null)
{
throw new InvalidOperationException($"Duplicate key {key}");
}
bucket = new Bucket(key);
if (entry.Value == null) throw new ArgumentException("Value must be set");
var values = new HashSet<TElement>(entry.Value);
foreach (var value in values)
{
if (value == null) throw new ArgumentException(
"Value must not contain null");
var valueBucket = FindBucketByValue(value);
if (valueBucket != null)
{
throw new InvalidOperationException(
$"Duplicate value {value} on key {key} and {valueBucket.Key}");
}
bucket.Values.Add(value);
}
buckets.Add(key, bucket);
foreach (var value in values)
{
elements.Add(value, key);
}
}
}
class Bucket : IGrouping<TKey, TElement>
{
public TKey Key { get; }
public IEnumerator<TElement> GetEnumerator() => Values.GetEnumerator();
IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
internal ICollection<TElement> Values { get; }
internal Bucket(TKey key)
{
Key = key;
Values = new List<TElement>();
}
}
}
Questions
- Is this a useful type of collection?
- Is this collection implementing immutability correctly?
1 Answer 1
Just a couple of thoughts
FindBucketByKey
andFindBucketByValue
should be implemented asTryGetSomething
because you already are doing this anyway internally and what is most important, it would save you from a ton ofnull
-checks all over the place. Nobody likes them.Set
should be calledInitialize
- This one is funny:
if (x == null)
+throw
means do not use{}
whereasif (x != null)
+throw
means use{}
;-)
- You should not initialize
values
with= new HashSet<TElement>(entry.Value);
as this could change their order. Groupings don't do this so this behavior would be unexpected. You are also not using any other features of aHashSet
as I find usingDistict
would be more appropriate and would maintain the order of values. Bucket
would be easier to use and implement if it was derived fromList<T>
.Use more linq. You know I like linq so I when I see how
Set
is implemented I feel the same way as @VisualMelon when he sees tuples :-P. This is how I imagine this method doing its main job:var result = from x in Data() from y in x.Value group x.Key by y into g select g;
This would give you groups for each element where you later could check whether any of the groups contains more than one element. With that, you could identify all duplicates and not only the first two and of course the implementation would become much simpler too. This would also be a lot more helpful as with multiple bugs you would need to run the code multiple time to discover every one of them.
-
\$\begingroup\$ I didn't even notice my subconscious (consistent :p) handling of parenthesis there o_O \$\endgroup\$dfhwze– dfhwze2019年09月08日 06:51:58 +00:00Commented Sep 8, 2019 at 6:51
-
\$\begingroup\$ Do you think this class could be useful as API or is it too specific for my use case? \$\endgroup\$dfhwze– dfhwze2019年09月08日 06:58:36 +00:00Commented Sep 8, 2019 at 6:58
-
1\$\begingroup\$ @dfhwze I think this is quite useful... for example it could be used as a command identifier and the values as its aliases or tags. Or in any other use case where there is a name and some aliases/tags. I've had a lot of them recently so I think I'll borrow form it. \$\endgroup\$t3chb0t– t3chb0t2019年09月08日 07:00:40 +00:00Commented Sep 8, 2019 at 7:00
Contains
method should be renamedContainsKey
since (1) it callsbuckets.ContainsKey
and (2) there is also aContainsValue
method so the rename adds clarity. \$\endgroup\$