A GapHelper static class to create unique (non-existing) size restricted names

Question 1

I am working with a 3th party library which has following interface defined for a processing task:

public interface IProcessor {
 void Execute( FileContent fileContent );
}

Such an IProcessor gives me the possibility to change the FileContent before it gets written to the database. For legacy reasons, and for integration reasons with an unnamed ERP, its unique identifiers are string identifiers which have a limitation in length, and they are padded on the left with padding char '0'.

Names will therefore always have the form of 00100 or 02301 or 10179 for a 5 digit identifier.

Sometimes, we need to add new content inside the fileContent.Components, but we are not allowed to change the existing identifiers. Theoretically any number of items could be added to the components, but for the question at hand, I simplified it to just adding 20 items.

The existing identifiers do not have any order to them, nor can I be sure that the highest number inside the string would be lower than the theoretical maximum number for this identifier.

To handle this addition, I thought I could use a IEnumerator<string> function (not class) as an iterator, that creates the next number starting from the theoretical minimum value to the theoretical maximum value, and throw when it exceeds the maximum value.

public static class GapHelper {
 public static IEnumerator<string> GetNextName( int maxSize, IEnumerable<string> items, Func<int, string> transform ) {
 var maxLength = (long)Math.Pow( maxSize, 10 );
 var hashSet = new HashSet<string>( items );
 for (var i = 0; i < maxLength; i++) {
 var text = transform(i);
 if (hashSet.Contains( text )) {
 continue;
 }
 yield return text;
 }
 throw new ArgumentOutOfRangeException( "Iteration exceeded maximum numbers of items at " + maxSize );
 }
 public static IEnumerator<string> GetNextName( int maxSize, IEnumerable<string> items) {
 return GetNextName( maxSize, items, i => i.ToString().PadLeft( maxSize, '0' ) );
 }
}

and this would be used inside such an IProcessor in the following way:

public class ProcessingTask : IProcessor {
 public void Execute( FileContent fileContent ) {
 using (var nameGenerator = GapHelper.GetNextName( 5, fileContent.Components.Select( c => c.Name ) ) ) {
 for (var i = 0; i < 20; i++) {
 nameGenerator.MoveNext();
 fileContent.Components.Add( new Component { 
 Name = nameGenerator.Current, Content = "Foo"
 } );
 }
 }
 }
}

What I cannot change:

How the processor is called, and with which parameters
How the file is processed afterwards, and might have been processed before
My bracket positioning, we have them at the end of each line
I am aware I didn't add documentation to the methods

A full simplified example can be found in this dotnetfiddle.

I am interested in how this code looks from maintainability perspective, and what other potential options I might have overlooked. I am also curious if having an IEnumerator<string> implemented as a method rather than a class makes a big difference, either for this implementation or from an iterator perspective.

Question 2

I vote to close this question because of this: Theoretically any number of items could be added to the components, but for the question at hand, I simplified it to just adding 20 items. - it makes it difficult to figure out whether the result must have n number of items or can have. In other words do you add fake items to the result as you are doing here or do you stop when there are no more items to add.

Question 3

@t3chb0t That's a good point, it was hard to know how much I could simply the question without oversimplifying the process. It is indeed a bit more complex as the number of items I would add is undefined at compilation time, I get it from another file, and there are 3 dependent counters on top of it, one for subitems of the component, and two for the component. I chose not to do it that complex for the questions, but maybe I should

Question 4

Do you have any tests for this code? It helps reviewers to be able to build and run the code, and experiment with it.

Question 5

@TobySpeight well I did add a link to a dotnetfiddle

Question 6

Possible Bug

Since Func<int, string> transform is not guarded against collisions (different int values may yield the same string result), the next line is wrong. It only checks for unique elements in the already available data items, not in the new results fetched from transform.

 if (hashSet.Contains( text )) {

This could be solved by trying to add each element to the set instead:

if (!hashSet.Add( text )) {

LINQ

Furthermore, I would use an IEnumerable to allow for some LINQ chaining.

public static IEnumerable<string> GetNextName(
 int maxSize, IEnumerable<string> items, Func<int, string> transform) {

So we could just call:

public class ProcessingTask : IProcessor {
 public void Execute( FileContent fileContent ) {
 GapHelper.GetNextName(5, fileContent.Components
 .Select(c => c.Name)).Take(20).Select(x 
 => new Component { Name = x, Content = "Foo" })
 .ToList().ForEach(fileContent.Components.Add);
 }
}

Which yields better readability. Also a fun fact about disposing an enumerator from the initial code:

 using (var nameGenerator = // ..

Question 7

Ha, good one on the bug, I had actually refactored that for the question, but didn't test it with anything else ^_^ before that the padding was done in the method, but it didn't feel right :) On the topic of chaining, that would be nice, but I am iterating an external file given to me (I know that the question didn't reflect that, so sorry), and I will have 3 different iterators there

Question 8

I only don't get the comment on disposing an enumerator, I honestly don't mind it, it makes sense to have it disposed since I will probably never really iterate the full list, so having it in a using feels like second nature to me, as it implements the IDisposable

Question 9

Eric Lippert suggests not to bother with disposing enumerators, since its only purpose is to for some interop. But it can't hurt ether if you keep using it.

Question 10

mhmm... have you seen this post of him? He says the opposite: Always dispose your enumerators. They implement IDisposable for a reason. which the linked article confirms.

Question 11

@t3chb0t He has a more recent post where he argued otherwise.

dfhwze dfhwze 14.1k3 gold badges40 silver badges101 bronze badges · Accepted Answer · 2019-10-01 06:20:08Z

Possible Bug

Since Func<int, string> transform is not guarded against collisions (different int values may yield the same string result), the next line is wrong. It only checks for unique elements in the already available data items, not in the new results fetched from transform.

 if (hashSet.Contains( text )) {

This could be solved by trying to add each element to the set instead:

if (!hashSet.Add( text )) {

LINQ

Furthermore, I would use an IEnumerable to allow for some LINQ chaining.

public static IEnumerable<string> GetNextName(
 int maxSize, IEnumerable<string> items, Func<int, string> transform) {

So we could just call:

public class ProcessingTask : IProcessor {
 public void Execute( FileContent fileContent ) {
 GapHelper.GetNextName(5, fileContent.Components
 .Select(c => c.Name)).Take(20).Select(x 
 => new Component { Name = x, Content = "Foo" })
 .ToList().ForEach(fileContent.Components.Add);
 }
}

Which yields better readability. Also a fun fact about disposing an enumerator from the initial code:

 using (var nameGenerator = // ..

Ha, good one on the bug, I had actually refactored that for the question, but didn't test it with anything else ^_^ before that the padding was done in the method, but it didn't feel right :) On the topic of chaining, that would be nice, but I am iterating an external file given to me (I know that the question didn't reflect that, so sorry), and I will have 3 different iterators there
I only don't get the comment on disposing an enumerator, I honestly don't mind it, it makes sense to have it disposed since I will probably never really iterate the full list, so having it in a using feels like second nature to me, as it implements the IDisposable
Eric Lippert suggests not to bother with disposing enumerators, since its only purpose is to for some interop. But it can't hurt ether if you keep using it.
mhmm... have you seen this post of him? He says the opposite: Always dispose your enumerators. They implement IDisposable for a reason. which the linked article confirms.
@t3chb0t He has a more recent post where he argued otherwise.

Stack Exchange Network

A GapHelper static class to create unique (non-existing) size restricted names

1 Answer 1

Possible Bug

LINQ

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

A GapHelper static class to create unique (non-existing) size restricted names

1 Answer 1

Possible Bug

LINQ

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions