I wrote my Split()
extension, it's main goal is to save delimiters and add them to splitted strings.
For example:
I have a string "-1-2+3+4-5-6"
, and separators '+'
, '-'
and I want to have following: ""
,"-1"
,"-2"
,"+3"
,"+4"
,"-5"
,"-6"
or "-"
,"1-"
,"2+"
,"3+"
,"4-"
,"5-"
,"6"
depends on input separator option.
My extension method consists of three parts:
- Check input values;
- Getting all indexes of separators, and separators in source string;
- Split source string and adding separators to left or right substring.
Code:
public enum SeparatorOptions
{
None = 0,
AddSeparatorToLeftSubstring = 1,
AddSeparatorToRightSubstring = 2
}
public static class Extensions
{
public static string[] Split(this string source, char[] separators, SeparatorOptions separatorOptions)
{
if (separatorOptions < SeparatorOptions.None || separatorOptions > SeparatorOptions.AddSeparatorToRightSubstring)
{
throw new ArgumentException("Arg_SeparatorEnumIllegalVal");
}
if (separators == null || separators.Length == 0)
{
return source.Split((char[])null);
}
if (separatorOptions == SeparatorOptions.None)
{
return source.Split(separators);
}
// Getting all indexes of separators, and separators in source string
int foundCount = 0;
int[] separatorIndexes = new int[source.Length];
char[] separatorValues = new char[source.Length];
for (int i = 0; i < source.Length; i++)
for (int j = 0; j < separators.Length; j++)
{
char separator = separators[j];
if (source[i] == separator)
{
separatorValues[foundCount] = source[i];
separatorIndexes[foundCount++] = i;
break;
}
}
string[] splitString = source.Split(separators);
// Adding separators to left or right substring
if (separatorOptions == SeparatorOptions.AddSeparatorToLeftSubstring)
return AddSeparatorToLeftSubstring(splitString, separatorValues, foundCount);
if (separatorOptions == SeparatorOptions.AddSeparatorToRightSubstring)
return AddSeparatorToRightSubstring(splitString, separatorValues, foundCount);
return splitString;
}
private static string[] AddSeparatorToLeftSubstring(string[] splitString, char[] separatorValues, int foundCount)
{
for (int i = 0; i < foundCount; i++)
{
splitString[i] += separatorValues[i];
}
return splitString;
}
private static string[] AddSeparatorToRightSubstring(string[] splitString, char[] separatorValues, int foundCount)
{
for (int i = 1, j = 0; i <= foundCount; i++, j++)
{
splitString[i] = separatorValues[j] + splitString[i];
}
return splitString;
}
}
class Program
{
static void Main(string[] args)
{
string str = "-1-2+3+4-5-6";
Console.WriteLine("Original split()");
var res2 = str.Split(new char[] { '+', '-' }, SeparatorOptions.None);
foreach (var el in res2) Console.WriteLine(el);
Console.WriteLine("Add to right substring");
var res = str.Split(new char[] { '+', '-' }, SeparatorOptions.AddSeparatorToRightSubstring);
foreach (var elem in res) Console.WriteLine(elem);
Console.WriteLine("Add to left substring");
var res3 = str.Split(new char[] { '+', '-' }, SeparatorOptions.AddSeparatorToLeftSubstring);
foreach (var el in res3) Console.WriteLine(el);
Console.ReadKey();
}
}
What I wrote seems to work, although I am not sure if it is the best way, or recommended way to do something like this, so I want to ask what is wrong with my code and how could it be done better.
-
1\$\begingroup\$ Indeed... A great common problem. :) \$\endgroup\$Igor Soloydenko– Igor Soloydenko2018年05月22日 09:38:24 +00:00Commented May 22, 2018 at 9:38
-
\$\begingroup\$ It might be a good idea to allow the user the option to remove empty entries when it is all said and done (just like String.Split). Splitting right when the first character is a separator causes an empty string as the first substring. Likewise, when splitting left and the last character is a separator. \$\endgroup\$Shelby115– Shelby1152018年05月22日 15:14:06 +00:00Commented May 22, 2018 at 15:14
2 Answers 2
Review
- You use very clean and strong names for all your variables so it's very easy to understand your code.
- You too often go without
{}
. Especially the doublefor
loops become very confusing. - You don't use good exception messages. In
Arg_SeparatorEnumIllegalVal
theArg
prefix is not necessary because the type of the exctpion is already telling me it's about an argument. You also formatted it like it was a variable with PascalCase but it's a message so it should be formatted like a normal sentence.
Alternative solution
There is a much shorter way to get the same results. You could use two different regexes:
var str = "-1-2+3+4-5-6";
Regex.Matches(str, "([-+]?[0-9]?)").Cast<Match>().Select(m => m.Value).Dump();
Regex.Matches(str, "([0-9]?[-+]?)").Cast<Match>().Select(m => m.Value).Dump();
which will respectively give you:
-1
-2
+3
+4
-5
-6
""
and
-
1-
2+
3+
4-
5-
6
""
Whether you want to keep the empty string or not is up to you.
-
1\$\begingroup\$ Where does Dump() come from? \$\endgroup\$paparazzo– paparazzo2018年05月22日 17:04:31 +00:00Commented May 22, 2018 at 17:04
-
\$\begingroup\$ @paparazzo LINQPad \$\endgroup\$t3chb0t– t3chb0t2018年05月22日 17:06:53 +00:00Commented May 22, 2018 at 17:06
-
2\$\begingroup\$ @paparazzo It’s a LINQPad operation. You wouldn’t use it in actual code. \$\endgroup\$Konrad Rudolph– Konrad Rudolph2018年05月22日 17:23:10 +00:00Commented May 22, 2018 at 17:23
Can only pass a valid enum value so don't see the value in this:
if (separatorOptions < SeparatorOptions.None || separatorOptions > SeparatorOptions.AddSeparatorToRightSubstring)
Why oversize?
int[] separatorIndexes = new int[source.Length];
char[] separatorValues = new char[source.Length];
You could split first to get the proper length or use List
Don't skip { }
for (int i = 0; i < source.Length; i++)
{
char c = source[i]
foreach(char separator in separators)
{
-
2
-
\$\begingroup\$ @Malivil Cast any number to an enum? That is reaching. \$\endgroup\$paparazzo– paparazzo2018年05月23日 01:43:41 +00:00Commented May 23, 2018 at 1:43
-
\$\begingroup\$ How is that reaching if I proved it in my example? In this specific case it probably wouldn't happen, but if you're casting input coming in through something like an API or a database field, then it's not so unfathomable. \$\endgroup\$Malivil– Malivil2018年05月23日 12:24:01 +00:00Commented May 23, 2018 at 12:24
-
\$\begingroup\$ @Malivil OK fathomable. \$\endgroup\$paparazzo– paparazzo2018年05月23日 12:27:07 +00:00Commented May 23, 2018 at 12:27
-
1\$\begingroup\$ @Malivil is correct about enums, and you are mistaken. Peeking at Microsoft's code for String.Split shows it using an StringInternal method, which one of the first things it does is check
if (options < StringSplitOptions.None || options > StringSplitOptions.RemoveEmptyEntries)
very similar to the OP. See StringInternal ReferenceSource link. \$\endgroup\$Rick Davin– Rick Davin2018年05月23日 13:36:05 +00:00Commented May 23, 2018 at 13:36