I have 2 functions used for parsing a house number and I think these can both be improved.
Any suggestions?
input:
45a - 47b
or
45a bis 47b
output:
45,a,47,b
public static string[] HausnummerAufteilen(string hausNummer)
{
string[] result = new string[4];
string[] resultVon = new string[2];
string[] resultBis = new string[2];
resultVon[0] = string.Empty;
resultVon[1] = string.Empty;
resultBis[0] = string.Empty;
resultBis[1] = string.Empty;
int pos1, pos2;
pos2 = 0;
pos1 = hausNummer.IndexOf("bis");
if (pos1 != -1)
{
pos2 = pos1 + 3;
}
else
{
pos1 = hausNummer.IndexOf("-");
if (pos1 != -1)
pos2 = pos1 + 1;
}
if (pos1 > 0)
{
resultVon = HausnummerBuchst(hausNummer.Substring(0, pos1).Trim());
resultBis = HausnummerBuchst(hausNummer.Substring(pos2, hausNummer.Length - pos2).Trim());
}
else
resultVon = HausnummerBuchst(hausNummer);
List<string> list = new List<string>();
list.AddRange(resultVon);
list.AddRange(resultBis);
result = list.ToArray();
return result;
}
public static string[] HausnummerBuchst(string hauseNummer)
{
string[] result = new string[2];
result[0] = string.Empty;
result[1] = string.Empty;
int iPos;
int testInt;
bool bFound = false;
for (iPos = 0; iPos < hauseNummer.Length; iPos++)
{
if (!int.TryParse(hauseNummer[iPos].ToString() ,out testInt))
{
bFound = true;
break;
}
}
if (bFound)
{
result[0] = hauseNummer.Substring(0, iPos).Trim();
result[1] = hauseNummer.Substring(iPos, hauseNummer.Length - iPos).Trim();
}
else
result[0] = hauseNummer;
return result;
}
1 Answer 1
Anytime I encounter a method that's only doing read-only parsing of strings and I see it's using String.IndexOf()
and positions, it leaps out at me like a gigantic code smell.
Sometimes upon examination it turns out to be a legitimate and wise usage. However very often the programmer is manually splitting strings and looping through parts of the string. In other words they are doing exactly what String.Split()
and foreach
loops would offer them, but doing it the longer and less readable way.
This is one of those cases.
This may be just my personal taste in code. In any case, I can show you how I would tackle this problem using foreach
and Split()
, which I believe would be a great improvement to your methods. Except for this difference I think we might have taken comparable approaches.
Variable names
A lot of your variables are named with very little descriptiveness. You may understand that variables are named so that someone maintaining or reading the code (like me) can understand what's going on, but you could do much better in this department.
result
: You use this word many times in your variable names. The problem is this name is completely useless to me. I can see it is the result of something (variously a function, a loop or some other process) but beyond that you might as well just call the variablefoo
orbar
.hauseNummerVon
andhauseNummerBin
are far more understandable thanresultVon
andresultBin
- I now know exactly what it is; it's the from house number and the to house number.- For either
result[]
array, I am left with nothing and I would have to read all of the code before I understood what they contained. This is a little backwards - I should know what the variable contains, then be able to read the code - with descriptive variable names giving me a greater understanding of what's going on.
- For either
bFound
: Don't shorten your variable names like this. I'm left wondering: What is ab
? What does it mean if it's found? Maybe this would be more recognisable to a German reader, but even so, expandingb
to a word to say exactly what's found would do wonders for readability.pos1
,pos2
:SeparatorPosition
andSeparatorEnd
would have been more descriptive. Better yet these variables are entirely unnecessary if you just useString.Split()
.
My approach
As I mentioned above, as oppose to editing your methods, I've written from scratch how I'd tackle this problem.
ParseHouseNumberRange
This is the entry method. You feed this method a string such as "20a bis 21c". It splits it up on the "bis" or the "-" (defined in the separators
array) in order to produce smaller chunks of the original string.
Each of these chunks is passed to then passes each of the results (in this case, "20a " and " 21c", spaces included) over to the non-public method ParseHouseNumber
below. ParseHouseNumber
will split these chunks up further and this method combines all the results into a single array which is returned.
Given "20a bis 21c" or similar, it returns an array: 20, a, 21, c
.
public string[] ParseHouseNumberRange(string houseNumberRange)
{
string[] separators = new string[] {"bis", "-"};
string[] houseNumbers = houseNumberRange.Split(separators,
StringSplitOptions.RemoveEmptyEntries);
List<string> parsedList = new List<string>();
foreach (string houseNumber in houseNumbers)
{
parsedList.AddRange(ParseHouseNumber(houseNumber));
}
return parsedList.ToArray();
}
ParseHouseNumber
This method takes a segment of an address from the method above, such as "20a ". After trimming out any spaces, it walks through the string character by character, plucking out digits and letters and sorting them into their own individual sections e.g. "1c4ad3" would be split into the sections 1, c, 4, ad, 3
. Any non-digit non-letter characters are ignored.
Once all sections have been sorted out from the string provided to it, it returns those sections in the form of an array.
public string[] ParseHouseNumber(string houseNumber)
{
bool firstRun = true;
bool alphabeticMode = false; // set on first run
houseNumber = houseNumber.Trim();
List<string> sections = new List<string>();
string currentSection = "";
foreach (char c in houseNumber)
{
bool isLetter = Char.IsLetter(c);
bool isDigit = Char.IsDigit(c);
if (!(isDigit || isLetter)) continue;
// If we've switched character type, then a section's finished.
if (firstRun)
{
alphabeticMode = isLetter;
firstRun = false;
}
else if ((isLetter && !alphabeticMode)
|| (isDigit && alphabeticMode))
{
sections.Add(currentSection);
currentSection = "";
alphabeticMode = !alphabeticMode;
}
currentSection += c;
}
if (currentSection != "") sections.Add(currentSection);
return sections.ToArray();
}