I would like to know if my function that checks if a string syntax is correct. It's currently very messy, but seems to do the job.
//Example Valid String
[12345678]
{
something
whatever
}
Function:
private static bool CheckIfSyntaxIsValid(string input)
{
int mode = 1;
char last = ' ';
foreach (char c in input)
{
if (last == '{' && c == '}')
{
last = ' ';
mode = 1;
continue;
}
if (last == '{' && c == '{')
return false;
if (last == '[' && c == ']')
{
last = ' ';
mode = 2;
continue;
}
if (last == '[' && !Char.IsDigit(c))
return false;
if (c == '[')
{
if (mode == 2)
return false;
last = c;
mode = 1;
}
if (c == '{')
{
if (mode == 1)
return false;
last = c;
mode = 2;
}
if (c == ']' || c == '}')
return false;
}
if (last != ' ')
return false;
return true;
}
The rule is basically:
[]
must contain only numbers/digits- It must start with
[]
before{}
, and they must come after each other
So, [] {} [] {}
etc. Anything else is simply invalid.
Okay, not everything else is invalid. Text outside of those are ignored, so the only thing that matters is that the bracket orders are correct, etc.
2 Answers 2
Let me post another answer because this is completely different from the first one.
You want to have it less messy so how about this solution?
I introduced a simple Token
class that stores the token char and its position.
private class Token
{
public Token(char value, int position)
{
Value = value;
Position = position;
}
public char Value { get; }
public int Position { get; }
}
The new SyntaxService
does the following now:
- it finds all tokens
- it checks if the number is even and divisible by 4 (
[]{}
.Length) - it checks if each batch of four tokens has the valid sequence
[]{}
- it checks if the attribute isn't empty
- it checks if the attribute contains all digits
- it checks if the
{}
block is not-empty
Example:
public class SyntaxService
{
private const string separators = "[]{}";
public bool ValidateSyntax(string value)
{
var tokens = FindTokens(value).ToList();
return ValidateTokens(value, tokens);
}
private IEnumerable<Token> FindTokens(string value)
{
for (int i = 0; i < value.Length; i++)
{
if (separators.Contains(value[i]))
{
yield return new Token(value[i], i);
}
}
}
private bool ValidateTokens(string value, IList<Token> tokens)
{
var isValidTokenCount = tokens.Count % separators.Length == 0;
if (!isValidTokenCount)
{
return false;
}
var substring = new Func<int, int, string>((start, end) => value.Substring(start + 1, end - start - 1));
for (var i = 0; i < tokens.Count; i += separators.Length)
{
var batch = tokens.SkipFast(i).Take(separators.Length).ToArray();
var isValidTokenSequence = batch.Select(x => x.Value).SequenceEqual(separators);
if (!isValidTokenSequence)
{
return false;
}
const int squareBracketLeft = 0;
const int squareBracketRight = 1;
const int curlyBracketLeft = 2;
const int curlyBracketRight = 3;
var attribute = substring(batch[squareBracketLeft].Position, batch[squareBracketRight].Position);
if (string.IsNullOrWhiteSpace(attribute))
{
return false;
}
if (!attribute.All(x => Char.IsDigit(x)))
{
return false;
}
var content = substring(batch[curlyBracketLeft].Position, batch[curlyBracketRight].Position);
if (string.IsNullOrWhiteSpace(content))
{
return false;
}
}
return true;
}
private class Token
{
public Token(char value, int position)
{
Value = value;
Position = position;
}
public char Value { get; }
public int Position { get; }
}
}
My helper extension for SkipFast
for lists:
static class Extensions
{
public static IEnumerable<T> SkipFast<T>(this IList<T> values, int index)
{
for (var i = index; i < values.Count; i++)
{
yield return values[i];
}
}
}
Usage:
var isValid = new SyntaxService().ValidateSyntax(text);
This seems to work even better then the original method. It correctly recognizes invalid syntax where the original one thinks it's valid.
-
\$\begingroup\$ I will mark this as an Answer cause it gives much insight and i can learn from it in this context and take useful things from it:)! \$\endgroup\$Zerowalker– Zerowalker2016年11月15日 13:07:04 +00:00Commented Nov 15, 2016 at 13:07
I suggest making the code more expressive by encapsulating the conditions as lambdas before the loop.
You should avoid magic numbers and for the mode
create constants or an enum.
In the following example the {}
stand for a block and []
for an attribute.
private static bool CheckIfSyntaxIsValid(string input)
{
var mode = 1;
var last = ' ';
var isEndOfEmptyBlock = new Func<char, bool>(c => last == '{' && c == '}');
var isBeginOfNestedBlock = new Func<char, bool>(c => last == '{' && c == '{');
var isEndOfEmptyAttribute = new Func<char, bool>(c => last == '[' && c == ']');
var isNotDigitInsideAttribute = new Func<char, bool>(c => last == '[' && !Char.IsDigit(c));
var isAttributeBegin = new Func<char, bool>(c => c == '[');
var isAttributeEnd = new Func<char, bool>(c => c == ']');
var isBlockBegin = new Func<char, bool>(c => c == '{');
var isBlockEnd = new Func<char, bool>(c => c == '}');
foreach (char c in input)
{
if (isEndOfEmptyBlock(c))
{
last = ' ';
mode = 1;
continue;
}
if (isBeginOfNestedBlock(c))
{
return false;
}
if (isEndOfEmptyAttribute(c))
{
last = ' ';
mode = 2;
continue;
}
if (isNotDigitInsideAttribute(c))
{
return false;
}
if (isAttributeBegin(c))
{
if (mode == 2)
{
return false;
}
last = c;
mode = 1;
}
if (isBlockBegin(c))
{
if (mode == 1)
{
return false;
}
last = c;
mode = 2;
}
if (isAttributeEnd(c) || isBlockEnd(c))
{
return false;
}
}
if (last != ' ')
{
return false;
}
return true;
}
-
\$\begingroup\$ This does simplify it in a sense, didn't think you could split it up in functions as variables like that, it's still very messy though (not your fault), i will give this a +1 for now at least:) \$\endgroup\$Zerowalker– Zerowalker2016年11月14日 14:29:43 +00:00Commented Nov 14, 2016 at 14:29
-
\$\begingroup\$ @Zerowalker maybe if you could explain the "mode" we could do more for you ;-) \$\endgroup\$t3chb0t– t3chb0t2016年11月14日 14:31:53 +00:00Commented Nov 14, 2016 at 14:31
-
\$\begingroup\$ Mode doesn't really mean anything, it's just a value so i can keep track with the "ifs". Cause if you look at my Example text, you can see that there is a order of things. \$\endgroup\$Zerowalker– Zerowalker2016年11月14日 19:58:47 +00:00Commented Nov 14, 2016 at 19:58
-
\$\begingroup\$ I noticed that using Func instead of manually writing the "bool if things" gives quite a huge performance hit (not that It really matters in my case). Though didn't think it was that huge (more than twice as slow in my tests) \$\endgroup\$Zerowalker– Zerowalker2016年11月15日 13:08:05 +00:00Commented Nov 15, 2016 at 13:08
-
\$\begingroup\$ @Zerowalker that's interesting, I didn't test it for performance, I wouldn't even expect it to be so bad. How big are your strings that you validate? I need to digg deeper and maybe learn something new too ;-) \$\endgroup\$t3chb0t– t3chb0t2016年11月15日 13:12:00 +00:00Commented Nov 15, 2016 at 13:12
[208208]{ something, {whatever}}
be valid, or invalid input? \$\endgroup\$