This parser is made for part of larger trivia game, and only allowed numbers are positive integers. So negative or decimal numbers should return -1 as sign of error.
I'm pretty new to programming and have never encountered something similar. I tried my best. I guess its not an optimal solution, but I would like to get some feedback on what is good, what to change, and overall quality of this code and its performance.
using System.Data;
using System.Text;
namespace Parser
{
class MathParser
{
public const string validSigns = "+-*/()";
public int Parse(string input)
{
var tokens = Tokenize(input);
int result = Eval(tokens, 0);
return result;
}
public int Eval(List<string> tokens, int pos)
{
while (pos < tokens.Count)
{
if (tokens[pos] == "(")
{
int openB = 0;
int closeB = 0;
for (int i = pos; i < tokens.Count; i++)
{
if (tokens[i] == "(")
{
openB++;
}
else if (tokens[i] == ")")
{
closeB++;
if (openB == closeB)
{
tokens.RemoveAt(i);
tokens.RemoveAt(pos);
var expressionTokens = tokens.Skip(pos).Take(i - pos - 1).ToList();
int result = Eval(expressionTokens, 0);
if (result < 0)
{
return -1;
}
tokens.RemoveRange(pos + 1, i - pos - 2);
tokens[pos] = result.ToString();
pos = 0;
break;
}
}
}
}
pos++;
}
for (int i = 0; i < tokens.Count; i++)
{
if (tokens[i] == "*" || tokens[i] == "/")
{
int left = Convert.ToInt32(tokens[i - 1]);
int right = Convert.ToInt32(tokens[i + 1]);
if (tokens[i] == "/" && left % right != 0)
{
return -1;
}
int result = tokens[i] == "*" ? left * right : left / right;
if (result < 0)
{
return -1;
}
tokens[i] = result.ToString();
tokens.RemoveAt(i + 1);
tokens.RemoveAt(i - 1);
i = 0;
}
}
for (int i = 0; i < tokens.Count; i++)
{
if (tokens[i] == "+" || tokens[i] == "-")
{
int left = Convert.ToInt32(tokens[i - 1]);
int right = Convert.ToInt32(tokens[i + 1]);
int result = tokens[i] == "+" ? left + right : left - right;
if (result < 0)
{
return -1;
}
tokens[i] = result.ToString();
tokens.RemoveAt(i + 1);
tokens.RemoveAt(i - 1);
i = 0;
}
}
return Convert.ToInt32(tokens[0]);
}
public List<string> Tokenize(string expression)
{
var tokens = new List<string>();
expression = RemoveWhiteSpace(expression);
var tempN = new StringBuilder();
for (int i = 0; i < expression.Length; i++)
{
if (isNumber(expression[i]))
{
tempN.Append(expression[i]);
}
else
{
if (tempN.Length > 0)
{
tokens.Add(tempN.ToString());
tempN.Clear();
}
if (isValidSign(expression[i]))
{
tokens.Add(expression[i].ToString());
}
}
}
if (tempN.Length > 0)
{
tokens.Add(tempN.ToString());
}
return tokens;
}
public void print(List<string> tokens)
{
foreach (var token in tokens)
Console.Write(" {0} ", token);
System.Console.WriteLine();
}
public string RemoveWhiteSpace(string input)
{
return new string(input
.Where(c => !Char.IsWhiteSpace(c))
.ToArray());
}
public bool isValidSign(char c)
{
for (int i = 0; i < validSigns.Length; i++)
{
if (c == validSigns[i])
{
return true;
}
}
return false;
}
public bool isNumber(char c)
{
return char.IsDigit(c);
}
public bool isNumber(string s)
{
return int.TryParse(s, out _);
}
}
}
At first I tried building it without slicing down list for each bracket (instead I would provide indexes for start and end of bracket, but deleting whole bracket was hell), but it would throw errors at nested brackets, and I just couldn't follow it up, so I deleted it all and tried to built it this way, which seems to have succeeded.
1 Answer 1
Well, it's a very good start, I can see that you've covered most cases. There are some notes that will give you the boost you need to learn more and more.
validSigns
should bechar
array instead of string.Parse
should have a guard againstNull
orWhiteSpace
andEmpty
strings.if(string.IsNullOrWhiteSpace(input))
Eval
method should be private (you only need to exposeParse
the rest should be hidden from the consumer).Eval
should be divided into multiple small methods one for each operation, such asEvaluateAddition
,EvaluateDivision
,EvaluateMultiplication
..etc. and in the main methodEval
you call them. This will make it easier to manage and flexible to add new signs or expressions identifiers.when you want to iterate over a
string
characters, you do not need to create a new collection for every find. In fact, you can iterate over that, and do all necessary checks, and you could store the indexes, or parse the parts you need into a new class. For instance, if you are trying to extract anint
, then you can extract that part and try to parse it as int, then store that int value for later use. (find, and parse then store once).Remove
from a any collection should not be used in your case, since theRemove
will add more unnecessary allocation and performance hit (because you add extra un needed work). So, if you see you are creating a collection and then removing items from it, then you are parsing stuff that you do not need, in which, you want to fix the root issue (the parser).instead of parsing from string to string, you can use a class model in which will store each parsed expression something like this
public enum MathExpressionSign { Addition, Division, Multiplication, OpenBrace, CloseBrace } private sealed class MathExpression { public int LeftOperand { get; set; } public int RightOperand { get; set; } public MathExpressionSign Sign { get; set; } }
the MathExpressionSign
can be used as a sign identifier in which you can identify the operation type later on, and `MathExpression' will hold the values and its operation type. This way you can parse the expression once to this model, and then use that to do the calculation (better maintainability and less memory allocation).
isNumber
,RemoveWhiteSpace
, andisValidSign
you do not need them, as you only need to do this only if the condition is complex or you need to add more readability or reusability to it. But, I did not see that in them. So, you can do this instead :for (int i = 0; i < expression.Length; i++) { var character = expression[i]; if(char.IsWhiteSpace(character)) continue; if (char.IsDigit(character) && int.TryParse($"{character}", out var digit)) { tempN.Append(character); } else { if (tempN.Length > 0) { tokens.Add(tempN.ToString()); tempN.Clear(); } if (validSigns.Contains(character)) { tokens.Add(character.ToString()); } } }
In any case, two things you should always focus on in any application (over all performance, and memory allocation). So, whenever you develop your skills, you will gain more knowledge and skills that make your coding top notch as they say ;).
In your code : ToList
or ToArray
and creating more string
would impact the memory allocation, so instead you can just iterate and read using some low memory allocation techniques such as Span<T>
and IEnumerable<T>
. This might not be a big deal since you are only experimenting your skills, however, in the long run, you will developed a bad habit if not handled from the start.
Lastly, in .NET
there is DataTable.Compute
class in which can evaluate mathematical expressions. You can use it instead, but if you want to only experiment and learn, you can do more expressions and compare your work with it to improve your work further.
Console.Write
next to aSystem.Console.WriteLine
- no need to duplicate the namespace you've already referenced. In some cases, you surround single statements with braces, others not - programmers are generally appreciative of braces everywhere to make maintenance clear. \$\endgroup\$signs
are actuallyoperators
. No biggie, but better to be clear about their purpose... (And the numbers areoperands
.) \$\endgroup\$