I've written this class to create RapportColumn
objects for each column that is written in the query. And I would like to get your feedback.
The query that is given in the GetRapportColumnsList()
method is always a valid query. That check has been done in a previous class.
I am specifically asking about the CreateColumns
class, but any other comments are greatly appreciated. Can I optimize this code any further?
class RapportService
{
static void Main(string[] args)
{
var service = new RapportService();
var query =
"SELECT columna, columnaa, columnb, columnc, DATE_FORMAT(columnd, '%Y-%m-%d'), DATE_FORMAT(columnd, '%T') , IF(columne = (SELECT count(*) FROM tablea WHERE columna = m.columnb AND columnc = 0 AND columnd = 1), 'Goed', 'Fout') as somestring FROM tableb m WHERE columna = 64 AND columnb >= '2018-09-04 00:00:00' AND columnd <= '2018-09-08 00:00:00' ORDER BY columnb ASC";
var x = service.GetRapportColumnsList(query, 5);
Console.Read();
}
private Queue<string> columnQueue = new Queue<string>();
private List<RapportColumn> columns = new List<RapportColumn>();
// param -> part of query everthing between select and from
// split on ',' and add it in a queue
private void AddColumnStringsToQueue(string query)
{
var columns = query.Split(',');
foreach (var column in columns)
{
columnQueue.Enqueue(column);
}
}
// param query and rapport_id
// get the part of the query between SELECT and FROM
// calls the method to put everythin in the queue
// calls the method to create the column objects
public List<RapportColumn> GetRapportColumnsList(string query, int rapport_id)
{
var i = query.LastIndexOf(SqlClauseEnum.FROM, StringComparison.CurrentCultureIgnoreCase);
var res = query.Substring(SqlClauseEnum.SELECT.Length, i - SqlClauseEnum.SELECT.Length);
AddColumnStringsToQueue(res);
CreateColumns(rapport_id);
return columns;
}
// loop trough queue and create RapportColumn objects
private void CreateColumns(int rapport_id)
{
var stack = new Stack<char>();
var count = 0;
while (columnQueue.Count > 0)
{
var builder = new StringBuilder();
count++;
var rapportColumn = new RapportColumn();
var item = columnQueue.Dequeue();
builder.Append(item);
var bracket = GetOpeningBracket(item);
if (bracket == BracketsEnum.ParenthesisOpen)
{
if (!IsSurrounded(item))
{
stack.Push(bracket);
while (columnQueue.Count > 0)
{
var newItem = columnQueue.Peek();
var newBracket = GetOpeningBracket(newItem);
if (newBracket == BracketsEnum.ParenthesisOpen)
{
if (!IsSurrounded(item))
{
stack.Push(newBracket);
}
}
builder.Append(", ");
builder.Append(newItem);
columnQueue.Dequeue();
if (GetClosingBracket(newItem) == BracketsEnum.ParenthesisClose)
{
if (!IsSurrounded(item))
{
stack.Pop();
if (stack.Count == 0)
{
break;
}
}
}
}
}
}
var query = builder.ToString();
if (query.IndexOf("AS", StringComparison.CurrentCultureIgnoreCase) > 0)
{
var parts = Regex.Split(query, "AS", RegexOptions.IgnoreCase);
rapportColumn.RC_ColumnQuery = parts[0];
rapportColumn.RC_ColumnLabel = parts[1];
}
else
{
rapportColumn.RC_ColumnQuery = query;
}
rapportColumn.Rapport_Id = rapport_id;
rapportColumn.RC_Order = count;
columns.Add(rapportColumn);
}
}
// checks if there is a opening parenthesis in the string
private char GetOpeningBracket(string item)
{
var arr = item.ToCharArray();
foreach (var charr in arr)
{
switch (charr)
{
case BracketsEnum.ParenthesisOpen:
return charr;
default:
continue;
}
}
return '0円';
}
// checks if there is a closing parenthesis in the string
private char GetClosingBracket(string item)
{
var arr = item.ToCharArray();
foreach (var charr in arr)
{
switch (charr)
{
case BracketsEnum.ParenthesisClose:
return charr;
default:
continue;
}
}
return '0円';
}
// checks if there are as much closing as opening parenthesis in the string
private bool IsSurrounded(string item)
{
var stack = new Stack<int>();
var isSurrounded = false;
for (var i = 0; i < item.Length; i++)
{
switch (item[i])
{
case BracketsEnum.ParenthesisOpen:
stack.Push(i);
break;
case BracketsEnum.ParenthesisClose:
var index = stack.Any() ? stack.Pop() : -1;
break;
default:
break;
}
}
return stack.Count <= 0;
}
}
class RapportColumn
{
public int RC_Id { get; set; }
public int Rapport_Id { get; set; }
public string RC_ColumnName { get; set; }
public string RC_ColumnLabel { get; set; }
public string RC_ColumnQuery { get; set; }
public int RC_Order { get; set; }
public bool RC_checked { get; set; }
}
class SqlClauseEnum
{
public const string FROM = "FROM";
public const string WHERE = "WHERE";
public const string GROUP_BY = "GROUP BY";
public const string HAVING = "HAVING";
public const string SELECT = "SELECT";
public const string ORDER_BY = "ORDER BY";
public const string LIMIT = "LIMIT";
}
class BracketsEnum
{
public const char ParenthesisOpen = '(';
public const char ParenthesisClose = ')';
}
1 Answer 1
Bugs
Performance doesn't matter when the results aren't correct. First make it work goed, then make it work snel! ;)
Your approach to parsing SQL queries is broken. You can't just split on commas or check for the presence of keywords or parenthesis like that. Column and table names can contain keywords, and when quoted, can also contain commas and parenthesis. The same goes for strings. Here are a few queries that produce incorrect results:
SELECT smell FROM fromage
- note thefrom
infromage
.SELECT passport FROM items
- note theas
inpassport
.SELECT `foo,bar` FROM table
- this should produce a single column, not two.SELECT IF(condition, '1) success', '2) failure') FROM table
- likewise, a single column, not two.
There's another bug: columns
isn't cleared, so calling GetRapportColumnsList
twice in a row produces incorrect results. Another problem is that, because columns
is returned directly, a caller that modifies its results affects the results of any other caller as well.
Other notes
- Instead of letting
AddColumnStringsToQueue
andCreateColumns
store their results in instance fields, I would let them return their results instead. I'd also modifyCreateColumns
to take the column parts queue as a parameter. This makes the dependency between them more obvious, and makes it more difficult to use them incorrectly (such as calling them in the wrong order). There doesn't seem to be any reason to keepcolumns
around afterGetRapportColumnsList
returns anyway. - With the above change,
RapportService
has become stateless, and so the parsing methods might as well be madestatic
. They're 'pure' functions whose outputs only depend on their inputs. - Why use a queue instead of passing an array or list, and iterating it with foreach?
GetOpeningBracket(input)
can be replaced byinput.Contains(BracketsEnum.ParenthesisOpen)
. The same goes forGetClosingBracket
.- The name
IsSurrounded
is somewhat misleading: that method actually checks if the parenthesis in the input are properly balanced - and an input without parenthesis is also balanced. - Personally I wouldn't add an
Enum
suffix to a class name, especially since they're not enums. I would also put those constants in a single place, not spread across two classes.