I am working on a fairly complex (at least it feels complex to me at the moment) search program and am looking to possibly increase the performance. Everything works exactly how I want it to, but I'm just wondering if there are any slight performance increases I could benefit from. In this code, even minor performance increases are essential. This is due to the extensive amount of operations being performed.
// Split search into multiple terms and check the {x} longest terms against the cache.
string[] word = searchTerm.Split(' ');
Array.Sort(word, (x, y) => y.Length.CompareTo(x.Length));
for (int i = 0; i < word.Length; i++)
{
string searchValue = word[i];
if (i <= MAX_WORD_ITERATIONS && (xmlsearchResults == BLANK_SEARCH_XML_SCHEMA || xmlsearchResults == string.Empty))
{
xmlsearchResults = GetCachedRecord(thisSearch, searchValue);
}
if (xmlsearchResults != BLANK_SEARCH_XML_SCHEMA && xmlsearchResults != string.Empty)
{
thisSearch.Value = searchValue;
ignoreCache = true;
break;
}
}
xmlString.Append(string.Format(@"<{0}>", xmlHeader));
int objcount = 0;
string prevResultName = string.Empty;
try
{
if (searchResults.hitCount > 0)
{
List<SearchItem> SearchResults = new List<SearchItem>();
foreach (Node ResultNode in Results.Nodes)
{
string code = ResultNode.code.ToString();
string result = ResultNode.id.ToString();
string name = ResultNode.name_l.ToString();
string image = string.Empty;
if (!code.ToLower().Contains("ccb") && code.Length <= MAX_CODE_LENGTH &&
name != prevResultName && objcount < maxResults)
{
Boolean addResult = true;
if (thisSearch.FilterDescription)
{
if (!name.ToLower().Contains(thisSearch.Value.ToLower()))
{
addResult = false;
}
}
if (addResult)
{
SearchItem oSearch = new SearchItem(name, code);
SearchResults.Add(oSearch);
prevResultName = name;
}
}
}
var SortedSearchResults = SearchResults.OrderByDescending(s => s.Downloads).ToList();
for (int i = 0; i < SortedSearchResults.Count(); i++)
{
if ((objcount < maxResults))
{
string te = string.Format("<ResultName>{0}</ResultName>", SortedSearchResults[i].Description);
if (!xmlString.ToString().Contains(te))
{
xmlString.Append(SortedSearchResults[i].GetXMLString());
objcount += 1;
}
}
}
}
}
xmlString.Append(string.Format(@"</{0}>", xmlHeader));
return objcount;
-
2\$\begingroup\$ Your code is not complete. Can you please at least tell us the types of all the identifiers you use? \$\endgroup\$Snowbody– Snowbody2015年03月12日 19:18:40 +00:00Commented Mar 12, 2015 at 19:18
3 Answers 3
Comparing strings case insensitively is done by using the proper overload:
string.Compare(str1, str2, StringComparison.OrdinalIgnoreCase)
By calling .ToLower()
for each string to compare (4 times for each iteration), you create 4 new strings each time. If you have a lot of (long) strings, this will create a lot of garbage for no reason.
In your case you use .Contains()
so you can mimic this by using .IndexOf()
:
str4.IndexOf("dyt2", StringComparison.OrdinalIgnoreCase) >= 0
It's prettier to use bool
instead of Boolean
, partly because it also keeps your code consistent.
Write your variable names in full: objectCount
, previousResultName
, etc.
Constants are PascalCase
, not WHATEVER_THIS_IS
as in Java.
The addResult
variable seems a little wonky. Instead of this:
Boolean addResult = true;
if (thisSearch.FilterDescription)
{
if (!name.ToLower().Contains(thisSearch.Value.ToLower()))
{
addResult = false;
}
}
if (addResult)
{
SearchItem oSearch = new SearchItem(name, code);
SearchResults.Add(oSearch);
prevResultName = name;
}
Why not this:
if (!(thisSearch.FilterDescription && !name.ToLower().Contains(thisSearch.Value.ToLower()))
{
SearchItem oSearch = new SearchItem(name, code);
SearchResults.Add(oSearch);
prevResultName = name;
}
You can clean this up a bit more but I'm not in a boolean algebra mood.
-
\$\begingroup\$ Okay thanks, so for the
.ToLower().Contains(...
it should just bestring.Compare(code, "ccd", StringComparison.OrdinalIgnoreCase)
instead? \$\endgroup\$Volearix– Volearix2015年03月12日 19:31:39 +00:00Commented Mar 12, 2015 at 19:31 -
1\$\begingroup\$ Oh, I didn't realize it was a .Contains call. In that case use
.IndexOf()
with the appropriate overload. I'll edit it. \$\endgroup\$Jeroen Vannevel– Jeroen Vannevel2015年03月12日 19:33:32 +00:00Commented Mar 12, 2015 at 19:33 -
\$\begingroup\$ Okay, so I would actually want
<
, right? Because it's aNot
operation. \$\endgroup\$Volearix– Volearix2015年03月12日 19:42:15 +00:00Commented Mar 12, 2015 at 19:42 -
\$\begingroup\$ @Volearix: correct. \$\endgroup\$Jeroen Vannevel– Jeroen Vannevel2015年03月12日 19:43:24 +00:00Commented Mar 12, 2015 at 19:43
-
1\$\begingroup\$ THIS_IS_SNAKE_CASE! *Kicks you into well* \$\endgroup\$Davor– Davor2015年03月13日 11:20:31 +00:00Commented Mar 13, 2015 at 11:20
One way to improve performance would be to not sort the whole list, since you're only concerned about the MAX_WORD_ITERATIONS
longest items. Just do a partial sort.
At the very least, limit your for
loop to only going MAX_WORD_ITERATIONS
times. If it's not found by then, it won't be found.
It looks like the first if
serves no purpose and can be eliminated.
Code segment 1
Array.PartialSort(word, (x, y) => y.Length.CompareTo(x.Length),MAX_WORD);
for (int i = 0; i <= MAX_WORD_ITERATIONS; i++)
{
string searchValue = word[i];
xmlsearchResults = GetCachedRecord(thisSearch, searchValue);
if (xmlsearchResults != BLANK_SEARCH_XML_SCHEMA && xmlsearchResults != string.Empty)
{
thisSearch.Value = searchValue;
ignoreCache = true;
break;
}
}
Code segment 2
xmlString.Append(string.Format(@"<{0}>", xmlHeader));
How about using AppendFormat()
? No speed increase but easier-to-read code.
if (!code.ToLower().Contains("ccb")
Instead of creating a temporary lowercase string, just use the case-insensitive compare:
code.IndexOf("ccb", StringComparison.OrdinalIgnoreCase) < 0
See https://stackoverflow.com/questions/444798/case-insensitive-containsstring
var SortedSearchResults = SearchResults.OrderByDescending(s => s.Downloads).ToList();
Why are you converting it to a List<>
? You're not doing anything that requires a List<>
. Leave it as an IEnumerable<>
and use a foreach
to iterate over it.
One thing which might improve performance:
I'm not sure what xmlString
is but given how it's used it's probably a StringBuilder
. What you do in this loop:
for (int i = 0; i < SortedSearchResults.Count(); i++) { if ((objcount < maxResults)) { string te = string.Format("<ResultName>{0}</ResultName>", SortedSearchResults[i].Description); if (!xmlString.ToString().Contains(te)) { xmlString.Append(SortedSearchResults[i].GetXMLString()); objcount += 1; } } }
is that you convert the content of the StringBuilder
into a string
then do a string search on it (Contains()
) and then append if it's no already present. So for every result you have to search through all previous results in the end making this an an O(n^2)
operation. Using a Distinct
filter should reduce the complexity to O(n)
. LINQ doesn't provide a Distinct
for a specific property but it's easy to write a little extension method (courtesy of SO):
public static IEnumerable<TSource> DistinctBy<TSource, TKey>
(this IEnumerable<TSource> source, Func<TSource, TKey> keySelector)
{
HashSet<TKey> seenKeys = new HashSet<TKey>();
foreach (TSource element in source)
{
if (seenKeys.Add(keySelector(element)))
{
yield return element;
}
}
}
HashSet<>
has O(1)
lookup hence this filter is O(n)
.
Then your loop becomes:
var objCount = 0;
foreach (var result in SortedSearchResults.DistinctBy(r => r.Description).Take(maxResults))
{
xmlString.Append(result.GetXMLString());
++objCount;
}
-
\$\begingroup\$ You shouldn't just say "An O(1) algorithm is better" unless you know the constant factors and the sample size. An O(log n) or O(n) algorithm with a smaller constant factor may be superior for this use case. \$\endgroup\$Snowbody– Snowbody2015年10月29日 17:39:08 +00:00Commented Oct 29, 2015 at 17:39
-
\$\begingroup\$ @Snowbody: Hm, reading the answer I can't see where I just said "An O(1) algorithm is better". My answer says "It might be better to turn an O(n^2) algorithm into an O(n) algorithm by the use of a different data structure". Since the OP stated even minor performance improvements are important I'd say it's certainly worth a shot. \$\endgroup\$ChrisWue– ChrisWue2015年10月29日 20:00:33 +00:00Commented Oct 29, 2015 at 20:00