I have the code below that gets hit several hundred times per second. I'm wondering what I can do to improve performance. It seems that there should be some way to build up a substring index. Is there some existing examples of that or an explanation on how to do this?
using System;
using System.Collections.Generic;
using System.Linq;
using Asi.Server.Interfaces.History;
namespace Asi.Server.History.AssetHistory
{
public class TelemetrySubscriptionFilter: ITelemetryLogFilter, ITelemetryLogAutosubscriber
{
private readonly List<string> _shouldNotLogs = new List<string>
{
"Local Position Service",
"Global Position Service",
"PositionGroup.",
"DKS/"
};
public bool ShouldNotLog(string telemetryName)
{
float period;
if (ShouldAutoSubscribe(telemetryName, out period))
return false;
// telemetryName is longer than our filter texts
return _shouldNotLogs.Any(s => telemetryName.IndexOf(s, StringComparison.Ordinal) >= 0);
}
private readonly List<string> _shouldAutoSubscribe = new List<string>
{
"Manual Mode",
"Autonomous Driven",
"Battery Voltage",
"RPM",
"Engine On",
"Parking Brake",
"Gear",
"Stopping Distance",
"Ready For Motion",
"GPS Correction Sent",
"Setpoint",
"Dead Reckon",
"VCU uC",
"Feedback",
"Off Path",
"RMS",
"Yaw Rate",
"Stop Enabled",
"Arbiter",
"Velocity Error",
"Processor"
};
public bool ShouldAutoSubscribe(string telemetryName, out float period)
{
period = 0.375f; // chosen somewhat arbitrarily
return _shouldAutoSubscribe.Any(s => telemetryName.IndexOf(s, StringComparison.Ordinal) >= 0);
}
}
}
2 Answers 2
Have you tried making a Regex
that matches any of the strings in the list mentioned? If you create the regex in advance, it will probably be faster than looping over the list of strings. Convert the list into the regex using
Regex regex = new Regex(String.Join('|',list.Select(x=>Regex.Escape(x))));
then check using regex.IsMatch(telemetryName)
.
References:
- http://msdn.microsoft.com/en-us/library/system.text.regularexpressions.regex(v=vs.110).aspx
Regex
- http://msdn.microsoft.com/en-us/library/dd783876(v=vs.110).aspx
String.Join
- http://msdn.microsoft.com/en-us/library/system.text.regularexpressions.regex.escape(v=vs.110).aspx
Regex.Escape()
- http://msdn.microsoft.com/en-us/library/3y21t6y4(v=vs.110).aspx
Regex.IsMatch()
-
\$\begingroup\$ Compiled Regex seems to be 4x faster. Thanks for the idea. \$\endgroup\$Brannon– Brannon2014年06月30日 14:42:53 +00:00Commented Jun 30, 2014 at 14:42
-
\$\begingroup\$ and of course, that's supposed to be a
private static readonly
class member, not a local variable. \$\endgroup\$Snowbody– Snowbody2014年06月30日 14:52:26 +00:00Commented Jun 30, 2014 at 14:52 -
1\$\begingroup\$ Right. And I pass it the compiled flag (as a static). \$\endgroup\$Brannon– Brannon2014年06月30日 14:58:22 +00:00Commented Jun 30, 2014 at 14:58
-
\$\begingroup\$ Ah,
RegexOptions.Compiled
. I'll have to remember that needs to be provided explicitly. \$\endgroup\$Snowbody– Snowbody2014年06月30日 15:11:47 +00:00Commented Jun 30, 2014 at 15:11
As with every performance question: run your code through a profiler
That said, your ShouldNotLog
method can be simplified. There is not much reason to check your auto subscribe list prior to checking your suppress list:
- _shouldNotLogs is a shorter list
- presumably, none of the strings in _shouldNotLogs should exist in _shouldAutoSubscribe
As a result, checking _shouldAutoSubscribe only adds extra work. The only times you perform fewer operations is if the input name matches one of the first three elements in _shouldAutoSubscribe. It's a wash at four. Anything else is extra work.
This leaves us with the following:
public bool ShouldNotLog(string telemetryName)
{
return _shouldNotLogs.Any(s => telemetryName.IndexOf(s, StringComparison.Ordinal) >= 0);
}
Next, for readability, I would instead use string.Contains
. If you check the .NET source, string.Contains
calls string.IndexOf
internally with an Ordinal check, so it is effectively the same thing, but it declares your intent far better:
public bool ShouldNotLog(string telemetryName)
{
return _shouldNotLogs.Any(s => telemetryName.Contains(s));
}
If possible, though, an even better way of doing it would be to extract your telemetry names to some numerical ID and use enums or ints. Number comparison will be far quicker than string comparison. However, you state that your input names are superstrings of those appearing in your lists, so I don't know if that is feasible.
-
\$\begingroup\$ Thanks for the info on the Contains method. I hadn't expected it to be Ordinal. \$\endgroup\$Brannon– Brannon2014年06月30日 14:44:25 +00:00Commented Jun 30, 2014 at 14:44
-
1\$\begingroup\$ @Brannon yeah - I wasn't 100% sure about it, so I looked it up. Microsoft released most/all the framework code over the Build conference this year. You can reference it at the sourceof.net page. \$\endgroup\$Dan Lyons– Dan Lyons2014年06月30日 17:03:18 +00:00Commented Jun 30, 2014 at 17:03
-
\$\begingroup\$ The docs for
Contains()
indicate that it uses an ordinal compare, though I don't know how long that tidbit has been included. \$\endgroup\$Snowbody– Snowbody2014年06月30日 19:45:31 +00:00Commented Jun 30, 2014 at 19:45
static
, but that won't affect the speed. \$\endgroup\$String.Contains()
? It already performs an ordinal comparison. \$\endgroup\$telemetryName
. \$\endgroup\$