I have a use case where I want to keep a pre-compiled regex, cached, for performance reasons, but the regex pattern also needs to be updated infrequently. Instances of CachedRegexFilter class will be used by multiple threads. The actual pattern for the regex is supplied as a custom class, StringConfig, which is basically a global volatile string that can be updated by a different thread.
Example usage:
// Main thread:
StringConfig regex = new StringConfig() { Value = ".*"};
CachedRegexFilter filter = new CachedRegexFilter(regex);
// now the filter is passed on to multiple service classes
MyApi(apiFilter: filter);
MyCron(cronFilter: filter);
//The config is updated by a seperate service
ConfigUpdaterSerice(configToUpdate: regex);
I have come up with this solution but would appreciate a review to ensure it is truly thread safe!
// a thread safe class for filtering strings based on a dynamic regex represented by a StringConfig
public class CachedRegexFilter
{
private readonly StringConfig _overrideRegex;
private readonly object _regexReplaceLock = new object();
private volatile string _currentoverrideRegexPattern;
private volatile Regex _currentoverrideRegex;
public CachedRegexFilter(StringConfig overrideRegex)
{
_overrideRegex = Preconditions.IsNotNull(overrideRegex, nameof(overrideRegex)); ;
UpdateCachedPattern();
}
public bool Matches(string toMatch)
{
if (_currentoverrideRegexPattern != _overrideRegex.Value)
{
lock (_regexReplaceLock)
{
// now entering critical section. Check again
if (_currentoverrideRegexPattern != _overrideRegex.Value)
{
UpdateCachedPattern();
}
}
}
return _currentoverrideRegex?.IsMatch(toMatch) ?? false;
}
private void UpdateCachedPattern()
{
_currentoverrideRegexPattern = _overrideRegex.Value;
_currentoverrideRegex = string.IsNullOrEmpty(_currentoverrideRegexPattern)
? null
: new Regex(_currentoverrideRegexPattern, RegexOptions.Singleline | RegexOptions.Compiled);
}
}
2 Answers 2
It depends a bit on what you mean by thread-safe here. If you mean "won't throw an exception", then you're definitely safe. You do have some potentially bad behaviour depending on your use case.
Consider the case when the config has just been updated. Thread 1 calls Matches
and enters the critical region to update the regex and gets paused here:
private void UpdateCachedPattern()
{
_currentoverrideRegexPattern = _overrideRegex.Value;
// Thread 1 paused here "between" these statements
_currentoverrideRegex = string.IsNullOrEmpty(_currentoverrideRegexPattern)
? null
: new Regex(_currentoverrideRegexPattern, RegexOptions.Singleline | RegexOptions.Compiled);
}
Thread 2 now starts executing Matches
:
if (_currentoverrideRegexPattern != _overrideRegex.Value)
{
// ...
As thread 1 has already updated the _currentoverrideRegexPattern
field, the check is false (i.e. they are already equal). Thread 2 continues executing and uses the out of date Regex
.
You could solve this by swapping the order:
private void UpdateCachedPattern()
{
var newPattern = _overrideRegex.Value;
_currentoverrideRegex = string.IsNullOrEmpty(newPattern)
? null
: new Regex(newPattern, RegexOptions.Singleline | RegexOptions.Compiled);
_currentoverrideRegexPattern = newPattern;
}
Honestly, the pattern you're going for here seems like a recipe for disaster. What you want to do is be able to either pull the configuration each time or have a way of the configuration notifying interested parties that it has changed.
What you're currently doing is basically mutating global state which makes code harder to follow and easier to break.
-
\$\begingroup\$ Unfortunately I can't modify the way we do configs in this system. Ideally the library should have Regex directly instead of strings. \$\endgroup\$Dexter– Dexter2022年08月01日 15:13:29 +00:00Commented Aug 1, 2022 at 15:13
I've got 2 reviews, first is on topic and second off topic.
Why do you have multiple regex patterns? Why would this not suffice:
public class ThreadSafeRegex {
private Object _lock = new Object();
private Regex _value;
public ThreadSafeRegex(StringConfig regexString) {
updateRegex(regexString);
}
public void updateRegex(StringConfig newRegexString) {
lock(_lock) {
_value = string.IsNullOrEmpty(_currentoverrideRegexPattern)
? null
: new Regex(newRegexString, RegexOptions.Singleline | RegexOptions.Compiled);
}
}
public bool match(string toMatch) {
lock(_lock) {
return _value?.IsMatch(toMatch) ?? false;
}
}
}
And secondly, what are your reasons for naming the variables the way you did? I don't understand why the private variables all contain override
in their names, when there is no "original" value that is overridden. It also seems like you named your variables after their implementation and not their intent, which makes it hard to understand your code. For more information on how to name variables well, you can read my recent post about making code more readable.
Explore related questions
See similar questions with these tags.
CachedRegexFilter
, which will be used on multiple threads. What happens tofilter
instance afterConfigUpdaterSerice
is finished? \$\endgroup\$