One of the most important parts at the core of rubberduck is the IdentifierReferenceResolver
class, which is responsible for resolving each declaration as the parse tree is being walked and function, member and variable calls are being encountered.
Proper scoping is the single most important concept here: when we meet an assignment to foo
, we possibly have 20 foo
identifier declarations, and we need to find the single Declaration
object that's being referred to, create an IdentifierReference
object and add it to that declaration object's References
concurrent bag.
So that's what the IdentifierReferenceResolver
class is doing. For context, here's a few lines of code from the main ResolveInternal
overload:
callee = FindLocalScopeDeclaration(identifierName, localScope, parentContext, isAssignmentTarget) ?? FindModuleScopeProcedure(identifierName, localScope, accessorType, isAssignmentTarget) ?? FindModuleScopeDeclaration(identifierName, localScope) ?? FindProjectScopeDeclaration(identifierName, Equals(localScope, _currentScope) ? null : localScope, hasStringQualifier);
The resolver first tries to find the Declaration
object for identifierName
within the current (local) scope; if that doesn't work, it tries to find a procedure (member actually) in the same module that could match - otherwise FindModuleScopeDeclaration
would find 2 matches when a property has a getter and a setter procedures. if that returns null
, then it tries to find a match at module scope, and when all else fails, it tries to find a match at project scope - and this is where things get funky.
I had this working code:
private Declaration FindProjectScopeDeclaration(string identifierName, Declaration localScope = null, bool hasStringQualifier = false) { var matches = _declarations.Items.Where(item => !item.IsBuiltIn && item.IdentifierName == identifierName || item.IdentifierName == identifierName + (hasStringQualifier ? "$" : string.Empty)).ToList(); if (matches.Count == 1) { return matches.Single(); } if (localScope == null && _withBlockQualifiers.Any()) { localScope = _withBlockQualifiers.Peek(); } try { return matches.SingleOrDefault(item => !item.IsBuiltIn && !item.DeclarationType.HasFlag(DeclarationType.Member) && item.DeclarationType != DeclarationType.Event // events can't be called outside the class they're declared in && (item.Accessibility == Accessibility.Public || item.Accessibility == Accessibility.Global || _moduleTypes.Contains(item.DeclarationType))) // todo: refactor ?? matches.SingleOrDefault(item => item.IsBuiltIn && item.DeclarationType != DeclarationType.Event && ((localScope == null && item.Accessibility == Accessibility.Global) || (localScope != null && (item.Accessibility == Accessibility.Public || item.Accessibility == Accessibility.Global) && localScope.IdentifierName == item.ParentDeclaration.IdentifierName))); } catch (InvalidOperationException) { return null; } }
Notice that // todo: refactor
comment - it's been there for about as far as I can remember. And that horrible "one-liner" (for some values of "one-liner") return
statement has only kept growing hair and tentacles since it was first written, and it seemed to me that extracting methods would make the logic more self-descriptive.
The method receives a string
identifier, and uses an instance-level _declarations
object that contains every single identifier that was picked up when the ANTLR parse tree was first walked to locate identifier declarations. When localScope
is specified, it's the Declaration
for the type of foo
in foo.Bar
. hasStringQualifier
is only needed to pick up the built-in the string-typed versions of a number of functions, such as Chr$
, UCase$
, Environ$
and others - merely working around a little limitation of the grammar/parser.
So I refactored it to this:
private Declaration FindProjectScopeDeclaration(string identifierName, Declaration localScope = null, bool hasStringQualifier = false)
{
// the "$" in e.g. "UCase$" isn't picked up as part of the identifierName, so we need to add it manually:
var matches = _declarations.Items.Where(item => !item.IsBuiltIn && item.IdentifierName == identifierName
|| item.IdentifierName == identifierName + (hasStringQualifier ? "$" : string.Empty)).ToList();
if (matches.Count == 1)
{
return matches.Single();
}
if (localScope == null && _withBlockQualifiers.Any())
{
localScope = _withBlockQualifiers.Peek();
}
try
{
return matches.SingleOrDefault(IsUserDeclarationInProjectScope)
?? matches.SingleOrDefault(item => IsBuiltInDeclarationInScope(item, localScope));
}
catch (InvalidOperationException)
{
return null;
}
}
private static bool IsPublicOrGlobal(Declaration item)
{
return item.Accessibility == Accessibility.Global
|| item.Accessibility == Accessibility.Public;
}
private bool IsUserDeclarationInProjectScope(Declaration item)
{
var isNonMemberUserDeclaration = !item.IsBuiltIn
&& !item.DeclarationType.HasFlag(DeclarationType.Member)
// events can't be called outside the class they're declared in, exclude them as well:
&& item.DeclarationType != DeclarationType.Event;
// declaration is in-scope if it's public/global, or if it's a module/class:
return isNonMemberUserDeclaration && (IsPublicOrGlobal(item) || _moduleTypes.Contains(item.DeclarationType));
}
private static bool IsBuiltInDeclarationInScope(Declaration item, Declaration localScope)
{
var isBuiltInNonEvent = item.IsBuiltIn && item.DeclarationType != DeclarationType.Event;
// if localScope is null, we can only resolve to a global:
// note: built-in declarations are designed that way
var isBuiltInGlobal = localScope == null && item.Accessibility == Accessibility.Global;
// if localScope is not null, we can resolve to any public or global in that scope:
var isInLocalScope = localScope != null && IsPublicOrGlobal(item)
&& localScope.IdentifierName == item.ParentDeclaration.IdentifierName;
return isBuiltInNonEvent && (isBuiltInGlobal || isInLocalScope);
}
Is this as good as it gets? What else could be improved? Are the comments helpful, or noise? Is anything unclear or questionable?
For more context, the full IdentifierReferenceResolver
class can be found here (GitHub).
2 Answers 2
try { return matches.SingleOrDefault(IsUserDeclarationInProjectScope) ?? matches.SingleOrDefault(item => IsBuiltInDeclarationInScope(item, localScope)); } catch (InvalidOperationException) { return null; }
This doesn't look like an appropriate use of try-catch
, but one you've been forced into by the semantics of SingleOrDefault
. In fact, I'm not sure the behaviour is what you want -- if two or more elements of matches
satisfy IsUserDeclarationInProjectScope
then an InvalidOperationException
is thrown, and we never check if any elements satisfy IsBuiltInDeclarationInScope
.
I think what you want is to write a method
TSource SingleOrDefault<TSource>(
this IEnumerable<TSource> source,
Func<TSource, bool> predicate,
TSource defaultValue)
that returns
defaultValue
if there are 0 elements satisfyingpredicate
,defaultValue
if there are 2+ elements satisfyingpredicate
,- The only matching element, otherwise.
With this method, we can write instead
return matches.SingleOrDefault(IsUserDeclarationInProjectScope, null)
?? matches.SingleOrDefault(item => IsBuiltInDeclarationInScope(item, localScope), null);
-
2\$\begingroup\$ You've just fixed a bug! \$\endgroup\$Mathieu Guindon– Mathieu Guindon2015年10月20日 05:07:00 +00:00Commented Oct 20, 2015 at 5:07
FindProjectScopeDeclaration
var matches = _declarations.Items.Where(item => !item.IsBuiltIn && item.IdentifierName == identifierName || item.IdentifierName == identifierName + (hasStringQualifier ? "$" : string.Empty)).ToList();
Here, if hastStringQualifier
would be false this will be evaluated to
var matches = _declarations.Items.Where(item => !item.IsBuiltIn && item.IdentifierName == identifierName || item.IdentifierName == identifierName).ToList();
So basically you don't care at all wether item.IsBuiltIn
or not.
If you care but coded this a little bit lazy, a better way would be to split this into two queries like so
List<Declaration> matches;
if (hasStringQualifier)
{
identifierName = identifierName + "$";
matches = _declarations.Items.Where(item => item.IdentifierName == identifierName).ToList();
}
else
{
matches = _declarations.Items.Where(item => !item.IsBuiltIn && item.IdentifierName == identifierName).ToList();
}
if (matches.Count == 1) { return matches.Single(); }
You should access the List<T>
item by its index instead of using the Single()
extension method.
if (localScope == null && _withBlockQualifiers.Any()) { localScope = _withBlockQualifiers.Peek(); }
From the sources I could see that withBlockQualifiers
is a Stack<T>
which has a Count
property which you should use instead of Any()
.
Although Any()
would use this property too, some casts and nullchecks will be added hence it has some performance costs. In addition you just should prefer builtin methods and properties over extension methods.
private static bool IsPublicOrGlobal(Declaration item) { return item.Accessibility == Accessibility.Global || item.Accessibility == Accessibility.Public; }
here I would prefer to have an Accessibility
enum parameter method argument to make it more clear and easier to read.
private bool IsUserDeclarationInProjectScope(Declaration item) { var isNonMemberUserDeclaration = !item.IsBuiltIn && !item.DeclarationType.HasFlag(DeclarationType.Member) // events can't be called outside the class they're declared in, exclude them as well: && item.DeclarationType != DeclarationType.Event; // declaration is in-scope if it's public/global, or if it's a module/class: return isNonMemberUserDeclaration && (IsPublicOrGlobal(item) || _moduleTypes.Contains(item.DeclarationType)); }
using a comment between boolean operators is ugly and makes this almost unreadable.
private static bool IsBuiltInDeclarationInScope(Declaration item, Declaration localScope) { var isBuiltInNonEvent = item.IsBuiltIn && item.DeclarationType != DeclarationType.Event; // if localScope is null, we can only resolve to a global: // note: built-in declarations are designed that way var isBuiltInGlobal = localScope == null && item.Accessibility == Accessibility.Global; // if localScope is not null, we can resolve to any public or global in that scope: var isInLocalScope = localScope != null && IsPublicOrGlobal(item) && localScope.IdentifierName == item.ParentDeclaration.IdentifierName; return isBuiltInNonEvent && (isBuiltInGlobal || isInLocalScope); }
here you should return early if isBuiltInNonEvent == false
or if isBuiltInGlobal == true
in which case we can omit the check for localScope != null
. For readability I would leave the variables like they are although they could be omitted.
private static bool IsBuiltInDeclarationInScope(Declaration item, Declaration localScope)
{
var isBuiltInNonEvent = item.IsBuiltIn && item.DeclarationType != DeclarationType.Event;
if (!isBuiltInNonEvent) { return false; }
// if localScope is null, we can only resolve to a global:
// note: built-in declarations are designed that way
var isBuiltInGlobal = localScope == null && item.Accessibility == Accessibility.Global;
if (isBuiltInGlobal) { return true; }
// if localScope is not null, we can resolve to any public or global in that scope:
var isInLocalScope = IsPublicOrGlobal(item)
&& localScope.IdentifierName == item.ParentDeclaration.IdentifierName;
return isInLocalScope;
}
-
1\$\begingroup\$ So basically you don't care at all wether item.IsBuiltIn or not. oh wow, this has unforeseen implications elsewhere:
Function Foo$()
actually breaks the grammar, and is legal VBA! Yet another grammar bug solved! Every time I think I know all the issues this grammar has, I get a new surprise... \$\endgroup\$Mathieu Guindon– Mathieu Guindon2015年10月20日 06:07:37 +00:00Commented Oct 20, 2015 at 6:07