I've written a function that can be used at runtime to suggest how the provided incomplete Lua expression can be completed so that it would make sense (that is, it would evaluate to a non-nil value). For example, this can be used with an application console that can evaluate Lua expressions at runtime, so that it would allow you to autocomplete expressions such as "network.database[index].ro" with a Tab press to "network.database[index].rowCount" or "network.database[index].rows" (provided that network.database[index]
is a table that contains values at the rowCount
and rows
indices).
I am interested in all kinds of suggestions of how can this be improved. Did I handle all the special cases? Any C++ style suggestions?
extern "C"
{
#include "lua.h"
#include "lualib.h"
#include "lauxlib.h"
}
#include <set>
#include <string>
#include <vector>
#include <boost/algorithm/string/predicate.hpp>
// =================================================================================
// The 'buildAutoCompleteCandidates' returns a collection of options that suggest how
// the provided string could be completed so that it would evaluate to a valid value.
//
// For example, when there's a console that can be used to evaluate Lua expressions
// at the application's runtime, this can be used to provide the console with the autocomplete
// feature, similar to a <Tab> press in bash etc.
//
// For instance, the expression 'world.entities[1].p' can perhaps be autocompleted
// as 'world.entities[1].posX', 'world.entities[1].posY', or 'world.entities[1].pivot',
// and so on.
//
// NOTES:
// 1/ The expression to autocomplete is evaluated at runtime.
// 2/ The autocomplete search traverses metatables' __index tables as needed.
// 3/ The metatables' non-table __index values (such as __index functions) stop the search.
// 4/ Endless metatable loops are detected and stop the search.
// 5/ The Lua stack remains in its original state when the call is over.
// 6/ Global variable names can be autocompleted as well.
// =================================================================================
std::vector<std::string> buildAutoCompleteCandidates (lua_State* L, const std::string& str)
{
std::string strBase;
std::string strSep;
std::string strKeyStart;
// str: "world.entities[1].p"
// --> strBase: "world.entities[1]"
// --> strSep: "."
// --> strKeyStart: "p"
auto lastSepPos = str.find_last_of(".:");
if (lastSepPos == std::string::npos)
{
strKeyStart = str;
}
else
{
strBase = str.substr(0, lastSepPos);
strKeyStart = str.substr(lastSepPos+1);
strSep = str[lastSepPos];
}
std::vector<std::string> result;
if (strBase.empty())
{
// For Lua 5.1 replace the line with:
// lua_pushvalue(L, LUA_GLOBALSINDEX)
lua_pushglobaltable(L);
}
else
{
// Evaluate the expresssion contained in strBase
// and leave the result at the topmost stack position.
luaL_loadstring(L, ("return " + strBase).c_str());
lua_pcall(L, 0, 1, 0);
if (!lua_istable(L, -1))
{
lua_pop(L, 1);
return result;
}
}
// We now have the base table whose indices we need
// to iterate over at the topmost (-1) stack position.
std::set<const void*> visited;
while (true)
{
// Iterate over the table:
for (lua_pushnil(L); lua_next(L, -2); lua_pop(L, 1))
{
// For the ':' separator, it only makes sense
// to propose autocompletion with 'function' values:
if (strSep == ":" && !lua_isfunction(L, -1))
{
continue;
}
// Consider only the string indices:
if (lua_type(L, -2) == LUA_TSTRING)
{
std::string key = lua_tostring(L, -2);
if (boost::starts_with(key, strKeyStart))
{
result.push_back(strBase + strSep + key);
}
}
}
// Follow one step up in the metatable chain:
visited.insert(lua_topointer(L, -1));
if (lua_getmetatable(L, -1) != 0)
{
lua_remove(L, -2);
lua_getfield(L, -1, "__index");
lua_remove(L, -2);
if (!lua_istable(L, -1))
{
// Non-table '__index' values cannot
// provide us with a list of all indices.
lua_pop(L, 1);
break;
}
// Detect and break metatable cycles.
const void* index = lua_topointer(L, -1);
if (visited.find(index) != visited.end())
{
lua_pop(L, 1);
break;
}
}
else
{
lua_pop(L, 1);
break;
}
}
return result;
}
-
\$\begingroup\$ I don't know Lua, but aren't you scared of side effects when you evaluate your base string? Like deleteMyHardDrive().f<complete here> \$\endgroup\$kutschkem– kutschkem2014年04月02日 09:36:30 +00:00Commented Apr 2, 2014 at 9:36
-
\$\begingroup\$ @kutschkem That is a valid concern. At the moment, I am using this for an internal project only, so I don't really need it, but otherwise this would certainly had to be addressed. \$\endgroup\$mzi– mzi2014年04月14日 16:28:51 +00:00Commented Apr 14, 2014 at 16:28
1 Answer 1
Single Responsibility Principle
Your function does a lot! It has over 100 LoC and there clearly are seams where separation should take place. You should have an own function for each completion case:
- "global completion" (where no separator occurs in the string)
- completion after
.
- completion after
:
(doing the check for:
in the extraction loop is unnecessary work, it should be outside!)
At first this might look like much code duplication but there are more extractions that reduce it:
Traversing the __index
metatables (and detecting loops therein) is orthogonal to the actual extraction of indices. You could write a for_each_indices_metatable
helper function that calls the extraction method on each metatable.
Likewise the "local" traversal of the indices of a table is orthogonal to the extraction of (function) string completions. This would make for another for_each_index
method that gets passed the functor to find the indices whose prefix was entered.
Speaking of this extraction function: It should only extract the local indices and not the whole completed string (so "bar" instead of "foo.bar" for "foo.b"). This allows for more finegrained handling and you still can prepend "foo." later on.
Spaghetti code
Because everything is in one function there are many variables visible in those parts of the code that don't need them. The aforementioned function extractions would also lessen this effect.
Side-effects
As noted by kutschkem you could invoke side-effects during your function. I feel that you underestimate this problem.
Not only can these effects be very subtle (meaning you will observe them long after you introduced them), they are also not always that obvious (__index
could do some nasty things for you) and the programmers awareness is not always best.
Unfortunately, this problem is not easily solvable. For simple expressions you could try to parse them yourself instead of evaluating them. The modularization that I proposed should help you with this but the limits are reached very fast: You cannot do functions calls because each could have side-effects. You might try to white-list some functions but that would be very tedious and still no guarantee that there are no side-effects (you might overlook code-paths, the functions might be blackboxes, changed parameter values could introduce indirect side-effects).
So the completion will always be relatively limited. However, you could get some inspiration on code completion tools for other script languages that might give you an idea on how to approach this problem.
Use the right methods
Instead of std::set::find
you should use std::set::count
to determine if an element is already inside the set if you don't need its iterator. This makes the code shorter and expresses your intent more clearly.
Use standard documentation formats
You have written a whole lot of documentation comments for your function but they are in a non standard format. You should use a format like doxygen instead to make the comments parseable for automatic documentation generators.
Naming
Some of the names could be improved. For example:
L
, this name might be standard in the lua code but it looks like a macro name (and might (hopefully not) collide with an existing macro),luaState
would be a very straightforward and not too long namestr
->completionExpression
strBase
->baseExpression
strSep
->lastSeparator
(in accordance withlastSepPos
)strKeyStart
->completionKeyPrefix
visited
->visitedMetaTables
Usability
You might also want to support completion for table[
contexts.