Lua Type Checking


Many programming languages provide some form of static (compile-time) or dynamic (run-time) type checking, with each form having its own merits [1]. Lua performs run-time type checking on its built-in operations. For example, this code triggers a run-time error:

> x = 5 + "ok"
stdin:1: attempt to perform arithmetic on a string value

However, unlike in languages like C, there is no built-in mechanism in Lua for type checking the parameters and return values of function calls. The types are left unspecified:

function abs(x)
 return x >= 0 and x or -x
end

This allows a lot of flexibility. For example, a function like print can accept values of many types. However, it can leave functions underspecified and more prone to usage errors. You are allowed to call this function on something other than a number, though the operations inside the function can will trigger a somewhat cryptic error at run-time (where it would be a compile-time error in C).

> print(abs("hello"))
stdin:2: attempt to compare number with string
stack traceback:
 stdin:2: in function 'abs'
 stdin:1: in main chunk
 [C]: ?

Solution: Assertions at Top of Function

For improved error reporting, one is typically told to do something like this:

function abs(x)
 assert(type(x) == "number", "abs expects a number")
 return x >= 0 and x or -x
end

> print(abs("hello"))
stdin:2: abs expects a number
stack traceback:
 [C]: in function 'assert'
 stdin:2: in function 'abs'
 stdin:1: in main chunk
 [C]: ?

That's a good suggestion, but one might complain that this imposes additional run-time overhead, it will only detect program errors for code that gets executed instead of all code that is compiled, the types of the function values are entirely in the function's implementation (not available by introspection), and it can involve a lot of repetitive code (especially when the named arguments are passed via a table).

Here is one ways to proactively check named parameters:

local Box = {}
local is_key = {x=true,y=true,width=true,height=true,color=true}
function create_box(t)
 local x = t.x or 0
 local y = t.y or 0
 local width = t.width or 0
 local height = t.height or 0
 local color = t.color
 assert(type(x) == "number", "x must be number or nil")
 assert(type(y) == "number", "y must be number or nil")
 assert(type(width) == "number", "width must be number be number or nil")
 assert(type(height) == "number", "height must be number or nil")
 assert(color == "red" or color == "blue", "color must be 'red' or 'blue'")
 for k,v in pairs(t) do
 assert(is_key[k], tostring(k) .. " not valid key")
 end
 return setmetatable({x1=x,y1=y,x2=x+width,y2=y+width,color=color}, Box)
end

It is relatively a lot of code. In fact, we might want to use error rather than assert to provide an appropriate level parameter for the stack traceback.

Solution: Function Decorators

Another approach is to place the type check code outside of the original function, potentially with a "function decorator" (see DecoratorsAndDocstrings for background).

is_typecheck = true
function typecheck(...)
 return function(f)
 return function(...)
 assert(false, "FIX-TODO: ADD SOME GENERIC TYPE CHECK CODE HERE")
 return f(...)
 end
 end
end
function notypecheck()
 return function(f) return f end
end
typecheck = is_typecheck and typecheck or notypecheck
sum = typecheck("number", "number", "->", "number")(
 function(a,b)
 return a + b
 end
)

The advantage is that the type information is outside of the function implementation. We can disable all the type checking by switching a single variable, and no added overhead would remain when the functions are executed (though there is some slight added overhead when the functions are built). The typecheck function could also store away the type info for later introspection.

This approach is similar to the one described in LuaList:/2002-07/msg00209.html (warning: Lua 4).

One way to implemnt such type checking decorators is:

--- Odin Kroeger, 2022, released under the MIT license.
do
 local abbrs = {
 ['%*'] = 'boolean|function|number|string|table|thread|userdata',
 ['%?(.*)'] = '%1|nil'
 }
 local msg = 'expected %s, got %s.'
 --- Check whether a value is of a type.
 --
 -- Type declaration grammar:
 --
 -- Declare one or more Lua type names separated by '|' to require that
 -- the given value is of one of the given types (e.g., 'string|table'
 -- requires the value to be a string or a table). '*' is short for the
 -- list of all types but `nil`. '?T' is short for 'T|nil' (e.g.,
 -- '?table' is short for 'table|nil').
 --
 -- Extended Backus-Naur Form:
 --
 -- > Type = 'boolean' | 'function' | 'nil' | 'number' |
 -- > 'string' | 'table' | 'thread' | 'userdata'
 -- >
 -- > Type list = [ '?' ], type, { '|', type }
 -- >
 -- > Wildcard = [ '?' ], '*'
 -- >
 -- > Type declaration = type list | wildcard
 --
 -- Complex types:
 --
 -- You can check types of table or userdata fields by
 -- declarding a table that maps indices to declarations.
 --
 -- > type_check({1, '2'}, {'number', 'number'})
 -- nil index 2: expected number, got string.
 -- > type_check({foo = 'bar'}, {foo = '?table'})
 -- nil index foo: expected table or nil, got string.
 -- > type_check('foo', {foo = '?table'})
 -- nil expected table or userdata, got string.
 --
 -- Wrong type names (e.g., 'int') do *not* throw an error.
 --
 -- @param val A value.
 -- @tparam string|table decl A type declaration.
 -- @treturn[1] bool `true` if the value matches the declaration.
 -- @treturn[2] nil `nil` otherwise.
 -- @treturn[2] string An error message.
 function type_match (val, decl, _seen)
 local t = type(decl)
 if t == 'string' then
 local t = type(val)
 for p, r in pairs(abbrs) do decl = decl:gsub(p, r) end
 for e in decl:gmatch '[^|]+' do if t == e then return true end end
 return nil, msg:format(decl:gsub('|', ' or '), t)
 elseif t == 'table' then
 local ok, err = type_match(val, 'table|userdata')
 if not ok then return nil, err end
 if not _seen then _seen = {} end
 assert(not _seen[val], 'cycle in data tree.')
 _seen[val] = true
 for k, t in pairs(decl) do
 ok, err = type_match(val[k], t, _seen)
 if not ok then return nil, format('index %s: %s', k, err) end
 end
 return true
 end
 error(msg:format('string or table', t))
 end
end
--- Type-check function arguments.
--
-- Type declaration grammar:
--
-- The type declaration syntax is that of @{type_match}, save for
-- that you can use '...' to declare that the remaining arguments
-- are of the same type as the previous one.
--
-- Obscure Lua errors may indicate that forgot the quotes around '...'.
--
-- Caveats:
--
-- * Wrong type names (e.g., 'int') do *not* throw an error.
-- * Sometimes the strack trace is wrong.
--
-- @tparam string|table ... Type declarations.
-- @treturn func A function that adds type checks to a function.
--
-- @usage
-- store = type_check('?*', 'table', '?number', '...')(
-- function (val, tab, ...)
-- local indices = table.pack(...)
-- for i = 1, n do tab[indices[i]] = val end
-- end
-- )
--
-- @function type_check
function type_check (...)
 local decls = pack(...)
 return function (func)
 return function (...)
 local args = pack(...)
 local decl, prev
 local n = math.max(decls.n, args.n)
 for i = 1, n do
 if decls[i] == '...' then prev = true
 elseif decls[i] then prev = false
 decl = decls[i]
 elseif not prev then break
 end
 if args[i] == nil and prev and i >= decls.n then break end
 local ok, err = type_match(args[i], decl)
 if not ok then error(format('argument %d: %s', i, err), 2) end
 end
 return func(...)
 end
 end
end

Solution: the checks library

The solutions above present some limitations:

* they are rather verbose, and non-trivial verifications can be hard to read;

* the error messages aren't as clear as those returned by Lua's primitives. Moreover, they indicate an error in the called function, at the place where assert() fails, rather than in the calling function which passed invalid arguments.

The checks library offers a terse, flexible and readable way to produce good error messages. Types are described by strings, which can of course be Lua type names, but can also be stored in an object's metatable, under the __type field. Additional, arbitrary type-checking functions can also be registered in a dedicated checkers table. For instance, if one wants to check for an IP port number, which must be between 0 and 0xffff, one can define a port type as follows:

function checkers.port(x)
 return type(x)=='number' and 0<=x and x<=0xffff and math.floor(x)==0 
end

To remove useless boilerplate code, checks() retrieves the parameters directly from the stack frame, no need to repeat them; for instance, if function f(num, str) needs a number and a string, it can be implemented as follows:

function f(num, str)
 checks('number', 'string')
 --actual function body
end

Types can be combined:

* the vertical bar allows to accept several types, e.g. checks('string|number') accepts both strings and numbers as first parameter.

* a prefix "?" makes a type optional, i.e. also accepts nil. Functionally, it's equivalent to a "nil|" prefix, although it's more readable and faster to check at runtime.

* the question mark can be combined with union bars, e.g. checks('?string|number') accepts strings, numbers and nil.

* finally, a special "!" type accepts anything but nil.

A more detailed description of how the library works can be found in its source's header (https://github.com/fab13n/checks/blob/master/checks.c). The library is part of Sierra Wireless' application framework, accessible here: https://github.com/SierraWireless/luasched. For convinience, it's also available here as a standalone rock: https://github.com/fab13n/checks

Hack: Boxed Values + Possible Values

As mentioned, run-time type checking will not detect program errors that don't get executed. Extensive test suites are particularly essential for programs in dynamically typed languages so that all branches of the code are executed with all conceivable data sets (or at least a good representation of them) so that the run-time assertions are are sufficiently hit. You can't rely as much on the compiler to do some of this checking for you.

Perhaps we can mitigate this by carrying more complete type information with the values. Below is one approach, though is it more a novel proof-of-concept rather than anything for production use at this point.

-- ExTypeCheck.lua ("ExTypeCheck")
-- Type checking for Lua.
--
-- In this type model, types are associated with values at run-time.
-- A type consists of the set of values the value could have
-- at run-time. This set can be large or infinite, so we
-- store only a small representative subset of those values.
-- Typically one would want to include at least the boundary
-- values (e.g. max and min) in this set.
-- Type checking is performed by verifying that all values
-- in that set are accepted by a predicate function for that type.
-- This predicate function takes a values and returns true or false
-- whether the value is a member of that type.
--
-- As an alternative to representing types as a set of representative
-- values, we could represent types more formally, such as with
-- first-order logic, but then we get into theorem proving,
-- which is more involved.
--
-- DavidManura, 2007, licensed under the same terms as Lua itself.
local M = {}
-- Stored Expression design pattern
-- ( http://lua-users.org/wiki/StatementsInExpressions )
local StoredExpression
do
 local function call(self, ...)
 self.__index = {n = select('#', ...), ...}
 return ...
 end
 function StoredExpression()
 local self = {__call = call}
 return setmetatable(self, self)
 end
end
 
-- Whether to enable type checking (true/false). Default true.
local is_typecheck = true
-- TypeValue is an abstract type for values that are typed
-- This holds the both the actual value and a subset of possible
-- values the value could assume at runtime. That set should at least
-- include the min and max values (for bounds checking).
local TypedValue = {}
-- Check that value x satisfies type predicate function f.
function M.check_type(x, f)
 for _,v in ipairs(x) do
 assert(f(v))
 end
 return x.v
end
-- Type check function that decorates functions.
-- Example:
-- abs = typecheck(ranged_real'(-inf,inf)', '->', ranged_real'[0,inf)')(
-- function(x) return x >= 0 and x or -x end
-- )
function M.typecheck(...)
 local types = {...}
 return function(f)
 local function check(i, ...)
 -- Check types of return values.
 if types[i] == "->" then i = i + 1 end
 local j = i
 while types[i] ~= nil do
 M.check_type(select(i - j + 1, ...), types[i])
 i = i + 1
 end
 return ...
 end
 return function(...)
 -- Check types of input parameters.
 local i = 1
 while types[i] ~= nil and types[i] ~= "->" do
 M.check_type(select(i, ...), types[i])
 i = i + 1
 end
 return check(i, f(...)) -- call function
 end
 end
end
function M.notypecheck() return function(f) return f end end
function M.nounbox(x) return x end
M.typecheck = is_typecheck and M.typecheck or M.notypecheck
M.unbox = is_typecheck and M.unbox or M.nounbox
-- Return a boxed version of a binary operation function.
-- For the returned function,
-- Zero, one, or two of the arguments may be boxed.
-- The result value is boxed.
-- Example:
-- __add = boxed_op(function(a,b) return a+b end)
function M.boxed_op(op)
 return function(a, b)
 if getmetatable(a) ~= TypedValue then a = M.box(a) end
 if getmetatable(b) ~= TypedValue then b = M.box(b) end
 local t = M.box(op(M.unbox(a), M.unbox(b)))
 local seen = {[t[1]] = true}
 for _,a2 in ipairs(a) do
 for _,b2 in ipairs(b) do
 local c2 = op(a2, b2)
 if not seen[c2] then
 t[#t + 1] = op(a2, b2)
 seen[c2] = true
 end
 end
 end
 return t
 end
end
-- Return a boxed version of a unary operation function.
-- For the returned function,
-- The argument may optionally be boxed.
-- The result value is boxed.
-- Example:
-- __unm = boxed_uop(function(a) return -a end)
function M.boxed_uop(op)
 return function(a)
 if getmetatable(a) ~= TypedValue then a = M.box(a) end
 local t = M.box(op(M.unbox(a)))
 local seen = {[t[1]] = true}
 for _,a2 in ipairs(a) do
 local c2 = op(a2)
 if not seen[c2] then
 t[#t + 1] = op(a2)
 seen[c2] = true
 end
 end
 return t
 end
end
TypedValue.__index = TypedValue
TypedValue.__add = M.boxed_op(function(a, b) return a + b end)
TypedValue.__sub = M.boxed_op(function(a, b) return a - b end)
TypedValue.__mul = M.boxed_op(function(a, b) return a * b end)
TypedValue.__div = M.boxed_op(function(a, b) return a / b end)
TypedValue.__pow = M.boxed_op(function(a, b) return a ^ b end)
TypedValue.__mod = M.boxed_op(function(a, b) return a % b end)
TypedValue.__concat = M.boxed_op(function(a, b) return a .. b end)
-- TypedValue.__le -- not going to work? (metafunction returns Boolean)
-- TypedValue.__lt -- not going to work? (metafunction returns Boolean)
-- TypedValue.__eq -- not going to work? (metafunction returns Boolean)
TypedValue.__tostring = function(self)
 local str = "[" .. tostring(self.v) .. " in "
 for i,v in ipairs(self) do
 if i ~= 1 then str = str .. ", " end
 str = str .. v
 end
 str = str .. "]"
 return str 
end
-- Convert a regular value into a TypedValue. We call this "boxing".
function M.box(v, ...)
 local t = setmetatable({v = v, ...}, TypedValue)
 if #t == 0 then t[1] = v end
 return t
end
-- Convert a TypedValue into a regular value. We call this "unboxing".
function M.unbox(x)
 assert(getmetatable(x) == TypedValue)
 return x.v
end
-- Returns a type predicate function for a given interval over the reals.
-- Example: ranged_real'[0,inf)'
-- Note: this function could be memoized.
function M.ranged_real(name, a, b)
 local ex = StoredExpression()
 if name == "(a,b)" then
 return function(x) return type(x) == "number" and x > a and x < b end
 elseif name == "(a,b]" then
 return function(x) return type(x) == "number" and x > a and x <= b end
 elseif name == "[a,b)" then
 return function(x) return type(x) == "number" and x >= a and x < b end
 elseif name == "[a,b]" then
 return function(x) return type(x) == "number" and x >= a and x <= b end
 elseif name == "(inf,inf)" then
 return function(x) return type(x) == "number" end
 elseif name == "[a,inf)" then
 return function(x) return type(x) == "number" and x >= a end
 elseif name == "(a,inf)" then
 return function(x) return type(x) == "number" and x > a end
 elseif name == "(-inf,a]" then
 return function(x) return type(x) == "number" and x <= a end
 elseif name == "(-inf,a)" then
 return function(x) return type(x) == "number" and x < a end
 elseif name == "[0,inf)" then
 return function(x) return type(x) == "number" and x >= 0 end
 elseif name == "(0,inf)" then
 return function(x) return type(x) == "number" and x > 0 end
 elseif name == "(-inf,0]" then
 return function(x) return type(x) == "number" and x <= 0 end
 elseif name == "(-inf,0)" then
 return function(x) return type(x) == "number" and x < 0 end
 elseif ex(name:match("^([%[%(])(%d+%.?%d*),(%d+%.?%d*)([%]%)])$")) then
 local left, a, b, right = ex[1], tonumber(ex[2]), tonumber(ex[3]), ex[4]
 if left == "(" and right == ")" then
 return function(x) return type(x) == "number" and x > a and x < b end
 elseif left == "(" and right == "]" then
 return function(x) return type(x) == "number" and x > a and x <= b end
 elseif left == "[" and right == ")" then
 return function(x) return type(x) == "number" and x >= a and x < b end
 elseif left == "[" and right == "]" then
 return function(x) return type(x) == "number" and x >= a and x <= b end
 else assert(false)
 end
 else
 error("invalid arg " .. name, 2)
 end
end
return M

Example usage:

-- type_example.lua
-- Test of ExTypeCheck.lua.
local TC = require "ExTypeCheck"
local typecheck = TC.typecheck
local ranged_real = TC.ranged_real
local boxed_uop = TC.boxed_uop
local box = TC.box
-- Checked sqrt function.
local sqrt = typecheck(ranged_real'[0,inf)', '->', ranged_real'[0,inf)')(
 function(x)
 return boxed_uop(math.sqrt)(x)
 end
)
-- Checked random function.
local random = typecheck('->', ranged_real'[0,1)')(
 function()
 return box(math.random(), 0, 0.999999)
 end
)
print(box("a", "a", "b") .. "z")
print(box(3, 3,4,5) % 4)
math.randomseed(os.time())
local x = 0 + random() * 10 - 1 + (random()+1) * 0
print(x + 1); print(sqrt(x + 1)) -- ok
print(x); print(sqrt(x)) -- always asserts! (since x might be negative)

Example output:

[az in az, bz]
[3 in 3, 0, 1]
[8.7835848325787 in 8.7835848325787, 0, 9.99999]
[2.9637113274708 in 2.9637113274708, 0, 3.1622760790292]
[7.7835848325787 in 7.7835848325787, -1, 8.99999]
lua: ./ExTypeCheck.lua:50: assertion failed!
stack traceback:
 [C]: in function 'assert'
 ./ExTypeCheck.lua:50: in function 'check_type'
 ./ExTypeCheck.lua:78: in function 'sqrt'
 testt.lua:30: in main chunk
 [C]: ?

Note: this approach of values holding multiple values does have some similarities to Perl6 junctions (originally termed "quantum superpositions").

Solution: Metalua Runtime Type Checking

There is a Runtime type checking example in Metalua [2].

Solution: Dao

The Dao language, partly based on Lua, has built-in support for optional typing [3].


--DavidManura

Solution: Teal

The [Teal language] is typed dialect of Lua that compiles to Lua.

Solution: TypeScriptToLua?

[TypeScriptToLua] is a TypeScript? to Lua transpiler that allows us to write Lua code with TypeScript? syntax and type checking at compile time.

I'd argue that, except for very simple programs or those that always have the same input, disabling type checks in a script is a bad idea. --JohnBelmonte

The comment above was anonymously deleted with "This isn't a forum for drive-by opinions". On the contrary, this is established wiki style. People can make comments on pages, and for points of contention that's considered courteous versus summarily changing the original text. The original author can decide to integrate such comments into the original text or not. --JohnBelmonte


RecentChanges · preferences
edit · history
Last edited July 12, 2022 2:38 am GMT (diff)

AltStyle によって変換されたページ (->オリジナル) /