Summary: This function generates a truth table for a boolean function of variable number of arguments. The name of the function passed and its arguments are deduced outside of the computable function scope. The returned value is a logical matrix with columns corresponding to function arguments and result.
library(R.utils)
library(stringr)
truthTable <- function(func, valuesOnly = F) {
numArguments <- length(formals(func))
if(valuesOnly) {
values <- vector(length = 2^numArguments)
for (i in 1:(2^numArguments)) {
arguments <- rev(as.logical(intToBits(i-1)))[-(1:(32-numArguments))]
values[i] <- do.call(func, as.list(arguments))
}
return(values)
}
result <- matrix(nrow = 2^numArguments, ncol = numArguments + 1)
colnames(result) <- c(names(formals(func)), as.character(substitute(func)))
for (i in 1:(2^numArguments)) {
arguments <- rev(as.logical(intToBits(i-1)))[-(1:(32-numArguments))]
result[i, ] <- c(arguments, do.call(func, as.list(arguments)))
}
return(result)
}
Example:
majority <- function(a, b, c) {
return ((a & b) | (b & c) | (a & c))
}
truthTable(majority)
Result:
a b c majority
[1,] FALSE FALSE FALSE FALSE
[2,] FALSE FALSE TRUE FALSE
[3,] FALSE TRUE FALSE FALSE
[4,] FALSE TRUE TRUE TRUE
[5,] TRUE FALSE FALSE FALSE
[6,] TRUE FALSE TRUE TRUE
[7,] TRUE TRUE FALSE TRUE
[8,] TRUE TRUE TRUE TRUE
Possible improvements:
- Faster conversion of numeric iterator to logical vector
- Usage of
apply
instead of iterative computation
What could be improved?
Any advice would be appreciated.
1 Answer 1
The main flaw that can be observed in your function is the presence of code duplication: expressions such as 2^numArguments
and arguments <- rev(as.logical(intToBits(i-1)))[-(1:(32-numArguments))]
appear multiple times. Code duplication is generally bad, you could refactor so that each of them appears only one time.
Other little things:
R.utils
andstringr
are loaded but never used.- It's better to use
FALSE
instead ofF
.
Here is an alternative solution using expand.grid
:
truthTable2 <- function(func, valuesOnly = FALSE) {
args <- formals(func)
L <- setNames(rep(list(c(TRUE, FALSE)), length(args)), names(args))
df <- expand.grid(L)
result <- sapply(1:nrow(df), function(i) do.call(func, lapply(df, `[`, i)))
# or result <- do.call(func, df) if func is vectorized
if (valuesOnly) {
unname(result)
} else {
df[[substitute(func)]] <- result
as.matrix(df)
}
}
truthTable(majority)
# a b c majority
# [1,] TRUE TRUE TRUE TRUE
# [2,] FALSE TRUE TRUE TRUE
# [3,] TRUE FALSE TRUE TRUE
# [4,] FALSE FALSE TRUE FALSE
# [5,] TRUE TRUE FALSE TRUE
# [6,] FALSE TRUE FALSE FALSE
# [7,] TRUE FALSE FALSE FALSE
# [8,] FALSE FALSE FALSE FALSE
Benchmark:
bench::mark(
truthTable(majority),
truthTable2(majority),
check = FALSE
)
# # A tibble: 2 x 13
# expression min median `itr/sec` mem_alloc `gc/sec` n_itr n_gc total_time result memory time
# <bch:expr> <bch:t> <bch:t> <dbl> <bch:byt> <dbl> <int> <dbl> <bch:tm> <list> <list> <lis>
# 1 truthTable(majority) 89.5us 97.4us 9634. 1.44KB 6.22 4650 3 483ms <lgl[~ <df[,~ <bch~
# 2 truthTable2(majority) 189.6us 204.7us 4292. 0B 4.07 2110 2 492ms <lgl[~ <df[,~ <bch~
# # ... with 1 more variable: gc <list>
bigf <- function(a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q) TRUE
bench::mark(
truthTable(bigf),
truthTable2(bigf),
check = FALSE
)
# # A tibble: 2 x 13
# expression min median `itr/sec` mem_alloc `gc/sec` n_itr n_gc total_time result memory time gc
# <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl> <int> <dbl> <bch:tm> <list> <list> <list> <list>
# 1 truthTable(bigf) 2.13s 2.13s 0.469 64MB 5.63 1 12 2.13s <lgl[,18] [131,072 ~ <df[,3] [262,351~ <bch:t~ <tibble [1 x~
# 2 truthTable2(bigf) 2.43s 2.43s 0.412 76.5MB 5.77 1 14 2.43s <lgl[,18] [131,072 ~ <df[,3] [262,218~ <bch:t~ <tibble [1 x~
# Warning message:
# Some expressions had a GC in every iteration; so filtering is disabled.
-
\$\begingroup\$ Indeed, code duplication should be avoided, and unused packager will be deleted (although may return as i find a faster way to convert the iterator from numeric to logical vector). Other than that, the suggested function (for illustrative purposed I will call it
truthTable2
is slower in my tests on a function with many (17, to be specific) arguments. It just returnsTRUE
all the time, so this is more of a test to see how fast the function handles the truth table generation. WhiletruthTable
takes 3 seconds to compute, thetruthtable2
takes 28. \$\endgroup\$F. Rosty– F. Rosty2019年10月13日 23:41:43 +00:00Commented Oct 13, 2019 at 23:41 -
\$\begingroup\$ But I appreciate the clarity of the code and will try to implement the best practices to improve the function performance. But R is not all about performance, and I might need to code this function in C++ to gain a significant boost, might do it later \$\endgroup\$F. Rosty– F. Rosty2019年10月13日 23:46:52 +00:00Commented Oct 13, 2019 at 23:46
-
2\$\begingroup\$ Assuming
func
is vectorized, like it is the case with yourmajority
, replacing thesapply
with a single callresult <- do.call(func, df)
should give it a good boost. \$\endgroup\$flodel– flodel2019年10月14日 02:06:28 +00:00Commented Oct 14, 2019 at 2:06 -
\$\begingroup\$ @F.Rosty You're right, it was really slow! The main bottleneck was the
split
on dataframe rows, I've replaced it and now it's much faster. And as flodel said you can also use the fact thatfunc
is vectorized (if it is). \$\endgroup\$Scarabee– Scarabee2019年10月16日 16:42:58 +00:00Commented Oct 16, 2019 at 16:42