Truth table generator for an arbitrary function

Question 1

Summary: This function generates a truth table for a boolean function of variable number of arguments. The name of the function passed and its arguments are deduced outside of the computable function scope. The returned value is a logical matrix with columns corresponding to function arguments and result.

library(R.utils)
library(stringr)
truthTable <- function(func, valuesOnly = F) {
 numArguments <- length(formals(func))
 if(valuesOnly) {
 values <- vector(length = 2^numArguments)
 for (i in 1:(2^numArguments)) {
 arguments <- rev(as.logical(intToBits(i-1)))[-(1:(32-numArguments))]
 values[i] <- do.call(func, as.list(arguments))
 }
 return(values)
 }
 result <- matrix(nrow = 2^numArguments, ncol = numArguments + 1)
 colnames(result) <- c(names(formals(func)), as.character(substitute(func)))
 for (i in 1:(2^numArguments)) {
 arguments <- rev(as.logical(intToBits(i-1)))[-(1:(32-numArguments))]
 result[i, ] <- c(arguments, do.call(func, as.list(arguments)))
 }
 return(result)
}

Example:

majority <- function(a, b, c) {
 return ((a & b) | (b & c) | (a & c))
}
truthTable(majority)

Result:

 a b c majority
[1,] FALSE FALSE FALSE FALSE
[2,] FALSE FALSE TRUE FALSE
[3,] FALSE TRUE FALSE FALSE
[4,] FALSE TRUE TRUE TRUE
[5,] TRUE FALSE FALSE FALSE
[6,] TRUE FALSE TRUE TRUE
[7,] TRUE TRUE FALSE TRUE
[8,] TRUE TRUE TRUE TRUE

Possible improvements:

Faster conversion of numeric iterator to logical vector
Usage of apply instead of iterative computation

What could be improved?

Any advice would be appreciated.

Question 2

The main flaw that can be observed in your function is the presence of code duplication: expressions such as 2^numArguments and arguments <- rev(as.logical(intToBits(i-1)))[-(1:(32-numArguments))] appear multiple times. Code duplication is generally bad, you could refactor so that each of them appears only one time.

Other little things:

R.utils and stringr are loaded but never used.
It's better to use FALSE instead of F.

Here is an alternative solution using expand.grid:

truthTable2 <- function(func, valuesOnly = FALSE) {
 args <- formals(func)
 L <- setNames(rep(list(c(TRUE, FALSE)), length(args)), names(args))
 df <- expand.grid(L)
 result <- sapply(1:nrow(df), function(i) do.call(func, lapply(df, `[`, i)))
 # or result <- do.call(func, df) if func is vectorized
 if (valuesOnly) {
 unname(result)
 } else {
 df[[substitute(func)]] <- result
 as.matrix(df)
 }
}
truthTable(majority)
# a b c majority
# [1,] TRUE TRUE TRUE TRUE
# [2,] FALSE TRUE TRUE TRUE
# [3,] TRUE FALSE TRUE TRUE
# [4,] FALSE FALSE TRUE FALSE
# [5,] TRUE TRUE FALSE TRUE
# [6,] FALSE TRUE FALSE FALSE
# [7,] TRUE FALSE FALSE FALSE
# [8,] FALSE FALSE FALSE FALSE

Benchmark:

bench::mark(
 truthTable(majority),
 truthTable2(majority),
 check = FALSE
)
# # A tibble: 2 x 13
# expression min median `itr/sec` mem_alloc `gc/sec` n_itr n_gc total_time result memory time 
# <bch:expr> <bch:t> <bch:t> <dbl> <bch:byt> <dbl> <int> <dbl> <bch:tm> <list> <list> <lis>
# 1 truthTable(majority) 89.5us 97.4us 9634. 1.44KB 6.22 4650 3 483ms <lgl[~ <df[,~ <bch~
# 2 truthTable2(majority) 189.6us 204.7us 4292. 0B 4.07 2110 2 492ms <lgl[~ <df[,~ <bch~
# # ... with 1 more variable: gc <list>
bigf <- function(a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q) TRUE
bench::mark(
 truthTable(bigf),
 truthTable2(bigf),
 check = FALSE
)
# # A tibble: 2 x 13
# expression min median `itr/sec` mem_alloc `gc/sec` n_itr n_gc total_time result memory time gc 
# <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl> <int> <dbl> <bch:tm> <list> <list> <list> <list> 
# 1 truthTable(bigf) 2.13s 2.13s 0.469 64MB 5.63 1 12 2.13s <lgl[,18] [131,072 ~ <df[,3] [262,351~ <bch:t~ <tibble [1 x~
# 2 truthTable2(bigf) 2.43s 2.43s 0.412 76.5MB 5.77 1 14 2.43s <lgl[,18] [131,072 ~ <df[,3] [262,218~ <bch:t~ <tibble [1 x~
# Warning message:
# Some expressions had a GC in every iteration; so filtering is disabled.

Question 3

Indeed, code duplication should be avoided, and unused packager will be deleted (although may return as i find a faster way to convert the iterator from numeric to logical vector). Other than that, the suggested function (for illustrative purposed I will call it truthTable2 is slower in my tests on a function with many (17, to be specific) arguments. It just returns TRUE all the time, so this is more of a test to see how fast the function handles the truth table generation. While truthTable takes 3 seconds to compute, the truthtable2 takes 28.

Question 4

But I appreciate the clarity of the code and will try to implement the best practices to improve the function performance. But R is not all about performance, and I might need to code this function in C++ to gain a significant boost, might do it later

Question 5

Assuming func is vectorized, like it is the case with your majority, replacing the sapply with a single call result <- do.call(func, df) should give it a good boost.

Question 6

@F.Rosty You're right, it was really slow! The main bottleneck was the split on dataframe rows, I've replaced it and now it's much faster. And as flodel said you can also use the fact that func is vectorized (if it is).

Scarabee Scarabee 3931 silver badge9 bronze badges · Answer 1 · 2019-10-13 22:54:30Z

The main flaw that can be observed in your function is the presence of code duplication: expressions such as 2^numArguments and arguments <- rev(as.logical(intToBits(i-1)))[-(1:(32-numArguments))] appear multiple times. Code duplication is generally bad, you could refactor so that each of them appears only one time.

Other little things:

R.utils and stringr are loaded but never used.
It's better to use FALSE instead of F.

Here is an alternative solution using expand.grid:

truthTable2 <- function(func, valuesOnly = FALSE) {
 args <- formals(func)
 L <- setNames(rep(list(c(TRUE, FALSE)), length(args)), names(args))
 df <- expand.grid(L)
 result <- sapply(1:nrow(df), function(i) do.call(func, lapply(df, `[`, i)))
 # or result <- do.call(func, df) if func is vectorized
 if (valuesOnly) {
 unname(result)
 } else {
 df[[substitute(func)]] <- result
 as.matrix(df)
 }
}
truthTable(majority)
# a b c majority
# [1,] TRUE TRUE TRUE TRUE
# [2,] FALSE TRUE TRUE TRUE
# [3,] TRUE FALSE TRUE TRUE
# [4,] FALSE FALSE TRUE FALSE
# [5,] TRUE TRUE FALSE TRUE
# [6,] FALSE TRUE FALSE FALSE
# [7,] TRUE FALSE FALSE FALSE
# [8,] FALSE FALSE FALSE FALSE

Benchmark:

bench::mark(
 truthTable(majority),
 truthTable2(majority),
 check = FALSE
)
# # A tibble: 2 x 13
# expression min median `itr/sec` mem_alloc `gc/sec` n_itr n_gc total_time result memory time 
# <bch:expr> <bch:t> <bch:t> <dbl> <bch:byt> <dbl> <int> <dbl> <bch:tm> <list> <list> <lis>
# 1 truthTable(majority) 89.5us 97.4us 9634. 1.44KB 6.22 4650 3 483ms <lgl[~ <df[,~ <bch~
# 2 truthTable2(majority) 189.6us 204.7us 4292. 0B 4.07 2110 2 492ms <lgl[~ <df[,~ <bch~
# # ... with 1 more variable: gc <list>
bigf <- function(a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q) TRUE
bench::mark(
 truthTable(bigf),
 truthTable2(bigf),
 check = FALSE
)
# # A tibble: 2 x 13
# expression min median `itr/sec` mem_alloc `gc/sec` n_itr n_gc total_time result memory time gc 
# <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl> <int> <dbl> <bch:tm> <list> <list> <list> <list> 
# 1 truthTable(bigf) 2.13s 2.13s 0.469 64MB 5.63 1 12 2.13s <lgl[,18] [131,072 ~ <df[,3] [262,351~ <bch:t~ <tibble [1 x~
# 2 truthTable2(bigf) 2.43s 2.43s 0.412 76.5MB 5.77 1 14 2.43s <lgl[,18] [131,072 ~ <df[,3] [262,218~ <bch:t~ <tibble [1 x~
# Warning message:
# Some expressions had a GC in every iteration; so filtering is disabled.

Indeed, code duplication should be avoided, and unused packager will be deleted (although may return as i find a faster way to convert the iterator from numeric to logical vector). Other than that, the suggested function (for illustrative purposed I will call it truthTable2 is slower in my tests on a function with many (17, to be specific) arguments. It just returns TRUE all the time, so this is more of a test to see how fast the function handles the truth table generation. While truthTable takes 3 seconds to compute, the truthtable2 takes 28.
But I appreciate the clarity of the code and will try to implement the best practices to improve the function performance. But R is not all about performance, and I might need to code this function in C++ to gain a significant boost, might do it later
Assuming func is vectorized, like it is the case with your majority, replacing the sapply with a single call result <- do.call(func, df) should give it a good boost.
@F.Rosty You're right, it was really slow! The main bottleneck was the split on dataframe rows, I've replaced it and now it's much faster. And as flodel said you can also use the fact that func is vectorized (if it is).

Stack Exchange Network

Truth table generator for an arbitrary function

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Truth table generator for an arbitrary function

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions