Boolean formula that agrees with most truth assignments

Question 1

Let $X_1,\dots,X_n$ be $n$ boolean variables. I have an unknown predicate $P(X_1,\dots,X_n)$ on these boolean variables. Of course, I can view the predicate as a function $f_P : \{0,1\}^n \to \{0,1\}$ that maps a vector of $n$ boolean values to the truth value of this predicate on those inputs.

Now I have a truth table of pairs $(x_1,y_1), \dots, (x_m,y_m),ドル and I want to find a predicate $P$ that is consistent with these pairs and that is as "simple" as possible. In particular, I have two variants of the problem:

Problem 1. Given $(x_1,y_1), \dots, (x_m,y_m),ドル find a predicate $P$ such that (1) it agrees with the entire truth table (i.e., for all $i,ドル $f_P(x_i)=y_i$), and (2) out of all such predicates, the complexity of $P$ is minimized.

Problem 2. Given $(x_1,y_1), \dots, (x_m,y_m)$ and a threshold $t,ドル find a predicate $P$ such that (1) $P$ agrees with at least a $t/m$ fraction of the truth table (i.e., there are at least $t$ values of $i$ such that $f_P(x_i)=y_i$), and (2) out of all such predicates, the complexity of $P$ is minimized.

Are there any algorithms for solving either of these problems, in a way that is more efficient than enumerating all predicates?

Of course, to make the problem well-posed, we must agree on a definition of the complexity of a predicate. Here I can see any number of realistic complexity metrics. One metric might be that, when we express $P$ as a formula in boolean logic, the length of that formula. Another might be the number of operators in that formula, or the nesting depth of the formula. I am interested in any and all algorithms for any plausible notion of complexity.

This can be viewed as a kind of learning problem, where Occam's razor suggests that low-complexity predicates are a priori more likely than high-complexity predicates.

Question 2

Gut feeling: Both problems (their decision variants, to be precise) are NP-complete. Trivial observation: If 1. is NP-complete, 2. is as well, since it is a generalisation of 1.

Question 3

Agreed, @FrankW. Both problems feel NP-hard to me too. Nonetheless, I'm curious about pragmatic approaches that might sometimes work well in practice (e.g., in the same way that SAT solvers are often useful even though SAT is NP-complete). Approximation algorithms would be OK too.

Question 4

Have a look at Karnaugh Maps; they are used in electronics to optimize logical circuits. The idea to match just a fraction of the target function is absolutely new to me, though. I'm curious what it might be good for, so if you describe your usecase I'd be interested :) Anyway, I'm not sure that the Karnaugh Maps can be adjusted to fit this idea, but it might be possible.

Question 5

I assume you want all other truth assignments to not satisfy the formula? Otherwise, tautologies present simple answers.

Question 6

@Raphael, actually, no: I don't care what happens to other values. Not seeing what you mean about tautologies. Making $P$ a tautology would only be an answer to Problem 1 if all of the $y_i$'s were true. What am I missing?

Question 7

Problem 1 is the optimization version of Partial MCSP. When the complexity measure is circuit size (over the binary de Morgan basis), the corresponding decision version is:

Given the partial truth table and a parameter $S$, does there exist a predicate $P$ agreeing with the partial truth table whose circuit complexity is at most $S$.

Alekhnovich, Braverman, Feldman, Klivans and Pitassi showed in their paper The Complexity of Properly Learning Simple Concept Classes that this problem is NP-complete with respect to randomized reductions. See also Hancock, Jiang, Li and Tromp, Lower Bounds on Learning Decision Lists and Trees.

Rahul Ilango later showed in his paper Constant Depth Formula and Partial Function Versions of MCSP are Hard that the problem cannot be solved in time 2ドル^{o(n\log n)}$ assuming the Exponential Time Hypothesis.

Yuval Filmus 281k27 gold badges319 silver badges516 bronze badges · Answer 1 · 2024-08-03 14:32:48Z

Problem 1 is the optimization version of Partial MCSP. When the complexity measure is circuit size (over the binary de Morgan basis), the corresponding decision version is:

Given the partial truth table and a parameter $S$, does there exist a predicate $P$ agreeing with the partial truth table whose circuit complexity is at most $S$.

Alekhnovich, Braverman, Feldman, Klivans and Pitassi showed in their paper The Complexity of Properly Learning Simple Concept Classes that this problem is NP-complete with respect to randomized reductions. See also Hancock, Jiang, Li and Tromp, Lower Bounds on Learning Decision Lists and Trees.

Rahul Ilango later showed in his paper Constant Depth Formula and Partial Function Versions of MCSP are Hard that the problem cannot be solved in time 2ドル^{o(n\log n)}$ assuming the Exponential Time Hypothesis.

Stack Exchange Network

Boolean formula that agrees with most truth assignments

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Boolean formula that agrees with most truth assignments

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions