Rosette adds to Racket a collection of solver-aided facilities. These facilities enable programmers to conveniently access a constraint solver that can answer interesting questions about program behaviors. They are based on four key concepts: symbolic values, assertions, assumptions, and queries. We use assertions and assumptions to express desired program behaviors and symbolic values to formulate queries about these behaviors.
This chapter illustrates the basics of solver-aided programming. More advanced tutorials, featuring extended examples, can be found in Section 2 of [3, 4].1 Code examples in these references are written in earlier versions of Rosette. While Rosette 4 is not backward compatible with these versions, they share the same conceptual core.
The following chapters describe the subset of Racket that can be safely used with solver-aided facilities, including the supported datatypes (both built-in and programmer-defined), syntactic forms, and libraries.
The Rosette language includes two kinds of values: concrete and symbolic. Concrete values are plain Racket values (#t, #f, 0, 1, etc.), and Rosette programs that operate only on concrete values behave like Racket programs. Accessing the solver-aided features of Rosette—such as code synthesis or verification—requires the use of symbolic values.
> bb
You can think of a symbolic constant as a placeholder for a concrete constant of the same type. As we will see shortly, the solver, once called, determines which concrete value a given symbolic constant represents: it will tell us whether the constant b is #t or #f, depending on what question we ask about the behavior of a program (or a procedure) applied to b.
#t
#f
(vector b 1)
(! b)
#t
#t
(= y1γγ« y2γγ«)
Printed constant names, such as x or b, are just comments. Two constants created by evaluating two distinct define-symbolic (or, define-symbolic* ) forms are distinct, even if they have the same printed name. They may still represent the same concrete value, but that is determined by the solver:
(<=> x x)
When given a symbolic boolean value, however, a Rosette assertion has no immediate effect. Instead, the value is accumulated in the current verification condition (VC), and the assertion’s effect (whether it passes or fails) is eventually determined by the solver.
#f
#t
(! b)
Assertions express properties that a program must satisfy on all legal inputs. In Rosette, as in other solver-aided frameworks, we use assumptions to describe which inputs are legal. If a program violates an assertion on a legal input, we blame the program. But if it violates an assertion on an illegal input, we blame the caller. In other words, a program is considered incorrect only when it violates an assertion on a legal input.
Assumptions behave analogously to assertions on both concrete and symbolic values. In the concrete case, assuming #f aborts the execution with a runtime exception, and assuming a true value is equivalent to calling (void ). In the symbolic case, the assumed value is accumulated in the current VC.
#t
[assume] failed
#f
(< 0 j)
(|| (! (< 0 j)) (< (+ i (- j)) i))
(vc (< 0 j) (|| (! (< 0 j)) (< (+ i (- j)) i)))
The solver reasons about assumed and asserted properties only when we ask a question about them—for example, “Does my program have an execution that violates an assertion while satisfying all the assumptions?” We pose such solver-aided queries with the help of constructs explained in the remainder of this chapter.
We will illustrate the queries on the following toy example. Suppose that we want to implement a procedure bvmid that takes as input two non-negative 32-bit integers, lo ≤ hi, and returns the midpoint of the interval [lo, hi]. In C or Java, we would declare the inputs and output of bvmid to be of type “int”. In Rosette, we model finite precision (i.e., machine) integers as bitvectors of length 32.
;int32 takes as input an integer literal and returns;the corresponding 32-bit bitvector value.(bviint32?))
> (int32?1);1 is not a 32-bit integer#f
> (int32?(int321));but (int32 1) is.#t
> (int321)(bv #x00000001 32)
Bitvectors support the usual operations on machine integers, and we can use them to implement bvmid as follows:
;Returns the midpoint of the interval [lo, hi].
As the above implementation suggests, we intend the midpoint to be the mathematical integer mi = (lo + hi) / 2, where / stands for integer division. Assuming that 0 ≤ lo ≤ hi, the midpoint mi is fully characterized by two properties: (1) lo ≤ mi ≤ hi and (2) 0 ≤ (hi - mi) - (mi - lo) ≤ 1. We can use these properties to define a generic correctness specification for any implementation of bvmid as follows:
This is not the only way to specify the behavior of bvmid, and we will see an alternative specification later on. In general, there are many ways to describe what it means for a program to be correct, and often, these descriptions are partial: they constrain some aspects of the implementation (e.g., the output is positive) without fully defining its behavior. In our example, check-mid is a full functional correctness specification in that it admits exactly one output value for (impllohi), namely, (lo + hi) / 2.
Testing bvmid against its specification on a few concrete legal inputs, we find that it triggers no assertion failures, as expected:
> (check-midbvmid(int320)(int320))> (check-midbvmid(int320)(int321))> (check-midbvmid(int320)(int322))> (check-midbvmid(int3210)(int3210000))
But does it work correctly on all legal inputs? The answer, as we will see below, is “no”. In fact, bvmid reproduces a famous bug that lurked for years in widely used C and Java implementations of binary search.
How can we check if bvmid satisfies its specification on all legal inputs? One approach is to enumerate all pairs of 32-bit integers with 0≤lo≤hi and apply (check-midbvmidhilo) to each. This approach is sound (it is guaranteed to find a bug if one exists), but a quick calculation shows that it is impractical even for our toy example: bvmid has roughly 2.3 × 1018 legal inputs. A better approach is to delegate such checks to a constraint solver, which can search large input spaces much more effectively than naive enumeration. In Rosette, this is done with the help of the verify query:
> cex(model
[l (bv #x394f0402 32)]
[h (bv #x529e7c00 32)])
The (verify expr) form queries the solver for a binding from symbolic constants to concrete values that causes the evaluation of expr to violate an assertion, while satisfying all the assumptions, when the bound symbolic constants are replaced with the corresponding concrete values. If such a binding exists, as it does in our case, it is called a counterexample.
(list (bv #x394f0402 32) (bv #x529e7c00 32))
;We can convert these values to integer? constants for debugging:'(961479682 1386118144)
;Here is the computed midpoint:> m(bv #xc5f6c001 32)
> (bitvector->integerm)-973684735
;This is clearly wrong. We expect (il + ih) / 2 instead:1173798913
;Expressed as a 32-bit integer, the correct answer is:(bv #x45f6c001 32)
;So, check-mid fails on (bvmid cl ch):> (check-midbvmidclch)[assert] failed
In our example, evaluating l and h with respect to cex reveals that bvmid fails to return the correct midpoint value, thus causing the first assertion in the check-mid procedure to fail. The bug is due to overflow: the expression (bvadd lohi) in bvmid produces a negative value in the 32-bit representation when the sum of lo and hi exceeds 231-1.
(bv #x8bed8002 32)
-1947369470
2347597826
2147483647
A common solution to this problem is to calculate the midpoint as lo + ((hi - lo) / 2). It is easy to see that all intermediate values in this calculation are at most hi when lo and hi are both non-negative, so no overflow can happen. We can also verify this with Rosette:
(unsat)
The solution given in bvmid-no-overflow avoids the overflow problem in bvmid at the cost of performing an additional arithmetic operation. Both implementations also rely on signed division, which is slow and expensive compared to addition, subtraction, and bitwise operations. So our ideal implementation would be correct, small, and composed of only cheap arithmetic and bitwise operations. Does such an implementation exist? To find out, we turn to Rosette’s synthesize query.
The synthesis query uses the solver to search for a correct program in a space of candidate implementations defined by a syntactic sketch. A sketch is a program with holes, which the solver fills with expressions drawn from a specified set of options. For example, (?? int32?) stands for a hole that can be filled with any 32-bit integer constant, so the sketch (bvadd x(?? int32?)) represents all 232 programs that add a 32-bit constant to the variable x. Rosette also lets you define richer holes that can be filled with expressions from a given grammar. For example, here is a grammar of all int32? expressions that consist of cheap arithmetic and bitwise operations:
[expr((bop)(expr)(expr)); (<bop> <expr> <expr>) |((uop)(expr)))]; (<uop> <expr>)[bop[uop
Using this grammar, we can sketch a fast implementation of the midpoint calculation as follows:
(fast-int32lohi#:depth2))
The above sketch describes the space of all expressions from the fast-int32 grammar that have parse trees of depth at most 2. The depth argument is optional. If ommitted, Rosette will use the value of the (current-grammar-depth ) parameter to bound the depth of the expressions drawn from the grammar.
At this point, it is worth noting that holes and sketches are not fundamental concepts in Rosette. Instead, they are macros defined in terms of the core constructs we have already seen, symbolic constants and assertions. For example, (?? int32?) is syntactic sugar for (let ()(define-symbolic idint32?)id), where id is an internally generated name. Similarly, (choose bvneg bvnot ) expands to (if (?? boolean? )bvneg bvnot ). Finally, a grammar hole such as (fast-int32lohi#:depth2) inlines its productions #:depth times to create a nested expression of the form (choose lohi(?? int32?)((choose bvadd ... )(choose ... )(choose ... ))((choose bvneg bvnot )(choose ... ))). Assigning concrete values to the symbolic constants generated by this expression has the effect of selecting a parse tree (of depth 2 in our example) from the hole’s grammar. So, completing a sketch is a matter of finding a suitable binding for the symbolic constants generated by the holes.
With this in mind, we can query the solver for a completion of the bvmid-fast sketch (if any) that satisfies our correctness specification:
;Save the above definitions to a file before calling print-forms.> sol(model
[0$choose:bvmid:37:9$expr:bvmid:37:3$fast-int32:bvmid:45:3 #f]
[1$choose:bvmid:37:9$expr:bvmid:37:3$fast-int32:bvmid:45:3 #f]
[2$choose:bvmid:37:9$expr:bvmid:37:3$fast-int32:bvmid:45:3 #f]
[3$choose:bvmid:37:9$expr:bvmid:37:3$fast-int32:bvmid:45:3 #t]
...)
(define (bvmid-fast lo hi) (bvlshr (bvadd hi lo) (bv #x00000001 32)))
The synthesis query takes the form (synthesize #:forallinput#:guaranteeexpr), where input lists the symbolic constants that represent inputs to a sketched program, and expr gives the correctness specification for the sketch. The solver searches for a binding from the hole (i.e., non-input) constants to values such that expr satisfies its assertions on all legal inputs. Passing this binding to print-forms converts it to a syntactic representation of the completed sketch.2 print-forms works only on sketches that have been saved to disk. In our example, the synthesized program implements the midpoint calculation using the logical shift operation, i.e., the midpoint between lo and hi is calculated as (lo + hi) >>u 1.
Rosette supports one more solver-aided query, which we call angelic execution. This query is the dual of verification. Given a program with symbolic values, it instructs the solver to find a binding for them that will cause the program to execute normally—that is, without any assumption or assertion failures.
> sol(model
[l (bv #x3f761e94 32)]
[h (bv #x3f761e95 32)])
(bv #x3f761e94 32)
(bv #x3f761e94 32)
#f;<--- replace with your sketch )
We conclude this chapter with a quick overview of common patterns and anti-patterns for programming in Rosette. For more details, see Chapters 8–10.
Rosette implements solver-aided queries by translating them to the input language of an SMT solver. By default, this translation respects types: a symbolic constant of type integer? will be translated to an SMT constant of the same type, i.e., an infinite precision mathematical integer. These types determine which theories the solver will need to use to solve a query. As a rule of thumb, the theory of bitvectors tends to elicit fastest solving times, and mixing theories can lead to severe performance degradation. For that reason, it is best to use the types from the same theory throughout your program (e.g., bitvectors).
To illustrate the impact of mixing theories, consider the following mixed-theory specification for our midpoint example:
(bitvector->integer(impllohi));⌈impl(lo, hi)⌉ =(bitvector->integerhi));integer corresponding to the2))));32-bit integer e.
This new specification uses both bitvectors and integers. Compared to check-mid, which uses only bitvectors, check-mid-slow causes one of our verification queries to become an order of magnitude slower and the other to time out:
cpu time: 0 real time: 38 gc time: 0
(model
[l (bv #x394f0402 32)]
[h (bv #x529e7c00 32)])
cpu time: 1 real time: 172 gc time: 0
(model
[l (bv #x2faef9a1 32)]
[h (bv #x5eefb8dd 32)])
cpu time: 0 real time: 160 gc time: 0
(unsat)
call-with-deep-time-limit: out of time
While slower than bitvectors, integers are more convenient to use for demos, prototyping, and interfacing with Racket. To bridge this gap, Rosette provides the option of approximating symbolic integers (and reals) as bitvectors of length k, by setting the current-bitwidth parameter to k. With this setting, integers (and reals) are treated as infinite precision values during evaluation, but when solving queries, they are translated to bitvectors of length k for better performance.
;By default, current-bitwidth is set to #f, so Rosette translates;integer? values precisely, using the SMT theory of integers.#f
;After we set current-bitwidth to 64, integer? values in;check-mid-slow are translated to SMT bitvectors of length 64.cpu time: 0 real time: 23 gc time: 0
(model
[l (bv #x00000001 32)]
[h (bv #x7fffffff 32)])
cpu time: 0 real time: 159 gc time: 0
(unsat)
In this example, we have chosen current-bitwidth carefully to ensure that the resulting approximation is both performant and sound—i.e., the approximate query returns a counterexample exactly when one would be returned by the corresponding integer query. But choosing the right bitwidth is difficult to do in general. If we underapproximate the number of bits that are needed to represent every integer value in a query, we lose soundness, and if we overapproximate it, we lose performance.
;The bitwidth is too low, so we get ...;no counterexample to a buggy query, andcpu time: 0 real time: 0 gc time: 0
(unsat)
;a bogus counterexample to a correct query.cpu time: 1 real time: 152 gc time: 0
(model
[l (bv #x71979fa3 32)]
[h (bv #x76b91b88 32)])
;The bitwidth is too high, so we get a 3-10X slowdown.cpu time: 1 real time: 385 gc time: 0
(model
[l (bv #x70000006 32)]
[h (bv #x73fffffa 32)])
cpu time: 0 real time: 430 gc time: 0
(unsat)
cpu time: 3 real time: 33 gc time: 0
(model
[l (bv #x394f0402 32)]
[h (bv #x529e7c00 32)])
read-solution: unrecognized solver output: (error line 68
column 19: Invalid function definition: unknown sort 'Int')
The process by which Rosette turns a query into an SMT constraint is called symbolic evaluation. At a high level, Rosette’s symbolic evaluation works by executing all paths through a program, and collecting all the assumptions and assertions on these paths into the current verification condition (vc ). The resulting (vc ) is then translated to the SMT language and passed to the solver. This evaluation model has two practical implications on writing performant and terminating Rosette code.
First, if a program is slow or runs forever under standard (concrete) evaluation, it will perform at least as poorly under all-path (symbolic) evaluation. Second, if a program terminates quickly on all concrete inputs, it can still perform poorly or fail to terminate on symbolic inputs. So, extra care must be taken to ensure good performance and termination in the presence of symbolic values.
> (bvsqrt(int323))(bv #x00000001 32)
> (bvsqrt(int324))(bv #x00000002 32)
> (bvsqrt(int3215))(bv #x00000003 32)
> (bvsqrt(int3216))(bv #x00000004 32)
(bvsqrtl))call-with-deep-time-limit: out of time
> n0l
> n1(bvlshr l (bv #x00000002 32))
> n2(bvlshr (bvlshr l (bv #x00000002 32)) (bv #x00000002 32))
> n3(bvlshr
(bvlshr (bvlshr l (bv #x00000002 32)) (bv #x00000002 32))
(bv #x00000002 32))
We can force termination by placing a concrete bound k on the number of times bvsqrt can call itself recursively. This approach is called finitization, and it is the standard way to handle unbounded loops and recursion under symbolic evaluation. The following code shows how to implement a sound finitization policy. If a verify query returns (unsat ) under a sound policy, we know that (1) the unrolling bound k is sufficient to execute all possible inputs to bvsqrt, and (2) all of these executions satisfy the query. If we pick a bound that is too small, the query will generate a counterexample input that needs a larger bound to compute the result. In our example, the bound of 16 is sufficient to verify the correctness of bvsqrt on all inputs:
;Parameter that controls the number of unrollings (5 by default).;A simple macro for defining bounded procedures;that use (fuel) to limit recursion.;Computes bvsqrt taking at most (fuel) steps.;Correctness specification for bvsqrt:
;Verification fails due to insufficient fuel.cpu time: 4 real time: 1143 gc time: 0
[assert] Out of fuel.
;Verification succeeds with enough fuel.> (fuel16)cpu time: 4 real time: 67938 gc time: 0
(unsat)
Many other finitization policies can be defined in a similar way. For example, if we change define-bounded to use assume instead of assert , we obtain a finitization policy that ensures completeness. If a verify query returns a counterexample under a complete policy, we know that the program is buggy, and it violates a query assertion within k recursive calls. But if the query returns (unsat ), we know only that there are no bugs within k or fewer unrollings—we cannot conclude anything about longer executions. So a complete policy prevents false positives, while a sound one prevents false negatives. What kind of policy to use depends on the application, and Rosette leaves that choice to the programmer.
1Code examples in these references are written in earlier versions of Rosette. While Rosette 4 is not backward compatible with these versions, they share the same conceptual core.
2print-forms works only on sketches that have been saved to disk.