Tag, you’re it: Signal tagging in Circom

By Tjaden Hess

We at Trail of Bits perform security reviews for a seemingly endless stream of applications that use zero-knowledge (ZK) proofs. While fast new arithmetization and folding libraries like Halo2, Plonky2, and Boojum are rapidly gaining adoption, Circom remains a mainstay of ZK circuit design. We’ve written about Circom safety before in the context of Circomspect, our linter and static analyzer; in this post, we will look at another way to guard against bugs in your Circom circuits using a lesser-known language feature called signal tags. We present four simple rules for incorporating signal tags into your development process, which will help protect you from common bugs and facilitate auditing of your codebase.

This post assumes some familiarity with the Circom language. We will examine some simple Circom programs and demonstrate how signal tags can be used to detect and prevent common classes of bugs; we will also point out potential pitfalls and weaknesses of the signal tagging feature.

Warning: For the remainder of this post, we will be working with Circom 2.1.6. Details of tag propagation have changed since 2.1.0—we highly recommend using version 2.1.6 or higher, as earlier versions contain severe pitfalls not mentioned in this post.

What are signal tags?

Signal tagging is a feature introduced in Circom 2.1.0 that allows developers to specify and enforce—at compile time—ad hoc preconditions and postconditions on templates. Circom tags help developers ensure that inputs to templates always satisfy the requirements of the template, guarding against soundness bugs while reducing duplication of constraints.

Here is the CircomLib implementation of the boolean OR gate:

C/C++
template OR() {
    signal input {binary} a;
    signal input {binary} b;
    signal output {binary} out;
out <== a + b - a*b;

}

circomlib/circuits/gates.circom#37–43

Assume that we are writing a ZK circuit that requires proof of authentication whenever either of two values (e.g., outgoing value transfers) is nonzero. An engineer might write this template to enforce the authentication requirement.

C/C++
// Require `authSucceeded` to be `1` whenever outgoing value is nonzero
template EnforceAuth() {
    signal input valueA;
    signal input valueB;
    signal input authSucceeded;
signal authRequired <== OR()(valueA, valueB);
(1 - authSucceeded) * authRequired === 0;

}

When tested with random or typical values, this template will seem to behave correctly; nonzero values of valueA and valueB will be allowed only when authSucceeded is 1.

However, what about when valueA == valueB == 2? Notice that authRequired will be zero and thus the desired invariant of EnforceAuth will be violated.

So what went wrong? There was an implicit precondition on the OR template that a and b both be binary—that is, in the set {0,1}. Violating this condition leads to unexpected behavior.

One way to approach the issue is to add constraints to the OR gate requiring that the inputs be binary:

C/C++
template OR() {
    signal input a;
    signal input b;
    signal output out;
// Constrain a and b to be binary
a * (1 - a) === 0;
b * (1 - b) === 0;

out <== a + b - a*b;

}

The problem with this approach is that we have just tripled the number of constraints needed per OR gate. Often the inputs will have already been constrained earlier in the circuit, which makes these constraints purely redundant and needlessly increases the compilation and proving time.

In many languages, input constraints would be expressed as types. Circom, unlike more flexible API-driven frameworks like Halo2, does not support expressive types; all signals can carry any value between 0 and P. However, Circom 2.1.0 and higher does support signal tags, which can be used as a sort of ad-hoc type system.

Let’s see how the OR template would look using signal tags:

C/C++
template OR() {
    signal input {binary} a;
    signal input {binary} b;
    signal output {binary} out;
out <== a + b - a*b;

}

Notice that the logic is entirely unchanged from the original; tags do not affect the compiled constraint system at all. However, if we try compiling the EnforceAuth template now, we get a compiler error:

Unset
error[T3001]: Invalid assignment: missing tags required by input signal.
 Missing tag: binary
   ┌─ "example1.circom":18:26
   │
18 │     signal authRequired <== OR()(valueA, valueB); │ ^^^^^^^^^^^^^^^^^^^^ found here │ = call trace: ->EnforceAuth

previous errors were found

Input tags are preconditions: requirements that inputs to the template must satisfy. By attaching a signal tag to the input, a developer indicates that the corresponding property must already be enforced; the template itself may assume but not enforce the condition.

Pretty cool! Now how do we rewrite the program to properly use tags? Let’s define a new template that properly checks if each value is zero before computing the OR value.

C/C++
// `out` is 1 whenever `in` is nonzero, or 0 otherwise
template ToBinary() {
    signal input in;
    // POSTCONDITION: out is either 0 or 1
    // PROOF:
    // in != 0 => out == 1 (by constraint (2))
    // in == 0 => out == 0 (by constraint (1))
    signal output {binary} out;
signal inv &lt;-- in!=0 ? 1/in : 0;
out &lt;== in*inv;
in*(1 - out) === 0;

}

This is essentially the negation of CircomLib IsZero template, normalizing the input and adding binary tag to the output. Note that binary is just an arbitrary string – Circom does not know anything about the semantics that we intend binary to have and in particular does not check that out is in the set {0,1}. Circom simply attaches the opaque tag binary to the output wire of IsZero.

Output tags are postconditions: promises that the developer makes to downstream users of the signal.

Note that, as Circom does not check our postconditions for us, we must be very careful not to accidentally assign a label to a signal that could possibly carry a value outside the allowed values for the tag. In order to keep track of all the potential ways that a signal can be assigned a tag, we recommend including a comment just above any template output with tags, explaining the reason that the postcondition is satisfied.

Now we can plug this into our EnforceAuth circuit, and everything compiles!

C/C++
// Require `authSucceeded` to be `1` whenever outgoing value is nonzero
template EnforceAuth() {
    signal input valueA;
    signal input valueB;
    signal input authSucceeded;
signal spendsA &lt;== ToBinary()(valueA);
signal spendsB &lt;== ToBinary()(valueB);

signal authRequired &lt;== OR()(spendsA, spendsB);
(1 - authSucceeded) * authRequired === 0;

}

Under the hood, Circom is propagating the tag attached to the output signal of ToBinary, so that spendsA also has the tag. Then when OR checks that its input has the binary tag, it is satisfied.

Tag propagation

Tags are propagated through direct assignment, but not through arithmetic operations. In the following example, signal x acquires the binary tag from in.

C/C++
template Example {
    signal input {binary} in;
// x gets the `binary` tag
signal x &lt;== in;
// one_minus_x does not have the `binary` tag;
signal one_minus_x &lt;== 1 - x;

// Compiler Error
1 === OR()(x, one_minus_x);

// Assume NOT is defined to return a binary output, like OR.
signal not_x &lt;== NOT()(x);
// Then this is OK
1 === OR()(x, not_x);

}

Elements of a signal array have a tag if and only if all members of the array have that tag.

C/C++
template Example {
    signal input {binary} a;
    signal input {binary} b;
    signal input c;
   // xs does not have tag `binary` because `c` does not have the tag
signal xs[3] &lt;== [a, b, c];

// Error: missing tag
1 === OR()(xs[0], xs[1]);

}

Tags with value

A common source of soundness bugs in zero-knowledge circuits occurs when arithmetic operations unexpectedly overflow the finite field modulus. Signal tags in Circom can also carry values, which are compile time variables that are propagated along with the tag. Using tags with value, we can ensure at compile time that operations never overflow.

C/C++
template EnforceMaxBits(n) {
    assert(n < 254); // Number of bits in the finite field 
    signal input in;
// REASON: Num2Bits constrains in to be representable by `n` bits
signal output {maxbits} out;
out.maxbits = n;

Num2Bits(n)(in);
out &lt;== in;

}

// Add two numbers, ensuring that the resut does not overflow
template AddMaxBits(){
signal input {maxbits} a;
signal input {maxbits} b;

// REASON: log(a + b) &lt;= log(2*max(a, b)) = 1 + max(log(a), log(b))
signal output {maxbits} c;
c.maxbits = max(a.maxbits, b.maxbits) + 1
assert(c.maxbits &lt; 254);

c &lt;== a + b;

}

// Multiply two numbers, ensuring that the resut does not overflow
template MulMaxBits(){
signal input {maxbits} a;
signal input {maxbits} b;

// REASON: log(a * b) = log(a) + log(b)
signal output {maxbits} c;
c.maxbits = a.maxbits + b.maxbits;
assert(c.maxbits &lt; 254);

c &lt;== a * b;

}

Tag values must be assigned before the signal is assigned. If a tag value propagates via signal assignment to a signal that already has a different tag value, Circom will throw an error.

Avoiding incorrect tag assignment

While signal tags can help prevent programming errors, the language feature syntax easily allows for accidental or unwarranted addition of tags to signals. Incorrectly assigning a tag to a signal that is not constrained to abide by the rules of that tag undermines the guarantees of the tag system and can easily lead to severe security issues. In order to get the full benefit of signal tags, we recommend strictly adhering to these usage rules.

Rule #1: Output and internal tag annotations must be accompanied by an explanatory comment

We mentioned before that adding tag annotations to output signals is dangerous. Internal signals can also be declared with a tag annotation, which unconditionally adds the tag to the signal. For example, this unsafe modification of the original EnforceAuth program uses tagged internal signals:

C/C++
// Require `authSucceeded` to be `1` whenever outgoing value is nonzero
template EnforceAuth() {
    signal input valueA;
    signal input valueB;
    signal input authSucceeded;
// These signals acquire the `binary` tag

// without any checks that the values are in fact binary
// This is UNSAFE
signal {binary} spendsA <== valueA;
signal {binary} spendsB <== valueB;

signal authRequired &lt;== OR()(valueA, valueB);
(1 - authSucceeded) * authRequired === 0;

}

We strongly recommend that manually tagged internal and output signals be avoided when possible. Any output or internal signal tag annotations must be accompanied by a comment explaining why the tag requirements are satisfied.

Rule #2: Tags should be added to signals using dedicated library templates

In order to minimize the use of manual signal tag annotation in high-level code, we recommend providing a library of helper templates comprising a safe API for using the tag. The following code exemplifies a library for binary values that contains constructors and type-safe operators.

C/C++
// binary.circom
// Tags:
//     binary: signals must be either 0 or 1

// Create a binary value from a constant 0 or 1
template BinaryConstant(b){
// REASON: Only valuid values are allowed at compile time
signal output {binary} out;
assert(b == 0 || b == 1);
out <== b;
}

// Constrains a sinal to be binary and returns a tagged output
template EnforceBinary(){
signal input in;
// REASON: Only solutions to x*(x-1) = 0 are 0 and 1
signal output {binary} out;

in * (in - 1) === 0;
out &lt;== in; } // Empty template simply asserts that input is tagged // Useful for checking / documenting post conditions on output signals template AssertBinary() { signal input {binary} in; } // Returns 1 when input is "truthy" (nonzero), 0 when input is zero template ToBinary() { signal input in; // REASON: // in != 0 =&gt; out == 1 (by constraint (2))
// in == 0 =&gt; out == 0 (by constraint (1))
signal output {binary} out;

signal inv &lt;-- in!=0 ? 1/in : 0;
out &lt;== in*inv;
in*(1 - out) === 0;

}

template AND(){
signal input {binary} a;
signal input {binary} b;
// REASON: 11 = 1, 10 = 0, 01 = 0, 00 = 0
signal output {binary} out <== a*b;
}

template NOT(){
signal input {binary} in;
// REASON: 1 - 0 = 1, 1 - 1 = 0
signal output {binary} out <== 1 - in; } template OR() { signal input {binary} a; signal input {binary} b; // REASON: a = 0 => out = b - 0b = b, a = 1 => out = 1 + b - 1b = 1
signal output {binary} out;

out &lt;== a + b - a*b;

}

Once a sufficiently rich library of templates has been established, developers should rarely need to manually add a tag elsewhere. Reducing the number of manual tags makes auditing for correctness much easier.

Postconditions of higher-level templates can be documented using assertion templates like AssertBinary, without using output tag annotations:

C/C++
template IsZero() {
    signal input in;
    // POSTCONDITION: out has `binary` tag
    signal output out; // Avoid risky output tag annotation here
signal isNonZero &lt;== ToBinary()(in); // Avoid risky internal tag annotation here

out &lt;== Not()(isNonZero);

AssertBinary()(out); // Document and check postcondition with no runtime cost

}

Rule #3: Explicit tag value assignments should be scarce and documented

Most tag values should be assigned automatically by library functions, as in the maxbits example. Whenever a signal is assigned a tag value, an explanatory comment should be nearby.

Rule #4: Tag- with-value must always have a value

Every tag in the codebase must either always have an associated value or never have an associated value. Mixing the two can cause confusion, especially when dealing with signal arrays.

A real-world example

We will look at two issues from our review of Succinct Labs’ Telepathy and explain how Circom tags could have been used to prevent them.

Telepathy is an implementation of the Ethereum sync committee light client protocol, using zkSNARKs written in Circom to accelerate verification of aggregate BLS signatures. The exact details of ETH2.0 light clients and BLS aggregation are not required to understand the bugs, but a refresher on elliptic curves and some notes on big-integer arithmetic in Circom will be useful.

The ETH2.0 light client protocol uses aggregate BLS signatures over the BLS12-381 curve1. Public keys are points (X, Y) on the BLS12-381 curve, where Y2 = X3 + 4 mod Q where Q is a 381-bit prime. Notice that the coordinates of the BLS public keys are 381 bits, while Circom signals can represent at most 254 bits. In order to represent a single public key coordinate, circom-pairing uses seven Circom signals (called “limbs”), each holding a 55-bit value. In order to ensure that representations of big integers are unique and to prevent overflow during arithmetic operations, the developer must ensure that the value of each limb is less than 255.

Ethereum blocks contain commitments to the sync committee public keys in compressed form, meaning that the keys are stored as an X coordinate plus one extra bit to indicate the sign of Y.2 In order to perform arithmetic operations with the curve points, the Telepathy circuits require the prover to provide the Y coordinate corresponding to the public key X coordinate. This Y value is then validated by the SubgroupCheckG1WithValidX template, which in turn enforces that the curve equation holds.

C/C++
/* VERIFY THAT THE WITNESSED Y-COORDINATES MAKE THE PUBKEYS LAY ON THE CURVE */

component isValidPoint[SYNC_COMMITTEE_SIZE];
for (var i = 0; i < SYNC_COMMITTEE_SIZE; i++) {
isValidPoint[i] = SubgroupCheckG1WithValidX(N, K);
for (var j = 0; j < K; j++) {
isValidPoint[i].in[0][j] <== pubkeysBigIntX[i][j];
isValidPoint[i].in[1][j] <== pubkeysBigIntY[i][j];
}
}

telepathy-circuits/circuits/rotate.circom#L101-L109

C/C++
template SubgroupCheckG1WithValidX(n, k){
    signal input in[2][k];
    var p[50] = get_BLS12_381_prime(n, k);
    var x_abs = get_BLS12_381_parameter();
    var b = 4;
    component is_on_curve = PointOnCurve(n, k, 0, b, p);
    for(var i=0; i<2; i++)for(var idx=0; idx<k; idx++)
        is_on_curve.in[i][idx] <== in[i][idx];
}

telepathy-circuits/circuits/pairing/bls12_381_hash_to_G2.circom#L723-L731

However, PointOnCurve assumes that the inputs are properly formatted big integers—in particular that each of the k limbs of Y is less than 2n. This check is never enforced, however, leading to uncontrolled overflow in the intermediate computations. Using this vulnerability, a malicious prover can cause the protocol to become stuck in an irrecoverable state, freezing the light client and any bridge funds depending on continued operation.

Using signal tags could have prevented this bug (TOB-SUCCINCT-1) and two others (TOB-SUCCINCT-2, TOB-SUCCINCT-14) that we found during the review. Properly formed big integer values should have a maxbits tag with a value corresponding to the size of the limbs (in this case, 55). BLS12-381 coordinates should additionally have a fp tag indicating that they are reduced modulo the base field prime. Together these two tags, used to indicate preconditions for templates that expect big integers and reduced finite field elements, would have prevented three major missing constraints in the final circuit.

Conclusion

Circom tags are a powerful feature for preventing bugs due to type confusion, missing range checks, and other common missing constraints. In order to receive the full benefits of the feature and hold yourself accountable for good development practices, follow the four simple rules above.

Tags are not a full solution to ZK circuit security. There are many other types of logic, arithmetic, and integration bugs that can compromise the security of your system. Don’t hesitate to contact us with any questions, and reach out if you would like us to review, specify, or implement any ZK circuit or protocol.

1The “BLS” acronyms in BLS signatures (Boneh–Lynn–Shacham) and BLS curves (Barreto-Lynn-Scott) overlap only for Ben Lynn, whose thesis on pairings is an excellent resource.
2For any X there are at most two corresponding Y values, of the form sqrt(X3 + 4), -sqrt(X3 + 4).

Article Link: Tag, you’re it: Signal tagging in Circom | Trail of Bits Blog