Demystifying recursive zero-knowledge proofs

Recursive zero-knowledge proofs are new cryptographic primitives relevant to the Anoma blockchain use case. In this article, we investigate the possible alternatives for verifying blockchain circuits efficiently. Our main interest is in estimating the efficiency of pairing-based constructions.

  • Photo of Simon Masson
    Simon Masson
    · 20 min read

Introduction

Motivation

Blockchain consensus protocols allow different parties to verify the validity of the different blocks. Recently, different constructions have been introduced to achieve different properties (for example privacy and anonymity).

The Anoma blockchain computes shielded transactions thanks to the Multi-Asset Shielded Pool. Each block contains different information related to state transitions that can be verified by all users of the blockchain. A basic idea for verifying the current block state is to recompute all state transitions starting from the genesis block. The major disadvantage of this method is that the longer the blockchain, the more computationally expensive it becomes.

We present in this article how state transitions can be verified using a proof of knowledge. Introducing this concept with a simple example without any cryptography involved, this article also describes different alternatives for getting concrete secure proofs from a cryptographic point of view. Subsequently, we present efficient verifications of the whole blockchain using recursive proofs of knowledge. To sum things up, a comparison of the different constructions available is made from a practical point of view.

A simple example

Proofs of knowledge have become very famous in the last few years. We begin this section with a very simple example using a deck of cards.

This simple example shows that:

  • Alice does not reveal information on her secret card (neither the number, nor the suit, i.e. ❤️ or ♦️). We say that this proof is zero-knowledge.
  • Using this undeniable proof, Bob is always conviced by Alice. With this, we say that the proof has the completeness property.
  • A cheater that picked a red card would not be able to prove that he picked a black card. This is called the soundness of the proof.

In the following sections, we consider proofs of knowledge that satisfy zero-knowledge, completeness and soundness. As a public-key cryptography construction, proof schemes are based on a hard mathematical problem. The security of the proofs we investigate here relies on the Discrete Logarithm Problem (DLP) instantiated with two different cases: finite fields and elliptic curves. In the next section, we introduce the needed tools for building proofs of knowledge.

In this article, we assume that the reader is familiar with elliptic curve and pairing-based cryptography, as well as finite field algebra.

Proof of knowledge of an arithmetic circuit

Proofs of knowledge give evidence of the knowledge of the solution of a circuit. The proofs we consider in this article have a similar structure:

  1. Consider an arithmetic circuit,
  2. Translate the circuit in terms of polynomials (see the following section, 'Plonk arithmetization'),
  3. Commit to the polynomials using a commitment scheme.

In the following sections, we consider arithmetic circuits modulo a prime rr. A simple example would be the knowledge of a square root, i.e. given yFry\in \mathbb F_r, the knowledge of xx such that x2=ymodrx^2 = y \bmod r.

The Plonk arithmetization

In this section, we consider the PLONK arithmetization. Arithmetic circuits modulo rr are decomposed into gates of the form qLa+qRb+qOc+qMab+qC=0q_La + q_Rb + q_O c + q_Mab + q_C = 0, where aa, bb and cc are the values we do not want to reveal. The square-root circuit described above can be written as 0x+0x+(1)y+0xx+0=00 \cdot x + 0 \cdot x + (-1) \cdot y + 0 \cdot x\cdot x + 0 = 0, meaning that x2=yx^2 = y. Another example (also related to squares) is the knowledge of a Pythagorean triple, i.e. (x,y,z)(x,y,z) satisfying x2+y2=z2x^2+y^2 = z^2. In order to translate it as a PLONK circuit, we decompose it as

Mathematical claimCorresponding PLONK gate
xx and T1T_1 satisfy x2=T1x^2 = T_10x+0x+(1)T1+0xx+0=00 \cdot x + 0 \cdot x + (-1) \cdot T_1 + 0 \cdot x\cdot x + 0 = 0
yy and T2T_2 satisfy y2=T2y^2 = T_20y+0y+(1)T2+0yy+0=00 \cdot y + 0 \cdot y + (-1) \cdot T_2 + 0 \cdot y\cdot y + 0 = 0
zz and T3T_3 satisfy z2=T3z^2 = T_30z+0z+(1)T3+0zz+0=00 \cdot z + 0 \cdot z + (-1) \cdot T_3 + 0 \cdot z\cdot z + 0 = 0
T1T_1, T2T_2 and T3T_3 satisfy T1+T2=T3T_1+T_2 = T_31T1+1T2+(1)T3+0T1T2+0=01 \cdot T_1 + 1 \cdot T_2 + (-1) \cdot T_3 + 0 \cdot T_1\cdot T_2 + 0 = 0

This circuit can be rewritten using the following table:

GateqLq_LqRq_RqOq_OqMq_MqCq_Caabbcc
100001-10000xxxxx2x^2
200001-10000yyyyy2y^2
300001-10000zzzzz2z^2
411111-10000x2x^2y2y^2z2z^2

The circuit also requires extra information: the first cc-value (say c1c_1) corresponds to the last aa-value (which is x2x^2), etc. In this example, c1=a4c_1 = a_4, c2=b4c_2 = b_4 and c3=c4c_3 = c_4. This information is often called "copy constraints". Note that one of the proof schemes we consider below is defined using the "ultraPLONK" arithmetization, leading to optimizations of the circuits. For simplicity, we consider only the case of the PLONK circuits.

From this circuit, the PLONK arithmetization produces polynomials corresponding to the circuit and the proof constructions rely on polynomial commitments. We will define these in the next section.

Polynomial commitments

The PLONK arithmetization leads to several polynomials we aim to prove the knowledge of. Here, we briefly recall what is a polynomial commitment scheme.

A commitment scheme is a cryptographic primitive that allows one to commit to a chosen value (or chosen statement) while keeping it hidden to others, with the ability to reveal the committed value later. In this context, the value is a polynomial f(x)Fr[x]f(x) \in \mathbb F_r[x], and the commitment corresponds to an evaluation f(s)f(s) for a given sFrs\in\mathbb F_r. From this commitment, the prover is able to open it so that a verifier can be convinced of the knowledge of ff, without knowing its coefficients.

We now present two polynomial commitment schemes relying on different assumptions. It has been studied in this work in a more general context, including the generalization for recursive proofs. We will go back to these constructions at the end of this post.

The IPA commitment scheme

The first polynomial commitment scheme we introduce is developed by Zcash and currently called Halo 2 by its designers. The main idea is to prove the knowledge of a polynomial f(x)f(x) by committing a value f(s)f(s) for a challenge ss given by the verifier. Using elliptic curve cryptography, we are able to prove the knowledge of ff without revealing its coefficients. The scheme relies on the Inner Product Argument (IPA). We summarize here the different steps of the commitment scheme:

  • Setup:

    We consider an elliptic curve EE defined over Fp\mathbb F_p with a subgroup of (large prime) order rr. We also consider deg(f)+1\deg(f)+1 points of E(Fp)E(\mathbb F_p) of order rr. The security relies on the elliptic curve discrete logarithm problem. In practice, this scheme is often used using the Pasta curve. One main advantage of the Halo 2 construction is that there is no trusted setup.

  • Commitment:

    A commitment is a point of E(Fp)E(\mathbb F_p) computed using scalar multiplications by the coefficients of the polynomial ff. An additional random scalar is also required in order to become zero-knowledge.

  • Opening:

    Given a commitment, it computes scalars of Fr\mathbb F_r and points of E(Fp)E(\mathbb F_p) in an algorithm with a O(log(deg(f)))O(\log(\deg(f))) complexity.

  • Verification:

    After recomputing values of the opening step, it checks an equation on the curve. This time, the opening elements are computed with an algorithm whose complexity is linear in the degree of ff.

For a more precise description of this commitment scheme, we refer the reader to the Zcash Rust code and our SageMath code.

In order to reach the 128-bit security level, one needs to choose an elliptic curve with a subgroup of 256-bit prime order. The Zcash team designed an efficient construction with the pasta curves. With these curves, log2p,log2r=256\log_2p, \log_2r = 256.

The PB commitment scheme

This construction relies on different security assumptions as it computes pairings, and has been introduced in this paper. While we need a secure discrete logarithm problem over the curve, the scheme computes a bilinear map that outputs an element of a finite field Fpk\mathbb F_{p^k} for a given embedding degree kk. The security also relies on the discrete logarithm over Fpk\mathbb F_{p^k}^*. Consequently, for a 128-bit security level, this scheme requires a larger prime pp than the Halo 2 commitment scheme. A special security analysis is needed for each curve in order to estimate whether the security has been reached. The Pairing-Based (PB) commitment scheme splits as follows:

  • Trusted setup:

    For this, we consider a pairing-friendly elliptic curve EE defined over Fp\mathbb F_p, which means that for a large prime rr, all the rr-torsion points are efficiently commutable. From now on, we denote G1\mathbb G_1 and G2\mathbb G_2, two orthogonal subgroups of order rr. G1G_1 and G2G_2 denote generators of these cyclic groups. For an integer n>0n>0, a third party chooses a secret integer ss and computes ([si]G1)i=0n([s^i]G_1)_{i=0}^n and G2,[s]G2G_2, [s]G_2. The trusted setup can be computed once and several proofs can be computed using this setup.

  • Commitment:

    The commitment corresponds to evaluating the polynomial at the secret integer ss, and then computing a scalar multiplication. Using the trusted setup, the commitment can be done without even knowing ss.

  • Opening:

    Several extra computations correspond to the extra necessary elements for a verification. It computes modular integer arithmetic as well as polynomial arithmetic.

  • Verification:

    Using the bilinearity of the pairing, the verifier is able to check an equation over the finite field Fpk\mathbb F_{p^k}. These computations require arithmetic over the curve (more precisely, over G1\mathbb G_1).

If you want to learn more about the PB proof scheme, feel free to reference this designer's Rust code, or our post explaining how to compute a proof "by hand". We also provide a SageMath code for further investigations.

Towards recursive proof constructions

A naive solution of verifying all blockchain blocks would be to verify each block separately. As mentioned before, this approach becomes increasingly expensive as the blockchain keeps growing. In order to get an efficient verification, several constructions provide a way of plugging a proof into another one. This is what we call recursive proofs.

Proof of a proof

Suppose that we have two circuits C1\mathcal C_1 and C2\mathcal C_2 we aim to give proofs of knowledge of solutions (x1x_1 and x2x_2). In other words, C1(x1)\mathcal C_1(x_1) and C2(x2)\mathcal C_2(x_2) are true. The idea of "proof of proof" is as follows:

  1. Compute a proof π1\pi_1 of the the knowledge of x1x_1 for the circuit C1\mathcal C_1;
  2. Produce a circuit C1\mathcal C_1' corresponding to the verification of π1\pi_1;
  3. Compute a proof π2\pi_2 of the knowledge of π1\pi_1 for C1\mathcal C_1', together with x2x_2 corresponding to C2\mathcal C_2.

This way, if the verification of π2\pi_2 succeeds, it means that C2(x2)\mathcal C_2(x_2) is true, and that there exists π1\pi_1 verifying the first circuit. In other words, we obtain a proof of both x1x_1 and x2x_2 using only one proof.

The main drawback of this construction is that the circuit corresponding to a verification is often very complex. For instance, in a pairing-based proof scheme, the verification involves the computation of pairings, and these correspond to large circuits.

In general, we prefer computing proofs using an accumulator.

Proofs accumulator

On this topic, accumulators have been introduced for the purpose of "accumulating" proofs. While the Halo 2 construction provides this accumulation, the ideas have been formalized in this paper in a wider context. An accumulator satisfies several properties:

  • Given acc\text{acc} and π\pi, we are able to verify an accumulation, i.e. that a new accumulator acc\text{acc}' indeed accumulates π\pi into acc\text{acc}.
  • Given an accumulator acc\text{acc}, we can verify all the proofs accumulated using one verification algorithm.

Using this construction, one can compute proofs of accumulation leading to a recursive proof. The main advantage of the accumulator construction (in comparison to the "proof of proof") is that verifying an accumulation is much simpler than verifying a proof.

The two schemes we have introduced are adaptable and we can get an accumulator in both cases:

  • In the IPA scheme, proofs correspond to polynomial commitments, and we can accumulate them by computing linear combinations of both the commitments and the polynomials.
  • In the PB scheme, the accumulator is also obtained from linear combinations of proofs, thanks to the bilinearity of the pairing.

In both cases, accumulation verification corresponds to a simpler circuit than in the case of "proof of proof". In practice, it is common to use the accumulator construction, which we will spend the rest of the article focusing on. Going back to the Fr\mathbb F_r circuits C1\mathcal C_1 and C2\mathcal C_2, a construction based on an accumulator would be:

  1. Compute a proof π1\pi_1 of the the knowledge of x1x_1 for the circuit C1\mathcal C_1. The proof π1\pi_1 corresponds to elements defined with integers modulo pp, namely Fp\mathbb F_p elements, and points of a curve defined over Fp\mathbb F_p.
  2. Accumulate this proof π1\pi_1 in an accumulator (elements of Fp\mathbb F_p).
  3. Produce a circuit C1\mathcal C_1' corresponding to the accumulation verification of π1\pi_1. The accumulation verification is often simpler than in the "proof-of-proof" case, but corresponds to an arithmetic modulo pp circuit.
  4. Compute a proof π1\pi_1' of the knowledge of π1\pi_1 for C1\mathcal C_1'. This circuit is an Fp\mathbb F_p-circuit and so we need to use an elliptic curve with a different structure for generating a proof of this circuit. More precisely, we need an elliptic curve with a subgroup of order pp in order to match with the circuit. We will examine in the next sections how difficult it is to achieve such curves.
  5. Accumulate π1\pi_1' into an accumulator (defined over the second curve base field). If this new curve base field is Fr\mathbb F_r, we are able to produce a proof for C2\mathcal C_2 recursively using the first curve.

As we have seen, the choice of curves is very important in order to obtain a recursive proof. We need to be able to produce proofs on two different curves E1,E2E_1,E_2 closely related to each other:

In the next section, we investigate the details of the requirements for E1E_1 and E2E_2 in order to obtain recursive proofs in the context of the IPA and the PB recursive proofs.

Cycles of curves

From the two accumulator constructions, we obtain recursive proofs using the construction with circuits modulo rr and pp. It means that we need two elliptic curves:

  • E1E_1 defined over Fp\mathbb F_p, and rr divides #E1(Fp)\#E_1(\mathbb F_p),
  • E2E_2 defined over Fr\mathbb F_r, and pp divides #E2(Fr)\#E_2(\mathbb F_r).

This definition easily extends to a cycle of more than two curves, but we do not require the formal definition. Depending on the proof scheme, the curves we will use need other properties. In most of the schemes used in practice, a large power of 22 dividing p1p-1 and r1r-1 is required, so that the arithmetic over the finite fields is efficient using FFT.

Following this, we consider three types of cycles of curves:

  1. Non-pairing cycle:

    Both curves are non-pairing-friendly, and in order to get 128 bits of security, one needs to use log2p,log2r256\log_2p, \log_2r ≥ 256. The Halo 2 scheme use the Pasta curves, a non-pairing cycle. It is not very restrictive to get such curves, even with efficiency and security properties.

  2. Half-cycle:

    Several constructions are possible. For instance, this construction provides a cycle with a Barreto-Naehrig curve (which is pairing-friendly) and a non-pairing curve for a (conservative) 128-bit security level (log2p384\log_2p\geq 384). A half-pairing cycle can lead to one-layer recursive proofs, meaning that we can verify a batch of proofs in one time. It is not clear how we could obtain an efficient fully recursive proof using a half-pairing cycle, in particular with a mix of IPA and PB proofs.

  3. Pairing cycle:

    Both curves are pairing-friendly, and the sizes of pp and qq need to be larger in order to reach the 128-bit security level. This depends on the embedding degrees and the structure of the primes, and needs further cryptanalysis study. Currently, the main construction uses MNT-4 and MNT-6 curves, and requires primes pp and rr with at least 768 bits.

Recursive proofs in practice

We now investigate recursive proofs from a practical point of view, with the aim of comparing the different constructions based on IPA and PB proofs.

An IPA recursive proof

The Halo 2 proof scheme described above can be adapted to get an accumulator, and as a result, a recursive proof.

  • Choice of curves:

    The Zcash team uses a cycle of curves called Pallas and Vesta. The two curves are defined with arithmetic modulo 256-bit long primes, reaching high efficiency and security in practice.

  • Size and cost:

    Estimating the size of the proofs and the cost of a verification depends on the circuit considered. Asymptotically, Halo 2 reaches a certain logarithmic complexity, but not a fully succinct proof scheme (where the verification cost does not depend on the circuit size). For a non-optimized circuit with 1024 gates of maximal degree 1, the proof is stored in less than 768 bytes and the verification costs less than 78.25ms. For a given circuit, many optimizations can lead to improvements on both sides.

  • Other properties:

    This scheme does not rely on pairing-based assumptions, and does not require a trusted setup.

A PB recursive proof

An accumulator can also be obtained from the pairing-based scheme studied earlier in this article.

  • Choice of curves:

    Cycles of pairing curves are more complicated to generate. In order to obtain a recursive proof practically, one needs to use two MNT curves. More precisely, we are restricted to curves defined modulo larger primes than in the IPA case. Aurore Guillevic recommends here 992-bit long primes for a conservative security, and the arithmetic over these curves is not really efficient due to the larger moduli. At the moment, the curves obtained here do not fit with the efficient Lagrange polynomial multiplication, but curves with such properties could be found with a bit more research.

  • Size and cost:

    Pairing-based recursive proofs are succinct in the sense that the verification cost is constant and does not depend on the circuit size. Practically, verification costs approximately 20ms, while proofs correspond to 16log2p16\log_2p bits. For a 128-bit security level, proofs are stored in 1984 bytes using the MNT curves.

  • Other properties:

    One drawback of this construction is the required trusted setup, but the construction leads to fully succinct verification. Depending on the size of the circuit, this construction could be preferred compared to the Halo 2 construction.

Mixing IPA×\timesPB recursive proof.

Another alternative is to do a mix of IPA and PB proofs.

  • Choice of curves.

    Using a mix of IPA and PB proofs lets us find a half-pairing cycle with more efficient curves. Namely, it is possible to reach the 128-bit security with curves defined over 446-bit primes (one is pairing-friendly, the other is not) using this work. The pairing-friendly curve is a Barreto-Naehrig (BN) curve and its structure is very similar to the BLS12 curves.

  • Size and cost.

    Using this half-pairing cycle, we are able to build succinct recursive proofs by choosing E1E_1 to be a BN curve:

    1. Setting E1=EBNE_1= E_\text{BN}, the circuits Ci\mathcal C_i lead to succinct proofs πi\pi_i and an accumulator defined over Fp\mathbb F_p.
    2. Circuits Ci\mathcal C_i' correspond to the verification of the accumulation in the case of the PB scheme. They are defined modulo pp, and have all the same fixed size, coming from the succinctness of the πi\pi_i.
    3. Proofs of Ci\mathcal C_i' are obtained using the IPA proof scheme. As the size of the circuits Ci\mathcal C_i is fixed, we also obtain a bounded proof size.
    4. Finally, the verification of the whole recursive proof corresponds to one PB verification, and one IPA verification with fixed size circuits.
  • Other properties.

    Further investigation is needed in order to know if the circuits Ci\mathcal C_i' lead to efficient verification in the IPA curve. This construction relies on the PB proof and therefore requires a trusted setup. The main advantage compared to other construction is the succinctness together with practical efficiency (to be confirmed). This construction allows efficient scalar multiplications using the GLV method (curves have jj-invariant 00). The construction using a BN curve is secure against subgroup attacks, but the twist security check is more expensive: EBNt=334094181358137757019p380E_\text{BN}^t = 3^3 \cdot 409 \cdot 4181358137757019 \cdot p_{380} E2t=32137316349523571327391881161189879663262061651p301E_2^t = 3^2 \cdot 13 \cdot 73 \cdot 1634952357132739 \cdot 1881161189879663262061651 \cdot p_{301} (where pp_\ell denotes a prime of \ell bits, and EtE^t the quadratic twist of EE).

Conclusion

In this article, we presented three constructions of recursive proofs based on different security assumptions, with different trade-offs between sizes of proofs and verification cost. We summarize size and complexities in the following table, while concrete implementations would be needed for real comparisons.

Recursive proof based onIPAPBIPA×\timesPB
Security assumptionECDLPECDLP and pairingECDLP and pairing
Trusted setupnoyesyes
Primes size256-bit992-bit446-bit
(Proof + accumulator) sizelinear in the circuit\approx 4464 bytesneeds more work
Verification cost (in the circuit size)logarithmicconstantconstant*

{}^* constant in the size of the circuit but bound by size of an accumulation verification circuit (can be large).

Using our SageMath code, we are able to compare the timings of all these schemes but not in the most precise way possible, as SageMath is not sufficiently efficient. Though, we can compare the cost of the proof over the 446-bit curves and using the MNT curves, together with the current PB proof based on BLS12-381 curves. The following comparisons are based on our SageMath code, and hence are not precise.

  • The proof verification on a BN-446 curve is slower than over the (non cyclable) BLS12-381 curve, with a factor of 1.72.
  • Over the MNT4-768 curve, the verification costs 1.80 times the cost on the BLS12-381 curve.
  • The proof verification on a MNT6-768 curve costs 2.10 times a verification over the BLS12-381 curve.
  • The proof verification on a MNT6-992 curve is not possible at the moment because no curve (with p1p-1 a multiple of 2322^{32}) has been generated yet.

While it seems to be more expensive, the PB recursive proof seems to be practically possible. Though, other computations may be more expensive: when algorithms involve arithmetic modulo pp, computations become more expensive as long as pp becomes larger. Scalar multiplications, but also polynomial evaluations, would have a significant cost with larger MNT curves.

Finally, these estimations need further investigations in order to confirm our estimations. In particular, an implementation using Rust would be helpful.


Written by Simon Masson, zero-knowledge cryptography researcher & protocol developer at Heliax, the team building the Anoma Network. If you're interested in zero-knowledge cryptography and cutting-edge cryptographic protocols engineering positions in Rust, check out the open positions at Heliax.