Abstract 1 Introduction 2 First-Order Programming Language with Quantum Case 3 Compilation Strategy 4 Main Results 5 Conclusions and Future Work References Appendix A Omitted proofs Appendix B Examples of Table 1

Branch Sequentialization in Quantum Polytime

Emmanuel Hainry ORCID Université de Lorraine, CNRS, Inria, LORIA, F-54000 Nancy, France Romain Péchoux ORCID Université de Lorraine, CNRS, Inria, LORIA, F-54000 Nancy, France Mário Silva ORCID Université de Lorraine, CNRS, Inria, LORIA, F-54000 Nancy, France
Abstract

Quantum algorithms leverage the use of quantumly-controlled data in order to achieve computational advantage. This implies that the programs use constructs depending on quantum data and not just classical data such as measurement outcomes. Current compilation strategies for quantum control flow involve compiling the branches of a quantum conditional, either in-depth or in-width, which in general leads to circuits of exponential size. This problem is coined as the branch sequentialization problem. We introduce and study a compilation technique for avoiding branch sequentialization on a language that is sound and complete for quantum polynomial time, thus, improving on existing polynomial-size-preserving compilation techniques.

Keywords and phrases:
Quantum Programs, Implicit Computational Complexity, Quantum Circuits
Copyright and License:
[Uncaptioned image] © Emmanuel Hainry, Romain Péchoux, and Mário Silva; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Theory of computation Quantum complexity theory
Acknowledgements:
The authors would like to thank Alexandre Guernut for his work on the compiler implementation and the anonymous reviewers for their constructive and useful comments.
Funding:
This work is supported by the the Plan France 2030 through the PEPR integrated project EPiQ ANR-22- PETQ-0007 and the HQI platform ANR-22-PNCQ-0002; and by the European projects NEASQC and HPCQS.
Editors:
Maribel Fernández

1 Introduction

Motivation

Quantum computing is an emerging paradigm of computation, where quantum physical phenomena such as entanglement and superposition are used to obtain an advantage over classical computation. A testament to the richness of the field is the variety of computational models: quantum Turing machines [6], quantum circuits [31, 26], measurement-based quantum computation [7, 12], linear optical circuits [21], among others.

In recent decades, a lot of effort has been put into developing high-level quantum programming languages that allow programmers to abstract away from the technicalities of these low-level models [27, 3]. Toward that end, several verification techniques such as type systems [15] or categorical approaches for reasoning about program semantics [4, 19] have been studied and developed to ensure the physical reality of compiled programs, for example, by ensuring that they satisfy the properties of quantum mechanics such as the no-cloning theorem [1] and unitarity [14].

An important line of research in this area involves checking polytime termination of quantum programs [11, 29, 17], by showing that any program can be simulated by a polytime quantum Turing machine (QTM). As a consequence of Yao’s Theorem [31], demonstrating that polynomial-time QTMs are computationally equivalent to uniform and poly-sized quantum circuit families, such programs can be instantiated by a uniform family of quantum circuits of polynomial size, i.e., with a polynomial number of qubits and gates. However, direct compilation techniques often fail to ensure this property, due to the use of quantum branching, where the flow in a conditional is determined by the state of a qubit.

As an illustrative example, consider a quantum case, of the shape

qcase q1,,qk of {iSi}i{0,1}k,

which executes the superposition of the unitary transformations Ui implemented by statements Si, depending on the state |i of the control qubits q1,,qk. This statement can be simulated by a QTM that, depending on k symbols in its tape, performs the instructions of different QTMs Mi in superposition. Consequently, its runtime is the maximum runtime among all Mi ([6, Branching Lemma]). Instantiating this qcase statement on quantum circuits can be done sequentially (in-depth) or in parallel (in-width). An in-depth encoding consists in applying each controlled-Ui gate sequentially, thus implementing the unitary transformation i{0,1}k|ii|Ui. The depth of the resulting circuit is the sum of the depths of all Ui, which is exponential in the number k of control qubits. An alternative strategy, inspired by QRAM, allows for the execution of each Ui in parallel in the circuit, using controlled swaps to perform routing of qubit addresses in linear depth on the number of control qubits. This strategy has been recently applied in compilation techniques to automatically parallelize the branches of qcase statements and allow for optimal parallel time (depth) complexity [35].

Since the unitaries are parallelized, the circuit depth is the maximum among the depths of Ui, but this strategy results in circuits where the width (number of qubits) scales as the sum of the complexities of Ui. Consequently, this parallelization requires a number of ancillas that grows exponentially on the number of control qubits. In summary, any implementation, that is agnostic about the structure of each Ui will incur such an exponential cost; this is the case of qcase statements used in the preparation of arbitrary quantum states [34, 16].

Automatic implementations of quantum conditionals therefore produce circuits with a size that scales with the sum (rather than the maximum) of the complexity of each branch. This problem, coined branch sequentialization in [32], leads, in many cases, to an exponential blow-up in circuit complexity, forcing the programmer to rewrite and optimize their code in order to make use of the program structure and improve the complexity bound. A challenging issue is, therefore, to develop quantum programming languages that avoid branch sequentialization by ensuring the correct circuit complexity for quantum control statements while providing full use of quantum control to the programmer [32, 33].

Contribution

This paper solves the above issue by introducing a programming language, called pbp for Polynomially Branching Program, with quantum case and first-order recursive procedures, on which compilation of quantum conditional does not lead to an exponential blow-up in circuit size. Our main contributions are as follows:

  • We introduce a compilation strategy compile (Figures 8 and 9) into quantum circuits, show its soundness (Theorem 8), and show that it avoids branch sequentialization on pbp (Theorem 9), a language with restrictions on recursive calls to procedures. Toward that end, we formally define the time complexity TimeP: of a program P (Definition 1) as a map from the number n of qubits to the maximal number of procedures called during a program execution on the Hilbert space 2n. The time complexity of a qcase statement is precisely the maximum of the time complexities of its branches and we show that branch sequentialization is avoided as the circuits generated by compile have size bounded asymptotically by the time complexity of the compiled program (Theorem 9).

  • A natural question concerns the expressive power of language pbp. We also solve this issue by showing that pbp is sound and complete for quantum polynomial time (Theorem 11). Consequently, any polytime quantum algorithm can be encoded by a pbp program. We illustrate the methodology through well-known examples such as the program QFT realizing the quantum Fourier transform (Example 6).

  • We also discuss asymptotic bounds for a strict extension of pbp, where the reduction in the input sorted set passed to procedure calls can be arbitrary. We show that this extension only increases the circuit complexity of the no-branch-sequentialization case by a linear factor (Theorem 13).

  • We also show that compile strictly improves on existing compilation strategies on first-order quantum programs with quantum case [17] by a polynomial speedup of arbitrarily large degree (Table 1).

A bird’s-eye view of our compilation strategy

We will now consider a simple illustrating example where branch sequentialization can exponentially worsen the size complexity of the compiled circuit, and show how our compilation strategy can preserve a polynomial bound.

The program PAIRS defined in Figure 1 consists in a simple call to procedure pairs (line 7) on a list q¯ of unique qubits. Let |q¯| be the number of qubits in q¯. By language design, the procedure pairs immediately terminates whenever |q¯|=0. The procedure enters a quantum case (line 2) when |q¯|2, otherwise it applies a NOT gate to the single qubit (line 6). On line 2, the program will branch depending on the state q¯[1,2] of the first two qubits in q¯. Out of all four cases (lines 2-5), pairs only performs an operation when the first two qubits are in state |00 or |11, in which case it performs a recursive call on q¯[1,2], the list obtained by removing the first and second qubits of q¯.

Figure 1: Program PAIRS.

Given an input state |xy, with x{0,1} and y{0,1}, pairs will apply a NOT gate to the last qubit y if and only if x is a string in the language (0011). Since pairs performs at most one call per branch, and consumes 2 qubits from its input while doing so, its time complexity is in O(|q¯|), when taking the qcase complexity as the maximum of each branch.

We now discuss the circuit compilation of pairs, when |q¯|2. In Figure 2, we give three strategies (a), (b), and (c), exemplifying different approaches. Strategies (a) and (b) are automatic compilation strategies, which ignore the inner structure of the program – the fact that the bodies of the two non-skip branches of the qcase are identical and therefore encode the same unitary transformation. Figure 2(a) represents an in-depth strategy, whereas Figure 2(b) implements an in-width strategy, making use of different registers r00 and r11, both initialized to |0|q¯|2 to perform each branch in parallel and then recombine the results in the same register. These two implementations require two recursive calls to pairs and, consequently, their size complexity is the sum of the complexities of the two branches. These strategies are performing branch sequentialization and, in both cases, the compiled circuit has size exponential in |q¯|. In contrast, Figure 2(c) avoids this exponential blow-up, making use of the fact that the two branches are identical. This strategy is able to merge the recursive calls into a single procedure call with the use of one ancilla. This is precisely the type of strategy used by our compilation algorithm compile (Section 3). Although this example is relatively simple due to similar recursive calls, compile allows us to deal with more involved situations, where branches are not identical.

Figure 2: Compilation strategies.

Related work

Resource optimization in low-level models of quantum computation is a well-studied subject. Given a specific quantum circuit, it is possible to reduce its number of gates (or at least its number of non-Clifford gates) via techniques such as gate substitution and graph rewriting [25, 23, 20, 13]. The study of resource optimization in the asymptotic scenario is a relatively young research area as it involves reasoning about families of circuits and program structure. Different implicit characterizations of quantum complexity classes have been developed using a lambda-calculus [11], function algebras [29, 30] and a first-order programming language [17]. The last of these provided a compilation strategy into quantum circuits with bounds on the circuit size based on the syntactic restrictions placed on the programs.

Compilation strategies for the quantum control statement (also called quantum switch case or quantum multiplexer) have been studied, for instance, in state preparation [22], appearing in quantum machine learning [18] and Hamiltonian simulation [8]. These techniques either focus on optimizing number of qubits (circuit width) [34] or circuit depth [28], but in all cases correspond to circuits of exponential size. In order to improve on these bounds, one must restrict the set of programs to those that admit an efficient implementation, which can be deduced from the program structure. Optimized compilation techniques in that scenario can then be judged on expressivity and completeness: how easily can one write a program while ensuring the syntactical restrictions? Do these restrictions encompass all efficient programs? The field of implicit computational complexity is particularly well-suited to answer these questions [9, 10].

2 First-Order Programming Language with Quantum Case

In this section, we introduce a programming language with quantum control and first-order recursive procedures. After introducing its syntax and semantics, we introduce a restriction pbp (for Polynomially Branching Programs) on which branch sequentialization can be avoided.

(Integers)i,j,k::=x|n|i±n||s|(Booleans)b::=ii||bb|(Qubits)q::=s[i](Sorted sets)s::=q¯|s[i](Statements)S::= skip;|q*=Uf(j);|SS|if b then S else S|qcase q of {0S,1S}|call proc(s);(Procedure declarations)D::=ε|decl proc(q¯){S},D(Programs)P(q¯)::=D::S

Figure 3: Syntax of programs.

2.1 Syntax of Programs

The language includes four basic datatypes for expressions, whose corresponding expressions are described in Figure 3.

  • Integers: Integer expressions are variables x, constant n, an addition by a constant i±n, or the size |s| of a sorted set s.

  • Booleans: Such expressions b are defined in a standard way.

  • Qubit: qubits expressions are of the shape s[i] which denotes the i-th qubit in sorted set s.

  • Sorted sets: lists of unique (i.e., non-duplicable) qubit pointers. Sorted-set expressions s are either variables q¯ or s[i], the sorted set s where the i-th element has been removed.

Let s[i1,,in] be a shorthand for s[i1],,s[in]. We also define syntactic sugar for pointing to the n-th last qubit in a sorted set, by defining for any n1, q¯[n]q¯[|q¯|n+1].

A program PD::S is defined in Figure 3 as a list of (possibly recursive) procedure declarations D, followed by a program statement S.

Let Procedures be an enumerable set of procedure names. We write procP to denote that proc appears in P. Each procedure of name procP is defined by a (unique) procedure declaration decl proc(q¯){S}D, which takes a sorted set q¯ as input parameter and executes the procedure statement S. We sometimes write Sproc to explicitly refer to the procedure statement S of proc.

Statements include the no-op instruction, unitary application, sequences, conditional, quantum case, and procedures calls. For the sake of universality [5], in a unitary application q*=Uf(j);, Uf(j) is a unitary transformation that can take an integer j and a polynomial-time approximable [2] total function f[0,2π) as optional arguments. The f and i can be omitted when they are not useful, as in a NOT gate.

The quantum conditional qcase s[i] of {0S0,1S1} allows branching by executing statements S0 and S1 in superposition according to the state of qubit s[i]. The procedure call call proc(s); runs procedure proc with sorted set expression s. The quantum conditional can be extended to multiple qubits in a standard way as used in Figure 1: qcase q1,q2 of {00S00,01S01,10S10,11S11} is a shorthand notation for qcase q1 of {0qcase q2 of {0S00,1S01},1qcase q2 of {0S10,1S11}}.

We also make use of some syntactic sugar to describe statements encoding constant-time quantum operations. For instance, the CNOT, SWAP, as well as a controlled-phase shift gate are defined by:

CNOT(q1,q2) qcase q1 of {0skip;, 1q2*=NOT;}
SWAP(q1,q2) CNOT(q1,q2)CNOT(q2,q1)CNOT(q1,q2)
CPHASE(q1,q2,i) qcase q1 of {0skip;,1q2*=Phλx.π/2x1(i);}

Given a program PD::S, the call relation PProcedures×Procedures is defined for any two procedures proc1, proc2P as proc1Pproc2 whenever proc2Sproc1. The relation P is then the transitive closure of P, and P denotes the equivalence relation where proc1Pproc2 if proc1Pproc2 and proc2Pproc1 both hold.

A procedure proc is recursive whenever procPproc holds. The strict order P is defined as proc1Pproc2 if proc1Pproc2 and proc1≁Pproc2 both hold.

2.2 Semantics of Programs

Let 𝔹 denote the set of Booleans and () denote the set of lists of natural numbers, [] being the empty list. We interpret each basic type as follows:

Integers Booleans 𝔹 Qubits Sorted Sets ()

Qubits are interpreted as integers (pointers) and sorted sets are interpreted as lists of pointers. For each basic type τ, the semantics of expressions, given in Figure 5, is described in a standard way as a function

ττ×()τ.

(e,l)τv holds when expression e of type τ evaluates to the value vτ under the context l(). l is simply the sorted set of qubit pointers considered when evaluating e. For instance, we have that (q¯[2],[1,4,5])4 (the second qubit has index 4), (q¯[4],[1,4,5])()[] ([] is used for errors on type ()), (q¯[4],[1,4,5])0 (index 0 encodes error on type ), and (q¯[3],[1,4,5])()[1,4] (third qubit removed).

Figure 4: Semantics of statements.
Figure 5: Semantics of expressions.

Let 2n denote the Hilbert space of n qubits 2n and 𝒫() denote the powerset of .

A configuration c over n qubits is of the shape (S,|ψ,A,l), for some statement SStatements{,}), and being two special symbols denoting termination and error, respectively, |ψ is a quantum state in 2n, where A𝒫() is the set of accessible qubit pointers, and where l() is the list of qubit pointers under consideration. Let Confn, be the set of configurations over n qubits. The initial configuration in Confn of a program D::S on input state |ψ2n is cinit(|ψ)(S,|ψ,{1,,n},[1,,n]). A final configuration can be defined in the same way as cfinal(|ψ)(,|ψ,{1,,n},[1,,n]).

Each unitary transformation U of a unitary application q¯[i]*=Uf(j);, comes with a function U assigning a unitary matrix U(f)(n)2×2 to each integer n and polynomial-time approximable total function f[0,2π). For example, the gates of the quantum Fourier transform can be defined by RnPh(λx.π/2x1)(n) with Ph(f)(n)(100eif(n)). The other basic unary gates are the NOT and the RY gate.

The big-step semantics is defined in Figure 4 as a relation in nConfn××Confn. Standard notations from quantum computation are used such as tensor product , ψ| for the conjugate transpose of |ψ, or given a dimension m, |knI2n1|kI2mn for k{0,1}. In Figure 4, the set A of accessible qubits is used to ensure that unitary operations on qubits can be physically implemented. For example, to ensure reversibility, in a quantum branch qcase s[i] of {0S0, 1S1}, statements S0 and S1 cannot access s[i].

Definition 1.

The time complexity TimeP: of a program PD::S is defined by

TimeP(n)max|ϕ2n{m|ϕ2n,cinit(|ϕ)mcfinal(|ϕ)}.

Intuitively, when cmc holds, the time complexity m is an integer corresponding to the maximum number of procedure calls performed over each (classical and quantum) branch during the evaluation of a configuration cConfn. We write P(|ψ)=|ψ, whenever cinit(|ψ)mcfinal(|ψ) holds for some m. If the program P terminates on any input (i.e., always reaches a final configuration) then P is a total function on quantum states.

Example 2.

For the program PAIRS of Figure 1, TimePAIRS(n)=n2+1 since each procedure call removes two qubits until it reaches a sorted set q¯ such that |q¯|1 (depending on whether n is odd or even) and for both sizes there are no more procedure calls.

2.3 Polynomial Branching Programs

Figure 6: Program QFT for the quantum Fourier transform.

We define three restrictions on the programming language to consider only a strict subset, called pbp for Polynomial Branching Programs, on which branch sequentialization can be avoided. First, we define a well-foundedness criterion to consider only terminating programs. A program P is well-founded if each recursive procedure call removes at least one qubit in its parameter. wf denotes the set of well-founded programs. Then, we define a criterion to exclude programs with exponential runtime. Toward that end, the notion of width of a procedure proc in a program P is introduced.

Definition 3.

Given a program P, the width of a procedure procP, denoted 0ptP(proc), is defined as 0ptP(proc)wPproc(Sproc), where wPproc(S) is defined inductively as follows:

wPproc(skip;) 0
wPproc(q *= Uf(i);) 0,
wPproc(S1S2) wPproc(S1)+wPproc(S2),
wPproc(if b then S0 else S1) max(wPproc(S0),wPproc(S1)),
wPproc(qcase q of {0S0, 1S1}) max(wPproc(S0),wPproc(S1)),
wPproc(call proc’(s);) {1if procPproc’,0otherwise.

Let width1 be the set of programs with procedures of width at most 1.

Programs of width 1 are inherently polynomial as they cannot perform an exponential number of procedure calls in sequence. However, the total number of calls in superposition (that is to say, across all branches of computation) may be exponential for such programs by the definition of the width for conditional and quantum case. For this reason, a compilation strategy whose circuits scale as the sum of the branches instead of their maximum can lead to circuits of exponential size.

Finally, for the purpose of our compilation process, we impose further syntactical conditions on programs. We restrict our attention to what we will call basic programs.

Definition 4.

A program is basic if there exists a sorted set s such that every procedure call is of the shape callproc(q¯) or callproc(s). Let basic denote the set of basic programs.

basic programs are programs whose procedure calls are restricted to the full qubit list or to a fixed qubit list expression. This restriction simplifies the treatment of recursive procedure calls during compilation.

Definition 5 (Polynomially Branching Programs).

The set pbp of polynomially branching programs is defined as pbpwfwidth1basic.

Notice that pbp is strictly included in the programming language of [17], that roughly corresponds to wfwidth1, where procedure calls can take an extra integer parameter. Since the language of [17] is polytime-sound, so is pbp (Theorem 11). The restriction to programs in pbp does not compromise the extensional completeness of the language as, by Theorem 11, pbp is complete for the class fbqp of functions in {0,1}{0,1}, i.e., functions that can be approximated with at least 2/3 probability by a quantum Turing machine running in polynomial time [6, 29].

Example 6.

The QFT program written in Figure 6 is in pbp. Indeed, the program QFT is in basic as calls are only performed on either q¯ or q¯[1]. It is also in wf as all recursive calls are performed on parameter q¯[1]. Finally, it is in width1 as 0ptQFT(qft)=0ptQFT(rot)=0ptQFT(shift)=1.

3 Compilation Strategy

We now present the compilation algorithm from wfwidth1 to quantum circuits based on the merging strategy of Figure 2(c). Compilation is restricted to programs in wfwidth1 for two reasons. The well-foundedness criterion wf ensures that the compilation always terminates. The restriction to width1 prevents exponential blow-up. Note however that, in Theorem 9, the extra basic restriction is required to avoid the branch sequentialization problem, that is, branch sequentialization is avoided on pbp.

3.1 Anchoring and Merging

Figure 7: Anchoring and merging.

A control structure cs is a partial map in {0,1}. Intuitively, cs represents a map from qubit pointers to their control values for a controlled gate and will also be used as a shorthand notation in circuits. For n and k{0,1}, let cs[n:=k] be the control structure obtained from cs by setting cs(n)=k. We denote by dom(cs) the domain of the control structure and by , the control structure such that dom()=.

During the compilation of the program statement, every recursive call is handled as follows: if it is the first call with this procedure name and input size, an ancilla is created and its procedure statement is compiled under a control on this ancilla (anchoring), otherwise this procedure call has already been compiled and the control structure of the new call is sent into the corresponding ancilla (merging). Anchoring and merging are represented in Figure 7 as rewrite rules on quantum circuits for a procedure call call proc(s), cs denoting the control structure corresponding to the (possibly nested) quantum cases, where the procedure call occurs.

akproc represents the ancilla introduced (in case of anchoring) or reused (in case of merging). The integer k refers to the number |s| of qubits in the procedure call. The grey box materializes the controlled statements left to compile.

3.2 The compile Algorithm

Figure 8: Rewrite rules of compile.

Let controlled statements be pairs of the shape (cs,S), for some control structure cs and some statement S. In the compilation process, controlled statements (cs,S) are used to represent a statement S that remains to be compiled into a quantum circuit C together with a control structure cs, representing the qubits controlling the compiled circuit C. The compilation algorithm compile is described by the rewrite rules of Figure 8. Generating the circuit corresponding to the program P=D::S will consist in running compile on the controlled statement (,S) for a fixed list of qubits pointers [1,,n] (hence, an input of fixed size n). We denote by compile(P,n) the circuit obtained for program P on size n. The algorithm standardly generates the quantum circuit corresponding to a wfwidth1 program inductively on the statement S. Indeed, in the rules of Figure 8 when compile is called on a given controlled statement (cs,S), an inductive call to compile is performed on controlled statements whose statements are sub-statements of S. The two base cases are the rules for the skip statement and the unitary application. In these cases, the compilation just outputs the identity circuit and a controlled gate computing the unitary, respectively. The rules for sequence and quantum case perform two inductive calls to compile on each branch of the statement. The rule for quantum case is the only rule that directly performs changes on the control structure. In the particular case of a call to a recursive procedure (i.e., when wPproc(Sproc)>0), compile calls the optimize subroutine to perform anchoring and merging. This call to optimize is highlighted in Figure 8 through the use of a shaded square [Uncaptioned image], which takes the procedure name proc as superscript. We call this process the optimization of procedure proc.






Figure 9: Rewrite rules of optimize.

The rewrite rules for the subroutine optimize on procedure proc are described in Figure 9. This subroutine takes two input parameters:

  • a first list lCst of controlled statements, the ones to be optimized,

  • a second list of controlled statements, called contextual list called lM and denoted by [Uncaptioned image], consisting in controlled statements that are orthogonal to those in lCst and do not contain recursive procedure calls.

As described by the last rule of Figure 8, each initial call to optimize is performed on the singleton list containing one controlled statement (cs,Sproc{s/q¯}) and a contextual circuit equal to the identity circuit.

In Figure 9, the rules are applied by considering the first controlled statement in the list lCst. We just specify the most interesting case below:

  • For a controlled statement (cs,S1S2), two distinct rules can be applied. In the first scenario, where wPproc(S1)=1, we have that S1 contains a recursive procedure call and S2 does not (this is a consequence of the width1 condition in pbp), or the converse. Depending on this, the rule select on which control statement (cs,S1) or (cs,S2) to perform the optimization and, append the compiled circuit of the other controlled statement to the left or to the right.

  • For a controlled statement with an if statement, we precompute the boolean value b (which only depends on classical values such as the integer input and the register size), and according to whether the corresponding branch contains a recursive call or not, we either add the controlled statement to lCst or the contextual list.

  • For a controlled statement with a qcase statement, one or two of the branches are added to the list lCst depending on whether they are both recursive or not. The branch that is not recursive is added to the contextual list.

  • For a controlled statement with a procedure call call proc’(s); we perform either anchoring or merging depending on whether the ancilla a|s|proc’ exists or not. In the case of merging, we include a controlled permutation of qubit addresses (as described in Lemma 12) in order to match the input qubits of the different procedures. Notice that, for basic programs, the input qubits will match and therefore this permutation is trivial, and merging is as described in Figure 7.

At the end of optimize, when lCst=[], we rearrange the contextual list in the following way. Let [(cs1,S1,l1),,(csk,Sk,lk)] be the state of the contextual list at the end of optimize. We may rewrite each controlled statement as a list of its atomic elements, with a transformation denoted seq. Let (q,l)n, then

seq(cs,skip;,l) [],
seq(cs,q*=Uf(j);,l) [(cs,q*=Uf(j);,l)],
seq(cs,S1S2,l) seq(cs,S1,l)@seq(cs,S1,l),
seq(cs,if b then Strue else Sfalse,l) seq(cs,Sb,l), if (b,l)𝔹b,
seq(cs,qcase q of {0S0, 1S1},l) seq(cs[n:=0],S0,l)@seq(cs[n:=1],S1,l),
seq(cs,call proc(s);,l) [(cs,call proc(s);,l)].

This separation of statements allows for a partitioning according to the type of procedure call appearing in the statement. Given a list of controlled statements , we define the list [0,1,,m] where, for procedures proc1, …, procm that are not mutually recursive.

0 {(cs,S,l):proc such that wPproc(Sproc)=1 and wPproc(S)=1}.
proci {(cs,S,l):wPproci(S)=1},i=1,,m.

Given our choice of procedures and the controlled statements obtained from seq, this partition is well-defined. Performing these two partitions (first in terms of sequential order of statements, and then according to procedure calls), we are able to rewrite:

The different instances of optimize contain calls to procedures that are mutually recursive, which will allow for further anchoring and merging.

3.3 Soundness of the algorithm

In this section, we discuss the validity of the compilation algorithm. One first observation should be that, given a pbp program, the compilation necessarily terminates. For instance, in compile (Figure 8), all rules besides the procedure call rewrite the controlled statement into either a circuit or into instances of compile of smaller statements. In the case of a procedure call, the rewriting of the procedure body produces a finite number of calls to procedures of lower recursive level.

In optimize, a recursive procedure will result in a finite number of calls to mutually recursive procedures – this is ensured by the well-foundedness condition wf, that requires that recursive procedure calls reduce the size of the input, therefore procedure calls either reduce the level of recursion or the number of available qubits.

The soundness of the compilation algorithm is ensured by an orthogonality invariant in optimize. Let cs,cs be two control structures. We say that cs and cs are orthogonal if there exists idom(cs)dom(cs) such that cs(i)=1cs(i). Two controlled statements are orthogonal if their control structures are orthogonal.

Lemma 7.

During the compilation of a program Pwfwidth1, for each optimization of a (recursive) procedure proc, all controlled statements in the union of list l and the contextual list lM are pairwise orthogonal.

Proof.

This can be checked by inspection of each rewrite rule in Figure 9, as either the same control structure cs is kept in the list, or it is removed, or we consider two control structures, cs[n=0] and cs[n=1], that preserve the orthogonality relations of cs and are also mutually orthogonal due to the control over qubit n.

This invariant ensures the validity of optimize, and the soundness of the compilation algorithm. It is also a consequence of the width1 restriction in pbp, as by the definition of width, at most one recursive call may appear per branch of a recursive procedure. This ensures that two recursive calls on a given procedure always occur in orthogonal branches and can be simply combined in the same ancilla. Given a circuit C, we define its semantics C naturally as the composition of the semantics of each gate.

Theorem 8 (Soundness of compilation).

Given a program Pwfwidth1 and a quantum state |ψ2n we have that compile(P,n)(|ψ)=P(|ψ).

3.4 Illustrating Example

Figure 10: Program REC.

We illustrate the compilation process with the program REC, of Figure 10. Notice that REC is in wfwidth1 but does not belong to pbp, as the procedure rec performs two recursive calls on different sorted sets q¯[1] and q¯[1,2]. This example thus illustrates that the compilation algorithm is not restricted to pbp.

Given that rec is a recursive procedure, its compilation is performed within optimize. We denote by the empty box [Uncaptioned image] the procedure statement Srec and by the dotted box [Uncaptioned image] the statement call rec(s);. Applying rewrite rules to rec we obtain the transitions:

The first step is obtained by applying the rule of if statements, where we assume that |q¯|>2. Afterwards, we apply twice the rule for the qcase statement, adding the two qubits q¯[1] and q¯[2] to the control structure. Finally, the compilation of the skip statement corresponds to the empty circuit. We denote by rec the application of this sequence of rewrite rules, which is used in Figure 11.

Figure 11: Example of anchoring and merging for the program REC in Figure 10.

One important thing to notice in Figure 11 is that the compilation strategy does not avoid branch sequentialization locally but rather asymptotically in the construction of the circuit. In other words, the qcase statement in rule rec does generate two instances of Srec in the circuit in sequence, one for each branch. However, the merging of calls to rec on inputs of the same size ensures that only one instance per input size needs to be compiled, and therefore this strategy achieves linear complexity in the number of gates.

4 Main Results

4.1 No Branch Sequentialization

First, we show that the problem of branch sequentialization is solved in pbp. For that purpose, we show that the size of quantum circuit obtained through compiling a pbp program is bounded asymptotically by the time complexity of the program. As the time complexity is the maximum of the complexity of the two branches of a quantum case statement, the branch sequentialization problem is avoided on pbp programs, as the size of the circuit is asymptotically equal to its time semantics. Given a circuit C, let #C denote its size, i.e., number of gates and wires.

Theorem 9 (No branch sequentialization).

Given a program Ppbp and input n, we have that #compile(P,n)=O(TimeP(n)) holds.

One may notice that, in the rules of compile (Figure 8), the compilation of a qcase statement generates two controlled statements in sequence. Similarly, the rule of optimize (Figure 9) for qcase statements appends two controlled statements to the list lCst in the case where they are both recursive. This does not pose a problem as, anchoring and merging will ensure that the asymptotic complexity of this statement will be given by the maximum complexity of the branches.

Example 10.

By Theorem 9 and Example 2, we have that #compile(PAIRS,n)=O(n).

4.2 Polynomial-Time Soundness and Completeness

Having shown in Theorem 9 that pbp programs can avoid branch sequentialization, we now turn to the question of whether they constitute an interesting fragment of quantum programs, given that it is restricted by the wf, width1 and basic conditions. We show that the set of pbp programs is sound and complete for quantum polynomial time via an implicit characterization of fbqp, i.e. the set of classical functions that can be approximated with bounded error by a polytime quantum Turing machine [30, 17].

Since the considered programming language does not allow for qubit creation, in order to define functions where the output is larger than the input, we make use of polytime padding. A polytime padding function is a function f:{0,1}{0,1} computable by a (classical) polytime Turing machine such that f(x)=xy, for some y depending only on the length of x (and not the value of x). Given a set of programs, denotes the set of functions given by Pf, where P, f is a polytime padding function, and is the standard function composition. For any p(12,1], we denote by p the set of functions in that approximate a classical function with probability at least p.

Theorem 11 (Soundness and Completeness).

pbp23=fbqp.

4.3 Extension to Programs in wfwidth𝟏

The basic restriction of pbp can be thought of as ensuring that qubits are consumed in a uniform way. As a consequence, any two procedure calls on inputs of size n will correspond to precisely the same n qubits. This guarantees that two compatible procedure calls can be merged simply by combining control structures in the same ancilla.

Without this constraint (i.e., on wfwidth1), we can use precisely the same compilation strategy but where merging is done by control-swapping registers into a single procedure call. For k instances of a procedure on input size n, this requires k1 controlled swaps, each requiring O(n) operations in the worst case. An example of merging two unitary gates U with controlled-swaps is given in Figure 12. If we allow for parallelization of gates in the circuit, this can be done in logarithmic depth, as shown in Lemma 12. An example of a controlled permutation is shown in Figure 13, where vertical dashed lines separate time slices where gates can be applied simultaneously.

Figure 12: Merging orthogonal controlled statements.
Figure 13: Implementation of a controlled permutation in log-depth.
Lemma 12.

Any controlled permutation on n qubits can be performed with a quantum circuit of size O(n) and depth O(logn).

An extension of compile with controlled-swapping as in Figure 12 used for merging procedures ensures the following bound for wfwidth1 programs.

Theorem 13.

For Pwfwidth1, compile(P,n) results in a quantum circuit of size O(nTimeP(n)) and depth O(log(n)TimeP(n)).

4.4 Comparison with Existing Work

The compile algorithm strictly improves upon the size bounds of other compilation algorithms [17], as illustrated by Table 1. These bounds can be empirically tested using the implementation of the compiler available at gitlab.inria.fr/mmachado/pfoq-compiler. In some cases, such as Examples 16 and 15 given in Appendix B, we find families of programs where the gap in complexity (i.e. the degree of the polynomial) can be made arbitrarily large.

In k-Chained Substrings (Example 16), we consider the problem of detecting whether an input string contains substrings y1,,yk appearing in that order. For a certain choice of substrings yi, we define a pbp program of linear time complexity for which the compilation strategy in [17] results in circuits of size Θ(n2k). The problem Sum(r), given in Example 17, consists of checking if an input x=x1xn satisfies i=1nxi=r.

Table 1: Circuit size complexity bounds.
Circuit complexity
Problem Example [17] compile
Full Adder Example 15 Θ(n) Θ(n)
Quantum Fourier Transform Example 6 Θ(n2) Θ(n2)
k-Chained Substrings Example 16 Θ(n2k) Θ(n)
Sum(r), r1 Example 17 Θ(nr) Θ(n)

5 Conclusions and Future Work

The quantum control statement, while being a central component of programming languages with quantum control flow, usually incurs a remarkable slowdown in the program complexity in any automatic implementation, which poses a problem to the quantum programmer. In this paper, we have demonstrated that such a slowdown can be avoided by restricting the program structure. This was achieved using techniques from quantum implicit computational complexity, which allow not only to selectively reason about efficient programs (polytime soundness) but also to ensure that the techniques are sufficient in principle for any program (polytime completeness).

pbp is then the first quantum programming language with a compilation strategy that ensures that the quantum control statement can be compiled onto a circuit whose size (i.e., number of gates and wires) is bounded asymptotically by the maximum of the time complexity of its branches. While this language is extensionally complete for fbqp and expressive enough to write quantum algorithms such as QFT or Full Adder in a natural way, it would be interesting to investigate in what ways its expressive power can be improved, and how different languages can also be shown to avoid branch sequentialization.

References

  • [1] Samson Abramsky. No-cloning in categorical quantum mechanics. Semantic Techniques in Quantum Computation, pages 1–28, 2009.
  • [2] Leonard M. Adleman, Jonathan DeMarrais, and Ming-Deh A. Huang. Quantum computability. SIAM Journal on Computing, 26(5):1524–1540, 1997. doi:10.1137/S0097539795293639.
  • [3] Thorsten Altenkirch and Jonathan Grattage. A functional quantum programming language. In 20th Annual IEEE Symposium on Logic in Computer Science (LICS’ 05), pages 249–258, 2005. doi:10.1109/LICS.2005.1.
  • [4] John Baez. Quantum quandaries: a category-theoretic perspective. The structural foundations of quantum gravity, pages 240–265, 2006.
  • [5] Adriano Barenco, Charles H. Bennett, Richard Cleve, David P. DiVincenzo, Norman Margolus, Peter Shor, Tycho Sleator, John A. Smolin, and Harald Weinfurter. Elementary gates for quantum computation. Phys. Rev. A, 52:3457–3467, November 1995. doi:10.1103/PhysRevA.52.3457.
  • [6] Ethan Bernstein and Umesh Vazirani. Quantum complexity theory. SIAM Journal on Computing, 26(5):1411–1473, 1997. doi:10.1137/S0097539796300921.
  • [7] Hans J. Briegel, David E. Browne, Wolfgang Dür, Robert Raussendorf, and Maarten Van den Nest. Measurement-based quantum computation. Nature Physics, 5(1):19–26, 2009. doi:10.1038/nphys1157.
  • [8] Andrew M. Childs and Nathan Wiebe. Hamiltonian simulation using linear combinations of unitary operations. Quantum Info. Comput., 12(11–12):901–924, November 2012. doi:10.26421/QIC12.11-12-1.
  • [9] Andrea Colledan and Ugo Dal Lago. Circuit width estimation via effect typing and linear dependency. In European Symposium on Programming, pages 3–30. Springer, 2024. doi:10.1007/978-3-031-57267-8_1.
  • [10] Andrea Colledan and Ugo Dal Lago. Flexible Type-Based Resource Estimation in Quantum Circuit Description Languages. Proc. ACM Program. Lang., 9(POPL), January 2025. doi:10.1145/3704883.
  • [11] Ugo Dal Lago, Andrea Masini, and Margherita Zorzi. Quantum implicit computational complexity. Theoretical Computer Science, 411(2):377–409, 2010. doi:10.1016/j.tcs.2009.07.045.
  • [12] Vincent Danos and Elham Kashefi. Determinism in the one-way model. Physical Review A, 74(5):052310, 2006. doi:10.1103/PhysRevA.74.052310.
  • [13] Niel de Beaudrap, Xiaoning Bian, and Quanlong Wang. Fast and Effective Techniques for T-Count Reduction via Spider Nest Identities. In 15th Conference on the Theory of Quantum Computation, Communication and Cryptography (TQC 2020), pages 11:1–11:23. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2020. doi:10.4230/LIPICS.TQC.2020.11.
  • [14] Alejandro Díaz-Caro, Mauricio Guillermo, Alexandre Miquel, and Benoît Valiron. Realizability in the unitary sphere. In 2019 34th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS), pages 1–13. IEEE, 2019. doi:10.1109/LICS.2019.8785834.
  • [15] Peng Fu, Kohei Kishida, and Peter Selinger. Linear dependent type theory for quantum programming languages: Extended abstract. In Proceedings of the 35th Annual ACM/IEEE Symposium on Logic in Computer Science, LICS ’20, pages 440–453, New York, NY, USA, 2020. Association for Computing Machinery. doi:10.1145/3373718.3394765.
  • [16] David Gosset, Robin Kothari, and Kewen Wu. Quantum state preparation with optimal T-count. arXiv preprint arXiv:2411.04790, 2024. doi:10.48550/arXiv.2411.04790.
  • [17] Emmanuel Hainry, Romain Péchoux, and Mário Silva. A programming language characterizing quantum polynomial time. In Foundations of Software Science and Computation Structures, pages 156–175. Springer, 2023. doi:10.1007/978-3-031-30829-1_8.
  • [18] Aram W. Harrow, Avinatan Hassidim, and Seth Lloyd. Quantum Algorithm for Linear Systems of Equations. Physical Review Letters, 103(15), October 2009. doi:10.1103/physrevlett.103.150502.
  • [19] Chris Heunen and Jamie Vicary. Categories for Quantum Theory: an introduction. Oxford University Press, 2019.
  • [20] Aleks Kissinger and John van de Wetering. Reducing the number of non-Clifford gates in quantum circuits. Phys. Rev. A, 102:022406, August 2020. doi:10.1103/PhysRevA.102.022406.
  • [21] Emanuel Knill, Raymond Laflamme, and Gerard J. Milburn. A scheme for efficient quantum computation with linear optics. Nature, 409(6816):46–52, January 2001. doi:10.1038/35051009.
  • [22] Gui-Lu Long and Yang Sun. Efficient scheme for initializing a quantum register with an arbitrary superposed state. Phys. Rev. A, 64:014303, June 2001. doi:10.1103/PhysRevA.64.014303.
  • [23] Dmitri Maslov, Gerhard W. Dueck, D. Michael Miller, and Camille Negrevergne. Quantum Circuit Simplification and Level Compaction. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 27(3):436–444, March 2008. doi:10.1109/tcad.2007.911334.
  • [24] Cristopher Moore and Martin Nilsson. Parallel Quantum Computation and Quantum Codes. SIAM Journal on Computing, 31(3):799–815, 2001. doi:10.1137/S0097539799355053.
  • [25] Yunseong Nam, Neil J. Ross, Yuan Su, Andrew M. Childs, and Dmitri Maslov. Automated optimization of large quantum circuits with continuous parameters. npj Quantum Information, 4(1):23, May 2018. doi:10.1038/s41534-018-0072-4.
  • [26] Michael A. Nielsen and Isaac L. Chuang. Quantum Computation and Quantum Information: 10th Anniversary Edition. Cambridge University Press, 2011.
  • [27] Peter Selinger. Towards a quantum programming language. Mathematical Structures in Computer Science, 14(4):527–586, 2004. doi:10.1017/S0960129504004256.
  • [28] Xiaoming Sun, Guojing Tian, Shuai Yang, Pei Yuan, and Shengyu Zhang. Asymptotically optimal circuit depth for quantum state preparation and general unitary synthesis. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 42(10):3301–3314, 2023. doi:10.1109/TCAD.2023.3244885.
  • [29] Tomoyuki Yamakami. A Schematic Definition of Quantum Polynomial Time Computability. J. Symb. Log., 85(4):1546–1587, 2020. doi:10.1017/jsl.2020.45.
  • [30] Tomoyuki Yamakami. Expressing Power of Elementary Quantum Recursion Schemes for Quantum Logarithmic-Time Computability. In Agata Ciabattoni, Elaine Pimentel, and Ruy J. G. B. de Queiroz, editors, Logic, Language, Information, and Computation, pages 88–104, Cham, 2022. Springer International Publishing. doi:10.1007/978-3-031-15298-6_6.
  • [31] Andrew Chi-Chih Yao. Quantum circuit complexity. In Proceedings of 1993 IEEE 34th Annual Foundations of Computer Science, pages 352–361, 1993. doi:10.1109/SFCS.1993.366852.
  • [32] Charles Yuan and Michael Carbin. Tower: Data Structures in Quantum Superposition. Proceedings of the ACM on Programming Languages, 6(OOPSLA2):259–288, October 2022. doi:10.1145/3563297.
  • [33] Charles Yuan, Agnes Villanyi, and Michael Carbin. Quantum control machine: The limits of control flow in quantum programming. Proc. ACM Program. Lang., 8(OOPSLA1), April 2024. doi:10.1145/3649811.
  • [34] Xiao-Ming Zhang, Tongyang Li, and Xiao Yuan. Quantum State Preparation with Optimal Circuit Depth: Implementations and Applications. Phys. Rev. Lett., 129:230504, November 2022. doi:10.1103/PhysRevLett.129.230504.
  • [35] Zhicheng Zhang and Mingsheng Ying. Quantum Register Machine: Efficient Implementation of Quantum Recursive Programs, 2024. doi:10.48550/arXiv.2408.10054.

Appendix A Omitted proofs

Theorem 8 (Soundness of compilation). [Restated, see original statement.]

Given a program Pwfwidth1 and a quantum state |ψ2n we have that compile(P,n)(|ψ)=P(|ψ).

Proof.

By induction on the structure of the program P=D::S in wfwidth1. One can check by inspection of each case that the compile rules for non-recursive statements corresponds to the straightforward circuit semantics of quantum programs.

Likewise, the rewriting rules of optimize, given in Figure 9, can be easily checked using the orthogonality invariant of Lemma 7. For instance, consider the case of the sequential statement S=S1S2. In the case where wPproc(S1)=1, we can derive the rule for the statement with the following steps. We omit the label lCst to simplify the reading.

where (a) corresponds to the definition of the sequential statement, in (b) we make use of the fact that (cs,S2) is orthogonal to all controlled statements in lCst and therefore can be commuted in the circuit, and in step (c) we likewise make use of orthogonality, in this case with the controlled statements in the contextual list. Other case can be inspected to follow a short sequence of steps, such as described for the sequential statement.

For instance, in the case of anchoring, we consider the following composition of rules

where, likewise, (a) corresponds to the typical circuit semantics of a procedure call (with an added anchoring ancilla), and (b) and (c) make use of the orthogonality of cs with the right side of the circuit. The validity of merging and also be checked:

Step (a) can be seen as the definition of the procedure call. Since we are in the merging scenario, a similar procedure call has already been performed and is anchored to its corresponding ancilla. Without loss of generality (since all controlled statements in lCst are orthogonal and therefore commute) we assume that this call appears at the end of lCst. Rule (b) simply indicates that since cs is orthogonal to other control structures in lCst, we may move the controlled statements so that they are adjacent, and where we apply step (c) to perform merging. Since cs is orthogonal to [a|s|proc’=1] the control is added to the ancilla as expected. Finally, orthogonality of cs with other control structures means that we may move the two controlled-NOTs to the edges of the circuit.

We now consider the validity of the rule for the contextual circuit. Consider a list of mutually-orthogonal controlled statements, where the sequential form of (csi,Si) is given by the lists of controlled statements l1(i),,lt(i), then we have that:

The final step comes from the fact that all lj(1) are orthogonal to cs2, …, csk since they include cs1. This can be done for all other (csi,Si), and in the end we obtain the arrangement l1(1)l1(k)l2(1)l2(k)lt(1)lt(k).

Given a certain 1jt, we have that all controlled statements in i=1klj(i) (that is, all controlled statements occurring in time j) are pairwise orthogonal. Therefore, we may rearrange their order according to their recursivity, and in doing so we may consider each time separately. For instance, in time 1, let i=1kl1(i), and let proc1,proc2,procm denote procedures belonging to different recursion families. Then, we perform the partition:

0 {(cs,S):proc such that wPproc(Sproc)=1 and wPproc(S)=1}.
i {(cs,S):wPproci(S)=1},i=1,,m.

By the definition of the sequential form of each controlled statement, we note that the partition is well-defined (e.g. there are no statements containing calls to more than one procedure). Therefore, we are able to rearrange and perform calls to optimize on each i1 and a call to compile on 0. After composing the obtained circuits, we get the rule given in Section 3. This concludes the proof.

We introduce the notion of rank of a procedure for use in the proof of Theorem 9.

Definition 14 (Rank).

Set max()0. Given a program P, the rank of a procedure procP, denoted rkP(proc), is defined as follows:

rkP(proc){0, if ¬(proc’,procPproc’),maxprocPproc’{rkP(proc’)}, if proc’,procPproc’¬(procPproc),1+maxprocPproc’{rkP(proc’)}, if procPproc.
Theorem 9 (No branch sequentialization). [Restated, see original statement.]

Given a program Ppbp and input n, we have that #compile(P,n)=O(TimeP(n)) holds.

Proof.

The theorem can be shown by structural induction on the program body. All cases are straightforward except the one of the quantum control case. We make use of the following two facts regarding pbp programs:

  1. (a)

    The size of the circuit is directly proportional to its total number of unique procedure calls (in the sense required for merging), and

  2. (b)

    a recursive procedure call results in O(n) unique calls to procedures of the same rank. This is because unique calls may only differ on procedure name (of which there is a constant amount) or input size (for which there is a linear number of possibilites).

Consider a quantum control statement S=qcase q of {0S0, 1S1} appearing in optimize in the context of a recursive procedure proc. By (a), the circuit size for S0 and S1 are proportional to the (total) number of procedure calls in each statement, separately. We check that the number of ancillas created for S is bounded by the maximum number of ancillas between S0 and S1.

We proceed by induction on the rank r of the procedure. The base case of r=1 is given by (b), and so we may consider r>1. For the inductive case, we consider two scenarios:

  • wprocP(S0)=wprocP(S1)=1. In this case, both S0 and S1 are of rank r, and all their recursive procedure calls may be merged. Therefore, the asymptotic number of such calls is bounded between the maximum between S0 and S1 (consider that, if there is no overlap between the ancillas needed, their number is still bounded linearly). Applying the IH on the procedures of rank r1 we obtain the desired result.

  • wprocP(S0)=0 and wprocP(S1)=1. In this case, S0 contains calls to procedures of rank r<r whereas S1 contains calls to procedures of rank r. According to optimize, statement S0 is kept in the contextual circuit until no more statements are recursive relative to proc. The statements of rank r which are present in S0 are then merged with the equivalent procedures that were derived from S1 and also added to lM. The number of procedures of rank r is bounded asymptotically by the maximum between those in S0 and S1, therefore we obtain our result.

Theorem 11 (Soundness and Completeness). [Restated, see original statement.]

pbp23=fbqp.

Proof.

Since pbpwfwidth1pfoq, with pfoq the language of [17], we have that pbpwfwidth1pfoq and, by pfoq soundness [17, Theorem 3], it also holds that pbp23fbqp. Completeness can be proven by showing that pbp can simulate the function algebra in [29], known to be complete for quantum polynomial time. The proof can be done using the same construction as in [17, Theorem 5].

See 12

Proof.

Any permutation can be written as the composition of two sets of disjoint transpositions, and therefore any permutation can be performed in constant time, using two time steps [24]. To perform a controlled permutation, it suffices to create O(n) ancillas with the correct controlled state, which can be done in O(logn) depth with O(n) gates.

Appendix B Examples of Table 1

Example 15.

Let us write the instruction qcase q1 of {0skip;,1CNOT(q2,q3)} as TOF(q1,q2,q3). We define the pbp program ADD as follows:

Given a carry-in bit cin, we have that ADD(|anbna1b10ncin)=|anbna1b1couts1sn, where cout encodes the carry-out bit and si encodes the i-th sum bit. Since fullAdder performs one recursive call to an input containing three fewer qubits, TimeADD(n)=n3+1.

Example 16 (k-Chained Substrings).

Consider a program for detecting a substring 001 occurring k times in an input. For the case k=1, we define a program with procedures

The program body is a call to procedure ai on input q¯, which results in a program that performs the transformation |x¯y|x¯(yb) where b{0,1} is 1 if and only if x¯ contains at least 1 instance of 001 as a substring.

For a general k, we consider the program Pk defined by k instances of ai, bi and ci, indexed from i=1 to i=k, and the procedure call to in each ci is replaced with a call to ai+1, except for the last ck. The final program then consists of a call to procedure a1 on input q¯. It is easy to check that Pkpbp, and that TimePk(n)=n, for any k.

Example 17.

Let Sum(r) be the decision problem of checking if an input bitstring x{0,1}n satisfies i=1nxi=r. For instance, if r=3, we may define a program SUM_3 as:

We have that TimeSUM_3(n)=n.