Coherence in logical quantum channels

Joseph K Iverson; John Preskill

doi:10.1088/1367-2630/ab8e5c

1. Introduction

Although there is no rigorous proof, much evidence supports the widely held belief that an ideal noiseless quantum computer would be able to solve problems that are intractable for classical digital computers. But in the real world, quantum computers are noisy. We therefore expect that quantum error correction will be needed to overcome the noise and reliably operate a large-scale quantum computer that can solve hard problems. Fortunately, the accuracy threshold theorem for quantum computation establishes that quantum computing is scalable, assuming that the noise is neither too strong nor too strongly correlated [1–5].

Until we try it on a real device, though, we will not know for sure whether realistic noise is sufficiently benign for quantum error correction to work effectively. A general noise channel acting on n qubits is extremely complex when n is large, so it will not be practical to fully characterize the noise in a complex quantum device using any feasible experimental protocol. A commonly used metric for the performance of single-qubit and two-qubit quantum gates is the 'average infidelity' r = 1 − F, where F is the fidelity of the output from the gate relative to the output of an ideal gate, averaged uniformly over all possible input states. This quantity r has the great virtue that it can be feasibly measured using randomized benchmarking [6, 7], but as a characterization of the noise strength it has shortcomings. Assuming an uncorrelated noise model, threshold theorems guarantee scalability if a different metric, the diamond distance D_◊, is less than a critical value. Here D_◊ denotes the deviation of the noisy gate from the ideal gate as measured by the diamond norm. For an incoherent noise channel like a Pauli channel, the diamond distance D_◊ is equal to the average infidelity r; in contrast, for a highly coherent channel, D_◊ scales like the square root of r. If we know only r, and have no information about the coherence of the noise, we cannot estimate D_◊ accurately, and therefore cannot easily make sound predictions about how effectively any error-correcting code will combat the noise [8–10]. The situation is even worse for correlated noise models.

Our purpose in this paper is to study further how well quantum error correction performs against coherent noise models. To make our analysis manageable, we will make some simplifying assumptions. For one, we will not actually consider quantum computation, but instead will focus on the easier task of operating a quantum memory. We envision encoding a quantum state in the memory using a quantum code; after the encoding step the memory is subjected to noise, and then the quantum state is decoded. As a further simplification, we will assume that the encoding and decoding are noiseless. Therefore, the performance of the code against the noise is captured by a logical channel, the result of composing the encoding channel, noise channel, and decoding channel.

We will be interested in what happens to a quantum state which is stored in the memory for a long time, and undergoes many rounds of error correction—that is, we want to characterize the effect of applying the logical channel many times in succession. For this purpose, we will need to understand the coherence properties of the logical channel. If the logical channel is incoherent, then the diamond distance of the decoded state from the ideal state grows linearly with the number of channel repetitions, while for a highly coherent logical channel, it can grow quadratically. Our main conclusion is that, even if the physical noise acting on the quantum memory is highly coherent, the coherence of the logical channel becomes strongly suppressed as the block length of the quantum error-correcting code increases, assuming that the noise is sufficiently weak and sufficiently weakly correlated.

Although we can analyze the logical channel only in a simplified setting, and only for particular code families, we believe that the lessons learned apply more broadly. We expect, for example, that randomized benchmarking applied to logical gates will accurately characterize logical noise even when the physical noise is highly coherent, at least for large code blocks. This also suggests that for concatenated coding schemes, in which the 'physical' qubits of a higher-level code are themselves the logical qubits of a lower-level code, the average infidelity of the lower-level code should be a good predictor for the performance of the higher-level code.

Our main conclusion is not unanticipated [4], as the suppression of coherence in the logical channel has an intuitive explanation. To decode, one measures the error syndrome, and then applies a recovery operation conditioned on the syndrome. For a large code, many different syndromes are possible, and only the errors which are projected onto the same syndrome value can interfere constructively, while errors projected onto different syndrome values add stochastically. The stochastic average over many syndrome sectors suppresses coherence, leaving only small residual coherent effects arising from summing coherently over errors which are projected onto a given syndrome sector. That said, carefully analyzing the residual coherence in the logical channel involves daunting combinatorics. It turns out that further cancellations occur, resulting in even stronger suppression of logical coherence than might be naively expected.

This discussion about averaging over all syndrome sectors highlights an important issue. We will consider the logical channel obtained by averaging over error syndromes, and then study the coherence of the resulting channel. One could make a case for an alternative procedure: define a metric that characterizes coherence, evaluate that metric for the logical channel conditioned on each syndrome, and then average the value of the metric over syndromes by weighting each syndrome with its probability. To argue in favor of this alternative procedure one might note that the experimentalist who executes the error correction protocol could know the syndrome she measures in each run of the protocol, and might be interested in the properties of the logical channel conditioned on that knowledge [11]. Our view is that properties of logical channels conditioned on the syndrome are potentially of interest for near-term experiments using relatively small codes, particularly because it might be feasible to postselect by retaining favorable syndromes and rejecting unfavorable ones. In future experiments using larger codes, though, syndrome histories will be quite complex, and it will be impractical to make useful inferences about the logical channel conditioned on syndrome information. For long computations using large codes, properties of the logical channel averaged over syndromes will most likely provide more usable guidance regarding the features of the protected quantum computation.

We should also note that methods have been proposed to suppress the coherence of physical noise. One such method is randomized compiling, which, under certain assumptions, can transform any single-qubit noise channel into an incoherent depolarizing channel [12]. The assumptions include a Markovian noise model and gate independence of the noise for the 'easy' gates in the scheme. These assumptions may hold to a good approximation for some realistic cases, but they will not hold exactly. We may then ask how the residual coherence is affected by error correction, an issue that can be addressed using the methods in this paper. Other schemes for mitigating coherent noise have been proposed in [13–16]. These papers focus on the strength of the logical noise, whereas we study the character of the logical noise channel, specifically its degree of coherence.

Here we investigate the coherence of the logical channel in the case where the physical noise is fully coherent unitary noise. This problem has been previously studied [17–19], and we discuss this related work in section 1.3 below. Our work improves on these past results in that we consider a family of codes with an accuracy threshold (toric codes without boundaries) and prove bounds on the logical coherence which apply in the limit of a large code block. By specializing to a particular code family, we also find better bounds on the logical coherence for finite code length. Other authors have obtained numerical results for sufficiently small codes in the case where all physical qubits are rotated about a fixed axis [20–22], including analyses of logical channels conditioned on particular error syndromes [11]. We focus instead on investigating asymptotic properties for large codes, using analytic methods. Some asymptotic statements about the performance of concatenated codes were proven in [23].

In our analysis we make extensive use of the chi-matrix formalism for describing quantum channels. The chi matrix arises when the action of a channel on an input density operator is expanded in terms of Pauli operators (tensor products of 2 × 2 Pauli matrices) acting on the density operator from the left and from the right. A channel can be expressed as the sum of an 'incoherent part' in which the Pauli operators on left and right are equal, and a 'coherent part' in which the Pauli operators on left and right are distinct. Our main task will be to infer, in the case of stabilizer codes, how the logical chi matrix which describes the logical channel after error correction is related to the physical chi matrix which describes the noise acting on physical qubits.

Specifically, we study the logical channel for the toric code on an L × L lattice where L is large, and where error correction is carried out using minimal-weight decoding. We estimate the coherent component of the logical chi matrix up to order L + 2ζ in the rotation angle θ, where ζ is any L-independent constant, and relate this coherent component to the incoherent component of the logical channel. Our main theorem states that the strength of the coherent part of the logical channel is bounded above by strength of the incoherent part times a factor of 1/θ. (Here θ is the rotation angle applied to each of the physical qubits—our result also holds for rotation angles and axes that vary somewhat from qubit to qubit.) From this statement, we may infer that when the logical channel is applied m times in succession, the average infidelity grows linearly with m. (There is a small contribution to the infidelity that grows quadratically with m, but this contribution is highly suppressed by a factor that scales as L^−L.) Stated differently, our result says that after m applications of the logical channel, the accumulated distance from the identity channel, as measured by the diamond norm, grows linearly with m, apart from a correction which is negligible for large L. We emphasize that to reach this conclusion we assumed that the rotation angle θ scales with the block size as 1/L. Therefore, unfortunately, we are not able to make a definitive statement about the coherence of the logical channel in the more physically relevant case where L becomes large with θ fixed; the combinatoric task required exceeded our ability.

A related conclusion holds for a broad class of correlated noise models. We provide a detailed analysis of correlated noise for the simpler case of the quantum repetition code, under the assumption that the noise Hamiltonian commutes with the Pauli operator X acting on each qubit, so that the repetition code provides effective protection against the noise model. In a model in which the rotations acting on pairs of qubits are strongly correlated, we find as expected that the correlations significantly enhance the probability of an uncorrectable logical error. However, the correlations enhance the coherent and incoherent parts of the logical chi matrix by comparable factors. Therefore, our conclusion that the coherence of the logical channel is heavily suppressed in the limit of large code length continues to apply despite the strong pairwise correlations in the noise.

1.1. Summary of the paper

The rest of this paper is organized as follows. In section 1.2 we present an overview of the proof of our main theorem, and in section 1.3 we compare our results to related work by previous authors. Section 2 is a self-contained review of quantum channels, emphasizing metrics for characterizing coherence and relations among them. In particular, we prove a relationship between the chi matrix and the Pauli transfer matrix which had not been previously discussed to our knowledge. In section 3 we compute the logical channel for the repetition code assuming independent unitary noise, finding that the coherence of the logical channel becomes strongly suppressed as the code length increases. Then in section 4 we analyze the repetition code again, this time using the chi-matrix formalism; we find that this analysis can be extended more easily to other stabilizer codes and other noise models. We consider the performance of the repetition code against two-body correlated noise in section 5, again concluding that the logical noise becomes incoherent in the limit of large code length.

The heart of the paper is section 6, where we build on lessons learned from the analysis of the repetition code to prove our main result, which asserts that, for an independent unitary noise model, the coherence of the logical channel is strongly suppressed by the toric code when the code block is large, assuming that the noise strength scales like 1/L. The proof mainly consists of a combinatoric analysis which allows us to estimate the coherent and incoherent components of the logical chi matrix. We have divided the proof into a series of lemmas; figure 1 indicates how these lemmas fit together to build our main theorem, and section 1.2 provides further guidance concerning the structure of the proof. Furthermore, our analysis of two-body correlated noise in the repetition code can be extended to the toric code assuming the noise is sufficiently weak for error correction to succeed with high probability; we therefore conclude that the coherence of the logical channel is highly suppressed even in the case of strongly correlated two-body noise.

**Figure 1.** How the different parts of this paper fit together to build our main result, theorem 3 in section 6.13. See section 1.2 for further details.
Download figure:
Standard image High-resolution image

Section 7 contains our conclusions. There we recount some of the obstacles that prevented us from extending our main theorem to the more physically relevant case where the noise strength is a constant independent of L.

1.2. Overview of the proof of theorem 3

Here we provide some additional guidance regarding how the different parts of this paper fit together to build our main result, theorem 3 in section 6.13. The structure of our argument is also summarized in figure 1.

As already noted, we study the logical channel acting on the code's protected qubits by deriving the chi matrix of this logical channel from the chi matrix of the noise channel acting on the physical qubits. To interpret the meaning of the logical chi matrix, we find it convenient to relate the chi matrix to another formalism for describing quantum channels—the Pauli transfer matrix. We explain some properties of the Pauli transfer matrix N of a channel $\mathcal{N}$ in section 2, relating N to the diamond distance ${D}_{\diamond }\left(\mathcal{N}\right)$ in equation (51) and to the average infidelity r_m of the m-times repeated channel ${\left(\mathcal{N}\right)}^{m}$ in equations (41) and (44). Using lemma 1, these expressions for the diamond distance and the average infidelity in terms of the Pauli transfer matrix can be restated in terms of the chi matrix.

In section 3 we study the performance of the quantum repetition code against coherent noise, and prove theorem 1 using explicit computation of the logical channel combined with results derived in section 2. This result shows that the logical channel is highly incoherent when the code block is large. An alternative proof of theorem 1, making essential use of the chi matrix, is presented in section 4, where we develop the key tools needed for the proof of theorem 3. We also prove lemma 2, which is used to show that, for independent unitary noise acting on the physical qubits, the coherence of the logical channel for the repetition code is maximized when all qubits are rotated by the same angle. A similar idea can be adapted for analyzing the coherence of the logical channel for the toric code.

In section 5, we extend the analysis of the repetition code to the case of two-body correlated coherent noise, culminating in the proof of theorem 2, showing that the coherence of the logical channel is heavily suppressed in this case as well. The proof is a computation of the logical channel for this case, achieved by a detailed combinatoric analysis. As expected, the noise correlations enhance the probability of a decoding error, but it turns out that both the coherent and incoherent parts of the logical channel are enhanced, so that the relationship between the two is not changed much compared to the case of uncorrelated coherent noise. The same reasoning used to prove theorem 2 can also be applied to the toric code to show that, in that case as well, two-body correlations in the noise do not enhance the coherence of the logical channel.

Our analysis of the performance of the toric code against coherent noise, culminating in the proof of theorem 3, is in section 6. To prove the theorem we compute first the coherent part of the logical channel, and then the incoherent part, after which we can make an inference about how the two are related. For this purpose, upper bounds on the logical noise strength would not suffice. Instead, we compute both the coherent and incoherent part of the logical channel up to an error which we show is small if the physical noise is sufficiently weak.

Our arguments in section 6 make use of observations, discussed in section 4, which apply to any stabilizer code. We may assign a 'standard error' E_s to each error syndrome s, and define a decoder which returns the damaged state to the code space by applying ${E}_{s}^{{\dagger}}$ when the syndrome is measured to be s. This E_s is a Pauli operator acting on the code block. Furthermore, each logical Pauli operator ${\tilde {L}}_{a}$ acting on the code may by convention be associated with a particular standard physical Pauli operator L_a—the choice of L_a is not unique, and therefore must be fixed by convention, because we have the freedom to multiply L_a by an element of the code's stabilizer group without changing its logical action. Once the standard error for each syndrome, and the physical Pauli operator corresponding to each logical Pauli operator, are determined, any physical Pauli operator acting on the code block has a unique decomposition of the form (up to a phase factor) σ(s, a, x) = E_sL_aG_x, where E_s is a standard error, L_a is a standard logical Pauli operator, and G_x is an element of the code stabilizer.

In the chi matrix formalism, the result $\mathcal{N}\left(\rho \right)$ of applying noisy channel $\mathcal{N}$ to density operator ρ is expanded as a sum of terms of the form σ(s, a, x) ρ σ(s',a',x')^†. As explained in section 4.2, if ρ is a logical density operator, then a term of this form is annihilated by the error recovery operation for s ≠ s', and for s = s' is mapped to ${L}_{a}\rho {L}_{{a}^{\prime }}^{{\dagger}}$ , up to a phase. (That phase is important, and we will need to keep track of it carefully.) Recovery is successful if L_a and L_a' are both logical identity operators. The terms in the logical channel with L_a = L_a' are said to be incoherent, and the terms with L_a ≠ L_a' are said to be coherent.

The key point is that we have a conceptually simple algorithm for computing the chi matrix for the logical channel, and for identifying its coherent and incoherent parts. To find the coefficient of ${L}_{a}\rho {L}_{{a}^{\prime }}^{{\dagger}}$ in the logical channel, we just need to sum up the coefficients of all terms in the physical chi matrix of the form σ(s, a, x) ρ σ(s,a',x')^†, being mindful of phase factors, for all possible values of s, x, x'. Unfortunately, in general this algorithm is too complex to carry out in practice, but under suitable conditions we can estimate logical chi matrix with sufficient accuracy for our purposes.

For the case of the toric code, we can begin by noting some helpful simplifications. We choose standard errors defined by minimal-weight decoding. Because of the code's CSS structure, we can analyze the logical X and logical Z errors separately, and in fact a single analysis applies to errors of both types. We do not need to worry about logical Y errors or about logical errors acting nontrivially on more than one of the code's logical qubits (lemma 14 in appendix H) because these are so highly suppressed; the same goes for coherent errors in which both L_a and L_a' are nontrivial (lemma 15 in appendix I). We can assume that the coherent noise rotates physical qubits about an axis in the X–Z plane (lemma 13 in appendix G); otherwise the logical noise would be even less coherent. We are left with the task of estimating two nontrivial elements of the logical chi matrix—the coherent term ${\tilde {Z}}_{1}\rho \tilde{I}$ term and the incoherent term ${\tilde {Z}}_{1}\rho {\tilde {Z}}_{1}$ , where ${\tilde {Z}}_{1}$ denotes the logical Z operator acting on one of the code's two encoded qubits. In the proof of theorem 3, we estimate both quantities using a series of approximations, and verify that these approximations are trustworthy when the physical noise is sufficiently weak.

First consider the coherent part of the logical chi matrix. We need to sum up all the terms in the physical chi matrix which contribute to ${\tilde {Z}}_{1}\rho \tilde{I}$ after the action of the decoding map. Each such term has the form ${E}_{s}{Z}_{1}{G}_{x}\rho \;{G}_{y}^{{\dagger}}{E}_{s}^{{\dagger}}$ , where E_s denotes a standard correctable Pauli error, G_x and G_y are Pauli operators in the code stabilizer, and Z₁ is the standard physical Pauli operator whose logical action matches ${\tilde {Z}}_{1}$ . For the purpose of our computation, we may assume that all the Pauli operators are of the Z type—that is, each applies Z to a subset of the qubits and applies I to the complementary set. For the purpose of enumerating all such contributions, it is convenient to note that the product ${G}_{y}^{{\dagger}}{Z}_{1}{G}_{x}$ of the Pauli operators acting on the density operator from the right and from the left is a logical operator, one commuting with the code stabilizer. This logical operator can be decomposed into a connected path that winds once around on the periodically identified square lattice—what we call a 'logical string'—and a collection of homologically trivial closed loops on the lattice—what we call the 'disconnected part' of the logical Pauli operator.

We can therefore enumerate all the contributions to ${\tilde {Z}}_{1}\rho \tilde{I}$ by this procedure: (1) consider all possible logical strings. (2) For each logical string, consider all possible 'partitions' of that string into an uncorrectable error acting from the left and a correctable error acting from the right. (3) For each logical string and partition, consider all possible choices for the disconnected part. We compute ${\tilde {Z}}_{1}\rho \tilde{I}$ by summing all these contributions. Though we cannot perform this sum exactly, we can approximate the sum and estimate the resulting errors.

It is for the purpose of approximating this sum that we need the assumption that the rotation angle θ scales like 1/L, where L is the linear system size. Under this assumption, we show that we make a small error by truncating the sum to include only relatively short logical strings (lemma 3 in section 6.5) which have a typical shape (lemmas 9 and 10 in appendix D). Summing over all partitions of a fixed logical string is similar to the computation we performed for the repetition code, but with a few new subtleties. Specifically, there are some 'exceptional' partitions such that the uncorrectable error acting from the left actually has lower weight than the correctable error acting from the right. Fortunately, we can show that we make a small error by ignoring this effect (lemma 4 in section 6.6), simplifying the sum over partitions.

For a fixed connected logical string and partition of that string, we need to sum over disconnected closed loops and partitions of those loops. Performing this sum is almost equivalent to adding up all possible error patterns weighted by their probabilities, which trivially sums to unity. The only complication is that, for some closed loops that closely approach the logical string, and for some special partitions, the additional loop can flip how the error is decoded. It turns out, though, that we make only a small error by ignoring this effect (lemma 11 in appendix E).

With all the above simplifications in hand, we can estimate the coherent part of the logical chi matrix. In particular, the sum over partitions for a fixed logical string can be evaluated much as in the proof of theorem 1 for the repetition code. It then remains to estimate the incoherent part and compare the two.

In the incoherent part, ${\tilde {Z}}_{1}$ acts from both the left and the right; therefore, there are two logical strings to keep track of, one on each side. These two logical strings have segments in common, determined by the intersection of the string with the standard error, but are free to fluctuate independently away from those segments (figure 10 of section 6.8). To approximate the sum over contributions from these logical strings to the incoherent part of the logical chi matrix, we may truncate the sum as in the computation of the coherent part, limiting our attention to relatively short strings with a typical shape (lemma 5 in section 6.8 and lemma 6 in section 6.9), and ignoring complications arising from the disconnected part of the error (lemma 12 in appendix F). Furthermore, we may also ignore contributions with 'mismatched weight', confining our attention to minimal-weight uncorrectable errors on the logical string acting from both the left and the right (lemma 7 in section 6.10). With these approximations, the incoherent part of the logical chi matrix may be expressed in a form which can be conveniently compared with the coherent part.

As for the repetition code, we can justify considering unitary noise such that all physical qubits are rotated by the same angle—rotating different qubits by different angles only makes the logical channel less coherent (lemma 8 in section 6.11). For such a coherent noise model with uniform rotation angles, we compare the coherent and incoherent parts of the logical chi matrix, proving theorem 3 (section 6.13). Using the findings from section 2, these results can be translated into statements about the diamond distance of the logical channel and about the average infidelity of the m-times repeated logical channel. We also observe (section 6.12), that our analysis of the performance of the repetition code against two-body correlated coherent noise (theorem 2) is applicable with few modifications to the toric code as well.

Our conclusion that the coherence of the logical channel is heavily suppressed applies in the limit of large code size L, and under the assumption that the physical qubits are rotated by an angle θ scaling like 1/L. In Section 7 we discuss the difficulties that have prevented us from extending the result to larger values of θ.

1.3. Related work

The performance of stabilizer codes against fully coherent unitary noise has been previously studied in [17–19]. Huang, Doherty, and Flammia [18] derived an inequality which relates the diamond distance D_◊ of the logical channel from the identity to the rotation angle θ for independent unitary noise, finding

$\begin{equation}{D}_{\diamond }{\leqslant}{c}_{n,k}\vert \mathrm{sin}\;\theta {\vert }^{d};\end{equation} \tag{ 1 }$

here d is the code distance, n is the code length, and k is the number of encoded qubits. Their result applies to any stabilizer code, but c_n,k grows exponentially with n (it is bounded above by 2^3n+k+1), so their result is not very informative for large codes. In contrast, we derive a bound relating the coherent and incoherent components of the logical channel which does not involve any exponentially large factors. We achieve this improved result by specializing to the toric code, and by assuming sin θ < 1/L. Furthermore, to obtain equation (1) the authors of [18] bounded a sum of contributions to the logical channel using the triangle inequality, hence obtaining a bound that would apply even if all the terms in the sum had a common phase. Instead, we sum the contributions with the appropriate phases; the resulting cancellations among terms yield a much smaller result than we would have obtained by merely invoking the triangle inequality. We are able to carry out this more detailed analysis because our assumption sin θ < 1/L allows us to restrict our attention to short logical strings, for which approximating the sum becomes a manageable task.

Beale, Wallman, Guttiérrez, Brown, and Laflamme [19] also studied the performance of stabilizer codes against independent unitary noise, and they concluded that the coherence of the logical channel is suppressed. For a fixed code length, they study the limit of small rotation angle θ. If the logical channel is expanded in powers of θ, then for sufficiently small θ the leading term in this expansion dominates, and they draw their conclusions by analyzing this leading term. In effect, they (like us) investigate the case in which the noise strength deceases as the code length increases, but their assumption about the noise strength is much stronger than ours. We (unlike them) include all corrections to the logical channel higher order in θ that are needed to accurately approximate the logical channels for sin θ < 1/L, albeit only for the special case of the toric code.

Bravyi, Engelbrecht, König, and Peard [21] have studied the performance of the toric code against independent unitary noise numerically, using a clever mapping from qubits to Majorana fermions, for code distance up to d = 37, and they found that the coherence of the logical channel becomes negligible as the code length increases, provided that the rotation angle θ is smaller than a nonzero constant threshold value θ₀. Their numerical method applies to a noise model in which all qubits are rotated about the Z axis, which according to our analysis is the worst case that maximizes the coherence of the logical channel. The numerical results support a value of θ₀ greater than 0.25 and less than 0.32, while for the largest code sizes they consider our analytic results apply only for θ less than about 0.027. They characterize the coherence of the logical channel by sampling from the distribution governing the logical rotation angle θ_logical conditioned on the measured error syndrome, finding that this distribution becomes strongly peaked around θ_logical = 0 for large code length when θ_physical is smaller than θ₀. They also consider, as we do, the logical channel averaged over syndromes, and show that the 'twirled' logical channel has an error probability close to the error probability of the untwirled logical channel for large code length, a further indication of suppressed logical coherence. Their numerical findings appear to be at least notionally consistent with our analytic results, though it is difficult to make a quantitative comparison because our formulas are accurate only for asymptotically large L and for L sin θ sufficiently small compared to 1.

2. Channel parameters

2.1. Pauli transfer matrix

We will use the Pauli transfer matrix representation to describe channels acting on n qubits. For this purpose we expand the density operator ρ in the Pauli operator basis {σⁱ}:

$\begin{equation}\rho =\sum _{i=0}^{{d}^{2}-1}{\rho }_{i}{\sigma }^{i},\end{equation} \tag{ 2 }$

where

$\begin{equation}\mathrm{Tr}\left({\sigma }^{i}{\sigma }^{j}\right)=\frac{1}{d}{\delta }^{ij},\end{equation} \tag{ 3 }$

and σ⁰ = (id)/d. Here d = 2ⁿ is the Hilbert-space dimension, and id denotes the d × d identity matrix. Note that Tr(ρ) = ρ₀. A linear map $\mathcal{N}$ acting on density operators defines a d² × d² matrix (the Pauli transfer matrix associated with $\mathcal{N}$ ) according to

$\begin{equation}\mathcal{N}\left(\rho \right)=\sum _{i,j}\left({N}_{ij}{\rho }_{j}\right){\sigma }^{i}.\end{equation} \tag{ 4 }$

This matrix is real if $\mathcal{N}$ maps Hermitian operators to Hermitian operators. If the map $\mathcal{N}$ is trace preserving, then ∑_iN_0iρ_i = ρ₀; hence N_0i = δ_0i. If the map $\mathcal{N}$ is unital (that is, $\mathcal{N}\left(id\right)=id$ ), then ∑_iN_ijδ_j0 = δ_i0; hence N_i0 = δ_i0. Thus the matrix representing the map $\mathcal{N}$ may be expressed as

$\begin{equation}N=\left(\begin{pmatrix}1\hfill & 0\hfill & 0\hfill & \hfill \cdots \hfill \\ \hfill {N}_{n}\hfill & \hfill \hfill & \hfill {N}_{u}\hfill \\ \hfill \hfill \end{pmatrix}\right).\end{equation} \tag{ 5 }$

We say that the (d² − 1) × (d² − 1) matrix N_u is the unital part of $\mathcal{N}$ and that the length-(d² − 1) vector N_n is its nonunital part. Altogether the trace-preserving map $\mathcal{N}$ is specified by d²(d² − 1) parameters.

For a unitary map $\mathcal{N}\left(\rho \right)=U\rho {U}^{{\dagger}}$ , we have N_n = 0 and (for i ≠ 0)

$\begin{equation}U{\sigma }^{i}{U}^{{\dagger}}=\sum _{j=1}^{{d}^{2}-1}{\left({N}_{u}\right)}_{ji}{\sigma }^{j},\end{equation} \tag{ 6 }$

where

$\begin{equation}d\;\mathrm{Tr}\left(U{\sigma }^{i}{U}^{{\dagger}}U{\sigma }^{k}{U}^{{\dagger}}\right)={\delta }^{ik}=d\sum _{j,l\ne 0}{\left({N}_{u}\right)}_{ji}{\left({N}_{u}\right)}_{lk}\mathrm{Tr}\left({\sigma }^{j}{\sigma }^{l}\right)=\sum _{j\ne 0}{\left({N}_{u}\right)}_{ji}{\left({N}_{u}\right)}_{jk};\end{equation} \tag{ 7 }$

hence N_u is an orthogonal matrix.

The matrix representing $\mathcal{N}$ is diagonal if and only if the map is a convex sum of Pauli operators

$\begin{equation}\mathcal{N}\left(\rho \right)=\sum _{i}{p}_{i}{\sigma }^{i}\rho {\sigma }^{i},\end{equation} \tag{ 8 }$

in which case the diagonal entries are

$\begin{equation}{N}_{jj}=\sum _{i}{p}_{i}{\xi }_{ij},\end{equation} \tag{ 9 }$

where σⁱσ^j = ξ_ijσ^jσⁱ; that is, ξ_ij is the sign ±1 determined by whether the Pauli operators σⁱ and σ^j commute or anticommute.

2.2. Average infidelity

The fidelity F of a channel $\mathcal{N}$ acting on a pure state |ψ⟩ is defined by

$\begin{equation}F=\langle \psi \vert \mathcal{N}\left(\rho \right)\vert \psi \rangle ,\end{equation} \tag{ 10 }$

and 1 − F is called the infidelity. The average infidelity r of $\mathcal{N}$ is

$\begin{equation}r=1-\int \mathrm{d}U\;\mathrm{Tr}\left[U\rho {U}^{{\dagger}}\mathcal{N}\left(U\rho {U}^{{\dagger}}\right)\right],\end{equation} \tag{ 11 }$

where the integral is with respect to the normalized invariant Haar measure on the unitary group, and ρ is any pure state. Equivalently, r is the infidelity of the averaged channel

$\begin{equation}\overline{\mathcal{N}}\left(\rho \right)=\int \mathrm{d}U\;{U}^{{\dagger}}\mathcal{N}\left(U\rho {U}^{{\dagger}}\right)U.\end{equation} \tag{ 12 }$

We may just as well define r as the infidelity of $\mathcal{N}$ averaged over a unitary two-design. Hence r can be measured in randomized benchmarking experiments, in which U is chosen by sampling uniformly from the Clifford group, which is a unitary two-design.

The d × d unitary matrix U defines an orthogonal (d² − 1) × (d² − 1) matrix N_u = O according to

$\begin{equation}U{\sigma }^{i}{U}^{{\dagger}}=\sum _{j=1}^{{d}^{2}-1}{O}_{ji}{\sigma }^{j},\quad {U}^{{\dagger}}{\sigma }^{i}U=\sum _{j=1}^{{d}^{2}-1}{O}_{ji}^{\text{T}}{\sigma }^{j},\end{equation} \tag{ 13 }$

where O^T denotes the transpose of O; therefore

$\begin{equation}{U}^{{\dagger}}\mathcal{N}\left(U{\sigma }^{i}{U}^{{\dagger}}\right)U=\sum _{j=1}^{{d}^{2}-1}{\left({O}^{\text{T}}{N}_{u}O\right)}_{ji}{\sigma }^{j};\quad {U}^{{\dagger}}\mathcal{N}\left(U{\sigma }^{0}{U}^{{\dagger}}\right)U={\sigma }^{0}+\sum _{i=1}^{{d}^{2}-1}{\left({O}^{\text{T}}{N}_{\text{n}}\right)}_{i}{\sigma }^{i}.\end{equation} \tag{ 14 }$

The uniform average of U over the unitary group becomes a uniform average of O over the orthogonal group. The nonunital part of $\mathcal{N}$ averages to zero, and the average of the unital part can be evaluated using

$\begin{equation}\int \mathrm{d}O\;{O}_{ij}^{\text{T}}{O}_{kl}=\frac{1}{{d}^{2}-1}{\delta }_{jk}{\delta }_{il},\end{equation} \tag{ 15 }$

which yields

$\begin{equation}{\left({\overline{N}}_{u}\right)}_{ij}=\frac{\mathrm{Tr}\left({N}_{u}\right)}{{d}^{2}-1}{\delta }_{ij}.\end{equation} \tag{ 16 }$

Hence, the averaged channel is a completely depolarizing Pauli channel of the form

$\begin{equation}\overline{\mathcal{N}}\left(\rho \right)=p\rho +\left(1-p\right)\left(\frac{id}{d}\right),\end{equation} \tag{ 17 }$

where

$\begin{equation}p=\frac{1}{{d}^{2}-1}\mathrm{Tr}\left({N}_{u}\right).\end{equation} \tag{ 18 }$

Note that if this averaged channel is applied m times in succession, we obtain

$\begin{equation}{\overline{\mathcal{N}}}^{m}\left(\rho \right)={p}^{m}\rho +\left(1-{p}^{m}\right)\left(\frac{id}{d}\right);\end{equation} \tag{ 19 }$

thus p is called the benchmarking parameter because it determines the rate of exponential decay of fidelity in benchmarking experiments. The average infidelity r is given by

$\begin{equation}r=1-\langle \psi \vert \overline{\mathcal{N}}\left(\vert \psi \rangle \langle \psi \vert \right)\vert \psi \rangle =1-\left(p+\frac{1-p}{d}\right)=\frac{d-1}{d}\left(1-p\right)=\frac{1}{d\left(d+1\right)}\mathrm{Tr}\left({I}_{{d}^{2}-1}-{N}_{u}\right)\end{equation} \tag{ 20 }$

for any pure state |ψ⟩. Here ${I}_{{d}^{2}-1}$ denotes the (d² − 1) × (d² − 1) identity matrix. Because N₀₀ = 1, we may also express the infidelity as

$\begin{equation}r=\frac{1}{d\left(d+1\right)}\mathrm{Tr}\left({I}_{{d}^{2}}-N\right),\end{equation} \tag{ 21 }$

where ${I}_{{d}^{2}}$ denotes the d² × d² identity.

2.3. Examples

2.3.1. Depolarizing channel

We have seen that if ${\mathcal{N}}_{p}$ is the depolarizing channel with benchmarking parameter p, then ${\left({\mathcal{N}}_{p}\right)}^{m}={\mathcal{N}}_{{p}^{m}}$ . Using the relation $r=\frac{d-1}{d}\left(1-p\right)$ , we can express the infidelity r_m of ${\left({\mathcal{N}}_{p}\right)}^{m}$ in terms of the infidelity r of ${\mathcal{N}}_{p}$ , finding

$\begin{equation}{r}_{m}=\frac{d-1}{d}\left(1-{p}^{m}\right)=mr-\frac{d}{2\left(d-1\right)}m\left(m-1\right){r}^{2}+\mathcal{O}\left({m}^{3}{r}^{3}\right).\end{equation} \tag{ 22 }$

If mr is small, the infidelity accumulates linearly with m, the number of times the channel is applied. A similar remark applies to more general Pauli channels.

We say that a channel with this property is incoherent. The interpretation is that (up to a constant factor), the infidelity r may be regarded as a probability of error. If the channel is applied m times, where mr is small, any one of the m instances of the channel could be faulty, so that the total probability of error is mr + higher-order terms.

2.3.2. Qubit rotation

In contrast, consider the case of a unitary rotation of a single qubit about the X-axis

$\begin{equation}{U}^{X}\left(\theta \right)=\mathrm{exp}\left(-i\frac{\theta }{2}{\sigma }^{X}\right)\end{equation} \tag{ 23 }$

which rotates the Bloch sphere by θ. For this channel the Pauli transfer matrix is

$\begin{equation}N\left(\theta \right)=\left(\begin{pmatrix}\hfill 1\hfill & \hfill 0\hfill & 0\hfill & 0\hfill \\ \hfill 0\hfill & \hfill 1\hfill & 0\hfill & 0\hfill \\ \hfill 0\hfill & \hfill 0\hfill & \hfill \mathrm{cos}\;\theta \hfill & \hfill \mathrm{sin}\;\theta \hfill \\ \hfill 0\hfill & \hfill 0\hfill & \hfill -\mathrm{sin}\;\theta \hfill & \hfill \mathrm{cos}\;\theta \hfill \end{pmatrix}\right);\end{equation} \tag{ 24 }$

therefore, the infidelity is

$\begin{equation}r=\frac{1}{6}\;\mathrm{Tr}\left(I-{N}_{u}\right)=\frac{1}{3}\left(1-\mathrm{cos}\;\theta \right)=\frac{1}{6}\;{\theta }^{2}-\frac{1}{72}{\theta }^{4}+\mathcal{O}\left({\theta }^{6}\right).\end{equation} \tag{ 25 }$

Applying this channel m times, we obtain N(θ)^m = N(mθ), a rotation by an angle m times larger. Therefore,

$\begin{equation}{r}_{m}=\frac{1}{3}\left(1-\mathrm{cos}\;m\theta \right)={m}^{2}r-\frac{1}{2}{m}^{2}\left({m}^{2}-1\right){r}^{2}+\mathcal{O}\left({m}^{6}{r}^{3}\right).\end{equation} \tag{ 26 }$

Here, for m²r small, the infidelity accumulates quadratically with m; it is the rotation angle, rather than the error probability, that increases linearly. We say that a channel like this one, for which the infidelity increases faster than linearly with m, is coherent.

2.3.3. Rotation/dephasing channels

The distinction between a coherent and incoherent channel is not always clearcut, and we will need measures that quantify the degree of coherence. As an example, consider the case where a qubit either dephases in the X-basis (with probability q_D) or is rotated by angle θ about the X-axis (with probability q_R):

$\begin{equation}\mathcal{N}\left(\rho \right)=\left(1-{q}_{D}-{q}_{R}\right)\rho +{q}_{D}{\sigma }^{X}\rho {\sigma }^{X}+{q}_{R}{U}^{X}\left(\theta \right)\rho {U}^{X}{\left(\theta \right)}^{{\dagger}}.\end{equation} \tag{ 27 }$

The Pauli transfer matrix is

$\begin{equation}N=\left(\begin{pmatrix}\hfill I\hfill & \hfill 0\hfill \\ \hfill 0\hfill & \hfill M\hfill \end{pmatrix}\right),\end{equation} \tag{ 28 }$

where I is the 2 × 2 identity, and M is the 2 × 2 matrix

$\begin{equation}M=\left(\begin{pmatrix}\hfill 1-{\epsilon}\hfill & \hfill \delta \hfill \\ -\delta \hfill & \hfill 1-{\epsilon}\hfill \end{pmatrix}\right),\end{equation} \tag{ 29 }$

with

$\begin{equation}{\epsilon}=2{q}_{D}+{q}_{R}\left(1-\mathrm{cos}\;\theta \right),\quad \delta ={q}_{R}\;\mathrm{sin}\;\theta .\end{equation} \tag{ 30 }$

The infidelity is

$\begin{equation}r=\frac{1}{6}\;\mathrm{Tr}\left(I-M\right)=\frac{1}{3}\;{\epsilon}=\frac{2}{3}\;{q}_{D}+\frac{1}{3}\;{q}_{R}\left(1-\mathrm{cos}\;\theta \right).\end{equation} \tag{ 31 }$

The eigenvalues of M are

$\begin{equation}{\lambda }_{{\pm}}=1-{\epsilon}{\pm}i\delta ,\end{equation} \tag{ 32 }$

and therefore the infidelity of ${\mathcal{N}}^{m}$ is

$\begin{align}\hfill {r}_{m}=\frac{1}{6}\;\mathrm{Tr}\left(I-{M}^{m}\right)& =\frac{1}{6}\left[2-{\left(1-{\epsilon}+i\delta \right)}^{m}-{\left(1-{\epsilon}-i\delta \right)}^{m}\right]\hfill \\ \hfill & =\frac{1}{3}\;m{\epsilon}-\frac{1}{6}m\left(m-1\right)\left({{\epsilon}}^{2}-{\delta }^{2}\right)+\mathcal{O}\left({m}^{3}{{\epsilon}}^{3},{m}^{3}{\epsilon}{\delta }^{2},{m}^{4}{\delta }^{4}\right).\hfill \end{align} \tag{ 33 }$

Here the degree of coherence depends on the relative value of and δ. In the case of a unitary rotation, we have ${\epsilon}=\mathcal{O}\left({\delta }^{2}\right)$ , which means that the term growing quadratically with m can dominate. On the other hand, for ⩾ δ, there is no quadratically growing term at all.

A generalization of this channel will be useful in section 3. Instead of a single rotation by θ occurring with probability q_R, we may consider an ensemble of possible rotations, where a rotation by θ_a occurs with probability q_a. In that case r_m is still given by equation (33), but now

$\begin{equation}{\epsilon}=2{q}_{D}+\sum _{a}{q}_{a}\left(1-\mathrm{cos}\;{\theta }_{a}\right),\quad \delta =\sum _{a}{q}_{a}\;\mathrm{sin}\;{\theta }_{a}.\end{equation} \tag{ 34 }$

2.4. Unitarity and the coherence angle

We have seen that N_u is an orthogonal matrix if (and only if) the channel $\mathcal{N}$ is unitary. Hence a deviation from orthogonality of N_u indicates a deviation from unitarity of $\mathcal{N}$ . With that in mind, following [10] we define the unitarity $u\left(\mathcal{N}\right)$ of the channel $\mathcal{N}$ as

$\begin{equation}u\left(\mathcal{N}\right)=\frac{1}{{d}^{2}-1}\;\mathrm{Tr}\left({N}_{u}^{\text{T}}{N}_{u}\right),\end{equation} \tag{ 35 }$

which is 1 for unitary channels and strictly less than 1 for nonunitary channels. For a fixed value of the infidelity r, the unitarity achieves its minimum for the depolarizing channel [24], where

$\begin{equation}u\left(\mathcal{N}\right)=\frac{1}{{d}^{2}-1}\left(\mathrm{Tr}\;{N}_{u}^{2}\right)={p}^{2}={\left(1-\frac{dr}{d-1}\right)}^{2}.\end{equation} \tag{ 36 }$

The unitarity u and the benchmarking parameter p together provide a useful characterization of the coherence of a channel. We will be primarily interested in the case where the infidelity r is small, so that the diagonal elements {N_ii} of the Pauli transfer matrix are close to one, and it makes sense to expand in the small quantity 1 − N_ii. Writing

$\begin{equation}{\left({N}_{u}\right)}_{ii}^{2}={\left(1-\left(1-{\left({N}_{u}\right)}_{ii}\right)\right)}^{2}=1-2\left(1-{\left({N}_{u}\right)}_{ii}\right)+{\left(1-{\left({N}_{u}\right)}_{ii}\right)}^{2},\end{equation} \tag{ 37 }$

we see that

$\begin{equation}u\left(\mathcal{N}\right)=\frac{1}{{d}^{2}-1}\sum _{i,j}{\left({N}_{u}\right)}_{ij}^{2}=1-2\left(1-p\right)+\frac{1}{{d}^{2}-1}\sum _{i,j\vert i\ne j}{\left({N}_{u}\right)}_{ij}^{2}+\frac{1}{{d}^{2}-1}\sum _{i}{\left(1-{\left({N}_{u}\right)}_{ii}\right)}^{2}.\end{equation} \tag{ 38 }$

Expanding the square root of u, we find

$\begin{equation}\sqrt{u\left(\mathcal{N}\right)}=p+\frac{1}{2\left({d}^{2}-1\right)}\sum _{i,j\vert i\ne j}{\left({N}_{u}\right)}_{ij}^{2}+\cdots \;,\end{equation} \tag{ 39 }$

where the ellipsis indicates terms that are fourth order in the off-diagonal entries ${\left({N}_{u}\right)}_{ij}$ and terms that are quadratic order in $\left(1-{\left({N}_{u}\right)}_{ii}\right)$ .

The coherence angle Θ is defined as

$\begin{equation}{\Theta}=\mathrm{arccos}\left(p/\sqrt{u}\right),\end{equation} \tag{ 40 }$

which for p and u close to one, can be expressed as

$\begin{equation}{{\Theta}}^{2}=2\left(1-\frac{p}{\sqrt{u}}\right)+\cdots =\frac{1}{{d}^{2}-1}\sum _{i,j\vert i\ne j}{\left({N}_{u}\right)}_{ij}^{2}+\cdots \;.\end{equation} \tag{ 41 }$

Apart from a normalization factor, and neglecting the higher-order terms, Θ² is the sum of squares of all off-diagonal terms in N_u. It quantifies the coherence in the channel.

For the qubit rotation channel in equation (24), the coherence angle is related to the rotation angle θ by

$\begin{equation}{{\Theta}}^{2}\approx \frac{2}{3}{\mathrm{sin}}^{2}\;\theta \approx \frac{2}{3}{\theta }^{2}.\end{equation} \tag{ 42 }$

For the dephasing/rotation qubit channel in equation (29), our truncated power series expansion used to derive equation (41) is justified if is negligible compared to δ, in which case we find

$\begin{equation}{{\Theta}}^{2}\approx \frac{2}{3}{q}_{R}^{2}{\theta }^{2}.\end{equation} \tag{ 43 }$

For the depolarizing channel, u = p² and hence Θ = 0.

In [25], Carignan-Dugas et al derived a bound on r_m, the infidelity when a unital channel $\mathcal{N}$ is applied m times in succession, in terms of the infidelity r and coherence angle Θ of $\mathcal{N}$ :

$\begin{equation}{r}_{m}{\leqslant}mr+\frac{d-1}{2d}m\left(m-1\right){{\Theta}}^{2}+\cdots \;,\end{equation} \tag{ 44 }$

where the ellipsis indicates terms higher order in r and Θ². In this sense (for unital channels), the coherence angle controls the quadratic growth of r_m as a function of m, when r and Θ² are small.

2.5. Diamond distance

In some versions of the quantum accuracy threshold theorem, the strength of Markovian noise is characterized by the deviation of a noisy gate from the corresponding ideal gate in the diamond norm [26]. This diamond norm deviation is useful for quantifying the damage inflicted when the noisy gate acts on qubits which are entangled with other qubits in a quantum computer. The diamond norm ${\Vert}\mathcal{E}{{\Vert}}_{\diamond }$ of a linear map $\mathcal{E}$ is defined as the L¹ norm of the extended map $\mathcal{E}\otimes \mathcal{I}$ :

$\begin{equation}{\Vert}\mathcal{E}{{\Vert}}_{\diamond }=\underset{\rho }{\mathrm{max}}{\Vert}\mathcal{E}\otimes \mathcal{I}\left(\rho \right){{\Vert}}_{1}.\end{equation} \tag{ 45 }$

If $\mathcal{E}$ acts on Hilbert space $\mathcal{H}$ with dimension d, then $\mathcal{I}$ denotes the identity acting on another Hilbert space ${\mathcal{H}}^{\prime }$ with dimension d; the maximum is over all density operators on $\mathcal{H}\otimes {\mathcal{H}}^{\prime }$ . A measure of noise strength for a noisy channel $\mathcal{N}$ is the diamond distance of $\mathcal{N}$ from the identity channel,

$\begin{equation}{D}_{\diamond }\left(\mathcal{N}\right){:=}{\Vert}\mathcal{N}-\mathcal{I}{{\Vert}}_{\diamond }.\end{equation} \tag{ 46 }$

If $\mathcal{N}$ is applied m times in succession, we have

$\begin{equation}{D}_{\diamond }\left({\mathcal{N}}^{m}\right){\leqslant}m{D}_{\diamond }\left(\mathcal{N}\right).\end{equation} \tag{ 47 }$

Upper and lower bounds on the diamond distance can be expressed in terms of the benchmarking parameter $p\left(\mathcal{N}\right)=1-r\left(\mathcal{N}\right)d/\left(d-1\right)$ and the unitarity $u\left(\mathcal{N}\right)$ [9]:

$\begin{equation}\frac{\sqrt{{d}^{2}-1}}{2d}f\left(p,u\right){\leqslant}{D}_{\diamond }{\leqslant}\frac{d\sqrt{{d}^{2}-1}}{2}f\left(p,u\right),\end{equation} \tag{ 48 }$

where

$\begin{equation}f\left(p,u\right)=\sqrt{\left(1-2p+u\right)}.\end{equation} \tag{ 49 }$

For the depolarizing channel, we have u = p² and f = 1 − p = rd/(d − 1); the diamond distance scales linearly with the infidelity r. But for a unitary channel, we have u = 1 and $f=\sqrt{2\left(1-p\right)}$ ; then the diamond distance scales like $\sqrt{r}$ .

From equation (38), we see that

$\begin{equation}f{\left(p,u\right)}^{2}=1-2p+u=\frac{1}{{d}^{2}-1}\left(\sum _{i,j\vert i\ne j}{\left({N}_{u}\right)}_{ij}^{2}+\sum _{i}{\left(1-{\left({N}_{u}\right)}_{ii}\right)}^{2}\right),\end{equation} \tag{ 50 }$

which together with equation (48) provides upper and lower bounds on the diamond distance written in terms of Pauli transfer matrix elements:

$\begin{align}\hfill {D}_{\diamond }\left(\mathcal{N}\right)& {\geqslant}\frac{1}{d}{\left(\sum _{i,j\vert i\ne j}{\left({N}_{u}\right)}_{ij}^{2}+\sum _{i}{\left(1-{\left({N}_{u}\right)}_{ii}\right)}^{2}\right)}^{1/2}\hfill \\ \hfill {D}_{\diamond }\left(\mathcal{N}\right)& {\leqslant}d{\left(\sum _{i,j\vert i\ne j}{\left({N}_{u}\right)}_{ij}^{2}+\sum _{i}{\left(1-{\left({N}_{u}\right)}_{ii}\right)}^{2}\right)}^{1/2}.\hfill \end{align} \tag{ 51 }$

We will be mostly interested in the upper bound on the diamond distance for a logical channel with a fixed number of encoded qubits; therefore, the unfavorable scaling of the upper bound with the dimension d need not cause us great concern.

2.6. Coherence in the chi-matrix representation

The Pauli transfer matrix representation is useful for proving the preceding relationships between channel components, the growth of average infidelity, and the dependence of the diamond distance from identity on the average infidelity. When we analyze error correction, we will make use of a different representation of the noise channel. Any channel $\mathcal{N}$ has an expansion in terms of Pauli operators. Consider a completely positive map $\mathcal{N}$ with Kraus operators {K_α} and expand each K_α as

$\begin{equation}{K}_{\alpha }=\sum _{i=0}^{{d}^{2}-1}{c}_{\alpha i}{\sigma }^{i},\end{equation} \tag{ 52 }$

where all Pauli operators {σⁱ} are chosen to be Hermitian, and the {c_αi} are complex numbers. Then

$\begin{equation}\mathcal{N}\left(\rho \right)=\sum _{\alpha }{K}_{\alpha }\rho {K}_{\alpha }^{{\dagger}}=\sum _{i,j=0}^{{d}^{2}-1}{\chi }_{ij}{\sigma }^{i}\rho {\sigma }^{j},\end{equation} \tag{ 53 }$

where

$\begin{equation}{\chi }_{ij}=\sum _{\alpha }{c}_{\alpha i}{c}_{\alpha j}^{{\ast}}={\chi }_{ji}^{{\ast}}.\end{equation} \tag{ 54 }$

This is called the chi-matrix representation of the channel. The map $\mathcal{N}$ is trace preserving if

$\begin{equation}\sum _{ij}{\chi }_{ij}{\sigma }^{j}{\sigma }^{i}=id,\end{equation} \tag{ 55 }$

and unital if

$\begin{equation}\sum _{i,j}{\chi }_{ij}{\sigma }^{i}{\sigma }^{j}=id.\end{equation} \tag{ 56 }$

Note that σⁱσ^kσ^j = ±σ^k if and only if i = j; therefore, in the Pauli transfer matrix language, the terms in equation (53) with i = j contribute to the diagonal entries in N_ab, while the terms with i ≠ j contribute to the off-diagonal entries.

To be more concrete, consider the single-qubit rotation about the X-axis ${U}^{X}\left(\theta \right)=\mathrm{exp}\left(\right.\left(-i\theta {\sigma }^{X}/2\right)$ , for which

$\begin{align}\hfill \rho \to {U}^{X}\left(\theta \right)\rho {U}^{X}{\left(\theta \right)}^{{\dagger}}={\mathrm{cos}}^{2}\left(\theta /2\right)\rho +i\mathrm{cos}\left(\theta /2\right)\mathrm{sin}\left(\theta /2\right)\rho {\sigma }^{X}-i\mathrm{cos}\left(\theta /2\right)\mathrm{sin}\left(\theta /2\right){\sigma }^{X}\rho \\ \hfill \quad +{\mathrm{sin}}^{2}\left(\theta /2\right){\sigma }^{X}\rho {\sigma }^{X};\end{align} \tag{ 57 }$

hence

$\begin{equation}\left(\begin{pmatrix}\hfill {\chi }_{II}\hfill & \hfill {\chi }_{IX}\hfill \\ \hfill {\chi }_{XI}\hfill & \hfill {\chi }_{XX}\hfill \end{pmatrix}\right)=\left(\begin{pmatrix}\hfill \frac{1}{2}\left(1+\mathrm{cos}\;\theta \right)\hfill & \hfill \frac{i}{2}\mathrm{sin}\;\theta \hfill \\ -\frac{i}{2}\mathrm{sin}\;\theta \hfill & \hfill \frac{1}{2}\left(1-\mathrm{cos}\;\theta \right)\hfill \end{pmatrix}\right).\end{equation} \tag{ 58 }$

More generally, for the channel with Pauli transfer matrix

$\begin{equation}N=\left(\begin{pmatrix}\hfill 1\hfill & \hfill 0\hfill & 0\hfill & \hfill 0\hfill \\ \hfill 0\hfill & \hfill 1\hfill & 0\hfill & \hfill 0\hfill \\ \hfill 0\hfill & \hfill 0\hfill & \hfill 1-{\epsilon}\hfill & \hfill \delta \hfill \\ \hfill 0\hfill & \hfill 0\hfill & \hfill -\delta \hfill & \hfill 1-{\epsilon}\hfill \end{pmatrix}\right),\end{equation} \tag{ 59 }$

as in equation (29), we have

$\begin{equation}\left(\begin{pmatrix}\hfill {\chi }_{II}\hfill & \hfill {\chi }_{IX}\hfill \\ \hfill {\chi }_{XI}\hfill & \hfill {\chi }_{XX}\hfill \end{pmatrix}\right)=\left(\begin{pmatrix}\hfill 1-{\epsilon}/2\hfill & \hfill i\delta /2\hfill \\ \hfill -i\delta /2\hfill & \hfill {\epsilon}/2\hfill \end{pmatrix}\right).\end{equation} \tag{ 60 }$

There is a simple general relationship between the off-diagonal entries of the Pauli transfer matrix N_ab and the chi matrix χ_ij, namely

Lemma 1. The off-diagonal elements of the Pauli transfer matrix N_ab and the chi matrix χ_ij are related by

$\begin{equation}\sum _{a,b\vert a\ne b}{N}_{ab}^{2}={d}^{2}\sum _{i,j\vert i\ne j}\vert {\chi }_{ij}{\vert }^{2},\end{equation} \tag{ 61 }$

where d = 2ⁿ is the Hilbert space dimension.

Because of this identity, we may quantify the coherence of a channel using the off-diagonal entries in either N_ab or χ_ij. The case d = 2 is explained explicitly in appendix A.

Proof. To prove the claim, note that, for any Hermitian Pauli operators σⁱ, σ^j, σ^a, we have

$\begin{equation}{\sigma }^{i}{\sigma }^{a}{\sigma }^{j}={\eta }_{ij}^{ab}{\sigma }^{b}\end{equation} \tag{ 62 }$

for some Hermitian Pauli operator σ^b and some phase ${\eta }_{ij}^{ab}$ . By taking Hermitian adjoints of both sides, we also have

$\begin{equation}{\sigma }^{j}{\sigma }^{a}{\sigma }^{i}={\eta }_{ij}^{ab{\ast}}{\sigma }^{b}.\end{equation} \tag{ 63 }$

The phase is ${\eta }_{ij}^{ab}={\pm}1$ if σⁱσ^aσ^j is Hermitian, and it is ${\eta }_{ij}^{ab}={\pm}i$ if σⁱσ^aσ^j is anti-Hermitian. Furthermore, for each fixed i ≠ j, as σ^a ranges over the d² Hermitian Pauli operators, σⁱσ^aσ^j is Hermitian for d²/2 choices of σ^a, and anti-Hermitian for the remaining d²/2 choices. (If σⁱ and σ^j commute, then σⁱσ^aσ^j is Hermitian if and only if σ^a commutes with σ^jσⁱ. If σⁱ and σ^j anticommute, then σⁱσ^aσ^j is Hermitian if and only if σ^a anticommutes with σ^jσⁱ.) Note that b ≠ a if i ≠ j.

The entries in the Pauli transfer matrix are (for a ≠ b).

$\begin{equation}{N}_{ab}=\sum _{i,j\vert i\ne j}{\eta }_{ij}^{ab}{\chi }_{ij}=\sum _{i,j\vert i{< }j}\left({\eta }_{ij}^{ab}{\chi }_{ij}+{\eta }_{ij}^{ab{\ast}}{\chi }_{ji}\right),\end{equation} \tag{ 64 }$

where the sum is restricted to {i, j} such that σⁱσ^aσ^j ∝ σ^b. The summand is $\left({\pm}1\right)\left({\chi }_{ij}+{\chi }_{ji}\right)$ if σⁱσ^aσ^j is Hermitian, and it is $\left({\pm}i\right)\left({\chi }_{ij}-{\chi }_{ji}\right)$ if σⁱσ^aσ^j is anti-Hermitian. Suppose now that, for fixed i, j, we collect all the terms in ${\sum }_{a\ne b}{N}_{ab}^{2}$ which are quadratic in {χ_ij, χ_ji}. Because σⁱσ^aσ^j is Hermitian for half the choices of σ^a and anti-Hermitian for half the choices, we have

$\begin{equation}\frac{{d}^{2}}{2}{\left({\chi }_{ij}+{\chi }_{ji}\right)}^{2}-\frac{{d}^{2}}{2}{\left({\chi }_{ij}-{\chi }_{ji}\right)}^{2}=2{d}^{2}{\chi }_{ij}{\chi }_{ji}={d}^{2}\left(\vert {\chi }_{ij}{\vert }^{2}+\vert {\chi }_{ji}{\vert }^{2}\right),\end{equation} \tag{ 65 }$

where we have used ${\chi }_{ij}={\chi }_{ji}^{{\ast}}$ , which is required by complete positivity.

To complete the proof of the claim, we must verify that all the multilinear terms of the form χ_ijχ_kl (where {i, j} and {k, l} are disjoint) cancel in the sum ${\sum }_{a\ne b}{N}_{ab}^{2}$ . Such a cross term of the form

$\begin{equation}{\eta }_{ij}^{ab}{\eta }_{kl}^{ab}{\chi }_{ij}{\chi }_{kl}\end{equation} \tag{ 66 }$

arises in ${N}_{ab}^{2}$ when we have

$\begin{align}\hfill & {\sigma }^{i}{\sigma }^{a}{\sigma }^{j}={\eta }_{ij}^{ab}{\sigma }^{b},\hfill \\ \hfill & {\sigma }^{k}{\sigma }^{a}{\sigma }^{l}={\eta }_{kl}^{ab}{\sigma }^{b}.\hfill \end{align} \tag{ 67 }$

We will consider all such terms with i, j, k, l fixed, as we vary σ^a and σ^b over the possible Hermitian Pauli operators. Multiplying both sides on the left by Hermitian Pauli operator σ^c, we obtain

$\begin{align}\hfill \left({\sigma }^{c}{\sigma }^{i}{\sigma }^{c}\right)\left({\sigma }^{c}{\sigma }^{a}\right){\sigma }^{j}& ={\eta }_{ij}^{ab}\left({\sigma }^{c}{\sigma }^{b}\right),\hfill \\ \hfill \left({\sigma }^{c}{\sigma }^{k}{\sigma }^{c}\right)\left({\sigma }^{c}{\sigma }^{a}\right){\sigma }^{l}& ={\eta }_{kl}^{ab}\left({\sigma }^{c}{\sigma }^{b}\right).\hfill \end{align} \tag{ 68 }$

Given a standard sign choice for the d² Hermitian Pauli operators, we may write

$\begin{equation}{\sigma }^{c}{\sigma }^{a}={\phi }_{ca}^{{a}^{\prime }}{\sigma }^{{a}^{\prime }},\quad {\sigma }^{c}{\sigma }^{b}={\phi }_{cb}^{{b}^{\prime }}{\sigma }^{{b}^{\prime }};\end{equation} \tag{ 69 }$

here e.g. ${\phi }_{ca}^{{a}^{\prime }}$ is a phase, which is ±1 if σ^a and σ^c commute and ±i if σ^a and σ^c anticommute. We also have

$\begin{equation}{\sigma }^{c}{\sigma }^{i}{\sigma }^{c}={\xi }_{ic}{\sigma }^{i},\quad {\sigma }^{c}{\sigma }^{k}{\sigma }^{c}={\xi }_{kc}{\sigma }^{k};\end{equation} \tag{ 70 }$

here ξ_ic = ±1 is a sign indicating whether σ^c and σⁱ commute or anticommute. Therefore

$\begin{align}\hfill {\sigma }^{i}{\sigma }^{{a}^{\prime }}{\sigma }^{j}& =\left({\xi }_{ic}{\phi }_{ca}^{{a}^{\prime }{\ast}}{\phi }_{cb}^{{b}^{\prime }}{\eta }_{ij}^{ab}\right){\sigma }^{{b}^{\prime }},\hfill \\ \hfill {\sigma }^{k}{\sigma }^{{a}^{\prime }}{\sigma }^{l}& =\left({\xi }_{kc}{\phi }_{ca}^{{a}^{\prime }{\ast}}{\phi }_{cb}^{{b}^{\prime }}{\eta }_{kl}^{ab}\right){\sigma }^{{b}^{\prime }},\hfill \end{align} \tag{ 71 }$

and the corresponding cross term arising from ${N}_{{a}^{\prime }{b}^{\prime }}^{2}$ is

$\begin{equation}{\xi }_{ic}{\xi }_{kc}{\left({\phi }_{ca}^{{a}^{\prime }{\ast}}{\phi }_{cb}^{{b}^{\prime }}\right)}^{2}{\eta }_{ij}^{ab}{\eta }_{kl}^{ab}{\chi }_{ij}{\chi }_{kl}.\end{equation} \tag{ 72 }$

Now suppose that either σ^c commutes with both σ^a and σ^b or anticommutes with both; in either case ${\left({\phi }_{ca}^{{a}^{\prime }{\ast}}{\phi }_{cb}^{{b}^{\prime }}\right)}^{2}=1$ . As we vary σ^c over the d²/2 Pauli operators with this property, the sign ξ_icξ_kc has the value +1 for the d²/4 choices of σ^c such that σ^c commutes with both σⁱ and σ^k or anticommutes with both, while ξ_icξ_kc has the value −1 for the d²/4 choices of σ^c such that σ^c commutes with one of σⁱ and σ^k and anticommutes with the other. Therefore, as we vary a' and b' over these d²/2 possible choices for σ^c, with i, j, k, l fixed, the cross terms cancel.

Alternatively, suppose that σ^c commutes with one of σ^a and σ^b and anticommutes with the other; then ${\left({\phi }_{ca}^{{a}^{\prime }{\ast}}{\phi }_{cb}^{{b}^{\prime }}\right)}^{2}=-1$ . Again, as we vary a' and b' over the d²/2 possible choices for σ^c, with i, j, k, l fixed, ξ_icξ_kc = +1 for half of the terms and ξ_icξ_kc = −1 for the other half; therefore, the cross terms cancel. This completes the proof. □

3. Logical channel for the repetition code

From now on we will use the streamlined notation for single-qubit Pauli operators:

$\begin{equation}I=\left(\begin{pmatrix}\hfill 1\hfill & \hfill 0\hfill \\ \hfill 0\hfill & \hfill 1\hfill \end{pmatrix}\right),\quad X=\left(\begin{pmatrix}\hfill 0\hfill & \hfill 1\hfill \\ \hfill 1\hfill & \hfill 0\hfill \end{pmatrix}\right),\quad Y=\left(\begin{pmatrix}\hfill 0\hfill & \hfill -i\hfill \\ \hfill i\hfill & \hfill 0\hfill \end{pmatrix}\right),\quad Z=\left(\begin{pmatrix}\hfill 1\hfill & 0\hfill \\ \hfill 0\hfill & \hfill -1\hfill \end{pmatrix}\right).\end{equation} \tag{ 73 }$

Consider the repetition code, which protects one logical qubit against bit flip (X) errors, but provides no protection against phase (Z) errors. Let us analyze how well this code protects against coherent errors, in which each physical qubit in the code block rotates about the X-axis. Similar calculations were carried out in [17, 18]. Understanding this example will prepare us for an analysis of more general stabilizer codes.

To be as concrete as possible, we will start with the simplest interesting case, the three-qubit repetition code spanned by |000⟩ and |111⟩. Our goal is to determine the logical channel that results when rotation errors applied to the physical qubits are followed by error correction. We will assume for now that the same rotation is applied to each of the three qubits; this will be generalized later.

Suppose that each physical qubit is subjected to the unitary rotation

$\begin{equation}{U}^{X}\left(\theta \right)=cI-isX,\quad c=\mathrm{cos}\left(\theta /2\right),\;s=\mathrm{sin}\left(\theta /2\right);\end{equation} \tag{ 74 }$

thus the product unitary map applied to the three physical qubits is

$\begin{equation}{U}^{X}{\left(\theta \right)}^{\otimes 3}={c}^{3}III-i{c}^{2}s\left(XII+IXI+IIX\right)-c{s}^{2}\left(XXI+XIX+IXX\right)+i{s}^{3}XXX.\end{equation} \tag{ 75 }$

To perform error correction we measure the operators ZZI and IZZ to obtain two syndrome bits. If the syndrome is trivial (both measurements yield +1), no further action is required. If the syndrome is nontrivial, X is applied to one of the three qubits, returning the state to the code space. Thus the terms in the expansion in equation (75) with weight 0 or 1 (where the weight is the number of X's) are error corrected to the logical operator Ī = III, while terms with weight 2 or 3 are error corrected to the logical operator $\overline{X}=XXX$ . We conclude that the logical channel ${\mathcal{N}}_{L}$ is a convex combination of two unitary transformations,

$\begin{equation}{\mathcal{N}}_{L}\left(\rho \right)={p}_{0}{U}^{X}\left({\theta }_{0}\right)\rho {U}^{X}{\left({\theta }_{0}\right)}^{{\dagger}}+{p}_{1}{U}^{X}\left({\theta }_{1}\right)\rho {U}^{X}{\left({\theta }_{1}\right)}^{{\dagger}},\end{equation} \tag{ 76 }$

where

$\begin{align}\hfill {p}_{0}& ={c}^{6}+{s}^{6},\quad {\theta }_{0}/2=\mathrm{arctan}\left(-{s}^{3}/{c}^{3}\right)\hfill \\ \hfill {p}_{1}& =3\left({c}^{4}{s}^{2}+{c}^{2}{s}^{4}\right),\quad {\theta }_{1}/2=\mathrm{arctan}\left(s/c\right)=\theta /2.\hfill \end{align} \tag{ 77 }$

A logical rotation by θ₀ is applied when the syndrome is trivial (weight 0), and a logical rotation by θ₁ is applied when the syndrome is nontrivial (weight 1).

The logical channel has the form specified in equation (29), where

$\begin{equation}{\epsilon}={p}_{0}\left(1-\mathrm{cos}\;{\theta }_{0}\right)+{p}_{1}\left(1-\mathrm{cos}\;{\theta }_{1}\right),\quad \delta ={p}_{0}\;\mathrm{sin}\;{\theta }_{0}+{p}_{1}\;\mathrm{sin}\;{\theta }_{1}.\end{equation} \tag{ 78 }$

These expressions for and δ can be simplified using trigonometric identities. In terms of s/c = t = tan θ/2, we have

$\begin{align}\hfill {p}_{0}={c}^{6}\left(1+{t}^{6}\right),\quad 1-\mathrm{cos}\;{\theta }_{0}=\frac{2{t}^{6}}{1+{t}^{6}},\quad \mathrm{sin}\;{\theta }_{0}=\frac{-2{t}^{3}}{1+{t}^{6}},\\ \hfill {p}_{1}=3{c}^{6}{t}^{2}\left(1+{t}^{2}\right),\quad 1-\mathrm{cos}\;{\theta }_{1}=\frac{2{t}^{2}}{1+{t}^{2}},\quad \mathrm{sin}\;{\theta }_{1}=\frac{2t}{1+{t}^{2}};\end{align} \tag{ 79 }$

therefore we find

$\begin{equation}{\epsilon}=2{s}^{6}+6{c}^{2}{s}^{4},\quad \delta =-2{s}^{3}{c}^{3}+6{s}^{3}{c}^{3}=4{s}^{3}{c}^{3}.\end{equation} \tag{ 80 }$

Expanding to leading order for small θ, we have

$\begin{equation}{\epsilon}\approx \frac{3}{8}{\theta }^{4},\quad \delta \approx \frac{1}{2}{\theta }^{3}.\end{equation} \tag{ 81 }$

Here, because is higher order in θ than δ, equation (41) applies, and therefore the coherence angle is

$\begin{equation}{{\Theta}}^{2}\approx 2{\delta }^{2}/3\approx {\theta }^{6}/6.\end{equation} \tag{ 82 }$

From equation (33), we see that if this logical channel ${\mathcal{N}}_{L}$ is applied m times, the infidelity becomes

$\begin{equation}{r}_{m}\approx \frac{1}{3}m{\epsilon}+\frac{1}{6}m\left(m-1\right){\delta }^{2}\approx \frac{1}{8}m{\theta }^{4}+\frac{1}{24}m\left(m-1\right){\theta }^{6}.\end{equation} \tag{ 83 }$

Note that the term quadratic in m actually matches the upper bound in equation (44). Equation (83) reveals that the coherence of the logical channel is somewhat suppressed, as it takes a number of repetitions $m=\mathcal{O}\left({\theta }^{-2}\right)$ for the quadratically growing contribution to r to 'catch up' with the dominant linear term.

Now let's do a similar analysis for the length-n repetition code (where n is odd), which corrects up to (n − 1)/2 bit-flip errors. In this case the logical channel is a convex combination of (n + 1)/2 unitary rotations,

$\begin{equation}{\mathcal{N}}_{L}\left(\rho \right)=\sum _{w=0}^{\left(n-1\right)/2}{p}_{w}{U}^{X}\left({\theta }_{w}\right)\rho {U}^{X}{\left({\theta }_{w}\right)}^{{\dagger}}\end{equation} \tag{ 84 }$

where w ranging from 0 to (n − 1)/2 indicates the weight of a correctable X error occurring in the expansion of ${\left(c-isX\right)}^{\otimes n}$ . When the (n−1)-bit syndrome is measured, syndromes pointing to a weight-w error occur with total probability

$\begin{equation}{p}_{w}=\left(\genfrac{}{}{0.0pt}{}{n}{w}\right)\left[{c}^{2\left(n-w\right)}{s}^{2w}+{c}^{2w}{s}^{2\left(n-w\right)}\right]=\left(\genfrac{}{}{0.0pt}{}{n}{w}\right){c}^{2n}{t}^{2w}\left[1+{t}^{2\left(n-2w\right)}\right],\end{equation} \tag{ 85 }$

and the logical rotation angle conditioned on a weight-w syndrome is

$\begin{align}\hfill & {\theta }_{w}/2={\left(-1\right)}^{\left(n-1-2w\right)/2}\mathrm{arctan}\left[{\left(s/c\right)}^{n-2w}\right]\hfill \\ \hfill & \;\Longrightarrow\;1-\mathrm{cos}\;{\theta }_{w}=\frac{2{t}^{2\left(n-2w\right)}}{1+{t}^{2\left(n-2w\right)}},\quad \mathrm{sin}\;{\theta }_{w}={\left(-1\right)}^{\left(n-1\right)/2}{\left(-1\right)}^{w}\frac{2{t}^{n-2w}}{1+{t}^{2\left(n-2w\right)}}.\hfill \end{align} \tag{ 86 }$

Summing over the weight of the syndrome, we find

$\begin{align}\hfill {\epsilon}& =\sum _{w=0}^{\left(n-1\right)/2}{p}_{w}\left(1-\mathrm{cos}\;{\theta }_{w}\right)=\sum _{w=0}^{\left(n-1\right)/2}\left(\genfrac{}{}{0.0pt}{}{n}{w}\right){\left({c}^{2}\right)}^{w}{\left({s}^{2}\right)}^{n-w},\hfill \\ \hfill \delta & =\sum _{w=0}^{\left(n-1\right)/2}{p}_{w}\;\mathrm{sin}\;{\theta }_{w}={\left(-1\right)}^{\left(n-1\right)/2}{c}^{n}{s}^{n}\sum _{w=0}^{\left(n-1\right)/2}{\left(-1\right)}^{w}\left(\genfrac{}{}{0.0pt}{}{n}{w}\right)=2\left(\genfrac{}{}{0.0pt}{}{n-1}{\frac{n-1}{2}}\right){c}^{n}{s}^{n}.\hfill \end{align} \tag{ 87 }$

In appendix B we use Stirling's approximation to evaluate the sum in the expression for . Applying Stirling's approximation to our expression for δ as well, we have proven

Theorem 1. Consider the length-n repetition code which protects against bit flip (X) errors, subject to the independent unitary noise map $U={\left(\left(\mathrm{cos}\;\theta /2\right)I-i\left(\mathrm{sin}\;\theta /2\right)X\right)}^{\otimes n}$ , where sin² θ/2 < 1/2. Let ${\mathcal{N}}_{L}\left(\rho \right)=\mathcal{R}\left(U\rho {U}^{{\dagger}}\right)$ be the logical map, where ρ is a code state and $\mathcal{R}$ decodes using majority voting. Then ${\mathcal{N}}_{L}$ has Pauli transfer matrix N of the form given in equations (28) and (29), with and δ given by

$\begin{align}\hfill {\epsilon}& =\sqrt{\frac{2}{\pi n}}\left(\frac{{\mathrm{sin}}^{n+1}\;\theta }{\mathrm{cos}\;\theta }\right)\left(1+\mathcal{O}\left(\frac{1}{n}\right)\right),\hfill \\ \hfill \delta & =\sqrt{\frac{2}{\pi n}}{\mathrm{sin}}^{n}\;\theta \left(1+\mathcal{O}\left(\frac{1}{n}\right)\right)=\left(\frac{\mathrm{cos}\;\theta }{\mathrm{sin}\;\theta }\right){\epsilon}\left(1+\mathcal{O}\left(\frac{1}{n}\right)\right).\hfill \end{align} \tag{ 88 }$

Therefore, using equation (33) and approximations that are well justified (according to theorem 1) when n is large and sin² θ/2 < 1/2, we can estimate the infidelity when the logical channel is applied m times is succession, finding

$\begin{equation}{r}_{m}\approx \frac{1}{3}m{\epsilon}+\frac{1}{6}m\left(m-1\right){\delta }^{2}\approx \frac{1}{3}\sqrt{\frac{2}{\pi n}}\left[m\left(\frac{{\mathrm{sin}}^{n+1}\;\theta }{\mathrm{cos}\;\theta }\right)+\frac{1}{\sqrt{2\pi n}}m\left(m-1\right){\mathrm{sin}}^{2n}\;\theta \right].\end{equation} \tag{ 89 }$

The scaling of the infidelity $r=\mathcal{O}\left({\theta }^{n+1}\right)$ arises because a bit flip error must have weight at least w = (n + 1)/2 to cause a logical error. The scaling $\mathcal{O}\left({\theta }^{2n}\right)$ of the term quadratic in m indicates that the coherence of the logical channel is suppressed when θ is small. It takes $m\approx \sqrt{2\pi n}/{\theta }^{n-1}$ successive applications of the logical channel ${\mathcal{N}}_{L}$ for the quadratic term in r_m to become comparable to the linear term. This suppression arises because larger logical rotations occur with only smaller probability; for example a logical rotation by θ occurs with probability $\mathcal{O}\left({\theta }^{n-1}\right)$ .

Keeping only the leading-order terms in equation (87), we obtain

$\begin{equation}{\epsilon}\approx 2\left(\genfrac{}{}{0.0pt}{}{n}{\frac{n-1}{2}}\right){\left(\frac{\theta }{2}\right)}^{n+1},\quad \delta \approx 2\left(\genfrac{}{}{0.0pt}{}{n-1}{\frac{n-1}{2}}\right){\left(\frac{\theta }{2}\right)}^{n}\;\Longrightarrow\;\delta \approx \frac{n+1}{n}{\theta }^{-1}{\epsilon},\end{equation} \tag{ 90 }$

generalizing equation (81). We derived the relationship

$\begin{equation}{\epsilon}\approx \frac{n}{n+1}\left(\theta \delta \right)\end{equation} \tag{ 91 }$

using the identity

$\begin{equation}\sum _{w=0}^{\left(n-1\right)/2}{\left(-1\right)}^{w}\left(\genfrac{}{}{0.0pt}{}{n}{w}\right)={\left(-1\right)}^{\left(n-1\right)/2}\left(\genfrac{}{}{0.0pt}{}{n-1}{\frac{n-1}{2}}\right),\end{equation} \tag{ 92 }$

which can be proved by induction. For drawing the conclusion that θδ/ is bounded above by an n-independent constant, the oscillating minus sign in this expression is important—if not for the oscillating sign, the sum would be 2ⁿ⁻¹, hence larger than equation (92) by a factor which scales like $\sqrt{n}$ . This would mean that average infidelity r_m in equation (33) would have a large quadratic component relative to the linear component as the code length n becomes large. In other words, the logical noise channel would have significant coherence.

4. Repetition code revisited

In this section we will compute the logical channel for the repetition code using a different method than in section 3. This new method can be extended more easily to general stabilizer codes.

4.1. Stabilizer formalism

We now briefly review the structure of stabilizer codes, as this will be used in our analysis. Let {g_α, α = 1, 2, ..., n − k} denote the n − k stabilizer generators for an [[n, k, d]] stabilizer code. These generators are mutually commuting Hermitian Pauli operators such that ${g}_{\alpha }^{2}=I$ . The syndrome s(σⁱ) of Pauli operator σⁱ is a length-(n − k) binary vector such that $s{\left({\sigma }^{i}\right)}_{\alpha }={s}_{\alpha }^{i}$ where

$\begin{equation}{g}_{\alpha }{\sigma }^{i}={\left(-1\right)}^{{s}_{\alpha }^{i}}{\sigma }^{i}{g}_{\alpha }.\end{equation} \tag{ 93 }$

Note that the syndrome of a product of Pauli operators is additive: $s{\left({\sigma }^{i}{\sigma }^{j}\right)}_{\alpha }={\sigma }_{\alpha }^{i}+{\sigma }_{\alpha }^{j}$ , where the addition is modulo 2.

The code space is the simultaneous eigenstate with eigenvalue 1 of all the stabilizer generators. If $\vert \overline{\psi }\rangle$ is a pure state in the code space, then

$\begin{equation}{g}_{\alpha }\left({\sigma }^{i}\vert \overline{\psi }\rangle \right)={\left(-1\right)}^{{s}_{\alpha }^{i}}{\sigma }^{i}{g}_{\alpha }\vert \overline{\psi }\rangle ={\left(-1\right)}^{{s}_{\alpha }^{i}}{\sigma }^{i}\vert \overline{\psi }\rangle .\end{equation} \tag{ 94 }$

Therefore, the syndrome of σⁱ can be identified by measuring all of the stabilizer generators. Hence we may say that $s\left({\sigma }^{i}\right)$ is the syndrome of the state ${\sigma }^{i}\vert \overline{\psi }\rangle$ . A Pauli operator that commutes with the stabilizer generators preserves the code space and is said to be logical. We may define a complete set of orthogonal projectors {Π_s} on the n-qubit Hilbert space, where Π_s projects onto the subspace with syndrome s. Then

$\begin{equation}{{\Pi}}_{s}{{\Pi}}_{t}={\delta }_{st}{{\Pi}}_{s},\quad \sum _{s}{{\Pi}}_{s}=I.\end{equation} \tag{ 95 }$

An encoded density operator $\overline{\rho }$ (one supported on the code space) has the property

$\begin{equation}{{\Pi}}_{s}\overline{\rho }{{\Pi}}_{t}={\delta }_{s0},{\delta }_{t0}\overline{\rho },\end{equation} \tag{ 96 }$

where s = 0 denotes the trivial syndrome.

To construct the error recovery map $\mathcal{R}$ , we first perform an orthogonal measurement to identify the syndrome s. Then for each syndrome s, a particular Pauli operator ${E}_{s}^{{\dagger}}$ is applied, which returns the measured state to the code space; therefore,

$\begin{equation}\mathcal{R}\left(\rho \right)=\sum _{s}{E}_{s}^{{\dagger}}{{\Pi}}_{s}\rho {{\Pi}}_{s}{E}_{s}.\end{equation} \tag{ 97 }$

One says that E_s is the standard error associated with the syndrome s. In the case of minimal-weight decoding, E_s is chosen to be a minimal-weight Pauli operator with syndrome s. By the weight w(σ) of the n-qubit Pauli operator σ, we mean the number of qubits to which a nontrivial Pauli matrix X, Y, or Z is applied, while I is applied to the remaining n − w qubits.

By summing over all values of the syndromes s to construct the error recovery channel, we are averaging over all the possible outcomes of the syndrome measurement, with each syndrome weighted by its probability. We discussed in the introduction how to justify performing this average when computing the logical channel.

4.2. Recovery in the chi-matrix representation

For any such noise channel $\mathcal{N}$ acting on an encoded density operator $\overline{\rho }$ , we would like to find the error corrected map $\mathcal{R}{\circ}\mathcal{N}\left(\overline{\rho }\right)$ . Using the chi representation of the noise channel, it evidently suffices to compute

$\begin{equation}\mathcal{R}\left({\sigma }^{i}{\overline{\sigma }}^{k}{\sigma }^{j}\right)\end{equation} \tag{ 98 }$

for each pair of physical Pauli operators σⁱ, σ^j and each logical Pauli operator ${\overline{\sigma }}^{k}$ . Because the syndrome is additive, we have

$\begin{equation}{{\Pi}}_{s}{P}_{t}{{\Pi}}_{s}={\delta }_{t0}{P}_{0}{{\Pi}}_{s}\end{equation} \tag{ 99 }$

if P_t is any physical Pauli operator with syndrome t, and therefore

$\begin{align}\hfill & \mathcal{R}\left({\sigma }^{i}{\overline{\sigma }}^{k}{\sigma }^{j}\right)=\sum _{s}{E}_{s}^{{\dagger}}{{\Pi}}_{s}{\sigma }^{i}{\overline{\sigma }}^{k}{\sigma }^{j}{{\Pi}}_{s}{E}_{s}=0\hfill \\ \hfill & \quad \mathrm{u}\mathrm{n}\mathrm{l}\mathrm{e}\mathrm{s}\mathrm{s}\quad s\left({\sigma }^{i}{\overline{\sigma }}^{k}{\sigma }^{j}\right)=s\left({\sigma }^{i}{\sigma }^{j}\right)=0.\hfill \end{align} \tag{ 100 }$

That is, only the terms for which σⁱ and σ^j have the same syndrome survive when the error recovery map is applied. This property will be crucial in our analysis of the logical channel.

Now let's understand the action of $\mathcal{R}$ in more detail. An [[n, k, d]] stabilizer code has 4^k logical Pauli operators. The physical Pauli operator L representing a logical Pauli operator is not unique, because L and LG act in the same way on the code space, where G is any element of the stabilizer group. But let us by convention choose standard physical operators {L_a, a = 0, 1, 2, ..., 4^k − 1} representing each of the logical Pauli operators. Since we have also assigned a standard error operator E_s to each syndrome s, any Hermitian Pauli operator has a unique decomposition of the form

$\begin{equation}\sigma \left(s,a,x\right)={\eta }_{\mathrm{s}\mathrm{a}\mathrm{x}}{E}_{s}{L}_{a}{G}_{x},\quad {\eta }_{\mathrm{s}\mathrm{a}\mathrm{x}}\in \left\{{\pm}1,{\pm}i\right\},\end{equation} \tag{ 101 }$

where G_x is an element of the stabilizer group, and η_sax is a phase. Since there are 2^n−k stabilizer group elements (up to phases), 2^n−k distinct syndromes, and 2^2k logical Pauli operators, we see that this decomposition accounts for all 4ⁿ physical Pauli operators. We conclude that if $\overline{\rho }$ is an encoded density operator, then

$\begin{align}\hfill \mathcal{R}\left(\sigma \left(s,a,x\right)\overline{\rho }\sigma \left({s}^{\prime },{a}^{\prime },{x}^{\prime }\right)\right)& ={\delta }_{s{s}^{\prime }}{E}_{s}^{{\dagger}}\left({\eta }_{\mathrm{s}\mathrm{a}\mathrm{x}}{E}_{s}{L}_{a}{G}_{x}\right)\overline{\rho }\left({G}_{{x}^{\prime }}^{{\dagger}}{L}_{{a}^{\prime }}^{{\dagger}}{E}_{s}^{{\dagger}}{\eta }_{{s}^{\prime }{a}^{\prime }{x}^{\prime }}^{{\ast}}\right){E}_{s}\hfill \\ \hfill & ={\delta }_{s{s}^{\prime }}\;{\eta }_{\mathrm{s}\mathrm{a}\mathrm{x}}{\eta }_{{s}^{\prime }{a}^{\prime }{x}^{\prime }}^{{\ast}}\;{L}_{a}\overline{\rho }{L}_{{a}^{\prime }}^{{\dagger}},\hfill \end{align} \tag{ 102 }$

where we have used the property that σ(s', a', x') is Hermitian. In the logical channel, the terms with L_a = L_a' are incoherent—they contribute to the on-diagonal elements of the logical Pauli transfer matrix. The terms with L_a ≠ L_a' are coherent—they contribute to the off-diagonal elements.

When the noise channel $\mathcal{N}$ is weak, the dominant terms in the chi-matrix expansion in equation (53) are those such that σⁱσ^j has minimal weight, and we have also seen that the only terms that survive when the recovery map is applied are those such that σⁱσ^j is a logical operator (has trivial syndrome). Now let's suppose that the code distance is d and that minimal-weight decoding is performed. This means that we choose E_s such that L_a = I (up to multiplication by an element of the stabilizer) whenever σ(s, a, x) has weight no larger than (d − 1)/2, assuming d is odd.

To get a contribution to the incoherent part of the logical channel, we will need both σⁱ and σ^j to have weight at least (d + 1)/2, so that the total weight must be at least d + 1. In that case it is possible for both σⁱ and σ^j to be error corrected to a nontrivial logical operator. But there are also weight-d contributions to the coherent part of the logical channel, arising from the terms in which w(σⁱ) + w(σ^j) = d, where w(σ) denotes the weight of Pauli operator σ. In that case one of the two Pauli operators has weight less than or equal to (d − 1)/2, hence is error corrected to the identity, while the other has weight greater than or equal to (d + 1)/2, hence is corrected to a nontrivial logical operator L. The resulting term in the logical channel is either $L\overline{\rho }$ or $\overline{\rho }L$ (up to a phase), depending on whether σⁱ or σ^j has higher weight.

If we choose the standard errors {E_s} differently, then the action of the recovery operator may be modified. But it is evident from equation (102) that if we make the replacement E_s → E_s' = ϕ_sE_sG_y, where G_y is an element of the stabilizer and ϕ_s is a phase, then $\mathcal{R}\left(\sigma \overline{\rho }{\sigma }^{\prime }\right)$ is not changed. In particular, when we perform minimal-weight decoding, there may be more than one minimal-weight Pauli operator with syndrome s, so that the choice of E_s is ambiguous. However, as long as any two minimal-weight Pauli operators E_s and E_s' with syndrome s have the property that ${E}_{s}^{\prime {\dagger}}{E}_{s}$ is an element of the code stabilizer, then the logical channel will not depend on how the minimal-weight standard errors are chosen. This will certainly be the case if the code distance is d and the standard errors have weight not larger than (d − 1)/2, since then ${E}_{s}^{\prime {\dagger}}{E}_{s}$ has weight at most d − 1 and cannot be a nontrivial logical operator.

4.3. Analysis of repetition code using the chi-matrix formalism

To illustrate this method, we return to the length-3 repetition code, where the noise channel is as in equation (75). We write out the chi-matrix expansion of $\mathcal{N}\left(\rho \right)$ in equation (53), and then apply the recovery operator $\mathcal{R}$ to find the logical channel ${\mathcal{N}}_{L}=\mathcal{R}{\circ}\mathcal{N}$ . The task of applying $\mathcal{R}$ is simplified by the observation that, if the state ρ is supported on the code space, then $\mathcal{R}$ annihilates all terms in which σⁱσ^j is not logical; that is, as indicated in equation (100), σⁱσ^j must commute with the stabilizer for the term to survive. We may write

$\begin{equation}\mathcal{N}\left(\rho \right)={\mathcal{N}}_{\text{incoh}}\left(\rho \right)+{\mathcal{N}}_{\text{coh}}\left(\rho \right)+{\mathcal{N}}_{\text{null}}\left(\rho \right),\end{equation} \tag{ 103 }$

where ${\mathcal{N}}_{\text{null}}$ is the sum of terms such that σⁱσ^j is not logical (hence $\mathcal{R}{\circ}{\mathcal{N}}_{\text{null}}=0$ acting on encoded density operators), ${\mathcal{N}}_{\text{incoh}}$ is the sum of terms such that σⁱσ^j is the logical identity, and ${\mathcal{N}}_{\text{coh}}$ is the sum of terms such that σⁱσ^j is a nontrivial logical operator. Then $\mathcal{R}{\circ}{\mathcal{N}}_{\text{incoh}}$ is the incoherent part of ${\mathcal{N}}_{L}$ and $\mathcal{R}{\circ}{\mathcal{N}}_{\text{coh}}$ is its coherent part. Explicitly,

$\begin{align}\hfill {\mathcal{N}}_{\text{incoh}}\left(\rho \right)& ={c}^{6}III\rho III+{c}^{4}{s}^{2}\left(XII\rho XII+IXI\rho IXI+IIX\rho IIX\right)\hfill \\ \hfill & \quad +\;{c}^{2}{s}^{4}\left(XXI\rho XXI+XIX\rho XIX+IXX\rho IXX\right)+{s}^{6}XXX\rho XXX,\hfill \end{align} \tag{ 104 }$

and

$\begin{align}\hfill {\mathcal{N}}_{\text{coh}}\left(\rho \right)& =i{c}^{3}{s}^{3}\left(XXX\rho III-III\rho XXX\right)+i{c}^{3}{s}^{3}\left(XII\rho IXX+IXI\rho XIX+IIX\rho XXI\right.\hfill \\ \hfill & \quad \left.-\;IXX\rho XII-XIX\rho IXI-XXI\rho IIX\right).\hfill \end{align} \tag{ 105 }$

The code has two syndrome bits, given by the measured values of ZZI and IZZ, and for a minimal-weight decoder we choose the standard errors to be

$\begin{equation}{E}_{00}=III,\quad {E}_{01}=IIX,\quad {E}_{10}=XII,\quad {E}_{11}=IXI,\end{equation} \tag{ 106 }$

while the nontrivial logical operator is $\overline{X}=XXX$ . Each of the Pauli operators in equations (104) and (105) can be expressed as a product of a standard error and a logical operator which is either Ī = III or $\overline{X}$ , so the logical map becomes

$\begin{align}\hfill {\mathcal{N}}_{L,\mathrm{i}\mathrm{n}\mathrm{c}\mathrm{o}\mathrm{h}}\left(\rho \right)& =\mathcal{R}{\circ}{\mathcal{N}}_{\text{incoh}}\left(\rho \right)=\left({c}^{6}+3{c}^{4}{s}^{2}\right)\rho +\left(3{c}^{2}{s}^{4}+{s}^{6}\right)\overline{X}\rho \overline{X},\hfill \\ \hfill {\mathcal{N}}_{L,\mathrm{c}\mathrm{o}\mathrm{h}}\left(\rho \right)& =\mathcal{R}{\circ}{\mathcal{N}}_{\text{coh}}\left(\rho \right)=i{c}^{3}{s}^{3}\left(\left[\overline{X},\rho \right]-3\left[\overline{X},\rho \right]\right).\hfill \end{align} \tag{ 107 }$

To compare with our previous calculation of the logical channel, we note that

$\begin{align}\hfill {\mathcal{N}}_{L,\mathrm{i}\mathrm{n}\mathrm{c}\mathrm{o}\mathrm{h}}\left(\bar{I}\right)& ={\left({c}^{2}+{s}^{2}\right)}^{3}\bar{I}=\bar{I},\hfill \\ \hfill {\mathcal{N}}_{L,\mathrm{i}\mathrm{n}\mathrm{c}\mathrm{o}\mathrm{h}}\left(\overline{X}\right)& ={\left({c}^{2}+{s}^{2}\right)}^{3}\overline{X}=\overline{X},\hfill \\ \hfill {\mathcal{N}}_{L,\mathrm{i}\mathrm{n}\mathrm{c}\mathrm{o}\mathrm{h}}\left(\bar{Y} \right)& =\left[{\left({c}^{2}+{s}^{2}\right)}^{3}-6{c}^{2}{s}^{4}-2{s}^{6}\right]\bar{Y} =\left(1-6{c}^{2}{s}^{4}-2{s}^{6}\right)\bar{Y} ,\hfill \\ \hfill {\mathcal{N}}_{L,\mathrm{i}\mathrm{n}\mathrm{c}\mathrm{o}\mathrm{h}}\left(\overline{Z}\right)& =\left[{\left({c}^{2}+{s}^{2}\right)}^{3}-6{c}^{2}{s}^{4}-2{s}^{6}\right]\overline{Z}=\left(1-6{c}^{2}{s}^{4}-2{s}^{6}\right)\overline{Z},\hfill \end{align} \tag{ 108 }$

and

$\begin{align}\hfill {\mathcal{N}}_{L,\mathrm{c}\mathrm{o}\mathrm{h}}\left(\bar{I}\right)& ={\mathcal{N}}_{L,\mathrm{c}\mathrm{o}\mathrm{h}}\left(\overline{X}\right)=0,\hfill \\ \hfill {\mathcal{N}}_{L,\mathrm{c}\mathrm{o}\mathrm{h}}\left(\bar{Y} \right)& =-2i{c}^{3}{s}^{3}\left[\overline{X},\bar{Y} \right]=4{c}^{3}{s}^{3}\overline{Z},\hfill \\ \hfill {\mathcal{N}}_{L,\mathrm{c}\mathrm{o}\mathrm{h}}\left(\overline{Z}\right)& =-2i{c}^{3}{s}^{3}\left[\overline{X},\overline{Z}\right]=-4{c}^{3}{s}^{3}\bar{Y} .\hfill \end{align} \tag{ 109 }$

In the notation of equation (29), we have found that the logical channel is parametrized by

$\begin{equation}{\epsilon}=6{c}^{2}{s}^{4}+2{s}^{6},\quad \delta =4{c}^{3}{s}^{3},\end{equation} \tag{ 110 }$

in agreement with the result found in equation (80).

Now consider the length-n repetition code, for n odd, where the noise is the product unitary transformation U^X(θ)^⊗n. The incoherent part ${\mathcal{N}}_{L,\mathrm{i}\mathrm{n}\mathrm{c}\mathrm{o}\mathrm{h}}$ of the logical channel arises from the diagonal terms {σⁱρσⁱ} in the chi-matrix expansion of $\mathcal{N}\left(\rho \right)$ . Here σⁱ can be any one of the 2ⁿ Pauli operators contained in ${\left\{I,{\sigma }^{X}\right\}}^{\otimes n}$ . The code can correct t = (n − 1)/2 σ^X errors, so σⁱ is error corrected to Ī if its weight w(σⁱ) is t or less, and is error corrected to $\overline{X}$ if its weight is t + 1 or more. Therefore, if ρ is an encoded density operator, then

$\begin{equation}{\mathcal{N}}_{L,\mathrm{i}\mathrm{n}\mathrm{c}\mathrm{o}\mathrm{h}}\left(\rho \right)=\left(\sum _{w=0}^{t}\left(\genfrac{}{}{0.0pt}{}{n}{w}\right){c}^{2\left(n-w\right)}{s}^{2w}\right)\rho +\left(\sum _{w=t+1}^{n}\left(\genfrac{}{}{0.0pt}{}{n}{w}\right){c}^{2\left(n-w\right)}{s}^{2w}\right)\overline{X}\rho \overline{X},\end{equation} \tag{ 111 }$

where the binomial coefficient $\left(\genfrac{}{}{0.0pt}{}{n}{w}\right)$ counts the number of weight-w (or weight-(n − w)) operators. Using

$\begin{equation}\sum _{w=0}^{n}\left(\genfrac{}{}{0.0pt}{}{n}{w}\right){c}^{2\left(n-w\right)}{s}^{2w}={\left({c}^{2}+{s}^{2}\right)}^{n}=1,\end{equation} \tag{ 112 }$

we see that ${\mathcal{N}}_{L,\mathrm{i}\mathrm{n}\mathrm{c}\mathrm{o}\mathrm{h}}\left(\bar{I}\right)=\bar{I}$ and ${\mathcal{N}}_{L,\mathrm{i}\mathrm{n}\mathrm{c}\mathrm{o}\mathrm{h}}\left(\overline{X}\right)=\overline{X}$ , and furthermore

$\begin{equation}{\mathcal{N}}_{L,\mathrm{i}\mathrm{n}\mathrm{c}\mathrm{o}\mathrm{h}}\left(\bar{Y} \right)=\left(1-2\sum _{w=t+1}^{n}\left(\genfrac{}{}{0.0pt}{}{n}{w}\right){c}^{2\left(n-w\right)}{s}^{2w}\right)\bar{Y} ;\end{equation} \tag{ 113 }$

hence

$\begin{equation}{\epsilon}=2\sum _{w=\left(n+1\right)/2}^{n}\left(\genfrac{}{}{0.0pt}{}{n}{w}\right){c}^{2\left(n-w\right)}{s}^{2w}\end{equation} \tag{ 114 }$

in agreement with equation (87). To leading order in s ≈ θ/2, this becomes

$\begin{equation}{\epsilon}\approx 2\left(\genfrac{}{}{0.0pt}{}{n}{\frac{n+1}{2}}\right){\left(\frac{\theta }{2}\right)}^{n+1},\end{equation} \tag{ 115 }$

as in equation (90).

The coherent part ${\mathcal{N}}_{L,\mathrm{c}\mathrm{o}\mathrm{h}}$ of the logical channel arises from the terms in the Pauli operator expansion of $\mathcal{N}\left(\rho \right)$ such that ${\sigma }^{i}{\sigma }^{j}=\overline{X}$ . There are 2ⁿ such terms—σⁱ can be any operator among {I,X}^⊗n, and σ^j is then the complementary operator with X and I interchanged. If σⁱ has weight ⩽t, and so is error corrected to Ī, then σ^j has weight ⩾(t + 1), and so is error corrected to $\overline{X}$ . We obtain

$\begin{align}\hfill {\mathcal{N}}_{L,\mathrm{c}\mathrm{o}\mathrm{h}}\left(\rho \right)& =\left(\sum _{w=0}^{t}\left(\genfrac{}{}{0.0pt}{}{n}{w}\right){c}^{w}{\left(-is\right)}^{n-w}{c}^{n-w}{\left(is\right)}^{w}\right)\overline{X}\rho +\left(\sum _{w=0}^{t}\left(\genfrac{}{}{0.0pt}{}{n}{w}\right){c}^{n-w}{\left(-is\right)}^{w}{c}^{w}{\left(is\right)}^{n-w}\right)\rho \overline{X}\hfill \\ \hfill & ={\left(-i\right)}^{n}{c}^{n}{s}^{n}\left(\sum _{w=0}^{t}{\left(-1\right)}^{w}\left(\genfrac{}{}{0.0pt}{}{n}{w}\right)\right)\left[\overline{X},\rho \right].\hfill \end{align} \tag{ 116 }$

Therefore,

$\begin{equation}{\mathcal{N}}_{L,\mathrm{c}\mathrm{o}\mathrm{h}}\left(\bar{Y} \right)=2{\left(-i\right)}^{n-1}{\left(cs\right)}^{n}\left(\sum _{w=0}^{t}{\left(-1\right)}^{w}\left(\genfrac{}{}{0.0pt}{}{n}{w}\right)\right)\overline{Z};\end{equation} \tag{ 117 }$

hence

$\begin{equation}\delta =2{\left(-i\right)}^{n-1}\left(\sum _{w=0}^{\left(n-1\right)/2}{\left(-1\right)}^{w}\left(\genfrac{}{}{0.0pt}{}{n}{w}\right)\right){\left(cs\right)}^{n}\approx 2\left(\genfrac{}{}{0.0pt}{}{n-1}{\frac{n-1}{2}}\right){\left(\frac{\theta }{2}\right)}^{n},\end{equation} \tag{ 118 }$

in agreement with equation (90).

4.4. Inhomogeneous X-axis rotations

Now let's consider the logical channel obtained by decoding the length-n repetition code, in the case where the rotation angle varies from qubit to qubit. That is, the unitary noise channel is

$\begin{equation}{U}^{X}\left({\theta }_{1},{\theta }_{2},\dots ,\;{\theta }_{n}\right)=\underset{\alpha =1}{\overset{n}{\bigotimes}}\left({c}_{\alpha }-i{s}_{\alpha }{\sigma }^{X}\right),\end{equation} \tag{ 119 }$

where c_α = cos θ_α/2 and s_α = sin θ_α/2. As in our previous derivation for the case where all angles are equal, we can calculate the incoherent and coherent parts of the logical channel by expanding this tensor product and isolating the terms in $\mathcal{N}\left(\rho \right)$ of the form σⁱρσ^j where σⁱσ^j is either a trivial logical operator (for the incoherent part) or a nontrivial logical operator (for the coherent part). The only difference from the previous calculation is that, while previously all terms in the expansion of U^X with the same weight occurred with equal amplitudes, now operators of the same weight may have different amplitudes.

Still, the derivation goes through in much the same way as before. Let S denote a subset of the n qubits, let |S| denote the size of S, and let $\overline{S}$ denote the subset complementary to S. Extending our previous argument to the case of unequal angles yields

$\begin{align}\hfill {\epsilon}& =2\sum _{S,\vert S\vert {\geqslant}t+1}\prod _{\alpha \in S}\prod _{\overline{\alpha }\in \overline{S}}{c}_{\overline{\alpha }}^{2}{s}_{\alpha }^{2},\hfill \\ \hfill \delta & =\left(-2i\right)\sum _{S,\vert S\vert {\leqslant}t}\prod _{\alpha \in S}\prod _{\overline{\alpha }\in \overline{S}}{c}_{\overline{\alpha }}\left(-i{s}_{\alpha }\right){c}_{\alpha }\left(i{s}_{\overline{\alpha }}\right)\hfill \\ \hfill & =\left(-2i\right)\prod _{\alpha =1}^{n}\left(i{c}_{\alpha }{s}_{\alpha }\right)\sum _{S,\vert S\vert {\leqslant}t}{\left(-1\right)}^{\vert S\vert }.\hfill \end{align} \tag{ 120 }$

Note that the sum in the expression for δ does not depend on the angles. To leading order in the small {s_α}, we find

$\begin{align}\hfill {\epsilon}& =2\sum _{S,\vert S\vert =\left(n+1\right)/2}\prod _{\alpha \in S}{s}_{\alpha }^{2}+\cdots ,\hfill \\ \hfill \delta & =2\left(\genfrac{}{}{0.0pt}{}{n-1}{\frac{n-1}{2}}\right)\prod _{\alpha =1}^{n}{s}_{\alpha },\hfill \end{align} \tag{ 121 }$

where we have used the identity

$\begin{equation}\sum _{S,\vert S\vert {\leqslant}\frac{n-1}{2}}{\left(-1\right)}^{\vert S\vert }=\sum _{w=0}^{\left(n-1\right)/2}{\left(-1\right)}^{w}\left(\genfrac{}{}{0.0pt}{}{n}{w}\right)={\left(-1\right)}^{\left(n-1\right)/2}\left(\genfrac{}{}{0.0pt}{}{n-1}{\frac{n-1}{2}}\right).\end{equation} \tag{ 122 }$

As before we find ${\epsilon}=\mathcal{O}\left({s}^{n+1}\right)$ and $\delta =\mathcal{O}\left({s}^{n}\right)$ . Furthermore, the expression for δ is very simple—the same as our previous formula, but with sⁿ replaced by ∏_αs_α.

The formula for depends in a more complicated way on the set of angles {θ_α}. But we can show that for fixed δ, is minimized when all the s_α are equal. Therefore, we have a lower bound on , namely

$\begin{equation}{\epsilon}{\geqslant}2\left(\genfrac{}{}{0.0pt}{}{n}{\frac{n+1}{2}}\right){s}^{n+1}+\cdots \end{equation} \tag{ 123 }$

where the ellipsis indicates terms higher order in s, and we have defined

$\begin{equation}{s}^{n}=\prod _{\alpha =1}^{n}{s}_{\alpha }.\end{equation} \tag{ 124 }$

Correspondingly, using

$\begin{equation}\left(\genfrac{}{}{0.0pt}{}{n}{\frac{n+1}{2}}\right)=\frac{2n}{n+1}\left(\genfrac{}{}{0.0pt}{}{n-1}{\frac{n-1}{2}}\right),\end{equation} \tag{ 125 }$

we have the upper bound on δ:

$\begin{equation}\delta {\leqslant}\frac{n+1}{2n}\left(\frac{{\epsilon}}{s}\right)+\cdots \;.\end{equation} \tag{ 126 }$

Therefore, for inhomogeneous as well as homogeneous rotations, we conclude that the coherent part of the logical channel is suppressed. In fact, the case where all rotation angles are equal is the worst case, where equation (126) is saturated.

Now let's prove that is minimized (for fixed δ), when all {s_α} are equal.

Lemma 2. Consider minimizing the function

$\begin{equation}{f}_{m}\left({x}_{1},{x}_{2},\dots ,\;{x}_{n}\right)=\sum _{S,\vert S\vert =m}\prod _{\alpha \in S}{x}_{\alpha }\end{equation} \tag{ 127 }$

subject to the constraint ${\prod }_{\alpha =1}^{n}{x}_{\alpha }=c{ >}0$ , where all x_α are nonnegative. Here S denotes a subset of the n variables, and |S| is the size of S. The minimum occurs for x₁ = x₂ = ⋯ = x_n = c^1/n.

Proof. Note that f_m is a symmetric function, invariant under permutations of its n arguments, and can be decomposed as

$\begin{equation}{f}_{m}\left({x}_{1},{x}_{2},\dots ,\;{x}_{n}\right)={f}_{m}\left({x}_{3},\dots ,\;{x}_{n}\right)+{x}_{1}{f}_{m-1}\left({x}_{3},\dots ,\;{x}_{n}\right)+{x}_{2}{f}_{m-1}\left({x}_{3},\dots ,\;{x}_{n}\right)+{x}_{1}{x}_{2}{f}_{m-2}\left({x}_{3},\dots ,\;{x}_{n}\right).\end{equation} \tag{ 128 }$

Using the constraint we write

$\begin{equation}{x}_{1}=\frac{c}{{x}_{2}{x}_{3}\dots {x}_{n}},\end{equation} \tag{ 129 }$

and regard f_m as a function of the n − 1 independent variables x₂, x₃, ..., x_n; then

$\begin{equation}\frac{\partial }{\partial {x}_{2}}\left({x}_{1}{x}_{2}\right)=0,\quad \frac{\partial }{\partial {x}_{2}}\left({x}_{1}\right)=\frac{-{x}_{1}}{{x}_{2}}.\end{equation} \tag{ 130 }$

Therefore, setting the gradient of f_m equal to zero, we find

$\begin{equation}\frac{\partial }{\partial {x}_{2}}{f}_{m}\left({x}_{1},{x}_{2},\dots ,\;{x}_{n}\right)=\left(1-\frac{{x}_{1}}{{x}_{2}}\right){f}_{m-1}\left({x}_{3},\dots ,\;{x}_{n}\right)=0.\end{equation} \tag{ 131 }$

The constraint requires that all x_α are positive; therefore f_m−1(x₃, ..., x_n) is positive and we find that x₁ = x₂. From the symmetry of f_m, we conclude that x₁ = x_α = c^1/n for α = 2, 3, ..., n, when f_m is stationary. This is the unique stationary point of f_m(x₁, x₂, ..., x_n) when all x_α are positive; furthermore f_m is smooth and bounded below. Therefore it must be the minimum of f_m. □

5. Correlated unitary noise

Now let's consider unitary noise acting on n qubits which does not factorize into a product of single-qubit unitaries. Since we still wish to consider noise that can be corrected by the repetition code, assume that the n-qubit unitary U has an expansion in terms of X-type Pauli operators:

$\begin{equation}U=\sum _{S}\psi \left(S\right)X\left(S\right),\end{equation} \tag{ 132 }$

where S denotes a subset of the n qubits and X(S) = ⊗_α∈SX_α is the X-type operator supported on S. (X_α means X acting on the αth qubit, and it is implicit that I acts on qubit α for α ∉ S.) Unitarity of U implies

$\begin{equation}\sum _{S}\vert \psi \left(S\right){\vert }^{2}=1,\end{equation} \tag{ 133 }$

and

$\begin{equation}\sum _{S}\psi {\left(S\right)}^{{\ast}}\psi \left(S+{S}^{\prime }\right)=0,\end{equation} \tag{ 134 }$

where S' is a nonempty set and S + S' = S ∪ S'\S ∩ S' is the disjoint union of S and S'. To make the analysis of the noise more tractable, let's also suppose the noise is invariant under permutations of the n qubits. In that case ψ(S) = ψ(|S|); that is, the amplitude ψ depends only on the weight w = |S| of the error operator X(S). A tensor product of n identical unitary X rotations, $U={\left(cI-isX\right)}^{\otimes n}$ , is the special case where

$\begin{equation}\psi \left(w\right)={c}^{n}{\left(\frac{-is}{c}\right)}^{w},\end{equation} \tag{ 135 }$

an exponential function of the weight w.

The symmetric unitary transformation may also be expressed as U = e^−iH where H is a symmetric n-qubit Hamiltonian of the form

$\begin{equation}H=\sum _{w=0}^{n}{h}_{w}\left(\sum _{S,\vert S\vert =w}X\left(S\right)\right).\end{equation} \tag{ 136 }$

We are assuming that there is no geometric locality constraint on the interactions among the qubits—the strength of a weight-w term in the Hamiltonian depends only on the weight, not on which set S of w qubits are interacting. Since h_w is the coefficient of a sum of $\left(\genfrac{}{}{0.0pt}{}{n}{w}\right)$ terms, it is implicit that h_w decays as a function of w. It is natural to assume that $\left(\genfrac{}{}{0.0pt}{}{n}{w}\right){h}_{w}=\mathcal{O}\left(n\right)$ , as only in that case do we expect (for h_w sufficiently small) the probability of a logical error to drop rapidly as n gets large. For example, if ${h}_{2}=\mathcal{O}\left(1\right)$ , then each qubit has $\mathcal{O}\left(1\right)$ coupling strength with n − 1 other qubits, so the strength of the noise acting on each qubit grows linearly in n, and error correction fails for n sufficiently large. We will elaborate on this point in the discussion below of two-body correlated noise. In a more realistic noise model, the higher-weight terms in the Hamiltonian would have $\mathcal{O}\left(1\right)$ strength (independent of system size), but would decay sufficiently rapidly as the qubits separate that the effective single-qubit noise strength is also $\mathcal{O}\left(1\right)$ [27, 28].

The structure of the noise correlations is determined by how h_w falls off as the weight w increases. In particular, if ${n}^{w-1}{h}_{w}=\mathcal{O}\left({h}_{1}^{w}\right)$ , then ψ(w) in equation (132) is a sum of $\mathcal{O}\left({h}_{1}^{w}\right)$ terms; in that case the parameters of the logical channel will be ${\epsilon}=\mathcal{O}\left({h}_{1}^{n+1}\right)$ and $\delta =\mathcal{O}\left({h}_{1}^{n}\right)$ , so the coherent and incoherent parts of the logical channel qualitatively resemble what we found for uncorrelated noise. On the other hand, in the extreme case where h_n ≠ 0 and h_w = 0 for 0 ⩽ w ⩽ n − 1, the code provides no protection against logical errors and there is no suppression of coherence. Instead we find $\delta =\mathcal{O}\left({h}_{n}\right)$ and ${\epsilon}=\mathcal{O}\left({h}_{n}^{2}\right)$ so that ${\epsilon}=\mathcal{O}\left({\delta }^{2}\right)$ just as in equation (24).

To be concrete, consider the three-qubit repetition code and noise Hamiltonian

$\begin{equation}H={h}_{1}\left({X}_{1}+{X}_{2}+{X}_{3}\right)+{h}_{2}\left({X}_{1}{X}_{2}+{X}_{2}{X}_{3}+{X}_{3}{X}_{1}\right)+{h}_{3}\left({X}_{1}{X}_{2}{X}_{3}\right).\end{equation} \tag{ 137 }$

The unitary noise has the expansion

$\begin{align}\hfill U={\mathrm{e}}^{-\text{i}H}& =\left(1+\cdots \;\right)I+\left(-i{h}_{1}+\cdots \;\right)\left({X}_{1}+{X}_{2}+{X}_{3}\right)+\left(-i{h}_{2}-{h}_{1}^{2}+\cdots \;\right)\left({X}_{1}{X}_{2}+{X}_{2}{X}_{3}+{X}_{3}{X}_{1}\right)\hfill \\ \hfill & \quad +\;\left(i{h}_{1}^{3}-3{h}_{1}{h}_{2}-i{h}_{3}+\cdots \;\right){X}_{1}{X}_{2}{X}_{3},\hfill \end{align} \tag{ 138 }$

where only the leading terms are shown in the coefficient of each Pauli operator. Repeating the analysis of the logical channel as in section 4.3, but now using this modified unitary noise operator, we find

$\begin{align}\hfill {\mathcal{N}}_{L,\mathrm{i}\mathrm{n}\mathrm{c}\mathrm{o}\mathrm{h}}\left(\rho \right)& =\rho +\left(3{h}_{1}^{4}+3{h}_{2}^{2}+{h}_{3}^{2}+\cdots \;\right)\overline{X}\rho \overline{X},\hfill \\ \hfill {\mathcal{N}}_{L,\mathrm{c}\mathrm{o}\mathrm{h}}\left(\rho \right)& =\left(i{h}_{1}^{3}-i{h}_{3}\right)\left[\overline{X},\rho \right]-3{h}_{1}{h}_{2}\left(\overline{X}\rho +\rho \overline{X}\right)-3i{h}_{1}^{3}\left[\overline{X},\rho \right]+3{h}_{1}{h}_{2}\left(\overline{X}\rho +\rho \overline{X}\right)+\cdots \hfill \\ \hfill & =-i\left(2{h}_{1}^{3}+{h}_{3}\right)\left[\overline{X},\rho \right],\hfill \end{align} \tag{ 139 }$

(showing only the leading terms), which yields

$\begin{equation}{\tilde {\chi }}_{XX}={\epsilon}/2=3{h}_{1}^{4}+3{h}_{2}^{2}+{h}_{3}^{2}+\cdots \;,\quad {\tilde {\chi }}_{XI}=-i\delta /2=-i\left(2{h}_{1}^{3}+{h}_{3}+\cdots \;\right),\end{equation} \tag{ 140 }$

where $\tilde {\chi }$ denotes the logical chi matrix after error correction. (We do not find any contribution to the coherent part of the logical channel depending only on h₂, because the h₂ term in the Hamiltonian has even X parity, while the logical operator $\overline{X}$ has odd parity.) Now whether coherence is suppressed hinges on the strength of the h₃ term in the Hamiltonian. In particular, if h₃ is large compared to ${h}_{1}^{2}$ and h₂, then highly correlated noise dominates, and coherence of the logical channel is unsuppressed.

As another instructive example, consider the length-n repetition code, where the Hamiltonian contains only single-qubit and two-qubit terms. We will compute the coherent and incoherent parts of the logical channel following the same reasoning as in section 4.3. Again, we will need to sum over all the possible values of the syndrome weight, which we will now denote by k. For each value of k, we will find a contribution to the chi matrix for the error-corrected logical channel, with logical operators acting on the encoded density operator ρ from the left and from the right. Each such operator can be obtained in many ways as a product of one-body and two-body terms in the Hamiltonian, and we will have to do some combinatorics to sum up those contributions. By computing the logical chi matrix, and comparing its coherent and incoherent parts, we can prove the following:

Theorem 2. Consider the bit flip code with n qubits, and let the noise model be given by the n-qubit unitary map

$\begin{equation}U=\mathrm{exp}\left(-iH\right),\quad \mathrm{w}\mathrm{h}\mathrm{e}\mathrm{r}\mathrm{e}\;H=\sum _{i}{h}_{1}{X}_{i}+\sum _{i{< }j}{h}_{2}{X}_{i}{X}_{j}.\end{equation} \tag{ 141 }$

After error correction, the logical noise channel satisfies the following bound relating the coherent and incoherent components:

$\begin{equation}{\tilde {\chi }}_{XX}{\geqslant}\frac{2n}{n+1}\left(\mathrm{tan}{h}_{1}\right)\vert {\tilde {\chi }}_{XI}\vert ,\end{equation} \tag{ 142 }$

where $\tilde {\chi }$ denotes the logical chi matrix. Equation (142) holds for any odd n, and for any h₁, but we have made the approximation nh₂ ≪ 1, neglecting a multiplicative $\left(1+\mathcal{O}\left(n{h}_{2}\right)\right)$ correction on the right-hand side.

Theorem 2 implies that, even for this correlated unitary noise model, the coherence of the logical noise channel is heavily suppressed for large n. In fact, the ratio of the coherent to incoherent components of the logical noise channel is similar to what we found for the uncorrelated case, where h₁ ≈ θ/2; compare to equation (91).

Proof. The proof of theorem 2 is contained in the next few subsections. We will compute first the coherent component of the logical channel, then the incoherent component, and finally we will compare the two to obtain equation (142).

The unitary operator U = e^−iH can be expressed as

$\begin{equation}U=\prod _{i}\left({c}_{1}-i{s}_{1}{X}_{i}\right)\prod _{i{< }j}\left({c}_{2}-i{s}_{2}{X}_{i}{X}_{j}\right)={c}_{1}^{n}{c}_{2}^{n\left(n-1\right)/2}\prod _{i}\left(1-i{t}_{1}{X}_{i}\right)\prod _{i{< }j}\left(1-i{t}_{2}{X}_{i}{X}_{j}\right),\end{equation} \tag{ 143 }$

where s₁ = sin h₁, c₁ = cos h₁, t₁ = tan h₁, and likewise for h₂. In our computations, we will suppress the prefactor ${c}_{1}^{n}{c}_{2}^{n\left(n-1\right)/2}$ , which is implicit in all formulas, and we will expand U in a collisionless approximation. That is, we will neglect terms in the expansion in which operators such as X_i and X_iX_j or X_kX_i and X_iX_j act on a qubit in common. The terms we are neglecting are systematically suppressed by powers of nh₂ compared to the terms we are keeping. More precisely, these corrections can be absorbed into a multiplicative renormalization of h₁ and h₂ by a factor $\left(1+\mathcal{O}\left(n{h}_{2}\right)\right)$ .

5.1. Coherent component

Let us look first at the coherent component ${\tilde {\chi }}_{XI}$ of the logical chi matrix. For each syndrome of weight k, the physical error contributing to this logical component consists of an uncorrectable X error of weight n − k on the left of ρ and a correctable X error of weight k on the right, where k ranges from 0 to (n − 1)/2. The operators on the left and right are supported on disjoint sets of qubits. When we write these operators as products of one-body and two-body terms we will need to count the number of ways of dividing a set of 2p X errors into distinct combinations of p two-body terms. We denote this number by κ_p where

$\begin{equation}{\kappa }_{p}=\frac{\left(2p\right)!}{{2}^{p}p!}.\end{equation} \tag{ 144 }$

Let us count the terms with k_L factors of t₂ on the left and k_R factors of t₂ on the right. In addition, there will be some number w of factors of t₁ on the right and n − 2k_L − 2k_R − w factors of t₁ on the left to fill out the coherent term. First we choose the 2k_L qubits on the left where the t₂ terms act; these qubits can be chosen in $\left(\genfrac{}{}{0.0pt}{}{n}{2{k}_{\text{L}}}\right)$ ways. Once these 2k_L qubits have been chosen, there are ${\kappa }_{{k}_{\text{L}}}$ ways to divide up the qubits into pairs where the two-body terms act. Next, we choose the 2k_R qubits on the right where the t₂ terms act. Because the operators on the left and right are supported on disjoint sets of qubits, these 2k_R qubits can be chosen in $\left(\genfrac{}{}{0.0pt}{}{n-2{k}_{\text{L}}}{2{k}_{\text{R}}}\right)$ ways. Once these 2k_R qubits have been chosen, there are ${\kappa }_{{k}_{\text{R}}}$ ways to divide up the qubits into pairs where the two-body terms act. Of the remaining n − 2k_L − 2k_R qubits where no two-body terms act, we choose w qubits on the left where the one-body terms acts; these can be chosen in $\left(\genfrac{}{}{0.0pt}{}{n-2{k}_{\text{L}}-2{k}_{\text{R}}}{w}\right)$ ways. As usual, this contribution to the logical channel has a phase, which is determined by including a factor of −i for each term in the Hamiltonian which acts from the left and a factor of i for each term in the Hamiltonian which acts from the right. By combining all these factors, we find a contribution to ${\tilde {\chi }}_{XI}$

$\begin{equation}{\left(i\right)}^{n-w-2{k}_{\text{R}}-{k}_{\text{L}}}{\left(-i\right)}^{w+{k}_{\text{R}}}{t}_{1}^{n-2{k}_{\text{L}}-2{k}_{\text{R}}}{t}_{2}^{{k}_{\text{L}}+{k}_{\text{R}}}\left(\genfrac{}{}{0.0pt}{}{n}{2{k}_{\text{L}}}\right)\left(\genfrac{}{}{0.0pt}{}{n-2{k}_{\text{L}}}{2{k}_{\text{R}}}\right)\left(\genfrac{}{}{0.0pt}{}{n-2{k}_{\text{L}}-2{k}_{\text{R}}}{w}\right){\kappa }_{{k}_{\text{L}}}{\kappa }_{{k}_{\text{R}}}.\end{equation} \tag{ 145 }$

Next we sum over w, taking care to note the w-dependent phase in equation (145). Fortunately, this sum can be evaluated explicitly using an identity satisfied by binomial coefficients, just as we saw in section 3. The sum ranges from w = 0 to w = (n − 1)/2 − k_R, so we have

$\begin{equation}\sum _{w=0}^{\left(n-1\right)/2-2{k}_{\text{R}}}{\left(-1\right)}^{w}\left(\genfrac{}{}{0.0pt}{}{n-2{k}_{\text{L}}-2{k}_{\text{R}}}{w}\right)={\left(-1\right)}^{\left(n-1\right)/2-2{k}_{\text{R}}}\left(\genfrac{}{}{0.0pt}{}{n-2{k}_{\text{L}}-2{k}_{\text{R}}-1}{\left(n-1\right)/2-2{k}_{\text{R}}}\right).\end{equation} \tag{ 146 }$

To complete the evaluation of ${\tilde {\chi }}_{XI}$ , it remains to sum over k_L and k_R in

$\begin{equation}{\tilde {\chi }}_{XI}=\sum _{{k}_{\text{L}},{k}_{\text{R}}}{\Omega}\left({k}_{\text{L}},{k}_{\text{R}}\right){t}_{2}^{{k}_{\text{L}}+{k}_{\text{R}}}{t}_{1}^{n-2{k}_{\text{L}}-2{k}_{\text{R}}},\end{equation} \tag{ 147 }$

where from equations (145) and (146) we have

$\begin{equation}{\Omega}\left({k}_{\text{L}},{k}_{\text{R}}\right)={\left(i\right)}^{n-{k}_{\text{L}}-{k}_{\text{R}}}{\left(-1\right)}^{{k}_{\text{R}}}{\left(-1\right)}^{\left(n-1\right)/2}\left(\genfrac{}{}{0.0pt}{}{n}{2{k}_{\text{L}}}\right)\left(\genfrac{}{}{0.0pt}{}{n-2{k}_{\text{L}}}{2{k}_{\text{R}}}\right)\left(\genfrac{}{}{0.0pt}{}{n-2{k}_{\text{L}}-2{k}_{\text{R}}-1}{\left(n-1\right)/2-2{k}_{\text{R}}}\right){\kappa }_{{k}_{\text{L}}}{\kappa }_{{k}_{\text{R}}}.\end{equation} \tag{ 148 }$

In the sum in equation (147), 2k_R can be any nonnegative integer less than or equal to (n − 1)/2, and 2(k_R + k_L) can be any nonnegative integer less than or equal to n − 1.

Our goal is to compare this coherent component with the incoherent component, which can also be expressed as a sum. Instead of performing an unrestricted sum over k_L and k_R, we will consider the sum over k_L where k_L + k_R = q is fixed. This collects all the terms in ${\tilde {\chi }}_{XI}$ of order ${t}_{2}^{q}$ . Then we will follow a similar path to compute the incoherent component ${\tilde {\chi }}_{XX}$ to order ${t}_{2}^{q}$ , so that we can compare the coherent and incoherent components in each order.

Let us isolate the parts of Ω(k_L, q − k_L) that depend on q only (not on k_R), and let us introduce the shorthand m = (n − 1)/2, finding

$\begin{equation}{\Omega}\left(q-{k}_{\text{R}},{k}_{\text{R}}\right)=\frac{{\left(i\right)}^{n-q}{\left(-1\right)}^{m}\left(m+1\right)}{\left(n-2q\right){2}^{q}}\left(\genfrac{}{}{0.0pt}{}{n}{m}\right)\frac{{\left(m!\right)}^{2}}{\left(2m-2q\right)!q!}{\times}{\left(-1\right)}^{{k}_{\text{R}}}\left(\genfrac{}{}{0.0pt}{}{2m-2q}{m-2{k}_{\text{R}}}\right)\left(\genfrac{}{}{0.0pt}{}{q}{{k}_{\text{R}}}\right),\end{equation} \tag{ 149 }$

where we have used equation (144). Now we need to sum k_R from k_R = 0 to k_R = q, and then sum q from q = 0 to q = (n − 1)/2.

We observe that, due to the oscillating sign ${\left(-1\right)}^{{k}_{\text{R}}}$ , the sum over k_R vanishes when q is odd. This cancellation occurs because if we replace k_R by q − k_R, the summand remains the same except for a change in phase (−1)^q. What's happening is that for each term contributing to ${\tilde {\chi }}_{XI}$ with l factors of it₂ on the right and q − l factors of −it₂ on the left, there is a corresponding term with q − l factors of it₂ on the right and l factors of −it₂ on the left. These two terms have equal magnitude but opposite sign, if q is odd. Similar cancellations occur in the computation of the incoherent component ${\tilde {\chi }}_{XX}$ .

5.2. Incoherent component

Now we can use similar reasoning to compute the incoherent component ${\tilde {\chi }}_{XX}$ of the logical channel. In this case, though, we will not perform a sum over all syndromes; instead we will keep only the contribution of lowest order in t₁ and t₂, arising from the syndrome of highest weight. This will suffice for deriving the lower bound in equation (142), because the contributions to ${\tilde {\chi }}_{XX}$ higher order in t₁ and t₂ are nonnegative. Furthermore, keeping only the lowest-order term is a good approximation when t₁ and t₂ are sufficiently small.

For n odd, this leading-order contribution arises from terms with X acting (n + 1)/2 times from both the left and the right. In a term with k_L factors of t₂ on the left and k_R factors of t₂ on the right, there will also be (n + 1)/2 − 2k_L factors of t₁ on the left, and (n + 1)/2 − 2k_R factors of t₁ on the right. Summing over k_L and k_R, and arguing as in our discussion of the coherent contribution, we find

$\begin{equation}{\tilde {\chi }}_{XX}=\sum _{{k}_{\text{L}},{k}_{\text{R}}}{\Delta}\left({k}_{\text{L}},{k}_{\text{R}}\right){t}_{2}^{{k}_{\text{L}}+{k}_{\text{R}}}{t}_{1}^{n+1-2{k}_{\text{L}}-2{k}_{\text{R}}}+\cdots \;.\end{equation} \tag{ 150 }$

Here

$\begin{equation}{\Delta}\left({k}_{\text{L}},{k}_{\text{R}}\right)={\left(i\right)}^{m+1-{k}_{\text{L}}}{\left(-i\right)}^{m+1-{k}_{\text{R}}}\left(\genfrac{}{}{0.0pt}{}{n}{m}\right)\left(\genfrac{}{}{0.0pt}{}{m+1}{2{k}_{\text{L}}}\right)\left(\genfrac{}{}{0.0pt}{}{m+1}{2{k}_{\text{R}}}\right){\kappa }_{{k}_{\text{L}}}{\kappa }_{{k}_{\text{R}}},\end{equation} \tag{ 151 }$

we have defined m = (n − 1)/2, and the ellipsis indicates nonnegative higher-order corrections. We can again introduce q = k_L + k_R and isolate the portion of Δ(q − k_R, k_R) that depends only on q:

$\begin{equation}{\Delta}\left(q-{k}_{\text{R}},{k}_{\text{R}}\right)=\frac{{\left(i\right)}^{q}}{{2}^{q}}\left(\genfrac{}{}{0.0pt}{}{n}{m}\right)\frac{{\left(\left(m+1\right)!\right)}^{2}}{\left(2m-2q+2\right)!q!}{\times}{\left(-1\right)}^{{k}_{\text{R}}}\left(\genfrac{}{}{0.0pt}{}{2m-2q+2}{m-2{k}_{\text{R}}+1}\right)\left(\genfrac{}{}{0.0pt}{}{q}{{k}_{\text{R}}}\right);\end{equation} \tag{ 152 }$

here k_R is to be summed from 0 to q, followed by a sum over q from 0 to (n + 1)/2. As for the coherent component, the sum over k_R with q fixed vanishes when q is odd, due to the oscillating minus sign ${\left(-1\right)}^{{k}_{\text{R}}}$ .

5.3. Comparing the coherent and incoherent components

Now we are ready to compare ${\tilde {\chi }}_{XI}$ and ${\tilde {\chi }}_{XX}$ . In both cases there is a sum over k_R to perform for each even value of q, and by inspecting (149) and (152) we see that the k_R-dependent factors in Ω(q, k_R) and Δ(q, k_R) are nearly the same; the factor in Δ is obtained from the factor in Ω if we replace m by m + 1. Because this factor grows rapidly with m, we see that the factor in Δ is larger than the factor in Ω for each value of q and k_R, but that by itself does not suffice for comparing ${\tilde {\chi }}_{XI}$ and ${\tilde {\chi }}_{XX}$ , due to the alternating sign ${\left(-1\right)}^{{k}_{\text{R}}}$ in the sum over k_R.

To compare the coherent and incoherent logical noise components properly, we must perform the sum over k_R. We will make use of the generalized hypergeometric function ₃F₂. This function is defined

$\begin{equation}{}_{3}{F}_{2}\left[\begin{matrix}\hfill a,\hfill & \hfill b,\hfill & \hfill c\hfill \\ \hfill \hfill & \hfill d,\hfill & \hfill e\hfill \end{matrix}\;;\;\;z\right]=\sum _{k=0}^{\infty }\frac{{\left(a\right)}_{k}{\left(b\right)}_{k}{\left(c\right)}_{k}{z}^{k}}{{\left(d\right)}_{k}{\left(e\right)}_{k}k!},\end{equation} \tag{ 153 }$

where (a)_k denotes the Pochhammer function or the rising factorial

$\begin{equation}{\left(a\right)}_{k}=a\left(a+1\right)\left(a+2\right)\left(a+3\right)\dots \left(a+k-1\right).\end{equation} \tag{ 154 }$

If a is a negative integer, then

$\begin{equation}\frac{{\left(a\right)}_{k}}{k!}={\left(-1\right)}^{k}\left(\genfrac{}{}{0.0pt}{}{-a}{k}\right),\end{equation} \tag{ 155 }$

and the sum over k in equation (153) terminates—instead of 0 to ∞, the sum runs from 0 to −a. The same is true if b or c is a negative integer.

Using this definition of ₃F₂, we can write the sum over k_R of Ω or Δ in terms of ₃F₂. We will have to distinguish the two cases 2q < m and 2q ⩾ m, although we will see at the end that the final expressions will coincide for the two cases. Take the second term in equation (149). Supposing that 2q < m, we can write

$\begin{align}\hfill \sum _{{k}_{\text{R}}}{\left(-1\right)}^{{k}_{\text{R}}}\left(\genfrac{}{}{0.0pt}{}{2m-2q}{m-2{k}_{\text{R}}}\right)\left(\genfrac{}{}{0.0pt}{}{q}{{k}_{\text{R}}}\right)& =\left(\genfrac{}{}{0.0pt}{}{2m-2q}{m}\right)\sum _{{k}_{\text{R}}}\left(\genfrac{}{}{0.0pt}{}{q}{{k}_{\text{R}}}\right)\frac{m\left(m-1\right)\dots \left(m-2{k}_{\text{R}}+1\right)}{\left(m-2q+2{k}_{\text{R}}\right)\dots \left(m-2q+1\right)}\hfill \\ \hfill & ={}_{3}{F}_{2}\left[\begin{matrix}\hfill -q,\hfill & \hfill \frac{1-m}{2},\hfill & \hfill \frac{-m}{2}\hfill \\ \hfill \hfill & \hfill \frac{m+1}{2}-q,\hfill & \hfill \frac{m}{2}-q+1\hfill \end{matrix}\;;\;\;1\right]\left(\genfrac{}{}{0.0pt}{}{2m-2q}{m}\right).\hfill \end{align} \tag{ 156 }$

Then we can apply Dixon's identity for the hypergeometric function ₃F₂. This reads

$\begin{equation}3{F}_{2}\left[\begin{matrix}\hfill a,\hfill & \hfill b,\hfill & \hfill -c\hfill \\ \hfill \hfill & \hfill 1+a-b,\hfill & \hfill 1+a+c\hfill \end{matrix}\;;\;\;1\right]=\frac{{\Gamma}\left(1+\frac{a}{2}\right){\Gamma}\left(1+\frac{a}{2}-b-c\right){\Gamma}\left(1+a-b\right){\Gamma}\left(1+a-c\right)}{{\Gamma}\left(1+a\right){\Gamma}\left(1+a-b-c\right){\Gamma}\left(1+\frac{a}{2}-b\right){\Gamma}\left(1+\frac{a}{2}-c\right)};\end{equation} \tag{ 157 }$

cf equation (2.3.3.5) in [29]. Applying this formula to equation (156), we get

$\begin{equation}\sum _{{k}_{\text{R}}}{\left(-1\right)}^{{k}_{\text{R}}}\left(\genfrac{}{}{0.0pt}{}{2m-2q}{m-2{k}_{\text{R}}}\right)\left(\genfrac{}{}{0.0pt}{}{q}{{k}_{\text{R}}}\right)=\frac{\left(-q/2\right)!}{\left(-q\right)!}\frac{{\Gamma}\left(m-q/2-1/2\right){\Gamma}\left(m/2-q-1/2\right){\Gamma}\left(m/2-q\right)}{{\Gamma}\left(m-q-1/2\right){\Gamma}\left(m/2-q/2-1/2\right){\Gamma}\left(m/2-q/2\right)}{\times}\left(\genfrac{}{}{0.0pt}{}{2m-2q}{m}\right).\end{equation} \tag{ 158 }$

We need to do something about the first factor on the right-hand side (−q/2)!/(−q)! because the gamma function has poles at each negative integer. However, this ratio can still be defined:

$\begin{equation}\frac{\left(-q/2\right)!}{\left(-q\right)!}=\underset{q/2\to \mathrm{I}\mathrm{n}\mathrm{t}\mathrm{e}\mathrm{g}\mathrm{e}\mathrm{r}}{\mathrm{lim}}\frac{{\Gamma}\left(-q/2+1\right)}{{\Gamma}\left(-q+1\right)}={\left(-q\right)}_{q/2}={\left(-1\right)}^{q/2}\frac{q!}{\left(q/2\right)!}.\end{equation} \tag{ 159 }$

We can substitute this into equation (158) and we find that we can simplify the expression

$\begin{equation}\sum _{{k}_{\text{R}}}{\left(-1\right)}^{{k}_{\text{R}}}\left(\genfrac{}{}{0.0pt}{}{2m-2q}{m-2{k}_{\text{R}}}\right)\left(\genfrac{}{}{0.0pt}{}{q}{{k}_{\text{R}}}\right)=\frac{{\left(-1\right)}^{q/2}\left(2m-q\right)!q!}{\left(m-q/2\right)!\left(q/2\right)!m!}.\end{equation} \tag{ 160 }$

Up until now we have assumed 2q < m. If we instead assume 2q ⩽ m, we find that the intermediate steps look different, but we arrive at the same final answer as in equation (160).

Now we can compute the sum of equation (149) as k_R goes from 0 to q using what we found in equation (160). We can also apply our result to perform the sum over k_R for equation (152). This gives:

$\begin{align}\hfill {\Omega}\left(q\right)\equiv \sum _{{k}_{\text{R}}}{\Omega}\left(q-{k}_{\text{R}},{k}_{\text{R}}\right)& =\frac{\left(2m-q\right)!\left(m+1\right)!}{\left(m-q/2\right)!\left(q/2\right)!\left(2m+1-2q\right)!{2}^{q}}\left(\genfrac{}{}{0.0pt}{}{n}{m}\right),\hfill \\ \hfill {\Delta}\left(q\right)\equiv \sum _{{k}_{\text{R}}}{\Delta}\left(q-{k}_{\text{R}},{k}_{\text{R}}\right)& =\frac{\left(2m+2-q\right)!\left(m+1\right)!}{\left(m-q/2+1\right)!\left(q/2\right)!\left(2m+2-2q\right)!{2}^{q}}\left(\genfrac{}{}{0.0pt}{}{n}{m}\right).\hfill \end{align} \tag{ 161 }$

The ratio of these quantities is

$\begin{equation}\frac{{\Omega}\left(q\right)}{{\Delta}\left(q\right)}=\frac{\left(2m+2-2q\right)\left(m-q/2+1\right)}{\left(2m-q+2\right)\left(2m-q+1\right)}=\frac{n+1-2q}{2n-2q}{\leqslant}\frac{n+1}{2n}.\end{equation} \tag{ 162 }$

Now we can sum over q; because all terms are nonnegative and the bound holds for every q, we conclude

$\begin{equation}{\tilde {\chi }}_{XX}{ >}\frac{2n}{n+1}{t}_{1}{\tilde {\chi }}_{XI},\end{equation} \tag{ 163 }$

thus proving theorem 2. □

5.4. Summary

By setting q = 0, we can check that the result in equation (161) matches what we found in section 3 for the uncorrelated case. It is also instructive to consider the expansion of ${\tilde {\chi }}_{XI}$ in powers of t₂, under the assumption q ≪ m. From equation (161) we see that

$\begin{equation}{\Omega}\left(q\right)=\left(\frac{{m}^{q/2}{\left(2m\right)}^{2q}}{{2}^{q}\left(q/2\right)!{\left(2m\right)}^{q}}+\cdots \;\right){\Omega}\left(0\right)=\left(\frac{{m}^{3q/2}}{\left(q/2\right)!}+\cdots \;\right){\Omega}\left(0\right),\end{equation} \tag{ 164 }$

where the ellipsis indicates $\mathcal{O}\left(q/m\right)$ corrections.

Restoring the factors of t₁ and t₂ from equation (147), we see that this expansion in t₂ generates a multiplicative correction to ${\tilde {\chi }}_{XI}$ which exponentiates:

$\begin{equation}\sum _{q\;\mathrm{e}\mathrm{v}\mathrm{e}\mathrm{n}}\frac{1}{\left(q/2\right)!}{\left(\frac{{m}^{3}{t}_{2}^{2}}{{t}_{1}^{4}}+\cdots \;\right)}^{q/2}\approx \mathrm{exp}\left({m}^{3}{t}_{2}^{2}/{t}_{1}^{4}\right).\end{equation} \tag{ 165 }$

Since the sum over q is dominated by terms with ${m}^{3}{t}_{2}^{2}/{t}_{1}^{4}\sim q$ , this exponential series should be a good approximation for ${m}^{3}{t}_{2}^{2}{t}_{1}^{4}\ll m$ , or $m{t}_{2}\ll {t}_{1}^{2}$ , since in that case neglecting the terms higher order in q/m can be justified. Under this condition, the two-body terms in the Hamiltonian in equation (162) make a small contribution to the total energy, suppressed by $\mathcal{O}\left({t}_{1}\right)$ compared to the one-body terms. Recall that we also needed mt₂ ≪ 1 to justify the collisionless approximation used in the proof of theorem 2; this condition is subsumed by $m{t}_{2}\ll {t}_{1}^{2}$ if ${t}_{1}=\mathcal{O}\left(1\right)$ .

We see that there is a regime

$\begin{equation}\frac{1}{m}\gg \frac{{t}_{2}}{{t}_{1}^{2}}\gg \frac{1}{{m}^{3/2}}\end{equation} \tag{ 166 }$

in which our approximations are reliable, yet the multiplicative corrections to ${\tilde {\chi }}_{XI}$ are large. That large corrections occur, even when the two-body terms make a small contribution to the total energy, is not a surprise; we have found as expected that the noise correlations can substantially enhance the probability of a logical error. The important point established by theorem 2 (at least for the simple noise model we have analyzed) is that even when the correlated noise produces large corrections to the logical channel, the corrections occur in both the coherent part and the incoherent part of the channel, so that our conclusion that the coherence is strongly suppressed for large n continues to apply.

It is not immediately obvious why the leading power of m in equation (164) should be m^3q/2, because higher powers of m occur in Ω(q − k_R, k_R) and Δ(q − k_R, k_R) for each fixed k_R and q. It turns out that these higher powers of m all cancel when we do the sum over k_R. In appendix C we explain why these cancellations occur, providing a useful check on our results.

6. The toric code against coherent noise

We now analyze the logical channel for the two-dimensional toric code on an L × L square lattice, where L is odd. We will consider uncorrelated unitary noise acting on the 2L² qubits, and suppose that error correction is performed using minimal-weight decoding. Our goal is to show that, when the noise is sufficiently weak, the coherence of the logical noise channel is highly suppressed for large L.

Our analysis will draw heavily on the tools we developed in our study of the repetition code. Before proceeding further, we will review some notation.

6.1. The toric code

We will consider the 2D toric code, which is defined on a square lattice with qubits placed on edges. We choose a square patch of lattice with side length L and identify opposite edges. (The toric code can be constructed on a lattice with boundaries, but for simplicity we choose periodic boundary conditions.) The stabilizer group for the toric code is generated by the X and Z generators shown in figure 2. The logical operators of the toric code are topologically non-trivial loops that wrap around the torus. Figure 3 shows two logical operators.

**Figure 2.** In blue is a Z-type stabilizer generator for the toric code. There are Z generators on every plaquette in the lattice. In red is an X-type stabilizer generator. There are X generators at every vertex of the lattice.
Download figure:
Standard image High-resolution image

**Figure 3.** This figure illustrates two of the logical Pauli operators in the toric code. X and Z logical operators are shown for one of the two encoded qubits. Each logical operator is a topologically non-trivial loop that wraps around the torus. The logical operators for the other encoded qubit are similar but rotated by 90 degrees.
Download figure:
Standard image High-resolution image

The toric code is parameterized by the linear dimensions of the lattice; when the side length is L, the code distance (the minimum weight of a nontrivial logical operator) is L. We will also sometimes refer to L as the code 'size'. The number of physical qubits in the code block is 2L², and there are two encoded logical qubits. To analyze the logical channel, we must choose a decoding procedure. Decoding the toric code is a well-studied problem and many good algorithms are known [30–32]. We will choose minimal-weight decoding, in which the applied recovery operation has the lowest possible weight consistent with the measured error syndrome. This recovery operation can be computed efficiently on a classical computer [33], and corrects the error with a success probability that is exponentially close to 1 when L is large and the noise is both sufficiently weak and sufficiently weakly correlated.

6.2. Notation

We will use the chi matrix to describe the physical noise channel $\mathcal{N}$ acting on the 2L² qubits in the code block:

$\begin{equation}\mathcal{N}\left(\rho \right)=\sum _{i,j}{\chi }_{ij}{\sigma }^{i}\rho \;{\sigma }^{j},\end{equation} \tag{ 167 }$

where {σⁱ} is a basis of Pauli operators.

Definition 1. When we speak of a 'noise term', we will mean a component of the chi matrix for the physical noise channel acting on the qubits in the code block. We will find it convenient to use the notation (σⁱρ σ^j) for the number χ_ij, the coefficient of σⁱρ σ^j in the chi-matrix expansion in equation (167).

We may choose the index that labels a Pauli operator to be (s, a, x ), where σ(s, a, x ) = E_sL_aG_x; here s denotes the error syndrome, E_s is the standard error associated with the syndrome s, L_a is a standard choice for the physical Pauli operator that acts as the logical Pauli operator ${\tilde {L}}_{a}$ , and G_x is an element of the code stabilizer. To compute the logical chi matrix, we sum over the syndrome s and the stabilizer elements, observing that the standard error E_s is removed by the recovery procedure. Hence we find that a term in the logical chi matrix can be expressed in our notation as

$\begin{equation}{\tilde {\chi }}_{ab}\equiv \left({\tilde {L}}_{a}\tilde {\rho }{\tilde {L}}_{b}^{{\dagger}}\right)=\sum _{s,x,y}\left({E}_{s}{L}_{a}{G}_{x}\rho \;{G}_{y}^{{\dagger}}{L}_{b}^{{\dagger}}{E}_{s}^{{\dagger}}\right).\end{equation} \tag{ 168 }$

We say that the diagonal components of the logical chi matrix ${\tilde {\chi }}_{ab}$ with a = b are 'incoherent' noise terms. and that the off-diagonal terms with a ≠ b are 'coherent'.

6.3. Coherent and incoherent logical components

We are going to analyze the coherent and incoherent sums separately at first. Using path counting and assuming the noise is sufficiently weak, we will prove that in both cases the logical chi matrix is dominated by 'short logical strings' (logical Pauli operators of relatively low weight), those with length ⩽L + 2ζ for a constant ζ. Then by summing up the contributions due to these short logical strings, we will derive an inequality relating the coherent and incoherent components of the logical channel.

Our argument will use equation (168), where we have expressed the logical chi matrix as a sum of terms in the physical chi matrix. In the next several sections we will analyze the sums contributing to coherent and incoherent components of ${\tilde {\chi }}_{ab}$ . We will make a series of approximations to simplify the sums by neglecting certain terms. In the end we will demonstrate that the two sums are related by a constant factor.

6.4. The coherent sum

First, consider the coherent sum. The coherent components of the logical noise channel are sums of terms from the physical noise channel. We want to upper bound the magnitude of these coherent logical components. Before we go any further, we will make some simplifications. For one, we will neglect certain coherent logical noise components. We focus on the components of the logical noise ${\tilde {\chi }}_{ab}$ , where exactly one of the operators L_a and L_b is identity and the other is either an X or a Z error on one of the two encoded qubits. These components of the logical noise channel can be expressed as a sum over physical noise terms:

$\begin{equation}{\tilde {\chi }}_{aI}=\sum _{s,x,y}\left({E}_{s}{L}_{a}{G}_{x}\rho \;{G}_{y}^{{\dagger}}{E}_{s}^{{\dagger}}\right),\end{equation} \tag{ 169 }$

where L_a is either an X or Z logical error on one of the two encoded qubits of the toric code. In appendix I we prove that we can neglect the coherent terms with non-trivial logical operators on both sides of ρ, and in appendix H we prove that we can neglect Y logical operators and operators that act non-trivially on both encoded qubits. The proof comes down to showing that terms with a non-trivial error on both sides of ρ, that act on both encoded qubits, or that apply a Y to one of the logical qubits, have high weight relative to the terms we keep. A further simplification concerns the structure of the noise model. Our result applies to a noise model in which the single-qubit unitary operator acting on each qubit has an axis of rotation and angle of rotation that varies somewhat from qubit to qubit. However, we will prove that the most coherent logical channel is one in which the same unitary operator is applied to each qubit, so we may confine our attention to that case for the purpose of deriving a bound on the relative strength of the coherent and incoherent parts of the logical channel.

We will make use of another way of writing the coherent sum. Each coherent term in the form of equation (169) can be unambiguously associated with a logical string. The product of the Pauli operators acting on the left-hand and right-hand sides is the logical operator L_aG_xG_y, which in general consists of a connected logical string wrapping around the code block, accompanied by some number of closed loops. To be concrete, if L_a is a logical X error, then the logical string contains only physical X errors, the closed loops are either loops of X errors which are disjoint from the logical string, or closed loops of Z errors which may or may not intersect with the logical string or with the closed loops of X errors (the intersections are the Y errors).

Definition 2. For a given noise term $\left({E}_{s}{L}_{a}{G}_{x}\rho \;{G}_{y}^{{\dagger}}{E}_{s}^{{\dagger}}\right)$ , we can extract a connected logical string by removing the topologically trivial loops from L_aG_xG_y. Call this logical string $\mathcal{L}$ . We define the 'connected part' of the noise term as the restriction to the qubits in $\mathcal{L}$ . The connected part of $\left({E}_{s}{L}_{a}{G}_{x}\rho \;{G}_{y}^{{\dagger}}{E}_{s}^{{\dagger}}\right)$ is a noise term given by

$\begin{equation}\left(\left({E}_{s}{L}_{a}{G}_{x}\right){\vert }_{\mathcal{L}}\;\rho \;\left({G}_{y}^{{\dagger}}{E}_{s}^{{\dagger}}\right){\vert }_{\mathcal{L}}\right),\end{equation} \tag{ 170 }$

where the symbol ${\vert }_{\mathcal{L}}$ denotes the restriction of an operator to the support of $\mathcal{L}$ .

Definition 3. For a noise term $\left({E}_{s}{L}_{a}{G}_{x}\rho \;{G}_{y}^{{\dagger}}{E}_{s}^{{\dagger}}\right)$ the 'disconnected part' is the part of the noise term not in the connected part. Once again, we can define a continuous logical string $\mathcal{L}$ by removing all topologically trivial closed loops from L_aG_xG_y. The disconnected part of $\left({E}_{s}{L}_{a}{G}_{x}\rho \;{G}_{y}^{{\dagger}}{E}_{s}^{{\dagger}}\right)$ is given by

$\begin{equation}\left(\left({E}_{s}{L}_{a}{G}_{x}\right){\vert }_{{\mathcal{L}}^{C}}\;\rho \;\left({G}_{y}^{{\dagger}}{E}_{s}^{{\dagger}}\right){\vert }_{{\mathcal{L}}^{C}}\right),\end{equation} \tag{ 171 }$

where the symbol ${\vert }_{{\mathcal{L}}^{C}}$ denotes the restriction of an operator to the qubits in the complement of the support of $\mathcal{L}$ .

Furthermore, we will be able to assume that all of the physical single-qubit errors in the connected part are X- or Z-type. For example, in the case of a logical X-type error, we may neglect terms in which a closed loop of Z errors intersects with the logical string. To justify this assumption, we show in appendix G that allowing Y errors along the logical string will only make the logical noise channel less coherent.

A coherent term contributing to the logical chi matrix element ${\tilde {\chi }}_{{Z}_{1}I}$ , which includes disconnected errors, is illustrated in figure 4. The disconnected part includes identity on the qubits without errors in addition to the disconnected errors. Z errors acting on the density operator from the left are shown in red, and Z errors acting from the right are shown in blue. Because the errors acting from the left and right have the same syndrome s, the product of the left and right logical operators is logical. The connected logical string crosses the code block near the bottom of the figure. Associated with the syndrome s is the corresponding standard error E_s, the Z error of minimal weight with that syndrome. (If the minimal-weight error is not unique, we arbitrarily choose E_s to be one of the errors of minimal weight by convention.) To evaluate the logical chi matrix element ${\tilde {\chi }}_{{Z}_{1}I}$ as in equation (168), we need to sum over the syndrome s and the stabilizer elements G_x and G_y. To facilitate estimating the sum, it will be helpful to organize it in an appropriate way.

To this end, we introduce the following definition:

Definition 4. For a logical string $\mathcal{L}$ with no topologically trivial closed loops, the word 'partition' denotes a connected noise term (O₁ρO₂) such that ${O}_{1}{O}_{2}=\mathcal{L}$ and O₁ and O₂ are disjoint. In other words, each partition is a way of dividing the single-qubit errors in $\mathcal{L}$ into two subsets, O₁ and O₂. By definition O₁ and O₂ share the same syndrome. Because the code size L is odd, exactly one of O₁ and O₂ will be corrected to a logical operator with the same action as $\mathcal{L}$ and the other will be corrected to the logical identity.

For each fixed logical string, the sum over all partitions of the logical string will produce the full set of connected terms derived from that logical string. The sum over partitions, for a fixed logical string, is directly analogous to the sum over syndromes we encountered in our analysis of the repetition code in section 4.3. In the case of the toric code, we compute the coherent part ${\tilde {\chi }}_{{Z}_{1}I}$ of the logical channel by summing over all possible logical strings, and for each choice of logical string we sum over all partitions of the logical string. In addition, for each chosen logical string, we sum over the possible disconnected pieces, the additional closed loops of Z errors which are disjoint from the logical string.

Schematically, the coherent component of the logical chi matrix is

$\begin{equation}{\tilde {\chi }}_{{Z}_{1}I}=\sum _{\text{strings}}\sum _{\text{partitions}}\left(\text{Connected}\;\text{part}\right)\left(\text{disconnected}\;\text{sum}\right).\end{equation} \tag{ 172 }$

This form will allow us to approximate the coherent sum. Assuming that the noise is sufficiently weak, we will prove that we can truncate the sum over logical strings, including only short strings. Furthermore, most of the short logical strings have a particular shape. To complete the argument, we will show that the disconnected sum is approximately the same for each short logical string and for each partition of the logical string.

6.5. Counting of logical strings

We want to find an upper bound on the magnitude of the coherent component of the logical noise channel. We have already put the sum over physical noise terms into a convenient form by factoring out the disconnected piece of each term. Next, we will simplify the sum by restricting the set of connected pieces we need to consider; we will neglect the long logical strings in favor of those strings with length no larger than than L + 2ζ, where ζ is an L-independent constant. To justify this truncation we will require a strong assumption on how the physical noise strength scales with L; namely, the single-qubit rotation angles must scale as 1/L.

In equation (172) we wrote the contribution of a given logical string to the coherent logical noise as a product of a connected and disconnected part as described in definitions 2 and 3. The connected part summed over partitions as defined in definition 4. The sum over partitions contains 2^w−1 terms for a weight-w logical string (one containing w lattice edges). Suppose that the unitary rotation ${U}^{Z}\left(\theta \right)=\mathrm{exp}\left(-i\frac{\theta }{2}Z\right)$ is applied to each physical qubit in the toric code block. We can upper bound the sum using the number of terms times the magnitude of each term. Then the contribution of each logical string is upper bounded by 2^w−1(|sin(θ/2)|cos(θ/2))^w times the factor from the disconnected part. We will prove in section 6.7 and appendix E that the disconnected piece is 1 plus a higher weight correction that we can neglect for short logical strings.

There is a regime where we can upper bound the number of logical strings as a function of the string's length. Asymptotically, the number c_w of self-avoiding random walks with length w was proven in [34] to satisfy

$\begin{equation}{c}_{w}={\mu }^{w+o\left(w\right)},\end{equation} \tag{ 173 }$

where μ ≈ 2.64 for the 2D square lattice. We can start a walk from a fixed point along one edge of the toric code. Logical strings will be the self-avoiding walks that wrap around the torus and end at the starting point. We can use equation (173) to show that the contribution to the coherent logical noise from logical strings of length ℓ is exponentially decaying with ℓ as long as |θ| < arcsin 1/μ ≈ 0.39. This statement applies only for logical strings with length much greater than the minimum of L, the code distance. We do not have a precise estimate indicating at what length above L the number of logical strings begins to scale like equation (173). This means we do not know at what string length ℓ the contribution will begin to decay exponentially, and therefore we do not know where to truncate the sum if we wish to use equation (173) to bound the terms we are neglecting. In any case, in our subsequent analysis we will truncate the sum over the string length ℓ at L + 2ζ for some constant ζ. In this regime the asymptotic estimate in equation (173) is not helpful and we will not make use of it. Instead, we will assume that |θ| is sufficiently small that we can use the following lemma to bound the terms we neglect.

Lemma 3. Suppose that |sin θ| < 1/L. In equation (172) we wrote ${\tilde {\chi }}_{{Z}_{1}I}$ as a sum over logical strings. If we truncate the sum to include only logical strings of length w ⩽ L + 2ζ, then magnitude of the difference between the truncated sum and the complete sum is

$\begin{equation}{\leqslant}\alpha {L}^{2\zeta +1}\vert \mathrm{sin}\;\theta {\vert }^{L+2\zeta },\end{equation} \tag{ 174 }$

where α = (1 − L|sin θ|)⁻¹.

Proof. We begin by fixing a point along one edge of the code block, which can be chosen in L ways. We will count the number of logical strings that wrap around the torus and pass through that fixed point on the edge. Let ℓ denote the logical string length. At minimum length ℓ = L, there is only one logical string. At length ℓ = L + 2, if the logical string runs left to right across the code, then the string features one step up and one step down. There are L(L − 1) such logical strings. At longer string lengths, there are many steps up and down. We can upper bound the number of such logical strings by supposing we choose any of the L positions to place each of the steps up and steps down. We divide by ((ℓ − L)/2)! to capture the fact that the (ℓ − L)/2 steps up are all identical, and the same for the steps down. This encompasses all possible combinations of steps up and down including cases where the several steps up are placed at the same point creating a step up of more than one. It does not encompass strings that backtrack, but in lemma 9 we show that among strings of length L + 2ζ, those that feature backtracking are suppressed by $\mathcal{O}\left(1/L\right)$ . Also, the number of strings grows most quickly near minimum and eventually approaches the asymptotic value, where the number of strings grows like μ^L. In the asymptotic regime, the number of strings grows much slower than L^ℓ−L. We conclude that

$\begin{equation}\text{Number}\;\text{of}\;\text{logical}\;\text{strings}\;\text{of}\;\text{length}\;\ell \;{\leqslant}L{L}^{\ell -L}.\end{equation} \tag{ 175 }$

In equation (172) for each logical string in the sum the contribution to the logical noise is a sum over partitions of the connected part times a disconnected part. We will discuss the sum over partitions in detail in section 6.6, but for now it is enough that we know that the sum over partitions contains 2^ℓ−1 terms for each connected logical string of length ℓ. These terms have different phases and in general the sum can be complicated. We can obtain a simple bound by multiplying the number of terms by the magnitude of each term, in other words treating all the phases as if they are the same. For each weight w string,

$\begin{equation}\sum _{\text{partitions}}\left(\text{Connected}\;\text{part}\right){\leqslant}{2}^{\ell }{\left(\vert \mathrm{sin}\;\theta /2\vert \mathrm{cos}\;\theta /2\right)}^{\ell }.\end{equation} \tag{ 176 }$

We still have to handle the disconnected piece. In section 6.7 we will argue that the disconnected sum decreases as the length of the logical string increases. Furthermore, the disconnected part equals 1 up to corrections which are small for logical strings with length ⩽L + 2ζ for a constant ζ. This means that we can upper bound the coherent logical noise component ${\tilde {\chi }}_{{Z}_{1}I}$ by

$\begin{equation}\sum _{\ell =L}^{{\ell }_{\mathrm{max}}}{L}^{\ell -L+1}\vert \mathrm{sin}\;\theta {\vert }^{\ell },\end{equation} \tag{ 177 }$

where ℓ_max is the longest Z₁ logical string supported on the code. If |sin θ| < 1/L, the contribution from logical strings of length ℓ decreases exponentially with ℓ.

If we truncate the sum over logical strings to those with weight w ⩽ L + 2ζ, the error we make is equal to the total contribution of strings with weight w > L + 2ζ. The contribution at weight w is exponentially decreasing with w, so we can bound the sum over the long logical strings using

$\begin{equation}\sum _{\ell =c}^{\infty }{\beta }^{\ell }=\frac{{\beta }^{c}}{1-\beta }=\alpha {\beta }^{c}\quad 0{< }\beta {< }1,\end{equation} \tag{ 178 }$

where $\alpha =\frac{1}{1-\beta }$ . We conclude that the absolute error we make by truncating the series is

$\begin{equation}{\leqslant}\alpha {L}^{2\zeta +1}\vert \mathrm{sin}\;\theta {\vert }^{L+2\zeta }\end{equation} \tag{ 179 }$

where α = (1 − L|sin θ|)⁻¹. Therefore, the error due to truncation is exponentially small in both L and ζ. □

In lemma 3 we proved an upper bound on the absolute magnitude of the error due to truncation in the coherent sum. However, so far we have not described any lower bound on the terms that we have kept, arising from the logical strings with length ⩽L + 2ζ. Therefore, we have not yet justified that the error we have neglected is small relative to the coherent noise contributions that we kept. However, we will prove in section 6.10 that the incoherent logical noise component is at least $L\left(\genfrac{}{}{0.0pt}{}{L}{\frac{L+1}{2}}\right){\left(\frac{\mathrm{sin}\;\theta }{2}\right)}^{L+1}$ ; compared to this incoherent component the contribution in equation (179) to the coherent component due to strings of length >L + 2ζ is suppressed by a factor (L sin θ)^2ζ. This means that the error we make in truncating the sum in lemma 3 is negligible compared to the incoherent component, an observation which will be helpful for showing that the coherence of the logical channel is suppressed. For now, we will restrict our attention to connected logical strings with length ⩽L + 2ζ for a constant ζ. We will refer to these as 'short logical strings'.

Definition 5. A 'short logical string' is a nontrivial logical Pauli operator with no topologically trivial closed loops and length ⩽L + 2ζ, where L is the code size and ζ is our chosen cutoff constant.

6.6. Sum over partitions

In the previous section we restricted our attention to short logical strings, which have length ⩽L + 2ζ where L is the code size and ζ is a constant. We can go further by characterizing the shape of a logical string, and arguing that logical strings with shape meeting certain criteria give a dominant contribution to the logical channel.

Definition 6. Among short logical strings, we will speak of those with 'typical shape.' This means two things. First, supposing that the logical string in question runs left to right across the code block, then the steps up and down along the string are by one lattice spacing at a time. Furthermore, the string contains no backtracking steps that move from right to left. Second, the individual steps up and down are separated from each other by at least $\gamma \sqrt{L}$ , where γ is a small constant we may choose. This constant γ will appear in the error term in many of our subsequent estimates.

In lemmas 9 and 10 in appendix D, we prove that most short strings have a typical shape. Among short strings with length ⩽L + 2ζ, the fraction of atypical strings relative to the total number of logical strings of the same length is

$\begin{equation}\frac{\mathrm{A}\mathrm{t}\mathrm{y}\mathrm{p}\mathrm{i}\mathrm{c}\mathrm{a}\mathrm{l}\mathrm{s}\mathrm{t}\mathrm{r}\mathrm{i}\mathrm{n}\mathrm{g}\mathrm{s}}{\mathrm{T}\mathrm{o}\mathrm{t}\mathrm{a}\mathrm{l}\mathrm{s}\mathrm{t}\mathrm{r}\mathrm{i}\mathrm{n}\mathrm{g}\mathrm{s}}=\frac{8\gamma {\zeta }^{2}}{\sqrt{L}}+\mathcal{O}\left(\frac{1}{L}\right).\end{equation} \tag{ 180 }$

Figure 5 illustrates a string with typical shape for some small γ. Short logical strings with typical shape are simple, which makes our analysis easier, particularly when we discuss the sum over partitions.

**Figure 5.** Here we have one logical string $\mathcal{L}$ of length 15 in an L = 9 toric code. Imagine the code size growing while ζ remains fixed. The likely strings will be those where the steps up and down are widely separated.
Download figure:
Standard image High-resolution image

Let's revisit the sum over partitions for a fixed connected logical string. That is, for a given logical string contributing to ${\tilde {\chi }}_{{Z}_{1}I}$ , we wish to enumerate all the ways to divide the Z errors along the string into an uncorrectable error acting on the density operator from the left and a correctable error acting from the right. This sum over partitions of a fixed logical string is analogous to the sum we encountered when we summed over syndromes in our analysis of the repetition code. In the case of the repetition code of length n, there is just one length-n 'logical string' to consider, and summing over syndromes is equivalent to summing over all ways of choosing a (correctable) error acting on the right that has weight at most (n − 1)/2 (where n is odd).

In the toric code, although the sum over partitions is similar to the sum over syndromes in the repetition code, there is a complication.

Definition 7. An 'exceptional term' is a partition of a connected logical string $\mathcal{L}$ such that the uncorrectable error has lower weight than the correctable error.

In some cases, depending on the geometry of the logical string, we will have some number of exceptional terms. These exceptional terms complicate our analysis of the logical channel. Fortunately, because we need only consider contributions to the logical channel arising from short logical strings when the noise is weak enough, we will be able to fully characterize the exceptional terms and show they are negligible.

How exceptional terms can occur is illustrated in figure 6. Here, for the toric code with L = 9, we consider the logical string of length 15 shown in figure 5, and we have chosen a partition such that the uncorrectable error shown in red has weight 7, while the correctable error shown in blue has weight 8. Note that the minimal-weight standard error associated with the error syndrome on the logical string has weight 6—it follows nearly the same path as the correctable error, but achieves a lower weight than the correctable error by taking a 'shortcut' across the blue notch on the logical string. Another example of an exceptional term for this same logical string is shown in figure 7, where this time the weight of the uncorrectable error is 6, and the minimal-weight error has weight 5. Again, the minimal-weight error takes a shortcut, avoiding the excursions up and down followed by the correctable error.

**Figure 7.** Again, one possible partition of the logical string in figure 5 is illustrated. The uncorrectable error O_U is in red, and the correctable error O_C is in blue. The correctable error includes all the errors along both the cap in the top right and the bottommost cap of the logical string. For this reason, the correctable error has weight 9, while the uncorrectable error has weight 6. Therefore, we call this partition a weight-6 exceptional term.
Download figure:
Standard image High-resolution image

For all these examples, the correctable error contains the qubits along the logical string that make the furthest excursions up and down. This turns out to be a universal rule, at least among the typical short logical strings—for exceptional terms, the uncorrectable error has no support on the outermost steps along the string. In the next lemma we count the number of exceptional terms and find that relative to the total number of partitions of a typical short logical string, these exceptional terms are exponentially unlikely in L.

Lemma 4. Fix a logical string of length ℓ ⩽ L + 2ζ, where ζ is a specified L-independent constant, with a typical shape according to definition 7. This means that if the string runs left to right across the code block, it has steps up and down by one lattice spacing at a time and the steps are separated by at least $\gamma \sqrt{L}$ for some constant γ. To keep the fraction of atypical strings small in equation (180), we will choose γ to be a sufficiently small constant. Now consider all the ways of partitioning this typical logical string into a correctable error and an uncorrectable error. Then the fraction of exceptional partitions relative to all partitions of this string is bounded by

$\begin{equation}\frac{\mathrm{E}\mathrm{x}\mathrm{c}\mathrm{e}\mathrm{p}\mathrm{t}\mathrm{i}\mathrm{o}\mathrm{n}\mathrm{a}\mathrm{l}}{\mathrm{T}\mathrm{o}\mathrm{t}\mathrm{a}\mathrm{l}}{< }\left(\zeta +1\right)\mathrm{exp}\left({\gamma }^{2}\right){\left(2\right)}^{-\gamma \sqrt{L}}\left(1+\mathcal{O}\left(1/L\right)\right).\end{equation} \tag{ 181 }$

Exceptional terms are exponentially rare for typical short logical strings and large L.

Proof. Take a logical string of length ℓ ⩽ L + 2ζ with typical shape. Each step is separated from the others by at least $\gamma \sqrt{L}$ for some γ. Now, consider taking a subset of $\frac{\ell -1}{2}$ of the qubits in the logical string. We would expect such a subset to be correctable. If not, this partition is exceptional.

Choose a partition of a connected logical string and let O_U be the uncorrectable error and O_C be the correctable error. O_U and O_C share a syndrome by definition. Denote that syndrome by s. The decoding algorithm, which in our case is minimal-weight decoding, applies some correction to this syndrome to return it to the code space. Call this correction E_s. E_s is by definition a correctable error in the code, and therefore, because we are using minimal-weight decoding, E_s must have lower weight than O_U. The fact that we chose our code size L to be odd ruled out the case where the two might be equal. Now if the partition we are considering happens to be exceptional, this means by definition that O_C has higher weight than O_U, and we have

$\begin{equation}\text{Exceptional}\;\text{term}\;\left({O}_{\text{U}}\rho \;{O}_{\text{C}}\right)\;\Longrightarrow\;\;\vert {O}_{\text{C}}\vert { >}\vert {O}_{\text{U}}\vert { >}\vert {E}_{s}\vert .\end{equation} \tag{ 182 }$

We will use this condition to bound the number of exceptional terms for a given connected logical string.

What does it mean for O_C to have higher weight than E_s? For connected logical strings of typical shape as in definition 6, this happens only if on some subset or subsets of the logical string, the correctable error O_C contains errors on qubits arranged in a 'cap'. By this we mean a configuration of errors where the errors form three edges of a rectangle. The minimal-weight decoder will choose the fourth edge of the rectangle as part of the correction E_s. This is illustrated in figures 6 and 7. If the connected logical string has length greater than L, then it has steps up and down if it crosses the code block left to right. In every exceptional term, the correctable error O_C will contain the outermost qubits around some of the steps, forming a cap.

Now that we have a simple necessary condition for an exceptional term, we will bound the number of exceptional terms for each short logical string with a typical shape according to definition 6. Start with a logical string of length ℓ. Consider first the partitions into $\frac{\ell +1}{2}$ and $\frac{\ell -1}{2}$ . Of course, those partitions for which the weight- $\frac{\ell +1}{2}$ error is correctable will be exceptional. Every exceptional term like this will have the property that the correctable error contains some number of 'caps' where all of the qubits around three sides of a rectangle are part of the correctable error. To bound the number of exceptional terms, we will count the number of partitions with this property.

Each partition of a weight-ℓ connected logical string into weight- $\frac{\ell +1}{2}$ and $\frac{\ell -1}{2}$ errors is formed by choosing $\frac{\ell -1}{2}$ out of the ℓ errors in the logical string. This is what we mean by a partition. We want to count the number of ways of choosing these errors such that the correctable error (of weight $\frac{\ell +1}{2}$ because we are counting exceptional terms) contains all the errors along a 'cap'. This means that the subset of $\frac{\ell -1}{2}$ errors contains no errors along one or more of the 'caps'. A typical short logical string running left to right across the code consists of horizontal segments separated by single steps up and down. The outermost of these steps form 'caps'. The number of such 'caps' depends on the particular pattern of steps in the logical string. However, we can bound the number of exceptional terms by counting the number of ways of choosing no qubits along one of the horizontal segments of length $\gamma \sqrt{L}$ . This is because every 'cap' consists of an outermost horizontal segment combined with the up and down steps on either side. This counting gives

$\begin{equation}\text{Ways}\;\text{of}\;\text{choosing}\;\text{no}\;\text{qubits}\;\text{along}\;\text{a}\;\text{horizontal}\;\text{segment}\;\text{of}\;\text{length}\;\gamma \sqrt{L}=\left(\genfrac{}{}{0.0pt}{}{\left(\ell -\gamma \sqrt{L}\right)}{\frac{\ell -1}{2}}\right).\end{equation} \tag{ 183 }$

We want the number of ways of choosing no qubits along at least one of the horizontal segments. There are ⩽2ζ steps up and down along the logical string. Therefore, there are ⩽2ζ horizontal segments. We can use a union bound to write

$\begin{equation}\text{Number}\;\text{of}\;\text{weight}-\left(\frac{\ell +1}{2},\frac{\ell -1}{2}\right)\;\text{exceptional}\;\text{terms}{\leqslant}2\zeta \left(\genfrac{}{}{0.0pt}{}{\ell -\gamma \sqrt{L}}{\frac{\ell -1}{2}}\right).\end{equation} \tag{ 184 }$

This is relative to the total number of $\left(\frac{\ell +1}{2},\frac{\ell -1}{2}\right)$ partitions for our logical string of length ℓ, which is

$\begin{equation}\text{Total}\;\text{number}\;\text{of}\;\left(\frac{\ell +1}{2},\frac{\ell -1}{2}\right)\;\text{partitions}=\left(\genfrac{}{}{0.0pt}{}{\ell }{\frac{\ell -1}{2}}\right).\end{equation} \tag{ 185 }$

We can expand the ratio of exceptional terms to the total using Stirling's approximation. This gives

$\begin{align}\hfill 2\zeta \left(\genfrac{}{}{0.0pt}{}{\ell -\gamma \sqrt{L}}{\frac{\ell -1}{2}}\right)/\left(\genfrac{}{}{0.0pt}{}{\ell }{\frac{\ell -1}{2}}\right)& =\frac{2\zeta \left(\ell -\gamma \sqrt{L}\right)!\frac{\ell +1}{2}!}{\ell !\left(\frac{\ell +1}{2}-\gamma \sqrt{L}\right)!}\hfill \\ \hfill & \approx 2\zeta \sqrt{\frac{\left(\ell +1\right)\left(\ell -\gamma \sqrt{L}\right)}{2\ell \left(\frac{\ell +1}{2}-\gamma \sqrt{L}\right)}}\frac{{\left(\ell -\gamma \sqrt{L}\right)}^{\ell -\gamma \sqrt{L}}{\left(\frac{\ell +1}{2}\right)}^{\frac{\ell +1}{2}}}{{\ell }^{\ell }{\left(\frac{\ell +1}{2}-\gamma \sqrt{L}\right)}^{\frac{\ell +1}{2}-\gamma \sqrt{L}}}.\hfill \end{align} \tag{ 186 }$

This approximation holds up to corrections $\mathcal{O}\left(1/\ell \right)$ . We can rewrite this as

$\begin{equation}=2\zeta \sqrt{\frac{\left(\ell +1\right)\left(\ell -\gamma \sqrt{L}\right)}{\ell \left(\ell +1-2\gamma \sqrt{L}\right)}}{\left(1-\frac{\gamma \sqrt{L}}{\ell }\right)}^{\ell }{\left(1-\frac{2\gamma \sqrt{L}}{\ell +1}\right)}^{-\frac{\ell +1}{2}}{\left(\frac{\ell -\gamma \sqrt{L}}{\frac{\ell +1}{2}-\gamma \sqrt{L}}\right)}^{-\gamma \sqrt{L}}.\end{equation} \tag{ 187 }$

Next we square the $\left(1-\frac{\gamma \sqrt{L}}{\ell }\right)$ term in order to combine terms:

$\begin{equation}=2\zeta \sqrt{\frac{\left(\ell -\gamma \sqrt{L}\right)}{\ell }}{\left(\frac{1-\frac{2\gamma \sqrt{L}}{\ell }+{\left(\frac{\gamma \sqrt{L}}{\ell }\right)}^{2}}{1-\frac{2\gamma \sqrt{L}}{\ell +1}}\right)}^{\ell /2}{\left(\frac{\ell -\gamma \sqrt{L}}{\frac{\ell +1}{2}-\gamma \sqrt{L}}\right)}^{-\gamma \sqrt{L}}.\end{equation} \tag{ 188 }$

We upper bound the term inside the radical and also the term raised the power ℓ/2:

$\begin{equation}{< }2\zeta {\left(1+\frac{{\gamma }^{2}}{\ell -2\gamma \sqrt{\ell }}\right)}^{\ell /2}{\left(\frac{\ell -\gamma \sqrt{L}}{\frac{\ell +1}{2}-\gamma \sqrt{L}}\right)}^{-\gamma \sqrt{L}}.\end{equation} \tag{ 189 }$

The second of the three terms is exponentially decaying to exp(γ²/2). As long as ℓ ⩾ 4, we can bound it by

$\begin{equation}{\left(1+\frac{{\gamma }^{2}}{\ell -2\gamma \sqrt{\ell }}\right)}^{\ell /2}{< }\mathrm{exp}\;{\gamma }^{2}.\end{equation} \tag{ 190 }$

Now, we bound $\frac{\ell -\gamma L}{\left(\ell +1\right)/2-\gamma L}{ >}2$ and assemble one term raised to the power L and another to the power (ℓ − L)/2:

$\begin{equation}{< }2\zeta \;\mathrm{exp}\left({\gamma }^{2}\right){2}^{-\gamma \sqrt{L}}.\end{equation} \tag{ 191 }$

We chose some small value for γ in lemma 10, and then the number of exceptional terms with a weight- $\frac{\ell -1}{2}$ logical error on one side and a weight- $\frac{\ell +1}{2}$ correctable error on the other is exponentially small in L.

For the chosen connected logical string of weight ℓ, we have calculated the fraction of exceptional terms among the partitions into $\frac{\ell -1}{2}$ and $\frac{\ell +1}{2}$ . We will also have exceptional terms among the partitions into other weights, possibly all the way down to partitions into weight $\frac{L+1}{2}$ and $\ell -\frac{L+1}{2}$ . Above, we applied the condition in equation (182) that for every exceptional term the correctable error must have higher weight than the minimal-weight correction. If we apply this same method to bound the number of exceptional terms among partitions into $\frac{\ell -3}{2}$ and $\frac{\ell +3}{2}$ , we find that the correctable error must be at least 4 longer than the minimal-weight correction. This means we want to count the number of configurations where at least two of the 'caps' are contained in the correctable error. This is clearly far fewer than the number of configurations where one 'cap' is contained. Therefore, the ratio of exceptional terms to total partitions is bounded by the ratio we found for partitions into $\frac{\ell -1}{2}$ and $\frac{\ell +1}{2}$ .

In the end we see that number of weight- $\frac{\ell -1}{2}$ exceptional terms is exponentially small in L for fixed ζ and further that the weight- $\frac{\ell -3}{2}$ exceptional terms are exponentially small in L relative to the higher-weight exceptional terms, and so on. Then for large L, exceptional terms are negligible. □

Lemma 4 allows us to approximate the sum over partitions for a typical, short logical string $\mathcal{L}$ . Neglecting exceptional terms, the sum over partitions resembles the calculation of what we called δ in the repetition code in equations (87) and (118). Let $\mathcal{L}$ have length ℓ. Each partition contributes ${\left(\frac{\mathrm{sin}\;\theta }{2}\right)}^{\ell }$ with a phase. The sum over partitions is given by

$\begin{align}\hfill \sum _{\text{partitions}}\left(\mathrm{C}\mathrm{o}\mathrm{n}\mathrm{n}\mathrm{e}\mathrm{c}\mathrm{t}\mathrm{e}\mathrm{d}\,\mathrm{p}\mathrm{a}\mathrm{r}\mathrm{t}\right)& =\left(\sum _{j=0}^{\frac{\ell -1}{2}}{i}^{\ell }{\left(-1\right)}^{j}\left(\genfrac{}{}{0.0pt}{}{\ell }{j}\right){\left(\frac{\mathrm{sin}\;\theta }{2}\right)}^{\ell }\right)\left(1+{\epsilon}\right)\hfill \\ \hfill & =\left(i\left(\genfrac{}{}{0.0pt}{}{\ell -1}{\frac{\ell -1}{2}}\right){\left(\frac{\mathrm{sin}\;\theta }{2}\right)}^{\ell }\right)\left(1+{\epsilon}\right),\hfill \end{align} \tag{ 192 }$

where is the error from exceptional terms, which is upper bounded

$\begin{equation}\vert {\epsilon}\vert {< }4\zeta \;\mathrm{exp}\left({\gamma }^{2}\right){2}^{-\gamma \sqrt{L}}.\end{equation} \tag{ 193 }$

This is two times the expression in equation (191), because each exceptional term contributes to the sum over partitions with the opposite sign relative to a non-exceptional term.

6.7. The disconnected part

In the preceding subsections, we analyzed the coherent component of the logical noise channel, expressed as a sum over many physical noise terms. So far we have only considered the connected logical string associated with each coherent term. In this subsection, we will analyze the disconnected errors in more detail, and describe more rigorously how they affect the evaluation of the coherent terms in the logical channel. In section 6.4 we described how to decompose a contribution to ${\tilde {\chi }}_{{Z}_{1}I}$ into a connected piece and some number of disconnected pieces. The left- and right-hand side of each coherent term can be expanded as the product of the errors contained in the connected logical string and the errors outside of it; schematically,

$\begin{equation}\left({\mathrm{C}\mathrm{o}\mathrm{n}\mathrm{n}}_{\text{L}}{\mathrm{D}\mathrm{i}\mathrm{s}\mathrm{c}}_{\text{L}}\;\rho \;{\mathrm{C}\mathrm{o}\mathrm{n}\mathrm{n}}_{\text{R}}{\mathrm{D}\mathrm{i}\mathrm{s}\mathrm{c}}_{\text{R}}\right)=\left({\mathrm{C}\mathrm{o}\mathrm{n}\mathrm{n}}_{\text{L}}\;\rho \;{\mathrm{C}\mathrm{o}\mathrm{n}\mathrm{n}}_{\text{R}}\right)\;\mathrm{d}\mathrm{i}\mathrm{s}\mathrm{c}\mathrm{o}\mathrm{n}\mathrm{n}\mathrm{e}\mathrm{c}\mathrm{t}\mathrm{e}\mathrm{d}.\end{equation} \tag{ 194 }$

The factor 'disconnected' means the contribution to the coherent term from disconnected components that appeared in equation (172). The product of the two (disjoint) factors Conn_L and Conn_R yields the connected logical string, with no additional disjoint loops included. The connected factor includes sin θ/2 cos θ/2 for each qubit along the connected logical string. The disconnected factor includes (cos θ/2)² on every qubit not in the connected logical string in addition to a sum over all possible disconnected errors.

Fix a partition (O_UρO_C) of a short, typical logical string, and consider dressing it with disconnected errors. We can distinguish two types of added errors: incoherent and coherent. If the disconnected error is D_L acting on the density operator from the left, and D_R acting from the right, then if a particular qubit is hit by the same error contained in both D_L and D_R, we say that the disconnected error acting on that qubit is incoherent. If a particular qubit is hit by distinct errors contained in D_L and D_R, then the error is coherent. The product D_LD_R of the errors added on right and left must be a non-identity stabilizer operator, i.e. a closed loop or a set of disjoint closed loops. (Here, because we are investigating the encoded Z errors in the logical channel, only the Z-type physical errors are considered.) The two types of added error—incoherent and coherent—are shown in figure 8, where (A) is an incoherent-type added error and (B)–(D) are coherent-type.

**Figure 8.** Here we have a partition (O_UρO_C) of a connected logical string adorned in four different ways by added errors. The errors in red are the uncorrectable part, O_U, of the partition, while the errors in blue form the correctable part, O_C. The four added errors are labelled (A)–(D). In (A), the same error has been added to both O_U and O_C. In (B) and (D), three errors are added to one side of the partition and one to the other. This produces a minus sign. In (C), two errors are added to each side.
Download figure:
Standard image High-resolution image

Let us first treat the case of incoherent-type added errors, where D_L = D_R ≡ D. These are the ones with the same disconnected error added to both operators in the partition, for example (A) from figure 8. These terms do not change the phase of the original partition, and they multiply the magnitude by (sin θ/2)^2m if m is the weight of the error added on each side. The disconnected part contains cos² θ/2 on each qubit corresponding to no disconnected errors plus many configurations of disconnected errors. The incoherent-type added errors on each qubit in the disconnected part supply the sin² θ/2 term to give 1 on the qubits not contained in the connected logical string. This reasoning applies to each incoherent-type added error that does not change how the operators O_U and O_C are decoded. In other words, if D is the disconnected error we add to O_U and O_C, we require that DO_U is an uncorrectable error.

We must be careful because in some cases the added incoherent-type errors can change how the correctable and uncorrectable errors in the partition are decoded. The added error can 'flip' the uncorrectable error to a correctable one. This means that the noise term that contributes to the logical ${\tilde {\chi }}_{{Z}_{1}I}$ component is not (DO_UρDO_C) as we would have expected but is instead (DO_CρDO_U). This term has the opposite sign relative to the expected term. This is only possible when the added error D is located very near the connected logical string and only for special partitions. We prove in lemma 11 in appendix E that the contribution from these disconnected terms is negligible.

What of the coherent-type added errors? Again, fix a partition of a connected logical string. Let O_U and O_C be the correctable and uncorrectable errors. Now consider choosing a stabilizer operator or a closed loop, ℓ that is disjoint from the connected logical string. Let the length of the loop be |ℓ|. Now choose a subset of p of the qubits in the loop, and let the disconnected error D_L act on these p qubits from the left, while the disconnected error D_R acts on the remaining |ℓ| − p qubits from the right. Suppose further that the qubits in the loop and the partition are such that the uncorrectable error O_U plus the additional error D_L remains uncorrectable. This need not always be true; we will consider the case where the O_UD_L is correctable in a moment.

Supposing that the disconnected error D_L does not change the decoding, we can perform a sum over all the ways of choosing the p errors in D_L from among the |ℓ| errors in the loop. The number of ways of choosing p errors is given by a binomial coefficient, and the magnitude of each term is suppressed by (sin θ/2)^|ℓ| relative to the original partition of the connected logical string without any additional disconnected errors added. The phase of each term depends, as always, on the relative weight of the errors on the right and the left. The disconnected part contributes a phase of (i)^p(−i)^|ℓ|−p, and ℓ is a closed loop so |ℓ| is even. The sum yields

$\begin{equation}\left(\text{Connected}\;\text{part}\right)\sum _{p=0}^{\vert \ell \vert }{\left(-1\right)}^{p}{\left(-1\right)}^{\vert \ell \vert /2}\left(\genfrac{}{}{0.0pt}{}{\vert \ell \vert }{p}\right){\left(\mathrm{sin}\;\theta /2\right)}^{\vert \ell \vert }=0.\end{equation} \tag{ 195 }$

When we sum over all ways of forming disconnected terms out of the original loop ℓ, the sum is 0. This holds for any loop such that the disconnected part does not change how the connected part is decoded.

In the examples we considered in figure 8, the additional disconnected errors did not change how the connected part was decoded. This is the same condition we encountered in the discussion of incoherent-type added errors. In certain cases the error D_L that we add to the O_U side of the partition can be such that D_LO_U is a correctable operator. This means the partition is 'flipped' by the disconnected error. We account for this case in lemma 11 and prove that the contribution to the logical noise from these special disconnected terms is negligible for short logical strings.

Using lemma 11, we can neglect the added errors that change how the partition is decoded. Then we can conclude that the net contribution from coherent-type added errors is 0 and the incoherent-type added errors contribute a sin² θ/2 factor on each qubit not in the connected logical string. This implies that the 'disconnected sum' term in equation (172) is equal to 1 plus a small correction. This implies that

$\begin{equation}{\tilde {\chi }}_{{Z}_{1}I}=\left[\sum _{\mathcal{L}}\sum _{\text{partitions}}\left({O}_{\text{U}}\rho \;{O}_{\text{C}}\right)\right]\left(1+\mathcal{E}\right)+\text{high}\;\text{weight},\end{equation} \tag{ 196 }$

where $\mathcal{L}$ is a connected, short, typical logical string, partitions refers to the partitions of $\mathcal{L}$ denoted (O_UρO_C), and $\mathcal{E}$ is a noise term. The error term satisfies

$\begin{equation}\mathcal{E}{\leqslant}\frac{16\gamma {\zeta }^{2}}{\sqrt{L}}+\mathcal{O}\left(1/L\right).\end{equation} \tag{ 197 }$

This error term is from lemma 11 and comes from the added errors that change how the partition is decoded. The term 'high weight' in equation (196) is the error from lemma 3 corresponding to the contributions of logical strings with length >L + 2ζ. We have not yet justified that this error is small relative to the short strings. This is because we do not have a lower bound on the short strings. The justification comes from our subsequent discussion of the incoherent logical noise components.

6.8. Incoherent sum

Now that we have simplified the sum for the coherent components of the logical noise channel, factored out the disconnected pieces, and performed the sum over syndromes for the connected pieces, we turn our attention to the incoherent logical noise components. We start by making several of the same simplifications we made in the coherent sum. Of the incoherent logical components $\left({\tilde {L}}_{a}\tilde {\rho }{\tilde {L}}_{a}^{{\dagger}}\right)$ , we neglect all the terms where L_a is a logical Y operator or acts non-trivially on both encoded qubits. We retain only the terms where L_a is a logical X or Z on one of the two encoded qubits. The reason is the same as for the coherent sum. The neglected terms are much higher weight, such that the path counting excludes them. Then we have the sum

$\begin{equation}\left({\tilde {L}}_{a}\tilde {\rho }{\tilde {L}}_{a}^{{\dagger}}\right)=\sum _{s,x,y}\left({E}_{s}{L}_{a}{G}_{x}\rho \;{G}_{y}^{{\dagger}}{L}_{a}^{{\dagger}}{E}_{s}^{{\dagger}}\right),\end{equation} \tag{ 198 }$

where L_a is an X or Z logical operator on one of the encoded qubits and identity on the other. Again, we suppose that all the angles are equal to some fixed θ for each single-qubit rotation. We will extend to general rotations in lemma 8.

Again, we will divide each term into connected and disconnected pieces. In this discussion of the incoherent logical noise components, definition 2 must be modified. The noise terms that enter into the incoherent logical noise contain an uncorrectable error on both sides of ρ. We will need to consider two logical strings in our definition.

Definition 8. The 'connected part' of a noise term $\left({E}_{s}{L}_{a}{G}_{x}\rho \;{G}_{y}^{{\dagger}}{L}_{a}^{{\dagger}}{E}_{s}^{{\dagger}}\right)$ is a noise term defined in the following way: let ${\mathcal{L}}_{1}$ equal L_aG_x with all topologically trivial closed loops removed and ${\mathcal{L}}_{2}$ equal L_aG_y with all trivial closed loops removed. Then let A denote the set of qubits $\subset {\mathcal{L}}_{1}\cup {\mathcal{L}}_{2}$ where either E_sL_aG_x or E_sL_aG_y, or both, act non-trivially. The connected part of $\left({E}_{s}{L}_{a}{G}_{x}\rho \;{G}_{y}^{{\dagger}}{L}_{a}^{{\dagger}}{E}_{S}^{{\dagger}}\right)$ is given by

$\begin{equation}\left(\left({E}_{s}{L}_{a}{G}_{x}\right){\vert }_{A}\;\rho \;\left({G}_{y}^{{\dagger}}{L}_{a}^{{\dagger}}{E}_{s}^{{\dagger}}\right){\vert }_{A}\right).\end{equation} \tag{ 199 }$

|_A denotes the restriction of an operator to the set of qubits A.

If the incoherent term is (O_UρO_U'), then this definition captures the set of qubits in the support of O_U or O_U' that lie along the two logical strings formed by O_U and E_s and O_U' and E_s pruned of all trivial closed loops. Figure 9 illustrates the connected and disconnected part of a noise term that enters into the incoherent logical noise. The connected part of the noise term in the figure features factors of sin θ/2 cos θ/2 for the qubits that appear in exactly one of O_U or O_U' and sin² θ/2 for the qubits that appear in both O_U and O_U'. We can lower bound the connected part of each incoherent noise term by ${\left(\mathrm{sin}\;\theta /2\;\mathrm{cos}\;\theta /2\right)}^{\vert {O}_{\text{U}}\vert +\vert {O}_{\text{U}}^{\prime }\vert }$ . This will be useful later on when we sum over many possible choices for the operators O_U and O_U'.

**Figure 9.** A noise terms (O_UρO_U') is shown with O_U in red and O_U' in blue. The standard correction E_s chosen by the minimal-weight decoder is drawn as a dotted black line. The connected part of this noise term is signified by the orange qubits, while the disconnected part contains the black qubits.
Download figure:
Standard image High-resolution image

Definition 9. The "disconnected part" of a noise term $\left({E}_{s}{L}_{a}{G}_{x}\rho \;{G}_{y}^{{\dagger}}{L}_{a}^{{\dagger}}{E}_{s}^{{\dagger}}\right)$ is the restriction of the noise term to the qubits not in the connected part. In definition 8 we constructed the set A, which contained the qubits in the connected part. The disconnected part is given by

$\begin{equation}\left(\left({E}_{s}{L}_{a}{G}_{x}\right){\vert }_{{A}^{C}}\;\rho \;\left({G}_{y}^{{\dagger}}{L}_{a}^{{\dagger}}{E}_{s}^{{\dagger}}\right){\vert }_{{A}^{C}}\right),\end{equation} \tag{ 200 }$

where ${\vert }_{{A}^{C}}$ denotes the restriction of an operator to the complement of the set A.

For the example in figure 9, the disconnected part features factors of sin θ/2 cos θ/2 for the six qubits along the trivial closed loop near the top of the figure and cos² θ/2 for the rest of the qubits. For a given connected part, we can imagine adding disconnected errors to form many different noise terms. The connected part contains factors of sin θ/2 cos θ/2 for each qubit that appears in one of the uncorrectable errors and sin² θ/2 for each qubit that appears in both errors. The disconnected term includes cos² θ/2 for each qubit not in the connected part plus a sum over all possible coherent and incoherent-type disconnected error. Just as in section 6.7, when the disconnected errors do not change how the connected term is decoded, the incoherent-type errors give cos² θ/2 + sin² θ/2 = 1 on qubits not in the connected part. The coherent-type disconnected errors, which form loops split between left and right, sum to zero because of the alternating signs.

Just as in the case of the coherent logical noise components, some disconnected errors will not be allowed because they change how the connected term is decoded. We will set the disconnected part equal to 1 plus an error term that comes from these disallowed disconnected errors. In lemma 12 we justify this by proving that the error term is small. This is analogous to lemma 11, where we prove that the disconnected part of the coherent logical noise components is equal to 1 up to small corrections.

We want to continue to follow a similar argument to the one for the coherent terms. The next step is restricting the set of connected terms we consider. We will break up each error into connected and disconnected pieces and restrict ourselves to noise terms with low-weight connected part, where the total weight of the connected part is bounded by L + 2ζ + 1; here ζ is the same L-independent cutoff as in the coherent sum. Just as for the analysis of the coherent logical noise in section 6.5, we will require θ to scale like 1/L to justify this truncation of the noise terms contributing to the connected part.

Lemma 5. Consider an incoherent logical noise component, say ${\tilde {\chi }}_{{Z}_{1}{Z}_{1}}$ . We write this logical noise component as a sum over physical noise terms (O_UρO_U'). Then if |sin θ| < 1/L, we can truncate the sum to include only those noise terms where |O_U| + |O_U'| ⩽ L + 2ζ + 1, where ζ is the same cutoff constant as in lemma 3. In other words,

$\begin{equation}{\tilde {\chi }}_{{Z}_{1}{Z}_{1}}=\sum _{{O}_{\text{U}},{O}_{\text{U}}^{\prime }:\;\vert {O}_{\text{U}}\vert +\vert {O}_{\text{U}}^{\prime }\vert {\leqslant}L+2\zeta +1}\left({O}_{\text{U}}\rho \;{O}_{\text{U}}^{\prime }\right){\times}\mathrm{d}\mathrm{i}\mathrm{s}\mathrm{c}\mathrm{o}\mathrm{n}\mathrm{n}\mathrm{e}\mathrm{c}\mathrm{t}\mathrm{e}\mathrm{d}{\times}\left(1+\mathcal{O}\left({\left(L\;\mathrm{sin}\;\theta \right)}^{2\zeta }\right)\right).\end{equation} \tag{ 201 }$

Proof. We split each noise term into connected and disconnected parts. We show in lemma 12 that the disconnected part is decreasing as the weight of the connected part increases. Moreover, the disconnected part is approximately equal to 1 for connected terms with total weight ⩽L + 2ζ + 1. Therefore, we need only consider the connected part as we proceed to truncate the sum and upper bound the error.

Let us denote the connected part of a noise term that enters into the logical ${\tilde {\chi }}_{{Z}_{1}{Z}_{1}}$ component by (O_UρO_U'). All such noise terms have the shape drawn in figure 10. The connected part is supported on a series of loops. The loops are joined by the minimal-weight correction. Denote the minimal-weight correction of O_U and of O_U' by E_s. Let w be the weight of O_U and w' the weight of O_U'. Suppose w ⩽ w'. If not, then swap O_U and O_U' in what follows. Let ${E}_{s}{O}_{\text{U}}=\mathcal{L}$ . Then $\mathcal{L}$ is a connected Z₁ logical string with length at most 2w − 1. The number of such logical strings is upper bounded by L^2w−L using our bound on the number of logical strings in equation (175).

The number of ways of choosing a weight w operator O_U as a subset of $\mathcal{L}$ is upper bounded by 2^2w−2. Now that O_U is fixed, the number of ways of choosing the operator O_U' is upper bounded by $\mathcal{O}\left({L}^{{w}^{\prime }-w}\right)$ . This is because the lowest-weight operator with the same syndrome and logical action has weight ⩽w. Then O_U' consists of this lowest-weight operator combined with a number of additional deviations like we considered to derive equation (175). (Here we are neglecting a factor which is polynomial in w and w'; bounding the exponential dependence on w' − w will suffice for what follows.) All together we have the following upper bound on the number of noise terms with fixed w and w':

$\begin{equation}{\leqslant}{2}^{2w-2}{L}^{2w-L}{L}^{{w}^{\prime }-w}.\end{equation} \tag{ 202 }$

Each of these terms has magnitude at most (sin θ/2)^w+w', which is positive because w + w' is even. As in lemma 3 we will truncate the sum and keep only those connected noise terms with w + w' ⩽ L + 2ζ + 1. If we let w + w' = w_total, for each w_total there are several combinations of w and w' with the same total. Because w and w' must be >(L + 1)/2, there are less than w_total − L combinations. We perform a sum over w_total from L + 2ζ + 1 up to the maximum weight. Therefore, if we let denote the contribution from the higher weight connected terms to ${\tilde {\chi }}_{{Z}_{1}{Z}_{1}}$ , then is bounded by

$\begin{equation}{\epsilon}{\leqslant}\mathcal{O}\left({\left(2\;\mathrm{sin}\left(\theta /2\right)\right)}^{L+2\zeta +1}{L}^{2\zeta +2}\right).\end{equation} \tag{ 203 }$

Here we have estimated the sum over w_total using the same method as in the derivation of equation (179). We will compare this error to the contribution from the lowest weight noise terms. These terms have w = w' = (L + 1)/2, and contribute at least ξ, where

$\begin{equation}\xi =L{\left(\mathrm{sin}\;\theta \right)}^{L+1}.\end{equation} \tag{ 204 }$

Then the relative error associated with our truncation is given by

$\begin{equation}\frac{{\epsilon}}{\xi }{\leqslant}\mathcal{O}\left({\left(L\;\mathrm{sin}\;\theta \right)}^{2\zeta }\right).\end{equation} \tag{ 205 }$

We have neglected a polynomial factor in L in our counting of noise terms. Nevertheless, as long as L|sin θ| < 1, the relative error is exponentially small in ζ, and the higher-weight connected terms are negligible. □

**Figure 10.** This is a connected incoherent noise term, (O_UρO_U'). The operator O_U is drawn with the solid blue line. The operator O_U' is drawn with the dashed blue line. The standard correction E_s for the syndrome shared by O_U and O_U' is drawn with the solid red line. Each connected incoherent term is a configuration of loops where each loop is formed by two segments with shared endpoints, one segment from O_U and one from O_U'. The operator E_s links together the loops like beads on a string, so that O_UE_s and O_U'E_s are both continuous logical strings spanning the code.
Download figure:
Standard image High-resolution image

6.9. The incoherent sum over strings

The connected part of the incoherent components is not as simply expressed as a sum over strings as the coherent components because each uncorrectable error E_sL_aG_x can generally be completed to many different logical strings by multiplying by different correctable errors. Nevertheless, we can rewrite the sum in a similar way. This will form our primary tool for comparing coherent and incoherent logical noise components. We will write a sum over each logical string with logical action L_a. For each string, we will sum over different ways of choosing the uncorrectable subset O_U. We will restrict the subsets we consider for each logical string in order to control the over-counting factor that we will describe shortly. Then for each operator O_U, we will sum over all possible uncorrectable operators O_U' with the same syndrome and logical action. Fix a connected logical string $\mathcal{L}$ with length ℓ and choose an uncorrectable subset of the logical string O_U with weight w. We impose two constraints on O_U: first that w ⩾ (ℓ + 1)/2 and second that the complement of O_U has the same weight as the minimal-weight correction E_s. The complement of O_U is ${O}_{\text{U}}\mathcal{L}$ , which we will denote O_C. Note that the name O_C is chosen in analogy to the way the errors were labelled in the coherent noise terms, but O_C is not a part of the incoherent noise term, which is notated (O_UρO_U'). Now that O_U is fixed, choose a second uncorrectable error O_U' with the same syndrome and with weight w'. This will produce every incoherent connected term (O_UρO_U'). However, each uncorrectable error O_U will appear many times as a subset of many different logical strings. This is the over-counting we mentioned above.

Each operator O_U can be completed to a logical string in many ways. Because of the constraints we imposed on the subset O_U, the complement, which we called O_C, must have the same weight as the minimal-weight correction to O_U, denoted E_s. Let {O_C'} be the set of possible complements. Then each possible complement O_C' ∈ {O_C'} defines a logical string O_UO_C', which will appear in the sum over strings. For each operator O_U in the sum over strings, we need to divide by the number of complements O_C' with weight |E_s|. Each incoherent logical noise component can be written as a sum over connected logical strings $\mathcal{L}$ times a disconnected factor. This form of the sum will allow us to compare with equation (172). We sum over logical strings, and for each logical string we sum over possible choices of O_U and O_U'. We divide by the over-counting factor for each O_U. This gives

$\begin{equation}{\tilde {\chi }}_{{Z}_{1}{Z}_{1}}=\sum _{\mathcal{L}}\sum _{{O}_{\text{U}}\subset \mathcal{L}}\frac{1}{\vert \left\{{O}_{\text{C}}^{\prime }\right\}\vert }\sum _{{O}_{\text{U}}^{\prime }}\left({O}_{\text{U}}\rho \;{O}_{\text{U}}^{\prime }\right){\times}\text{disconnected},\end{equation} \tag{ 206 }$

where

$\begin{equation}\vert \mathcal{L}\vert =\ell ,\quad \vert {O}_{\text{U}}\vert {\geqslant}\frac{\ell +1}{2},\quad \vert {O}_{\text{C}}^{\prime }\vert =\vert {E}_{s}\vert .\end{equation} \tag{ 207 }$

To reiterate, equation (206) expresses an incoherent logical noise component as a sum over connected logical strings. For each string $\mathcal{L}$ with weight ℓ, we sum over all uncorrectable subsets O_U of weight ⩾(ℓ + 1)/2 such that the complement O_C has weight equal to the minimal-weight correction of O_U, namely E_s. For each O_U we must divide by the number of complements O_C' with the same syndrome and weight as E_s in order to cancel the over-counting. {O_C'} is the set of such operators, and |{O_C'}| is its cardinality. Finally, we sum over all uncorrectable operators O_U' with the same syndrome to produce the complete set of incoherent terms. We will prove the following lemma, which provides a lower bound on the contribution of each logical string to the incoherent logical noise component. We will apply this lemma to lower bound the incoherent logical noise strength in terms of the coherent logical noise strength.

Lemma 6. As long as |sin θ| < 1/L, we can apply lemma 5. This means that in equation (206) we can restrict to the case where |O_U| + |O_U'| ⩽ L + 2ζ + 1. Let us also suppose that |O_U| = |O_U'|. This assumption will be justified by lemma 7. Then we can pick a connected logical string $\mathcal{L}$ with $\vert \mathcal{L}\vert =\ell$ such that ℓ ⩽ L + 2ζ. $\mathcal{L}$ is a Z₁ logical string if we are calculating the ${\tilde {\chi }}_{{Z}_{1}{Z}_{1}}$ logical noise component. O_U is subset of $\mathcal{L}$ such that O_U is corrected to a logical Z₁ operator and |O_U| = (ℓ + 1)/2. O_U' is an operator with the same weight, syndrome, and logical action as O_U. For each fixed $\mathcal{L}$ with length ℓ ⩽ L + 2ζ, the following holds:

$\begin{equation}\sum _{{O}_{\text{U}}}\frac{1}{\vert \left\{{O}_{\text{C}}^{\prime }\right\}\vert }\sum _{{O}_{\text{U}}^{\prime }}1{\geqslant}\sum _{{O}_{\text{U}}}1.\end{equation} \tag{ 208 }$

Proof. For each short logical string $\mathcal{L}$ with length ℓ, we partition it into an uncorrectable operator O_U of weight w = (ℓ + 1)/2 and a correctable operator O_C of length |O_C| = |E_s|. Then we consider the alternative uncorrectable and correctable paths, O_U' and O_C', with weight w and |E_s|, respectively. The logical string $\mathcal{L}$ is short, so we can use lemmas 9 and 10. Say the logical string runs right to left across the code. We observe by studying figure 12 that we have multiple possible strings of the same weight exactly when both a vertical error and some number of adjacent horizontal errors are contained in either the correctable or uncorrectable part.

Suppose that for some partition consisting of an uncorrectable operator ${O}_{\text{U}}^{\left(1\right)}$ and a correctable operator ${O}_{\text{C}}^{\left(1\right)}$ , denote the operators with the same weight, syndrome, and logical action by ${O}_{\text{U}}^{\prime \;\left(1\right)}$ and ${O}_{\text{C}}^{\prime \;\left(1\right)}$ . Suppose $\vert \left\{{O}_{\text{C}}^{\prime \;\left(1\right)}\right\}\vert { >}\vert \left\{{O}_{\text{U}}^{\prime \;\left(1\right)}\right\}\vert$ . We construct a new operator ${O}_{\text{U}}^{\left(2\right)}$ by exchanging all but one of the errors in ${O}_{\text{U}}^{\left(1\right)}$ with the errors in ${O}_{\text{C}}^{\left(1\right)}$ , so that ${O}_{\text{U}}^{\left(2\right)}$ is equal ${O}_{\text{C}}^{\left(1\right)}$ plus one additional error. This is shown in figure 11, where we have kept the error on the farthest left vertical segment fixed and flipped the rest relative to the term in figure 12. For every O_U there are w possible mappings, one for each of the w choices of the single-qubit error that remains fixed. In the same way, every O_U is mapped onto by w different mappings acting on w other operators with the same logical action as O_U. Then there exists a convention that selects exactly one partner for each O_U.

We assumed that for the original ${O}_{\text{U}}^{\left(1\right)}$

$\begin{equation}\vert \left\{{O}_{\text{C}}^{\prime \;\left(1\right)}\right\}\vert { >}\vert \left\{{O}_{\text{U}}^{\prime \;\left(1\right)}\right\}\vert .\end{equation} \tag{ 209 }$

The mapping we described constructs a partner ${O}_{\text{U}}^{\left(2\right)}$ such that

$\begin{equation}\vert \left\{{O}_{\text{U}}^{\prime \;\left(2\right)}\right\}\vert {\geqslant}\vert \left\{{O}_{\text{C}}^{\prime \;\left(1\right)}\right\}\vert { >}\vert \left\{{O}_{\text{U}}^{\prime \;\left(1\right)}\right\}\vert {\geqslant}\vert \left\{{O}_{\text{C}}^{\prime \;\left(2\right)}\right\}\vert .\end{equation} \tag{ 210 }$

Then for each pair ${O}_{\text{U}}^{\left(1\right)}$ and ${O}_{\text{U}}^{\left(2\right)}$ , we can lower bound the contribution to the incoherent logical noise using

$\begin{align}\hfill \frac{\vert \left\{{O}_{\text{U}}^{\prime \;\left(1\right)}\right\}\vert }{\vert \left\{{O}_{\text{C}}^{\prime \;\left(1\right)}\right\}\vert }+\frac{\vert \left\{{O}_{\text{U}}^{\prime \;\left(2\right)}\right\}\vert }{\vert \left\{{O}_{\text{C}}^{\prime \;\left(2\right)}\right\}\vert }& {\geqslant}\frac{\vert \left\{{O}_{\text{U}}^{\prime \;\left(1\right)}\right\}\vert }{\vert \left\{{O}_{\text{C}}^{\prime \;\left(1\right)}\right\}\vert }+\frac{\vert \left\{{O}_{\text{C}}^{\prime \;\left(1\right)}\right\}\vert }{\vert \left\{{O}_{\text{U}}^{\prime \;\left(1\right)}\right\}\vert }\hfill \\ \hfill & =\frac{\vert \left\{{O}_{\text{U}}^{\prime \;\left(1\right)}\right\}{\vert }^{2}+\vert \left\{{O}_{\text{C}}^{\prime \;\left(1\right)}\right\}{\vert }^{2}}{\vert \left\{{O}_{\text{U}}^{\prime \;\left(1\right)}\right\}\vert \vert \left\{{O}_{\text{C}}^{\prime \;\left(1\right)}\right\}\vert }\hfill \\ \hfill & {\geqslant}2.\hfill \end{align} \tag{ 211 }$

Finally, we apply the lower bound to the entire sum over O_U to conclude

$\begin{equation}\sum _{{O}_{\text{U}}}\frac{\vert \left\{{O}_{\text{U}}^{\prime }\right\}\vert }{\vert \left\{{O}_{\text{C}}^{\prime }\right\}\vert }{\geqslant}\sum _{{O}_{\text{U}}}1.\end{equation} \tag{ 212 }$

The number of terms in the sum over O_U is at most $\left(\genfrac{}{}{0.0pt}{}{\ell }{w}\right)$ , where ℓ is the length of the logical string $\mathcal{L}$ . For typical, short logical strings the binomial coefficient will be the number of terms in the sum over O_U up to a small correction. □

**Figure 11.** A partition of a length-13 logical string is shown in a toric code with L = 9. The operator O_U is shown in solid red. The operator O_C is shown in solid blue. The alternative operators with the same weight, syndrome, and logical action, which we denoted O_U' and O_C', are drawn with dotted lines. For this partition |{O_U'}| = 2 and |{O_C'}| = 4.
Download figure:
Standard image High-resolution image

**Figure 12.** This is a partner of the partition shown in figure 11. It is another partition of the same logical string, and the errors in O_U and O_C are interchanged except for one qubit. In this case that qubit is the one that lies on the farthest left vertical segment. The error on that qubit is part of O_U in both partitions. Once again, the operator O_U is in red and the operator O_C is in blue. The alternative operators with the same weight, syndrome, and logical action are given by the dashed lines. For this partition |{O_U'}| = 12 and |{O_C'}| = 2.
Download figure:
Standard image High-resolution image

6.10. Noise terms with mismatched weight

We have already shown that we can neglect the high-weight noise terms in the incoherent logical noise components, and we can also write the incoherent logical noise components as a sum over logical strings. Next, we will show that among the low-weight noise terms, we may neglect the terms with different weight errors on each side of ρ. This is crucial to our proof that the coherence of the logical noise is suppressed. We will construct a lower bound on the incoherent logical noise components and an upper bound on the coherent logical noise components. The noise terms with mismatched weight enter with a phase of −1 whenever the difference between the weights on left and right = 2 mod 4. A large contribution from noise terms with mismatched weight could spoil our lower bound on the incoherent logical noise. Fortunately, no such contribution occurs.

Lemma 7. If |sin θ| < 1/L, then the incoherent logical noise component ${\tilde {\chi }}_{{Z}_{1}{Z}_{1}}$ can be written

$\begin{equation}{\tilde {\chi }}_{{Z}_{1}{Z}_{1}}{\geqslant}\sum _{\mathcal{L}}\sum _{{O}_{\text{U}}}\frac{1}{\vert \left\{{O}_{\text{C}}^{\prime }\right\}\vert }\sum _{{O}_{\text{U}}^{\prime }}\left({O}_{\text{U}}\rho \;{O}_{\text{U}}^{\prime }\right){\times}\mathrm{d}\mathrm{i}\mathrm{s}\mathrm{c}\mathrm{o}\mathrm{n}\mathrm{n}\mathrm{e}\mathrm{c}\mathrm{t}\mathrm{e}\mathrm{d}{\times}\left(1+\mathcal{O}\left(L\;\mathrm{sin}\;\theta {\left.\right)}^{2\zeta }\right)\right).\end{equation} \tag{ 213 }$

The sum over $\mathcal{L}$ includes all typical, short logical strings with length ℓ such that ℓ ⩽ L + 2ζ. The sum over O_U includes uncorrectable subsets of $\mathcal{L}$ with weight (ℓ + 1)/2 such that the complement O_C has minimal weight. The sum over O_U' has the same syndrome and the same weight as O_U. The error term comes from the high-weight terms we neglected in lemma 5.

Proof. Using lemma 5, we can truncate the sum over noise terms in the incoherent logical noise component ${\tilde {\chi }}_{{Z}_{1}{Z}_{1}}$ to include only those noise terms with total weight ⩽L + 2ζ + 1. In doing so we make an error that is exponentially small in the cutoff ζ, assuming that the single-qubit angle of rotation θ satisfies |sin θ| < 1/L. We will use equation (206) to express the incoherent logical noise components as a sum over strings. We will begin by reviewing how we construct that form of the sum.

We denote the weight of O_U by w and O_U' by w'. We upper bound the mismatched-weight terms where w ≠ w' by letting w > w' and multiplying by two. As in equation (206) we can generate the complete set of connected incoherent terms with fixed w and w' by summing over connected logical strings $\mathcal{L}$ . Denote the length of the logical string by $\vert \mathcal{L}\vert =\ell$ . To produce the incoherent terms with fixed w + w', it will suffice to sum over logical strings with ℓ < w + w'. We already restricted to low-weight terms, so w + w' ⩽ L + 2ζ + 1. For each logical string, we sum over the uncorrectable subsets O_U with weight w. We will also require that the complement of O_U, which we denoted O_C, has minimal weight. This is to control the over-counting of each incoherent term (O_UρO_U'). Then for each O_U, we sum over the operators O_U' with weight w' that have the same syndrome and logical action as O_U. As discussed in section 6.9, we must also divide by an over-counting factor 1/|{O_C}| that is a function of O_U and equals one over the number of times O_U appears in the sum over $\mathcal{L}$ . The contribution to the incoherent logical noise is lower bounded by

$\begin{equation}\text{Contribution}\;\text{from}\;\text{string}\;\mathcal{L}{\geqslant}\sum _{{O}_{\text{U}}}\frac{\vert \left\{{O}_{\text{U}}^{\prime }\right\}\vert }{\vert \left\{{O}_{\text{C}}^{\prime }\right\}\vert }{\left(\frac{\mathrm{sin}\;\theta }{2}\right)}^{w+{w}^{\prime }}.\end{equation} \tag{ 214 }$

The inequality comes from the cosine factors. If the operators O_U and O_U' act on the same set of qubits, then we have (sin θ/2)^2w with no cosine factors in the connected part. The lower bound corresponds to the case where O_U and O_U' act on disjoint sets of qubits, and we pick up a cosine factor on each qubit in the connected part. We also have an upper bound:

$\begin{equation}\text{Contribution}\;\text{from}\;\text{string}\;\mathcal{L}{\leqslant}\sum _{{O}_{\text{U}}}\frac{\vert \left\{{O}_{\text{U}}^{\prime }\right\}\vert }{\vert \left\{{O}_{\text{C}}^{\prime }\right\}\vert }{\left(\mathrm{sin}\left(\theta /2\right)\right)}^{w+{w}^{\prime }}.\end{equation} \tag{ 215 }$

In this bound, we have ${\left(\mathrm{sin}\left(\theta /2\right)\right)}^{w+{w}^{\prime }}$ . This corresponds to the case when O_U and O_U' act on the same qubits, yielding no cosine terms.

Consider first the terms with w = w'. These are generated from logical strings of length ℓ ⩽ 2w − 1. Some strings $\mathcal{L}$ with length ℓ such that L ⩽ ℓ ⩽ L + 2ζ have typical shape and some do not. We will prove first that the contribution from a given string of atypical shape is no greater than that of a string of typical shape, in fact it will be less. We will conclude that we can safely neglect the contribution from strings of atypical shape (because there are fewer such strings). This is the same simplification we made in our discussion of the coherent logical noise components in section 6.6.

We require that O_U is an uncorrectable error, so we cannot choose any subset of w qubits in $\mathcal{L}$ . We ignore the subsets that correspond to exceptional partitions like we discussed in section 6.6. Now, if we have two connected logical strings of the same length, one with a typical shape and one with an atypical shape, we want to compare the terms with w = w'. The first thing we notice is that exceptional terms are exponentially unlikely for the string with typical shape, while for the atypical string, exceptional terms may be a significant fraction of the total partitions. This tells us that in the sum over O_U there are many more terms for the typical string than for the atypical string. We have argued about the number of terms in the sum in equations (214) and (215), but we must also consider the magnitude of each term, which is given by the ratio of |{O_U'}| over |{O_C'}|.

We must argue that after summing the ratio of {O_U'} and {O_U'} over O_U, the result is less for an atypical string than for a typical string. {O_U'} here is the set of uncorrectable operators with the same weight and syndrome as O_U and {O_C'} is the set of correctable weight-ℓ − w operators with the same syndrome. Suppose the logical string runs left to right across the code. The set {O_U'} contains more than one element whenever O_U contains a set of contiguous qubits around one or more of the vertical steps in the logical string. This was discussed in detail in section 6.9. |{O_U'}| and |{O_C'}| are large when either O_U or O_C contain contiguous sets of qubits around the vertical steps. The typical logical string has at least $\gamma \sqrt{L}$ horizontal steps around each of the vertical steps. The atypical string does not. This means that the typical string has more possible sets of qubits around each vertical step that make |{O_U'}| or |{O_C'}| large. Therefore, |{O_U'}| and |{O_C'}| will tend to be larger for the typical string. The ratio of |{O_U'}| to |{O_C'}| is what determines the contribution to the incoherent logical noise. In lemma 6 we showed how we can match up terms such that for each O_U satisfying |{O_U'}|/|{O_C'}| = c, the partner has |{O_U'}|/|{O_C'}| ⩾ 1/c. If c is large, then $c+\frac{1}{c}\gg 2$ . It follows that because the typical string has more operators O_U in the sum and the |{O_U'}|/|{O_C'}| factors tend to be larger, the contribution to the incoherent logical noise is smaller for an atypical string than the contribution from a typical string. When we combine this fact with the fact that the atypical strings represent an small minority, this means we can neglect the atypical strings among the w = w' terms in the incoherent logical noise. The error is given by lemma 10.

Now, consider the mismatched-weight terms that are the subject of this lemma. Fix w + w' and suppose w > w'. For each $\mathcal{L}$ we can construct a number of incoherent terms with mismatched weight depending on the length and shape of $\mathcal{L}$ . Let $\vert \mathcal{L}\vert =\ell$ . Once again O_U is an uncorrectable subset of the logical string $\mathcal{L}$ with weight w. Then for each O_U we have the possibility that there may exist an operator O_U' with the same syndrome as O_U and lower weight. We sum over the set of O_U such that for each O_U there exists an O_U' with weight w'. If we sum over all logical strings of length <w + w', we produce every connected incoherent term with |O_U| = w and |O_U'| = w'. We will proceed by fixing a logical string and upper bounding the sum over w of the noise terms derived from this logical string with w + w' fixed. In this sum the terms will alternate sign as w increases. The terms with w = w' have a positive sign. As we seek to bound the contribution of these mismatched-weight terms to the incoherent logical noise, there are two things we need to bound. First, we must understand the combinatorics that govern the number of operators O_U that permit lower-weight O_U'. Second, we must bound the factor |{O_U'}|/|{O_C'}| for each such O_U.

Suppose that the logical string $\mathcal{L}$ has a typical shape. To be concrete, consider the set of operators O_U with weight w = (ℓ + 3)/2. If there exist lower-weight O_U', then O_U must contain all of the qubits around a 'cap', which is similar condition to the one we discussed in section 6.6. Each cap has width at least $\gamma \sqrt{L}$ because the string is typical. This means that such O_U are exponentially few relative to the total set of uncorrectable O_U with weight w. This is the same calculation as in lemma 4. We compare these terms to the terms with |O_U| = (ℓ + 1)/2 = |O_U'|. There are exponentially more of these terms where |O_U| = |O_U'|. This means that in equations (214) and (215) the sum over O_U contains exponentially more terms when w = w' for a typical logical string. The summand also tends to be less for the w > w' terms. The argument is similar to the one we used earlier when we were discussing the w = w' terms from typical and atypical strings. |{O_U'}| is large when O_U contains many qubits around several of the different steps up and down along the logical string. In this case the terms with w > w' feature operators O_U that contain at least one of the 'caps' along the logical string. This removes at least two of the vertical steps. These steps cannot contribute to |{O_U'}|. Then by the argument we used above, the ratio |{O_U'}|/|{O_C'}| tends to be less for the w > w' terms relative to the w = w'. We chose w = (ℓ + 3)/2, but we could have chosen any w > (ℓ + 1)/2 and any w' < w. We would find that there are ${2}^{\gamma \sqrt{L}\left(w-{w}^{\prime }\right)}$ fewer of the mismatched weight terms. We have a factor of ${2}^{\gamma \sqrt{L}}$ for each cap contained in O_U. We conclude that the mismatched-weight terms are negligible for strings of typical shape.

Finally, consider a logical string $\mathcal{L}$ with an atypical shape. Fix w + w'. We already neglected the contribution of atypical strings to the w = w' terms. We seek a bound on the contribution from the terms with w > w' for atypical strings. We will compare two sets of terms for fixed $\mathcal{L}$ with length ℓ. On the one hand, take the terms with w = w₁ for some w₁ > (ℓ − 1)/2 and w' = w₂ < w₁. On the other hand, take the terms with w = w₁ + 1 and w' = w₂ − 1. We will show that the latter set of terms contribute less than the former. This will tell us that the sum over mismatched-weight terms for the fixed string $\mathcal{L}$ is bounded by the contribution from terms with |O_U| = |O_U'|.

When O_U' has lower weight than O_U, O_U must contain all the qubits along a cap. If O_U − O_U' = 2j, then O_U must contain at caps with total height at least j. Because the logical string $\mathcal{L}$ has an atypical shape, these caps may have width one or height greater than one. It will not be exponentially unlikely that all qubits around a small cap are contained in O_U. For the terms with w = w₁ and w' = w₂, relative to O_U', O_U contains all the qubits around w₁ − w₂ of the caps. These terms will be a fraction of the $\left(\genfrac{}{}{0.0pt}{}{\ell }{{w}_{1}}\right)$ subsets of w₁ qubits in the logical string $\mathcal{L}$ . We compare these terms to the ones with w = w₁ + 1 and w' = w₂ − 1 keeping our logical string $\mathcal{L}$ fixed. These terms include all the qubits around an additional cap. On one of the caps, instead of containing at least one and less than all of the qubits, O_U contains all of the qubits around that cap. This stricter condition of O_U means that the fraction of the total $\left(\genfrac{}{}{0.0pt}{}{\ell }{{w}_{1}+1}\right)$ weight w₁ subsets of $\mathcal{L}$ that feature an O_U' operator with weight w₂ − 1 is smaller than the fraction of the total $\left(\genfrac{}{}{0.0pt}{}{\ell }{{w}_{1}}\right)$ weight w₁ subsets of $\mathcal{L}$ that feature an O_U' operator with weight w₂. This means that in equations (214) and (215) in the sum over O_U for our fixed logical string, the number of possible O_U at a given weight w > (ℓ + 1)/2 is given by a binomial coefficient times a function that decreases monotonically as w increases. As for the summand |{O_U'}|/|{O_U'}|, we apply the same reasoning as above. For each cap contained in O_U, there are fewer vertical steps to create many operators O_U'. This implies that the summand will tend to be smaller as w increases. The sum over the different values of w has the form

$\begin{equation}\sum _{w=c}^{\ell }{\left(-1\right)}^{w}f\left(w\right)\left(\genfrac{}{}{0.0pt}{}{\ell }{w}\right){< }\left(\genfrac{}{}{0.0pt}{}{\ell }{c}\right),\end{equation} \tag{ 216 }$

where c ⩾ (l + 1)/2 and f is a monotonically decreasing function. The inequality in equation (216) is proven by pairing the adjacent terms in the sum, positive and negative, to produce small positive contributions bounded by the contributions in the case where f(w) = 1 for all w. It follows that the sum over the mismatched-weight terms derived from the logical string $\mathcal{L}$ is positive, and moreover is bounded by the w = w' terms. We already argued that the w = w' terms from atypical strings are negligible relative to those terms from typical strings. Finally, we can lower bound the incoherent logical noise component by neglecting the atypical strings and for the typical strings, neglecting the mismatched-weight terms. This yields equation (213). □

We are left with only the incoherent terms that have the same weight of uncorrectable error on each side and the weight is ${\leqslant}\frac{L+2\zeta +1}{2}$ . These terms all have the same phase +1, so the incoherent terms with different weights will add constructively. This gives us lower bounds on the logical incoherent noise strength. Each logical string with length ℓ ⩽ L + 2ζ contributes at least $\left(\genfrac{}{}{0.0pt}{}{\ell }{\frac{\ell +1}{2}}\right){\left(\frac{\mathrm{sin}\;\theta }{2}\right)}^{\ell +1}$ . When ℓ is much larger than the minimum logical string length, L, the number of logical strings is given by equation (173).

In particular, the incoherent logical noise components must be larger than the lowest order term. This at last completes the argument begun in section 6.5 about neglecting the contribution to the coherent logical noise from connected logical strings with length >L + 2ζ for a cut-off constant ζ. In lemma 3 we proved that the contribution from long logical strings is upper bounded by αL^2ζ+1|sin θ|^L+2ζ, where α is (1 − L|sin θ|)⁻¹. This bound is exponentially small in ζ relative to the lowest order incoherent logical noise component, $L\left(\genfrac{}{}{0.0pt}{}{L}{\frac{L+1}{2}}\right){\left(\frac{\mathrm{sin}\;\theta }{2}\right)}^{L+1}$ . Our aim is to compare the logical coherent and incoherent noise components, and we have shown that the contribution to the coherent logical noise from long strings is small relative to the incoherent logical noise components. Therefore, we can safely neglect the long connected logical strings. The same applies for the truncation error in lemma 5. The truncation error is negligible relative to the lowest order incoherent terms for large enough ζ.

6.11. More general rotation angles

In sections 6.4 and 6.8, we simplified the problem by assuming that all qubits are rotated by the same single-qubit unitary rotation. Now we want to extend our result to more general single-qubit rotations. We will allow the magnitude of the rotation angle to vary from qubit to qubit and will also allow different axes of rotation for different qubits. Here we will assume that each rotation axis is contained in the X–Z plane. Physical Y errors are treated in appendix G, where we prove that rotations partly along the Y-axis produce less coherent logical noise channels than those arising from rotations along axes in the X–Z plane.

The idea of the proof is the same as that of lemma 2. We will consider the coherent and incoherent logical noise components as functions of individual qubit rotation angles and prove that the coherent component is maximized relative to the incoherent component when all rotation angles are equal.

Lemma 8. Consider the toric code with qubits subject to single-qubit rotations, where each rotation axis lies in the X–Z plane, and both the rotation axis and angle of rotation may vary from qubit to qubit. The bound on the coherence of the logical noise channel proved in theorem 3 continues to apply if the rotations are sufficiently close to uniform; that is, provided that each rotation axis and angle deviates from a fixed constant value within a bounded region.

Proof. Suppose at first that all rotations are about the Z-axis and denote the rotation angle for the ith qubit by θ_i. Each logical coherent or incoherent component is a sum of physical noise terms, which are functions of all the angles θ_i. We will refer to the coherent or incoherent logical noise strength; by this we mean the sum of norms squared of the off-diagonal or diagonal components of the chi matrix for the logical noise channel. We are interested in the coherence of the logical noise channel, that is, the relative magnitude of the coherent and incoherent logical noise strength. Our approach will be to fix the coherent logical noise strength and calculate how the incoherent logical noise strength varies as we change rotation angles while remaining in the submanifold with constant coherent logical noise strength.

We begin at a point where all single-qubit rotation angles are equal. Suppose that this rotation angle is >0. The proof will be similar if the angle is <0. We will perturb away from this point, moving along the submanifold with fixed coherent logical noise strength. These perturbations can be built out of small elementary steps, in which two qubits, i and j, are selected. We require that θ_i ⩾ θ_j. Then the elementary step consists of increasing θ_i by some amount and decreasing θ_j such that we remain on the submanifold with constant coherent logical noise strength. We will prove that such elementary steps increase the incoherent logical noise strength. Therefore, we will conclude that the coherence of the logical noise is maximized when all single-qubit rotation angles are equal. Our calculation will be limited to configurations of angles not too far from the point where all angles are equal.

In lemmas 3 and 5, we proved that when all the rotation angles are equal and satisfy |sin θ| < 1/L, the logical noise is dominated by the contributions of the low-weight connected terms. We bounded the absolute magnitude of the sum over high-weight connected terms. These high-weight terms were negligible relative to the low-weight connected terms in the incoherent logical noise. If the rotation angles are allowed to differ, so long as all the angles θ_i satisfy |sin θ_i| < 1/L, our upper bound on the absolute magnitude of the error from the high-weight terms continues to hold. We require that this error is negligible relative to the low-weight terms we keep in the incoherent logical noise components. This was true when all angles were equal and will continue to be true for a wide range of configurations; only certain edge cases will violate this condition. For instance, one such edge case arises if all the rotation angles are 0 except for the qubits along a long logical string with a shape such that it contains no low-weight uncorrectable subsets.

We previously defined the connected and disconnected parts of a noise term (definitions 2, 3, 8, and 9). As we described in section 6.7 and lemmas 11 and 12, the disconnected part has a value of 1 up to corrections. These corrections are small for low-weight connected terms when all rotation angles are equal. If the angles are different, we can still apply our analysis, so long as the absolute error from the corrections is small relative to the low-weight connected terms in the incoherent logical noise components. This holds in a region around the point where all angles are equal. Hence, in this proof we will compare only the connected terms in the coherent and incoherent logical noise components with the understanding that the error terms we are neglecting are small relative to the low-weight connected terms we have kept.

We can build any general perturbation out of an elementary (non-infinitesimal) perturbation where we increase one rotation angle θ_i and decrease a second angle, θ_j, such that the connected contribution to the coherent logical noise strength is unchanged. The perturbation will look different depending on how the two qubits are positioned. If the qubits i and j are adjacent to each other and aligned in the correct direction, they will appear together in many short logical strings. Otherwise, i and j will not appear together in short logical strings. Throughout this section, we will approximate sin θ_i/2 ≈ θ_i/2 to simplify the equations. We will incur a relative error of ${\theta }_{i}^{2}/4$ , that will always be small, since we have assumed that |sin θ_i| < 1/L for every i. Then the contribution to the coherent logical noise strength from the low-weight connected terms as a function of θ_i and θ_j is

$\begin{equation}\text{Coherent}={\gamma }_{0}+{\gamma }_{1}\left({\theta }_{i}+{\theta }_{j}\right)+{\gamma }_{2}{\theta }_{i}{\theta }_{j}.\end{equation} \tag{ 217 }$

The coherent logical noise strength is a sum of norms squared and is therefore positive. This implies that γ₀ > 0. Moreover, the sum over partitions has the same phase for every short logical string as in equation (192). This means that each logical string makes a positive contribution to the noise strength. Therefore, γ₁ and γ₂ are both non-negative. The relative size of γ₁ and γ₂ depends on how close the two chosen qubits i and j are. When i and j are both along the same horizontal or vertical line, many low-weight logical strings will contain both qubits. These strings contribute to γ₂, so that the γ₂ term may be comparable to the γ₁ term. On the other hand, if qubits i and j are not along a horizontal or vertical line, then none of the minimal-weight logical strings contain both qubits. Also, for any fixed length ℓ ⩽ L + 2ζ, the number of logical strings of length ℓ that contain both qubits i and j is negligible relative to the number of length ℓ logical strings that contain qubit i and not qubit j. In this case the γ₂ term is negligible relative to the γ₁ term. In either case, we can write down the perturbation that leaves the coherent logical noise strength unchanged. Let θ_i = c_iθ and θ_j = c_jθ for some θ, and then we will solve for c_j such that the connected coherent sum is constant. This yields

$\begin{equation}{c}_{j}=\frac{\left(2-{c}_{i}\right){\gamma }_{1}+{\gamma }_{2}\theta }{{\gamma }_{1}+{c}_{i}{\gamma }_{2}\theta },\end{equation} \tag{ 218 }$

so that when γ₂ = 0, we have c_j = 2 − c_i.

We can expand the incoherent logical noise strength in the same way. The noise terms that enter into the incoherent logical noise have the form (O_UρO_U'). As we expand in the angles θ_i and θ_j, we have cases where the qubits i and j are contained in neither, one of, or both O_U and O_U':

$\begin{equation}\text{Incoherent}={\delta }_{0}+{\delta }_{1}\left({\theta }_{i}^{2}+{\theta }_{j}^{2}\right)+{\delta }_{2}{\theta }_{i}^{2}{\theta }_{j}^{2}+{\delta }_{3}\left({\theta }_{i}+{\theta }_{j}\right)+{\delta }_{4}{\theta }_{i}{\theta }_{j}+{\delta }_{5}\left({\theta }_{i}^{2}{\theta }_{j}+{\theta }_{i}{\theta }_{j}^{2}\right).\end{equation} \tag{ 219 }$

By lemma 5, the contributions of high-weight logical strings to the logical incoherent noise components are negligible. The contributions for each short logical string are positive due to lemma 7. If we fix a short logical string that contains both qubit i and qubit j and require that O_U and O_U' both contain i and j or one of i and j, then the same proof as in lemma 7 implies that these contributions are positive. Therefore, the coefficients δ₁ and δ₂ are positive when all rotation angles are equal. We can now substitute the perturbation from equation (218) into each of the terms in equation (219). We compute the perturbed value of the incoherent logical noise strength and subtract the initial value when all the angles of rotation are equal. Let incoherent(a) denote the value of the incoherent term in equation (219) with c_i = a. The difference between the perturbed and initial values is

$\begin{align}\hfill \mathrm{I}\mathrm{n}\mathrm{c}\mathrm{o}\mathrm{h}\mathrm{e}\mathrm{r}\mathrm{e}\mathrm{n}\mathrm{t}\left({c}_{i}\right)-\mathrm{i}\mathrm{n}\mathrm{c}\mathrm{o}\mathrm{h}\mathrm{e}\mathrm{r}\mathrm{e}\mathrm{n}\mathrm{t}\left(1\right)& ={\delta }_{1}\frac{{\left({c}_{i}-1\right)}^{2}\left(2{\gamma }_{1}^{2}{\theta }^{2}+2\left(2+{c}_{i}\right){\gamma }_{1}{\gamma }_{2}{\theta }^{3}+\left({c}_{i}^{2}+2{c}_{i}+1\right){\gamma }_{2}^{2}{\theta }^{4}\right)}{{\left({\gamma }_{1}+{c}_{i}{\gamma }_{2}\theta \right)}^{2}}\hfill \\ \hfill & \quad -\;{\delta }_{2}\frac{{\left({c}_{i}-1\right)}^{2}\left({\gamma }_{1}^{2}{\theta }^{4}\left(2-{\left({c}_{i}-1\right)}^{2}\right)+2{c}_{i}{\gamma }_{1}{\gamma }_{2}{\theta }^{5}\right)}{{\left({\gamma }_{1}+{c}_{i}{\gamma }_{2}\theta \right)}^{2}}\hfill \\ \hfill & \quad +\;{\delta }_{3}\frac{{\left({c}_{i}-1\right)}^{2}\left({\gamma }_{1}{\gamma }_{2}{\theta }^{2}+{c}_{i}{\gamma }_{2}^{2}{\theta }^{3}\right)}{{\left({\gamma }_{1}+{c}_{i}{\gamma }_{2}\theta \right)}^{2}}-{\delta }_{4}\frac{{\left({c}_{i}-1\right)}^{2}\left({\gamma }_{1}^{2}{\theta }^{2}+{c}_{i}{\gamma }_{1}{\gamma }_{2}{\theta }^{3}\right)}{{\left({\gamma }_{1}+{c}_{i}{\gamma }_{2}\theta \right)}^{2}}\hfill \\ \hfill & \quad -\;{\delta }_{5}\frac{{\left({c}_{i}-1\right)}^{2}\left(2{\gamma }_{1}^{2}{\theta }^{3}+{c}_{i}^{2}{\gamma }_{1}{\gamma }_{2}{\theta }^{4}-{c}_{i}{\gamma }_{2}^{2}{\theta }^{5}\right)}{{\left({\gamma }_{1}+{c}_{i}{\gamma }_{2}\theta \right)}^{2}}.\hfill \end{align} \tag{ 220 }$

We see that the δ₁ term is positive for all c_i > 1. We will show that the positive terms are larger than the other terms for all c_i > 1. Each term has the same denominator and contains a factor of ${\left({c}_{i}-1\right)}^{2}$ in the numerator. This means immediately that the first derivative with respect to c_i vanishes at the point c_i = 1. We pull out the shared factors of $\frac{{\left({c}_{i}-1\right)}^{2}}{{\left({\gamma }_{1}+{c}_{i}{\gamma }_{2}\theta \right)}^{2}}$ and rearrange terms in equation (220):

$\begin{align}\hfill & =\frac{{\left({c}_{i}-1\right)}^{2}}{{\left({\gamma }_{1}+{c}_{i}{\gamma }_{2}\theta \right)}^{2}}\left({\gamma }_{1}^{2}\left(2{\theta }^{2}{\delta }_{1}-2{\theta }^{4}{\delta }_{2}-{\theta }^{2}{\delta }_{4}-2{\theta }^{3}{\delta }_{5}\right)+{\gamma }_{1}{\gamma }_{2}\left(4{\theta }^{3}{\delta }_{1}+{\theta }^{2}{\delta }_{3}\right)\right.\hfill \\ \hfill & \left.\quad +\;{\gamma }_{1}{\gamma }_{2}\left(2{c}_{i}{\theta }^{3}{\delta }_{1}-2{c}_{i}{\theta }^{5}{\delta }_{2}-{c}_{i}{\theta }^{3}{\delta }_{4}\right)+{\gamma }_{2}^{2}\left(2{c}_{i}{\theta }^{4}{\delta }_{1}+{c}_{i}{\theta }^{3}{\delta }_{3}+{c}_{i}{\theta }^{5}{\delta }_{5}\right)\right.\hfill \\ \hfill & \left.\quad +\;{\gamma }_{1}^{2}{\left({c}_{i}-1\right)}^{2}{\theta }^{4}{\delta }_{2}+{\gamma }_{2}^{2}\left({c}_{i}^{2}+1\right){\theta }^{4}{\delta }_{1}-{\gamma }_{1}{\gamma }_{2}{c}_{i}^{2}{\theta }^{4}{\delta }_{5}\right).\hfill \end{align} \tag{ 221 }$

If the conditions,

$\begin{align}\hfill 2{\theta }^{2}{\delta }_{1}& { >}2{\theta }^{4}{\delta }_{2}+{\theta }^{2}\vert {\delta }_{4}\vert +2{\theta }^{3}\vert {\delta }_{5}\vert ,\hfill \\ \hfill 4{\theta }^{3}{\delta }_{1}& { >}{\theta }^{2}\vert {\delta }_{3}\vert ,\hfill \\ \hfill {\theta }^{4}{\delta }_{1}& { >}{\theta }^{3}\vert {\delta }_{3}\vert +{\theta }^{5}\vert {\delta }_{5}\vert ,\hfill \\ \hfill \mathrm{a}\mathrm{n}\mathrm{d}\quad {\gamma }_{2}^{2}\left({c}_{i}^{2}+1\right){\theta }^{4}{\delta }_{1}+{\gamma }_{1}^{2}{\left({c}_{i}-1\right)}^{2}{\theta }^{4}{\delta }_{2}& { >}{\gamma }_{1}{\gamma }_{2}{c}_{i}^{2}{\theta }^{4}\vert {\delta }_{5}\vert ,\hfill \end{align} \tag{ 222 }$

are satisfied, then each line of equation (221) is greater than 0. Recall that γ₁, γ₂, δ₁, and δ₂ are non-negative. Each elementary perturbation increases the value of c_i. Therefore, if the conditions in equation (222) are satisfied, then each elementary step along the submanifold with constant coherent noise strength increases the incoherent logical noise strength. It remains for us to argue that these conditions are satisfied when the rotation angles are close to equal.

Consider two cases for the relative positions of qubits i and j. In the first case, suppose qubits i and j are positioned so that no short logical strings contain both. Then the strings that contribute to γ₂ as well as δ₂, δ₄, and δ₅ are all long. Such strings do not appear in the sum over low-weight connected terms, so γ₂ = 0 in equation (221). Therefore, the only condition is the first line of equation (222). In this inequality δ₂, δ₄ and δ₅ are 0, and the condition is satisfied.

In the other case, qubits i and j are in a horizontal or vertical line so that both qubits are contained in several short logical strings. In this case θγ₂ is comparable to γ₁. Now consider the incoherent contribution from the strings that contain both qubits i and j. For each weight-(2w − 1) logical string containing both qubits i and j, the errors O_U are weight-w subsets of the logical string. Nearly half will contain exactly one of qubits i and j and one quarter will contain both qubits i and j. This means that more terms contribute to θ²δ₁ and θδ₃ than to θ⁴δ₂, θ²δ₄, and θ³δ₅. In particular, θ²δ₁ > 2θ⁴δ₂. For each O_U, the set {O_U'} contains operators with the same syndrome, logical action and weight. O_U' differs from O_U only near certain transverse steps along the logical string as we described in section 6.9. For most logical strings, if O_U does not contain qubit i, then most of the operators O_U' also will not. If O_U contains qubit i, most operators O_U' will as well. This implies that

$\begin{align}\hfill {\theta }^{2}{\delta }_{1}& \gg \theta \vert {\delta }_{3}\vert ,\hfill \\ \hfill {\theta }^{4}{\delta }_{2}& \gg {\theta }^{2}\vert {\delta }_{4}\vert ,\hfill \\ \hfill {\theta }^{4}{\delta }_{2}& \gg {\theta }^{3}\vert {\delta }_{5}\vert ,\hfill \\ \hfill \mathrm{a}\mathrm{n}\mathrm{d}\quad {\theta }^{2}{\delta }_{1}& \gg {\theta }^{3}\vert {\delta }_{5}\vert .\hfill \end{align} \tag{ 223 }$

Together with our earlier statement that θ²δ₁ > 2θ⁴δ₂, this implies that even when i and j are near to each other, the conditions in equation (222) are satisfied.

We have proven that each of the elementary perturbations starting from uniform angles increases the incoherent logical noise strength. However, we must also consider elementary perturbations that are applied after a different elementary perturbation has taken us away from uniform angles. In that case in equations (217) and (219) we would no longer have exact symmetry between θ_i and θ_j. In other words, equation (217) would read

$\begin{equation}\mathrm{C}\mathrm{o}\mathrm{h}\mathrm{e}\mathrm{r}\mathrm{e}\mathrm{n}\mathrm{t}={\gamma }_{0}+{\gamma }_{1}^{\left(i\right)}{\theta }_{i}+{\gamma }_{1}^{\left(j\right)}{\theta }_{j}+{\gamma }_{2}{\theta }_{i}{\theta }_{j},\end{equation} \tag{ 224 }$

where ${\gamma }_{1}^{\left(i\right)}$ and ${\gamma }_{1}^{\left(j\right)}$ are different coefficients. However, as long as we are not too far from uniform angles, ${\gamma }_{1}^{\left(i\right)}\approx {\gamma }_{1}^{\left(j\right)}$ , and the change in the incoherent logical noise strength will be almost the same as in equation (220). We have argued that the conditions in equation (222) are satisfied by a large margin. Therefore, there exists a region around the point with uniform angles where every elementary perturbation on the submanifold with fixed coherent logical noise strength increases the incoherent logical noise strength. We conclude that in a region around the symmetric point where every single-qubit rotation is by the same angle θ, this symmetric point gives the largest coherent logical noise strength relative to the incoherent logical noise strength. This means that the connected contribution to the incoherent logical noise strength has a local minimum at the point with uniform rotations within the submanifold with constant connected contribution to the coherent logical noise strength. This implies that the upper bound on logical coherence we derive in theorem 3 for the case where all angles are equal also upper bounds the logical coherence in a region around the point where all angles are equal.

Now consider changing the axes of rotation while keeping the total rotation angle the same. Let the noise model be a rotation in the X–Z plane such that the rotation angles are θ_X and θ_Z for every qubit. We show that Y rotations on the qubits will produce less coherent logical noise in lemma 13. The X and Z rotations contribute to independent components of the logical noise channel. Then each noise term that enters into the X-type logical noise components depend on at least L powers of θ_X. Similarly, the Z-type logical noise depends on at least L powers of θ_Z. Therefore, if the total rotation angle for each qubit, $\sqrt{{\theta }_{X}^{2}+{\theta }_{Z}^{2}}$ , is fixed, the logical noise strength is greatest when either θ_X or θ_Z is 0. Also, because the X and Z-type errors contribute to different logical noise components, we can apply our analysis of coherent and incoherent logical noise components to the two types of errors separately. If both |sin θ_X| and |sin θ_Z| are <1/L, then lemmas 3 and 5 imply that the logical coherent and incoherent X and Z noise is dominated by the contributions of short logical strings. With this noise model, we will in general expect logical Y noise, but lemma 14 implies that logical Y-type errors are negligible relative to X- and Z-type errors. The θ_X rotations contribute to the X₁ and X₂ logical noise, while the θ_Z rotations contribute to the Z₁ and Z₂ logical noise. The bounds we proved for coherent and incoherent logical noise components apply equally well to the X- and Z-type noise separately in this noise model. □

Lemma 8 states that among noise models consisting of single-qubit rotations where each rotation is close to the same, the coherence of the logical channel is greatest for the noise model consisting of Z-axis rotations on every qubit by the same angle. The same does not necessarily hold for noise models consisting of single-qubit rotations with wildly different angles of rotation on each qubit. This is not surprising because if we allow for wildly different rotation angles, we encounter the case where all the rotation angles are 0 except for the qubits along some very long logical string. This kind of high-weight connected term is beyond the scope of the present work, cf lemmas 3 and 5.

6.12. Correlations

We can apply theorem 2 to study the toric code with minimal-weight decoding subject to correlated unitary noise. In the repetition code, we found that adding two-body correlations did not change the relation between coherent and incoherent components of the logical noise when the code size n is large. We can transfer that to the toric code using what we have already proven. Consider a single logical string. We can sum over its partitions. With correlated unitary noise, instead of sin θ/2 cos θ/2 for each qubit in the logical string, we have a sum over the one- and two-body couplings in the Hamiltonian, h₁ and h₂. The model of correlated noise that we considered in theorem 2 included two-body coupling terms between every pair of qubits in the code. Therefore, the magnitude of each multi-qubit error is a function only of its weight. We found that the coherent and incoherent logical noise in the toric code is dominated by the contributions from short logical strings with typical shape. In theorem 3 we will finish proving a relation between the coherent and incoherent terms that is based on the number of terms and their magnitudes, which we always assumed to be sin θ/2 raised to the weight of the terms. In the correlated case we alter the magnitude of each term, but theorem 2 tells us that string by string we can bound the coherent logical noise contribution in terms of the incoherent logical noise contribution.

6.13. Main theorem

Theorem 3. Consider the L × L toric code without boundaries subject to single-qubit unitary noise acting on each qubit. We chose minimal-weight decoding and assume that syndrome extraction is perfect. Suppose that each qubit is rotated by an angle θ about some axis and that |sin θ| < 1/L. Our conclusion will also hold for angles and axes that differ among the qubits, so long as the deviation is small as discussed in lemma 8. Let Ñ be the logical noise channel produced by encoding into the toric code, acting with single-qubit unitary noise, and then decoding. Denote by $\tilde {\chi }$ the chi matrix for the logical noise channel Ñ. Then the coherent and incoherent components of the logical channel are related by

$\begin{equation}\sum _{i,j\vert i\ne j}{\left\vert {\tilde {\chi }}_{i,j}\right\vert }^{2}{< }\frac{2}{{\left(\mathrm{sin}\;\theta \right)}^{2}}{\left(\sum _{l\vert l\ne 0}{\tilde {\chi }}_{l,l}\right)}^{2}\left(1+\mathcal{E}\right),\end{equation} \tag{ 225 }$

where

$\begin{equation}\vert \mathcal{E}\vert {\leqslant}\frac{24\gamma {\zeta }^{2}}{\sqrt{L}}+\mathcal{O}\left(\frac{1}{L}\right)+\mathcal{O}\left({\left(L\;\mathrm{sin}\;\theta \right)}^{2\zeta }\right),\end{equation} \tag{ 226 }$

and ζ is an arbitrary L-independent constant. We denote the diamond norm distance of Ñ from the identity channel by D_◊(Ñ). It follows that

$\begin{equation}{D}_{{\diamondsuit}}\left(\tilde{N}\right){\leqslant}cr,\end{equation} \tag{ 227 }$

for a constant c given by

$\begin{equation}{c}^{2}=\frac{{\left({d}_{L}+1\right)}^{2}}{2}\left(1+\frac{1}{{\left(\mathrm{sin}\;\theta \right)}^{2}}\right)\left(1+\mathcal{E}\right),\end{equation} \tag{ 228 }$

where d_L is the dimension of the code space (d_L = 4 for the 2D toric code without boundaries). Here r is the average infidelity of the logical noise channel Ñ, and $\mathcal{E}$ is the error term bounded in equation (226). (If the logical noise channel Ñ were unitary, then D_◊(Ñ) would be proportional to $\sqrt{r}$ .) We can also consider the growth of the average infidelity as we apply the logical noise channel many times in succession. Let r_m be the average infidelity after m applications of Ñ; then using equation (225), we can write

$\begin{equation}{r}_{m}=mr\left(1+\frac{{d}_{L}}{\left({d}_{L}+1\right){\mathrm{sin}}^{2}\;\theta }\left(m-1\right)r\left(1+\mathcal{E}\right)\right),\end{equation} \tag{ 229 }$

where d_L = 4 for the 2D toric code without boundaries, $\mathcal{E}$ is the error term that is upper bounded in equation (226), and r is the average infidelity for a single application of the logical noise channel. As long as the physical noise strength is below the fault-tolerant threshold, r will be exponentially small in the code distance L. Therefore, equation (229) states that the term growing quadratically in m is exponentially small in L relative to the term growing linearly with m. In this sense, the coherence of the logical channel is heavily suppressed.

Proof. We start with a noise model consisting of Z rotations by angle θ on every qubit in the L × L block of toric code. We seek to approximate the coherent logical noise component ${\tilde {\chi }}_{{Z}_{1}I}$ and relate it to the incoherent logical noise component ${\tilde {\chi }}_{{Z}_{1}{Z}_{1}}$ . First, let us calculate the coherent component. We write ${\tilde {\chi }}_{{Z}_{1}I}$ as a sum over strings and partitions with a connected and disconnected part as in equation (172). Applying lemma 3 to the connected part, we neglect high-weight terms, leaving only the logical strings with length ⩽L + 2ζ for a fixed ζ, and making an error which is exponentially small in ζ. For this step, the magnitudes of the sines of the single-qubit rotation angles are required to be below a threshold value 1/L. We apply lemma 11 to argue that the disconnected part is equal to 1 up to a small correction. Lemmas 9 and 10 tell us that we can treat all short logical strings $\mathcal{L}$ as typical and make another small error. Now that we have only short typical logical strings, we apply lemma 4 to perform the sum over partitions. We conclude that

$\begin{equation}{\tilde {\chi }}_{{Z}_{1}I}=\sum _{\mathcal{L}}{i}^{\ell }\left(\genfrac{}{}{0.0pt}{}{\ell -1}{\frac{\ell -1}{2}}\right){\left(\frac{\mathrm{sin}\;\theta }{2}\right)}^{\ell }\left(1+{\mathcal{E}}_{\mathrm{s}\mathrm{h}\mathrm{a}\mathrm{p}\mathrm{e}}\right)+{\mathcal{E}}_{\mathrm{l}\mathrm{o}\mathrm{n}\mathrm{g}},\end{equation} \tag{ 230 }$

where the sum is over all connected logical Z₁ strings $\mathcal{L}$ with length ℓ such that ℓ ⩽ L + 2ζ. The error from logical strings with length greater than L + 2ζ is bounded

$\begin{equation}\left\vert {\mathcal{E}}_{\mathrm{l}\mathrm{o}\mathrm{n}\mathrm{g}}\right\vert {\leqslant}\alpha {L}^{2\zeta +1}{\left(\mathrm{sin}\;\theta \right)}^{L+2\zeta },\end{equation} \tag{ 231 }$

where α = (1 − L sin θ)⁻¹. This error is from lemma 3. The other error term from lemmas 10 and 11 is bounded

$\begin{equation*}\vert {\mathcal{E}}_{\mathrm{s}\mathrm{h}\mathrm{a}\mathrm{p}\mathrm{e}}\vert {\leqslant}\frac{16\gamma {\zeta }^{2}}{\sqrt{L}}+O\left(1/L\right).\end{equation*}$

A further error due to neglecting exceptional partitions is subdominant according to lemma 4 and is not shown in equation (230).

Now, we will lower bound the incoherent logical noise component ${\tilde {\chi }}_{{Z}_{1}{Z}_{1}}$ . Lemma 5 implies that we can neglect the contributions of noise terms (O_UρO_U') such that |O_U| + |O_U'| > L + 2ζ + 1. The error we make by truncating the sum is exponentially small in ζ, so long as the rotation angle θ satisfies |θ| < 1/L. The incoherent logical noise component can be put in the form of a sum over logical strings as in equation (206). Using lemma 7, we restrict to the connected terms where the logical string $\mathcal{L}$ is short and has typical shape and |O_U| = |O_U'|. We can also keep only the terms with |O_U| = (ℓ + 1)/2, because just as in the repetition code, these higher-weight partitions are suppressed by factors of sin θ² and the binomial coefficients are decreasing as we consider higher-weight O_U. The disconnected part is equal to 1 up to small correction according to lemma 12. Finally, lemma 6 gives a lower bound on the contribution of each logical string to the incoherent logical noise. All together, we have the following lower bound on ${\tilde {\chi }}_{{Z}_{1}{Z}_{1}}$ :

$\begin{equation}{\tilde {\chi }}_{{Z}_{1}{Z}_{1}}{\geqslant}\sum _{\mathcal{L}}\left(\genfrac{}{}{0.0pt}{}{\ell }{\frac{\ell +1}{2}}\right){\left(\frac{\mathrm{sin}\;\theta }{2}\right)}^{\ell +1}\left(1-\frac{8\gamma {\zeta }^{2}}{\sqrt{L}}+\mathcal{O}\left(1/L\right)+\mathcal{O}\left({\left(L\;\mathrm{sin}\;\theta \right)}^{2\zeta }\right)\right).\end{equation} \tag{ 232 }$

The error terms come from lemmas 10, 5, and 12. A subdominant error term from lemma 4 is suppressed in equation (232).

Putting together equations (230) and (232), we conclude the following about how the coherent and incoherent terms in the logical chi matrix are related:

$\begin{equation}\frac{\vert \left({\tilde {L}}_{a}\tilde {\rho }\right)\vert }{\vert \left({\tilde {L}}_{a}\tilde {\rho }{\tilde {L}}_{a}\right)\vert }{\leqslant}\frac{1}{\vert \mathrm{sin}\;\theta \vert }\left(1+\mathcal{E}\right),\end{equation} \tag{ 233 }$

where

$\begin{equation}\vert \mathcal{E}\vert {\leqslant}\left(\frac{24\gamma {\zeta }^{2}}{\sqrt{L}}\right)+\mathcal{O}\left(\frac{1}{L}\right)+\mathcal{O}\left({\left(L\;\mathrm{sin}\;\theta \right)}^{2\zeta }\right).\end{equation} \tag{ 234 }$

We used

$\begin{equation}\left(\genfrac{}{}{0.0pt}{}{\ell }{\frac{\ell -1}{2}}\right)\approx 2\left(\genfrac{}{}{0.0pt}{}{\ell -1}{\frac{\ell -1}{2}}\right)\end{equation} \tag{ 235 }$

to arrive at equation (233).

We have restricted our attention to the coherent logical component $\left({\tilde {L}}_{a}\tilde {\rho }\right)$ and the corresponding incoherent component $\left({\tilde {L}}_{a}\tilde {\rho }{\tilde {L}}_{a}\right)$ , where ${\tilde {L}}_{a}$ was either X or Z on one of the two encoded qubits. We did this because these are the largest components in the logical noise. This is proven in appendices H and I. In lemma 14 we prove that we can neglect the logical noise components $\left({\tilde {L}}_{a}\tilde {\rho }\right)$ where ${\tilde {L}}_{a}$ is a logical Y or a non-trivial operator on both encoded qubits. In lemma 15 we prove that we can neglect coherent terms of the form $\left({\tilde {L}}_{a}\tilde {\rho }{\tilde {L}}_{b}\right)$ with a ≠ b. Using these results, we can bound the sum of all coherent logical terms relative to the sum of all incoherent logical terms. There are two off-diagonal terms for each diagonal term, e.g. ${\tilde {\chi }}_{{Z}_{1}I}$ and ${\tilde {\chi }}_{I{Z}_{1}}$ are matched with ${\tilde {\chi }}_{{Z}_{1}{Z}_{1}}$ , so we have

$\begin{equation}\sum _{i,j\vert i\ne j}{\left\vert {\tilde {\chi }}_{i,j}\right\vert }^{2}{< }\frac{2}{{\left(\mathrm{sin}\;\theta \right)}^{2}}{\left(\sum _{k\vert k\ne 0}{\tilde {\chi }}_{k,k}\right)}^{2}\left(1+\mathcal{E}\right).\end{equation} \tag{ 236 }$

We have proven equation (225). The term on the right-hand side is proportional to the infidelity by equation (K.9). Going back to lemma 1 and equation (44), we can write

$\begin{equation}{r}_{m}{\leqslant}mr+m\left(m-1\right)\frac{{d}_{L}}{2\left({d}_{L}+1\right)}\sum _{i,j\vert i\ne j}{\tilde {\chi }}_{i,j}^{2},\end{equation} \tag{ 237 }$

where d_L is the dimension of the physical Hilbert space, which is 4 for the toric code without boundaries. We can combine this with equation (236).

$\begin{equation}{r}_{m}{\leqslant}mr\left(1+\frac{{d}_{L}}{\left({d}_{L}+1\right){\mathrm{sin}}^{2}\;\theta }\left(m-1\right)r\left(1+\mathcal{E}\right)\right).\end{equation} \tag{ 238 }$

Finally, we can use lemma 16 with equation (236) to derive equation (227).

So far we have considered a noise model consisting of the same Z rotation by angle θ on every qubit in the code block. We can use lemmas 8 and 13 to prove that this noise channel produces maximally coherent logical noise in a region around uniform rotations. The single-qubit rotation angles are allowed to differ so long as the deviation is not too great. Therefore, the relation we found between coherent and incoherent logical noise components for the Z rotation noise model bounds the coherence of the logical noise channel for small rotations about any axis, so long as the rotations are close to uniform across the qubits.□

There are some subtleties in the interpretation of theorem 3. We address these in the next subsection, but first we will make a remark about the error bound in equation (226). This error bound is satisfactory for finite code size L; however, we will need make a small modification before the bound is suitable for the L → ∞ limit. This is because the term $\mathcal{O}\left({\left(L\;\mathrm{sin}\;\theta \right)}^{2\zeta }\right)$ in equation (226) contains a factor polynomial in L. If the single qubit rotation angle θ satisfies |sin θ| ∝ 1/L, then this polynomial factor would make the truncation error large as L → ∞. The polynomial factor comes partly from the fact that the truncation error in equation (230) has a factor of 2^L relative to the factor of $\left(\genfrac{}{}{0.0pt}{}{L}{\frac{L+1}{2}}\right)$ that appears in the lowest-weight incoherent noise terms. The ratio is proportional to $\sqrt{L}$ . The other contribution to this polynomial factor comes from the path counting in lemma 5, where we neglected a factor polynomial in w + w' in equation (202). We can cancel the polynomial factor by slightly modifying our truncation procedure. Denote this polynomial factor p(L). Instead of neglecting noise terms with weight >L + 2ζ in the coherent logical noise components, we neglect noise terms with weight >L + 2⌈ζ' log(L)⌉ for a constant ζ'. We perform a similar truncation for the incoherent logical noise components. Then we can choose ζ' large enough that (L sin θ)^2ζ'p(L) is decreasing with L. The minimum value for ζ' such that this is decreasing depends on the degree of p(L) and the magnitude of L sin θ. If ζ' is greater than this minimum value, then (L sin θ)^2ζ'p(L) is bounded above by a|L sin θ|^λζ'log(L), where a is a constant that is determined by the coefficients in the polynomial p(L), and λ is a constant that depends on ζ', the degree of p(L), and the magnitude of L sin θ. Our new truncation rule slightly alters the other error terms in equation (234). The new error term is

$\begin{equation}\vert \mathcal{E}\vert {\leqslant}\frac{24\gamma {\zeta }^{\prime 2}{\left(\mathrm{log}\;L\right)}^{2}}{\sqrt{L}}+\mathcal{O}\left(\frac{1}{L}\right)+a\vert L\;\mathrm{sin}\;\theta {\vert }^{\lambda {\zeta }^{\prime }\mathrm{log}\left(L\right)},\end{equation} \tag{ 239 }$

where a, ζ', and λ are constants. a and λ are determined as we described above. We are free to choose ζ', so long as it is greater than a minimum value. Now, if we take the limit L → ∞, we find that the error term in equation (239) remains small. Therefore, we may apply theorem 3 in the limit of L → ∞ with equation (239) replacing equation (226).

6.14. Interpreting bounds on coherence

We have proved a relation between the diagonal and off-diagonal components of the chi matrix of the logical noise channel. The interpretation is a bit subtle, so it is worth commenting on here. We upper bounded the off-diagonal components by 1/|sin θ| times the diagonal components, and we were forced to assume that |sin θ| < 1/L because our analysis only applies to logical strings with length ⩽L + 2ζ where ζ is a constant. With this assumption, the factor of 1/|sin θ| implies that the coherent component of the logical channel may be L times larger than the incoherent component. This might seem to indicate that the coherence of the logical channel is not suppressed for large L, but that is not the best way to think about the comparison.

In equation (229) the term quadratic in m has a coefficient proportional to $r/{\left(\mathrm{sin}\;\theta \right)}^{2}$ relative to the linear term. But the average infidelity r is exponentially small in L. Thus the coefficient of the quadratic term is really exponentially smaller in L relative to the coefficient of the linear term. In equation (227) the constant c² is not really a constant, since it scales like L² if θ scales like 1/L. The point is that if the logical noise channel were fully coherent, i.e. unitary, then c would scale like $1/\sqrt{r}$ , but we find that $1/\sqrt{r}$ scales like L^L/2, which is vastly greater than L². We conclude that, although the logical noise channel is not exactly incoherent, it is quite close to an incoherent channel as measured by our statements about the growth of average infidelity and the relation between diamond distance and average infidelity.

We could also consider writing our logical noise channel as a product of a unitary rotation and a Pauli channel. We can solve for the single parameter in each of these two channels. In the limit of low logical noise strength, the angle of rotation of the unitary channel approximately equals one of the off-diagonal chi matrix elements, and the probability of error in the Pauli channel is comparable to one of the diagonal components of the chi matrix. Theorem 3 implies that the logical channel can be written as a product of a unitary channel and a Pauli channel where the angle of rotation of the unitary is larger than the error probability of the Pauli channel by a factor which is approximately 1/|sin θ|, and therefore enhanced by a factor of L if θ scales like 1/L. Again, this might make it seem like the coherence is not suppressed; however, the coherent channel makes a contribution to the average infidelity proportional to the rotation angle squared. This is why we find that the growth of average infidelity becomes nearly linear in m as the code size L increases. As the code block becomes large, the diamond distance for the logical noise channel is much smaller than what one would expect for a coherent channel based on the value of the average infidelity r. This is another way of making the same point as in the previous paragraph.

One might wonder whether a tighter upper bound than theorem 3 can be derived on the strength of the coherent part of the logical channel relative to the incoherent part. In fact, a substantially tighter upper bound is not possible, if we want this bound to hold for arbitrary small rotation angles. For instance, we could choose to set every rotation angle equal to zero except for the qubits along a single length L logical string. For this case, the computation of the logical channel is similar to our computation for the repetition code, where we were able to compute the logical channel quite precisely. Alternatively, for a fixed code size we could choose sufficiently small uniform rotation angles θ such that the lowest-weight terms dominate in the logical noise. In this case the computation of the logical channel is again similar to that of the repetition code. Since the bound we proved for the toric code nearly matches what we found for the repetition code, we know that our result is optimal in this special case. Of course, for some other particular set of single-qubit rotations, the logical noise channel may be less coherent than our upper bound predicts.

7. Conclusions

We have studied characterizations of coherence in quantum channels. One useful method for diagnosing the coherence of a channel $\mathcal{N}$ is to consider applying $\mathcal{N}\;m$ times in succession, and to investigate how the average infidelity r of the composite channel ${\mathcal{N}}^{m}$ increases with m. For incoherent channels r is linear in m, while for highly coherent channels it can grow quadratically with m. Another useful diagnostic is provided by the relationship between r and ${D}_{{\diamondsuit}}\left(\mathcal{N}\right)$ , the distance between $\mathcal{N}$ and identity channel as measured by the diamond norm. For incoherent channels this distance scales linearly with r, while for highly coherent channels it scales like $\sqrt{r}$ .

Using these criteria we have investigated the coherence properties of logical channels. To define a logical channel, we choose a particular quantum error-correcting code and decoding method; then we consider encoding an initial input state, subjecting the physical qubits to a noise model, and finally applying the decoder to obtain the channel's output. Our main conclusion is that, for the code families we examined, even if the physical noise model is highly coherent, the coherence of the logical channel is heavily suppressed in the limit of a large code block.

For the case of the quantum repetition code, we can compute the logical channel precisely, and verify that the logical channel is highly incoherent for large block size. Most of this paper was devoted to the analysis of a more challenging case, the L × L two-dimensional toric code subject to independent unitary noise. Our main conclusion about this case is encompassed by theorem 3. Regrettably, for the case of the toric code we were able to prove that the coherence of the logical channel is suppressed only under an unrealistic assumption: that as the size L of the code block increases, the rotation angle θ applied to each qubit scales like 1/L.

Under this assumption, we can estimate the logical channel well enough for our purposes by expanding it to a constant (L-independent) order in θ, and argue that the higher-order terms we ignore make a contribution that can be safely neglected. A key step in our argument is the observation, backed up by lemmas 9 and 10, that, for L|sin θ| < 1, the logical channel is dominated by logical strings with an easily characterized typical shape. For the logical strings of this typical shape, lemmas 4, 7, 11, and 12 provide a sufficiently accurate estimate of the logical channel to prove theorem 3.

Our main conclusion, that the coherence of the logical coherence is heavily suppressed, applies to unitary physical noise such that each qubit is rotated independently, even if the rotation axis and rotation angle vary from qubit to qubit, as long as the rotations are close to the same and sufficiently small. It also applies for some highly correlated noise models. The result also extends to physical noise channels which are convex combinations of unitary channels, or convex combinations of unitary channels and depolarizing channels. (Depolarizing physical noise is mapped to an incoherent depolarizing logical channel under error correction.)

We emphasize that our result is an asymptotic statement in the limit of large code size L, albeit under the assumption that the noise strength scales like 1/L. For codes of fixed size our results may not be tight; the coherence of logical channels for finite code blocks has been studied elsewhere [11, 17, 21]. Our goal was to study a family of codes with an accuracy threshold instead. When the noise is below threshold, the logical channel approaches the identity as the code block increases in size. In addition, under conditions where theorem 3 applies, the coherent component of the logical channel vanishes much more rapidly than the incoherent component.

It is reasonable to expect that our conclusion—that the logical channel becomes increasingly incoherent as L grows—continues to hold even if we allow L to increase while the rotation angle θ has a fixed constant value. But proving this will be challenging. For one thing, if θ is a constant we cannot accurately estimate the logical channel by expanding to a constant order in θ. Instead, logical strings with length ⩽L(1 + β) need to be included for some constant β. These logical strings are not easy to count. A logical string can be regarded as a self-avoiding walk on the square lattice whose endpoints are a distance L apart, but previously derived upper bounds on the number of self-avoiding walks with specified length [34–36] do not treat the case where the distance between the endpoints differs from the length of the walk by an $\mathcal{O}\left(1\right)$ multiplicative factor. And even if we could count the logical strings accurately, we would still need to overcome some additional obstacles to prove that the coherence of the logical channel is suppressed.

First, to prove theorem 3, we disposed of the 'exceptional' terms (definition 7), those in which the uncorrectable error on a logical string has lower weight than the correctable error, by arguing that these terms are sufficiently rare as to make a negligible contribution to the coherent part of the logical channel. But for logical strings with length L(1 + β), exceptional terms will be far more common.

Second, when we calculated the contribution to the coherent or incoherent logical noise, we separated the computation into a sum over a connected part and a disconnected part, and argued in lemmas 11 and 12 that the disconnected part contributes a multiplicative factor close to 1. But the proofs of these lemmas required the logical strings to be short, of length L + 2ζ for constant ζ; these proofs do not apply for longer logical strings of length L(1 + β).

Third, and even more dauntingly, our proof of theorem 3 made use of a relationship between the physical noise terms that contribute to the coherent and incoherent logical noise. But as the logical string length increases, the contributions to the coherent and incoherent component of the logical channel become less and less alike. Each contribution to the coherent logical noise is associated with a logical string. In contrast, each contribution to the incoherent logical noise is associated with a pair of logical strings; these strings have segments in common, but they fluctuate relative to one another apart from those shared segments. For short logical strings, these fluctuations are relatively mild, and did not prevent us from relating the incoherent and coherent logical noise, as described in section 6.9. For longer logical strings, the combinatorics become much harder to handle.

Unable to overcome these obstacles ourselves, we settled for proving a weaker result that applies for L|sin θ| < 1 rather than constant θ. Perhaps a more ambitious combinatoric analysis can push the proof through even for constant θ. Or perhaps a completely different approach will be more successful. Conceivably, it is not true that the coherence of the logical channel becomes heavily suppressed for large L and sufficiently small constant θ, though we consider that possibility unlikely.

Further numerical studies of logical coherence may also prove to be instructive. The problem has already been studied numerically [11, 20–22]; however, our methods for organizing the estimate of the logical channel suggest different approaches to numerically simulating the logical channel. Numerics could help to resolve the issues that prevented us from extending theorem 3 to the case where θ is an L-independent constant.

In our analysis of the toric code subject to single-qubit unitary rotations, we used minimal-weight decoding because it can be systematically analyzed. However, we do not expect our conclusion about suppression of logical coherence to be very sensitive to the choice of decoding method. The suppression arises from averaging over many error syndromes, and therefore should occur for other families of stabilizer codes with good decoders. Many of the elements from which we built the proof of theorem 3 can be applied to more general stabilizer codes, including 'logical strings', partitions, exceptional terms, and the decomposition into connected and disconnected parts.

We analyzed the toric code because it has an accuracy threshold, and we aspired to study the coherence of the logical channel for a fixed nonzero value of θ as the linear size L of the code block gets large. That aspiration eluded us, so we settled for investigating the logical coherence in the regime L|sin θ| < 1. In that regime, asymptotic results similar to ours, derived using similar methods, may be applicable to other code families that do not have an accuracy threshold. For example, for the Bacon-Shor code family subjected to depolarizing noise with error probability p, the optimal logical failure probability, computed analytically in [37], is achieved by the code with distance $d=\mathcal{O}\left(1/p\right)$ . We anticipate that, for unitary noise, decoding the optimal Bacon-Shor yields a logical channel with strongly suppressed coherence, though we have not done a careful analysis.

It has long been suspected that error correction suppresses the coherence of noise. Such suppression had been observed numerically for the toric code [21], but no rigorous argument supporting this conclusion had been previously known for any code family with an accuracy threshold. Our goal in this project was to prove that, for the toric code subject to sufficiently weak independent or weakly correlated unitary noise, the logical channel after decoding is highly incoherent in the limit of a large code block. We fell short of this goal, settling for a proof that coherence is suppressed in the case where the noise strength decreases as the code block grows. Nevertheless, we hope and expect that the tools we have developed will prove to be useful in future studies of quantum error correction.

Acknowledgments

We would like to thank Michael Beverland, Robin Blume-Kohout, Benjamin Brown, Aaron Chew, Andrew Darmawan, Andrew Doherty, Steven Flammia, Daniel Gottesman, Tomas Jochym-O'Connor, Aleksander Kubica, Richard Kueng, David Poulin, and Leonid Pryadko for valuable discussions. We gratefully acknowledge support from ARO-LPS (W911NF-18-1-0103) and NSF (PHY-1733907). The Institute for Quantum Information and Matter is an NSF Physics Frontiers Center.

Appendix A.: Chi matrix and Pauli transfer matrix for qubits

Here we verify lemma 1 for qubits by expressing all non-diagonal terms in N_kl in terms of χ_ij explicitly:

$\begin{align}\hfill I& {\mapsto}\left({\chi }_{XI}+{\chi }_{IX}\right)X+\left({\chi }_{YI}+{\chi }_{IY}\right)Y+\left({\chi }_{ZI}+{\chi }_{IZ}\right)Z\hfill \\ \hfill & +\left({\chi }_{XY}-{\chi }_{YX}\right)\left(iZ\right)+\left({\chi }_{ZX}-{\chi }_{XZ}\right)\left(iY\right)+\left({\chi }_{YZ}-{\chi }_{ZY}\right)\left(iX\right),\hfill \\ \hfill X& {\mapsto}\left({\chi }_{XY}+{\chi }_{YX}\right)Y+\left({\chi }_{XZ}+{\chi }_{ZX}\right)Z+\left({\chi }_{IX}+{\chi }_{XI}\right)I\hfill \\ \hfill & +\left({\chi }_{ZY}-{\chi }_{YZ}\right)\left(iI\right)+\left({\chi }_{IY}-{\chi }_{YI}\right)\left(iZ\right)+\left({\chi }_{ZI}-{\chi }_{IZ}\right)\left(iY\right),\hfill \\ \hfill Y& {\mapsto}\left({\chi }_{XY}+{\chi }_{YX}\right)X+\left({\chi }_{ZY}+{\chi }_{YZ}\right)Z+\left({\chi }_{IY}+{\chi }_{YI}\right)I\hfill \\ \hfill & +\left({\chi }_{ZX}-{\chi }_{XY}\right)\left(iI\right)+\left({\chi }_{IZ}-{\chi }_{ZI}\right)\left(iX\right)+\left({\chi }_{XI}-{\chi }_{IX}\right)\left(iZ\right),\hfill \\ \hfill Z& {\mapsto}\left({\chi }_{ZY}+{\chi }_{YZ}\right)Y+\left({\chi }_{XZ}+{\chi }_{ZX}\right)X+\left({\chi }_{IZ}+{\chi }_{ZI}\right)I\hfill \\ \hfill & +\left({\chi }_{YX}-{\chi }_{XY}\right)\left(iI\right)+\left({\chi }_{IX}-{\chi }_{XI}\right)\left(iY\right)+\left({\chi }_{YI}-{\chi }_{IY}\right)\left(iX\right).\hfill \end{align} \tag{ A.1 }$

When we collect all the terms in ${\sum }_{a\ne b}{N}_{ab}^{2}$ which are quadratic in {χ_XY, χ_YX}, we obtain

$\begin{equation}2{\left({\chi }_{XY}+{\chi }_{YX}\right)}^{2}-2{\left({\chi }_{XY}-{\chi }_{YX}\right)}^{2}=8\vert {\chi }_{XY}{\vert }^{2}=4\left(\vert {\chi }_{XY}{\vert }^{2}+\vert {\chi }_{YX}{\vert }^{2}\right),\end{equation} \tag{ A.2 }$

using ${\chi }_{ij}={\chi }_{ji}^{{\ast}}$ , as required by complete positivity. The same applies to the terms involving χ_IX, χ_IY, χ_IZ, χ_ZX, χ_YZ, and their complex conjugates.

To prove the claim we must verify that the linear terms cancel. This can be shown using the general argument in lemma 1, but in the qubit case it may be easier to verify the cancellation explicitly. For example, the contributions to N_ab involving χ_IX, χ_XI, χ_YZ, χ_ZY are

$\begin{align}\hfill {N}_{IX}& =\left({\chi }_{XI}+{\chi }_{IX}\right)+i\left({\chi }_{YZ}-{\chi }_{ZY}\right)+\cdots \;,\hfill \\ \hfill {N}_{XI}& =\left({\chi }_{XI}+{\chi }_{IX}\right)-i\left({\chi }_{YZ}-{\chi }_{ZY}\right)+\cdots \;,\hfill \\ \hfill {N}_{YZ}& =\left({\chi }_{ZY}+{\chi }_{YZ}\right)+i\left({\chi }_{XI}-{\chi }_{IX}\right)+\cdots \;,\hfill \\ \hfill {N}_{ZY}& =\left({\chi }_{ZY}+{\chi }_{YZ}\right)-i\left({\chi }_{XI}-{\chi }_{IX}\right)+\cdots \;,\hfill \end{align} \tag{ A.3 }$

and we therefore see that the cross terms cancel in ${N}_{IX}^{2}+{N}_{XI}^{2}$ and in ${N}_{YZ}^{2}+{N}_{ZY}^{2}$ . Similar cancellations occur for all other cross terms.

Appendix B.: Approximating sums

We wish to evaluate the sum in equation (87):

$\begin{equation}{P}_{n}\left(p\right)=\sum _{w=0}^{\left(n-1\right)/2}\left(\genfrac{}{}{0.0pt}{}{n}{w}\right){p}^{n-w}{\left(1-p\right)}^{w}=\sum _{w=\left(n+1\right)/2}^{n}\left(\genfrac{}{}{0.0pt}{}{n}{w}\right){p}^{w}{\left(1-p\right)}^{n-w},\end{equation} \tag{ B.1 }$

where p = s² = sin² θ/2, and (1 − p) = c² = cos² θ/2. Note that P_n(p) is the probability of a decoding error for the n-bit repetition code subject to independent noise with bit-flip probability p. It is convenient to redefine the summation index obtaining

$\begin{equation}{P}_{n}\left(p\right)={p}^{\left(n+1\right)/2}{\left(1-p\right)}^{\left(n-1\right)/2}\sum _{r=0}^{\left(n-1\right)/2}\left(\genfrac{}{}{0.0pt}{}{n}{\frac{n+1}{2}+r}\right){\left(\frac{p}{1-p}\right)}^{r}.\end{equation} \tag{ B.2 }$

From the Stirling approximation, we have

$\begin{equation}\left(\genfrac{}{}{0.0pt}{}{n}{\frac{n+1}{2}+r}\right)\approx \left(\sqrt{\frac{2}{\pi n}}\right){2}^{n}\;\mathrm{exp}\left(-\frac{2}{n}{\left(r+\frac{1}{2}\right)}^{2}\right),\end{equation} \tag{ B.3 }$

neglecting a multiplicative $\left(1+\mathcal{O}\left(1/n\right)\right)$ correction. Making another $\left(1+\mathcal{O}\left(1/n\right)\right)$ multiplicative error, we may replace the exponential inside the sum over r by 1, obtaining

$\begin{equation}{P}_{n}\left(x\right)\approx {p}^{\left(n+1\right)/2}{\left(1-p\right)}^{\left(n-1\right)/2}\left(\sqrt{\frac{2}{\pi n}}\right){2}^{n}\sum _{r=0}^{\left(n-1\right)/2}{\left(\frac{p}{1-p}\right)}^{r},\end{equation} \tag{ B.4 }$

and we also make a negligible error (assuming $p{< }\frac{1}{2}$ ) by extending the upper limit on the sum to infinity, finding

$\begin{equation}\sum _{r=0}^{\infty }{\left(\frac{p}{1-p}\right)}^{r}=\frac{1-p}{1-2p}.\end{equation} \tag{ B.5 }$

We conclude that

$\begin{equation}{P}_{n}\left(x\right)=\left(\sqrt{\frac{2}{\pi n}}\right)\left(\frac{\sqrt{p\left(1-p\right)}}{1-2p}\right){\left[4p\left(1-p\right)\right]}^{n/2}\left(1+\mathcal{O}\left(\frac{1}{n}\right)\right),\end{equation} \tag{ B.6 }$

assuming $p{< }\frac{1}{2}$ . Using

$\begin{equation}4p\left(1-p\right)={\left(2sc\right)}^{2}={\mathrm{sin}}^{2}\;\theta ,\quad 1-2p={c}^{2}-{s}^{2}=\mathrm{cos}\;\theta ,\end{equation} \tag{ B.7 }$

we find

$\begin{equation}{P}_{n}\left(x\right)=\frac{1}{\sqrt{2\pi n}}\left(\frac{{\mathrm{sin}}^{n+1}\;\theta }{\mathrm{cos}\;\theta }\right)\left(1+\mathcal{O}\left(\frac{1}{n}\right)\right),\quad \mathrm{f}\mathrm{o}\mathrm{r}\;{\mathrm{sin}}^{2}\;\theta /2{< }1/2.\end{equation} \tag{ B.8 }$

Appendix C.: Correlated noise: leading behavior for large n

Here we will describe an alternative way of understanding equation (164), where the coefficient of ${h}_{2}^{q}$ in the logical channel is $\mathcal{O}\left({m}^{3q/2}\right)$ . This leading behavior results from cancellations of terms higher order in m that occur when we perform the sum over k_R in equation (161). What is the explanation for these cancellations?

In section 5 we calculated the coherent and incoherent logical components for the bit-flip code of size n subject to correlated unitary rotations given by a Hamiltonian of the form:

$\begin{equation}H=\sum _{i}{h}_{1}{X}_{i}+\sum _{j,k\vert j\ne k}{h}_{2}{X}_{j}{X}_{k}.\end{equation} \tag{ C.1 }$

We expressed the logical coherent component ${\tilde {\chi }}_{XI}$ and the logical incoherent component ${\tilde {\chi }}_{XX}$ in terms of functions Ω and Δ such that

$\begin{align}\hfill {\tilde {\chi }}_{XI}\left(q\right)& =\sum _{{k}_{\text{R}}=0}^{q}{\Omega}\left(q-{k}_{\text{R}},{k}_{\text{R}}\right){h}_{2}^{q}{h}_{1}^{n-2q},\hfill \\ \hfill {\tilde {\chi }}_{XX}\left(q\right)& =\sum _{{k}_{\text{R}}=0}^{q}{\Delta}\left(q-{k}_{\text{R}},{k}_{\text{R}}\right){h}_{2}^{q}{h}_{1}^{n+1-2q}+\mathcal{O}\left({h}_{2}^{q}{h}_{1}^{n+3-2q}\right),\hfill \end{align} \tag{ C.2 }$

where

$\begin{equation}{\tilde {\chi }}_{XI}=\sum _{q=0}^{\left(n-1\right)/2}{\tilde {\chi }}_{XI}\left(q\right),\quad {\tilde {\chi }}_{XX}=\sum _{q=0}^{\left(n+1\right)/2}{\tilde {\chi }}_{XX}\left(q\right),\end{equation} \tag{ C.3 }$

and only even values of q contribute. Here k_R is the number of times the Hamiltonian term ih₂X X acts on the density operator from the right, and k_L = q − k_R is the number of times −ih₂X X acts from the left.

We were able to compute Ω and Δ by counting the ways of decomposing each physical noise term into combinations of one- and two-body Hamiltonian terms. Repeating equations (149) and (152), we found

$\begin{equation}{\Omega}\left(q-{k}_{\text{R}},{k}_{\text{R}}\right)=\frac{{\left(i\right)}^{n-q}{\left(-1\right)}^{m}\left(m+1\right)}{\left(n-2q\right){2}^{q}}\left(\genfrac{}{}{0.0pt}{}{n}{m}\right){\left(-1\right)}^{{k}_{\text{R}}}\frac{{\left(m!\right)}^{2}}{{k}_{\text{R}}!\left(q-{k}_{\text{R}}\right)!\left(m-2{k}_{\text{R}}\right)!\left(m-2q+2{k}_{\text{R}}\right)!},\end{equation} \tag{ C.4 }$

and

$\begin{equation}{\Delta}\left(q-{k}_{\text{R}},{k}_{\text{R}}\right)=\frac{{\left(i\right)}^{q}}{{2}^{q}}\left(\genfrac{}{}{0.0pt}{}{n}{m}\right){\left(-1\right)}^{{k}_{\text{R}}}\frac{{\left(\left(m+1\right)!\right)}^{2}}{\left(m+1-2q+2{k}_{\text{R}}\right)!\left(m+1-2{k}_{\text{R}}\right)!{k}_{\text{R}}!\left(q-{k}_{\text{R}}\right)!}.\end{equation} \tag{ C.5 }$

Let's evaluate the sum over k_R to leading order in 1/m in both equations (C.4) and (C.5). We focus on the second factor in each equation, which contains all of the k_R dependence. In equation (C.4) this factor is

$\begin{equation}{\left(-1\right)}^{{k}_{\text{R}}}\frac{\left(m\right)\left(m-1\right)\dots \left(m-2{k}_{\text{R}}+1\right){\times}\left(m\right)\left(m-1\right)\dots \left(m-2q+2{k}_{\text{R}}+1\right)}{\left({k}_{\text{R}}\right)!\left(q-{k}_{\text{R}}\right)!}.\end{equation} \tag{ C.6 }$

The dominant term for m large is given by

$\begin{equation}{\left(-1\right)}^{{k}_{\text{R}}}\frac{{\left(m\right)}^{2q}}{\left({k}_{\text{R}}\right)!\left(q-{k}_{\text{R}}\right)!}={\left(-1\right)}^{{k}_{\text{R}}}\frac{{\left(m\right)}^{2q}}{q!}{\times}\left(\genfrac{}{}{0.0pt}{}{q}{{k}_{\text{R}}}\right).\end{equation} \tag{ C.7 }$

Then when we sum over k_R, we have

$\begin{equation}\frac{{\left(m\right)}^{2q}}{q!}\sum _{{k}_{\text{R}}=0}^{q}{\left(-1\right)}^{{k}_{\text{R}}}\left(\genfrac{}{}{0.0pt}{}{q}{{k}_{\text{R}}}\right)=0,\end{equation} \tag{ C.8 }$

where we have made use of the identity

$\begin{equation}\sum _{b=0}^{a}{P}_{c}\left(b\right){\left(-1\right)}^{b}\left(\genfrac{}{}{0.0pt}{}{a}{b}\right)=0\quad \forall c{< }a.\end{equation} \tag{ C.9 }$

Here P_c(b) denotes any polynomial in b of degree c < a. The situation for Δ is the same except that m is replaced by m + 1. Therefore, the leading term for m large in equations (C.4) and (C.5) vanishes.

Similar cancellations occur for higher-order corrections which are suppressed by powers in 1/m. These corrections are computed by expanding the numerator, equation (C.6), as a 2qth order polynomial in m. For example, the coefficient of m^2q−1 is

$\begin{equation}{\left(-1\right)}^{{k}_{\text{R}}}\left(-1\right)\frac{2{q}^{2}-4q{k}_{\text{R}}+q+4{k}_{\text{R}}^{2}}{q!}\left(\genfrac{}{}{0.0pt}{}{q}{{k}_{\text{R}}}\right),\end{equation} \tag{ C.10 }$

in which the prefactor multiplying $\left(\genfrac{}{}{0.0pt}{}{q}{{k}_{\text{R}}}\right)$ is a second-degree polynomial in k_R, so that equation (C.9) implies that the sum over k_R vanishes if q > 2. Likewise, the coefficient of m^2q−r is a polynomial in k_R of degree of 2r, and the sum over k_R vanishes if 2r < q. Recalling that only even q contribute, we see that the leading term that survives the summation over k_R has r = q/2 and is therefore order m^2q−r = m^3q/2. We have now seen why terms higher order in m cancel. The term of order m^3q/2 can be evaluated using the identity

$\begin{equation}\sum _{b=0}^{a}{\left(-1\right)}^{b}{b}^{a}\left(\genfrac{}{}{0.0pt}{}{a}{b}\right)={\left(-1\right)}^{a}a!.\end{equation} \tag{ C.11 }$

The identities in equations (C.9) and (C.11) can be derived by performing the binomial expansion of (1 + x)^a, differentiating repeatedly, and then setting x = −1.

Appendix D.: The shape of the logical string

In this appendix, we prove that among short logical strings nearly all have typical shape as in definition 6.

Lemma 9. In a size L toric code, all but order 1/L of the logical strings running left to right across the code with length ⩽L + 2ζ consist of single steps up and down, so that no vertical segment is longer than one qubit.

Proof. If the size of the code is L and we consider all length L + 2ζ logical strings for fixed ζ, we will count the number of strings that satisfy the condition that each step up or down is only length one. First, we start with a horizontal logical string of length L and then pick ζ sites along it. We have ζ upward steps and ζ downward steps, and we need to fix an ordering. Alternatively, we could think of choosing ζ sites for the upward steps and another ζ sites for the downward steps. In total the number of strings of this type is

$\begin{equation}\text{Number}\;\text{of}\;\text{strings}\;\text{with}\;\text{steps}\;\text{of}\;\text{one}=\left(\genfrac{}{}{0.0pt}{}{L}{2\zeta }\right)\left(\genfrac{}{}{0.0pt}{}{2\zeta }{\zeta }\right)=\left(\genfrac{}{}{0.0pt}{}{L}{\zeta }\right)\left(\genfrac{}{}{0.0pt}{}{L-\zeta }{\zeta }\right).\end{equation} \tag{ D.1 }$

The L dependence in equation (D.1) is

$\begin{equation}\frac{L!}{\left(L-2\zeta \right)!}.\end{equation} \tag{ D.2 }$

Next, we will count the total number of strings that consist of no backwards steps, that is starting from the left of the code block the strings move only right, up, and down. These strings potentially contain upward and downward steps of more than one. In general, such a string involves q₁ distinct steps up with ζ total length and q₂ steps down also totaling ζ in length. The number of ways of writing ζ as a sum of q₁ terms, not ignoring order, is given by the number of compositions of the integer ζ into q₁ terms, which is $\left(\genfrac{}{}{0.0pt}{}{\zeta -1}{{q}_{1}-1}\right)$ . Each of the q₁ steps up and q₂ steps down can be placed independently. This gives us $\left(\genfrac{}{}{0.0pt}{}{L}{{q}_{1}}\right)\left(\genfrac{}{}{0.0pt}{}{L-{q}_{1}}{q2}\right)$ combinations of possible configurations. In total we have

$\begin{equation}\text{Number}\;\text{of}\;\text{strings}\;\text{with}\;{q}_{1}\;\text{steps}\;\text{up}\;\text{and}\;{q}_{2}\;\text{steps}\;\text{down}=\left(\genfrac{}{}{0.0pt}{}{\zeta -1}{{q}_{1}-1}\right)\left(\genfrac{}{}{0.0pt}{}{L}{{q}_{1}}\right)\left(\genfrac{}{}{0.0pt}{}{\zeta -1}{{q}_{2}-1}\right)\left(\genfrac{}{}{0.0pt}{}{L-{q}_{1}}{{q}_{2}}\right)\end{equation} \tag{ D.3 }$

such strings. When q₁ = q₂ = ζ, we recover the case where each step up or down is by one lattice site. Then we can isolate the L dependence in equation (D.3):

$\begin{equation}\frac{L!}{\left(L-{q}_{1}-{q}_{2}\right)!}.\end{equation} \tag{ D.4 }$

We can compare this to equation (D.2), and we see that there are fewer paths with steps larger than one. The ratio is proportional to

$\begin{equation}\frac{\text{Number}\;\text{with}\;{q}_{1}\;\text{steps}\;\text{up}\;\text{and}\;{q}_{2}\;\text{steps}\;\text{down}}{\text{Number}\;\text{with}\;\zeta \;\text{single}\;\text{steps}\;\text{up}\;\text{and}\;\text{down}}=\mathcal{O}\left({L}^{{q}_{1}+{q}_{2}-2\zeta }\right).\end{equation} \tag{ D.5 }$

Then if we count the paths with a single step of two and the other steps are all one, there are order 1/L of these relative to the number of paths with single steps up and down.

We must also count the number of logical strings where the string backtracks on itself. There are even fewer of these than the strings with jumps up or down by two. Each string with backtracking can be produced from a string with a jump up or down of at least two lattice spacings. We add some additional cap onto the vertical segment of at least length 2. The number of strings with one instance of backtracking like figure D1 will be proportional to the number of strings of length two shorter that also have at least one step up or down of more than one. For this reason strings like the one in figure D1 are an exponentially smaller minority than the strings with steps up and down of more than one. Then we conclude that nearly all short logical strings spanning the code left to right consist of steps up and down by only one qubit.□

**Figure D1.** The logical string $\mathcal{L}$ has backtracking. Among short logical strings, those with backtracking are unlikely relative to strings without.
Download figure:
Standard image High-resolution image

Lemma 10. For the class of length L + 2ζ strings described in lemma 9 (those with exactly ζ steps up and ζ steps down as they span the code block from left to right), for large L nearly all will have spacings between the steps growing proportional to $\sqrt{L}$ . We choose a small constant γ and define typical strings as those for which all vertical steps are separated by at least $\gamma \sqrt{L}$ . If we fix the length of logical strings and combine this lemma with lemma 9, we can make the following statement about the fraction of strings of that length that have atypical shape:

$\begin{equation}\frac{\text{Number}\;\text{of}\;\text{strings}\;\text{with}\;\text{atypical}\;\text{shape}}{\text{Total}\;\text{number}\;\text{of}\;\text{strings}}=\frac{8\gamma {\zeta }^{2}}{\sqrt{L}}+\mathcal{O}\left(\frac{1}{L}\right).\end{equation} \tag{ D.6 }$

Proof. The total number of strings of the type in lemma 9 is $\left(\genfrac{}{}{0.0pt}{}{L}{2\zeta }\right)\left(\genfrac{}{}{0.0pt}{}{2\zeta }{\zeta }\right)$ . Now let us compute the number of strings such that each step up or down is separated from others by $\gamma \sqrt{L}$ for some constant γ. We can lower bound the number by starting with a length L string running left to right across the code and placing our steps up and down. We suppose that each step we place prohibits placing another step on a further $2\gamma \sqrt{L}$ of the sites. This is a lower bound because in the true answer these intervals will sometimes overlap. The lower bound is

$\begin{equation}\frac{L\left(L-\left(2\gamma \sqrt{L}+1\right)\right)\left(L-2\left(2\gamma \sqrt{L}+1\right)\right)\cdots \left(L-\left(2\zeta -1\right)\left(2\gamma \sqrt{L}+1\right)\right)}{\left(2\zeta \right)!}\left(\genfrac{}{}{0.0pt}{}{2\zeta }{\zeta }\right).\end{equation} \tag{ D.7 }$

Compared to the total, n − i has been replaced by n − i(2γ + 1) for each i, so that when γ = 0 we recover the total number of strings. In general, the ratio of this limited set to the total for fixed ζ and γ is given by

$\begin{equation}\frac{\text{Number}\;\text{of}\;\text{length}\;L+2\zeta \;\text{strings}\;\text{with}\;\text{widely}\;\text{separated}\;\text{steps}}{\text{Number}\;\text{of}\;\text{length}\;L+2\zeta \;\text{strings}}\approx \prod _{i=1}^{2\zeta -1}\left(1-\frac{2i\gamma }{\sqrt{L}}\right).\end{equation} \tag{ D.8 }$

We can lower bound this by

$\begin{equation}{ >}{\left(1-\frac{4\zeta \gamma }{\sqrt{L}}\right)}^{2\zeta }=1-\frac{8\gamma {\zeta }^{2}}{\sqrt{L}}+\mathcal{O}\left(\frac{1}{L}\right).\end{equation} \tag{ D.9 }$

This approaches 1 as L increases, and we see that with high probability a short logical string will have the property that the steps up and down are separated by more than $\gamma \sqrt{L}$ , as L becomes large. □

Appendix E.: Disconnected errors

Fix a coherent logical noise component and consider the sum in equation (172). In section 6.7 we argued that the disconnected term is 1 for disconnected errors that do not change how a given connected term is decoded. This allows us to write the sum as

$\begin{equation}{\tilde {\chi }}_{{Z}_{1},I}=\left(\sum _{\mathcal{L}}\sum _{p\in \mathcal{P}\left(s\right)}\text{Connected}\;\text{part}\right)+\mathrm{e}\mathrm{r}\mathrm{r}\mathrm{o}\mathrm{r}.\end{equation} \tag{ E.1 }$

The sum over $\mathcal{L}$ includes all typical short connected logical strings. $\mathcal{P}\left(\mathcal{L}\right)$ is the set of likely partitions of connected logical string $\mathcal{L}$ . This excludes the partitions we called 'exceptional terms' in definition 7 and lemma 4. ' $\mathrm{E}\mathrm{r}\mathrm{r}\mathrm{o}\mathrm{r}$ ' contains all the terms we have neglected. This includes the contribution of long logical strings, short logical strings with atypical shape, and exceptional terms. It also includes the terms with disconnected pieces that we did not consider in section 6.7. These are all of the terms where the disconnected errors flip the way the partition is decoded, where we start with a partition and after adding disconnected errors to each side, the error that was originally uncorrectable becomes correctable and vice versa. These terms will not follow the analysis we did in section 6.7. We will describe these terms now and show that they are negligible in the following lemma.

Lemma 11. In equation (E.1) the error from the neglected terms $\mathcal{E}$ can be expressed

$\begin{equation}Error={\mathcal{E}}_{1}+{\mathcal{E}}_{2},\end{equation} \tag{ E.2 }$

where ${\mathcal{E}}_{1}$ contains the contributions that we have already proven are negligible—long connected logical strings, logical strings with an atypical shape, and exceptional partitions. ${\mathcal{E}}_{2}$ contains the contributions from terms where the disconnected errors have flipped the way the partition is decoded. These are the terms we neglected in section 6.7. The following is true:

$\begin{equation}\vert {\mathcal{E}}_{2}\vert {\leqslant}\vert {\mathcal{E}}_{1}\vert .\end{equation} \tag{ E.3 }$

Proof. We start with a typical short connected logical string and take a partition into a correctable operator and an uncorrectable operator, denoted (O_UρO_C). Now we add disconnected errors, D_L and D_R to the left and right side of the partition. In some cases the uncorrectable error may become correctable and vice versa. That is, O_UD_L will be correctable, while O_CD_R is uncorrectable. For example, the term that contributes to the ${\tilde {\chi }}_{{Z}_{1}I}$ component of the logical noise might be (O_CD_RρO_UD_L). Our treatment of the disconnected part in section 6.7 assumed that the added errors did not flip the correctable and uncorrectable sides of the original partition. Now we will justify this assumption by proving that such terms are negligible.

First, we must understand the conditions when an added error will turn the uncorrectable side of a partition into a correctable error. O_U is the uncorrectable side of the partition, so the minimal-weight correction to O_U is equal to O_C up to stabilizers. For O_UD_L to be correctable, the minimal-weight correction must equal O_UD_L up to stabilizers and not equal O_CD_L up to stabilizers. Note that we write D_L and not D_R because D_L and D_R have the same syndrome, so as far as the decoder is concerned they are equivalent. This implies

$\begin{equation}\underset{{G}_{x}\in \mathcal{S}}{\mathrm{min}}\vert {G}_{x}{O}_{\text{U}}{D}_{\text{L}}\vert {< }\underset{{G}_{x}\in \mathcal{S}}{\mathrm{min}}\vert {G}_{x}{O}_{\text{C}}{D}_{\text{L}}\vert ,\end{equation} \tag{ E.4 }$

where G_x is an element of $\mathcal{S}$ , which denotes the stabilizer group, and |⋅| denotes the weight of a Pauli operator. The weight of the minimal-weight operator equivalent up to stabilizers to O_CD_L is no greater than the sum of the weights of the minimal-weight operators equivalent up to stabilizers to O_C and D_L individually. We can continue:

$\begin{equation}\underset{{G}_{x}\in \mathcal{S}}{\mathrm{min}}\vert {G}_{x}{O}_{\text{U}}{D}_{\text{L}}\vert {< }\underset{{G}_{x}\in \mathcal{S}}{\mathrm{min}}\vert {G}_{x}{O}_{\text{C}}{D}_{\text{L}}\vert {\leqslant}\underset{{G}_{x}\in \mathcal{S}}{\mathrm{min}}\vert {G}_{x}{O}_{\text{C}}\vert +\underset{{G}_{y}\in \mathcal{S}}{\mathrm{min}}\vert {G}_{y}{D}_{\text{L}}\vert {< }\underset{{G}_{x}\in \mathcal{S}}{\mathrm{min}}\vert {G}_{x}{O}_{\text{U}}\vert +\underset{{G}_{y}\in \mathcal{S}}{\mathrm{min}}\vert {G}_{y}{D}_{\text{L}}\vert .\end{equation} \tag{ E.5 }$

We conclude that the added error must be such that the minimal-weight correction of O_UD_L is less than the minimal-weight correction of O_U plus the minimal-weight correction of D_L. This happens when the disconnected error D_L lies near O_U such that the minimal-weight decoder will tend to form a loop out of parts of D_L and O_U. This is possible only in cases like the one in figure E1.

The condition in equation (E.5) requires a special combination of disconnected error and original partition. This is possible for both coherent- and incoherent-type disconnected errors as defined in section 6.7. Let us consider incoherent-type disconnected errors first. This is what is illustrated in figure E1. The disconnected error causes the uncorrectable side of the partition to become correctable when D_L contains at least two errors in a row adjacent to O_U. Based on our condition, we observe that the number of added errors that flip the partition is greatest for the lowest-weight partitions. These terms require the fewest added errors to flip. We also see that the number of these added errors increases with the length of the logical string. A longer string has more adjacent qubits. This implies that the value of the disconnected part is decreasing with string length. This fact was used in lemma 3.

We seek to prove that terms like the one in figure E1 are negligible in the coherent logical noise components. We will do this by mapping each combination of a partition of a connected logical string and a set of disconnected errors such that uncorrectable and correctable errors in the partition are flipped to a partition of a longer connected logical string. There exists a unique stabilizer operator that will multiply the starting partition plus disconnected errors and produce a partition of a longer logical string. This is illustrated in figure E2. Our condition in equation (E.5) says that the minimum stabilizer-equivalent operator to O_U is lower weight than O_U. The stabilizer operator we need to map figures E1 to E2 is the product of O_UD_L and its minimal-weight correction. The resulting connected logical string is longer than the original connected logical string, but the total weight of the noise term (connected and disconnected) is smaller. This must be true because we have lowered the weight of the errors in blue (O_UD_L), and we have not changed the weight of the errors in red (O_CD_R).

In our previous analysis of the connected part of the coherent logical noise, we neglected logical strings with length >L + 2ζ for a cut-off constant ζ. Then we neglected short logical strings with atypical shape. Finally, we neglected the unlikely partitions of each string, which we called exceptional terms. In this proof we began with a likely partition of a short, typical connected logical string. We added disconnected errors, and in cases like the one in figure E1 where the added errors flipped the partition, we mapped these terms to partitions of a different connected logical string like in figure E2. The final part of our proof is to argue that we can neglect this class of terms, where disconnected errors changed how the connected partition is decoded.

First, we observed above that the weight of the new connected term produced by our mapping is less than the weight of the original term with disconnected errors. This means that the term in figure E1 is suppressed in powers of sin θ/2 relative to the term in figure E2. Second, we will argue that the new connected term is one we have already neglected. Recall how we constructed new connected terms like the one in figure E2. We took a likely partition of a typical, short, connected logical string and added disconnected errors to it in such a way that we flipped how the partition was decoded. The original uncorrectable side became correctable and vice versa. Then we multiplied the correctable error O_UD_L by a particular stabilizer operator to produce a new term that is a partition of a new connected logical string. We make the following observation. One of two things must be true. One is that the new logical string has an atypical shape, specifically if the logical string runs left to right across the code, the steps up and down in the lattice are separated by less than γL. γ was our chosen constant from lemma 10 that lower bounded the separation between the vertical steps in the logical string. The alternative is that the new connected logical string has a typical shape, but the partition we produce is unlikely. If the stabilizer we multiply by in figure E2 has a width of at least γL, the shape of the new connected logical string may be typical. However, in that case the partition we get for the longer connected logical string has a row of γL qubits all belonging to the blue error. We proved in lemma 4 that partitions with this feature are an exponentially small (in $\sqrt{L}$ ) fraction of the total partitions. We neglected these partitions in our earlier sum over connected terms. We find that the terms with disconnected errors that flip how the partition is decoded are in one-to-one correspondence with terms we have already neglected and moreover, the magnitudes of the terms with disconnected errors are smaller by a number of powers of sin θ/2. We conclude that such terms contribute less to the logical noise than the terms we have already neglected.

So far in this proof, we dealt with incoherent-type added errors. This was for simplicity, so that we had only one picture in mind. The argument for coherent-type added errors is the same. Coherent-type disconnected errors can also flip the correctable and uncorrectable sides of a partition of a connected logical string. We have already stated the condition when this occurs. The disconnected terms on the uncorrectable side D_L must contain a contiguous set of errors near a contiguous set of errors in the uncorrectable part of the partition O_U.

We bound the contribution of the disconnected coherent-type errors that flip the correctable and uncorrectable sides of the partition in the same way as we did the incoherent-type. We will use a mapping that takes such a term and produces a partition of a longer connected logical string. The mapping multiplies by a suitable stabilizer operator as depicted in figure E4. In this case the new connected term is negligible for the same reasons as in the incoherent-type added error case. The connected logical string produced from the original partition plus the disconnected coherent-type errors either has an atypical shape or the original partition was exponentially unlikely (in $\sqrt{L}$ ). We conclude that the contribution of terms with disconnected errors that flip the partition is negligible in the logical noise. □

**Figure E1.** This figure shows a partition of a connected logical string together with a disconnected error. The disconnected error is incoherent-type, so D_L = D_R. The uncorrectable error O_CD_R is in red, while the correctable error O_UD_L is in blue. The two form a length-5 connected logical string that runs left to right across the code. Without the disconnected errors, O_C would be correctable and O_U, uncorrectable. Therefore, the added disconnected errors have flipped the original partition.
Download figure:
Standard image High-resolution image

**Figure E2.** Here is the partition of a connected logical string corresponding to figure E1, with the new uncorrectable error O_U in red and the new correctable error O_C in blue. This term shares the same syndrome as the one in figure E1. We can always multiply right or left-hand sides by a stabilizer to produce different coherent terms. This term is produced from figure E1 by multiplying the correctable side in blue by the stabilizer operator in gray crosshatching. Notice that the connected logical string is longer, but the total weight of the term is smaller.
Download figure:
Standard image High-resolution image

**Figure E3.** Here we have a partition of a connected logical string together with a disconnected error. The partition is the same as the one in figures E1 and E2. The disconnected error is now a coherent-type error. The uncorrectable error O_CD_R is in red, while the correctable error O_UD_L is in blue. Without the disconnected errors, O_C would be correctable and O_U, uncorrectable. The added loop of disconnected errors has flipped the original partition.
Download figure:
Standard image High-resolution image

**Figure E4.** This is the connected string and partition that corresponds to figure E3, with the new uncorrectable error O_U in red and the new correctable error O_C in blue. We can always multiply right or left-hand sides by a stabilizer to produce different coherent terms. This term is produced from figure E3 by multiplying the correctable side in blue by the stabilizer operator in gray crosshatching. The connected logical string is longer than the one in figure E3, but the total weight of the term is less.
Download figure:
Standard image High-resolution image

Appendix F.: The disconnected part of the incoherent logical noise

In appendix E we proved that for the dominant noise terms in the coherent logical noise components, the disconnected part was equal to 1 up to small corrections. We will now prove the same statement for the dominant noise terms in the incoherent logical noise components.

Lemma 12. In equation (206) we wrote an incoherent logical noise component as a sum over the contributions from individual logical strings. This included a disconnected factor. Here we prove that we can set the disconnected factor equal to 1 and make only a small error. In other words suppose we write

$\begin{equation}{\tilde {\chi }}_{{Z}_{1}{Z}_{1}}{\geqslant}\sum _{\mathcal{L}}\sum _{{O}_{\text{U}}\subset \mathcal{L}}\frac{1}{\vert \left\{{O}_{\text{C}}^{\prime }\right\}\vert }\sum _{{O}_{\text{U}}^{\prime }}\left({O}_{\text{U}}\rho \;{O}_{\text{U}}^{\prime }\right)\left(1+{\mathcal{E}}_{1}+{\mathcal{E}}_{2}\right),\end{equation} \tag{ F.1 }$

where the sum over $\mathcal{L}$ includes all short typical logical strings, O_U, |{O_C'}|, and O_U' are all as described in section 6.9, and ${\mathcal{E}}_{1}$ is the error we make by neglecting various connected terms, including high-weight terms and terms with mismatched weight. In lemma 5 we proved that

$\begin{equation}\vert {\mathcal{E}}_{1}\vert {< }\mathcal{O}\left({\left(\mathrm{sin}\;\theta \right)}^{2\zeta }\right).\end{equation} \tag{ F.2 }$

${\mathcal{E}}_{2}$ represents the error we make when we set the disconnected factor equal to 1. Then

$\begin{equation}\vert {\mathcal{E}}_{2}\vert {< }\frac{8\gamma {\zeta }^{2}}{\sqrt{L}}+\mathcal{O}\left(\frac{1}{L}\right).\end{equation} \tag{ F.3 }$

Proof. We can follow the argument from section 6.7 and appendix E. The connected noise terms we considered in those sections had the form (O_UρO_C). Here we will consider noise terms with the form (O_UρO_U'). We will imagine arriving at these noise terms in the manner of section 6.9. Namely, we begin with a short logical string with a typical shape. We partition the logical string into O_U and O_C. Then we choose an operator O_U' with the same syndrome as O_U. We denoted the set of possible O_U' by {O_U'}.

Now that we have a connected noise term (O_UρO_U'), we can think of dressing it with disconnected errors in exactly the same way as we did in the coherent case. In section 6.7 we observed that the added errors that make up the disconnected part can be divided into coherent- and incoherent-type. The coherent-type added errors are when we add different errors to O_U and O_U'. In this case the errors we add to O_U and O_U' form a loop (with nonzero area). We saw that as long as the loop was positioned such that the added errors did not change how the connected noise term was decoded, the sum over the possible ways of dividing the errors in the loop between O_U and O_U' gave zero. We considered incoherent-type added errors where we added the same error to O_U and O_U'. In this case the contribution was nonzero. As long as the added error did not change how the connected noise term was decoded, the incoherent-type added errors contributed a sin² θ/2 term on each qubit. Together with the cos² θ/2 term corresponding to no error on each qubit, this gives 1 for the disconnected part. This applies to the added errors that do not change how the connected part was decoded. Therefore, our approach is to write the disconnected part as 1 plus a correction that comes from the configurations of added errors that change how the connected part is decoded.

In lemma 11 we considered added errors that change how O_U is decoded in the coherent logical noise components. For connected noise terms that enter into the incoherent logical noise, the correction to the disconnected part comes from the same source, certain added errors that change how the connected term is decoded. In that case the added errors flipped the correctable and uncorrectable sides of the partition, which gives a phase of −1. In the incoherent case, if O_U is made correctable by the added errors, then so is O_U', and the resulting term contributes to the identity part of the logical noise. In effect, there are disallowed added errors, which reduce the value of the disconnected part. The counting of such terms is identical to what we did in lemma 11. Recall that these added error terms were related to connected terms that either had an atypical shape or an unlikely partition. The contribution to ${\mathcal{E}}_{2}$ from these terms is proportional to the fraction of atypical logical strings from lemma 10. The contribution from the unlikely partitions is exponentially small in $\sqrt{L}$ as before.

There is another class of added errors that contribute to the correction to the disconnected part. Some errors are near the correction E_s so that once the errors have been added, they become part of a new connected term. An example is shown in figure F1. This class of terms contains only incoherent-type added errors. Coherent-type added error placed here will still give 0 after we sum over the ways of splitting the errors into the left and right disconnected errors. This is the same as for coherent-type added errors far from the connected logical string. For a likely partition of a short logical string with typical shape, E_s will have the same weight as O_C. The incoherent-type added errors that join the connected part may either lie near to E_s or they may be contained in E_s. We will first study the case where the added errors are not contained in E_s. These terms are closely related to the situation we just analyzed where the added errors sit next to O_U. The condition on the added error is analogous to equation (E.5). Let D denote the added incoherent-type error and $\mathcal{S}$ denote the stabilizer group. Then

$\begin{equation}\underset{{G}_{x}\in \mathcal{S}}{\mathrm{min}}\vert {G}_{x}{O}_{\text{C}}D\vert {< }\underset{{G}_{y}\in \mathcal{S}}{\mathrm{min}}\vert {G}_{y}{O}_{c}\vert +\underset{{G}_{z}\in \mathcal{S}}{\mathrm{min}}\vert {G}_{z}D\vert .\end{equation} \tag{ F.4 }$

If this condition is satisfied, then the added error becomes part of a new connected term (O_UDρO_U'D). Together with the new lowest-weight correction O_UD forms a new connected logical string. This logical string is similar to the old logical string, but it contains a detour where it veers off to include the error D. This logical string either has an unlikely shape because it includes two closely spaced vertical steps, or the error D has width >γL. In the latter case we also require that the original correction E_s included >γL consecutive qubits. This is exponentially unlikely according to the counting we did in lemma 4 and the bound we wrote in equation (191). We conclude that the contribution to the error term ₂ from the noise terms we arrive at by adding errors proximate to O_C is small.

This leaves the added errors that lie within E_s or one of the operators with the same syndrome and weight. In this case the new connected logical string that results from adding the errors is the same as the old logical string. We will compare the set of terms we arrive at by adding errors in this manner to the set of terms with the same O_U but a higher-weight O_U'. We will argue that there are more of the terms with higher-weight O_U'. We have already neglected such noise terms in lemma 7, so we conclude that the correction to the disconnected part is small.

We start with a connected noise term (O_UρO_U'), where |O_U| = |O_U'| = w. We form a higher-weight noise term by adding an incoherent-type error D within one of the operators O_C' to produce a new connected noise term (O_UDρO_U'D). This adds a total of two to the weight of the term. We can place the added error anywhere within one of the operators O_C'. The number of possibilities is $\mathcal{O}\left(w\right)$ . Now consider the possible connected noise terms with the same O_U, but instead of choosing O_U' with weight w, we set |O_U'| = w + 2. Suppose we start with an operator O_U' with weight w. We can construct one with w + 2 by adding an extra 'cap' consisting of three qubits around a single plaquette or star. This is illustrated in figure F2, where three possible choices of O_U' are drawn with the red dashed line. The number of possible choices is at least $\mathcal{O}\left(w\right)$ , because we can place a cap at each location along O_U. The full set of possibilities will generally be larger. If we consider a pair of added errors lying within O_C', then we compare to the connected noise terms with |O_U'| = |O_U| + 4. The noise terms where O_U and O_U' have different weights were discussed in section 6.10. We proved in lemma 7 that the contribution of these terms is negligible. Finally we conclude, just as in lemma 11, that each of the disconnected errors that contribute to the error ${\mathcal{E}}_{2}$ can be matched with a connected noise term that we have already neglected. In other words, the error ${\mathcal{E}}_{2}$ is less than ${\mathcal{E}}_{1}$ . Therefore, we can say that the disconnected part = 1 and make only a small error for low-weight connected terms. □

**Figure F1.** This figure illustrates a certain type of added error term. We start with a short, typical logical string and partition it into operators O_U and O_C. Then we would choose some operator O_U' (not pictured) with the same weight and syndrome as O_U to produce an incoherent term. The errors in the red circle are added to both O_U and O_U'. The minimal-weight correction is shown as a black dashed line. The added errors are not disconnected but form a new connected term.
Download figure:
Standard image High-resolution image

**Figure F2.** This figure illustrates the idea of the proof that a certain type of added error term contributes only a small error. Consider a connected logical string that we partition into O_U and O_C, the solid red and blue lines. The class of added errors we consider are those where the added error lies along O_C or one of the operators O_C' with the same weight and syndrome. The possible locations for such an added error are marked by the × symbols. We prove that such terms are negligible by comparing to the noise terms where the operator O_U' is chosen with weight 2 greater than O_U. Three of the possible choices for O_U' are drawn with the dashed red lines. In this example there are 12 possible O_U' operators and six possible added error terms.
Download figure:
Standard image High-resolution image

Appendix G.: Physical Y errors

In lemma 8 we considered rotations in the X–Z plane where the single-qubit rotation angles were allowed to differ. Here we prove that allowing for rotations partly around the Y axis on the physical qubits will decrease the coherence of the logical noise channel.

Lemma 13. Consider an L × L toric code and a noise channel that consists of single-qubit rotations by an angle θ about an arbitrary axis. Suppose that |sin θ| < 1/L as in lemmas 3 and 5. Then the connected contribution to the logical noise from low-weight terms is most coherent when the single-qubit rotations are about an axis in the X–Z plane. We proved elsewhere that the low-weight connected contribution dominates the logical noise components.

Proof. Let θ_X, θ_Y, θ_Z denote the rotation angles about the X, Y, and Z-axis, respectively, so that ${\theta }_{X}^{2}+{\theta }_{Y}^{2}+{\theta }_{Z}^{2}={\theta }^{2}$ . Of the coherent logical noise components, according to lemmas 14 and 15, the dominant components are the ones $\left({\tilde {L}}_{a}\tilde {\rho }\right)$ , where ${\tilde {L}}_{a}$ is a logical X or Z operator on one of the encoded qubits. We apply several of our lemmas to restrict the noise terms we consider, just as in theorem 3. Among the noise terms that contribute to the coherent logical noise, we keep the terms with short, typical logical strings and non-exceptional partitions. Among the noise terms that contribute to the incoherent logical noise, we keep the terms where the logical string $\mathcal{L}$ is short and typical, $\vert {O}_{\text{U}}\vert =\left(\vert \mathcal{L}\vert +1\right)/2$ , |O_C| = |E_s|, and |O_U'| = |O_U|.

First suppose that θ_Z = 0. Then the logical $\left({X}_{1}\tilde {\rho }\right)$ noise component is generated from noise terms (O_UρO_C) where O_U and O_C together contain X acting on every qubit along an X₁ logical string. Meanwhile, the incoherent logical $\left({X}_{1}\tilde {\rho }{X}_{1}\right)$ noise component is also generated by X₁ logical strings. In theorem 3 we state a bound on the relative magnitude of these logical noise components. Here θ_X plays the role of θ in equation (233). Under our θ_X and θ_Y rotation noise model, we also have a non-zero logical $\left({Z}_{1}\tilde {\rho }\right)$ noise component. This is generated by connected noise terms, (O_UρO_C), where O_U and O_C together contain both X and Y acting on every qubit along a Z₁ logical string. The number of Z₁ logical strings with length ℓ is the same as the number of X₁ logical strings with length ℓ. However, the weight of the noise terms that contribute to $\left({Z}_{1}\tilde {\rho }\right)$ is ℓ². The contribution of each noise term is ${\left(\mathrm{sin}\;{\theta }_{X}/2\right)}^{\ell }{\left(\mathrm{sin}\;{\theta }_{Y}/2\right)}^{\ell }$ . In contrast, the noise terms that contribute to $\left({X}_{1}\tilde {\rho }\right)$ are all proportional to ${\left(\mathrm{sin}\;{\theta }_{X}/2\right)}^{\ell }$ . Therefore, $\left({Z}_{1}\tilde {\rho }\right)$ is exponentially smaller in L relative to $\left({X}_{1}\tilde {\rho }\right)$ for any choice of rotation axis in the X–Y plane. The $\left({Z}_{1}\tilde {\rho }\right)$ noise component has a negligible effect on the relative magnitudes of the coherent and incoherent logical noise components. We also have an incoherent $\left({Z}_{1}\tilde {\rho }\;{Z}_{1}\right)$ noise component. This is generated by noise terms, (O_UρO_U'), where O_U and O_U' contains Y errors along an uncorrectable subset of a Z₁ logical string. These noise terms have magnitude ${\left(\mathrm{sin}\;{\theta }_{Y}/2\right)}^{\ell +1}$ , which is exponentially large relative to the noise terms that contributed to $\left({Z}_{1}\tilde {\rho }\right)$ . It follows that the logical coherence is maximized when θ_Y = 0 and θ_X = θ. We began by supposing θ_Z = 0. Next, we will consider the case where θ_X, θ_Y, and θ_Z are all nonzero.

Suppose |θ_Z| ⩾ |θ_X|. If not, switch the role of X and Z in what follows. Fix a Z₁ logical string $\mathcal{L}$ . The contribution of the logical string $\mathcal{L}$ to ${\tilde {\chi }}_{{Z}_{1}I}$ is a sum over the partitions of $\mathcal{L}$ . For each partition (O_UρO_C), we can replace a Z error in O_U with a Y error if we add an X error on the same qubit to O_C. Similarly we can replace a Z error in O_C by a Y error if we add an X error on the same qubit to O_U. The Z syndrome is unchanged, but now we also have a non-trivial X syndrome corresponding to the X error on the chosen site. This does not change how any partitions are decoded, but it does change the weight. The contribution of each partition to ${\tilde {\chi }}_{{Z}_{1}I}$ is a sum over all combinations of either a Z error or a Y and an X error on every qubit in $\mathcal{L}$ . The terms with Y errors have higher weight. This means they contain extra factors of sin θ_Y/2, which is small since |sin θ| < 1/L. At the same time, the logical string $\mathcal{L}$ contributes to the ${\tilde {\chi }}_{{Z}_{1}{Z}_{1}}$ logical noise component. These noise terms include some that feature only Z errors and others with some number of Z errors replaced by Y errors. Unlike the contributions to ${\tilde {\chi }}_{{Z}_{1}I}$ , these terms with Y errors are not higher-weight. There are no extra factors of sin θ_Y/2, and we conclude that the incoherent logical noise components are made larger relative to the coherent logical noise components. Therefore, the logical coherence is maximized when θ_Y = 0.□

Appendix H.: Other logical maps

In the section 6.4, we restricted our attention to logical coherent terms of the form $\left(\tilde {{L}_{a}}\tilde {\rho }\right)$ , where L_a is an X or Z operator on one of the encoded qubits. Now we would like to consider the case where L_a acts nontrivially on both encoded qubits or as Y on one or both of the encoded qubits.

Lemma 14. Consider the toric code with minimal-weight decoding and a noise model that consists of uniform single-qubit unitary rotations about a fixed axis. Then the coherent logical noise components, $\left(\tilde {{L}_{a}}\tilde {\rho }\right)$ , where ${\tilde {L}}_{a}$ is a Y-type logical operator or ${\tilde {L}}_{a}$ is a non-trivial logical operator on both encoded qubits, are negligible relative to the components where ${\tilde {L}}_{a}$ is an X- or Z-type logical operator on one encoded qubit.

Proof. Suppose we have ${\tilde {L}}_{a}={X}_{1}{Z}_{2}$ . Logical strings of this type are the product of two operators of the type we have already considered. Each connected noise term that contributes to the logical noise component, $\left({X}_{1}{Z}_{2}\tilde {\rho }\right)$ , is a product of a connected noise term that contributes to $\left({X}_{1}\tilde {\rho }\right)$ and a connected noise term that contributes to $\left(\tilde {\rho }{Z}_{2}\right)$ . It follows that up to corrections that come from the disconnected part, $\left({X}_{1}{Z}_{2}\tilde {\rho }\right)\approx \left({X}_{1}\tilde {\rho }\right)\left({Z}_{2}\tilde {\rho }\right)$ . The logical components, $\left({X}_{1}\tilde {\rho }\right)$ and $\left({Z}_{2}\tilde {\rho }\right)$ , are both small if error correction is working, so the logical $\left({X}_{1}{Z}_{2}\tilde {\rho }\right)$ component will be negligible. If ${\tilde {L}}_{a}$ is Y-type operator on the first encoded qubit, the argument is the same, since Y-type logical strings are products of X and Z-type logical strings, Y₁ = X₁Z₁.

If we have ${\tilde {L}}_{a}={Z}_{1}{Z}_{2}$ , the logical component $\left({Z}_{1}{Z}_{2}\tilde {\rho }\right)$ is no longer a product of $\left({Z}_{1}\tilde {\rho }\right)$ and $\left({Z}_{2}\tilde {\rho }\right)$ . This is because Z₁ and Z₂-type logical strings can overlap, and this changes the counting of logical strings of a fixed weight. Figure H1 shows two examples of this kind of logical string. At length 2L, where L is the code distance, there are many connected logical strings because we can have a single connected string that wraps the torus along both directions. If we count the shortest paths between two points in the square lattice separated by distance l₁ in the horizontal and l₂ in the vertical, we get

$\begin{equation}\text{Number}\;\text{of}\;\text{shortest}\;\text{paths}\;\text{travelling}\;{l}_{1}\;\text{horizontal}\;\text{and}\;{l}_{2}\;\text{vertical}\;\text{spaces}=\left(\genfrac{}{}{0.0pt}{}{{l}_{1}+{l}_{2}}{{l}_{1}}\right).\end{equation} \tag{ H.1 }$

We can use this to bound the number of weight-2L logical Z₁Z₂ strings. Fix two sites in the code, qubit i along the vertical edge of the code and qubit j along the horizontal edge. Now count the number of shortest paths that connect these points. We have

$\begin{equation}\left(\genfrac{}{}{0.0pt}{}{i+j}{i}\right)\left(\genfrac{}{}{0.0pt}{}{2L-2-i-j}{L-1-i}\right)+\left(\genfrac{}{}{0.0pt}{}{L-1-i+j}{j}\right)\left(\genfrac{}{}{0.0pt}{}{i+L-1-j}{i}\right)\end{equation} \tag{ H.2 }$

for the two ways of linking the edge points. We simply apply the result from equation (H.1). In the end we find that

$\begin{equation}\text{Number}\;\text{of}\;\text{weight}\-2L\;{Z}_{1}{Z}_{2}\;\text{logical}\;\text{strings}\approx \frac{{4}^{L}}{\sqrt{\pi L}}.\end{equation} \tag{ H.3 }$

In section 6.5, we counted logical strings that act as X or Z on one of the encoded qubits starting from length L, and we found exponentially many logical strings at higher weights. If we consider weight-2L logical strings, we find order μ^2L logical strings, where μ ≈ 2.64 for the 2D square lattice. This is more than 4^L, so we have more of the high-weight logical strings that act on only one encoded qubit. Further, in our path counting in lemma 3, we neglected all logical strings of length >L + 2ζ for a constant ζ. The strings of length ⩾2L contribute negligibly for large L. Then we conclude that the logical noise components, $\left({\tilde {L}}_{a}\tilde {\rho }\right)$ , where L_a is a Y-type logical operator or acts on both encoded qubits, are negligible relative to the noise components where L_a acts as X or Z on one of the encoded qubits. □

**Figure H1.** Here are two examples of lowest-weight Z₁Z₂ logical strings, ${\mathcal{L}}_{1}$ and ${\mathcal{L}}_{2}$ , that act as Z on both encoded qubits. Notice that red and green connect the edge points in different (but topologically equivalent) ways.
Download figure:
Standard image High-resolution image

**Figure H1.** Here are two examples of lowest-weight Z₁Z₂ logical strings, ${\mathcal{L}}_{1}$ and ${\mathcal{L}}_{2}$ , that act as Z on both encoded qubits. Notice that red and green connect the edge points in different (but topologically equivalent) ways.
Download figure:
Standard image High-resolution image

Appendix I.: More general coherent terms

We have considered coherent logical noise components $\left({\tilde {L}}_{a}\tilde {\rho }\right)$ , where ${\tilde {L}}_{a}$ is a logical operator that acts as X or Z on exactly one of the encoded qubits. We must also consider logical noise components $\left({\tilde {L}}_{a}\tilde {\rho }{\tilde {L}}_{b}\right)$ , where ${\tilde {L}}_{a}$ and ${\tilde {L}}_{b}$ are different non-trivial operators on the encoded qubits.

Lemma 15. Consider the L × L toric code with noise that consists of single-qubit unitary rotations about a fixed axis by angle θ on every qubit, where |sin θ| is <1/L as in lemma 3. Each coherent logical noise component of the form $\left({\tilde {L}}_{a}\tilde {\rho }{\tilde {L}}_{b}\right)$ , where ${\tilde {L}}_{a}$ and ${\tilde {L}}_{b}$ are different nontrivial logical operators, is negligible relative to the coherent logical noise components with ${\tilde {L}}_{b}=\tilde {id}$ . Each of the more general coherent terms is given by

$\begin{equation}\left({\tilde {L}}_{a}\tilde {\rho }{\tilde {L}}_{b}\right)\approx \left({\tilde {L}}_{a}\tilde {\rho }\right)\left(\tilde {\rho }{\tilde {L}}_{b}\right).\end{equation} \tag{ I.1 }$

$\left({\tilde {L}}_{a}\tilde {\rho }\right)$ and $\left(\tilde {\rho }{\tilde {L}}_{b}\right)$ are both small (because we are interested in the regime where error correction succeeds with high probability.) Therefore, we may safely neglect all logical noise components $\left({\tilde {L}}_{a}\tilde {\rho }{\tilde {L}}_{b}\right)$ , where ${\tilde {L}}_{a}$ and ${\tilde {L}}_{b}$ are different nontrivial logical operators.

Proof. Our approach here is to bound the coherent logical noise components $\left({\tilde {L}}_{a}\tilde {\rho }{\tilde {L}}_{b}\right)$ , where ${\tilde {L}}_{a}$ and ${\tilde {L}}_{b}$ are different nontrivial logical operators, by the coherent logical noise components we have already considered. This follows because the short connected logical strings with different logical action do not overlap much. Overlap here means that the strings contain the same error acting on the same qubits. One possible overlap is between Z₁ and Z₂ logical strings. Pick a Z₁ logical string, ${\mathcal{L}}_{1}$ , and a Z₂ logical string, ${\mathcal{L}}_{2}$ . One string runs left to right, and the other runs top to bottom. If the horizontal string is longer than L, the code distance, then it has vertical steps along it, and these steps may overlap with the vertical logical string. An example is given in figure I1. We assume ${\mathcal{L}}_{1}$ and ${\mathcal{L}}_{2}$ both have length ⩽L + 2ζ because of lemma 3. Then we use lemma 9 to restrict to the case where all the steps are one lattice spacing at a time. Any possible overlap is on at most two sites as shown in figure I1. Further, if we consider all possible pairs of a Z₁ logical string and a Z₂ logical string, only order 1/L strings have any overlap at all, so we can neglect possible overlap.

Because the two logical strings ${\mathcal{L}}_{1}$ and ${\mathcal{L}}_{2}$ are approximately disjoint, when we sum over partitions, each partition approximately factors into a partition of ${\mathcal{L}}_{1}$ times a partition of ${\mathcal{L}}_{2}$ . That is, each connected noise term in the sum for the logical ${\tilde {\chi }}_{{Z}_{1}{Z}_{2}}$ is a partition $\left({O}_{\text{U}}^{\left(1\right)}{O}_{\text{C}}^{\left(2\right)}\rho \;{O}_{\text{C}}^{\left(1\right)}{O}_{\text{U}}^{\left(2\right)}\right)$ which is approximately equal to $\left({O}_{\text{U}}^{\left(1\right)}\rho \;{O}_{\text{C}}^{\left(1\right)}\right)\left({O}_{\text{C}}^{\left(2\right)}\rho \;{O}_{\text{U}}^{\left(2\right)}\right)$ where ${O}_{\text{U}}^{\left(1\right)}{O}_{\text{C}}^{\left(1\right)}={\mathcal{L}}_{1}$ and ${O}_{\text{U}}^{\left(2\right)}{O}_{\text{C}}^{\left(2\right)}={\mathcal{L}}_{2}$ . Therefore, ${\tilde {\chi }}_{{Z}_{1}{Z}_{2}}\approx {\tilde {\chi }}_{{Z}_{1}I}{\tilde {\chi }}_{I{Z}_{2}}$ up to small corrections from the overlap between ${\mathcal{L}}_{1}$ and ${\mathcal{L}}_{2}$ and from the disconnected part. Each of the terms ${\tilde {\chi }}_{{Z}_{1}I}$ and ${\tilde {\chi }}_{I{Z}_{1}}$ will be ≪1 if we are in a regime where error correction succeeds. Therefore, the ${\tilde {\chi }}_{{Z}_{1}{Z}_{2}}$ logical noise component will be negligible relative to the ${\tilde {\chi }}_{{Z}_{1}I}$ logical noise component. The same holds for the other logical noise components with a nontrivial logical operator on each side of $\tilde {\rho }$ . Then we may safely neglect the more general coherent terms and consider only the $\left({\tilde {L}}_{a}\tilde {\rho }\right)$ components. □

**Figure I1.** Here we have a Z₁ and a Z₂ logical string. They have an overlap of two qubits, but if we fix one string and consider all possible paths for the other string, we see that only order 1/L have any overlap.
Download figure:
Standard image High-resolution image

Appendix J.: Growth of infidelity

The expression for the average infidelity after m applications of the noise channel from [25] is an upper bound.

$\begin{equation}{r}_{m}{\leqslant}rm+\frac{\left(d-1\right){{\Theta}}^{2}}{2d}m\left(m-1\right),\end{equation} \tag{ J.1 }$

where r_m is the average infidelity after m applications of a fixed noise channel, r is the average infidelity after one application of the channel, d is the dimension of the Hilbert space on which the channel acts, and Θ is the coherence angle. For anything save unitary or completely coherent channels, the upper bound has a linear component. We expect that this linear part is not only an upper bound, but that the average infidelity will grow linearly to lowest order.

Working in the Pauli transfer matrix representation, a unital noise channel is written as

$\begin{equation}\left(\begin{pmatrix}\hfill 1\hfill & 0\hfill & 0\hfill & \hfill \dots \hfill \\ \hfill 0\hfill & \hfill 1-{\lambda }_{2}\hfill & \hfill {\beta }_{2,3}\hfill & \hfill \dots \hfill \\ \hfill 0\hfill & \hfill {\beta }_{3,2}\hfill & \hfill 1-{\lambda }_{3}\hfill & \hfill \hfill \\ \hfill {\vdots}\hfill & \hfill {\vdots}\hfill & \hfill \hfill & \hfill \ddots \hfill \end{pmatrix}\right).\end{equation} \tag{ J.2 }$

When channels are composed, we multiply the Pauli transfer matrices. After applying the same noise channel twice, we have diagonal entries

$\begin{equation}{\left(1-{\lambda }_{j}\right)}^{2}+\sum _{l\vert l\ne j}{\beta }_{j,l}{\beta }_{l,j}.\end{equation} \tag{ J.3 }$

After m applications of the noise channel, the diagonal entries are

$\begin{equation}{\left(1-{\lambda }_{j}\right)}^{m}+\left(\genfrac{}{}{0.0pt}{}{m}{1}\right){\left(1-{\lambda }_{j}\right)}^{m-1}\sum _{l\vert l\ne j}{\beta }_{j,l}{\beta }_{l,j}+\cdots \;.\end{equation} \tag{ J.4 }$

Then the infidelity after composing the channel m times is proportional to

$\begin{align}\hfill \sum _{j=1}^{d}1-{\left(1-{\lambda }_{j}\right)}^{m}-\left(\genfrac{}{}{0.0pt}{}{m}{1}\right){\left(1-{\lambda }_{j}\right)}^{m-1}\sum _{i\vert i\ne j}{\beta }_{j,i}{\beta }_{i,j}-\;\cdots \\ \hfill =\sum _{j=1}^{d}m{\lambda }_{j}-{\lambda }_{j}^{2}\frac{m\left(m-1\right)}{2}+\;\cdots \;-m\sum _{l\vert l\ne j}{\beta }_{j,l}{\beta }_{l,j}\;\cdots \;.\end{align} \tag{ J.5 }$

To lowest order the infidelity grows proportional to r, the first term in the upper bound in equation (J.1).

Appendix K.: Diamond distance bound

The diamond distance from identity can be bounded in terms of the average infidelity, r, and the sum of squares of the off-diagonal (coherent) components of the chi matrix.

Lemma 16. In equation (48) we upper bounded the diamond distance from identity for a channel by a function f based on [9]. This function depended on the components of the Pauli transfer matrix for the channel. With a little algebra, we can show

$\begin{equation}{f}^{2}{\leqslant}{c}_{1}\left(\sum _{i,j\vert i\ne j}{\chi }_{i,j}^{2}\right)+{c}_{2}{r}^{2}.\end{equation} \tag{ K.1 }$

where the constants are given by ${c}_{1}={d}_{L}^{2}$ and ${c}_{2}=2{\left({d}_{L}+1\right)}^{2}$ and d_L is the dimension of the logical space.

Proof. We start with equation (48), and rewrite the Pauli transfer matrix in terms of chi matrix. We expand ${\left(1-{N}_{i,i}\right)}^{2}$ and compare to r². Equation (50) reads

$\begin{equation}{f}^{2}=\frac{1}{{d}_{L}^{2}-1}\left(\sum _{i,j\vert i\ne j}{N}_{i,j}^{2}+\sum _{l}{\left(1-{N}_{l,l}\right)}^{2}\right),\end{equation} \tag{ K.2 }$

where N is the Pauli transfer matrix representation of the noise channel. The diamond distance from identity is bounded by a constant times f. We can expand f in terms of the chi matrix elements. Recall that we already have lemma 1 concerning the off-diagonal elements. Also, the infidelity r is related to the trace of the Pauli transfer matrix or the (0,0) element of the chi matrix.

We can write the diagonal components of Pauli transfer matrix in terms of the diagonal components of the chi matrix in the following way:

$\begin{equation}{N}_{i,i}=\sum _{j\in {C}_{i}}{\chi }_{j,j}-\sum _{l\in {A}_{i}}{\chi }_{l,l},\end{equation} \tag{ K.3 }$

where the set C_i includes all the Pauli operators σ^j that commute with σⁱ and the set A_i is all Pauli operators σ^l that anticommute with σⁱ. For example, in the case of a single-qubit channel

$\begin{equation}{N}_{1,1}={\chi }_{0,0}+{\chi }_{1,1}-{\chi }_{2,2}-{\chi }_{3,3}.\end{equation} \tag{ K.4 }$

Then we can sum over all the diagonal components of N using the fact that the identity operator commutes with every operator.

$\begin{equation}\sum _{i=0}^{{d}_{L}^{2}}{N}_{i,i}={d}_{L}^{2}{\chi }_{0,0},\end{equation} \tag{ K.5 }$

where d_L is the dimension of the logical space. Next, we can expand the diagonal term from equation (K.2):

$\begin{align}\hfill \sum _{i=0}^{{d}_{L}^{2}}{\left(1-{N}_{i,i}\right)}^{2}& =\sum _{i=0}^{{d}_{L}^{2}}\left(1-2{N}_{i,i}+{N}_{i,i}^{2}\right)\hfill \\ \hfill & ={d}_{L}^{2}-2{d}_{L}^{2}{\chi }_{0,0}+{d}_{L}^{2}\sum _{j}{\chi }_{j,j}^{2}\hfill \\ \hfill & ={d}_{L}^{2}{\left(1-{\chi }_{0,0}\right)}^{2}+{d}_{L}^{2}\sum _{j\vert j\ne 0}{\chi }_{j,j}^{2}\hfill \\ \hfill & ={d}_{L}^{2}{\left(\sum _{l\vert l\ne 0}{\chi }_{l,l}\right)}^{2}+{d}_{L}^{2}\sum _{j\vert j\ne 0}{\chi }_{j,j}^{2},\hfill \end{align} \tag{ K.6 }$

where we have used the trace preservation condition ${\sum }_{i}$ χ_i,i = 1. Because the noise channel is unitary, the diagonal components of the chi matrix are real and greater than 0. Then we can bound

$\begin{equation}\sum _{i=0}^{{d}_{L}^{2}}{\left(1-{N}_{i,i}\right)}^{2}{\leqslant}2{d}_{L}^{2}{\left(\sum _{l\vert l\ne 0}{\chi }_{l,l}\right)}^{2}.\end{equation} \tag{ K.7 }$

When we substitute into equation (K.2) and use lemma 1 for the off-diagonal terms, we have the following bound on the diamond norm distance from identity:

$\begin{equation}{D}_{{\diamondsuit}}{\left(N\right)}^{2}{\leqslant}\frac{{d}_{L}^{2}}{4}\left(\sum _{i,j\vert i\ne j}{\chi }_{i,j}^{2}\right)+\frac{{d}_{L}^{2}}{2}{\left(\sum _{l\ne 0}{\chi }_{l,l}\right)}^{2}.\end{equation} \tag{ K.8 }$

Finally, the average infidelity r is given by

$\begin{equation}r=\frac{{d}_{L}}{{d}_{L}+1}\left(1-{\chi }_{0,0}\right)=\frac{{d}_{L}}{{d}_{L}+1}\sum _{l\ne 0}{\chi }_{l,l}\end{equation} \tag{ K.9 }$

in the chi matrix representation. Equation (K.1) follows.□

Coherence in logical quantum channels

Article metrics

Share this article

Author affiliations

ORCID iDs

Dates

Abstract

1. Introduction

1.1. Summary of the paper

1.2. Overview of the proof of theorem 3

1.3. Related work

2. Channel parameters

2.1. Pauli transfer matrix

2.2. Average infidelity

2.3. Examples

2.3.1. Depolarizing channel

2.3.2. Qubit rotation

2.3.3. Rotation/dephasing channels

2.4. Unitarity and the coherence angle

2.5. Diamond distance

2.6. Coherence in the chi-matrix representation

3. Logical channel for the repetition code

4. Repetition code revisited

4.1. Stabilizer formalism

4.2. Recovery in the chi-matrix representation

4.3. Analysis of repetition code using the chi-matrix formalism

4.4. Inhomogeneous X-axis rotations

5. Correlated unitary noise

5.1. Coherent component

5.2. Incoherent component

5.3. Comparing the coherent and incoherent components

5.4. Summary

6. The toric code against coherent noise

6.1. The toric code

6.2. Notation

6.3. Coherent and incoherent logical components

6.4. The coherent sum

6.5. Counting of logical strings

6.6. Sum over partitions

6.7. The disconnected part

6.8. Incoherent sum

6.9. The incoherent sum over strings

6.10. Noise terms with mismatched weight

6.11. More general rotation angles

6.12. Correlations

6.13. Main theorem

6.14. Interpreting bounds on coherence

7. Conclusions

Acknowledgments

Appendix A.: Chi matrix and Pauli transfer matrix for qubits

Appendix B.: Approximating sums

Appendix C.: Correlated noise: leading behavior for large n

Appendix D.: The shape of the logical string

Appendix E.: Disconnected errors

Appendix F.: The disconnected part of the incoherent logical noise

Appendix G.: Physical Y errors

Appendix H.: Other logical maps

Appendix I.: More general coherent terms

Appendix J.: Growth of infidelity

Appendix K.: Diamond distance bound