Published: 26 August 2019

Quantum convolutional neural networks

Nature Physics volume 15, pages 1273–1278 (2019)Cite this article

34k Accesses
570 Citations
192 Altmetric
Metrics details

Abstract

Neural network-based machine learning has recently proven successful for many complex applications ranging from image recognition to precision medicine. However, its direct application to problems in quantum physics is challenging due to the exponential complexity of many-body systems. Motivated by recent advances in realizing quantum information processors, we introduce and analyse a quantum circuit-based algorithm inspired by convolutional neural networks, a highly effective model in machine learning. Our quantum convolutional neural network (QCNN) uses only O(log(N)) variational parameters for input sizes of N qubits, allowing for its efficient training and implementation on realistic, near-term quantum devices. To explicitly illustrate its capabilities, we show that QCNNs can accurately recognize quantum states associated with a one-dimensional symmetry-protected topological phase, with performance surpassing existing approaches. We further demonstrate that QCNNs can be used to devise a quantum error correction scheme optimized for a given, unknown error model that substantially outperforms known quantum codes of comparable complexity. The potential experimental realizations and generalizations of QCNNs are also discussed.

You have full access to this article via your institution.

Highly accurate protein structure prediction with AlphaFold

Article Open access 15 July 2021

Virtual reality-empowered deep-learning analysis of brain cells

Article Open access 22 April 2024

De novo design of protein structure and function with RFdiffusion

Article Open access 11 July 2023

Main

The complex nature of quantum many-body systems motivates the use of machine learning techniques to analyse them. Indeed, large-scale neural networks have successfully solved classically difficult problems such as image recognition or optimization of classical error correction¹, and their architectures have been related to various physical concepts^2,3. As such, a number of recent works have used neural networks to study properties of quantum many-body systems^{4,5,6,7,8,9,10}. However, the direct application of these classical algorithms is challenging for intrinsically quantum problems, which take quantum states or processes as inputs. This is because the extremely large many-body Hilbert space hinders the efficient translation of such problems into a classical framework without performing exponentially difficult quantum state or process tomography^11,12.

Recent experimental progress towards realizing quantum information processors^13,14,15,16 has led to proposals for the use of quantum computers to enhance conventional machine learning tasks^17,18,19,20. Motivated by such developments, we introduce and analyse a machine learning-inspired quantum circuit model—the quantum convolutional neural network (QCNN)—and demonstrate its ability to solve important classes of intrinsically quantum many-body problems. The first class of problems we consider is quantum phase recognition (QPR), which asks whether a given input quantum state ρ_in belongs to a particular quantum phase of matter. In contrast to many existing schemes based on tensor network descriptions^21,22,23, we assume that ρ_in is prepared in a physical system without direct access to its classical description. The second class, quantum error correction (QEC) optimization, asks for an optimal QEC code for a given, a priori unknown error model such as dephasing or potentially correlated depolarization in realistic experimental settings. We provide both theoretical insight and numerical demonstrations for the successful application of a QCNN to these important problems, and show its feasibility for near-term experimental implementation.

QCNN circuit model

Convolutional neural networks provide a successful machine learning architecture for classification tasks such as image recognition^1,24,25. A CNN generally consists of a sequence of different (interleaved) layers of image processing; in each layer, an intermediate two-dimensional array of pixels, called a feature map, is produced from the previous one (Fig. 1a). (More generally, CNN layers connect ‘volumes’ of multiple feature maps to subsequent volumes; for simplicity, we consider only a single feature map per volume and leave the generalization to future works.) The convolution layers compute new pixel values $x_{i,j}^{(\ell )}$ from a linear combination of nearby ones in the preceding map $x_{i,j}^{(\ell )} = \mathop {\sum}\limits_{a,b = 1}^w {w_{a,b}} x_{i + a,j + b}^{(\ell - 1)}$, where the weights w_a,b form a w × w matrix. Pooling layers reduce feature map size, for example by taking the maximum value from a few contiguous pixels, and are often followed by the application of a nonlinear (activation) function. Once the feature map size becomes sufficiently small, the final output is computed from a function that depends on all remaining pixels (the fully connected layer). The weights and fully connected function are optimized by training on large datasets. In contrast, variables such as the number of convolution and pooling layers and the size w of the weight matrices (known as hyperparameters) are fixed for a specific CNN¹. The key properties of a CNN are thus translationally invariant convolution and pooling layers, each characterized by a constant number of parameters (independent of system size) and sequential data size reduction (that is, a hierarchical structure).

Motivated by this architecture, we introduce a QCNN circuit model extending these key properties to the quantum domain (Fig. 1b). The circuit’s input is an unknown quantum state ρ_in. A convolution layer applies a single quasilocal unitary (U_i) in a translationally invariant manner for finite depth. For pooling, a fraction of qubits are measured, and their outcomes determine unitary rotations (V_j) applied to nearby qubits. Hence, nonlinearities in QCNNs arise from reducing the number of degrees of freedom. Convolution and pooling layers are performed until the system size is sufficiently small; then, a fully connected layer is applied as a unitary F on the remaining qubits. Finally, the outcome of the circuit is obtained by measuring a fixed number of output qubits. As in the classical case, QCNN hyperparameters such as the number of convolution and pooling layers are fixed, and the unitaries themselves are learned.

A QCNN to classify N-qubit input states is thus characterized by O(log(N)) parameters. This corresponds to a double exponential reduction compared with a generic quantum circuit-based classifier¹⁹ and allows for efficient learning and implementation. For example, given a set of M classified training vectors {(|ψ_α〉, y_α): α = 1, …, M}, where |ψ_α〉 are input states and y_α = 0 or 1 are corresponding binary classification outputs, one could compute the mean squared error

$${\mathrm{MSE}} = \frac{1}{{2M}}\mathop {\sum}\limits_{\alpha = 1}^M {\left( {y_i - f_{\{ U_i,V_j,F\} }\left( {\left| {\psi _\alpha } \right\rangle } \right)} \right)^2}$$

(1)

Here, $f_{\{ U_i,V_j,F\} }\left( {\left| {\psi _\alpha } \right\rangle } \right)$ denotes the expected QCNN output value for input |ψ_α〉. Learning then consists of initializing all unitaries and successively optimizing them until convergence, for example via gradient descent.

To gain physical insight into the mechanism underlying QCNNs and motivate their application to the problems under consideration, we now relate our circuit model to two well-known concepts in quantum information theory—the multiscale entanglement renormalization ansatz²⁶ (MERA) and QEC. The MERA framework provides an efficient tensor network representation of many classes of interesting many-body wavefunctions, including those associated with critical systems^26,27,28. A MERA can be understood as a quantum state generated by a sequence of unitary and isometry layers applied to an input state (for example |00〉). While both types of layers apply quasilocal unitary gates, each isometry layer first introduces a set of new qubits in a predetermined state, such as |0〉 (Fig. 1c). This exponentially growing, hierarchical structure allows for the long-range correlations associated with critical systems. The QCNN circuit has similar structure but runs in the reverse direction. Hence, for any given state |ψ〉 with a MERA representation, there is always a QCNN that recognizes |ψ〉 with deterministic measurement outcomes; one such QCNN is simply the inverse of the MERA circuit.

For input states other than |ψ〉, however, such a QCNN does not generally produce deterministic measurement outcomes. These additional degrees of freedom distinguish a QCNN from a MERA. Specifically, we can identify the measurements as syndrome measurements in QEC²⁹, which determine error correction unitaries V_j to apply to the remaining qubit(s). Thus, a QCNN circuit with multiple pooling layers can be viewed as a combination of a MERA (an important variational ansatz for many-body wavefunctions) and nested QEC (a mechanism to detect and correct local quantum errors without collapsing the wavefunction). This makes QCNNs a powerful architecture for classifying input quantum states or devising new QEC codes. In particular, for QPR, the QCNN can provide a MERA realization of a representative state |ψ₀〉 in the target phase. Other input states within the same phase can be viewed as |ψ₀〉 with local errors, which are repeatedly corrected by the QCNN in multiple layers. In this sense, the QCNN circuit can mimic renormalization-group flow, a methodology that successfully classifies many families of quantum phases³⁰. For QEC optimization, the QCNN structure allows for simultaneous optimization of efficient encoding and decoding schemes with potentially rich entanglement structure.

Detecting a 1D symmetry-protected topological phase

We first demonstrate the potential of a QCNN by applying it to QPR in a class of 1D many-body systems. Specifically, we consider a ${\Bbb Z}_2 \times {\Bbb Z}_2$ symmetry-protected topological (SPT) phase ${\mathcal{P}}$, a phase containing the S = 1 Haldane chain³¹, and ground states {|ψ_G〉} of a family of Hamiltonians on a spin-1/2 chain with open boundary conditions:

$$H = - J\mathop {\sum}\limits_{i = 1}^{N - 2} Z_iX_{i + 1}Z_{i + 2} - h_1\mathop {\sum}\limits_{i = 1}^N X_i - h_2\mathop {\sum}\limits_{i = 1}^{N - 1} X_iX_{i + 1}$$

(2)

where X_i, Z_i are Pauli operators for the spin at site i, and h₁, h₂ and J are parameters of the Hamiltonian. The ${\Bbb Z}_2 \times {\Bbb Z}_2$ symmetry is generated by $X_{{\mathrm{even}}({\mathrm{odd}})}= \mathop {\prod}\limits_{i \in {\mathrm{even}}({\mathrm{odd}})} {X_i}$. Figure 2a shows the phase diagram as a function of (h₁/J, h₂/J). When h₂ = 0, the Hamiltonian is exactly solvable via the Jordan–Wigner transformation³⁰, confirming that ${\cal{P}}$ is characterized by non-local order parameters. When h₁ = h₂ = 0, all terms are mutually commuting, and a ground state is the 1D cluster state. Our goal is to identify whether a given, unknown ground state drawn from the phase diagram belongs to ${\cal{P}}$.

**Fig. 2: Application to quantum phase recognition.**

As an example, we first present an exact, analytical QCNN circuit that recognizes ${\cal{P}}$ (Fig. 2b). The convolution layers involve controlled-phase gates as well as Toffoli gates with controls in the X-basis, and pooling layers perform phase-flips on remaining qubits when one adjacent measurement yields X = −1. This convolution–pooling unit is repeated d times, where d is the QCNN depth. The fully connected layer measures Z_i−1X_iZ_i+1 on the remaining qubits. Figure 2c shows the QCNN output for a system of N = 135 spins and d = 1, …, 4 along h₁ = 0.5J, obtained using matrix product state simulations. As d increases, the measurement outcomes show sharper changes around the critical point, and the output of a d = 2 circuit already reproduces the phase diagram with high accuracy (Fig. 2a). This QCNN can also be used for other Hamiltonian models belonging to the same phase, such as the S = 1 Haldane chain³¹ (see Methods).

Sample complexity

The performance of a QPR solver can be quantified by sample complexity¹¹: what is the expected number of copies of the input state required to identify its quantum phase? We demonstrate that the sample complexity of our exact QCNN circuit is substantially better than that of conventional methods. In principle, ${\cal{P}}$ can be detected by measuring a non-zero expectation value of string order parameters (SOPs)^32,33 $S$ such as

$$S_{ab} = Z_aX_{a + 1}X_{a + 3}\,...\,X_{b - 3}X_{b - 1}Z_b$$

(3)

In practice, however, the expectation values of SOP vanish near the phase boundary due to diverging correlation length³³; since quantum projection noise is maximal in this vicinity, many experimental repetitions are required to affirm a non-zero expectation value. In contrast, the QCNN output is much sharper near the phase transition, so fewer repetitions are required.

Quantitatively, given some |ψ_in〉 and SOP $S$, a projective measurement of S can be modelled as a (generalized) Bernoulli random variable, where the outcome is 1 with probability p = (〈ψ_in|S|ψ_in〉 + 1) / 2 and −1 with probability 1 − p (since ${S}^2$ equals the identity operator); after M binary measurements, we estimate p. p > p₀ = 0.5 signifies $\left| {\psi _{{\mathrm{in}}}} \right\rangle \in {\cal{P}}$. We define the sample complexity M_min as the minimum M to test whether p > p₀ with 95% confidence using an arcsine variance-stabilizing transformation³⁴:

$$M_{{\mathrm{min}}} = \frac{{1.96^2}}{{({\mathrm{arcsin}} \sqrt p - \sqrt {{\mathrm{arcsin}} p_0} )^2}}$$

(4)

Similarly, the sample complexity for a QCNN can be determined by replacing 〈ψ_in|S|ψ_in〉 by the QCNN output expectation value in the expression for p.

Figure 2d shows the sample complexity for the QCNN at various depths and SOPs of different lengths. The QCNN clearly requires substantially fewer input copies throughout the parameter regime, especially near criticality. More importantly, although the SOP sample complexity scales independently of string length, the QCNN sample complexity consistently improves with increasing depth and is limited only by finite size effects in our simulations. In particular, compared with SOPs, the QCNN reduces sample complexity by a factor that scales exponentially with the depth of the QCNN in numerically accessible regimes (inset). Such scaling arises from the iterative QEC performed at each depth and is not expected from any measurements of simple (potentially nonlocal) observables. We show in the Methods that our QCNN circuit measures a multiscale string order parameter—a sum of products of exponentially many different SOPs that remains sharp up to the phase boundary.

MERA and QEC

Futher insights into the performance of the QCNN are revealed by interpreting it in terms of MERA and QEC. In particular, our QCNN is specifically designed to contain the MERA representation of the 1D cluster state (|ψ₀〉) such that it becomes a stable fixed point. When |ψ₀〉 is used as input, each convolution–pooling unit produces the same state |ψ₀〉 with reduced system size in the unmeasured qubits, while yielding deterministic outcomes (X = 1) in the measured qubits. The fully connected layer measures the SOP for |ψ₀〉. When an input wavefunction is perturbed away from |ψ₀〉, our QCNN corrects such ‘errors’. For example, if a single X error occurs, the first pooling layer identifies its location, and controlled unitary operations correct the error propagated through the circuit (Fig. 3). Similarly, if an initial state has multiple, sufficiently separated errors (possibly in coherent superpositions), the error density after several iterations of convolution and pooling layers will be substantially smaller³⁵. If the input state converges to the fixed point, our QCNN classifies it into the SPT phase with high fidelity. Clearly, this mechanism resembles the classification of quantum phases based on renormalization-group flow.

Obtaining QCNN from training procedure

Having analytically illustrated the computational power of the QCNN circuit model, we now demonstrate how a QCNN for ${\cal{P}}$ can also be obtained using the learning procedure. Details of the hyperparameters of the QCNN can be found in the Methods and Supplementary Fig. 2. Initially, all unitaries are set to random values. As classically simulating our training procedure requires expensive computational resources, we focus on a relatively small system with N = 15 spins and QCNN depth d = 1; there are a total of 1,309 parameters to be learned (see Methods). Our training data consists of 40 evenly spaced points along the line h₂ = 0, where the Hamiltonian is exactly solvable by the Jordan–Wigner transformation. Using gradient descent with the MSE function (1), we iteratively update the unitaries until convergence (see Methods). The classification output of the resulting QCNN for generic h₂ is shown in Fig. 4. This QCNN accurately reproduces the two-dimensional phase diagram over the entire parameter regime, despite being trained only on samples from a set of solvable points that do not even cross the lower phase boundary.

This example illustrates how the QCNN structure avoids overfitting to training data with its exponentially reduced number of parameters. While the training dataset for this particular QPR problem consists of solvable points, more generally, such a dataset can be obtained by using traditional methods (such as measuring SOPs) to classify representative states that can be efficiently generated either numerically or experimentally^36,37.

Optimizing QEC

As seen in the previous example, the QCNN’s architecture enables one to perform effective QEC. We next leverage this feature to design a new QEC code that is itself optimized for a given error model. More specifically, any QCNN circuit (and its inverse) can be viewed as a decoding (encoding) quantum channel between the physical input qubits and the logical output qubit. The encoding scheme introduces sets of new qubits in a predetermined state, for example |0〉, while the decoding scheme performs measurements (Fig. 5a). Given an error channel ${\cal{N}}$, our aim is therefore to maximize the recovery fidelity

$$f = \mathop {\sum}\limits_{\left| {\psi _l} \right\rangle \in \{ \pm x,y,z\} } {\left\langle {\psi _l} \right|} {\cal{M}}^{ - 1}\left( {{\cal{N}}\left( {{\cal{M}}\left( {\left| {\psi _l} \right\rangle \left\langle {\psi _l} \right|} \right)} \right)} \right)\left| {\psi _l} \right\rangle$$

(5)

where ${\cal{M}}( {{\cal{M}}^{ - 1}})$ is the encoding (decoding) scheme generated by a QCNN circuit, and |±x, y, z〉 are the ±1 eigenstates of the Pauli matrices. Thus, our method simultaneously optimizes both encoding and decoding schemes, while ensuring their efficient implementation in realistic systems. The variational optimization can be carried out with an unknown ${\cal{N}}$, since f can be evaluated experimentally.

**Fig. 5: QCNN for optimizing quantum error correction.**

To illustrate the potential of this procedure, we consider a two-layer QCNN with N = 9 physical qubits and 126 variational parameters (Fig. 5a and Methods). This particular architecture includes the nested (classical) repetition codes and the 9-qubit Shor code³⁸; in the following, we compare our performance to the better of the two. We consider three different input error models: (1) independent single-qubit errors on all qubits with equal probabilities p_μ for μ = X, Y and Z errors or (2) anisotropic probabilities p_x ≠ p_y = p_z, and (3) independent single-qubit anisotropic errors with additional two-qubit correlated errors X_iX_i+1 with probability p_xx.

On initializing all QCNN parameters to random values and numerically optimizing them to maximize f, we find that our model produces the same logical error rate as known codes in case (1) but can reduce the error rate by a constant factor of up to 50% in case (2), depending on the specific input error probability ratios (see Methods and Supplementary Fig. 4). In case (3), the optimized QEC code performs substantially better than known codes (Fig. 5b). As the Shor code is only guaranteed to correct arbitrary single-qubit errors, it performs more poorly than using no error correction, while the optimized QEC code performs much better. This example demonstrates the power of using QCNNs to obtain and optimize new QEC codes for realistic, a priori unknown error models.

Experimental realizations

Our QCNN architecture can be efficiently implemented on several state-of-the-art experimental platforms. The key ingredients for realizing QCNNs include the efficient preparation of quantum many-body input states, the application of two-qubit gates at various length scales and projective measurements. As in stabilizer-based QEC, the measurements of intermediate qubits and feed-forwarding can be replaced by controlled two-qubit unitary operations so that measurements are performed only at the end of an experimental sequence. These capabilities have already been demonstrated in multiple programmable quantum simulators consisting of N ≥ 50 qubits based on trapped neutral atoms and ions, or superconducting qubits^39,40,41,42.

As an example, we present a feasible protocol for near-term implementation of our exact cluster model QCNN circuit via neutral Rydberg atoms^39,43, where long-range dipolar interactions allow high-fidelity entangling gates⁴⁴ among distant qubits in a variable geometric arrangement. The qubits can be encoded in the hyperfine ground states, where one of the states can be coupled to the Rydberg level to perform efficient entangling operations via the Rydberg-blockade mechanism⁴⁴; an explicit implementation scheme for every gate in Fig. 2b is provided in the Methods. Our QCNN at depth d with N input qubits requires at most $\frac{{7N}}{2}(1 - 3^{1 - d}) + N3^{1 - d}$ multi-qubit operations and 4d single-qubit rotations. For a realistic effective coupling strength Ω ≈ 2π × 10–100 MHz and single-qubit coherence time τ ≈ 200 μs limited by the Rydberg state lifetime, approximately Ωτ ≈ 2π × 10³–10⁴ multi-qubit operations can be performed, and a d = 4 QCNN on N ≈ 100 qubits is feasible. These estimates are reasonably conservative as we have not considered advanced control techniques such as pulse-shaping⁴⁵ or potentially parallelizing independent multi-qubit operations.

Outlook

These considerations indicate that QCNNs provide a promising quantum machine learning paradigm. Several interesting generalizations and future directions can be considered. For example, while we have only presented the QCNN circuit structure for recognizing 1D phases, it is straightforward to generalize the model to higher dimensions, where phases with intrinsic topological order such as the toric code are supported^46,47,48. Such studies could potentially identify nonlocal order parameters with low sample complexity for lesser-understood phases such as quantum spin liquids⁴⁹ or anyonic chains⁵⁰. To recognize more exotic phases, we could also relax the translation–invariance constraints, resulting in O(Nlog(N)) parameters for system size N, or use ancilla qubits to implement parallel feature maps following traditional CNN architecture. Further extensions could incorporate optimizations for fault-tolerant operations on QEC code spaces. Finally, while we have used a finite-difference scheme to compute gradients in our learning demonstrations, the structural similarity of a QCNN to its classical counterpart suggests the possibility of adopting more efficient schemes such as backpropagation¹.

Methods

Phase diagram and QCNN circuit simulations

The phase diagram in Fig. 2a was numerically obtained using the infinite-size density matrix renormalization group (DMRG) algorithm. We generally follow the method outlined in ref. ⁵¹ with 150 as the maximum bond dimension. To extract each data point in Fig. 2a, we numerically obtain the ground state energy density as a function of h₂ for fixed h₁ and compute its second-order derivative. The phase boundary points are identified from sharp peaks.

The simulation of our QCNN in Fig. 2b also uses matrix product state representations. We first obtain the input ground state wavefunction using finite-size DMRG⁵¹ with bond dimension D = 130 for a system of N = 135 qubits. We then perform the circuit operations by sequentially applying swap gates and two-qubit gates on the nearest neighboring qubits⁵². Each three-qubit gate is decomposed into two-qubit unitaries⁵³. We find that increasing the bond dimension to D = 150 does not lead to any visible changes in our main figures, confirming a reasonable convergence of our method. The colour plot in Fig. 2a is similarly generated for a system of N = 45 qubits.

QCNN for the S = 1 Haldane chain

As discussed in the main text, the (spin-1/2) 1D cluster state belongs to an SPT phase protected by ${\Bbb Z}_2 \times {\Bbb Z}_2$ symmetry, a phase that also contains the celebrated S = 1 Haldane chain³¹. It is thus natural to ask whether this circuit can be used to detect the phase transition between the Haldane phase and an S = 1 paramagnetic phase, which we numerically demonstrate here.

The one-parameter family of Hamiltonians we consider for the Haldane phase is defined on a 1D chain of N spin-1 particles with open boundary conditions³¹:

$$H_{{\mathrm{Haldane}}} = J\mathop {\sum}\limits_{j = 1}^N {{\mathbf{S}}_{\mathbf{j}}} \cdot {\mathbf{S}}_{{\mathbf{j}} + {\mathbf{1}}} + \omega \mathop {\sum}\limits_{j = 1}^N {(S_j^z)^2}$$

(6)

In this equation, S_j denotes the vector of S = 1 spin operators at site j, and J, ω are parameters of the Hamiltonian. The system is protected by a ${\Bbb{Z}}_2 \times {\Bbb{Z}}_2$ symmetry generated by global π-rotations of every spin around the X and Y axes: $R_{x} = \mathop {\prod}\limits_{j} {e^{i{\pi}S_{j}^{x}}}$, $R_y = \mathop {\prod}\limits_j {\mathrm{e}}^{i{\pi} S_{j}^{y}}$. When ω is zero or small compared to J, the ground state belongs to the SPT phase, but when ω/J is sufficiently large, the ground state becomes paramagnetic³¹.

To apply our QCNN circuit to this Haldane phase, we must first identify a quasilocal isometric map U between the two models, because their representations of the symmetry group are distinct. More specifically, since the cluster model has a ${\Bbb Z}_2 \times {\Bbb Z}_2$ symmetry generated by $X_{{\mathrm{even}}({\mathrm{odd}})} = \mathop {\prod}\limits_{i \in {\mathrm{even}}({\mathrm{odd}})} {X_i}$, we require $UR_xU^\dagger = X_{{\mathrm{odd}}}$ and $UR_yU^\dagger = X_{{\mathrm{even}}}$. Such a map can be constructed following ref. ⁵⁴. Intuitively, it extends the local Hilbert space of a spin-1 particle by introducing a spin singlet state |s〉 and mapping it to a pair of spin-1/2 particles: $\left| x \right\rangle \mapsto \left| { + - } \right\rangle$, $\left| y \right\rangle \mapsto - \left| { - + } \right\rangle$, $\left| z \right\rangle \mapsto - i\left| { - - } \right\rangle$, $\left| s \right\rangle \mapsto \left| { + + } \right\rangle$. Here, |±〉 denote the |±1〉 eigenstates of the (spin-1/2) Pauli matrix X. |μ〉 denotes a spin-1 state defined by $R_\nu \left| \mu \right\rangle = ( - 1)^{\delta _{\mu ,\nu } + 1}\left| \mu \right\rangle$ (μ, ν ∈ {x, y, z}). The QCNN circuit for the Haldane chain thus consists of applying U followed by the circuit presented in the main text.

Supplementary Fig. 1 shows the QCNN output for an input system of N = 54 spin-1 particles at d = 1, …, 4, obtained using matrix product state simulations with D = 160. For this system size, we numerically identified the critical point as ω/J = 1.035 ± 0.005 by using DMRG to obtain the second derivative of energy density as a function of ω and J. The QCNN provides accurate identification of the phase transition.

Multiscale string order parameters

We examine the final operator measured by our circuit that recognizes the SPT phase in the Heisenberg picture. Although a QCNN performs non-unitary measurements in the pooling layers, similar to QEC circuits²⁹, one can postpone all measurements to the end and replace pooling layers by unitary-controlled gates acting on both measured and unmeasured qubits. In this way, the QCNN is equivalent to measuring a non-local observable

$${\cal{O}} = (U_{{\mathrm{CP}}}^{(d)}...U_{{\mathrm{CP}}}^{(1)})^\dagger Z_{i - 1}X_iZ_{i + 1}(U_{{\mathrm{CP}}}^{(d)}...U_{{\mathrm{CP}}}^{(1)})$$

(7)

where i is the index of the measured qubit in the final layer and $U_{{\mathrm{CP}}}^{(l)}$ is the unitary corresponding to the convolution–pooling unit at depth l. A more explicit expression of ${\cal{O}}$ can be obtained by commuting U_CP with the Pauli operators, which yields recursive relations:

$$U_{{\mathrm{CP}}}^\dagger X_iU_{{\mathrm{CP}}} = X_{\tilde i - 2}X_{\tilde i}X_{\tilde i + 2}$$

(8)

$$U_{{\mathrm{CP}}}^\dagger Z_iU_{{\mathrm{CP}}} = \frac{1}{2}(Z_{\tilde i} + Z_{\tilde i - 2}X_{\tilde i - 1} + X_{\tilde i + 1}Z_{\tilde i + 2} - Z_{\tilde i - 2}X_{\tilde i - 1}Z_{\tilde i}X_{\tilde i + 1}Z_{\tilde i + 2})$$

(9)

$\tilde{i}$ enumerates every qubit at depth l − 1, including those measured in the pooling layer. It follows that an SOP of the form ZXX … XZ at depth l transforms into a weighted linear combination of 16 products of SOPs at depth l − 1. Thus, instead of measuring a single SOP, our QCNN circuit measures a sum of products of exponentially many different SOPs:

$${\cal{O}} = \mathop {\sum}\limits_{ab} {C_{ab}^{(1)}} {S}_{ab} + \mathop {\sum}\limits_{a_1b_1a_2b_2} {C_{a_1b_1a_2b_2}^{(2)}} {S}_{a_1b_1}{S}_{a_2b_2} + \cdots$$

(10)

${\cal{O}}$ can be viewed as a multiscale SOP with coefficients computed recursively in d using equations (8) and (9). This allows the QCNN to produce a sharp classification output even when the correlation length is as long as 3^d.

Demonstration of learning procedure for QPR

To perform our learning procedure in a QPR problem, we choose the hyperparameters for the QCNN as shown in Supplementary Fig. 2. This hyperparameter structure can be used for generic (1D) phases, and is characterized by a single integer n that determines the reduction of system size in each convolution–pooling layer, L → L/n. (Supplementary Fig. 2 shows the special case where n = 3.) The first convolution layer involves (n + 1)-qubit unitaries starting on every nth qubit. This is followed by n layers of n-qubit unitaries arranged as shown in Supplementary Fig. 2. The pooling layer measures n − 1 out of every contiguous block of n qubits; each of these is associated with a unitary V_j applied to the remaining qubit, depending on the measurement outcome. This set of convolution and pooling layers is repeated d times. Finally, the fully connected layer consists of an arbitrary unitary on the remaining N/n^d qubits, and the classification output is given by the measurement output of the middle qubit (or any fixed choice of one of them). For our example, we choose n = 3 because the Hamiltonian in equation (2) involves three-qubit terms.

In our simulations, we consider only N = 15 spins and d = 1, because simulating quantum circuits on classical computers requires a large amount of resources. We parameterize unitaries as exponentials of generalized a × a Gell-Mann matrices {Λ_i}, where a = 2^v, and v is the number of qubits involved in the unitary⁵⁵: $U = {\mathrm{exp}} \left( { - i\mathop {\sum}\limits_j {c_j} {\it{\Lambda }}_j} \right)$.

This parameterization is used directly for the unitaries in the convolution layers C₂−C₄, the pooling layer and the fully connected layer. For the first convolution layer C₁, we restrict the choice of U₁ to a product of six two-qubit unitaries between each possible pair of qubits: U₁ = U₍₂₃₎U₍₂₄₎U₍₁₃₎U₍₁₄₎U₍₁₂₎U₍₃₄₎, where U_(αβ) is a two-qubit unitary acting on qubits indexed by α and β. Such a decomposition is useful when considering experimental implementation.

In the QCNN learning procedure, all parameters c_μ are set to random values between 0 and 2π for the unitaries {U_i, V_j, F}. In every iteration of gradient descent, we compute the derivative of the mean-squared error function (equation (1)) to first order with respect to each c_μ by using the finite-difference method:

$$\frac{{\partial {\mathrm{MSE}}}}{{\partial c_\mu }} = \frac{1}{{2\epsilon }}\left( {{\mathrm{MSE}}(c_\mu + \epsilon ) - {\mathrm{MSE}}(c_\mu - \epsilon )} \right) + O(\epsilon ^2)$$

(11)

Each coefficient is thus updated as $c_\mu \mapsto c_\mu - \eta \frac{{\partial {\mathrm{MSE}}}}{{\partial c_\mu }}$, where η is the learning rate for that iteration. We compute the learning rate using the bold driver technique from machine learning, where η is increased by 5% if the error has decreased from the previous iteration and decreased by 50% otherwise⁵⁶. We repeat the gradient descent procedure until the error function changes on the order of 10⁻⁵ between successive iterations. In our simulations, we use $\epsilon = 10^{ - 4}$ for the gradient computation and begin with an initial learning rate of η₀ = 10.

Construction of the QCNN circuit

To construct the exact QCNN circuit in Fig. 2b, we follow the guidelines discussed in the main text. Specifically, we design the convolution and pooling layers to satisfy the following two important properties:

1.
Fixed-point criterion. If the input is a cluster state |ψ₀〉 of L spins, the output of the convolution–pooling layers is a cluster state |ψ₀〉 of L/3 spins, with all measurements deterministically yielding |0〉.
2.
QEC criterion. If the input is not |ψ₀〉 but instead differs from |ψ₀〉 at one site by an error that commutes with the global symmetry, the output should still be a cluster state of L/3 spins, but at least one of the measurements will result in the state |1〉.

These two properties are desirable for any quantum circuit implementation of RG flow for performing QPR.

In the specific case of our Hamiltonian, the ground state (1D cluster state) is a graph state, which can be efficiently obtained by applying a sequence of controlled-phase gates to a product state. This simplifies the construction of the MERA representation for the fixed-point criterion. To satisfy the QEC criterion, we treat the ground state manifold of the unperturbed Hamiltonian $H = - J\mathop {\sum}\limits_i {Z_i} X_{i + 1}Z_{i + 2}$ as the code space of a stabilizer code with stabilizers {Z_iX_i+1Z_i+2}. The remaining degrees of freedom in the QCNN convolution and pooling layers are then specified such that the circuit detects and corrects the error (that is, it measures at least one |1〉 and prevents propagation to the next layer) when a single-qubit X error is present.

QCNN for general QPR problems

Our interpretation of QCNNs in terms of MERA and QEC motivates their application for recognizing more generic quantum phases. For any quantum phase ${\cal{P}}$ whose RG fixed-point wavefunction $\left| {\psi _0({\cal{P}})} \right\rangle$ has a tensor network representation in isometric or G-isometric form⁵⁷ (Supplementary Fig. 3a), one can systematically construct a corresponding QCNN circuit. This family of quantum phases includes all 1D SPT and 2D string-net phases^47,57,58. In these cases, one can explicitly construct a commuting parent Hamiltonian for $\left| {\psi _0({\cal{P}})} \right\rangle$ and a MERA structure in which $\left| {\psi _0({\cal{P}})} \right\rangle$ is a fixed-point wavefunction (Supplementary Fig. 3a for 1D systems). The diagrammatic proof of this fixed-point property is given in Supplementary Fig. 3b. Furthermore, any ‘local error’ perturbing an input state away from $\left| {\psi _0({\cal{P}})} \right\rangle$ can be identified by measuring a fraction of terms in the parent Hamiltonian, similar to syndrome measurements in stabilizer-based QEC²⁹. Then, a QCNN for ${\cal{P}}$ simply consists of the MERA for $\left| {\psi _0({\cal{P}})} \right\rangle$ and a nested QEC scheme in which an input state with error density below the QEC threshold⁵⁹ ‘flows’ to the RG fixed point. Such a QCNN can be optimized via our learning procedure.

While our generic learning protocol begins with completely random unitaries, as in the classical case¹, this initialization may not be the most efficient for gradient descent. Instead, motivated by deep learning techniques such as pre-training¹, a better initial parameterization would consist of a MERA representation of $\left| {\psi _0({\cal{P}})} \right\rangle$ and one choice of nested QEC. With such an initialization, the learning procedure serves to optimize the QEC scheme, expanding its threshold to the target phase boundary (Supplementary Fig. 3c).

Experimental resource analysis

To compute the gate depth of the cluster model QCNN circuit in a Rydberg atom implementation, we analyse each gate shown in Fig. 2b. By postponing pooling layer measurements to the end of the circuit, the multi-qubit gates required are

$${\mathrm{C}}_z{\mathrm{Z}}_{ij} = {\mathrm{e}}^{i\pi ( - {\mathbf{1}} + Z_i)( - {\mathbf{1}} + Z_j)/4}$$

(12)

$${\mathrm{C}}_x{\mathrm{Z}}_{ij} = {\mathrm{e}}^{i\pi ( - {\mathbf{1}} + X_i)( - {\mathbf{1}} + Z_j)/4}$$

(13)

$${\mathrm{C}}_x{\mathrm{C}}_x{\mathrm{X}}_{ijk} = {\mathrm{e}}^{i\pi ( - {\mathbf{1}} + X_i)( - {\mathbf{1}} + X_j)( - {\mathbf{1}} + X_k)/8}$$

(14)

By using Rydberg blockade-mediated controlled gates⁶⁰, it is straightforward to implement C_zZ_ij and ${\mathrm{C}}_z{\mathrm{C}}_z{\mathrm{Z}}_{ijk} = e^{i\pi ( - {\mathbf{1}} + Z_i)( - {\mathbf{1}} + Z_j)( - {\mathbf{1}} + Z_k)/8}$. The desired C_xZ_ij and C_xC_xX_ijk gates can then be obtained by conjugating C_zZ_ij and C_zC_zZ_ijk by single-qubit rotations. For an input size of N spins, the kth convolution–pooling unit thus applies 4N/3^k−1 C_zZ_ij gates, N/3^k−1 C_xC_xX_ijk gates and 2N/3^k−1 layers of C_xZ_ij gates. The depth of single-qubit rotations required is 4d, as these rotations can be implemented in parallel on all N qubits. Finally, the fully connected layer consists of N3^1−d C_zZ_ij gates. Thus, the total number of multi-qubit operations required for a QCNN of depth d operating on N spins is $\frac{{7N}}{2}(1 - 3^{1 - d}) + N3^{1 - d}$. Note that we do not need to use SWAP gates since the Rydberg interaction is long-range.

Demonstration of learning procedure for QEC

To obtain the QEC code considered in the main text, we consider a QCNN with N = 9 input physical qubits and simulate the circuit evolution of its 2^N × 2^N density matrix exactly. Strictly speaking, our QCNN has three layers: a three-qubit convolution layer U₁, a 3-to-1 pooling layer and a 3-to-1 fully connected layer U₂. Without loss of generality, we may ignore the optimization over the pooling layer by absorbing its effect into the first convolution layer, leading to the effective two-layer structure shown in Fig. 5a. The generic three-qubit unitary operations U₁ and U₂ are parameterized using 63 Gell-Mann coefficients each.

As discussed in the main text, we consider three different error models: (1) independent single-qubit errors on all qubits with equal probabilities p_μ for μ = X, Y and Z errors, (2) independent single-qubit errors on all qubits, with anisotropic probabilities p_x ≠ p_y = p_z and (3) independent single-qubit anisotropic errors with additional two-qubit correlated errors X_iX_i+1 with probability p_xx. More specifically, the first two error models are realized by applying a (generally anisotropic) depolarization quantum channel to each of the nine physical qubits:

$${\cal{N}}_{1,i}:\rho \mapsto \left (1 - \mathop {\sum}\limits_\mu {p_\mu } \right)\rho + \mathop {\sum}\limits_\mu {p_\mu } \sigma _i^\mu \rho \sigma _i^\mu$$

(15)

with Pauli matrices ${\sigma _{i}^{\mu}}$ for i ∈ {1, 2, …, 9} (the qubit indices are defined from bottom to top in Fig. 5a). For the anisotropic case, we trained the QCNN on various different error models with the same total error probability p_x + p_y + p_z = 0.001 but different relative ratios; the resulting ratio between the logical error probability of the Shor code and that of the QCNN code is plotted as a function of anisotropy in Supplementary Fig. 4. For strongly anisotropic models, the QCNN outperforms the Shor code, while for nearly isotropic models, the Shor code is optimal and QCNN can achieve the same logical error rate.

For the correlated error model, we additionally apply a quantum channel:

$${\cal{N}}_{2,i}:\rho \mapsto (1 - p_{xx})\rho + p_{xx}X_iX_{i + 1}\rho X_iX_{i + 1}$$

(16)

for pairs of nearby qubits, that is i ∈ {1, 2, 4, 5, 7, 8}. Such a geometrically local correlation is motivated from experimental considerations. In this case, we train our QCNN circuit on a specific error model with parameter choices p_x = 5.8 × 10⁻³, p_y = p_z = 2 × 10⁻³, p_xx = 2 × 10⁻⁴ and evaluate the logical error probabilities for various physical error models with the same relative ratios but different total error per qubit p_x + p_y + p_z + p_xx. In general, for an anisotropic logical error model with probabilities p_μ for σ_μ logical errors, the overlap f is $(1 - 2\mathop {\sum}\limits_\mu {p_\mu } /3)$, since $\left\langle { \pm \nu } \right|\sigma _\mu \left| { \pm \nu } \right\rangle = ( - 1)^{\delta _{\mu ,\nu } + 1}$. Becuase of this, we compute the total logical error probability from f as 1.5(1 − f). Hence, our goal is to maximize the logical state overlap f defined in equation (5). If we naively apply the gradient descent method based on f directly to both U₁ and U₂, we find that the optimization is easily trapped in a local optimum. Instead, we optimize two unitaries U₁ and U₂ sequentially, similar to the layer-by-layer optimization in backpropagation for conventional CNN¹.

A few remarks are in order. First, since U₁ is optimized prior to U₂, one needs to devise an efficient cost function C₁ that is independent of U₂. In particular, simply maximizing f with an assumption on U₂, for example that it equals the identity, may not be ideal, since such choice does not capture a potential interplay between U₁ and U₂. Second, because U₁ captures arbitrary single-qubit rotations, the definition of C₁ should be basis-independent. Finally, we note that the tree structure of our circuit allows one to view the first layer as an independent quantum channel:

$${\cal{M}}_{U_1}:\rho \mapsto {\mathrm{tr}}_a\left[ {U_1{\cal{N}}\left( {U_1^\dagger \left( {\left| 0 \right\rangle \left\langle 0 \right| \otimes \rho \otimes \left| 0 \right\rangle \left\langle 0 \right|} \right)U_1} \right)U_1^\dagger } \right]$$

(17)

where tr_a[⋅] denotes tracing over the ancilla qubits that are measured in the intermediate step. From this perspective, ${\cal{M}}_{U_1}$ describes an effective error model to be corrected by the second layer.

With these considerations, we optimize U₁ such that the effective error model ${\cal{M}}_{U_1}$ becomes as classical as possible, that is ${\cal{M}}_{U_1}$ is dominated by a ‘flip’ error along a certain axis with a strongly suppressed ‘phase’ error. Only then will the remnant, simpler errors be corrected by the second layer. More specifically, one may represent ${\cal{M}}_{U_1}$ using a map ${\cal{M}}_{U_{1}}:{\mathbf{r}} \mapsto M{\mathbf{r}} + {\mathbf{c}}$, where ${\mathbf{r}} \in {\Bbb R}^3$ is the Bloch vector for a qubit state $\rho \equiv \frac{1}{2}{\mathbf{1}} + {\mathbf{r}} \cdot {\it{\sigma }}$, where 1 is the identity operator and σ = (X, Y, Z) is the vector of Pauli matrices⁵³. The singular values of the real matrix M encode the probabilities p₁ ≥ p₂ ≥ p₃ for three different types of errors. We choose our cost function for the first layer as $C_1 = p_1^2 + p_2 + p_3$, which is relatively more sensitive to p₂ and p₃ than p₁ and ensure that the resultant, optimized channel ${\cal{M}}_{U_1}$ is dominated by one type of error (with probability p₁). We note that M can be efficiently evaluated from a quantum device without knowing ${\cal{N}}$, by performing quantum process tomography for a single logical qubit. Once U₁ is optimized, we use gradient descent to find an optimal U₂ to maximize f. As with QPR, gradients are computed via the finite-difference method, and the learning rate is determined by the bold driver technique¹.

Data availability

The data that support the plots within this paper and other findings of this study are available from the corresponding author on reasonable request.

References

LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).
Article ADS Google Scholar
Lin, H. W., Tegmark, M. & Rolnick, D. Why does deep and cheap learning work so well? J. Stat. Phys. 168, 1223–1247 (2017).
Article MathSciNet MATH ADS Google Scholar
Mehta, P. & Schwab, D. J. An exact mapping between the variational renormalization group and deep learning. Preprint at https://arxiv.org/abs/1410.3831 (2014).
Carleo, G. & Troyer, M. Solving the quantum many-body problem with artificial neural networks. Science 355, 602–606 (2017).
Article MathSciNet MATH ADS Google Scholar
van Nieuwenburg, E. P. L., Liu, Y. H. & Huber, S. D. Learning phase transitions by confusion. Nat. Phys. 13, 435–439 (2017).
Article Google Scholar
Carrasquilla, J. & Melko, R. G. Machine learning phases of matter. Nat. Phys. 13, 431–434 (2017).
Article Google Scholar
Wang, L. Discovering phase transitions with unsupervised learning. Phys. Rev. B 94, 195105 (2016).
Article ADS Google Scholar
Levine, Y., Cohen, N. & Shashua, A. Quantum entanglement in deep learning architectures. Phys. Rev. Lett. 122, 065301 (2019).
Article MathSciNet ADS Google Scholar
Zhang, Y. & Kim, E.-A. Quantum loop topography for machine learning. Phys. Rev. Lett. 118, 216401 (2017).
Article MathSciNet ADS Google Scholar
Maskara, N., Kubica, A. & Jochym-O’Connor, T. Advantages of versatile neural-network decoding for topological codes. Phys. Rev. A 99, 052351 (2019).
Article ADS Google Scholar
Haah, J., Harrow, A. W., Ji, Z., Wu, X. & Yu, N. Sample-optimal tomography of quantum states. IEEE Trans. Inform. Theory 63, 5628–5641 (2017).
MathSciNet MATH Google Scholar
Lee, J. Y. & Landon-Cardinal, O. Practical variational tomography for critical one-dimensional systems. Phys. Rev. A 91, 062128 (2019).
Article ADS Google Scholar
Ladd, T. D. et al. Quantum computers. Nature 464, 45–53 (2010).
Article ADS Google Scholar
Monroe, C. & Kim, J. Scaling the ion trap quantum processor. Science 339, 1164–1169 (2013).
Article ADS Google Scholar
Devoret, M. H. & Schoelkopf, R. J. Superconducting circuits for quantum information: an outlook. Science 339, 1169–1174 (2013).
Article ADS Google Scholar
Awschalom, D. D., Bassett, L. C., Dzurak, A. S., Hu, E. L. & Petta, J. R. Quantum spintronics: engineering and manipulating atom-like spins in semiconductors. Science 339, 1174–1179 (2013).
Article ADS Google Scholar
Biamonte, J. et al. Quantum machine learning. Nature 549, 195–202 (2017).
Article ADS Google Scholar
Dunjko, V., Taylor, J. M. & Briegel, H. J. Quantum-enhanced machine learning. Phys. Rev. Lett. 117, 130501 (2016).
Article MathSciNet ADS Google Scholar
Farhi, E. & Neven, H. Classification with quantum neural networks on near term processors. Preprint at https://arxiv.org/abs/1802.06002 (2018).
Huggins, W., Patil, P., Mitchell, B., Whaley, K. B. & Stoudenmire, E. M. Towards quantum machine learning with tensor networks. Quantum Sci. Technol. 4, 024001 (2018).
Article ADS Google Scholar
Huang, C.-Y., Chen, X. & Lin, F.-L. Symmetry-protected quantum state renormalization. Phys. Rev. B 88, 205124 (2013).
Article ADS Google Scholar
Singh, S. & Vidal, G. Symmetry-protected entanglement renormalization. Phys. Rev. B 88, 121108(R) (2013).
Article ADS Google Scholar
Kim, I. & Swingle, B. Robust entanglement renormalization on a noisy quantum computer. Preprint at https://arxiv.org/abs/1711.07500 (2017).
LeCun, Y. & Bengio, Y. in The Handbook of Brain Theory and Neural Networks 255–258 (MIT Press, 1995).
Krizhevsky, A., Sutskever, I. & Hinton, G. E. Imagenet classification with deep convolutional neural networks. In Proc. 25th Int. Conf. Neural Information Processing Systems Vol. 1, 1097–1105 (2012).
Vidal, G. Class of many-body states that can be efficiently simulated. Phys. Rev. Lett. 101, 110501 (2008).
Article ADS Google Scholar
Aguado, M. & Vidal, G. Entanglement renormalization and topological order. Phys. Rev. Lett. 100, 070404 (2008).
Article ADS Google Scholar
Pfeifer, R. N. C., Evenbly, G. & Vidal, G. Entanglement renormalization, scale invariance, and quantum criticality. Phys. Rev. A 79, 040301 (2009).
Article MathSciNet ADS Google Scholar
Preskill, J. Lecture Notes for Physics 229: Quantum Information and Computation. (California Institute of Technology, 1998).
Sachdev, S. Quantum Phase Transitions (Cambridge University Press, 2011).
Haldane, F. D. M. Nonlinear field theory of large-spin Heisenberg antiferromagnets: semiclassically quantized solitons of the one-dimensional easy-axis Néel state. Phys. Rev. Lett. 50, 1153 (1983).
Article MathSciNet ADS Google Scholar
Haegeman, J., Pérez-García, D., Cirac, I. & Schuch, N. Order parameter for symmetry-protected topological phases in one dimension. Phys. Rev. Lett. 109, 050402 (2012).
Article ADS Google Scholar
Pollmann, F. & Turner, A. M. Detection of symmetry-protected topological phases in one dimension. Phys. Rev. B 86, 125441 (2012).
Article ADS Google Scholar
Brown, L. D., Cai, T. T. & DasGupta, A. Interval estimation for a binomial proportion. Stat. Sci. 16, 101 (2001).
MathSciNet MATH Google Scholar
Zeng, B. & Zhou, D. L. Topological and error-correcting properties for symmetry-protected topological order. Eur. Phys. Lett. 113, 56001 (2016).
Article ADS Google Scholar
Schwarz, M., Temme, K. & Verstraete, F. Preparing projected entangled pair states on a quantum computer. Phys. Rev. Lett. 108, 110502 (2012).
Article ADS Google Scholar
Ge, Y., Molnar, A. & Cirac, J. I. Rapid adiabatic preparation of injective projected entangled pair states and Gibbs states. Phys. Rev. Lett. 116, 080503 (2016).
Article ADS Google Scholar
Shor, P. Scheme for reducing decoherence in quantum computer memory. Phys. Rev. A 52, R2493(R) (1995).
Article ADS Google Scholar
Bernien, H. et al. Probing many-body dynamics on a 51-atom quantum simulator. Nature 551, 579–584 (2017).
Article ADS Google Scholar
Zhang, J. et al. Observation of a many-body dynamical phase transition with a 53-qubit quantum simulator. Nature 551, 601–604 (2017).
Article ADS Google Scholar
Brydges, T. et al. Probing Rényi entanglement entropy via randomized measurements. Science 364, 260–263 (2019).
ADS Google Scholar
Harris, R. et al. Phase transitions in a programmable quantum spin glass simulator. Science 361, 162–165 (2018).
Article MathSciNet ADS Google Scholar
Labuhn, H. et al. Tunable two-dimensional arrays of single Rydberg atoms for realizing quantum ising models. Nature 534, 667–670 (2016).
Article ADS Google Scholar
Levine, H. et al. High-fidelity control and entanglement of Rydberg atom qubits. Phys. Rev. Lett. 121, 123603 (2018).
Article ADS Google Scholar
Freeman, R. Shaped radiofrequency pulses in high resolution NMR. Progr. Nucl. Magn. Reson. Spectrosc. 32, 59–106 (1998).
Article Google Scholar
Kitaev, A. Y. Fault-tolerant quantum computation by anyons. Ann. Phys. 303, 2–30 (2003).
Article MathSciNet MATH ADS Google Scholar
Levin, M. A. & Wen, X.-G. String-net condensation: a physical mechanism for topological phases. Phys. Rev. B. 71, 045110 (2005).
Article ADS Google Scholar
Schuch, N., Pérez-García, D. & Cirac, J. I. PEPS as ground states: degeneracy and topology. Ann. Phys. 325, 2153 (2010).
Article MathSciNet MATH ADS Google Scholar
Savary, L. & Balents, L. Quantum spin liquids: a review. Rep. Prog. Phys. 80, 016502 (2017).
Article ADS Google Scholar
Feiguin, A. et al. Interacting anyons in topological quantum liquids: the golden chain. Phys. Rev. Lett. 98, 160409 (2007).
Article ADS Google Scholar
McCulloch, I. P. Infinite size density matrix renormalization group, revisited. Preprint at https://arxiv.org/abs/0804.2509 (2008).
Vidal, G. Efficient classical simulation of slightly entangled quantum computations. Phys. Rev. Lett. 91, 147902 (2003).
Article ADS Google Scholar
Nielsen, M. A. & Chuang, I. Quantum Computation and Quantum Information (Cambridge Univ. Press, 2000).
Verresen, R., Moessner, R. & Pollman, F. One-dimensional symmetry-protected topological phases and their transitions. Phys. Rev. B 96, 165124 (2017).
Article ADS Google Scholar
Bertlmann, R. A. & Krammer, P. Bloch vectors for qudits. J. Phys. A 41,, 235303 (2008).
Article MathSciNet MATH ADS Google Scholar
Hinton, G. Lecture notes for CSC2515: Introduction to machine learning (Univ. Toronto, 2007).
Schuch, N., Pérez-García, D. & Cirac, J. I. Classifying quantum phases using matrix product states and projected entangled pair states. Phys. Rev. B 84, 165139 (2011).
Article ADS Google Scholar
Chen, X., Gu, Z.-C. & Wen, X.-G. Classification of gapped symmetric phases in one-dimensional spin systems. Phys. Rev. B 83, 035107 (2011).
Article ADS Google Scholar
Aharonov, D. & Ben-Or, M. in Proc. 29th Annu. ACM Symp. on the Theory of Computing 176–188 (ACM, 1997).
Saffman, M., Walker, T. & Molmer, K. Quantum information with Rydberg atoms. Rev. Mod. Phys. 82, 2313 (2010).
Article ADS Google Scholar

Download references

Acknowledgements

We thank I. Cirac, E. Farhi, W. W. Ho, C. Nayak, H. Pichler, J. Preskill, X. Qi, A. Vishwanath, Z. Wang and X.-G. Wen for insightful discussions. I.C. acknowledges support from the Paul and Daisy Soros Fellowship, the Alfred Spector and Rhonda Kost Fellowship of the Hertz Foundation, and the Department of Defense through the National Defense Science and Engineering Graduate Fellowship Program. S.C. acknowledges support from the Miller Institute for Basic Research in Science. This work was supported through the National Science Foundation, the Center for Ultracold Atoms, the Vannevar Bush Faculty Fellowship and Google Research Award.

Author information

Authors and Affiliations

Department of Physics, Harvard University, Cambridge, MA, USA
Iris Cong, Soonwon Choi & Mikhail D. Lukin
Department of Physics, University of California, Berkeley, Berkeley, CA, USA
Soonwon Choi

Authors

Iris Cong
View author publications
You can also search for this author in PubMed Google Scholar
Soonwon Choi
View author publications
You can also search for this author in PubMed Google Scholar
Mikhail D. Lukin
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed extensively to the work presented in this paper.

Corresponding author

Correspondence to Soonwon Choi.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information: Nature Physics thanks Zohar Ringel and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Figs. 1–4.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cong, I., Choi, S. & Lukin, M.D. Quantum convolutional neural networks. Nat. Phys. 15, 1273–1278 (2019). https://doi.org/10.1038/s41567-019-0648-8

Download citation

Received: 09 October 2018
Accepted: 23 July 2019
Published: 26 August 2019
Issue Date: December 2019
DOI: https://doi.org/10.1038/s41567-019-0648-8

This article is cited by

Correlated optical convolutional neural network with “quantum speedup”
- Yifan Sun
- Qian Li
- Xiangdong Zhang
Light: Science & Applications (2024)
Enhancing detection of topological order by local error correction
- Iris Cong
- Nishad Maskara
- Mikhail D. Lukin
Nature Communications (2024)
A framework for demonstrating practical quantum advantage: comparing quantum against classical generative models
- Mohamed Hibat-Allah
- Marta Mauri
- Alejandro Perdomo-Ortiz
Communications Physics (2024)
Understanding quantum machine learning also requires rethinking generalization
- Elies Gil-Fuster
- Jens Eisert
- Carlos Bravo-Prieto
Nature Communications (2024)
Theoretical guarantees for permutation-equivariant quantum neural networks
- Louis Schatzki
- Martín Larocca
- M. Cerezo
npj Quantum Information (2024)