Quantifying the magic of quantum channels

Xin Wang; Mark M Wilde; Yuan Su

doi:10.1088/1367-2630/ab451d

1. Introduction

1.1. Background

One of the main obstacles to physical realizations of quantum computation is decoherence that occurs during the execution of quantum algorithms. Fault-tolerant quantum computation (FTQC) [1, 2] provides a framework to overcome this difficulty by encoding quantum information into quantum error-correcting codes, and it allows reliable quantum computation when the physical error rate is below a certain threshold value.

The fault-tolerant approach to quantum computation allows for a limited set of transversal, or manifestly fault-tolerant, operations, which are usually taken to be the stabilizer operations (SOs). However, the SOs alone do not enable universality because they can be simulated efficiently on a classical computer, a result known as the Gottesman–Knill theorem [3, 4]. The addition of non-stabilizer quantum resources, such as non-SOs, can lead to universal quantum computation [5]. With this perspective, it is natural to consider the resource-theoretic approach [6] to quantify and characterize non-stabilizer quantum resources, including both quantum states and channels.

One solution for the above scenario is to implement a non-SO via state injection [7] of so-called 'magic states,' which are costly to prepare via magic state distillation [5] (see also [8–14]). The usefulness of such magic states also motivates the resource theory of magic states [15–20], where the free operations are the SOs and the free states are the stabilizer states (abbreviated as 'Stab'). On the other hand, since a key step of fault-tolerant quantum computing is to implement non-SOs, a natural and fundamental problem is to quantify the non-stabilizerness or 'magic' of quantum operations. As we are at the stage of noisy intermediate-scale quantum (NISQ) technology, a resource theory of magic for noisy quantum operations is desirable both to exploit the power and to identify the limitations of NISQ devices in fault-tolerant quantum computation (FTQC).

1.2. Overview of results

In this paper, we develop a framework for the resource theory of magic quantum channels, based on qudit systems with odd prime dimension d. Related work on this topic has appeared recently [21], but the set of free operations that we take in our resource theory is larger, given by the completely positive-Wigner-preserving (CPWP) perations as we detail below. We note here that d-level FTQC based on qudits with prime d is of considerable interest for both theoretical and practical purposes [22–26].

Our paper is structured as follows.

In section 2, we first review the stabilizer formalism [4] and the discrete Wigner function [27–29]. We further review various magic measures of quantum states and introduce various classes of free operations, including the SOs and beyond.
In section 3, we introduce and characterize the CPWP operations. We then introduce two efficiently computable magic measures for quantum channels. The first is the mana of quantum channels, whose state version was introduced in [19]. The second is the max-thauma of quantum channels, inspired by the magic state measure in [20]. We prove several desirable properties of these two measures, including reduction to states, faithfulness, additivity for tensor products of channels, subadditivity for serial composition of channels, an amortization inequality, and monotonicity under CPWP superchannels.
In section 4, we explore the ability of quantum channels to generate magic states. We first introduce the amortized magic of a quantum channel as the largest amount of magic that can be generated via a quantum channel. Furthermore, we introduce an information-theoretic notion of the distillable magic of a quantum channel. In particular, we show that both the amortized magic and distillable magic of a quantum channel can be bounded from above by its mana and max-thauma.
In section 5, we apply our magic measures for quantum channels in order to evaluate the magic cost of quantum channels, and we explore further applications in quantum gate synthesis. In particular, we show that at least four T gates are required to perfectly implement a controlled-controlled-NOT gate.
In section 6, we propose a classical algorithm, inspired by [30], for simulating quantum circuits, which is relevant for the broad class of noisy quantum circuits that are currently being run on NISQ devices. This algorithm has sample complexity that scales with respect to the mana of a quantum channel. We further show by concrete examples that the new algorithm can outperform a previous approach for simulating noisy quantum circuits, based on channel robustness [21].

2. Preliminaries

2.1. The stabilizer formalism

For most known fault-tolerant schemes, the restricted set of quantum operations is the SOs, consisting of preparation and measurement in the computational basis and a restricted set of unitary operations. Here we review the basic elements of the stabilizer states and operations for systems with a dimension that is a product of odd primes. Throughout this paper, a Hilbert space implicitly has an odd dimension, and if the dimension is not prime, it should be understood to be a tensor product of Hilbert spaces each having odd prime dimension.

Let ${{ \mathcal H }}_{d}$ denote a Hilbert space of dimension d, and let $\{| j\rangle \}{}_{j=0,\cdots ,d-1}$ denote the standard computational basis. For a prime number d, we define the unitary boost and shift operators $X,Z\in { \mathcal L }({{ \mathcal H }}_{d})$ in terms of their action on the computational basis:

$\begin{eqnarray}&&X| j\rangle =| j\oplus 1\rangle ,\,\end{eqnarray} \tag{ 1 }$

$\begin{eqnarray}&&Z| j\rangle ={\omega }^{j}| j\rangle ,\quad \omega ={{\rm{e}}}^{2\pi {\rm{i}}/d},\,\end{eqnarray} \tag{ 2 }$

where ⊕ denotes addition modulo d. We define the Heisenberg–Weyl operators as

$\begin{eqnarray}&&{T}_{{\bf{u}}}={\tau }^{-{a}_{1}{a}_{2}}{Z}^{{a}_{1}}{X}^{{a}_{2}},\end{eqnarray} \tag{ 3 }$

where $\tau ={{\rm{e}}}^{(d+1)\pi {\rm{i}}/d},{\bf{u}}=({a}_{1},{a}_{2})\in {{\mathbb{Z}}}_{d}\times {{\mathbb{Z}}}_{d}$ .

For a system with composite Hilbert space ${{ \mathcal H }}_{A}\otimes {{ \mathcal H }}_{B}$ , the Heisenberg–Weyl operators are the tensor product of the subsystem Heisenberg–Weyl operators:

$\begin{eqnarray}&&{T}_{{{\bf{u}}}_{A}\oplus {{\bf{u}}}_{B}}={T}_{{{\bf{u}}}_{A}}\otimes {T}_{{{\bf{u}}}_{B}},\end{eqnarray} \tag{ 4 }$

where ${{\bf{u}}}_{A}\oplus {{\bf{u}}}_{B}$ is an element of ${{\mathbb{Z}}}_{{d}_{A}}\times {{\mathbb{Z}}}_{{d}_{A}}\times {{\mathbb{Z}}}_{{d}_{B}}\times {{\mathbb{Z}}}_{{d}_{B}}$ .

The Clifford operators ${{ \mathcal C }}_{d}$ are defined to be the set of unitary operators that map Heisenberg–Weyl operators to Heisenberg–Weyl operators under unitary conjugation up to phases:

$\begin{eqnarray}&&U\in {{ \mathcal C }}_{d}\ \mathrm{iff}\ \forall {\bf{u}},\exists \theta ,{\bf{u}}^{\prime} ,\ {\rm{s}}.{\rm{t}}.\ {{UT}}_{{\bf{u}}}{U}^{\dagger }={{\rm{e}}}^{{\rm{i}}\theta }{T}_{{\bf{u}}^{\prime} }.\end{eqnarray} \tag{ 5 }$

These operators form the Clifford group.

The pure stabilizer states can be obtained by applying Clifford operators to the state $| 0\rangle$ :

$\begin{eqnarray}&&\{{S}_{j}\}=\{U| 0\rangle \langle 0| {U}^{\dagger }:U\in {{ \mathcal C }}_{d}\}.\end{eqnarray} \tag{ 6 }$

A state is defined to be a magic or non-stabilizer state if it cannot be written as a convex combination of pure stabilizer states.

2.2. Discrete Wigner function

The discrete Wigner function [27–29] was used to show the existence of bound magic states [18]. For an overview of discrete Wigner functions, we refer to [18, 19] for more details. See also [31] for a review of quasi-probability representations in quantum theory, with applications to quantum information science.

For each point ${\bf{u}}\in {{\mathbb{Z}}}_{d}\times {{\mathbb{Z}}}_{d}$ in the discrete phase space, there is a corresponding operator ${A}_{{\bf{u}}}$ , and the value of the discrete Wigner representation of a state ρ at this point is given by

$\begin{eqnarray}&&{W}_{\rho }({\bf{u}}):= \displaystyle \frac{1}{d}\mathrm{Tr}\,[{A}_{{\bf{u}}}\rho ],\end{eqnarray} \tag{ 7 }$

where d is the dimension of the Hilbert space and ${\{{A}_{{\bf{u}}}\}}_{{\bf{u}}}$ are the phase-space point operators:

$\begin{eqnarray}&&{A}_{0}=\displaystyle \frac{1}{d}\displaystyle \sum _{{\bf{u}}}{T}_{{\bf{u}}},\qquad {A}_{{\bf{u}}}={T}_{{\bf{u}}}{A}_{0}{T}_{{\bf{u}}}^{\dagger }.\end{eqnarray} \tag{ 8 }$

The discrete Wigner function can be defined more generally for a Hermitian operator X acting on a space of dimension d via the same formula:

$\begin{eqnarray}&&{W}_{X}({\bf{u}})=\displaystyle \frac{1}{d}\mathrm{Tr}\,[{A}_{{\bf{u}}}X].\end{eqnarray} \tag{ 9 }$

For the particular case of a measurement operator E satisfying $0\leqslant E\leqslant {\mathbb{1}}$ , the discrete Wigner representation is defined as

$\begin{eqnarray}&&W(E| {\bf{u}}):= \mathrm{Tr}\,\left[{{EA}}_{{\bf{u}}}\right],\end{eqnarray} \tag{ 10 }$

i.e. without the prefactor $1/d$ . The reason for this will be clear in a moment and is related to the distinction between a frame and a dual frame [30, 32, 33].

Some nice properties of the set ${\{{A}_{{\bf{u}}}\}}_{{\bf{u}}}$ are listed as follows:

1.
${A}_{{\bf{u}}}$ is Hermitian;
2.
${\sum }_{{\bf{u}}}{A}_{{\bf{u}}}/d={\mathbb{1}};$
3.
$\mathrm{Tr}\,[{A}_{{\bf{u}}}{A}_{{\bf{u}}^{\prime} }]=d\,\delta ({\bf{u}},{\bf{u}}^{\prime} );$
4.
$\mathrm{Tr}\,[{A}_{{\bf{u}}}]=1;$
5.
$\rho ={\sum }_{{\bf{u}}}{W}_{\rho }({\bf{u}}){A}_{{\bf{u}}};$
6.
${\{{A}_{{\bf{u}}}\}}_{{\bf{u}}}=\{{A}_{{\bf{u}}}^{T}\}{}_{{\bf{u}}}$ .

From the second property above and the definition in (7), we conclude the following equality for a quantum state ρ:

$\begin{eqnarray}&&\displaystyle \sum _{{\bf{u}}}{W}_{\rho }({\bf{u}})=1.\end{eqnarray} \tag{ 11 }$

For this reason, the discrete Wigner function is known as a quasi-probability distribution. More generally, for a Hermitian operator X, we have that

$\begin{eqnarray}&&\displaystyle \sum _{{\bf{u}}}{W}_{X}({\bf{u}})=\mathrm{Tr}\,[X],\end{eqnarray} \tag{ 12 }$

so that for a subnormalized state ω, satisfying $\omega \geqslant 0$ and $\mathrm{Tr}\,[\omega ]\leqslant 1$ , we have that ${\sum }_{{\bf{u}}}{W}_{\omega }({\bf{u}})\leqslant 1$ .

Following the convention in (10) for measurement operators, we find the following for a positive operator-valued measure (POVM) ${\{{E}^{x}\}}_{x}$ (satisfying ${E}^{x}\geqslant 0\ \forall x$ and ${\sum }_{x}{E}^{x}={\mathbb{1}}$ ):

$\begin{eqnarray}&&\displaystyle \sum _{x}W({E}^{x}| {\bf{u}})=1,\end{eqnarray} \tag{ 13 }$

so that the quasi-probability interpretation is retained for a POVM. That is, $W({E}^{x}| {\bf{u}})$ can be interpreted as the conditional quasi-probability of obtaining outcome x given input ${\bf{u}}$ .

We can quantify the amount of negativity in the discrete Wigner function of a state ρ via the sum negativity, which is equal to the absolute sum of the negative elements of the Wigner function [19]:

$\begin{eqnarray}&&\mathrm{sn}(\rho ):= \displaystyle \sum _{{\bf{u}}:{W}_{\rho }({\bf{u}})\lt 0}| {W}_{\rho }({\bf{u}})| =\displaystyle \frac{1}{2}\left(\displaystyle \sum _{{\bf{u}}}| {W}_{\rho }({\bf{u}})| -{W}_{\rho }({\bf{u}})\right)=\displaystyle \frac{1}{2}\left(\displaystyle \sum _{{\bf{u}}}| {W}_{\rho }({\bf{u}})| \right)-\displaystyle \frac{1}{2}.\end{eqnarray} \tag{ 14 }$

By definition, we find that $\mathrm{sn}(\rho )\geqslant 0$ . The mana of a state ρ is defined as [19]

$\begin{eqnarray}&&{ \mathcal M }(\rho ):= \mathrm{log}\left(\displaystyle \sum _{{\bf{u}}}| {W}_{\rho }({\bf{u}})| \right)=\mathrm{log}(2\cdot \mathrm{sn}(\rho )+1)\geqslant 0.\end{eqnarray} \tag{ 15 }$

We define the mana more generally, as in [20], for a positive semi-definite operator X via the formula

$\begin{eqnarray}&&{ \mathcal M }(X):= \mathrm{log}\left(\displaystyle \sum _{{\bf{u}}}| {W}_{X}({\bf{u}})| \right)=\mathrm{log}\left(2\left[\displaystyle \sum _{{\bf{u}}:{W}_{X}({\bf{u}})\lt 0}| {W}_{X}({\bf{u}})| \right]+\mathrm{Tr}\,[X]\right).\end{eqnarray} \tag{ 16 }$

We denote the set of quantum states with a non-negative Wigner function by ${{ \mathcal W }}_{+}$ (Wigner polytope), i.e.

$\begin{eqnarray}&&{{ \mathcal W }}_{+}:= \{\rho :\forall {\bf{u}},{W}_{\rho }({\bf{u}})\geqslant 0,\rho \geqslant 0,\mathrm{Tr}\,[\rho ]=1\}.\end{eqnarray} \tag{ 17 }$

It is known that quantum states with non-negative Wigner function are classically simulable and thus are useless in magic state distillation [18], which can be seen as the analog of states with positive partial transpose (PPT) in entanglement distillation [34, 35].

Motivated by the Rains bound [36] and its variants [37–43] in entanglement theory, the set of sub-normalized states with non-positive mana was introduced as follows [20] to explore the resource theory of magic states:

$\begin{eqnarray}&&{ \mathcal W }:= \left\{\sigma :\displaystyle \sum _{{\bf{u}}}| {W}_{\sigma }({\bf{u}})| \leqslant 1,\sigma \geqslant 0\right\}=\left\{\sigma :{ \mathcal M }(\sigma )\leqslant 0,\sigma \geqslant 0\right\}.\end{eqnarray} \tag{ 18 }$

It follows from definitions and the triangle inequality that $\mathrm{Tr}\,[\sigma ]\leqslant 1$ if $\sigma \in { \mathcal W }$ (alternatively one can conclude this by inspecting the right-hand side of (16)).

Furthermore, we define ${\widehat{{ \mathcal W }}}_{+}$ to be the set of Hermitian operators with non-negative Wigner function:

$\begin{eqnarray}&&{\widehat{{ \mathcal W }}}_{+}:= \{V:\forall {\bf{u}},{W}_{V}({\bf{u}})\geqslant 0\}.\end{eqnarray} \tag{ 19 }$

The Wigner trace norm and Wigner spectral norm of an Hermitian operator V are defined as follows, respectively:

$\begin{eqnarray}&&\parallel V{\parallel }_{W,1}:= \displaystyle \sum _{{\bf{u}}}| {W}_{V}({\bf{u}})| =\displaystyle \sum _{{\bf{u}}}| \mathrm{Tr}\,[{A}_{{\bf{u}}}V]/d| ,\end{eqnarray} \tag{ 20 }$

$\begin{eqnarray}&&\parallel V{\parallel }_{W,\infty }:= d\mathop{\max }\limits_{{\bf{u}}}| {W}_{V}({\bf{u}})| =\mathop{\max }\limits_{{\bf{u}}}| \mathrm{Tr}\,[{A}_{{\bf{u}}}V]| .\end{eqnarray} \tag{ 21 }$

The Wigner trace and spectral norms are dual to each other in the following sense:

$\begin{eqnarray}&&\,\parallel V{\parallel }_{W,1}:= \mathop{\max }\limits_{C}\{| \mathrm{Tr}\,[{VC}]| :\parallel C{\parallel }_{W,\infty }\leqslant 1\},\end{eqnarray} \tag{ 22 }$

$\begin{eqnarray}&&\parallel V{\parallel }_{W,\infty }:= \mathop{\max }\limits_{C}\{| \mathrm{Tr}\,[{VC}]| :\parallel C{\parallel }_{W,1}\leqslant 1\},\end{eqnarray} \tag{ 23 }$

with C ranging over Hermitian operators within the same space.

2.3. Stabilizer channels and beyond

A SO consists of the following types of quantum operations: Clifford operations, tensoring in stabilizer states, partial trace, measurements in the computational basis, and post-processing conditioned on these measurement results. Any quantum protocol composed of these quantum operations can be written in terms of the following Stinespring dilation representation: ${E}(\rho )={{\rm{T}}{\rm{r}}}_{E}[U(\rho \otimes {\rho }_{E}){U}^{\dagger }]$ , where U is a Clifford unitary and the ancilla ${\rho }_{E}$ is a stabilizer state.

The authors of [44] generalized the set of SOs to stabilizer-preserving operations, which are those that transform stabilizer states to stabilizer states and which form the largest set of physical operations that can be considered free for the resource theory of non-stabilizerness. More recently, [21] introduced the completely stabilizer-preserving operations (CSPO); i.e. a quantum operation Π is called completely stabilizer-preserving if for any reference system R,

$\begin{eqnarray}&&\forall {\rho }_{{RA}}\in \mathrm{Stab},\quad ({\mathrm{id}}_{R}\otimes {{\rm{\Pi }}}_{A\to B})({\rho }_{{RA}})\in \mathrm{Stab}.\end{eqnarray} \tag{ 24 }$

2.4. Magic measures of quantum states

We review some of the magic measures of quantum states in table 1. In particular, the max-thauma of a quantum state ρ is defined as follows [20]:

$\begin{eqnarray}&&{\theta }_{\max }(\rho ):= \mathop{\min }\limits_{\sigma \in { \mathcal W }}{D}_{\max }(\rho \parallel \sigma ):= \mathop{\min }\limits_{\sigma \in { \mathcal W }}\left[\min \{\lambda :\rho \leqslant {2}^{\lambda }\sigma \}\right]\,\end{eqnarray} \tag{ 25 }$

$\begin{eqnarray}&&=\,{\mathrm{log}}_{2}\min \left\{\parallel V{\parallel }_{W,1}\,:\rho \leqslant V\right\},\end{eqnarray} \tag{ 26 }$

where the max-relative entropy ${D}_{\max }(\rho \parallel \sigma )$ was defined in [45].

Table 1. Partial zoo of magic measures.

Measures	Acronym	Definition
Mana [19]	${ \mathcal M }(\rho )$	$\mathrm{log}{\sum }_{{\bf{u}}}\| \mathrm{Tr}\,{A}_{{\bf{u}}}\rho \| /d$
Robustness of magic [16]	${ \mathcal R }(\rho )$	$\inf \{2r+1:\tfrac{\rho +r\sigma }{1+r}=\tau ,\ \sigma ,\tau \in \mathrm{Stab}\}$
Relative entropy of magic [19]	${R}_{{ \mathcal M }}(\rho )$	${\inf }_{\sigma \in \mathrm{Stab}}D(\rho \parallel \sigma )$
Regularized relative entropy of magic [19]	${R}_{{ \mathcal M }}^{\infty }(\rho )$	${\mathrm{lim}}_{n\to \infty }{R}_{{ \mathcal M }}^{\infty }({\rho }^{\otimes n})/n$
Max-thauma [20]	${\theta }_{\max }(\rho )$	${\inf }_{\sigma \in { \mathcal W }}{D}_{\max }(\rho \parallel \sigma )$
Thauma [20]	$\theta (\rho )$	${\inf }_{\sigma \in { \mathcal W }}D(\rho \parallel \sigma )$
Regularized thauma [20]	${\theta }^{\infty }(\rho )$	${\mathrm{lim}}_{n\to \infty }\theta ({\rho }^{\otimes n})/n$
Min-thauma [20]	${\theta }_{\min }(\rho )$	${\inf }_{\sigma \in { \mathcal W }}{D}_{0}(\rho \parallel \sigma )$

3. Quantifying the non-stabilizerness of a quantum channel

3.1. Completely positive-Wigner-preserving operations

A quantum circuit consisting of an initial quantum state, unitary evolutions, and measurements, each having non-negative Wigner functions, can be classically simulated [30]. It is thus natural to consider free operations to be those that completely preserve the positivity of the Wigner function. Indeed, any such quantum operations are proved to be efficiently simulated via classical algorithms in section 6 and thus become reasonable free operations for the resource theory of magic.

(Completely PWP operation).

Definition 1 A Hermiticity-preserving linear map Π is called CPWP if for any system R with odd dimension, the following holds

$\begin{eqnarray}&&\forall {\rho }_{{RA}}\in {{ \mathcal W }}_{+},\quad ({\mathrm{id}}_{R}\otimes {{\rm{\Pi }}}_{A\to B})({\rho }_{{RA}})\in {{ \mathcal W }}_{+}.\end{eqnarray} \tag{ 27 }$

Figure 1 depicts the relationship between SOs, completely stabilizer-preserving operations, and completely PWP operations.

We now recall the definition of the discrete Wigner function of a quantum channel from [17], which is strongly related to the Wigner function of a quantum channel as defined in [46, equation (95)].

(Discrete Wigner function of a quantum channel).

Definition 2 Given a quantum channel ${{ \mathcal N }}_{A\to B}$ , its discrete Wigner function is defined as

$\begin{eqnarray}&&{{ \mathcal W }}_{{ \mathcal N }}({\boldsymbol{v}}| {\boldsymbol{u}}):= \displaystyle \frac{1}{{d}_{B}}\mathrm{Tr}\,[({\left({A}_{A}^{{\boldsymbol{u}}}\right)}^{T}\otimes {A}_{B}^{{\boldsymbol{v}}}){J}_{{AB}}^{{ \mathcal N }}]\end{eqnarray} \tag{ 28 }$

$\begin{eqnarray}&&=\,\displaystyle \frac{1}{{d}_{B}}\mathrm{Tr}\,[{A}_{B}^{{\boldsymbol{v}}}{ \mathcal N }({A}_{A}^{{\boldsymbol{u}}})].\,\end{eqnarray} \tag{ 29 }$

Here ${J}_{{AB}}^{{ \mathcal N }}={\sum }_{{ij}}| i\rangle \langle j{| }_{A}\otimes { \mathcal N }(| i\rangle \langle j{| }_{A^{\prime} })$ denotes the Choi–Jamiołkowski matrix [47, 48] of the channel ${ \mathcal N }$ , where ${\{| i{\rangle }_{A}\}}_{i}$ and ${\{| i{\rangle }_{A^{\prime} }\}}_{i}$ are orthonormal bases on isomorphic Hilbert spaces ${{ \mathcal H }}_{A}$ and ${{ \mathcal H }}_{A^{\prime} }$ , respectively. More generally, the discrete Wigner function of a Hermiticity-preserving linear map ${{ \mathcal P }}_{A\to B}$ can be defined using the same formula in (29), by substituting ${ \mathcal N }$ therein with ${ \mathcal P }$ .

From the definition above and the properties recalled in section 2.2, it follows for a quantum channel ${{ \mathcal N }}_{A\to B}$ that

$\begin{eqnarray}&&\displaystyle \sum _{{\bf{v}}}{{ \mathcal W }}_{{ \mathcal N }}({\bf{v}}| {\bf{u}})=1\quad \forall {\bf{u}},\end{eqnarray} \tag{ 30 }$

because

$\begin{eqnarray}&&\displaystyle \sum _{{\bf{v}}}{{ \mathcal W }}_{{ \mathcal N }}({\bf{v}}| {\bf{u}})=\displaystyle \sum _{{\bf{v}}}\mathrm{Tr}\,[{A}_{B}^{{\bf{v}}}{ \mathcal N }({A}_{A}^{{\bf{u}}})]{/d}_{B}=\mathrm{Tr}\,\left[\left(\displaystyle \sum _{{\bf{v}}}\displaystyle \frac{{A}_{B}^{{\bf{v}}}}{{d}_{B}}\right){ \mathcal N }({A}_{A}^{{\bf{u}}})\right]\end{eqnarray} \tag{ 31 }$

$\begin{eqnarray}&&=\,\mathrm{Tr}\,[{I}_{B}{ \mathcal N }({A}_{A}^{{\bf{u}}})]=\mathrm{Tr}\,[{ \mathcal N }({A}_{A}^{{\bf{u}}})]=\mathrm{Tr}\,[{A}_{A}^{{\bf{u}}}]=1,\,\end{eqnarray} \tag{ 32 }$

where the penultimate equality follows from the fact that ${ \mathcal N }$ is trace preserving (in fact here we did not require complete positivity or even linearity). Due to the normalization in (30), ${{ \mathcal W }}_{{ \mathcal N }}({\bf{v}}| {\bf{u}})$ can be interpreted as a conditional quasi-probability distribution.

Furthermore, the discrete Wigner function of a channel allows one to determine the output Wigner function from the input Wigner function by propagating the quasi-probability distributions, just as one does in the classical case. When there is no reference system, such a statement was proved in [17]. Here we slightly extend this result to the case with a reference system in the following lemma.

Lemma 1. For an input state ${\rho }_{{AR}}$ and a quantum channel ${{ \mathcal N }}_{A\to B}$ with respective Wigner functions ${W}_{{\rho }_{{AR}}}({\boldsymbol{u}},{\boldsymbol{y}})$ and ${W}_{{ \mathcal N }}({\boldsymbol{v}}| {\boldsymbol{u}})$ , the Wigner function ${W}_{{ \mathcal N }({\rho }_{{AR}})}({\boldsymbol{v}},{\boldsymbol{y}})$ of the output state ${{ \mathcal N }}_{A\to B}({\rho }_{{AR}})$ is given by

$\begin{eqnarray}&&{W}_{{{ \mathcal N }}_{A\to B}({\rho }_{{AR}})}({\boldsymbol{v}},{\boldsymbol{y}})=\displaystyle \sum _{{\boldsymbol{u}}}{W}_{{ \mathcal N }}({\boldsymbol{v}}| {\boldsymbol{u}})\ {W}_{{\rho }_{{AR}}}({\boldsymbol{u}},{\boldsymbol{y}}).\end{eqnarray} \tag{ 33 }$

Proof. The proof is straightforward:

$\begin{eqnarray}&&{W}_{{ \mathcal N }({\rho }_{{AR}})}({\bf{v}},{\bf{y}})=\displaystyle \frac{1}{{d}_{B}{d}_{R}}\mathrm{Tr}\,[({A}_{B}^{{\bf{v}}}\otimes {A}_{R}^{{\bf{y}}}){{ \mathcal N }}_{A\to B}({\rho }_{{AR}})]\,\,\,\end{eqnarray} \tag{ 34 }$

$\begin{eqnarray}&&\,\,=\,\displaystyle \frac{1}{{d}_{B}{d}_{R}}\displaystyle \sum _{{\bf{u}},{\bf{w}}}\mathrm{Tr}\,[({A}_{B}^{{\bf{v}}}\otimes {A}_{R}^{{\bf{y}}}){{ \mathcal N }}_{A\to B}({W}_{{\rho }_{{AR}}}({\bf{u}},{\bf{w}}){A}_{A}^{{\bf{u}}}\otimes {A}_{R}^{{\bf{w}}})]\end{eqnarray} \tag{ 35 }$

$\begin{eqnarray}&&\,=\,\displaystyle \frac{1}{{d}_{B}{d}_{R}}\displaystyle \sum _{{\bf{u}},{\bf{w}}}{W}_{{\rho }_{{AR}}}({\bf{u}},{\bf{w}})\mathrm{Tr}\,[{A}_{B}^{{\bf{v}}}{{ \mathcal N }}_{A\to B}({A}_{A}^{{\bf{u}}})\otimes {A}_{R}^{{\bf{y}}}{A}_{R}^{{\bf{w}}})]\end{eqnarray} \tag{ 36 }$

$\begin{eqnarray}&&\,=\,\displaystyle \frac{1}{{d}_{B}{d}_{R}}\displaystyle \sum _{{\bf{u}},{\bf{w}}}{W}_{{\rho }_{{AR}}}({\bf{u}},{\bf{w}})\mathrm{Tr}\,[{A}_{B}^{{\bf{v}}}{{ \mathcal N }}_{A\to B}({A}_{A}^{{\bf{u}}})]{d}_{R}\ \delta ({\bf{y}},{\bf{w}})\end{eqnarray} \tag{ 37 }$

$\begin{eqnarray}&&=\,\displaystyle \sum _{{\bf{u}}}{W}_{{ \mathcal N }}({\bf{v}}| {\bf{u}})\ {W}_{{\rho }_{{AR}}}({\bf{u}},{\bf{y}}).\,\,\,\end{eqnarray} \tag{ 38 }$

All steps follow from definitions and the properties of the phase-space point operators recalled in section 2.2. In particular, we made use of the fact that ${\rho }_{{AR}}={\sum }_{{\bf{v}},{\bf{w}}}{W}_{{\rho }_{{AR}}}({\bf{v}},{\bf{w}}){A}_{A}^{{\bf{v}}}\otimes {A}_{R}^{{\bf{w}}}$ in the second equality. ■

Theorem 2. The following statements about CPWP operations are equivalent:

1.
The quantum channel ${ \mathcal N }$ is CPWP;
2.
The discrete Wigner function of the Choi–Jamiołkowski matrix ${J}_{{ \mathcal N }}$ is non-negative;
3.
${W}_{{ \mathcal N }}({\boldsymbol{v}}| {\boldsymbol{u}})$ is non-negative for all ${\boldsymbol{u}}$ and ${\boldsymbol{v}}$ (i.e. ${W}_{{ \mathcal N }}({\boldsymbol{v}}| {\boldsymbol{u}})$ is a conditional probability distribution or classical channel).

Proof. $1\to 2$ : Let us first apply the (stabilizer) qudit controlled-NOT gate ${\mathrm{CNOT}}_{d}$ to the stabilizer state $| +\rangle \otimes | 0\rangle$ to prepare the maximally entangled state ${{\rm{\Phi }}}_{d}\in {{ \mathcal W }}_{+}$ . Since ${ \mathcal N }$ completely preserves the positivity of the Wigner function, it follows that

$\begin{eqnarray}&&{J}_{{ \mathcal N }}/d=({\mathrm{id}}_{R}\otimes { \mathcal N })({{\rm{\Phi }}}_{d})\in {{ \mathcal W }}_{+}.\end{eqnarray} \tag{ 39 }$

$2\to 3$ : We find that

$\begin{eqnarray}&&{W}_{{ \mathcal N }}({\bf{v}}| {\bf{u}})=\mathrm{Tr}\,[{J}_{{ \mathcal N }}({\left({A}_{A}^{{\bf{u}}}\right)}^{T}\otimes {A}_{B}^{{\bf{v}}})]{/d}_{B}=\mathrm{Tr}\,[{J}_{{ \mathcal N }}({A}_{A}^{{\bf{u}}^{\prime} }\otimes {A}_{B}^{{\bf{v}}})]{/d}_{B}\geqslant 0.\end{eqnarray} \tag{ 40 }$

In the last inequality, we note that ${A}_{A}^{{\bf{u}}^{\prime} }={\left({A}_{A}^{{\bf{u}}}\right)}^{T}$ and we can always find such ${\bf{u}}^{\prime}$ since $\{{A}_{A}^{{\bf{u}}}\}{}_{{\bf{u}}}=\{{({A}_{A}^{{\bf{u}}})}^{T}\}{}_{{\bf{u}}}$ . The fact that ${W}_{{ \mathcal N }}({\bf{v}}| {\bf{u}})$ is a conditional probability distribution follows from the inequality in (40) and the constraint in (30).

$3\to 1$ : If the channel ${ \mathcal N }$ has a non-negative Wigner function, then for an input state ${\rho }_{{AR}}$ such that ${\rho }_{{AR}}\in {{ \mathcal W }}_{+}$ , it follows from lemma 1 that

$\begin{eqnarray}&&{W}_{{ \mathcal N }({\rho }_{{AR}})}({\bf{v}},{\bf{y}})=\displaystyle \sum _{{\bf{u}}}{W}_{{ \mathcal N }}({\bf{v}}| {\bf{u}})\ {W}_{{\rho }_{{AR}}}({\bf{u}},{\bf{y}})\geqslant 0,\end{eqnarray} \tag{ 41 }$

concluding the proof.■

We remark here that the equivalence between 2 and 3 above was proved in [17], and our contribution is to show the equivalence between 2, 3, and the completely positive Wigner preserving property, which considers information processing in the presence of reference systems.

3.2. Quantum (CPWP) superchannels

A superchannel ${{\rm{\Xi }}}_{\left(A\to B\right)\to \left(C\to D\right)}$ is a quantum-physical evolution of a quantum channel ${{ \mathcal N }}_{A\to B}$ [49, 50], which leads to an output channel ${{ \mathcal K }}_{C\to D}$ as

$\begin{eqnarray}&&{{ \mathcal K }}_{C\to D}={{\rm{\Xi }}}_{\left(A\to B\right)\to \left(C\to D\right)}({{ \mathcal N }}_{A\to B}).\end{eqnarray} \tag{ 42 }$

The output channel ${{ \mathcal K }}_{C\to D}$ taking system C to system D can be denoted by ${\rm{\Xi }}({ \mathcal N })$ for short. The key property of a quantum superchannel is that the output map

$\begin{eqnarray}&&\left({\mathrm{id}}_{R}\otimes {{\rm{\Xi }}}_{\left(A\to B\right)\to \left(C\to D\right)}\right)({{ \mathcal N }}_{{RA}\to {RB}})\end{eqnarray} \tag{ 43 }$

is a legitimate quantum channel for all input bipartite channels ${{ \mathcal N }}_{{RA}\to {RB}}$ , where the reference system R is arbitrary and ${\mathrm{id}}_{R}$ denotes the identity superchannel. A superchannel ${{\rm{\Xi }}}_{\left(A\to B\right)\to \left(C\to D\right)}$ has a physical realization in terms of a pre-processing channel ${{ \mathcal E }}_{C\to {AM}}$ and a post-processing channel ${{ \mathcal D }}_{{BM}\to D}$ [49, 50], so that

$\begin{eqnarray}&&{{\rm{\Xi }}}_{\left(A\to B\right)\to \left(C\to D\right)}({{ \mathcal N }}_{A\to B})={{ \mathcal D }}_{{BM}\to D}\,\circ \,{{ \mathcal N }}_{A\to B}\,\circ \,{{ \mathcal E }}_{C\to {AM}}.\end{eqnarray} \tag{ 44 }$

The superchannel ${{\rm{\Xi }}}_{\left(A\to B\right)\to \left(C\to D\right)}$ is in one-to-one correspondence with a bipartite channel ${{ \mathcal P }}_{{CB}\to {AD}}$ , defined as

$\begin{eqnarray}&&{{ \mathcal P }}_{{CB}\to {AD}}:= {{ \mathcal D }}_{{BM}\to D}\,\circ \,{{ \mathcal E }}_{C\to {AM}}.\end{eqnarray} \tag{ 45 }$

Related to this, an arbitrary bipartite channel ${{ \mathcal P }}_{{CB}\to {AD}}^{{\prime} }$ is in one-to-one correspondence with a superchannel ${{\rm{\Xi }}}_{\left(A\to B\right)\to \left(C\to D\right)}^{{\prime} }$ as long as it obeys the following non-signaling constraint [51, theorem 4]:

$\begin{eqnarray}&&{\mathrm{Tr}}_{D}\,\circ \,{{ \mathcal P }}_{{CB}\to {AD}}^{{\prime} }={\mathrm{Tr}}_{D}\,\circ \,{{ \mathcal P }}_{{CB}\to {AD}}^{{\prime} }\,\circ \,{{ \mathcal R }}_{B}^{\pi },\end{eqnarray} \tag{ 46 }$

where ${{ \mathcal R }}_{B}^{\pi }$ is a replacer channel, defined as ${{ \mathcal R }}_{B}^{\pi }({\omega }_{B})=\mathrm{Tr}[{\omega }_{B}]{\pi }_{B}$ with ${\pi }_{B}$ the maximally mixed state. That is, the non-signaling constraint implies that a trace out of system D has the effect of tracing and replacing system B, thus preventing B from signaling to A.

The Choi operator of a quantum superchannel ${{\rm{\Xi }}}_{\left(A\to B\right)\to \left(C\to D\right)}$ is given by the Choi operator of its corresponding bipartite channel ${{ \mathcal P }}_{{CB}\to {AD}}$ [49, 50]:

$\begin{eqnarray}&&{J}_{{CBAD}}^{{\rm{\Xi }}}:= \displaystyle \sum _{i,j,{i}^{{\prime} },{j}^{{\prime} }}| i\rangle \langle j{| }_{C}\otimes | {i}^{{\prime} }\rangle \langle {j}^{{\prime} }{| }_{B}\otimes {{ \mathcal P }}_{{C}^{{\prime} }{B}^{{\prime} }\to {AD}}(| i\rangle \langle j{| }_{{C}^{{\prime} }}\otimes | {i}^{{\prime} }\rangle \langle {j}^{{\prime} }{| }_{{B}^{{\prime} }}),\end{eqnarray} \tag{ 47 }$

where systems ${B}^{{\prime} }$ and ${C}^{{\prime} }$ are isomorphic to B and C, respectively. It obeys the following constraints:

$\begin{eqnarray}&&{J}_{{CBAD}}^{{\rm{\Xi }}}\geqslant 0,\,\,\end{eqnarray} \tag{ 48 }$

$\begin{eqnarray}&&{\mathrm{Tr}}_{{AD}}[{J}_{{CBAD}}^{{\rm{\Xi }}}]={I}_{{CB}},\,\,\end{eqnarray} \tag{ 49 }$

$\begin{eqnarray}&&{\mathrm{Tr}}_{D}[{J}_{{CBAD}}^{{\rm{\Xi }}}]={\mathrm{Tr}}_{{DB}}[{J}_{{CBAD}}^{{\rm{\Xi }}}]\otimes {\pi }_{B},\end{eqnarray} \tag{ 50 }$

which correspond respectively to complete positivity of the corresponding bipartite channel ${{ \mathcal P }}_{{CB}\to {AD}}$ , trace preservation of the bipartite channel ${{ \mathcal P }}_{{CB}\to {AD}}$ , and the $B{/}\!\!\!\!\!\!\!{\to }A$ non-signaling constraint. Conversely, any operator obeying the three constraints above is a bipartite channel corresponding to a superchannel. One can employ the following propagation rule [49, 50] to determine the Choi operator of the output channel in (42):

$\begin{eqnarray}&&{J}_{{CD}}^{{ \mathcal K }}={\mathrm{Tr}}_{{AB}}\left[\left({\left({J}_{{AB}}^{{ \mathcal N }}\right)}^{T}\otimes {I}_{{CD}}\right){J}_{{CBAD}}^{{\rm{\Xi }}}\right],\end{eqnarray} \tag{ 51 }$

where the superscript T denotes the transpose operation.

By employing definition 2, we define the discrete Wigner function of a quantum superchannel ${{\rm{\Xi }}}_{(A\to B)\to (C\to D)}$ , and we do so by means of its corresponding bipartite channel ${{ \mathcal P }}_{{CB}\to {AD}}$ . That is, since ${{ \mathcal P }}_{{CB}\to {AD}}$ is a channel, it has a discrete Wigner function

$\begin{eqnarray}&&{W}_{{\rm{\Xi }}}({{\boldsymbol{u}}}_{A},{{\boldsymbol{v}}}_{D}| {{\boldsymbol{u}}}_{C},{{\boldsymbol{v}}}_{B})=\displaystyle \frac{1}{{d}_{A}{d}_{D}}\mathrm{Tr}[\left({A}_{A}^{{{\boldsymbol{u}}}_{A}}\otimes {A}_{D}^{{{\boldsymbol{v}}}_{D}}\right){{ \mathcal P }}_{{CB}\to {AD}}\left({A}_{C}^{{{\boldsymbol{u}}}_{C}}\otimes {A}_{B}^{{{\boldsymbol{v}}}_{B}}\right)],\end{eqnarray} \tag{ 52 }$

where we use the subscript Ξ to indicate its association with the superchannel Ξ and the choice of letters ${\boldsymbol{u}}$ and ${\boldsymbol{v}}$ are made in the above way because, in what follows, we will link up the discrete Wigner function ${W}_{{ \mathcal N }}({{\boldsymbol{v}}}_{B}| {{\boldsymbol{u}}}_{A})$ of a quantum channel ${{ \mathcal N }}_{A\to B}$ with ${W}_{{\rm{\Xi }}}$ and the notation given above is more convenient for doing so. In addition to obeying the following property

$\begin{eqnarray}&&\displaystyle \sum _{{{\boldsymbol{u}}}_{A},{{\boldsymbol{v}}}_{D}}{W}_{{\rm{\Xi }}}({{\boldsymbol{u}}}_{A},{{\boldsymbol{v}}}_{D}| {{\boldsymbol{u}}}_{C},{{\boldsymbol{v}}}_{B})=1,\end{eqnarray} \tag{ 53 }$

so that ${W}_{{\rm{\Xi }}}({{\boldsymbol{u}}}_{A},{{\boldsymbol{v}}}_{D}| {{\boldsymbol{u}}}_{C},{{\boldsymbol{v}}}_{B})$ is a conditional quasi-probability distribution, there is an extra constraint imposed on ${W}_{{\rm{\Xi }}}({{\boldsymbol{u}}}_{A},{{\boldsymbol{v}}}_{D}| {{\boldsymbol{u}}}_{C},{{\boldsymbol{v}}}_{B})$ related to the non-signaling constraint $B{/}\!\!\!\!\!\!\!{\to }A$ in (46). To see this, let

$\begin{eqnarray}&&{W}_{{\rm{\Xi }}}({{\boldsymbol{u}}}_{A}| {{\boldsymbol{u}}}_{C},{{\boldsymbol{v}}}_{B}):= \displaystyle \sum _{{{\boldsymbol{v}}}_{D}}{W}_{{\rm{\Xi }}}({{\boldsymbol{u}}}_{A},{{\boldsymbol{v}}}_{D}| {{\boldsymbol{u}}}_{C},{{\boldsymbol{v}}}_{B}).\end{eqnarray} \tag{ 54 }$

By employing the non-signaling constraint in (46) and properties of the phase-space point operators, it is straightforward to conclude that the non-signaling constraint $B{/}\!\!\!\!\!\!\!{\to }A$ is equivalent to the following condition on the discrete Wigner function ${W}_{{\rm{\Xi }}}({{\boldsymbol{u}}}_{A},{{\boldsymbol{v}}}_{D}| {{\boldsymbol{u}}}_{C},{{\boldsymbol{v}}}_{B})$ :

$\begin{eqnarray}&&{W}_{{\rm{\Xi }}}({{\boldsymbol{u}}}_{A}| {{\boldsymbol{u}}}_{C},{{\boldsymbol{v}}}_{B})={W}_{{\rm{\Xi }}}({{\boldsymbol{u}}}_{A}| {{\boldsymbol{u}}}_{C},{{\boldsymbol{v}}}_{B}^{{\prime} })\qquad \forall {{\boldsymbol{v}}}_{B},{{\boldsymbol{v}}}_{B}^{{\prime} },\end{eqnarray} \tag{ 55 }$

so that we can write

$\begin{eqnarray}&&{W}_{{\rm{\Xi }}}({{\boldsymbol{u}}}_{A}| {{\boldsymbol{u}}}_{C}):= {W}_{{\rm{\Xi }}}({{\boldsymbol{u}}}_{A}| {{\boldsymbol{u}}}_{C},{{\boldsymbol{v}}}_{B}).\end{eqnarray} \tag{ 56 }$

This can be interpreted as indicating that the output phase-space point ${{\boldsymbol{u}}}_{A}$ is independent of ${{\boldsymbol{v}}}_{B}$ if system D is not available (i.e. has been marginalized). We note here that the conditions in (55) represent a direct generalization of non-signaling constraints for classical probability distributions to quasi-probability distributions. Furthermore, we also observe that the super-quasi-probability distribution in (52) represents a generalization of the classical superchannels discussed in [52, 53].

By employing the propagation rule in (51) and a sequence of steps similar to those given in the proof of lemma 1, we conclude that the discrete Wigner function of the output channel ${{ \mathcal K }}_{C\to D}={{\rm{\Xi }}}_{\left(A\to B\right)\to \left(C\to D\right)}({{ \mathcal N }}_{A\to B})$ is given by

$\begin{eqnarray}&&{W}_{{\rm{\Xi }}({ \mathcal N })}({{\boldsymbol{v}}}_{D}| {{\boldsymbol{u}}}_{C})=\sum _{{{\boldsymbol{u}}}_{A},{{\boldsymbol{v}}}_{B}}{W}_{{\rm{\Xi }}}({{\boldsymbol{u}}}_{A},{{\boldsymbol{v}}}_{D}| {{\boldsymbol{u}}}_{C},{{\boldsymbol{v}}}_{B}){W}_{{ \mathcal N }}({{\boldsymbol{v}}}_{B}| {{\boldsymbol{u}}}_{A}).,\end{eqnarray} \tag{ 57 }$

again generalizing the fully classical case from [52, 53].

We now define CPWP superchannels as free superchannels that extend the notion of CPWP channels:

(CPWP superchannel).

Definition 3 A superchannel ${{\rm{\Xi }}}_{\left(A\to B\right)\to \left(C\to D\right)}$ is completely CPWP preserving (CPWP superchannel for short) if, for all CPWP channels ${{ \mathcal N }}_{{RA}\to {RB}}$ , the output channel ${{\rm{\Xi }}}_{\left(A\to B\right)\to \left(C\to D\right)}({{ \mathcal N }}_{{RA}\to {RB}})$ is CPWP, where R is an arbitrary reference system.

We then have the following theorem as a generalization of theorem 2 (its proof is very similar and so we omit it):

Theorem 3. The following statements about CPWP superchannels are equivalent:

1.
The quantum superchannel ${\rm{\Xi }}$ is CPWP;
2.
The discrete Wigner function of the Choi matrix ${J}_{{CBAD}}^{{\rm{\Xi }}}$ is non-negative;
3.
The discrete Wigner function ${W}_{{\rm{\Xi }}}({{\boldsymbol{u}}}_{A},{{\boldsymbol{v}}}_{D}| {{\boldsymbol{u}}}_{C},{{\boldsymbol{v}}}_{B})$ is non-negative for all ${{\boldsymbol{u}}}_{A},{{\boldsymbol{v}}}_{D}$ , ${{\boldsymbol{u}}}_{C}$ , and ${{\boldsymbol{v}}}_{B}$ (i.e. ${W}_{{\rm{\Xi }}}({{\boldsymbol{u}}}_{A},{{\boldsymbol{v}}}_{D}| {{\boldsymbol{u}}}_{C},{{\boldsymbol{v}}}_{B})$ is a conditional probability distribution or classical bipartite channel with a non-signaling constraint).

An interesting consequence of the third part of the above theorem is that every CPWP superchannel has a non-unique realization in terms of pre- and post-processing CPWP channels. This follows from the fact that every non-signaling classical bipartite channel can be realized in terms of pre- and post-processing classical channels (see the discussion surrounding [52, equation (7)]), and these pre- and post-processing classical channels can be identified as the discrete Wigner functions of pre- and post-processing CPWP channels.

3.3. Logarithmic negativity (mana) of a quantum channel

To quantify the magic of quantum channels, we introduce the mana (or logarithmic negativity) of a quantum channel ${{ \mathcal N }}_{A\to B}$ :

(Mana of a quantum channel).

Definition 4 The mana of a quantum channel ${{ \mathcal N }}_{A\to B}$ is defined as

$\begin{eqnarray}&&{ \mathcal M }({{ \mathcal N }}_{A\to B}):= \mathrm{log}\mathop{\max }\limits_{{\boldsymbol{u}}}\parallel {{ \mathcal N }}_{A\to B}({A}_{A}^{{\boldsymbol{u}}}){\parallel }_{W,1}\,\end{eqnarray} \tag{ 58 }$

$\begin{eqnarray}&&\,\,\,=\,\mathrm{log}\mathop{\max }\limits_{{\boldsymbol{u}}}\displaystyle \sum _{{\boldsymbol{v}}}\displaystyle \frac{1}{{d}_{B}}\left|\mathrm{Tr}\,[{A}_{B}^{{\boldsymbol{v}}}{{ \mathcal N }}_{A\to B}({A}_{A}^{{\boldsymbol{u}}})]\right|\end{eqnarray} \tag{ 59 }$

$\begin{eqnarray}&&\,\,\,=\,\mathrm{log}\mathop{\max }\limits_{{\boldsymbol{u}}}\displaystyle \sum _{{\boldsymbol{v}}}\displaystyle \frac{1}{{d}_{B}}\left|\mathrm{Tr}\,[({A}_{A}^{{\boldsymbol{u}}}\otimes {A}_{B}^{{\boldsymbol{v}}}){J}_{{AB}}^{{ \mathcal N }}]\right|\end{eqnarray} \tag{ 60 }$

$\begin{eqnarray}&&\,=\,\mathrm{log}\mathop{\max }\limits_{{\boldsymbol{u}}}\displaystyle \sum _{{\boldsymbol{v}}}| {W}_{{ \mathcal N }}({\boldsymbol{v}}| {\boldsymbol{u}})| .\end{eqnarray} \tag{ 61 }$

More generally, we define the mana of a Hermiticity-preserving linear map ${{ \mathcal P }}_{A\to B}$ via the same formula above, but substituting ${ \mathcal N }$ with ${ \mathcal P }$ .

In the following, we are going to show that the mana of a quantum channel has many desirable properties, such as:

1.
Reduction to states: ${ \mathcal M }({ \mathcal N })={ \mathcal M }(\sigma )$ when the channel ${ \mathcal N }$ is a replacer channel, acting as ${ \mathcal N }(\rho )=\mathrm{Tr}\,[\rho ]\sigma$ for an arbitrary input state ρ, with σ a state.
2.
Additivity under tensor products (proposition 5): ${ \mathcal M }({{ \mathcal N }}_{1}\otimes {{ \mathcal N }}_{2})={ \mathcal M }({{ \mathcal N }}_{1})+{ \mathcal M }({{ \mathcal N }}_{2})$ .
3.
Subadditivity under serial composition of channels (proposition 6): ${ \mathcal M }({{ \mathcal N }}_{2}\,\circ \,{{ \mathcal N }}_{1})\leqslant { \mathcal M }({{ \mathcal N }}_{1})+{ \mathcal M }({{ \mathcal N }}_{2})$ .
4.
Faithfulness (proposition 7): ${ \mathcal M }({ \mathcal N })\geqslant 0$ and ${ \mathcal M }({ \mathcal N })=0$ if and only if ${ \mathcal N }\in \mathrm{CPWP}$ .
5.
Amortization inequality (proposition 8): $\forall {\rho }_{{RA}}$ , ${ \mathcal M }(({\mathrm{id}}_{R}\otimes { \mathcal N })({\rho }_{{RA}}))-{ \mathcal M }({\rho }_{{RA}})\leqslant { \mathcal M }({ \mathcal N })$ .
6.
Monotonicity under CPWP superchannels (proposition 9), which implies monotonicity under completely stabilizer-preserving superchannels.

(Reduction to states).

Proposition 4 Let ${ \mathcal N }$ be a replacer channel, acting as ${ \mathcal N }(\rho )=\mathrm{Tr}\,[\rho ]\sigma$ for an arbitrary input state $\rho$ , with $\sigma$ a state. Then

$\begin{eqnarray}&&{ \mathcal M }({ \mathcal N })={ \mathcal M }(\sigma ).\end{eqnarray} \tag{ 62 }$

Proof. Applying definitions and the fact that $\mathrm{Tr}\,[{A}_{{\bf{u}}}]=1$ for a phase-space point operator ${A}_{{\bf{u}}}$ , we find that

$\begin{eqnarray}&&{ \mathcal M }({ \mathcal N })=\mathrm{log}\mathop{\max }\limits_{{\bf{u}}}\parallel { \mathcal N }({A}_{{\bf{u}}}){\parallel }_{W,1}=\mathrm{log}\mathop{\max }\limits_{{\bf{u}}}\parallel \mathrm{Tr}\,[{A}_{{\bf{u}}}]\sigma {\parallel }_{W,1}=\mathrm{log}\parallel \sigma {\parallel }_{W,1}={ \mathcal M }(\sigma ),\end{eqnarray} \tag{ 63 }$

concluding the proof. ■

(Additivity).

Proposition 5 For quantum channels ${{ \mathcal N }}_{1}$ and ${{ \mathcal N }}_{2}$ , the following additivity identity holds

$\begin{eqnarray}&&{ \mathcal M }({{ \mathcal N }}_{1}\otimes {{ \mathcal N }}_{2})={ \mathcal M }({{ \mathcal N }}_{1})+{ \mathcal M }({{ \mathcal N }}_{2}).\end{eqnarray} \tag{ 64 }$

More generally, the same additivity identity holds if ${{ \mathcal N }}_{1}$ and ${{ \mathcal N }}_{2}$ are Hermiticity-preserving linear maps.

Proof. The proof relies on basic properties of the Wigner one-norm and composite phase-space point operators, i.e.

$\begin{eqnarray}&&{ \mathcal M }({{ \mathcal N }}_{1}\otimes {{ \mathcal N }}_{2})=\mathrm{log}\mathop{\max }\limits_{{{\bf{u}}}_{1},{{\bf{u}}}_{2}}\parallel ({{ \mathcal N }}_{1}\otimes {{ \mathcal N }}_{2})({A}_{{{\bf{u}}}_{1}}\otimes {A}_{{{\bf{u}}}_{2}}){\parallel }_{W,1}\end{eqnarray} \tag{ 65 }$

$\begin{eqnarray}&&\,\,\,=\,\mathrm{log}\mathop{\max }\limits_{{{\bf{u}}}_{1},{{\bf{u}}}_{2}}\left[\parallel {{ \mathcal N }}_{1}({A}_{{{\bf{u}}}_{1}}){\parallel }_{W,1}\cdot \parallel {{ \mathcal N }}_{2}({A}_{{{\bf{u}}}_{2}}){\parallel }_{W,1}\right]\end{eqnarray} \tag{ 66 }$

$\begin{eqnarray}&&\,\,\,\,=\,\mathrm{log}\mathop{\max }\limits_{{{\bf{u}}}_{1}}\parallel {{ \mathcal N }}_{1}({A}_{{{\bf{u}}}_{1}}){\parallel }_{W,1}+\mathrm{log}\mathop{\max }\limits_{{{\bf{u}}}_{2}}\parallel {{ \mathcal N }}_{2}({A}_{{{\bf{u}}}_{2}}){\parallel }_{W,1}\end{eqnarray} \tag{ 67 }$

$\begin{eqnarray}&&=\,{ \mathcal M }({{ \mathcal N }}_{1})+{ \mathcal M }({{ \mathcal N }}_{2}).\,\end{eqnarray} \tag{ 68 }$

This concludes the proof. ■

(Subadditivity).

Proposition 6 For quantum channels ${{ \mathcal N }}_{1}$ and ${{ \mathcal N }}_{2}$ , the following subadditivity inequality holds

$\begin{eqnarray}&&{ \mathcal M }({{ \mathcal N }}_{2}\,\circ \,{{ \mathcal N }}_{1})\leqslant { \mathcal M }({{ \mathcal N }}_{1})+{ \mathcal M }({{ \mathcal N }}_{2}).\end{eqnarray} \tag{ 69 }$

More generally, the same subadditivity inequality holds if ${{ \mathcal N }}_{1}$ and ${{ \mathcal N }}_{2}$ are Hermiticity-preserving linear maps.

Proof. Consider the following for an arbitrary phase-space point operator ${A}_{{\bf{u}}}$ :

$\begin{eqnarray}&&\mathrm{log}\parallel ({{ \mathcal N }}_{2}\,\circ \,{{ \mathcal N }}_{1})({A}_{{\bf{u}}}){\parallel }_{W,1}=\mathrm{log}\displaystyle \frac{\parallel ({{ \mathcal N }}_{2}\,\circ \,{{ \mathcal N }}_{1})({A}_{{\bf{u}}}){\parallel }_{W,1}}{\parallel {{ \mathcal N }}_{1}({A}_{{\bf{u}}}){\parallel }_{W,1}}+\mathrm{log}\parallel {{ \mathcal N }}_{1}({A}_{{\bf{u}}}){\parallel }_{W,1}\end{eqnarray} \tag{ 70 }$

$\begin{eqnarray}&&\,\,\,\,\,=\,\mathrm{log}\displaystyle \frac{{\parallel {{ \mathcal N }}_{2}\left({\sum }_{{\bf{u}}^{\prime} }{W}_{{{ \mathcal N }}_{1}({A}_{{\bf{u}}})}({\bf{u}}^{\prime} ){A}_{{\bf{u}}^{\prime} }\right)\parallel }_{W,1}}{{\sum }_{{\bf{u}}^{\prime} }| {W}_{{{ \mathcal N }}_{1}({A}_{{\bf{u}}})}({\bf{u}}^{\prime} )| }+\mathrm{log}\parallel {{ \mathcal N }}_{1}({A}_{{\bf{u}}}){\parallel }_{W,1}\end{eqnarray} \tag{ 71 }$

$\begin{eqnarray}&&\,\,\,\,\,\,\leqslant \,\mathrm{log}\displaystyle \sum _{{\bf{u}}^{\prime} }\displaystyle \frac{| {W}_{{{ \mathcal N }}_{1}({A}_{{\bf{u}}})}({\bf{u}}^{\prime} )| }{{\sum }_{{\bf{u}}^{\prime} }| {W}_{{{ \mathcal N }}_{1}({A}_{{\bf{u}}})}({\bf{u}}^{\prime} )| }{\parallel {{ \mathcal N }}_{2}\left({A}_{{\bf{u}}^{\prime} }\right)\parallel }_{W,1}+\mathrm{log}\parallel {{ \mathcal N }}_{1}({A}_{{\bf{u}}}){\parallel }_{W,1}\end{eqnarray} \tag{ 72 }$

$\begin{eqnarray}&&\,\,\,\,\leqslant \,\mathrm{log}\mathop{\max }\limits_{{\bf{u}}^{\prime} }{\parallel {{ \mathcal N }}_{2}\left({A}_{{\bf{u}}^{\prime} }\right)\parallel }_{W,1}+\mathrm{log}\mathop{\max }\limits_{{\bf{u}}}\parallel {{ \mathcal N }}_{1}({A}_{{\bf{u}}}){\parallel }_{W,1}\end{eqnarray} \tag{ 73 }$

$\begin{eqnarray}&&=\,{ \mathcal M }({{ \mathcal N }}_{1})+{ \mathcal M }({{ \mathcal N }}_{2}).\,\end{eqnarray} \tag{ 74 }$

Since the chain of inequalities holds for an arbitrary phase-space point operator ${A}_{{\bf{u}}}$ , we conclude the statement of the proposition. ■

(Faithfulness).

Proposition 7 Let ${{ \mathcal N }}_{A\to B}$ be a quantum channel. Then the mana of the channel ${ \mathcal N }$ satisfies ${ \mathcal M }({ \mathcal N })\geqslant 0$ , and ${ \mathcal M }({ \mathcal N })=0$ if and only if ${ \mathcal N }\in \mathrm{CPWP}$ .

Proof. To see the first claim, from the assumption that ${ \mathcal N }$ is a quantum channel and (30), we find that

$\begin{eqnarray}&&\displaystyle \sum _{{\bf{v}}}| {{ \mathcal W }}_{{ \mathcal N }}({\bf{v}}| {\bf{u}})| \,=\,2\left[\displaystyle \sum _{{\bf{v}}:{{ \mathcal W }}_{{ \mathcal N }}({\bf{v}}| {\bf{u}})\lt 0}| {{ \mathcal W }}_{{ \mathcal N }}({\bf{v}}| {\bf{u}})| \right]+1\geqslant 1\quad \forall {\bf{u}}.\end{eqnarray} \tag{ 75 }$

Taking a maximization over ${\bf{u}}$ and applying a logarithm leads to the conclusion that ${ \mathcal M }({ \mathcal N })\geqslant 0$ for all channels ${ \mathcal N }$ .

Now suppose that ${ \mathcal N }\in \mathrm{CPWP}$ . Then by theorem 2, it follows that ${{ \mathcal W }}_{{ \mathcal N }}({\bf{v}}| {\bf{u}})$ is a conditional probability distribution, so that ${\sum }_{{\bf{v}}}| {{ \mathcal W }}_{{ \mathcal N }}({\bf{v}}| {\bf{u}})| ={\sum }_{{\bf{v}}}{{ \mathcal W }}_{{ \mathcal N }}({\bf{v}}| {\bf{u}})=1$ for all ${\bf{u}}$ . It then follows from the definition that ${ \mathcal M }({ \mathcal N })=0$ .

Finally, suppose that ${ \mathcal M }({ \mathcal N })=0$ . By definition, this implies that ${\max }_{{\bf{u}}}{\sum }_{{\bf{v}}}| {{ \mathcal W }}_{{ \mathcal N }}({\bf{v}}| {\bf{u}})| \,=\,1$ . However, consider that the rightmost inequality in (75) holds for all channels. So our assumption and this inequality imply that ${\sum }_{{\bf{v}}:{{ \mathcal W }}_{{ \mathcal N }}({\bf{v}}| {\bf{u}})\lt 0}| {{ \mathcal W }}_{{ \mathcal N }}({\bf{v}}| {\bf{u}})| =0$ for all ${\bf{u}}$ , which means that ${{ \mathcal W }}_{{ \mathcal N }}({\bf{v}}| {\bf{u}})\geqslant 0$ for all ${\bf{u}},{\bf{v}}$ . By theorem 2, it follows that ${ \mathcal N }\in \mathrm{CPWP}$ .■

(Amortization inequality).

Proposition 8 For any quantum channel ${{ \mathcal N }}_{A\to B}$ , the following inequality holds

$\begin{eqnarray}&&\mathop{\sup }\limits_{{\rho }_{A}}[{ \mathcal M }({ \mathcal N }({\rho }_{A}))-{ \mathcal M }({\rho }_{A})]\leqslant { \mathcal M }({ \mathcal N }).\end{eqnarray} \tag{ 76 }$

Furthermore, we have that

$\begin{eqnarray}&&\mathop{\sup }\limits_{{\rho }_{{RA}}}[{ \mathcal M }(({\mathrm{id}}_{R}\otimes { \mathcal N })({\rho }_{{RA}}))-{ \mathcal M }({\rho }_{{RA}})]\leqslant { \mathcal M }({ \mathcal N }).\end{eqnarray} \tag{ 77 }$

Proof. The inequality in (76) is a direct consequence of reduction to states (proposition 4) and subadditivity of mana with respect to serial compositions (proposition 6). Indeed, letting ${{ \mathcal N }}^{{\prime} }$ be a replacer channel that prepares the state ${\rho }_{A}$ , we find that

$\begin{eqnarray}&&{ \mathcal M }({ \mathcal N }({\rho }_{A}))={ \mathcal M }({ \mathcal N }\,\circ \,{{ \mathcal N }}^{{\prime} })\leqslant { \mathcal M }({ \mathcal N })+{ \mathcal M }({{ \mathcal N }}^{{\prime} })={ \mathcal M }({ \mathcal N })+{ \mathcal M }({\rho }_{A}).\end{eqnarray} \tag{ 78 }$

for all input states ${\rho }_{A}$ , from which we conclude (76).

By applying the inequality in (78) with the substitution ${ \mathcal N }\to \mathrm{id}\otimes { \mathcal N }$ , the additivity of the mana of a channel from proposition 5, and the fact that the identity channel is free (and thus has mana equal to zero), we finally conclude that

$\begin{eqnarray}&&{ \mathcal M }(({\mathrm{id}}_{R}\otimes { \mathcal N })({\rho }_{{RA}}))-{ \mathcal M }({\rho }_{{RA}})\leqslant { \mathcal M }({\mathrm{id}}_{R}\otimes { \mathcal N })\end{eqnarray} \tag{ 79 }$

$\begin{eqnarray}&&\,\,\,\,\,\,=\,{ \mathcal M }({\mathrm{id}}_{R})+{ \mathcal M }({ \mathcal N })\end{eqnarray} \tag{ 80 }$

$\begin{eqnarray}&&\,\,\,\,=\,{ \mathcal M }({ \mathcal N }),\end{eqnarray} \tag{ 81 }$

from which we conclude (77).■

(Monotonicity).

Theorem 9 Let ${{ \mathcal N }}_{A\to B}$ be a quantum channel, and let ${{\rm{\Xi }}}^{\mathrm{CPWP}}$ be a CPWP superchannel as given in definition 3. Then ${ \mathcal M }({ \mathcal N })$ is a channel magic measure in the sense that

$\begin{eqnarray}&&{ \mathcal M }({ \mathcal N })\geqslant { \mathcal M }({{\rm{\Xi }}}^{\mathrm{CPWP}}({ \mathcal N })).\end{eqnarray} \tag{ 82 }$

Proof. Recalling the definition of the channel mana in terms of the discrete Wigner function (see (61)) and abbreviating ${{\rm{\Xi }}}^{\mathrm{CPWP}}$ as Ξ, consider that

$\begin{eqnarray}&&{ \mathcal M }({{\rm{\Xi }}}^{\mathrm{CPWP}}({ \mathcal N }))=\mathrm{log}\mathop{\max }\limits_{{{\boldsymbol{u}}}_{C}}\displaystyle \sum _{{{\boldsymbol{v}}}_{D}}\left|{W}_{{\rm{\Xi }}({ \mathcal N })}({{\boldsymbol{v}}}_{D}| {{\boldsymbol{u}}}_{C})\right|\,\,\end{eqnarray} \tag{ 83 }$

$\begin{eqnarray}&&\,\,\,\,=\,\mathrm{log}\mathop{\max }\limits_{{{\boldsymbol{u}}}_{C}}\displaystyle \sum _{{{\boldsymbol{v}}}_{D}}\left|\displaystyle \sum _{{{\boldsymbol{u}}}_{A},{{\boldsymbol{v}}}_{B}}{W}_{{\rm{\Xi }}}({{\boldsymbol{u}}}_{A},{{\boldsymbol{v}}}_{D}| {{\boldsymbol{u}}}_{C},{{\boldsymbol{v}}}_{B}){W}_{{ \mathcal N }}({{\boldsymbol{v}}}_{B}| {{\boldsymbol{u}}}_{A})\right|\end{eqnarray} \tag{ 84 }$

$\begin{eqnarray}&&\,\,\,\leqslant \,\mathrm{log}\mathop{\max }\limits_{{{\boldsymbol{u}}}_{C}}\displaystyle \sum _{{{\boldsymbol{v}}}_{D},{{\boldsymbol{u}}}_{A},{{\boldsymbol{v}}}_{B}}\left|{W}_{{\rm{\Xi }}}({{\boldsymbol{u}}}_{A},{{\boldsymbol{v}}}_{D}| {{\boldsymbol{u}}}_{C},{{\boldsymbol{v}}}_{B}){W}_{{ \mathcal N }}({{\boldsymbol{v}}}_{B}| {{\boldsymbol{u}}}_{A})\right|\end{eqnarray} \tag{ 85 }$

$\begin{eqnarray}&&\,\,\,=\,\mathrm{log}\mathop{\max }\limits_{{{\boldsymbol{u}}}_{C}}\displaystyle \sum _{{{\boldsymbol{v}}}_{D},{{\boldsymbol{u}}}_{A},{{\boldsymbol{v}}}_{B}}{W}_{{\rm{\Xi }}}({{\boldsymbol{u}}}_{A},{{\boldsymbol{v}}}_{D}| {{\boldsymbol{u}}}_{C},{{\boldsymbol{v}}}_{B})\left|{W}_{{ \mathcal N }}({{\boldsymbol{v}}}_{B}| {{\boldsymbol{u}}}_{A})\right|\end{eqnarray} \tag{ 86 }$

$\begin{eqnarray}&&\,\,=\,\mathrm{log}\mathop{\max }\limits_{{{\boldsymbol{u}}}_{C}}\displaystyle \sum _{{{\boldsymbol{u}}}_{A},{{\boldsymbol{v}}}_{B}}{W}_{{\rm{\Xi }}}({{\boldsymbol{u}}}_{A}| {{\boldsymbol{u}}}_{C},{{\boldsymbol{v}}}_{B})\left|{W}_{{ \mathcal N }}({{\boldsymbol{v}}}_{B}| {{\boldsymbol{u}}}_{A})\right|\end{eqnarray} \tag{ 87 }$

$\begin{eqnarray}&&\,\,=\,\mathrm{log}\mathop{\max }\limits_{{{\boldsymbol{u}}}_{C}}\displaystyle \sum _{{{\boldsymbol{u}}}_{A},{{\boldsymbol{v}}}_{B}}{W}_{{\rm{\Xi }}}({{\boldsymbol{u}}}_{A}| {{\boldsymbol{u}}}_{C})\left|{W}_{{ \mathcal N }}({{\boldsymbol{v}}}_{B}| {{\boldsymbol{u}}}_{A})\right|.\end{eqnarray} \tag{ 88 }$

The second equality follows from (57). The first inequality follows from the triangle inequality. The third equality follows from the assumption that the superchannel Ξ is CPWP, so that its discrete Wigner function is non-negative (see theorem 3). The fourth equality follows from marginalizing ${W}_{{\rm{\Xi }}}$ over ${{\boldsymbol{v}}}_{D}$ . The fifth equality follows from the non-signaling constraint in (55). Continuing, we find that

$\begin{eqnarray}&&\mathrm{Equation}(88)=\mathrm{log}\mathop{\max }\limits_{{{\boldsymbol{u}}}_{C}}\displaystyle \sum _{{{\boldsymbol{u}}}_{A}}{W}_{{\rm{\Xi }}}({{\boldsymbol{u}}}_{A}| {{\boldsymbol{u}}}_{C})\displaystyle \sum _{{{\boldsymbol{v}}}_{B}}\left|{W}_{{ \mathcal N }}({{\boldsymbol{v}}}_{B}| {{\boldsymbol{u}}}_{A})\right|\,\end{eqnarray} \tag{ 89 }$

$\begin{eqnarray}&&\,\,\leqslant \,\mathrm{log}\mathop{\max }\limits_{{{\boldsymbol{u}}}_{C}}\displaystyle \sum _{{{\boldsymbol{u}}}_{A}}{W}_{{\rm{\Xi }}}({{\boldsymbol{u}}}_{A}| {{\boldsymbol{u}}}_{C})\left[\mathop{\max }\limits_{{{\boldsymbol{u}}}_{A}}\displaystyle \sum _{{{\boldsymbol{v}}}_{B}}\left|{W}_{{ \mathcal N }}({{\boldsymbol{v}}}_{B}| {{\boldsymbol{u}}}_{A})\right|\right]\end{eqnarray} \tag{ 90 }$

$\begin{eqnarray}&&=\,\mathrm{log}\mathop{\max }\limits_{{{\boldsymbol{u}}}_{A}}\displaystyle \sum _{{{\boldsymbol{v}}}_{B}}\left|{W}_{{ \mathcal N }}({{\boldsymbol{v}}}_{B}| {{\boldsymbol{u}}}_{A})\right|\,\,\end{eqnarray} \tag{ 91 }$

$\begin{eqnarray}&&=\,{ \mathcal M }({ \mathcal N }).\,\,\,\,\end{eqnarray} \tag{ 92 }$

The first equality follows from rearranging sums. The inequality follows from bounding ${\sum }_{{{\boldsymbol{v}}}_{B}}\left|{W}_{{ \mathcal N }}({{\boldsymbol{v}}}_{B}| {{\boldsymbol{u}}}_{A})\right|$ in terms of its maximum value (so that it is no longer dependent on ${{\boldsymbol{u}}}_{A}$ ). The penultimate equality follows because ${\sum }_{{{\boldsymbol{u}}}_{A}}{W}_{{\rm{\Xi }}}({{\boldsymbol{u}}}_{A}| {{\boldsymbol{u}}}_{C})=1$ , and the final one follows by definition. ■

Remark 1. We note here that the monotonicity inequality in (82) holds more generally if ${ \mathcal N }$ is a completely positive map that is not necessarily trace preserving.

3.4. Generalized thauma of a quantum channel

In this section, we define a rather general measure of magic for a quantum channel, called the generalized thauma, which extends to channels the definition from [20] for states. To define it, recall that a generalized divergence ${\bf{D}}(\rho \parallel \sigma )$ is any function of a quantum state ρ and a positive semi-definite operator σ that obeys data processing [54, 55], i.e. ${\bf{D}}(\rho \parallel \sigma )\geqslant {\bf{D}}({ \mathcal N }(\rho )\parallel { \mathcal N }(\sigma ))$ where ${ \mathcal N }$ is a quantum channel. Examples of generalized divergences, in addition to the trace distance and relative entropy, include the Petz–Rényi relative entropies [56], the sandwiched Rényi relative entropies [57, 58], the Hilbert α-divergences [59], and the ${\chi }^{2}$ divergences [60]. One can then define the generalized channel divergence [61], as a way of quantifying the distinguishability of two quantum channels ${{ \mathcal N }}_{A\to B}$ and ${{ \mathcal P }}_{A\to B}$ , as follows:

$\begin{eqnarray}&&{\bf{D}}({ \mathcal N }\parallel { \mathcal P }):= \mathop{\sup }\limits_{{\psi }_{{RA}}}{\bf{D}}({{ \mathcal N }}_{A\to B}({\psi }_{{RA}})\parallel {{ \mathcal P }}_{A\to B}({\psi }_{{RA}})),\end{eqnarray} \tag{ 93 }$

where the optimization is with respect to all pure states ${\psi }_{{RA}}$ such that system R is isomorphic to the channel input system A (note that one does not achieve a higher value of ${\bf{D}}({ \mathcal N }\parallel { \mathcal P })$ by allowing for an optimization over mixed states ${\rho }_{{RA}}$ with an arbitrarily large reference system [61], as a consequence of purification, the Schmidt decomposition theorem, and data processing). More generally, ${{ \mathcal P }}_{A\to B}$ can be a completely positive map in the definition in (93). Interestingly, the generalized channel divergence is monotone under the action of a superchannel Ξ:

$\begin{eqnarray}&&{\bf{D}}({ \mathcal N }\parallel { \mathcal P })\geqslant {\bf{D}}({\rm{\Xi }}({ \mathcal N })\parallel {\rm{\Xi }}({ \mathcal P })),\end{eqnarray} \tag{ 94 }$

as shown in [50, section V-A].

We then define generalized thauma as follows:

(Generalized thauma of a quantum channel).

Definition 5 The generalized thauma of a quantum channel ${{ \mathcal N }}_{A\to B}$ is defined as

$\begin{eqnarray}&&{\boldsymbol{\theta }}({ \mathcal N }):= \mathop{\inf }\limits_{{ \mathcal E }:{ \mathcal M }({ \mathcal E })\leqslant 0}{\bf{D}}({ \mathcal N }\parallel { \mathcal E }),\end{eqnarray} \tag{ 95 }$

where the optimization is with respect to all completely positive maps ${ \mathcal E }$ having mana ${ \mathcal M }({ \mathcal E })\leqslant 0$ .

It is clear that the above definition extends the generalized thauma of a state [20], which we recall is given by

$\begin{eqnarray}&&{\boldsymbol{\theta }}(\rho ):= \mathop{\inf }\limits_{\sigma \geqslant 0:{ \mathcal M }(\sigma )\leqslant 0}{\bf{D}}(\rho \parallel \sigma ).\end{eqnarray} \tag{ 96 }$

We now prove that the generalized thauma of a quantum channel reduces to the state measure whenever the channel ${ \mathcal N }$ is a replacer channel:

(Reduction to states).

Proposition 10 Let ${ \mathcal N }$ be a replacer channel, acting as ${ \mathcal N }(\rho )=\mathrm{Tr}\,[\rho ]\sigma$ for an arbitrary input state $\rho$ , where $\sigma$ is a state. Then

$\begin{eqnarray}&&{\boldsymbol{\theta }}({ \mathcal N })={\boldsymbol{\theta }}(\sigma ).\end{eqnarray} \tag{ 97 }$

Proof. First, denoting the maximally mixed state by π, consider that

$\begin{eqnarray}&&{\boldsymbol{\theta }}({ \mathcal N })=\mathop{\inf }\limits_{{ \mathcal E }:{ \mathcal M }({ \mathcal E })\leqslant 0}\mathop{\sup }\limits_{{\psi }_{{RA}}}{\bf{D}}(({\mathrm{id}}_{R}\otimes { \mathcal N })({\psi }_{{RA}})\parallel ({\mathrm{id}}_{R}\otimes { \mathcal E })({\psi }_{{RA}}))\end{eqnarray} \tag{ 98 }$

$\begin{eqnarray}&&\geqslant \,\mathop{\inf }\limits_{{ \mathcal E }:{ \mathcal M }({ \mathcal E })\leqslant 0}{\bf{D}}({\pi }_{R}\otimes { \mathcal N }({\pi }_{A})\parallel {\pi }_{R}\otimes { \mathcal E }({\pi }_{A}))\,\end{eqnarray} \tag{ 99 }$

$\begin{eqnarray}&&=\,\mathop{\inf }\limits_{{ \mathcal E }:{ \mathcal M }({ \mathcal E })\leqslant 0}{\bf{D}}({\sigma }_{B}\parallel { \mathcal E }(\pi ))\,\,\,\end{eqnarray} \tag{ 100 }$

$\begin{eqnarray}&&=\,\mathop{\inf }\limits_{\omega :{ \mathcal M }(\omega )\leqslant 0}{\bf{D}}({\sigma }_{B}\parallel \omega )\,\,\,\,\end{eqnarray} \tag{ 101 }$

$\begin{eqnarray}&&=\,{\boldsymbol{\theta }}(\sigma ).\,\,\,\,\,\end{eqnarray} \tag{ 102 }$

The first equality follows from the definition. The inequality follows by choosing the input state suboptimally to be ${\pi }_{R}\otimes {\pi }_{A}$ . The second equality follows because the generalized divergence is invariant with respect to tensoring in the same state for both arguments. The third equality follows because π is a free state with non-negative Wigner function and ${ \mathcal E }$ is a completely positive map with ${ \mathcal M }({ \mathcal E })\leqslant 0$ . Since one can reach all and only the operators $\omega \in { \mathcal W }$ , the equality follows. Then the last equality follows from the definition.

To see the other inequality, consider that ${ \mathcal E }(\rho )=\mathrm{Tr}\,[\rho ]\omega$ , for $\omega \in { \mathcal W }$ , is a particular completely positive map satisfying ${ \mathcal M }({ \mathcal E })={ \mathcal M }(\omega )\leqslant 0$ , so that

$\begin{eqnarray}&&{\boldsymbol{\theta }}({ \mathcal N })=\mathop{\inf }\limits_{{ \mathcal E }:{ \mathcal M }({ \mathcal E })\leqslant 0}\mathop{\sup }\limits_{{\psi }_{{RA}}}{\bf{D}}(({\mathrm{id}}_{R}\otimes { \mathcal N })({\psi }_{{RA}})\parallel ({\mathrm{id}}_{R}\otimes { \mathcal E })({\psi }_{{RA}}))\end{eqnarray} \tag{ 103 }$

$\begin{eqnarray}&&\leqslant \mathop{\inf }\limits_{\omega \,:{ \mathcal M }(\omega )\leqslant 0}{\bf{D}}({\psi }_{R}\otimes {\sigma }_{B}\parallel {\psi }_{R}\otimes {\omega }_{B})\,\,\end{eqnarray} \tag{ 104 }$

$\begin{eqnarray}&&=\,\mathop{\inf }\limits_{\omega :{ \mathcal M }(\omega )\leqslant 0}{\bf{D}}({\sigma }_{B}\parallel {\omega }_{B})\,\,\,\,\end{eqnarray} \tag{ 105 }$

$\begin{eqnarray}&&=\,{\boldsymbol{\theta }}(\sigma ).\,\,\,\,\,\,\end{eqnarray} \tag{ 106 }$

This concludes the proof.■

That the generalized thauma of channels proposed in (95) is a good measure of magic for quantum channels is a consequence of the following proposition:

(Monotonicity).

Theorem 11 Let ${{ \mathcal N }}_{A\to B}$ be a quantum channel, and let ${{\rm{\Xi }}}^{\mathrm{CPWP}}$ be a CPWP superchannel as given in definition 3. Then ${\boldsymbol{\theta }}({ \mathcal N })$ is a channel magic measure in the sense that

$\begin{eqnarray}&&{\boldsymbol{\theta }}({{ \mathcal N }}_{A\to B})\geqslant {\boldsymbol{\theta }}({{\rm{\Xi }}}^{\mathrm{CPWP}}({{ \mathcal N }}_{A\to B})).\end{eqnarray} \tag{ 107 }$

Proof. The idea is to utilize the generalized divergence and its basic property of data processing. In more detail, consider that

$\begin{eqnarray}&&{\boldsymbol{\theta }}({{ \mathcal N }}_{A\to B})=\mathop{\inf }\limits_{{ \mathcal E }:{ \mathcal M }({ \mathcal E })\leqslant 0}{\bf{D}}({{ \mathcal N }}_{A\to B}\parallel { \mathcal E })\,\,\end{eqnarray} \tag{ 108 }$

$\begin{eqnarray}&&\,\,\,\geqslant \,\mathop{\inf }\limits_{{ \mathcal E }:{ \mathcal M }({ \mathcal E })\leqslant 0}{\bf{D}}({{\rm{\Xi }}}^{\mathrm{CPWP}}({{ \mathcal N }}_{A\to B})\parallel {{\rm{\Xi }}}^{\mathrm{CPWP}}({ \mathcal E }))\end{eqnarray} \tag{ 109 }$

$\begin{eqnarray}&&\,\,\geqslant \,\mathop{\inf }\limits_{\widehat{{ \mathcal E }}:{ \mathcal M }(\widehat{{ \mathcal E }})\leqslant 0}{\bf{D}}({{\rm{\Xi }}}^{\mathrm{CPWP}}({{ \mathcal N }}_{A\to B})\parallel \widehat{{ \mathcal E }})\end{eqnarray} \tag{ 110 }$

$\begin{eqnarray}&&=\,{\boldsymbol{\theta }}({{\rm{\Xi }}}^{\mathrm{CPWP}}({{ \mathcal N }}_{A\to B})).\,\end{eqnarray} \tag{ 111 }$

The first inequality follows from the fact that the generalized divergence of channels is monotone under the action of a superchannel [50, section V-A]. The second inequality follows from the monotonicity of ${ \mathcal M }({ \mathcal N })$ given in theorem 9 (and which extends more generally to completely positive maps as stated in remark 1). This monotonicity implies that ${ \mathcal M }({ \mathcal E })\geqslant { \mathcal M }({{\rm{\Xi }}}^{\mathrm{CPWP}}({ \mathcal E }))$ and leads to the second inequality. ■

A generalized divergence is called strongly faithful [62] if for a state ${\rho }_{A}$ and a subnormalized state ${\sigma }_{A}$ , we have ${\bf{D}}({\rho }_{A}\parallel {\sigma }_{A})\geqslant 0$ in general and ${\bf{D}}({\rho }_{A}\parallel {\sigma }_{A})=0$ if and only if ${\rho }_{A}={\sigma }_{A}$ .

(Faithfulness).

Proposition 12 Let ${\bf{D}}$ be a strongly faithful generalized divergence. Then the generalized thauma ${\boldsymbol{\theta }}({ \mathcal N })$ of a channel ${ \mathcal N }$ defined through ${\bf{D}}$ is non-negative and it is equal to zero if ${ \mathcal N }\in \mathrm{CPWP}$ . If the generalized divergence is furthermore continuous and ${\boldsymbol{\theta }}({ \mathcal N })=0$ , then ${ \mathcal N }\in \mathrm{CPWP}$ .

Proof. From lemma 29 in appendix A, it follows that any completely positive map ${ \mathcal E }$ subject to the constraint ${ \mathcal M }({ \mathcal E })\leqslant 0$ is trace non-increasing on the set ${{ \mathcal W }}_{+}$ . It thus follows that ${{ \mathcal E }}_{A\to B}({\psi }_{{RA}})$ is subnormalized for any input state ${\psi }_{{RA}}\in {{ \mathcal W }}_{+}$ . By restricting the maximization to such input states in ${{ \mathcal W }}_{+}$ , applying the faithfulness assumption, and applying the definition of generalized thauma, we conclude that ${\boldsymbol{\theta }}({ \mathcal N })\geqslant 0$ .

Suppose that ${ \mathcal N }\in \mathrm{CPWP}$ . Then by proposition 7, ${ \mathcal M }({ \mathcal N })=0$ and so we can set ${ \mathcal E }={ \mathcal N }$ in the definition of generalized thauma and conclude from the faithfulness assumption that ${\boldsymbol{\theta }}({ \mathcal N })=0$ .

Finally, suppose that ${\boldsymbol{\theta }}({ \mathcal N })=0$ . By the assumption of continuity, this means that there exists a completely positive map ${ \mathcal E }$ satisfying ${\bf{D}}({ \mathcal N }\parallel { \mathcal E })=0$ . By lemma 29 in appendix A and the faithfulness assumption, this in turn means that ${{ \mathcal N }}_{A\to B}({{\rm{\Phi }}}_{{RA}})={{ \mathcal E }}_{A\to B}({{\rm{\Phi }}}_{{RA}})$ for the maximally entangled state ${{\rm{\Phi }}}_{{RA}}\in {{ \mathcal W }}_{+}$ , which implies that ${{ \mathcal N }}_{A\to B}={{ \mathcal E }}_{A\to B}$ . However, we have that ${ \mathcal M }({ \mathcal E })\leqslant 0$ , implying that ${ \mathcal M }({ \mathcal N })=0$ , since ${ \mathcal N }$ is a channel and ${ \mathcal M }({ \mathcal N })\geqslant 0$ for all channels. By proposition 7, we conclude that ${ \mathcal N }\in \mathrm{CPWP}$ .■

As discussed in [61, 63], a generalized divergence possesses the direct-sum property on classical-quantum states if the following equality holds:

$\begin{eqnarray}&&{\bf{D}}\left(\displaystyle \sum _{x}{p}_{X}(x)| x\rangle \langle x{| }_{X}\otimes {\rho }^{x}\parallel \displaystyle \sum _{x}{p}_{X}(x)| x\rangle \langle x{| }_{X}\otimes {\sigma }^{x}\right)=\displaystyle \sum _{x}{p}_{X}(x){\bf{D}}({\rho }^{x}\parallel {\sigma }^{x}),\end{eqnarray} \tag{ 112 }$

where p_X is a probability distribution, $\{| x\rangle \}{}_{x}$ is an orthonormal basis, and ${\{{\rho }^{x}\}}_{x}$ and ${\{{\sigma }^{x}\}}_{x}$ are sets of states. We note that this property holds for trace distance, quantum relative entropy [64], and the Petz–Rényi [56] and sandwiched Rényi [57, 58] quasi-entropies $\mathrm{sgn}(\alpha -1)\mathrm{Tr}\,\left[{\rho }^{\alpha }{\sigma }^{1-\alpha }\right]$ and $\mathrm{sgn}{\left(\alpha -1)\mathrm{Tr}[({\sigma }^{\tfrac{1-\alpha }{2\alpha }}\rho {\sigma }^{\tfrac{1-\alpha }{2\alpha }}\right)}^{\alpha }]$ , respectively.

For such generalized divergences, which are additionally continuous, as well as convex in the second argument, we find that an exchange of the minimization and the maximization in the definition of the generalized thauma is possible:

(Minimax).

Proposition 13 Let ${\bf{D}}$ be a generalized divergence that is continuous, obeys the direct-sum property in (112), and is convex in the second argument. Then the following exchange of min and max is possible in the generalized thauma:

$\begin{eqnarray}&&{\boldsymbol{\theta }}({ \mathcal N })=\mathop{\sup }\limits_{{\psi }_{{RA}}}\mathop{\inf }\limits_{{ \mathcal E }:{ \mathcal M }({ \mathcal E })\leqslant 0}{\bf{D}}({{ \mathcal N }}_{A\to B}({\psi }_{{RA}})\parallel {{ \mathcal E }}_{A\to B}({\psi }_{{RA}})).\end{eqnarray} \tag{ 113 }$

Proof. Let ${ \mathcal E }$ be a fixed completely positive map such that ${ \mathcal M }({ \mathcal E })\leqslant 0$ . Let ${\psi }_{{RA}}^{1}$ and ${\psi }_{{RA}}^{2}$ be input states to consider for the maximization. Due to the unitary freedom of purifications and invariance of generalized divergence with respect to unitaries, we can equivalently consider the maximization to be over the convex set of density operators acting on the channel input system A. Define

$\begin{eqnarray}&&{\rho }_{A}^{\lambda }=\lambda {\psi }_{A}^{1}+(1-\lambda ){\psi }_{A}^{2},\end{eqnarray} \tag{ 114 }$

for $\lambda \in [0,1]$ . Then the state

$\begin{eqnarray}&&| {\phi }^{\lambda }{\rangle }_{{R}^{{\prime} }{RA}}:= \sqrt{\lambda }| 0{\rangle }_{{R}^{{\prime} }}| {\psi }^{1}{\rangle }_{{RA}}+\sqrt{1-\lambda }| 1{\rangle }_{{R}^{{\prime} }}| {\psi }^{2}{\rangle }_{{RA}}\end{eqnarray} \tag{ 115 }$

purifies ${\rho }_{A}^{\lambda }$ and is related to a purification $| {\psi }^{\lambda }{\rangle }_{{RA}}$ of ${\rho }_{A}^{\lambda }$ by an isometry. It then follows that

$\begin{eqnarray}&&{\bf{D}}({{ \mathcal N }}_{A\to B}({\psi }_{{RA}}^{\lambda })\parallel {{ \mathcal E }}_{A\to B}({\psi }_{{RA}}^{\lambda }))\Space{0ex}{0ex}{0ex}\,\,\,\,\,\,={\bf{D}}({{ \mathcal N }}_{A\to B}({\phi }_{{R}^{{\prime} }{RA}}^{\lambda })\parallel {{ \mathcal E }}_{A\to B}({\phi }_{{R}^{{\prime} }{RA}}^{\lambda }))\,\,\end{eqnarray} \tag{ 116 }$

$\begin{eqnarray}\,\begin{array}{rcl} & \geqslant & \lambda {\bf{D}}(| 0\rangle \langle 0{| }_{{R}^{{\prime} }}\otimes {{ \mathcal N }}_{A\to B}({\psi }_{{RA}}^{1})\parallel | 0\rangle \langle 0{| }_{{R}^{{\prime} }}\otimes {{ \mathcal E }}_{A\to B}({\psi }_{{RA}}^{1}))\\ & & +(1-\lambda ){\bf{D}}(| 1\rangle \langle 1{| }_{{R}^{{\prime} }}\otimes {{ \mathcal N }}_{A\to B}({\psi }_{{RA}}^{2})\parallel | 1\rangle \langle 1{| }_{{R}^{{\prime} }}\otimes {{ \mathcal E }}_{A\to B}({\psi }_{{RA}}^{2}))\end{array}\end{eqnarray} \tag{ 117 }$

$\begin{eqnarray}&&\,\,=\,\lambda {\bf{D}}({{ \mathcal N }}_{A\to B}({\psi }_{{RA}}^{1})\parallel {{ \mathcal E }}_{A\to B}({\psi }_{{RA}}^{1}))+(1-\lambda ){\bf{D}}({{ \mathcal N }}_{A\to B}({\psi }_{{RA}}^{2})\parallel {{ \mathcal E }}_{A\to B}({\psi }_{{RA}}^{2})).\end{eqnarray} \tag{ 118 }$

The inequality follows from data processing, by applying a completely dephasing channel to the register $R^{\prime}$ . The last equality again follows from data processing. So the objective function is concave in the argument being maximized (again thinking of the maximization being performed over density operators on A rather than pure states on RA).

By assumption, for a fixed input state ${\psi }_{{RA}}$ , the objective function is convex in the second argument and the set of completely positive maps ${ \mathcal E }$ satisfying ${ \mathcal M }({ \mathcal E })$ is convex.

Then the Sion minimax theorem [65] applies, and we conclude the statement of the proposition.■

Remark 2. Examples of generalized divergences to which proposition 13 applies include the quantum relative entropy [64], the sandwiched Rényi relative entropy [57, 58], and the Petz–Rényi relative entropy [56]. The proposition applies to the latter two by working with the corresponding quasi-entropies and then lifting the result to the actual relative entropies.

3.5. Max-thauma of a quantum channel

As a particular case of the generalized thauma of a quantum channel defined in (95), we consider the max-thauma of a quantum channel, which is the max-relative entropy divergence between the channel and the set of completely positive maps with non-positive mana. Specifically, for a given quantum channel ${{ \mathcal N }}_{A\to B}$ , the max-thauma of ${{ \mathcal N }}_{A\to B}$ is defined by

$\begin{eqnarray}&&{\theta }_{\max }({ \mathcal N }):= \mathop{\min }\limits_{{ \mathcal E }:{ \mathcal M }({ \mathcal E })\leqslant 0}{D}_{\max }({ \mathcal N }\parallel { \mathcal E }),\end{eqnarray} \tag{ 119 }$

where the minimum is taken with respect to all completely positive maps ${ \mathcal E }$ satisfying ${ \mathcal M }({ \mathcal E })\leqslant 0$ and

$\begin{eqnarray}&&{D}_{\max }({ \mathcal N }\parallel { \mathcal E }):= \mathop{\sup }\limits_{{\psi }_{{RA}}}{D}_{\max }({{ \mathcal N }}_{A\to B}({\psi }_{{RA}})\parallel {{ \mathcal E }}_{A\to B}({\psi }_{{RA}}))\end{eqnarray} \tag{ 120 }$

is the max-divergence of channels [66]. (More generally, ${ \mathcal N }$ and ${ \mathcal E }$ could be arbitrary completely positive maps in (120).) Note that it is known that [62, 67]

$\begin{eqnarray}&&{D}_{\max }({ \mathcal N }\parallel { \mathcal E })={D}_{\max }({{ \mathcal N }}_{A\to B}({{\rm{\Phi }}}_{{RA}})\parallel {{ \mathcal E }}_{A\to B}({{\rm{\Phi }}}_{{RA}}))=\mathrm{log}\,\min \{t:{J}_{{AB}}^{{ \mathcal N }}\leqslant {{tJ}}_{{AB}}^{{ \mathcal E }}\},\end{eqnarray} \tag{ 121 }$

where ${{\rm{\Phi }}}_{{RA}}$ is the maximally entangled state and ${J}_{{AB}}^{{ \mathcal N }}$ is the Choi–Jamiołkowski matrix of the channel ${{ \mathcal N }}_{A\to B}$ and similarly for ${J}_{{AB}}^{{ \mathcal E }}$ .

Due to the properties of max-relative entropy, it follows that theorem 11 and propositions 10, 12, and 13 apply to the max-thauma of a channel, implying reduction to states, that it is monotone with respect to completely CPWP superchannels, faithful, and obeys a minimax theorem, so that

$\begin{eqnarray}&&{\theta }_{\max }({ \mathcal R })={\theta }_{\max }(\sigma ),\ \mathrm{if}\ \ { \mathcal R }(\rho )=\mathrm{Tr}\,[\rho ]\sigma \,\,\end{eqnarray} \tag{ 122 }$

$\begin{eqnarray}&&{\theta }_{\max }({ \mathcal N })\geqslant {\theta }_{\max }({{\rm{\Xi }}}^{\mathrm{CPWP}}({ \mathcal N })),\,\,\,\,\end{eqnarray} \tag{ 123 }$

$\begin{eqnarray}&&\,\,\,{\theta }_{\max }({ \mathcal N })\geqslant 0\quad \ \ \mathrm{and}\ \ \quad {\theta }_{\max }({ \mathcal N })=0\quad \ \ \mathrm{if}\ \mathrm{and}\ \mathrm{only}\ \mathrm{if}\ \ \quad { \mathcal N }\in \mathrm{CPWP},\end{eqnarray} \tag{ 124 }$

$\begin{eqnarray}&&{\theta }_{\max }({ \mathcal N })=\mathop{\min }\limits_{{ \mathcal E }:{ \mathcal M }({ \mathcal E })\leqslant 0}\mathop{\max }\limits_{{\psi }_{{RA}}}{D}_{\max }({{ \mathcal N }}_{A\to B}({\psi }_{{RA}})\parallel {{ \mathcal E }}_{A\to B}({\psi }_{{RA}}))\end{eqnarray} \tag{ 125 }$

$\begin{eqnarray}&&\,\,=\,\mathop{\max }\limits_{{\psi }_{{RA}}}\mathop{\min }\limits_{{ \mathcal E }:{ \mathcal M }({ \mathcal E })\leqslant 0}{D}_{\max }({{ \mathcal N }}_{A\to B}({\psi }_{{RA}})\parallel {{ \mathcal E }}_{A\to B}({\psi }_{{RA}})),\end{eqnarray} \tag{ 126 }$

where ${ \mathcal N }$ is a quantum channel and ${{\rm{\Xi }}}^{\mathrm{CPWP}}$ is a CPWP superchannel.

We can alternatively express the max-thauma of a channel as the following SDP:

(SDP for max-thauma).

Proposition 14 For a given quantum channel ${{ \mathcal N }}_{A\to B}$ , its max-thauma ${\theta }_{\max }({ \mathcal N })$ can be written as the following SDP:

$\begin{eqnarray}&&\begin{array}{l}{\theta }_{\max }({ \mathcal N })=\mathrm{log}\,\min \ t\\ \quad {\rm{s}}.{\rm{t}}.\ {J}_{{AB}}^{{ \mathcal N }}\leqslant {Y}_{{AB}}\\ \quad \displaystyle \sum _{{\boldsymbol{v}}}| \mathrm{Tr}\,[({A}_{A}^{{\boldsymbol{u}}}\otimes {A}_{B}^{{\boldsymbol{v}}}){Y}_{{AB}}]| {/d}_{B}\leqslant t,\quad \forall {\boldsymbol{u}},\end{array}\end{eqnarray} \tag{ 127 }$

where ${J}_{{AB}}^{{ \mathcal N }}$ is the Choi–Jamiołkowski matrix of the channel ${{ \mathcal N }}_{A\to B}$ . Moreover, the dual SDP to the above is as follows:

$\begin{eqnarray}&&\begin{array}{r}\begin{array}{l}{\theta }_{max}({N})={\rm{l}}{\rm{o}}{\rm{g}}\,{\rm{m}}{\rm{a}}{\rm{x}}\,{\rm{T}}{\rm{r}}\,[{J}_{AB}^{{N}}{V}_{AB}]\\ \,{\rm{s}}.{\rm{t}}.\,\displaystyle \sum _{{\boldsymbol{v}}}{b}_{{\boldsymbol{v}}}\leqslant 1\\ \,0\leqslant {V}_{AB}\leqslant \displaystyle \sum _{{\boldsymbol{v}},{\boldsymbol{u}}}({c}_{{\boldsymbol{v}},{\boldsymbol{u}}}-{f}_{{\boldsymbol{v}},{\boldsymbol{u}}}){A}_{A}^{{\boldsymbol{v}}}\otimes {A}_{B}^{{\boldsymbol{u}}}/{d}_{B},\\ \,{c}_{{\boldsymbol{v}},{\boldsymbol{u}}}+{f}_{{\boldsymbol{v}},{\boldsymbol{u}}}\leqslant {b}_{{\boldsymbol{v}}},\,{\rm{\forall }}{\boldsymbol{u}},{\boldsymbol{v}},\\ \,{c}_{{\boldsymbol{v}},{\boldsymbol{u}}}\geqslant 0,{f}_{{\boldsymbol{v}},{\boldsymbol{u}}}\geqslant 0,\,{\rm{\forall }}{\boldsymbol{u}},{\boldsymbol{v}}.\end{array}\end{array}\end{eqnarray} \tag{ 128 }$

Proof. Consider the following chain of equalities:

$\begin{eqnarray}&&{\theta }_{\max }({ \mathcal N })=\mathrm{log}\,\mathop{\min }\limits_{{ \mathcal E }:{ \mathcal M }({ \mathcal E })\leqslant 0}{D}_{\max }({ \mathcal N }\parallel { \mathcal E })\,\,\,\,\,\end{eqnarray} \tag{ 129 }$

$\begin{eqnarray}&&=\,\mathrm{log}\,\min \left\{t\,:{J}_{{AB}}^{{ \mathcal N }}\leqslant {{tJ}}_{{AB}}^{{ \mathcal E }^{\prime} },{ \mathcal M }({ \mathcal E }^{\prime} )\leqslant 0\right\}\,\,\,\end{eqnarray} \tag{ 130 }$

$\begin{eqnarray}&&\,=\,\mathrm{log}\,\min \left\{t\,:{J}_{{AB}}^{{ \mathcal N }}\leqslant {{tJ}}_{{AB}}^{{ \mathcal E }^{\prime} },\displaystyle \sum _{{\bf{v}}}| \mathrm{Tr}\,[{ \mathcal E }^{\prime} ({A}_{A}^{{\bf{u}}}){A}_{B}^{{\bf{v}}}]| {/d}_{B}\leqslant 1,\forall {\bf{u}}\right\}\end{eqnarray} \tag{ 131 }$

$\begin{eqnarray}&&\,=\,\mathrm{log}\,\min \left\{t\,:{J}_{{AB}}^{{ \mathcal N }}\leqslant {J}_{{AB}}^{{ \mathcal E }},\displaystyle \sum _{{\bf{v}}}| \mathrm{Tr}\,[{ \mathcal E }({A}_{A}^{{\bf{u}}}){A}_{B}^{{\bf{v}}}]| {/d}_{B}\leqslant t,\forall {\bf{u}}\right\}\end{eqnarray} \tag{ 132 }$

$\begin{eqnarray}&&\,\,=\,\mathrm{log}\,\min \left\{t\,:{J}_{{AB}}^{{ \mathcal N }}\leqslant {Y}_{{AB}},\displaystyle \sum _{{\bf{v}}}| \mathrm{Tr}\,[({A}_{A}^{{\bf{u}}}\otimes {A}_{B}^{{\bf{v}}}){Y}_{{AB}}]| {/d}_{B}\leqslant t,\forall {\bf{u}}\right\},\end{eqnarray} \tag{ 133 }$

where the second equality follows from (121) and the last from the fact that ${ \mathcal E }$ is completely positive and thus in one-to-one correspondence with positive semi-definite bipartite operators. We further rewrite the absolute-value constraint in (133) and arrive at the following SDP:

$\begin{eqnarray}&&{\theta }_{max}({N})={\rm{l}}{\rm{o}}{\rm{g}}\,min\left\{t\,:{J}_{AB}^{{N}}\leqslant {Y}_{AB},-t\leqslant \displaystyle \sum _{{\bf{v}}}{\rm{T}}{\rm{r}}\,[({A}_{A}^{{\bf{u}}}\otimes {A}_{B}^{{\bf{v}}}){Y}_{AB}]{/d}_{B}\leqslant t,{\rm{\forall }}{\bf{u}}\right\}.\end{eqnarray} \tag{ 134 }$

Then we use the Lagrangian method to obtain the dual SDP:

$\begin{eqnarray}&&\begin{array}{l}{\theta }_{\max }({ \mathcal N })=\mathrm{logmax}\ \mathrm{Tr}\,[{J}_{{AB}}^{{ \mathcal N }}{V}_{{AB}}]\\ \quad {\rm{s}}.{\rm{t}}.\ \displaystyle \sum _{{\bf{v}}}{b}_{{\bf{v}}}\leqslant 1\\ \quad 0\leqslant {V}_{{AB}}\leqslant \displaystyle \sum _{{\bf{v}},{\bf{u}}}({c}_{{\bf{v}},{\bf{u}}}-{f}_{{\bf{v}},{\bf{u}}}){A}_{A}^{{\bf{v}}}\otimes {A}_{B}^{{\bf{u}}}/{d}_{B},\\ \quad {c}_{{\bf{v}},{\bf{u}}}+{f}_{{\bf{v}},{\bf{u}}}\leqslant {b}_{{\bf{v}}},\quad \forall {\bf{u}},{\bf{v}},\\ \quad {c}_{{\bf{v}},{\bf{u}}}\geqslant 0,{f}_{{\bf{v}},{\bf{u}}}\geqslant 0,\quad \forall {\bf{u}},{\bf{v}}.\end{array}\end{eqnarray} \tag{ 135 }$

This concludes the proof.■

(Max-thauma versus mana).

Corollary 15 For a quantum channel ${{ \mathcal N }}_{A\to B}$ , its max-thauma does not exceed its mana:

$\begin{eqnarray}&&{\theta }_{\max }({ \mathcal N })\leqslant { \mathcal M }({ \mathcal N }).\end{eqnarray} \tag{ 136 }$

Proof. The proof is a direct consequence of the primal formulation in (127). By setting ${Y}_{{AB}}={J}_{{AB}}^{{ \mathcal N }}$ , we find that

$\begin{eqnarray}&&{\theta }_{\max }({ \mathcal N })\leqslant \mathrm{log}\,\min \{t:\mathop{\max }\limits_{{\bf{u}}}\displaystyle \sum _{{\bf{v}}}| \mathrm{Tr}\,[({A}_{A}^{{\bf{u}}}\otimes {A}_{B}^{{\bf{v}}}){J}_{{AB}}]| {/d}_{B}\leqslant t\}={ \mathcal M }({ \mathcal N }),\end{eqnarray} \tag{ 137 }$

where the last equality follows from (60). ■

(Additivity).

Proposition 16 For two given quantum channels ${{ \mathcal N }}_{1}$ and ${{ \mathcal N }}_{2}$ , the max-thauma is additive in the following sense:

$\begin{eqnarray}&&{\theta }_{\max }({{ \mathcal N }}_{1}\otimes {{ \mathcal N }}_{2})={\theta }_{\max }({{ \mathcal N }}_{1})+{\theta }_{\max }({{ \mathcal N }}_{2}).\end{eqnarray} \tag{ 138 }$

Proof. The idea of the proof is to utilize the primal and dual SDPs of ${\theta }_{\max }({ \mathcal N })$ from proposition 14. On the one hand, suppose that the optimal solutions to the primal SDPs for ${\theta }_{\max }({{ \mathcal N }}_{1})$ and ${\theta }_{\max }({{ \mathcal N }}_{2})$ are $\{{R}_{1},{b}_{{{\bf{v}}}_{1}}\}$ and $\{{R}_{2},{b}_{{{\bf{v}}}_{2}}\}$ , respectively. It is then easy to verify that $\{{R}_{1}\otimes {R}_{2},{b}_{{{\bf{v}}}_{1}}{b}_{{{\bf{v}}}_{2}}\}$ is a feasible solution to the SDP of ${\theta }_{\max }({{ \mathcal N }}_{1}\otimes {{ \mathcal N }}_{2})$ . Thus

$\begin{eqnarray}&&{\theta }_{\max }({{ \mathcal N }}_{1}\otimes {{ \mathcal N }}_{2})\geqslant \mathrm{log}\,\mathrm{Tr}\,[({J}_{{A}_{1}{B}_{1}}^{{{ \mathcal N }}_{1}}\otimes {J}_{{A}_{2}{B}_{2}}^{{{ \mathcal N }}_{2}})({R}_{1}\otimes {R}_{2})]={\theta }_{\max }({{ \mathcal N }}_{1})+{\theta }_{\max }({{ \mathcal N }}_{2}).\end{eqnarray} \tag{ 139 }$

On the other hand, considering equation (119), suppose that the optimal solutions for ${{ \mathcal N }}_{1}$ and ${{ \mathcal N }}_{2}$ are ${{ \mathcal E }}_{1}$ and ${{ \mathcal E }}_{2}$ , respectively. Noting that ${ \mathcal M }({{ \mathcal E }}_{1}\otimes {{ \mathcal E }}_{2})={ \mathcal M }({{ \mathcal E }}_{1})+{ \mathcal M }({{ \mathcal E }}_{2})\leqslant 0$ , and employing (121) and the additivity of the max-relative entropy, we find that

$\begin{eqnarray}&&{\theta }_{\max }({{ \mathcal N }}_{1}\otimes {{ \mathcal N }}_{2})\leqslant {D}_{\max }({{ \mathcal N }}_{1}\otimes {{ \mathcal N }}_{2}\parallel {{ \mathcal E }}_{1}\otimes {{ \mathcal E }}_{2})\end{eqnarray} \tag{ 140 }$

$\begin{eqnarray}&&\,\,=\,{\theta }_{\max }({{ \mathcal N }}_{1})+{\theta }_{\max }({{ \mathcal N }}_{2}).\end{eqnarray} \tag{ 141 }$

This concludes the proof.■

The following lemma is essential to establishing subadditivity of max-thauma of channels with respect to serial composition, as stated in proposition 18 below. We suspect that lemma 17 will find wide use in general resource theories beyond the magic resource theory considered in this paper. For example, it leads to an alternative proof of [62, proposition 17].

(Subadditivity of max-divergence of channels).

Lemma 17 Given completely positive maps ${{ \mathcal N }}_{A\to B}^{1},{{ \mathcal N }}_{B\to C}^{2},{{ \mathcal E }}_{A\to B}^{1}$ , and ${{ \mathcal E }}_{B\to C}^{2}$ , the following subadditivity inequality, with respect to serial compositions, holds for the max-channel divergence of (120), (121):

$\begin{eqnarray}&&{D}_{\max }({{ \mathcal N }}_{2}\,\circ \,{{ \mathcal N }}_{1}\parallel {{ \mathcal E }}_{2}\,\circ \,{{ \mathcal E }}_{1})\leqslant {D}_{\max }({{ \mathcal N }}_{1}\parallel {{ \mathcal E }}_{1})+{D}_{\max }({{ \mathcal N }}_{2}\parallel {{ \mathcal E }}_{2}),\end{eqnarray} \tag{ 142 }$

where we have made the abbreviations ${{ \mathcal N }}_{1}\equiv {{ \mathcal N }}_{A\to B}^{1},{{ \mathcal N }}_{2}\equiv {{ \mathcal N }}_{B\to C}^{2},{{ \mathcal E }}_{1}\equiv {{ \mathcal E }}_{A\to B}^{1}$ , and ${{ \mathcal E }}_{2}\equiv {{ \mathcal E }}_{B\to C}^{2}$ .

Proof. Recall the 'data-processed triangle inequality' from [68]:

$\begin{eqnarray}&&{D}_{\max }({ \mathcal P }(\rho )\parallel \omega )\leqslant {D}_{\max }(\rho \parallel \sigma )+{D}_{\max }({ \mathcal P }(\sigma )\parallel \omega ),\end{eqnarray} \tag{ 143 }$

which holds for ${ \mathcal P }$ a positive map and $\rho ,\omega$ , and σ positive semi-definite operators. Note that one can in fact see this as a consequence of the submultiplicativity of the operator norm and the data-processing inequality of max-relative entropy for positive maps:

$\begin{eqnarray}&&{D}_{\max }({ \mathcal P }(\rho )\parallel \omega )=2\mathrm{log}{\parallel {\omega }^{-1/2}{[{ \mathcal P }(\rho )]}^{1/2}\parallel }_{\infty }\,\,\,\end{eqnarray} \tag{ 144 }$

$\begin{eqnarray}&&\,\,=\,2\mathrm{log}{\parallel {\omega }^{-1/2}{[{ \mathcal P }(\sigma )]}^{1/2}{[{ \mathcal P }(\sigma )]}^{-1/2}{[{ \mathcal P }(\rho )]}^{1/2}\parallel }_{\infty }\end{eqnarray} \tag{ 145 }$

$\begin{eqnarray}&&\,\,\,\leqslant \,2\mathrm{log}{\parallel {\omega }^{-1/2}{[{ \mathcal P }(\sigma )]}^{1/2}\parallel }_{\infty }\cdot {\parallel {[{ \mathcal P }(\sigma )]}^{-1/2}{[{ \mathcal P }(\rho )]}^{1/2}\parallel }_{\infty }\end{eqnarray} \tag{ 146 }$

$\begin{eqnarray}&&\,\,\,\,=\,2\mathrm{log}{\parallel {\omega }^{-1/2}{[{ \mathcal P }(\sigma )]}^{1/2}\parallel }_{\infty }+2\mathrm{log}{\parallel {[{ \mathcal P }(\sigma )]}^{-1/2}{[{ \mathcal P }(\rho )]}^{1/2}\parallel }_{\infty }\end{eqnarray} \tag{ 147 }$

$\begin{eqnarray}&&\,=\,{D}_{\max }({ \mathcal P }(\rho )\parallel { \mathcal P }(\sigma ))+{D}_{\max }({ \mathcal P }(\sigma )\parallel \omega )\end{eqnarray} \tag{ 148 }$

$\begin{eqnarray}&&\leqslant \,{D}_{\max }(\rho \parallel \sigma )+{D}_{\max }({ \mathcal P }(\sigma )\parallel \omega ).\quad \end{eqnarray} \tag{ 149 }$

Let us pick

$\begin{eqnarray}&&{ \mathcal P }=\mathrm{id}\otimes {{ \mathcal N }}_{2},\quad \rho =(\mathrm{id}\otimes {{ \mathcal N }}_{1})({\rm{\Phi }}),\quad \sigma =(\mathrm{id}\otimes {{ \mathcal E }}_{1})({\rm{\Phi }}),\quad \omega =(\mathrm{id}\otimes {{ \mathcal E }}_{2})(\sigma ),\end{eqnarray} \tag{ 150 }$

where Φ denotes the maximally entangled state. We find that

$\begin{eqnarray}&&{D}_{\max }({{ \mathcal N }}_{2}\,\circ \,{{ \mathcal N }}_{1}\parallel {{ \mathcal E }}_{2}\,\circ \,{{ \mathcal E }}_{1})\Space{0ex}{0ex}{0ex}\,\,\,\,\,\,\,={D}_{\max }((\mathrm{id}\otimes ({{ \mathcal N }}_{2}\,\circ \,{{ \mathcal N }}_{1}))({\rm{\Phi }})\parallel (\mathrm{id}\otimes ({{ \mathcal E }}_{2}\,\circ \,{{ \mathcal E }}_{1}))({\rm{\Phi }}))\,\end{eqnarray} \tag{ 151 }$

$\begin{eqnarray}&&\,\leqslant \,{D}_{\max }((\mathrm{id}\otimes {{ \mathcal N }}_{1})({\rm{\Phi }})\parallel (\mathrm{id}\otimes {{ \mathcal E }}_{1})({\rm{\Phi }}))+{D}_{\max }((\mathrm{id}\otimes ({{ \mathcal N }}_{2}\,\circ \,{{ \mathcal E }}_{1}))({\rm{\Phi }})\parallel (\mathrm{id}\otimes ({{ \mathcal E }}_{2}\,\circ \,{{ \mathcal E }}_{1}))({\rm{\Phi }}))\end{eqnarray} \tag{ 152 }$

$\begin{eqnarray}&&\leqslant \,{D}_{\max }({{ \mathcal N }}_{1}\parallel {{ \mathcal E }}_{1})+{D}_{\max }({{ \mathcal N }}_{2}\parallel {{ \mathcal E }}_{2}).\,\,\,\,\,\,\,\,\end{eqnarray} \tag{ 153 }$

The first equality follows from (121). The first inequality follows from (143) with the choices in (150). The second inequality follows because ${D}_{\max }((\mathrm{id}\otimes {{ \mathcal N }}_{1})({\rm{\Phi }})\parallel (\mathrm{id}\otimes {{ \mathcal E }}_{1})({\rm{\Phi }}))={D}_{\max }({{ \mathcal N }}_{1}\parallel {{ \mathcal E }}_{1})$ , as a consequence of (121), and the channel divergence ${D}_{\max }({{ \mathcal N }}_{2}\parallel {{ \mathcal E }}_{2})$ involves an optimization over all bipartite input states, one of which is $(\mathrm{id}\otimes {{ \mathcal E }}_{1})({\rm{\Phi }})$ . ■

Remark 3. The proof above applies to any divergence that obeys the data-processed triangle inequality, which includes the Hilbert α-divergences of [59], as discussed in [62, appendix A].

(Subadditivity).

Proposition 18 For two given quantum channels ${{ \mathcal N }}_{1}$ and ${{ \mathcal N }}_{2}$ , the max-thauma is subadditive in the following sense:

$\begin{eqnarray}&&{\theta }_{\max }({{ \mathcal N }}_{2}\,\circ \,{{ \mathcal N }}_{1})\leqslant {\theta }_{\max }({{ \mathcal N }}_{1})+{\theta }_{\max }({{ \mathcal N }}_{2}).\end{eqnarray} \tag{ 154 }$

Proof. This is a direct consequence of lemma 17 above. Let ${{ \mathcal E }}_{i}$ be the completely positive map satisfying ${ \mathcal M }({{ \mathcal E }}_{i})\leqslant 0$ and that is optimal for ${{ \mathcal N }}_{i}$ with respect to the max-thauma ${\theta }_{\max }$ , for $i\in \{1,2\}$ . Then applying lemma 17, we find that

$\begin{eqnarray}&&{D}_{\max }({{ \mathcal N }}_{2}\,\circ \,{{ \mathcal N }}_{1}\parallel {{ \mathcal E }}_{2}\,\circ \,{{ \mathcal E }}_{1})\leqslant {D}_{\max }({{ \mathcal N }}_{1}\parallel {{ \mathcal E }}_{1})+{D}_{\max }({{ \mathcal N }}_{2}\parallel {{ \mathcal E }}_{2})\end{eqnarray} \tag{ 155 }$

$\begin{eqnarray}&&=\,{\theta }_{\max }({{ \mathcal N }}_{1})+{\theta }_{\max }({{ \mathcal N }}_{2}),\,\end{eqnarray} \tag{ 156 }$

The equality follows from the assumption that ${{ \mathcal E }}_{i}$ is the completely positive map satisfying ${ \mathcal M }({{ \mathcal E }}_{i})\leqslant 0$ , which is optimal for ${{ \mathcal N }}_{i}$ with respect to the max-thauma ${\theta }_{\max }$ , for $i\in \{1,2\}$ .

Given that, by assumption, ${ \mathcal M }({{ \mathcal E }}_{i})\leqslant 0$ for $i\in \{1,2\}$ , it follows from proposition 6 that ${ \mathcal M }\left({{ \mathcal E }}_{2}\,\circ \,{{ \mathcal E }}_{1}\right)\leqslant 0$ . Since the max-thauma involves an optimization over all completely positive maps ${ \mathcal E }$ satisfying ${ \mathcal M }({ \mathcal E })\leqslant 0$ , we conclude that

$\begin{eqnarray}&&{\theta }_{\max }({{ \mathcal N }}_{2}\,\circ \,{{ \mathcal N }}_{1})\leqslant {D}_{\max }({{ \mathcal N }}_{2}\,\circ \,{{ \mathcal N }}_{1}\parallel {{ \mathcal E }}_{2}\,\circ \,{{ \mathcal E }}_{1})\leqslant {\theta }_{\max }({{ \mathcal N }}_{1})+{\theta }_{\max }({{ \mathcal N }}_{2}),\end{eqnarray} \tag{ 157 }$

which is the statement of the proposition.■

(Amortization inequality).

Proposition 19 For any quantum channel ${{ \mathcal N }}_{A\to B}$ , the following inequality holds

$\begin{eqnarray}&&\mathop{\sup }\limits_{{\rho }_{A}}[{\theta }_{\max }({{ \mathcal N }}_{A\to B}({\rho }_{A}))-{\theta }_{\max }({\rho }_{A})]\leqslant {\theta }_{\max }({ \mathcal N }),\end{eqnarray} \tag{ 158 }$

with the optimization performed over input states ${\rho }_{A}$ . Moreover, the following inequality also holds

$\begin{eqnarray}&&\mathop{\sup }\limits_{{\rho }_{{RA}}}[{\theta }_{\max }(({\mathrm{id}}_{R}\otimes { \mathcal N })({\rho }_{{RA}}))-{\theta }_{\max }({\rho }_{{RA}})]\leqslant {\theta }_{\max }({{ \mathcal N }}_{A\to B}).\end{eqnarray} \tag{ 159 }$

Proof. The inequality in (158) is a direct consequence of reduction to states (proposition 122) and subadditivity of max-thauma with respect to serial compositions (proposition 18). Indeed, letting ${{ \mathcal N }}^{{\prime} }$ be a replacer channel that prepares the state ${\rho }_{A}$ , we find that

$\begin{eqnarray}&&{\theta }_{\max }({ \mathcal N }({\rho }_{A}))={\theta }_{\max }({ \mathcal N }\,\circ \,{{ \mathcal N }}^{{\prime} })\leqslant {\theta }_{\max }({ \mathcal N })+{\theta }_{\max }({{ \mathcal N }}^{{\prime} })={\theta }_{\max }({ \mathcal N })+{\theta }_{\max }({\rho }_{A}).\end{eqnarray} \tag{ 160 }$

for all input states ${\rho }_{A}$ , from which we conclude (158).

To arrive at the inequality in (159), we make the substitution ${ \mathcal N }\to \mathrm{id}\otimes { \mathcal N }$ , apply the above reasoning, the additivity in proposition 16, and the fact that the identity channel is free (CPWP), to conclude that the following holds for all input states ${\rho }_{{RA}}$

$\begin{eqnarray}&&{\theta }_{\max }(({\mathrm{id}}_{R}\otimes {{ \mathcal N }}_{A\to B})({\rho }_{{RA}}))-{\theta }_{\max }({\rho }_{{RA}})\leqslant {\theta }_{\max }(\mathrm{id}\otimes { \mathcal N })={\theta }_{\max }(\mathrm{id})+{\theta }_{\max }({ \mathcal N })={\theta }_{\max }({ \mathcal N }),\end{eqnarray} \tag{ 161 }$

from which we conclude (159).■

To summarize, the properties of ${\theta }_{\max }({ \mathcal N })$ are as follows:

1.
Reduction to states: ${\theta }_{\max }({ \mathcal N })={\theta }_{\max }(\sigma )$ when the channel ${ \mathcal N }$ is a replacer channel, acting as ${ \mathcal N }(\rho )=\mathrm{Tr}\,[\rho ]\sigma$ for an arbitrary input state ρ, where σ is a state.
2.
Monotonicity of ${\theta }_{\max }({ \mathcal N })$ under CPWP superchannels (including completely stabilizer-preserving superchannels).
3.
Additivity under tensor products of channels: ${\theta }_{\max }({{ \mathcal N }}_{1}\otimes {{ \mathcal N }}_{2})={\theta }_{\max }({{ \mathcal N }}_{1})+{\theta }_{\max }({{ \mathcal N }}_{2})$ .
4.
Subadditivity under serial composition of channels: ${\theta }_{\max }({{ \mathcal N }}_{2}\,\circ \,{{ \mathcal N }}_{1})\leqslant {\theta }_{\max }({{ \mathcal N }}_{1})+{\theta }_{\max }({{ \mathcal N }}_{2})$ .
5.
Faithfulness: ${ \mathcal N }\in \mathrm{CPWP}$ if and only if ${\theta }_{\max }({ \mathcal N })=0$ .
6.
Amortization inequality: ${\sup }_{{\rho }_{{RA}}}{\theta }_{\max }(({\mathrm{id}}_{R}\otimes { \mathcal N })({\rho }_{{RA}}))-{\theta }_{\max }({\rho }_{{RA}})\leqslant {\theta }_{\max }({ \mathcal N })$ .

Remark 4. Due to the subadditivity inequality in proposition 18, the additivity identity in proposition 16, and faithfulness in (124), the following identities hold

$\begin{eqnarray}&&{\theta }_{\max }({{ \mathcal N }}_{1})=\mathop{\sup }\limits_{{{ \mathcal N }}_{2}}{\theta }_{\max }([\mathrm{id}\otimes {{ \mathcal N }}_{1}]\,\circ \,{{ \mathcal N }}_{2})-{\theta }_{\max }({{ \mathcal N }}_{2}),\,\,\,\end{eqnarray} \tag{ 162 }$

$\begin{eqnarray}&&{\theta }_{\max }({{ \mathcal N }}_{1})=\mathop{\sup }\limits_{{{ \mathcal N }}_{2}}{\theta }_{\max }({{ \mathcal N }}_{2}\,\circ \,[\mathrm{id}\otimes {{ \mathcal N }}_{1}])-{\theta }_{\max }({{ \mathcal N }}_{2}),\,\,\,\end{eqnarray} \tag{ 163 }$

$\begin{eqnarray}&&{\theta }_{\max }({{ \mathcal N }}_{1})=\mathop{\sup }\limits_{{{ \mathcal N }}_{2},{{ \mathcal N }}_{3}}{\theta }_{\max }({{ \mathcal N }}_{2}\,\circ \,[\mathrm{id}\otimes {{ \mathcal N }}_{1}]\,\circ \,{{ \mathcal N }}_{3})-{\theta }_{\max }({{ \mathcal N }}_{2})-{\theta }_{\max }({{ \mathcal N }}_{3}),\end{eqnarray} \tag{ 164 }$

which have the interpretation that amortization in terms of arbitrary pre- and post-processing does not increase the max-thauma of a quantum channel.

4. Distilling magic from quantum channels

4.1. Amortized magic

Since many physical tasks relate to quantum channels and time evolution rather than directly to quantum states, it is of interest to consider the non-stabilizer properties of quantum channels. Now having established suitable measures to quantify the magic of quantum channels, it is natural to figure out the ability of a quantum channel to generate magic from input quantum states. Let us begin by defining the amortized magic of a quantum channel:

(Amortized magic).

Definition 6 The amortized magic of a quantum channel ${{ \mathcal N }}_{A\to B}$ is defined relative to a magic measure $m(\cdot )$ via the following formula:

$\begin{eqnarray}&&{m}^{{ \mathcal A }}({ \mathcal N })\,:=\,\mathop{\sup }\limits_{{\rho }_{{RA}}}m(({\mathrm{id}}_{R}\otimes { \mathcal N })({\rho }_{{RA}}))-m({\rho }_{{RA}}).\end{eqnarray} \tag{ 165 }$

The strict amortized magic of a quantum channel is defined as

$\begin{eqnarray}&&{\widetilde{m}}^{{ \mathcal A }}({ \mathcal N })\,:=\mathop{\sup }\limits_{{\rho }_{{RA}}\in \mathrm{Stab}}m(({\mathrm{id}}_{R}\otimes { \mathcal N })({\rho }_{{RA}})).\end{eqnarray} \tag{ 166 }$

That is, the amortized magic is defined as the largest increase in magic that a quantum channel can realize after it acts on an arbitrary input quantum state. The strict amortized magic is defined by finding the largest amount of magic that a quantum channel can realize when a stabilizer state is given to it as an input. Such amortized measures of resourcefulness of quantum channels were previously studied in the resource theories of quantum coherence (e.g. [67, 69–71]) and quantum entanglement (e.g. [72–76]). They have been considered in the context of an arbitrary resource theory in [73, section 7].

Proposition 20. Given a quantum channel ${{ \mathcal N }}_{A\to B}$ , the following inequalities hold

$\begin{eqnarray}&&{{ \mathcal M }}^{{ \mathcal A }}({ \mathcal N }):= \mathop{\sup }\limits_{{\rho }_{{RA}}}{ \mathcal M }(({\mathrm{id}}_{R}\otimes {{ \mathcal N }}_{A\to B})({\rho }_{{RA}}))-{ \mathcal M }({\rho }_{{RA}})\leqslant { \mathcal M }({ \mathcal N }),\end{eqnarray} \tag{ 167 }$

$\begin{eqnarray}&&\,{\theta }_{\max }^{{ \mathcal A }}({ \mathcal N }):= \mathop{\sup }\limits_{{\rho }_{{RA}}}{\theta }_{\max }(({\mathrm{id}}_{R}\otimes {{ \mathcal N }}_{A\to B})({\rho }_{{RA}}))-{\theta }_{\max }({\rho }_{{RA}})\leqslant {\theta }_{\max }({ \mathcal N }).\end{eqnarray} \tag{ 168 }$

Proof. These statements are an immediate consequence of the amortization inequality for ${ \mathcal M }(\rho )$ and ${\theta }_{\max }(\rho )$ given in propositions 8 and 19, respectively. ■

4.2. Distillable magic of a quantum channel

The most general protocol for distilling some resource by means of a quantum channel ${ \mathcal N }$ employs n invocations of the channel ${ \mathcal N }$ interleaved by free channels [73, section 7]. In our case, the resource of interest is magic, and here we take the free channels to be the CPWP channels discussed in section 3.1. In such a protocol, the instances of the channel ${ \mathcal N }$ are invoked one at a time, and we can integrate all CPWP channels between one use of ${ \mathcal N }$ and the next into a single CPWP channel, since the CPWP channels are closed under composition. The goal of such a protocol is to distill magic states from the channel.

In more detail, the most general protocol for distilling magic from a quantum channel proceeds as follows: one starts by preparing the systems ${R}_{1}{A}_{1}$ in a state ${\rho }_{{R}_{1}{A}_{1}}^{(1)}$ with non-negative Wigner function, by employing a free CPWP channel ${{ \mathcal F }}_{\varnothing \to {R}_{1}{A}_{1}}^{(1)}$ , then applies the channel ${{ \mathcal N }}_{{A}_{1}\to {B}_{1}}$ , followed by a CPWP channel ${{ \mathcal F }}_{{R}_{1}{B}_{1}\to {R}_{2}{A}_{2}}^{(2)}$ , resulting in the state

$\begin{eqnarray}&&{\rho }_{{R}_{2}{A}_{2}}^{(2)}:= {{ \mathcal F }}_{{R}_{1}{B}_{1}\to {R}_{2}{A}_{2}}^{(2)}(({\mathrm{id}}_{{R}_{1}}\otimes {{ \mathcal N }}_{{A}_{1}\to {B}_{1}})({\rho }_{{R}_{1}{A}_{1}}^{(1)})).\end{eqnarray} \tag{ 169 }$

Continuing the above steps, given state ${\rho }_{{R}_{i}{A}_{i}}^{(i)}$ after the action of $i-1$ invocations of the channel ${{ \mathcal N }}_{A\to B}$ and interleaved CPWP channels, we apply the channel ${{ \mathcal N }}_{{A}_{i}\to {B}_{i}}$ and the CPWP channel ${{ \mathcal F }}_{{R}_{i}{B}_{i}\to {R}_{i+1}{A}_{i+1}}^{(i+1)}$ , obtaining the state

$\begin{eqnarray}&&{\rho }_{{R}_{i+1}{A}_{i+1}}^{(i+1)}:= {{ \mathcal F }}_{{R}_{i}{B}_{i}\to {R}_{i+1}{A}_{i+1}}^{(i+1)}(({\mathrm{id}}_{{R}_{i}}\otimes {{ \mathcal N }}_{{A}_{i}\to {B}_{i}})({\rho }_{{R}_{i}{A}_{i}}^{(i)})).\end{eqnarray} \tag{ 170 }$

After n invocations of the channel ${{ \mathcal N }}_{A\to B}$ have been made, the final free CPWP channel ${{ \mathcal F }}_{{R}_{n}{B}_{n}\to S}^{(n+1)}$ produces a state ${\omega }_{S}$ on system S, defined as

$\begin{eqnarray}&&{\omega }_{S}:= {{ \mathcal F }}_{{R}_{n}{B}_{n}\to S}^{(n+1)}({\rho }_{{R}_{n}{A}_{n}}^{(n)}).\end{eqnarray} \tag{ 171 }$

Such a protocol is depicted in figure 2.

Fix $\varepsilon \in [0,1]$ and $k\in {\mathbb{N}}$ . The above procedure is an $(n,k,\varepsilon )$ ψ-magic distillation protocol with rate k/n and error $\varepsilon$ , if the state ${\omega }_{S}$ has a high fidelity with k copies of the target magic state ψ,

$\begin{eqnarray}&&\langle \psi {| }^{\otimes k}{\omega }_{S}| \psi {\rangle }^{\otimes k}\geqslant 1-\varepsilon .\end{eqnarray} \tag{ 172 }$

A rate R is achievable for ψ-magic state distillation from the channel ${ \mathcal N }$ , if for all $\varepsilon \in (0,1],\delta \gt 0$ , and sufficiently large n, there exists an $(n,n(R-\delta ),\varepsilon )$ ψ-magic state distillation protocol of the above form. The ψ-distillable magic of the channel ${ \mathcal N }$ is defined to be the supremum of all achievable rates and is denoted by ${C}_{\psi }({ \mathcal N })$ .

A common choice for a non-Clifford gate is the T-gate. The qutrit T gate [77] is given by

$\begin{eqnarray}T=\left(\begin{array}{ccc}\xi & 0 & 0\\ 0 & 1 & 0\\ 0 & 0 & {\xi }^{-1}\end{array}\right),\end{eqnarray} \tag{ 173 }$

where $\xi ={{\rm{e}}}^{2\pi {\rm{i}}/9}$ is a primitive ninth root of unity. The T gate leads to the T magic state

$\begin{eqnarray}&&| T\rangle := T| +\rangle ,\end{eqnarray} \tag{ 174 }$

by inputting the stabilizer state $| +\rangle$ to the T gate. Furthermore, by the method of state injection [7, 78], one can generate a T gate by acting with SOs on the T state $| T\rangle$ .

In what follows, we use quantum hypothesis testing to establish an upper bound on the rate at which one can distill qutrit T states. The proof follows the general method in [72, theorem 1] and [69, theorem 1], which was later generalized to an arbitrary resource theory in [73, section 7].

Proposition 21. Given a quantum channel ${ \mathcal N }$ , the following upper bound holds for the rate $R=k/n$ of an $(n,k,\varepsilon )\ T$ -magic distillation protocol:

$\begin{eqnarray}&&R\leqslant \displaystyle \frac{1}{\mathrm{log}(1+2\sin (\pi /18))}\left({\theta }_{\max }({ \mathcal N })+\displaystyle \frac{\mathrm{log}(1/[1-\varepsilon ])}{n}\right).\end{eqnarray} \tag{ 175 }$

Consequently, the following upper bound holds for the T-distillable magic of a quantum channel ${ \mathcal N }$ :

$\begin{eqnarray}&&{C}_{T}({ \mathcal N })\leqslant \displaystyle \frac{{\theta }_{\max }({ \mathcal N })}{\mathrm{log}(1+2\sin (\pi /18))}.\end{eqnarray} \tag{ 176 }$

Proof. Consider an arbitrary $(n,k,\varepsilon )$ T-magic state distillation protocol of the form described previously. Such a protocol uses the channel n times, starting from the state ${\rho }_{{R}_{1}{A}_{1}}^{(1)}$ with non-negative Wigner function and generating ${\rho }_{{R}_{2}{A}_{2}}^{(2)},\ldots ,{\rho }_{{R}_{n}{A}_{n}}^{(n)},$ and ${\omega }_{S}$ step by step along the way, such that the final state ${\omega }_{S}$ has fidelity $1-\varepsilon$ with $| T{\rangle }^{\otimes k}$ , where $| T\rangle =T| +\rangle$ is the corresponding magic state of the T gate. By assumption, it follows that

$\begin{eqnarray}&&\mathrm{Tr}\,[| T\rangle \langle T{| }^{\otimes k}{\omega }_{S}]\geqslant 1-\varepsilon ,\end{eqnarray} \tag{ 177 }$

while the result in [20] implies that

$\begin{eqnarray}&&\mathrm{Tr}\,[| T\rangle \langle T{| }^{\otimes k}{\sigma }_{S}]\leqslant {\left(1+2\sin (\pi /18)\right)}^{-k}\end{eqnarray} \tag{ 178 }$

for all ${\sigma }_{S}\in { \mathcal W }$ with the same dimension as ${\omega }_{S}$ . Applying the data processing inequality for the max-relative entropy, with respect to the measurement channel

$\begin{eqnarray}&&(\cdot )\to \mathrm{Tr}\,[| T\rangle \langle T{| }^{\otimes k}(\cdot )]| 0\rangle \langle 0| +\mathrm{Tr}\,[({I}^{\otimes k}-| T\rangle \langle T{| }^{\otimes k})(\cdot )]| 1\rangle \langle 1| ,\end{eqnarray} \tag{ 179 }$

we find that

$\begin{eqnarray}&&{\theta }_{\max }({\omega }_{S})\geqslant \mathrm{log}[(1-\varepsilon ){\left(1+2\sin (\pi /18)\right)}^{k}]\end{eqnarray} \tag{ 180 }$

$\begin{eqnarray}&&\,\,\geqslant \,\mathrm{log}(1-\varepsilon )+k\mathrm{log}(1+2\sin (\pi /18)).\end{eqnarray} \tag{ 181 }$

Moreover, by labeling ${\omega }_{S}$ as ${\rho }^{(n+1)}$ , we find that

$\begin{eqnarray}&&{\theta }_{\max }({\rho }^{(n+1)})=\displaystyle \sum _{j=1}^{n}[{\theta }_{\max }({\rho }^{(j+1)})-{\theta }_{\max }({\rho }^{(j)})]\end{eqnarray} \tag{ 182 }$

$\begin{eqnarray}&&\,\,\,\,\,=\,\displaystyle \sum _{j=1}^{n}[{\theta }_{\max }(({{ \mathcal F }}^{(j+1)}\,\circ \,[\mathrm{id}\otimes { \mathcal N }])({\rho }^{(j)}))-{\theta }_{\max }({\rho }^{(j)})]\end{eqnarray} \tag{ 183 }$

$\begin{eqnarray}&&\,\,\,\leqslant \,\displaystyle \sum _{j=1}^{n}[{\theta }_{\max }([\mathrm{id}\otimes { \mathcal N }]({\rho }^{(j)}))-{\theta }_{\max }({\rho }^{(j)})]\end{eqnarray} \tag{ 184 }$

$\begin{eqnarray}&&\leqslant \,n{\theta }_{\max }({ \mathcal N }).\,\end{eqnarray} \tag{ 185 }$

The first equality follows because ${\theta }_{\max }({\rho }^{(1)})=0$ and by adding and subtracting terms The first inequality follows because the max-thauma of a state does not increase under the action of a CPWP channel. The last inequality follows from applying proposition 19.

Hence

$\begin{eqnarray}&&n{\theta }_{\max }({ \mathcal N })\geqslant \mathrm{log}(1-\varepsilon )+k\mathrm{log}(1+2\sin (\pi /18)),\end{eqnarray} \tag{ 186 }$

which implies that

$\begin{eqnarray}&&R=k/n\leqslant \displaystyle \frac{1}{\mathrm{log}(1+2\sin (\pi /18))}\left({\theta }_{\max }({ \mathcal N })+\displaystyle \frac{\mathrm{log}(1/[1-\varepsilon ])}{n}\right).\end{eqnarray} \tag{ 187 }$

This concludes the proof.■

We note here that one could also use the subadditivity inequality in proposition 18 to establish the above result. We further note here that similar results in terms of max-relative entropies have been found in the context of other resource theories. Namely, a channel's max-relative entropy of entanglement is an upper bound on its distillable secret key when assisted by LOCC channels [68], the max-Rains information of a quantum channel is an upper bound on its distillable entanglement when assisted by completely PPT preserving channels [79], and the max-k-unextendibility of a quantum channel is an upper bound on its distillable entanglement when assisted by k-extendible channels [80].

4.3. Injectable quantum channel

In any resource theory of quantum channels, it tends to simplify for those channels that can be implemented by the action of a free channel on the tensor product of the channel input state and a resourceful state [73, section 7] and [81, section 6]. The situation is no different for the resource theory of magic channels. In fact, particular channels with the aforementioned structure have been considered for a long time in the context of magic states, via the method of state injection [7, 78]. Here we formally define an injectable channel as follows:

(Injectable channel).

Definition 7 A quantum channel ${ \mathcal N }$ is called injectable with associated resource state ${\omega }_{C}$ if there exists a CPWP channel ${{\rm{\Lambda }}}_{{AC}\to B}$ such that the following equality holds for all input states ${\rho }_{A}$ :

$\begin{eqnarray}&&{{ \mathcal N }}_{A\to B}({\rho }_{A})={{\rm{\Lambda }}}_{{AC}\to B}({\rho }_{A}\otimes {\omega }_{C}).\end{eqnarray} \tag{ 188 }$

The notion of a resource-seizable channel was introduced in [62, 81], and here we consider the application of this notion in the context of magic resource theory:

(Resource-seizable channel).

Definition 8 Let ${{ \mathcal N }}_{A\to B}$ be an injectable channel with associated resource state ${\omega }_{C}$ . The channel ${ \mathcal N }$ is resource-seizable if there exists a free state ${\kappa }_{{RA}}^{\mathrm{pre}}$ with non-negative Wigner function and a post-processing free CPWP channel ${{ \mathcal F }}_{{RB}\to C}^{\mathrm{post}}$ such that

$\begin{eqnarray}&&{{ \mathcal F }}_{{RB}\to C}^{\mathrm{post}}({{ \mathcal N }}_{A\to B}({\kappa }_{{RA}}^{\mathrm{pre}}))={\omega }_{C}.\end{eqnarray} \tag{ 189 }$

In the above sense, one seizes the resource state ${\omega }_{C}$ by employing free pre- and post-processing of the channel ${{ \mathcal N }}_{A\to B}$ .

An interesting and prominent example of an injectable channel that is also resource seizable is the channel ${ \mathcal T }$ corresponding to the $T$ gate. This channel ${ \mathcal T }$ has the following action ${ \mathcal T }(\rho ):= T\rho {T}^{\dagger }$ on an input state ρ. This channel is injectable with associated resource state ${\omega }_{C}=| T\rangle \langle T|$ , since one can use the method of circuit injection [7] to obtain the channel ${ \mathcal T }$ by acting on $| T\rangle \langle T|$ with SOs. It is resource seizable because one can act on the free state $| +\rangle \langle +|$ with the channel ${ \mathcal T }$ in order to seize the underlying resource state $| T\rangle \langle T| ={ \mathcal T }(| +\rangle \langle +| )$ .

As a generalization of the ${ \mathcal T }$ channel example above, consider the channel ${{\rm{\Delta }}}^{{\bf{p}}}\,\circ \,T$ , where ${{\rm{\Delta }}}^{{\bf{p}}}$ is a dephasing channel of the form

$\begin{eqnarray}&&{{\rm{\Delta }}}^{{\bf{p}}}(\rho )={p}_{0}\rho +{p}_{1}Z\rho {Z}^{\dagger }+{p}_{2}{Z}^{2}\rho {\left({Z}^{2}\right)}^{\dagger }\end{eqnarray} \tag{ 190 }$

where ${\bf{p}}=({p}_{0},{p}_{1},{p}_{2}),{p}_{0},{p}_{1},{p}_{2}\geqslant 0$ , and ${p}_{0}+{p}_{1}+{p}_{2}=1$ . The channel is injectable with resource state ${{\rm{\Delta }}}^{{\bf{p}}}(| T\rangle \langle T| )$ , because the same method of circuit injection leads to the channel ${{\rm{\Delta }}}^{{\bf{p}}}\,\circ \,T$ when acting on the resource state ${{\rm{\Delta }}}^{{\bf{p}}}(| T\rangle \langle T| )$ . Furthermore, the channel ${{\rm{\Delta }}}^{{\bf{p}}}\,\circ \,T$ is resource seizable because one recovers the resource state ${{\rm{\Delta }}}^{{\bf{p}}}(| T\rangle \langle T| )$ by acting with ${{\rm{\Delta }}}^{{\bf{p}}}\,\circ \,T$ on the free state $| +\rangle \langle +|$ .

For such injectable channels, the resource theory of magic channels simplifies in the following sense:

Proposition 22. Let ${ \mathcal N }$ be an injectable channel with associated resource state ${\omega }_{C}$ . Then the following inequalities hold

$\begin{eqnarray}&&{ \mathcal M }({ \mathcal N })\leqslant { \mathcal M }({\omega }_{C}),\qquad {\boldsymbol{\theta }}({ \mathcal N })\leqslant {\boldsymbol{\theta }}({\omega }_{C}),\end{eqnarray} \tag{ 191 }$

where ${\boldsymbol{\theta }}$ denotes the generalized thauma measures from section 3.4. If ${ \mathcal N }$ is also resource seizable, then the following equalities hold

$\begin{eqnarray}&&{ \mathcal M }({ \mathcal N })={ \mathcal M }({\omega }_{C}),\qquad {\boldsymbol{\theta }}({ \mathcal N })={\boldsymbol{\theta }}({\omega }_{C}).\end{eqnarray} \tag{ 192 }$

Proof. We first prove the first inequality in (191). Consider that

$\begin{eqnarray}&&{ \mathcal M }({ \mathcal N })=\mathrm{log}\mathop{\max }\limits_{{\bf{u}}}\parallel {{ \mathcal N }}_{A\to B}({A}_{A}^{{\bf{u}}}){\parallel }_{W,1}\,\,\end{eqnarray} \tag{ 193 }$

$\begin{eqnarray}&&\,=\,\mathrm{log}\mathop{\max }\limits_{{\bf{u}}}\parallel {{\rm{\Lambda }}}_{{AC}\to B}({A}_{A}^{{\bf{u}}}\otimes {\omega }_{C}){\parallel }_{W,1}\end{eqnarray} \tag{ 194 }$

$\begin{eqnarray}&&\leqslant \,\mathrm{log}\mathop{\max }\limits_{{\bf{u}}}\parallel {A}_{A}^{{\bf{u}}}\otimes {\omega }_{C}{\parallel }_{W,1}\,\,\end{eqnarray} \tag{ 195 }$

$\begin{eqnarray}&&\,=\,\mathrm{log}\mathop{\max }\limits_{{\bf{u}}}\parallel {A}_{A}^{{\bf{u}}}{\parallel }_{W,1}+\mathrm{log}\parallel {\omega }_{C}{\parallel }_{W,1}\end{eqnarray} \tag{ 196 }$

$\begin{eqnarray}&&=\,\mathrm{log}\parallel {\omega }_{C}{\parallel }_{W,1}\,\,\,\end{eqnarray} \tag{ 197 }$

$\begin{eqnarray}&&=\,{ \mathcal M }({\omega }_{C}).\,\,\,\,\end{eqnarray} \tag{ 198 }$

The first two equalities follow from definitions. The inequality follows from lemma 30 in the appendix. The third equality follows because the Wigner trace norm is multiplicative for tensor-product operators. The fourth equality follows because $\parallel {A}_{A}^{{\bf{u}}}{\parallel }_{W,1}=1$ for any phase-space point operator ${A}^{{\bf{u}}}$ .

We now prove the second inequality in (191):

$\begin{eqnarray}&&{\boldsymbol{\theta }}({ \mathcal N })=\mathop{\inf }\limits_{{ \mathcal E }:{ \mathcal M }({ \mathcal E })\leqslant 0}\mathop{\sup }\limits_{{\psi }_{{RA}}}{\bf{D}}({{ \mathcal N }}_{A\to B}({\psi }_{{RA}})\parallel {{ \mathcal E }}_{A\to B}({\psi }_{{RA}}))\,\,\end{eqnarray} \tag{ 199 }$

$\begin{eqnarray}&&=\,\mathop{\inf }\limits_{{ \mathcal E }:{ \mathcal M }({ \mathcal E })\leqslant 0}\mathop{\sup }\limits_{{\psi }_{{RA}}}{\bf{D}}({{\rm{\Lambda }}}_{{AC}\to B}({\psi }_{{RA}}\otimes {\omega }_{C})\parallel {{ \mathcal E }}_{A\to B}({\psi }_{{RA}}))\end{eqnarray} \tag{ 200 }$

$\begin{eqnarray}&&\,\,\leqslant \,\mathop{\inf }\limits_{{\sigma }_{C}\geqslant 0:{ \mathcal M }({\sigma }_{C})\leqslant 0}\mathop{\sup }\limits_{{\psi }_{{RA}}}{\bf{D}}({{\rm{\Lambda }}}_{{AC}\to B}({\psi }_{{RA}}\otimes {\omega }_{C})\parallel {{\rm{\Lambda }}}_{{AC}\to B}({\psi }_{{RA}}\otimes {\sigma }_{C}))\end{eqnarray} \tag{ 201 }$

$\begin{eqnarray}&&\leqslant \,\mathop{\inf }\limits_{{\sigma }_{C}\geqslant 0:{ \mathcal M }({\sigma }_{C})\leqslant 0}\mathop{\sup }\limits_{{\psi }_{{RA}}}{\bf{D}}({\psi }_{{RA}}\otimes {\omega }_{C}\parallel {\psi }_{{RA}}\otimes {\sigma }_{C})\,\end{eqnarray} \tag{ 202 }$

$\begin{eqnarray}&&=\,\mathop{\inf }\limits_{{\sigma }_{C}\geqslant 0:{ \mathcal M }({\sigma }_{C})\leqslant 0}{\bf{D}}({\omega }_{C}\parallel {\sigma }_{C})\,\,\,\,\end{eqnarray} \tag{ 203 }$

$\begin{eqnarray}&&=\,{\boldsymbol{\theta }}({\omega }_{C}).\,\,\,\,\,\,\end{eqnarray} \tag{ 204 }$

The first two equalities follow from definitions. The first inequality follows because the completely positive map ${ \mathcal E }={{\rm{\Lambda }}}_{{AC}\to B}(\cdot \otimes {\sigma }_{C})$ with ${\sigma }_{C}\in { \mathcal W }$ is a special kind of completely positive map such that ${ \mathcal M }({ \mathcal E })\leqslant 0$ , due to the first inequality in (191). The second inequality follows from data processing under the channel ${{\rm{\Lambda }}}_{{AC}\to B}$ . The third equality follows because the generalized divergence is invariant under tensoring its two arguments with the same state ${\psi }_{{RA}}$ (again a consequence of data processing [58]). The final equality follows from the definition in (96).

The inequalities in (192) are a direct consequence of the definition of a resource-seizable channel, the fact that both the mana and the generalized thauma are monotone under the action of a CPWP superchannel (theorems 9 and 11, respectively), and with ${{ \mathcal F }}_{{RB}\to C}^{\mathrm{post}}({{ \mathcal N }}_{A\to B}({\kappa }_{{RA}}^{\mathrm{pre}}))$ understood as a particular kind of superchannel that manipulates ${{ \mathcal N }}_{A\to B}$ to the state ${\omega }_{C}$ . Furthermore, it is the case that the channel measures reduce to the state measures when evaluated for preparation channels that take as input a trivial one-dimensional system, for which the only possible 'state' is the number one, and output a state on the output system (see proposition 4 and (122)).■

Applying proposition 22 to the channel ${ \mathcal T }$ and applying some of the results in [20], we find that

$\begin{eqnarray}&&{\theta }_{\max }({ \mathcal T })=\theta ({ \mathcal T })={\theta }_{\max }(| T\rangle \langle T| )=\theta (| T\rangle \langle T| )=\mathrm{log}(1+2\sin (\pi /18)).\end{eqnarray} \tag{ 205 }$

The notion of an injectable channel also improves the upper bounds on the distillable magic of a quantum channel:

Proposition 23. Given an injectable quantum channel ${ \mathcal N }$ with associated resource state ${\omega }_{C}$ , the following upper bound holds for the rate $R=k/n$ of an $(n,k,\varepsilon )$ $T$ -magic distillation protocol:

$\begin{eqnarray}&&R\leqslant \displaystyle \frac{1}{\mathrm{log}(1+2\sin (\pi /18))(1-\varepsilon )}\left(\theta ({\omega }_{C})+\displaystyle \frac{{h}_{2}(\varepsilon )}{n}\right),\end{eqnarray} \tag{ 206 }$

where ${h}_{2}(\varepsilon ):= -\varepsilon {\mathrm{log}}_{2}\varepsilon -(1-\varepsilon ){\mathrm{log}}_{2}(1-\varepsilon )$ . Consequently, the following upper bound holds for the T-distillable magic of the injectable quantum channel ${ \mathcal N }$ :

$\begin{eqnarray}&&{C}_{T}({ \mathcal N })\leqslant \displaystyle \frac{\theta ({\omega }_{C})}{\mathrm{log}(1+2\sin (\pi /18))}.\end{eqnarray} \tag{ 207 }$

Proof. Consider an arbitrary $(n,k,\varepsilon )$ T-magic state distillation protocol of the form described previously. Due to the injection property, it follows that such a protocol is equivalent to a CPWP channel acting on the resource state ${\omega }_{C}^{\otimes n}$ (see figure 5 of [73]). So the channel distillation problem reduces to a state distillation problem. Applying proposition 4 of [20] and standard inequalities for the hypothesis testing relative entropy from [82], we conclude the bound in (206). Then taking limits, we arrive at (207). ■

5. Magic cost of a quantum channel

5.1. Magic cost of exact channel simulation

Beyond magic distillation via quantum channels, the magic measures of quantum channels can also help us investigate the magic cost in quantum gate synthesis. In the past two decades, tremendous progress has been accomplished in the area of gate synthesis for qubits (e.g. [83–90]) and qudits (e.g. [91–95]). Elementary two-qudit gates include the controlled-increment gate [91] and the generalized controlled-X gate [93, 94]. More recently, the synthesis of single-qutrit gates was studied in [96, 97].

Of particular interest is to study exact gate synthesis of multi-qudit unitary gates from elements of the Clifford group supplemented by T gates. More generally, a fundamental question is to determine how many instances of a given quantum channel ${ \mathcal N }^{\prime}$ are required to simulate another quantum channel ${ \mathcal N }$ , when supplemented with CPWP channels. That is, such a channel synthesis protocol has the following form:

$\begin{eqnarray}&&{{ \mathcal N }}_{A\to B}={{ \mathcal F }}_{{R}_{n}{B}_{n}^{{\prime} }\to B}^{n+1}\,\circ \,{{ \mathcal N }}_{{A}_{n}^{{\prime} }\to {B}_{n}^{{\prime} }}^{{\prime} }\,\circ \,{{ \mathcal F }}_{{R}_{n-1}{B}_{n-1}^{{\prime} }\to {R}_{n}{A}_{n}^{\prime} }^{n}\,\circ \,\cdots \,\circ \,{{ \mathcal F }}_{{R}_{1}{B}_{1}^{{\prime} }\to {R}_{2}{A}_{2}^{{\prime} }}^{2}\,\circ \,{{ \mathcal N }}_{{A}_{1}^{{\prime} }\to {B}_{1}^{{\prime} }}^{{\prime} }\,\circ \,{{ \mathcal F }}_{A\to {R}_{1}{A}_{1}^{{\prime} }}^{1},\end{eqnarray} \tag{ 208 }$

as depicted in figure 3. Let ${S}_{{{ \mathcal N }}^{{\prime} }}({ \mathcal N })$ denote the smallest number of ${{ \mathcal N }}^{{\prime} }$ channels required to implement the quantum channel ${ \mathcal N }$ exactly. Note that it might not always be possible to have an exact simulation of the channel ${ \mathcal N }$ when starting from another channel ${{ \mathcal N }}^{{\prime} }$ . For example, if ${ \mathcal N }$ is a unitary channel and ${{ \mathcal N }}^{{\prime} }$ is a noisy depolarizing channel, then this is not possible. In this case, we define ${S}_{{{ \mathcal N }}^{{\prime} }}({ \mathcal N })=\infty$ .

**Figure 3.** The most general protocol for exact synthesis of a channel ${{ \mathcal N }}_{A\to B}$ starting from n uses of another quantum channel ${{ \mathcal N }}^{{\prime} }$ , along with free CPWP channels ${{ \mathcal F }}^{i}$ , for $i\in \{1,\ldots ,n\}$ .
Download figure:
Standard image High-resolution image

In the following, we establish lower bounds on gate synthesis by employing the channel measures of magic introduced previously.

Proposition 24. For any qudit quantum channel ${ \mathcal N }$ , the number of channels ${{ \mathcal N }}^{{\prime} }$ required to implement it is bounded from below as follows:

$\begin{eqnarray}&&{S}_{{{ \mathcal N }}^{{\prime} }}({ \mathcal N })\geqslant \max \left\{\displaystyle \frac{{ \mathcal M }({ \mathcal N })}{{ \mathcal M }({{ \mathcal N }}^{{\prime} })},\displaystyle \frac{{\theta }_{\max }({ \mathcal N })}{{\theta }_{\max }({{ \mathcal N }}^{{\prime} })}\right\}.\end{eqnarray} \tag{ 209 }$

If the channel ${ \mathcal N }$ is injectable with associated resource state ${\omega }_{C}$ , then the following bound holds

$\begin{eqnarray}&&{S}_{{{ \mathcal N }}^{{\prime} }}({ \mathcal N })\geqslant \max \left\{\displaystyle \frac{{ \mathcal M }({ \mathcal N })}{{ \mathcal M }({\omega }_{C})},\displaystyle \frac{{\theta }_{\max }({ \mathcal N })}{{\theta }_{\max }({\omega }_{C})}\right\}.\end{eqnarray} \tag{ 210 }$

Proof. Suppose that the simulation of ${ \mathcal N }$ is realized as in (208). Applying proposition 6 iteratively, we find that

$\begin{eqnarray}&&{ \mathcal M }({ \mathcal N })\leqslant n{ \mathcal M }({{ \mathcal N }}^{{\prime} })+\displaystyle \sum _{i=1}^{n+1}{ \mathcal M }({{ \mathcal F }}^{i})=n{ \mathcal M }({{ \mathcal N }}^{{\prime} }),\end{eqnarray} \tag{ 211 }$

where the equality follows from proposition 7 and the assumption that each ${{ \mathcal F }}^{i}$ is a CPWP channel. Then $n\geqslant \tfrac{{ \mathcal M }({ \mathcal N })}{{ \mathcal M }({{ \mathcal N }}^{{\prime} })}$ . Since this inequality holds for an arbitrary channel synthesis protocol, we find that ${S}_{{{ \mathcal N }}^{{\prime} }}({ \mathcal N })\geqslant \tfrac{{ \mathcal M }({ \mathcal N })}{{ \mathcal M }({{ \mathcal N }}^{{\prime} })}$ .

Applying propositions 18 and 12 in a similar way, we conclude that ${S}_{{{ \mathcal N }}^{{\prime} }}({ \mathcal N })\geqslant \tfrac{{\theta }_{\max }({ \mathcal N })}{{\theta }_{\max }({{ \mathcal N }}^{{\prime} })}$ .

If the channel is injectable, then the upper bounds in (191) apply, from which we conclude (210).■

As a direct application, we investigate gate synthesis of elementary gates. In the following, we prove that four T gates are necessary to synthesize a controlled–controlled-X qutrit gate (CCX gate) exactly.

Proposition 25. To implement a CCX qutrit gate, at least four qutrit $T$ gates are required.

Proof. By direct numerical evaluation, we find that

$\begin{eqnarray}&&{S}_{T}({\rm{CCX}})\geqslant \displaystyle \frac{{ \mathcal M }({\rm{CCX}})}{{ \mathcal M }(| T\rangle \langle T| )}\geqslant \displaystyle \frac{2.1876}{0.6657}\geqslant 3.2861,\end{eqnarray} \tag{ 212 }$

which means that four qutrit T gates are necessary to implement a qutrit CCX gate. ■

For NISQ devices, it is natural to consider gate synthesis under realistic quantum noise. One common noise model in quantum information processing is the depolarizing channel:

$\begin{eqnarray}&&{{ \mathcal D }}_{p}(\rho )=(1-p)\rho +\displaystyle \frac{p}{{d}^{2}-1}\displaystyle \sum _{\displaystyle \genfrac{}{}{0em}{}{0\leqslant i,j\leqslant d-1}{(i,j)\ne (0,0)}}{X}^{i}{Z}^{j}\rho {\left({X}^{i}{Z}^{j}\right)}^{\dagger }.\end{eqnarray} \tag{ 213 }$

Suppose that a T gate is not available, but instead only a noisy version ${{ \mathcal D }}_{p}\,\circ \,T$ of it is. Then it is reasonable to consider the number of noisy T gates required to implement a low-noise CCX gate, and the resulting lower bound is depicted in figure 4. Considering the depolarizing noise (p = 0.01) and applying proposition 24, the lower bound is given by

$\begin{eqnarray}&&\displaystyle \frac{{ \mathcal M }({{ \mathcal D }}_{0.01}^{\otimes 3}\,\circ \,{\rm{CCX}})}{{ \mathcal M }({{ \mathcal D }}_{p}\,\circ \,T)}.\end{eqnarray} \tag{ 214 }$

**Figure 4.** Lower bound on the number of noisy T gates required to implement a low-noise CCX gate.
Download figure:
Standard image High-resolution image

5.2. Magic cost of approximate channel simulation

Here we consider the magic cost of approximate channel simulation, which allows for a small error in the simulation process. To be specific, we establish the following proposition:

Proposition 26. For any qudit channel ${ \mathcal N }$ (with odd dimensions), the following lower bound for the number of channels ${{ \mathcal N }}^{{\prime} }$ required to implement it with error tolerance $\varepsilon$ :

$\begin{eqnarray}&&{S}_{{{ \mathcal N }}^{{\prime} }}^{\varepsilon }({ \mathcal N })\geqslant \min \{k:k{ \mathcal M }({{ \mathcal N }}^{{\prime} })\geqslant { \mathcal M }(\widetilde{{ \mathcal N }}),\displaystyle \frac{1}{2}\parallel \widetilde{{ \mathcal N }}-{ \mathcal N }{\parallel }_{\diamond }\leqslant \varepsilon ,\widetilde{{ \mathcal N }}\in \mathrm{CPTP}\},\end{eqnarray} \tag{ 215 }$

where $\parallel \cdot {\parallel }_{\diamond }:= {\sup }_{k\in {\mathbb{N}}}{\sup }_{\parallel X{\parallel }_{1}\leqslant 1}\parallel (\cdot \otimes {\mathrm{id}}_{k})(X){\parallel }_{1}$ denotes the diamond norm.

To get this bound, we minimize the lower bound of exact magic cost in proposition 24 over the quantum channels that are ε-close to the target channel in terms of diamond norm. Using the SDP form of the diamond norm [98] the above lower bound can be computed via the following SDP:

$\begin{eqnarray}&&\begin{array}{l}\mathrm{log}\,\min \ t\\ \quad {\rm{s}}.{\rm{t}}.\ t{2}^{{ \mathcal M }({{ \mathcal N }}^{{\prime} })}\geqslant \parallel \widetilde{{ \mathcal N }}({A}_{{\bf{u}}}){\parallel }_{W,1},\forall {\bf{u}},\\ \quad \mathrm{Tr}{}_{B}{Y}_{{AB}}\leqslant \varepsilon {{\mathbb{1}}}_{A},Y\geqslant 0,Y\geqslant {J}_{{ \mathcal N }}-{J}_{\widetilde{{ \mathcal N }}},\\ \quad {J}_{\widetilde{{ \mathcal N }}}\geqslant 0,\mathrm{Tr}{}_{B}{J}_{\widetilde{{ \mathcal N }}}={{\mathbb{1}}}_{A}.\end{array}\end{eqnarray} \tag{ 216 }$

Moreover, one could also replace mana with thauma in the above resource estimation. Also, it is possible and interesting to exactly characterize the minimum error of channel simulation under CPWP bipartite channels. We leave these for future study.

6. Classical simulation of quantum channels

6.1. Classical algorithm for simulating noisy quantum circuits

An operational meaning associated with mana is that it quantifies the rate at which a quantum circuit can be simulated on a classical computer. Inspired by [30], we propose an algorithm for simulating quantum circuits in which the operations can potentially be noisy. We show that the complexity of this algorithm scales with the mana (the logarithmic negativity) of quantum channels, establishing mana as a useful measure for measuring the cost of classical simulation of a (noisy) quantum circuit. For recent independent and related work, see [99].

Let ${{ \mathcal H }}_{d}^{\otimes n}$ be the Hilbert space of an n-qudit system. Consider an evolution that consists of the sequence ${\{{{ \mathcal N }}_{l}\}}_{l=1}^{L}$ of channels acting on an input state ρ. Then the probability of observing the POVM measurement outcome E, where $0\leqslant E\leqslant I$ , can be computed according to the Born rule as

$\begin{eqnarray}&&\mathrm{Tr}\,\left[E({{ \mathcal N }}_{L}\,\circ \,\cdots \,\circ \,{{ \mathcal N }}_{1})(\rho )\right]=\displaystyle \sum _{\vec{{\bf{u}}}}W(E| {{\bf{u}}}_{L})\displaystyle \prod _{l=1}^{L}{W}_{{{ \mathcal N }}_{l}}({{\bf{u}}}_{l}| {{\bf{u}}}_{l-1}){W}_{\rho }({{\bf{u}}}_{0}),\end{eqnarray} \tag{ 217 }$

where $\vec{{\bf{u}}}=({{\bf{u}}}_{0},\ldots ,{{\bf{u}}}_{L})$ represents a vector in the discrete phase space and $W(E| \cdot )$ is the discrete Wigner function of the measurement operator E (see equation (10)). For the base case L = 1, this follows from the properties of the discrete Wigner function:

$\begin{eqnarray}&&\displaystyle \sum _{{\bf{u}},{\bf{u}}^{\prime} }W(E| {\bf{u}}^{\prime} ){W}_{{ \mathcal N }}({\bf{u}}^{\prime} | {\bf{u}}){W}_{\rho }({\bf{u}})=\displaystyle \sum _{{\bf{u}},{\bf{u}}^{\prime} }\mathrm{Tr}\,\left[{{EA}}_{{\bf{u}}^{\prime} }\right]\displaystyle \frac{1}{d}\mathrm{Tr}\,\left[{A}_{{\bf{u}}^{\prime} }{ \mathcal N }({A}_{{\bf{u}}})\right]\displaystyle \frac{1}{d}\mathrm{Tr}\,\left[{A}_{{\bf{u}}}\rho \right]\end{eqnarray} \tag{ 218 }$

$\begin{eqnarray}&&\,\,\,=\displaystyle \sum _{{\bf{u}}^{\prime} }\mathrm{Tr}\,\left[{{EA}}_{{\bf{u}}^{\prime} }\right]\displaystyle \frac{1}{d}\mathrm{Tr}\,\left[{A}_{{\bf{u}}^{\prime} }{ \mathcal N }(\rho )\right]\end{eqnarray} \tag{ 219 }$

$\begin{eqnarray}&&=\mathrm{Tr}\,\left[E{ \mathcal N }(\rho )\right].\,\,\end{eqnarray} \tag{ 220 }$

The case of general L follows by induction.

Our goal is to estimate $\mathrm{Tr}\,\left[E({{ \mathcal N }}_{L}\,\circ \,\cdots \,\circ \,{{ \mathcal N }}_{1})(\rho )\right]$ with additive error. In what follows, we assume that the input state is $\rho =| {0}^{n}\rangle \langle {0}^{n}|$ and the desired outcome is $| 0\rangle \langle 0|$ . This assumption is without loss of generality, since we can reformulate both the state preparation and the measurement as quantum channels

$\begin{eqnarray}&&{{ \mathcal N }}_{1}(\sigma )=\mathrm{Tr}\,(\sigma )\rho ,\qquad {{ \mathcal N }}_{L}(\sigma )=\mathrm{Tr}\,(E\sigma )| 0\rangle \langle 0| +\mathrm{Tr}\,(({\mathbb{1}}-E)\sigma )| 1\rangle \langle 1| .\end{eqnarray} \tag{ 221 }$

Consequently, we have

$\begin{eqnarray}&&\mathrm{Tr}\,\left[| 0\rangle \langle 0| ({{ \mathcal N }}_{L}\,\circ \,\cdots \,\circ \,{{ \mathcal N }}_{1})(| {0}^{n}\rangle \langle {0}^{n}| )\right]=\displaystyle \sum _{\vec{{\bf{u}}}}W(| 0\rangle \langle 0| | {{\bf{u}}}_{L})\displaystyle \prod _{l=1}^{L}{W}_{{{ \mathcal N }}_{l}}({{\bf{u}}}_{l}| {{\bf{u}}}_{l-1}){W}_{| {0}^{n}\rangle \langle {0}^{n}| }({{\bf{u}}}_{0}).\end{eqnarray} \tag{ 222 }$

To describe the simulation algorithm, we define the negativity of quantum states and channels as

$\begin{eqnarray}\begin{array}{rcl}{{ \mathcal M }}_{\rho } & := & \parallel \rho {\parallel }_{W,1}=\displaystyle \sum _{{\bf{u}}}| {W}_{\rho }({\bf{u}})| ,\\ {{ \mathcal M }}_{{ \mathcal N }}({\bf{u}}) & := & \parallel { \mathcal N }({A}_{{\bf{u}}}){\parallel }_{W,1}=\displaystyle \sum _{{\bf{u}}^{\prime} }| {W}_{{ \mathcal N }}({\bf{u}}^{\prime} | {\bf{u}})| ,\\ {{ \mathcal M }}_{{ \mathcal N }} & := & {2}^{{ \mathcal M }({ \mathcal N })}=\mathop{\max }\limits_{{\bf{u}}}{{ \mathcal M }}_{{ \mathcal N }}({\bf{u}}).\end{array}\end{eqnarray} \tag{ 223 }$

Then a noisy circuit comprised of the channels ${\{{{ \mathcal N }}_{l}\}}_{l=1}^{L}$ can be simulated as follows. We sample the initial phase point ${{\bf{u}}}_{0}$ according to the distribution $| {W}_{| {0}^{n}\rangle \langle {0}^{n}| }({{\bf{u}}}_{0})| /{{ \mathcal M }}_{| {0}^{n}\rangle \langle {0}^{n}| }$ and, for each l=1,...,L, we sample a phase point ${{\bf{u}}}_{l}$ according to the conditional distribution $| {W}_{{{ \mathcal N }}_{l}}({{\bf{u}}}_{l}| {{\bf{u}}}_{l-1})| /{{ \mathcal M }}_{{{ \mathcal N }}_{l}}({{\bf{u}}}_{l-1})$ , after which we output the estimate

$\begin{eqnarray}&&{{ \mathcal M }}_{| {0}^{n}\rangle \langle {0}^{n}| }\mathrm{Sign}\left[{W}_{| {0}^{n}\rangle \langle {0}^{n}| }({{\bf{u}}}_{0})\right]\displaystyle \prod _{l=1}^{L}{{ \mathcal M }}_{{{ \mathcal N }}_{l}}({{\bf{u}}}_{l-1})\mathrm{Sign}\left[{W}_{{{ \mathcal N }}_{l}}({{\bf{u}}}_{l}| {{\bf{u}}}_{l-1})\right]W(| 0\rangle \langle 0| | {{\bf{u}}}_{L}).\end{eqnarray} \tag{ 224 }$

This gives an unbiased estimate of the output probability since

$\begin{eqnarray}&&\begin{array}{l}{\mathbb{E}}\left[{{ \mathcal M }}_{| {0}^{n}\rangle \langle {0}^{n}| }\mathrm{Sign}\left[{W}_{| {0}^{n}\rangle \langle {0}^{n}| }({{\bf{u}}}_{0})\right]\displaystyle \prod _{l=1}^{L}{{ \mathcal M }}_{{{ \mathcal N }}_{l}}({{\bf{u}}}_{l-1})\mathrm{Sign}\left[{W}_{{{ \mathcal N }}_{l}}({{\bf{u}}}_{l}| {{\bf{u}}}_{l-1})\right]W(| 0\rangle \langle 0| | {{\bf{u}}}_{L})\right]\\ \quad =\,\displaystyle \sum _{\vec{{\bf{u}}}}\displaystyle \frac{| {W}_{| {0}^{n}\rangle \langle {0}^{n}| }({{\bf{u}}}_{0})| }{{{ \mathcal M }}_{| {0}^{n}\rangle \langle {0}^{n}| }}\displaystyle \prod _{l=1}^{L}\displaystyle \frac{| {W}_{{{ \mathcal N }}_{l}}({{\bf{u}}}_{l}| {{\bf{u}}}_{l-1})| }{{{ \mathcal M }}_{{{ \mathcal N }}_{l}}({{\bf{u}}}_{l-1})}\\ \qquad \times \,{{ \mathcal M }}_{| {0}^{n}\rangle \langle {0}^{n}| }\mathrm{Sign}\left[{W}_{| {0}^{n}\rangle \langle {0}^{n}| }({{\bf{u}}}_{0})\right]\displaystyle \prod _{l=1}^{L}{{ \mathcal M }}_{{{ \mathcal N }}_{l}}({{\bf{u}}}_{l-1})\mathrm{Sign}\left[{W}_{{{ \mathcal N }}_{l}}({{\bf{u}}}_{l}| {{\bf{u}}}_{l-1})\right]W(| 0\rangle \langle 0| | {{\bf{u}}}_{L})\\ \quad =\,\displaystyle \sum _{\vec{{\bf{u}}}}W(| 0\rangle \langle 0| | {{\bf{u}}}_{L})\displaystyle \prod _{l=1}^{L}{W}_{{{ \mathcal N }}_{l}}({{\bf{u}}}_{l}| {{\bf{u}}}_{l-1}){W}_{| {0}^{n}\rangle \langle {0}^{n}| }({{\bf{u}}}_{0})\\ \quad =\,\mathrm{Tr}\,\left[| 0\rangle \langle 0| ({{ \mathcal N }}_{L}\,\circ \,\cdots \,\circ \,{{ \mathcal N }}_{1})(| {0}^{n}\rangle \langle {0}^{n}| )\right].\end{array}\end{eqnarray} \tag{ 225 }$

Note that ${{ \mathcal M }}_{| {0}^{n}\rangle \langle {0}^{n}| }=1$ since $| {0}^{n}\rangle$ is trivially a stabilizer state. Also for any stabilizer POVM $\{{E}_{k}\}$ , we have $\mathrm{Tr}\,\left[{E}_{k}{A}_{{\bf{u}}}\right]\geqslant 0$ and

$\begin{eqnarray}&&\displaystyle \sum _{k}\mathrm{Tr}\,\left[{E}_{k}{A}_{{\bf{u}}}\right]=\mathrm{Tr}\,\left[{A}_{{\bf{u}}}\right]=1,\end{eqnarray} \tag{ 226 }$

which implies that

$\begin{eqnarray}&&\mathop{max}\limits_{{\bf{u}}}| W({E}_{k}| {\bf{u}})| \,=\,\mathop{max}\limits_{{\bf{u}}}\left|{\rm{T}}{\rm{r}}\,\left[{E}_{k}{A}_{{\bf{u}}}\right]\right|\,=\,\mathop{max}\limits_{{\bf{u}}}{\rm{T}}{\rm{r}}\,\left[{E}_{k}{A}_{{\bf{u}}}\right]\leqslant 1.\end{eqnarray} \tag{ 227 }$

Therefore, the estimate that we output has absolute value bounded from above by

$\begin{eqnarray}&&{{ \mathcal M }}_{\to }=\displaystyle \prod _{l=1}^{L}{{ \mathcal M }}_{{{ \mathcal N }}_{l}}.\end{eqnarray} \tag{ 228 }$

By the Hoeffding inequality, it suffices to take

$\begin{eqnarray}&&\displaystyle \frac{2}{{\epsilon }^{2}}{{ \mathcal M }}_{\to }^{2}\mathrm{log}\left(\displaystyle \frac{2}{\delta }\right)\end{eqnarray} \tag{ 229 }$

samples to estimate the probability of a fixed measurement outcome with accuracy and success probability $1-\delta$ .

In the description of the above algorithm, we have used the discrete Wigner representation of quantum states, channels, and measurement operators. However, the algorithm can be generalized using the frame and dual frame representation along the lines of the work [30]. Specifically, for any frame $\{F(\lambda ):\lambda \in {\rm{\Lambda }}\}$ and its dual frame $\{G(\lambda ):\lambda \in {\rm{\Lambda }}\}$ on a d-dimensional Hilbert space, we define the corresponding quasiprobability representation of a state, a channel, and a measurement operator respectively as

$\begin{eqnarray}\begin{array}{rcl}{W}_{\rho }(\lambda ) & = & \mathrm{Tr}\,[F(\lambda )\rho ],\\ {{ \mathcal W }}_{{ \mathcal N }}(\lambda ^{\prime} | \lambda ) & = & \mathrm{Tr}\,[F(\lambda ^{\prime} ){ \mathcal N }(G(\lambda ))],\\ W(E| \lambda ) & = & \mathrm{Tr}\,\left[{EG}(\lambda )\right].\end{array}\end{eqnarray} \tag{ 230 }$

Then, we have similar rules for computing the measurement probability

$\begin{eqnarray}&&\displaystyle \sum _{\lambda ,\lambda ^{\prime} }W(E| \lambda ^{\prime} ){W}_{{ \mathcal N }}(\lambda ^{\prime} | \lambda ){W}_{\rho }(\lambda )=\mathrm{Tr}\,\left[E{ \mathcal N }(\rho )\right]\end{eqnarray} \tag{ 231 }$

and the above discussion carries through without any essential change. For simplicity, we omit the details here. Note that for the discrete Wigner function representation that we used in our paper, the correspondence is $F(\lambda )={A}_{{\bf{u}}}/d$ and $G(\lambda )={A}_{{\bf{u}}}$ .

6.2. Comparison of classical simulation algorithms for noisy quantum circuits

Recently, the channel robustness and the magic capacity were introduced to quantify the magic of multi-qubit noisy circuits [21]. To be specific, given a quantum channel ${ \mathcal N }$ , the channel robustness is defined as

$\begin{eqnarray}&&{{ \mathcal R }}_{* }({ \mathcal N }):= \mathop{\min }\limits_{{{\rm{\Lambda }}}_{\pm }\in \mathrm{CSPO}}\left\{2p+1:(1+p){{\rm{\Lambda }}}_{+}-p{{\rm{\Lambda }}}_{-}={ \mathcal N },p\geqslant 0\right\},\end{eqnarray} \tag{ 232 }$

and the magic capacity is defined as

$\begin{eqnarray}&&C({ \mathcal N }):= \mathop{\max }\limits_{| \phi \rangle \in \mathrm{Stab}}{ \mathcal R }[(\mathrm{id}\otimes { \mathcal N })(| \phi \rangle \langle \phi | )].\end{eqnarray} \tag{ 233 }$

They are related by the inequality [21]

$\begin{eqnarray}&&{ \mathcal R }({{\rm{\Phi }}}_{{ \mathcal N }})\leqslant C({ \mathcal N })\leqslant {{ \mathcal R }}_{* }({ \mathcal N }),\end{eqnarray} \tag{ 234 }$

where ${ \mathcal R }(\cdot )$ is the robustness of magic (see section 2) and ${{\rm{\Phi }}}_{{ \mathcal N }}$ is the normalized Choi–Jamiołkowski operator of ${ \mathcal N }$ . The authors of [21] further developed two matching simulation algorithms that scale quadratically with these channel measures. Here, we compare their approach with the one described in section 6.1 for simulating noisy qudit circuits. Note that neither the proof of (234) nor the static Monte Carlo algorithm of [21] depend on the dimensionality of the underlying system, so those results can be generalized to any qudit system with odd prime dimension.

Thus, we consider an n-qudit system with the underlying Hilbert space ${{ \mathcal H }}_{d}^{\otimes n}$ , where d is an odd prime. Consider a noisy circuit consisting of the sequence ${\{{{ \mathcal N }}_{l}\}}_{l=1}^{L}$ of channels acting on the initial state $| {0}^{n}\rangle$ , after which a computational basis measurement is performed. To describe the simulation algorithm based on channel robustness [21], we assume each ${{ \mathcal N }}_{j}$ has the optimal decomposition with respect to the set of CSPOs

$\begin{eqnarray}&&{{ \mathcal N }}_{j}=(1+{p}_{j}){{ \mathcal N }}_{j,0}-{p}_{j}{{ \mathcal N }}_{j,1},\end{eqnarray} \tag{ 235 }$

where ${{ \mathcal R }}_{* }({{ \mathcal N }}_{j})=2{p}_{j}+1$ . For any $\vec{k}\in {{\mathbb{Z}}}_{2}^{L}$ , define

$\begin{eqnarray}&&{p}_{\vec{k}}=\displaystyle \prod _{{k}_{j}=0}(1+{p}_{j})\displaystyle \prod _{{k}_{j}=1}(-{p}_{j})\qquad \parallel p{\parallel }_{1}=\displaystyle \sum _{\vec{k}}| {p}_{\vec{k}}| =\displaystyle \prod _{j}{{ \mathcal R }}_{* }({{ \mathcal N }}_{j})={{ \mathcal R }}_{* }.\end{eqnarray} \tag{ 236 }$

We sample a $\vec{k}\in {{\mathbb{Z}}}_{2}^{L}$ from the distribution $| {p}_{\vec{k}}| /\parallel p{\parallel }_{1}$ and simulate the evolution ${{ \mathcal N }}_{L,{k}_{L}}\,\circ \,\cdots \,\circ \,{{ \mathcal N }}_{2,{k}_{2}}\,\circ \,{{ \mathcal N }}_{1,{k}_{1}}$ ⁶ . To achieve accuracy and success probability $1-\delta$ , it suffices to take

$\begin{eqnarray}&&\displaystyle \frac{2}{{\epsilon }^{2}}{{ \mathcal R }}_{* }^{2}\mathrm{log}\left(\displaystyle \frac{2}{\delta }\right)\end{eqnarray} \tag{ 237 }$

samples.

To compare it with the mana-based simulation algorithm, we first prove that that the exponentiated mana of a quantum channel is always smaller than or equal to the channel robustness. To establish the separation, we introduce the robustness of magic with respect to non-negative Wigner function as follows.

Definition 9. Given a quantum state ρ, the robustness of magic with respect to non-negative Wigner function is defined as

$\begin{eqnarray}&&{{ \mathcal R }}_{{{ \mathcal W }}_{+}}(\rho ):= \min \left\{2p+1\,:\rho =(1+p)\sigma -p\omega ,\omega ,\sigma \in {{ \mathcal W }}_{+}\right\}.\end{eqnarray} \tag{ 238 }$

Since $\mathrm{Stab}\subset {{ \mathcal W }}_{+}$ , we have

$\begin{eqnarray}&&{{ \mathcal R }}_{{{ \mathcal W }}_{+}}({{\rm{\Phi }}}_{{ \mathcal N }})\leqslant { \mathcal R }({{\rm{\Phi }}}_{{ \mathcal N }})\leqslant C({ \mathcal N })\leqslant {{ \mathcal R }}_{* }({ \mathcal N }).\end{eqnarray} \tag{ 239 }$

Proposition 27. Given a quantum channel ${ \mathcal N }$ , the following inequality holds

$\begin{eqnarray}&&{2}^{{ \mathcal M }({ \mathcal N })}={{ \mathcal M }}_{{ \mathcal N }}\leqslant {{ \mathcal R }}_{* }({ \mathcal N }),\end{eqnarray} \tag{ 240 }$

and the inequality can be strict.

Proof. Suppose $\{p,{{\rm{\Lambda }}}_{\pm }\}$ is the optimal solution to equation (232) of ${{ \mathcal R }}_{* }({ \mathcal N })$ .

Then, we have

$\begin{eqnarray}&&{{ \mathcal M }}_{{ \mathcal N }}=\mathop{\max }\limits_{{\bf{u}}}\parallel { \mathcal N }({A}_{{\bf{u}}}){\parallel }_{W,1}\,\,\,\,\end{eqnarray} \tag{ 241 }$

$\begin{eqnarray}&&=\,\mathop{\max }\limits_{{\bf{u}}}\parallel (1+p){{\rm{\Lambda }}}_{+}({A}_{{\bf{u}}})-p{{\rm{\Lambda }}}_{-}({A}_{{\bf{u}}}){\parallel }_{W,1}\end{eqnarray} \tag{ 242 }$

$\begin{eqnarray}&&\leqslant \,\mathop{\max }\limits_{{\bf{u}}}\left[(1+p)\parallel {{\rm{\Lambda }}}_{+}({A}_{{\bf{u}}}){\parallel }_{W,1}+p\parallel {{\rm{\Lambda }}}_{-}({A}_{{\bf{u}}}){\parallel }_{W,1}\right]\end{eqnarray} \tag{ 243 }$

$\begin{eqnarray}&&=\,2p+1\,\,\,\,\,\,\end{eqnarray} \tag{ 244 }$

$\begin{eqnarray}&&=\,{{ \mathcal R }}_{* }({ \mathcal N }).\,\,\,\,\,\end{eqnarray} \tag{ 245 }$

The inequality in (243) follows due to the triangle inequality. The equality in (244) follows since ${{\rm{\Lambda }}}_{\pm }\in \mathrm{CSPO}$ and then $\parallel {{\rm{\Lambda }}}_{\pm }({A}_{{\bf{u}}}){\parallel }_{W,1}=1$ for any ${\bf{u}}$ .

Furthermore, we demonstrate the strict separation between ${2}^{{ \mathcal M }({ \mathcal N })}$ and ${{ \mathcal R }}_{* }({ \mathcal N })$ via the following example. Let us consider the diagonal unitary

$\begin{eqnarray}{U}_{\theta }=\left(\begin{array}{ccc}{{\rm{e}}}^{{\rm{i}}\theta /9} & 0 & 0\\ 0 & 1 & 0\\ 0 & 0 & {{\rm{e}}}^{-{\rm{i}}\theta /9}\end{array}\right).\end{eqnarray} \tag{ 246 }$

Note that the T gate is a special case, given by ${U}_{2\pi }$ . Due to equation (239), the separation between ${ \mathcal M }({U}_{\theta })$ and $\mathrm{log}{{ \mathcal R }}_{{{ \mathcal W }}_{+}}({{\rm{\Phi }}}_{{U}_{\theta }})$ in figure 5 indicates that the mana of a channel can be strictly smaller than the channel robustness and magic capacity, i.e.

$\begin{eqnarray}&&{{ \mathcal M }}_{{U}_{\theta }}\lt {{ \mathcal R }}_{{{ \mathcal W }}_{+}}({{\rm{\Phi }}}_{{U}_{\theta }})\leqslant C({U}_{\theta })\leqslant {R}_{* }({U}_{\theta }).\end{eqnarray} \tag{ 247 }$

This concludes the proof.■

**Figure 5.** Comparison between ${{ \mathcal M }}_{{U}_{\theta }}$ and ${{ \mathcal R }}_{{{ \mathcal W }}_{+}}({{\rm{\Phi }}}_{{U}_{\theta }})$ for $\pi \leqslant \theta \leqslant 2\pi$ . The gap indicates that ${{ \mathcal M }}_{{U}_{\theta }}$ is strictly smaller than $C({U}_{\theta })$ and ${R}_{* }({U}_{\theta })$ .
Download figure:
Standard image High-resolution image

**Figure 5.** Comparison between ${{ \mathcal M }}_{{U}_{\theta }}$ and ${{ \mathcal R }}_{{{ \mathcal W }}_{+}}({{\rm{\Phi }}}_{{U}_{\theta }})$ for $\pi \leqslant \theta \leqslant 2\pi$ . The gap indicates that ${{ \mathcal M }}_{{U}_{\theta }}$ is strictly smaller than $C({U}_{\theta })$ and ${R}_{* }({U}_{\theta })$ .
Download figure:
Standard image High-resolution image

Applying proposition 27 to the channels ${\{{{ \mathcal N }}_{l}\}}_{l=1}^{L}$ , we find that

$\begin{eqnarray}&&{{ \mathcal M }}_{\to }=\displaystyle \prod _{j}{{ \mathcal M }}_{{{ \mathcal N }}_{j}}\leqslant \displaystyle \prod _{j}{{ \mathcal R }}_{* }({{ \mathcal N }}_{j})={{ \mathcal R }}_{* }.\end{eqnarray} \tag{ 248 }$

Thus for an n-qudit system with odd prime dimension, the sample complexity of the mana-based approach is never worse than the algorithm of [21] based on channel robustness. Furthermore, the separation demonstrated in (247) indicates that the mana-based algorithm can be strictly faster for certain quantum circuits.

The above example shows that the mana of a quantum channel can be smaller than its magic capacity [21], due to (247), but it is not clear whether this relation holds for general quantum channels.

7. Examples

7.1. Non-stabilizerness under depolarizing noise

For near term quantum technologies, certain physical noise may occur during quantum information processing. One common quantum noise model is given by the depolarizing channel:

$\begin{eqnarray}&&{{ \mathcal D }}_{p}(\rho )=(1-p)\rho +\displaystyle \frac{p}{{d}^{2}-1}\displaystyle \sum _{\displaystyle \genfrac{}{}{0em}{}{0\leqslant i,j\leqslant d-1}{(i,j)\ne (0,0)}}{X}^{i}{Z}^{j}\rho {\left({X}^{i}{Z}^{j}\right)}^{\dagger },\end{eqnarray} \tag{ 249 }$

where $p\in [0,1]$ and $X,Z$ are the generalized Pauli operators.

Let us suppose that depolarizing noise occurs after the implementation of a T gate. From figure 6, we find that if the depolarizing noise parameter p is higher than or equal to 0.62, then the channel ${{ \mathcal D }}_{p}\,\circ \,{ \mathcal T }$ cannot generate any non-stabilizerness. That is, the channel ${{ \mathcal D }}_{p}\,\circ \,{ \mathcal T }$ becomes CPWP after this cutoff.

**Figure 6.** Non-stabilizerness of T gate after depolarizing noise. The solid red line quantifies the classical simulation cost of noisy circuits ${{ \mathcal D }}_{p}\,\circ \,{ \mathcal T }$ . The dashed blue line gives the upper bound of magic generating capacity of ${{ \mathcal D }}_{p}\,\circ \,{ \mathcal T }$ .
Download figure:
Standard image High-resolution image

Another interesting case is the CCX gate. Let us suppose that depolarizing noise occurs in parallel after the implementation of the CCX gate. The mana of ${{ \mathcal D }}_{p}^{\otimes 3}\,\circ \,{\rm{CCX}}$ is plotted in figure 7, where we see that it decreases linearly and becomes equal to zero at around $p\approx 0.75$ .

**Figure 7.** Non-stabilizerness of CCX gate after depolarizing noise. The solid line also quantifies the classical simulation cost of the noisy circuit ${{ \mathcal D }}_{p}^{\otimes 3}\,\circ \,{\rm{CCX}}$ .
Download figure:
Standard image High-resolution image

**Figure 7.** Non-stabilizerness of CCX gate after depolarizing noise. The solid line also quantifies the classical simulation cost of the noisy circuit ${{ \mathcal D }}_{p}^{\otimes 3}\,\circ \,{\rm{CCX}}$ .
Download figure:
Standard image High-resolution image

7.2. Werner–Holevo channel

An interesting qutrit channel is the qutrit Werner–Holevo channel [100]:

$\begin{eqnarray}&&{{ \mathcal N }}_{\mathrm{WH}}(V)=\displaystyle \frac{1}{2}[(\mathrm{Tr}\,V){\mathbb{1}}-{V}^{T}].\end{eqnarray} \tag{ 250 }$

In what follows, we find that the Werner–Holevo channel maps any quantum state to a free state in ${{ \mathcal W }}_{+}$ (state with non-negative Wigner function), while its amortized magic is given by our channel measure. This also indicates that the ancillary reference system is necessary to consider in the study of the resource theory of magic channels.

Proposition 28. For the qutrit Werner–Holevo channel ${{ \mathcal N }}_{\mathrm{WH}}$ ,

$\begin{eqnarray}&&{{ \mathcal N }}_{\mathrm{WH}}(\rho )\in {{ \mathcal W }}_{+},\end{eqnarray} \tag{ 251 }$

for any input state $\rho$ (which is restricted to be a state of the channel input system), while

$\begin{eqnarray}&&{{ \mathcal M }}^{{ \mathcal A }}({{ \mathcal N }}_{\mathrm{WH}})={ \mathcal M }({{ \mathcal N }}_{\mathrm{WH}})=\displaystyle \frac{5}{3},\end{eqnarray} \tag{ 252 }$

$\begin{eqnarray}&&\,{\theta }_{\max }^{{ \mathcal A }}({{ \mathcal N }}_{\mathrm{WH}})={\theta }_{\max }({{ \mathcal N }}_{\mathrm{WH}})=\displaystyle \frac{5}{3}.\end{eqnarray} \tag{ 253 }$

Proof. On the one hand, for any input state $\rho$ , we find that

$\begin{eqnarray}&&{W}_{{{ \mathcal N }}_{\mathrm{WH}}(\rho )}({\bf{u}})=\displaystyle \frac{1}{6}\mathrm{Tr}\,{A}_{{\bf{u}}}({\mathbb{1}}-{\rho }^{T})=\displaystyle \frac{1}{6}(1-\mathrm{Tr}\,{A}_{{\bf{u}}}{\rho }^{T})\geqslant \displaystyle \frac{1}{6}(1-\parallel {A}_{{\bf{u}}}{\parallel }_{\infty })\geqslant 0,\end{eqnarray} \tag{ 254 }$

where the first inequality follows because $\parallel {A}_{{\bf{u}}}{\parallel }_{\infty }=\parallel {A}_{0}{\parallel }_{\infty }=1$ , since ${A}_{{\bf{u}}}={T}_{{\bf{u}}}{A}_{0}{T}_{{\bf{u}}}^{\dagger }$ , the matrix ${T}_{{\bf{u}}}$ is unitary, and A₀ for a qutrit is explicitly given by the following unitary transformation:

$\begin{eqnarray}{A}_{0}=\left[\begin{array}{ccc}1 & 0 & 0\\ 0 & 0 & 1\\ 0 & 1 & 0\end{array}\right].\end{eqnarray} \tag{ 255 }$

On the other hand, we can set ${\rho }_{{RA}}$ to be the maximally entangled state, and we find from numerical calculations that

$\begin{eqnarray}&&{ \mathcal M }(({\mathrm{id}}_{R}\otimes {{ \mathcal N }}_{\mathrm{WH}})({\rho }_{{RA}}))-{ \mathcal M }({\rho }_{{RA}})=\displaystyle \frac{5}{3}.\end{eqnarray} \tag{ 256 }$

Meanwhile, we find from numerical calculations that ${ \mathcal M }({{ \mathcal N }}_{\mathrm{WH}})=5/3$ , which by proposition 8 means that

$\begin{eqnarray}&&\mathop{\sup }\limits_{{\rho }_{{RA}}}[{ \mathcal M }(({\mathrm{id}}_{R}\otimes {{ \mathcal N }}_{\mathrm{WH}})({\rho }_{{RA}}))-{ \mathcal M }({\rho }_{{RA}})]={ \mathcal M }({{ \mathcal N }}_{\mathrm{WH}})=\displaystyle \frac{5}{3}.\end{eqnarray} \tag{ 257 }$

Similarly, we find from numerical calculations that

$\begin{eqnarray}&&\mathop{\sup }\limits_{{\rho }_{{RA}}}[{\theta }_{\max }(({\mathrm{id}}_{R}\otimes {{ \mathcal N }}_{\mathrm{WH}})({\rho }_{{RA}}))-{\theta }_{\max }({\rho }_{{RA}})]={\theta }_{\max }({{ \mathcal N }}_{\mathrm{WH}})=\displaystyle \frac{5}{3}.\end{eqnarray} \tag{ 258 }$

This concludes the proof.■

8. Conclusion

We have introduced two efficiently computable magic measures of quantum channels to quantify and characterize the non-stabilizer resource possessed by quantum channels. These two channel measures have application in evaluating magic generating capability, gate synthesis, and classical simulation of noisy quantum circuits. More generally, our work establishes fundamental limitations on the processing of quantum magic using noisy quantum circuits, opening new perspectives for the investigation of the resource theory of quantum channels in FTQC.

One future direction is to explore tighter evaluations of the distillable magic of quantum channels. We think that it would also be interesting to explore other applications of our channel measures and generalize our approach to the multi-qubit case.

Acknowledgments

We are grateful to the anonymous referees for several comments that helped to improve our paper. XW acknowledges support from the Department of Defense. MMW acknowledges support from NSF under grant no. 1714215. YS acknowledges support from the Army Research Office (MURI award W911NF-16-1-0349), the Canadian Institute for Advanced Research, the National Science Foundation (grants 1526380 and 1813814), and the US Department of Energy, Office of Science, Office of Advanced Scientific Computing Research, Quantum algorithms Teams and Quantum Testbed Pathfinder programs.

Appendix A.: On completely positive maps with non-positive mana

Lemma 29. Let ${ \mathcal E }$ be a completely positive map. If ${ \mathcal M }({ \mathcal E })\leqslant 0$ , then ${ \mathcal E }$ is trace non-increasing on the set ${{ \mathcal W }}_{+}$ (quantum states with non-negative Wigner function).

Proof. The assumption ${ \mathcal M }({ \mathcal E })\leqslant 0$ is equivalent to ${\max }_{{\bf{u}}}{\sum }_{{\bf{v}}}| {W}_{{ \mathcal E }}({\bf{v}}| {\bf{u}})| \,\leqslant \,1$ . For $\rho \in {{ \mathcal W }}_{+}$ , then consider that

$\begin{eqnarray}&&\mathrm{Tr}\,[{ \mathcal E }(\rho )]=\left|\mathrm{Tr}\,[{ \mathcal E }(\rho )]\right|=\left|\displaystyle \sum _{{\bf{v}},{\bf{u}}}{W}_{{ \mathcal E }}({\bf{v}}| {\bf{u}}){W}_{\rho }({\bf{u}})\right|\end{eqnarray} \tag{ A1 }$

$\begin{eqnarray}&&\,\,\,\,\leqslant \,\displaystyle \sum _{{\bf{v}},{\bf{u}}}\left|{W}_{{ \mathcal E }}({\bf{v}}| {\bf{u}})\right|\left|{W}_{\rho }({\bf{u}})\right|\,\leqslant \,\displaystyle \sum _{{\bf{u}}}\left|{W}_{\rho }({\bf{u}})\right|\,=\,\displaystyle \sum _{{\bf{u}}}{W}_{\rho }({\bf{u}})=1.\end{eqnarray} \tag{ A2 }$

The first equality follows from the assumption that ${ \mathcal E }$ is completely positive and ρ is positive semi-definite. The second equality follows from lemma 1. The first inequality follows from the triangle inequality and the second from the assumption ${ \mathcal M }({ \mathcal E })\leqslant 0$ . The final two equalities follow from the assumption $\rho \in {{ \mathcal W }}_{+}$ and (11). ■

Appendix B.: Data processing inequality for the Wigner trace norm

Lemma 30. For any operator $Q$ and CPWP channel ${\rm{\Pi }}$ , the following inequality holds

$\begin{eqnarray}&&\parallel {\rm{\Pi }}(Q){\parallel }_{W,1}\leqslant \parallel Q{\parallel }_{W,1}.\end{eqnarray} \tag{ B1 }$

Proof. Let us suppose that $Q={\sum }_{{\bf{v}}}{q}_{{\bf{v}}}{A}_{{\bf{v}}}$ . Then we have $\parallel Q{\parallel }_{W,1}={\sum }_{{\bf{v}}}| {q}_{{\bf{v}}}|$ . Furthermore

$\begin{eqnarray}&&\parallel {\rm{\Pi }}(Q){\parallel }_{W,1}={\parallel \displaystyle \sum _{{\bf{v}}}{q}_{{\bf{v}}}{\rm{\Pi }}({A}_{{\bf{v}}})\parallel }_{W,1}\end{eqnarray} \tag{ B2 }$

$\begin{eqnarray}&&\,\,\,=\,\displaystyle \sum _{{\bf{u}}}\left|\displaystyle \sum _{{\bf{v}}}{q}_{{\bf{v}}}\mathrm{Tr}\,[{A}_{{\bf{u}}}{\rm{\Pi }}({A}_{{\bf{v}}})]\right|\end{eqnarray} \tag{ B3 }$

$\begin{eqnarray}&&\,\,=\,\displaystyle \sum _{{\bf{u}}}\left|\displaystyle \sum _{{\bf{v}}}{q}_{{\bf{v}}}{W}_{{\rm{\Pi }}}({\bf{u}}| {\bf{v}})\right|\end{eqnarray} \tag{ B4 }$

$\begin{eqnarray}&&\,\,\leqslant \,\displaystyle \sum _{{\bf{u}}}\displaystyle \sum _{{\bf{v}}}| {q}_{{\bf{v}}}| {W}_{{\rm{\Pi }}}({\bf{u}}| {\bf{v}})\end{eqnarray} \tag{ B5 }$

$\begin{eqnarray}&&\,=\,\displaystyle \sum _{{\bf{v}}}| {q}_{{\bf{v}}}| =\parallel Q{\parallel }_{W,1}.\end{eqnarray} \tag{ B6 }$

This concludes the proof. ■

Quantifying the magic of quantum channels

Article metrics

Author e-mails

Author affiliations

Author notes

ORCID iDs

Dates

Abstract

1. Introduction

1.1. Background

1.2. Overview of results

2. Preliminaries

2.1. The stabilizer formalism

2.2. Discrete Wigner function

2.3. Stabilizer channels and beyond

2.4. Magic measures of quantum states

3. Quantifying the non-stabilizerness of a quantum channel

3.1. Completely positive-Wigner-preserving operations

3.2. Quantum (CPWP) superchannels

3.3. Logarithmic negativity (mana) of a quantum channel

3.4. Generalized thauma of a quantum channel

3.5. Max-thauma of a quantum channel

4. Distilling magic from quantum channels

4.1. Amortized magic

4.2. Distillable magic of a quantum channel

4.3. Injectable quantum channel

5. Magic cost of a quantum channel

5.1. Magic cost of exact channel simulation

5.2. Magic cost of approximate channel simulation

6. Classical simulation of quantum channels

6.1. Classical algorithm for simulating noisy quantum circuits

6.2. Comparison of classical simulation algorithms for noisy quantum circuits

7. Examples

7.1. Non-stabilizerness under depolarizing noise

7.2. Werner–Holevo channel

8. Conclusion

Acknowledgments

Appendix A.: On completely positive maps with non-positive mana

Appendix B.: Data processing inequality for the Wigner trace norm

Footnotes

Quantifying the magic of quantum channels

Article metrics

Share this article

Author e-mails

Author affiliations

Author notes

ORCID iDs

Dates

Abstract

1. Introduction

1.1. Background

1.2. Overview of results

2. Preliminaries

2.1. The stabilizer formalism

2.2. Discrete Wigner function

2.3. Stabilizer channels and beyond

2.4. Magic measures of quantum states

3. Quantifying the non-stabilizerness of a quantum channel

3.1. Completely positive-Wigner-preserving operations

3.2. Quantum (CPWP) superchannels

3.3. Logarithmic negativity (mana) of a quantum channel

3.4. Generalized thauma of a quantum channel

3.5. Max-thauma of a quantum channel

4. Distilling magic from quantum channels

4.1. Amortized magic

4.2. Distillable magic of a quantum channel

4.3. Injectable quantum channel

5. Magic cost of a quantum channel

5.1. Magic cost of exact channel simulation

5.2. Magic cost of approximate channel simulation

6. Classical simulation of quantum channels

6.1. Classical algorithm for simulating noisy quantum circuits

6.2. Comparison of classical simulation algorithms for noisy quantum circuits

7. Examples

7.1. Non-stabilizerness under depolarizing noise

7.2. Werner–Holevo channel

8. Conclusion

Acknowledgments

Appendix A.: On completely positive maps with non-positive mana

Appendix B.: Data processing inequality for the Wigner trace norm

Footnotes