What is the theory behind PEC?#

Probabilistic error cancellation (PEC) [5, 10, 13] is a noise-aware error mitigation technique which is based on two main ideas:

  • The first idea is to express ideal gates as linear combinations of implementable noisy gates. These linear combinations are called quasi-probability representations [14];

  • The second idea is to probabilistically sample from the previous quasi-probability representations to approximate quantum expectation values via a Monte Carlo average.

Note: In this section we follow the same notation of [1].

Quasi-probability representations#

In PEC, each ideal gate \(\mathcal G_i\) of a circuit of interest \(\mathcal U = {\mathcal G}_t \circ \dots \circ {\mathcal G}_2 \circ {\mathcal G}_1 \) is represented as a linear combination of noisy implementable operations \({\mathcal O_{i, \alpha}}\) (i.e., operations that can be directly applied with a noisy backend):

\[ \mathcal G_i = \sum_\alpha \eta_{i, \alpha} \mathcal O_{i, \alpha}, \quad \eta_{i, \alpha} \in \mathbb R, \]

where the calligraphic symbols (\(\mathcal U\), \(\mathcal G_i\), \(\mathcal O_{i, \alpha}\)) stand for super-operators acting on the density matrix of the qubits as linear quantum channels.

The real coefficients \({\eta_{i,\alpha}}\) form a quasi-probability distribution [14] with respect to the index \(\alpha\). Their sum is normalized but, differently from standard probabilities, they can take negative values:

\[ \sum_\alpha \eta_{i,\alpha}=1, \qquad \gamma_i = \sum_\alpha |\eta_{i, \alpha}| \ge 1.\]

The constant \(\gamma_i\) quantifies the negativity of the quasi-probability distribution which is directly related to the error mitigation cost associated to the gate \(\mathcal G_i\).

Note: In principle, the gate index “\(i\)” in the noisy operations \(\mathcal O_{i, \alpha}\) could be dropped. However, we keep it to explicitly define gate-dependent basis of implementable operations, consistently with the structure of the OperationRepresentation class discussed in What additional options are available in PEC?.

Error cancellation#

The aim of PEC is estimating the ideal expectation value of some observable \(A=A^\dagger\) with respect to the quantum state prepared by an ideal circuit of interest \(\mathcal U\) acting on some initial reference state \(\rho_0\) (typically \(\rho_0= |0\dots 0 \rangle \langle 0 \dots 0 |\)).

Replacing each gate \(\mathcal G_i\) with its noisy representation, we can express the ideal expectation value as a linear combination of noisy expectation values:

\[ \langle A \rangle_{\rm ideal}= {\rm tr}[A \mathcal U (\rho_0)] = \sum_{\vec{\alpha}} \eta_{\vec{\alpha}} \langle A_{\vec{\alpha}}\rangle_{\rm noisy} \]

where we introduced the multi-index \(\vec{\alpha}=(\alpha_1, \alpha_2, \dots ,\alpha_t)\) and

\[ \eta_{\vec{\alpha}} := \prod_{i=1}^t \eta_{i, \alpha_i}, \quad \langle A_{\vec{\alpha}}\rangle_{\rm noisy} := {\rm tr}[A \Phi_{\vec{\alpha}}(\rho_0)], \quad \Phi_{\vec{\alpha}} := \mathcal O_{t, \alpha_t} \circ \dots \circ \mathcal O_{2, \alpha_2} \circ \mathcal O_{1, \alpha_1}. \]

The coefficients \(\{ \eta_{\vec{\alpha}} \}\) form a quasi-probability distribution for the global circuit over the noisy circuits. Indeed it is easy to check that, at the level of super-operators, we have:

\[ \mathcal U = \sum_{\vec{\alpha}} \eta_{\vec{\alpha}} \Phi_{\vec{\alpha}}. \]

The one-norm \(\gamma\) of the global quasi-probability distribution is the product of those of the gates:

\[ \sum_{\vec \alpha} \eta_{\vec{\alpha}}=1, \qquad \gamma = \sum_{\vec{\alpha}} |\eta_{\vec \alpha}| = \prod_{i=1}^{t} \gamma_i. \]

All the noisy expectation values \(\langle A_{\vec{\alpha}}\rangle_{\rm noisy}\) can be directly measured with a noisy backend since they only require circuits composed of implementable noisy operations. In principle, by combining all the noisy expectation values, one could compute the ideal result \(\langle A \rangle_{\rm ideal}\). Unfortunately this approach requires executing a number of circuits which grows exponentially with the circuit depth and which is typically unfeasible.

An important fact at the basis of PEC is that, for weak noise, only a small number of noisy expectation values actually contribute to the linear combination because many of the coefficients \(\eta_{\vec \alpha}\) are negligible. For this reason, it is more efficient to estimate \(\langle A \rangle_{\rm ideal}\) using an importance-sampling Monte Carlo approach as described in the next section.

Monte Carlo estimation#

To apply a Monte Carlo estimation, we need to replace quasi-probabilities with positive probabilities. This can be achieved as follows:

\[ \mathcal{G_i} = \sum_{\alpha} \eta_{i, \alpha} \mathcal{O}_{i, \alpha} = \gamma_i \sum_{\alpha} p_i(\alpha) \, {\rm sgn}(\eta_{i, \alpha})\, \mathcal{O}_{i, \alpha},\]

where \(p_{i}(\alpha)=|\eta_{i, \alpha}|/\gamma_i\) is a valid probability distribution with respect to \(\alpha\).

If for each gate \(\mathcal G_i\) of the circuit we sample a value of \(\alpha\) from \(p_{i}(\alpha)\) and we apply the corresponding noisy operation \(\mathcal O_{i, \alpha}\), we are effectively sampling a noisy circuit \(\Phi_{\vec{\alpha}}\) from the global probability distribution \(p(\vec{\alpha})= |\eta_{\vec{\alpha}}| / \gamma\).

Therefore, at the level of quantum channels, we have:

\[ \mathcal U = \gamma \mathbb E \left\{ {\rm sgn}(\eta_{i, \vec{\alpha}}) \Phi_{\vec{\alpha}} \right\},\]

where \(\mathbb E\) is the sample average over many repetitions of the previous probabilistic procedure and \({\rm sgn}(\eta_{\vec{ \alpha}}) = \prod_i {\rm sgn}(\eta_{i, \alpha})\). As a direct consequence, we can express the ideal expectation value as follows:

\[\langle A \rangle_{\text{ideal}} = \gamma\, \mathbb E \left\{ {\rm sgn}(\eta_{\vec{\alpha}}) \langle A_{\vec{\alpha}}\rangle_{\rm noisy} \right\}.\]

By averaging a finite number \(N\) of samples we obtain an unbiased estimate of \(\langle A \rangle_{\text{ideal}}\). Assuming a bounded observable \(|A|\le 1\), the number of samples \(N\) necessary to approximate \(\langle A\rangle_{\text{ideal}}\) within an absolute error \(\delta\), scales as [11]:

\[ N \propto \frac{\gamma^2}{\delta^2}. \]

The term \(\delta^2\) in the denominator is due to the stochastic nature of quantum measurements and is present even when directly estimating an expectation value without error mitigation. The \(\gamma^2\) factor instead represents the sampling overhead associated to PEC. For weak noise and short circuits, \(\gamma\) is typically small and PEC is applicable with a reasonable cost. On the contrary, if a circuit is too noisy or too deep, the value of \(\gamma\) can be so large that PEC becomes unfeasible.