Measurement setup
Table of Contents
The full experimental setup is shown in Extended Data Fig. 1. The device is measured in a Bluefors XLD400 dilution refrigerator. The device is mounted on the cold finger. Within T = 1 K, elevation from the base temperature is achieved by switching on and tuning the heater near the sample. Temperatures above 1 K are attained by reducing the amount of He mixture in the circulation and consequently the cooling power. Temperature control becomes nontrivial above 1.2 K and nonviable above 1.5 K.
An external d.c. magnetic field is supplied by an American Magnetics AMI430 magnet. The magnetic field points in the [110] direction of the Si lattice. The d.c. voltages are supplied with Basel Precision Instruments SP927 LNHR DACs through d.c. lines with a bandwidth of 0–20 Hz. Dynamic voltage pulses are generated with a Quantum Machines OPX and combined with d.c. voltages by custom voltage combiners at the 50 K stage in the refrigerator. The OPX has a sampling time of 4 ns. The dynamic pulse lines in the fridge have a bandwidth of 0–50 MHz, which translates into a minimum rise time of 20 ns. Microwave pulses are synthesized using a Keysight PSG8267D Vector Signal Generator with the baseband I/Q and pulse modulation signals from the OPX. The modulated signal spans from 250 kHz to 44 GHz but is bandlimited by the fridge line and the d.c. block.
The charge sensor comprises a singleisland SET connected to a tank circuit for reflectometry measurement. The return signal is amplified by a Cosmic Microwave Technology CITFL1 LNA at the 4 K stage and a Minicircuits ZX60P33ULN+ LNA followed by two Minicircuits ZFL1000LN+ LNAs at room temperature. The Quantum Machines OPX generates the tones for the RFSET and digitizes and demodulates the signals after the amplification.
Device tuneup
We first load the electrons according to the mapping of the doubledot charge configurations over a large range, using lockin charge sensing measurement^{61} with the RFSET. The measurement can be done in the physical gate basis by sweeping V_{P1} and V_{P2}, as shown in Extended Data Fig. 2a, or in the virtual gate basis by sweeping V_{P1} − V_{P2} and V_{J}, as shown in Fig. 1a. In the virtual gate basis, voltages of −0.32 V_{J} and −0.25 V_{J} are applied on P1 and P2 to compensate for the effect of pulsing J. During operation, each dot is loaded with an odd number of electrons, from which the unpaired electron carries the spin information. This is denoted as the (m + 1, n + 1) charge state in the charge maps, where m and n are even numbers.
The tuneup proceeds with locating the PSB region around the interdot charge transition, as indicated by the dashed square in Extended Data Fig. 2c,d. The initial PSB search involves loading a mixed spin state in (m + 1, n + 1), which has some probability of being evenparity (↓↓⟩ or ↑↑⟩), and subsequently pulsing to a location near the interdot charge transition point. Singleshot charge readout is performed before and after reaching the location and the final readout signal is provided by subtracting the two signals. Except at ultralow B_{0}, the readout mechanism is dominated by parity readout because of the relatively large dE_{Z} between the two qubits^{32}. An evenparity spin state appears as blockaded in the PSB region, which translates to a lower radiofrequency signal compared with that from an unblockaded state. The averaged radiofrequency signal, therefore, indicates the probability of having an evenparity state across multiple shots.
The twolevel behaviour in the PSB region is used to perform singleshot spin readout. The readout signal in each shot of the experiment is compared with a preset threshold that lies between the two levels, as we see in the readout histograms in Fig. 2b. We assign value 1 to a blockaded readout, and value 0 to an unblockaded readout. Finally, we average over all shots to obtain P_{blockade} for the statistics.
Extended Data Fig. 2f shows the ESR spectrum as a function of V_{J}, in which we identify two regimes. At V_{J} < 1.175 V, only two transitions pertaining to the driven rotation of the individual qubits are detected. Driven over time, these transitions correspond to the Rabi oscillations in Fig. 1e. At V_{J} > 1.175 V, in which the exchange energy is large, we see four transitions among the four twoqubit states corresponding to the controlled rotation operations^{38,50}. The layout of the transitions, together with the background signal, shows the composition of the initialized qubit state. The traces in Fig. 2a are taken from these measurements at high V_{J}. A more scalable twoqubit operation is the electrically pulsed controlled phase operation (CZ)^{35,36}. This is adopted in this work to construct the CZ gate (Extended Data Fig. 2g), or the DCZ gate in the main text.
Algorithmic initialization
When the qubit energy hf_{qubit} is greater than the thermal energy k_{B}T, electronspin qubit initialization may rely on intrinsic polarization mechanisms such as spindependent tunnelling from a reservoir^{62,63,64}, PSB^{17,18,49,65} or relaxation^{16,43,66}. Higherfidelity singlequbit state preparation can be achieved using initialization by measurement^{39,67} and conditional singlequbit pulses^{9,68}. These approaches either partially rely on intrinsic polarization or require readout with a reservoir, which are incompatible with operation at elevated temperatures. In this work, we design a generic twoqubit algorithmic initialization protocol that works in conditions for which hf_{qubit} is comparable to or less than k_{B}T. The method is applicable to a largescale qubit array, in which initialization and readout are performed pairwise^{16,49,65}.
The algorithmic initialization protocol, as shown in Extended Data Fig. 3a, proceeds as follows:

1.
Enter (m + 1, n + 1) to create two unpaired spins in the doubledot system.

2.
This results in one of the ↓↓⟩, ↓↑⟩, ↑↓⟩ and ↑↑⟩ states. The probability of creating the ground state ↓↓⟩ decreases as the temperature increases, as the thermal energy becomes comparable or greater than the qubit exchange coupling and the Zeeman energies.

3.
Ramp to the PSB region for parity readout and apply a filter that rejects oddparity states. The parity readout preserves the evenparity states as long as it is performed faster than the spin relaxation time^{32}.

(a)
If the state is unblockaded and thus determined as an oddparity (↓↑⟩, ↑↓⟩) or excited state, the initialization is restarted.

(b)
If the state is blockaded and thus determined as evenparity (↓↓⟩, ↑↑⟩), the initialization proceeds to the next stage.

(a)

4.
This results in either ↓↓⟩ or ↑↑⟩, with an increased probability of ↓↓⟩ from step 3. We calibrate the CZ gate at this stage, either from the exchangeinduced splitting of the ESR transitions (Extended Data Fig. 2f) or from the CZ oscillations (Extended Data Fig. 2g).

5.
A zeroCNOT (zCNOT) gate^{23} is performed to convert ↑↑⟩ into ↑↓⟩, leaving ↓↓⟩ unchanged. The construction of the zCNOT gate in this work is shown in Extended Data Fig. 2g.

6.
Ramp to the PSB region for parity readout, and apply a filter that rejects oddparity states.

(a)
If the state is unblockaded and thus determined as ↑↓⟩ or an excited state, the initialization is restarted.

(b)
If the state is blockaded and thus determined as ↓↓⟩, the initialization is determined to be completed.

(a)

7.
The resulting state is purely ↓↓⟩.
The if conditions above are implemented using realtime logic in the FPGA.
The protocol can also be adapted to prepare any other state on the parity basis. ↑↓⟩ and ↓↑⟩ can be prepared from ↓↓⟩ with a microwave π pulse on Q1 and Q2. ↑↑⟩ can be prepared by replacing the zCNOT with CNOT in the algorithm.
We test the algorithmic initialization in a wide range of B_{0} from 1 T down to 25 mT. The results at different stages of the protocol are seen in Fig. 2a and Extended Data Fig. 3b. Stage I, the outcome of a 100μs ramp into the operation point, has a mixture of ↓↓⟩, ↓↑⟩, ↑↓⟩ and ↑↑⟩ states and the measured ESR transitions are almost indistinguishable. After Stage II, the output is a mixture of ↓↓⟩ and ↑↑⟩ with the oddparity states or excited states filtered out through PSB, which can be identified from the associated ESR transitions. Stage III converts the remnant ↑↑⟩ into ↑↓⟩, which is then filtered out through PSB because of the odd parity.
It is also important to assess the time cost for the algorithmic initialization, as it involves multiple control and readout iterations. The table in Extended Data Fig. 3b breaks down the time spent on control and readout. We see that the readout integration time t_{integration} dominates the time consumption. At B_{0} = 85 mT and T = 1 K, the full algorithmic initialization takes an average of around three iterations, which totals around 150 μs. Evaluating this in the context of different B_{0} and temperatures, we obtain the dependence shown in Extended Data Fig. 3c,d. At ultralow B_{0}, for which a reduction in the control and readout fidelity is seen, N_{iteration} decreases, possibly because the system deviates from the parity basis. Higher B_{0} provides a larger qubit energy, increasing the likelihood of obtaining a ↓↓⟩ state after the load ramp and reducing N_{iteration}. Similarly, N_{iteration} also increases with higher temperatures. At B_{0} above 1 T, the onset of excited statelevel crossings enhances spin randomization after the load ramp, and thus more N_{iteration} is required. We expect that N_{iteration} may be reduced by incorporating corrective control based on measurement^{9,68} to accelerate the polarization towards the target state.
SPAM error analysis with repeated readout
A more comprehensive SPAM error analysis uses machine learning of the increased statistics from multiple measurements. The experimental sequence consists of initialization followed by repeated parity readout that results in a series of binary measurement outcomes m_{1}, m_{2}, …, m_{n}, where m_{i} ∈ {evenparity = 0, oddparity = 1}. This initialization(measurement)^{n} sequence is performed 1,000 shots.
A hidden Markov model (HMM) can describe this series of measurements formalism in which the true, but hidden, spin state s_{1}, s_{2}, …, s_{n} follows the Markov chain and the measurement outcomes, m_{i}, are probabilistically related to the underlying spin state. Three different tensors completely determine HMMs:

1.
A start probability vector, Π, encoding the initializing probabilities in each spin state.

2.
A transition probability matrix, A, encoding the probabilities of transiting between spin states during measurements.

3.
A measurement probability matrix, Θ, encoding the probability of the measurement outcomes conditioned on the current hidden spin state.
To find the likely HMM model for a given set of data, we perform expectation maximization in which we maximize the marginal likelihood, which is dependent on the marginalized hidden spin state, such that
$$\begin{array}{l}L({\boldsymbol{\Pi }},{A},{\Theta }\,;{\bf{m}})\,:=\,p({\bf{m}}{\boldsymbol{\Pi }},{A},{\Theta })\\ \,\,=\int \,p({\bf{s}},{\bf{m}}{\boldsymbol{\Pi }},{A},{\Theta }){\rm{d}}{\bf{s}}.\end{array}$$
(1)
For HMM models, there exists the Baum–Welch algorithm that can perform this expectation maximization by an iterative update rule, without the need for backpropagation of gradients^{69}.
We use the Cramer–Rao bound to quantify the level of uncertainty in these parameters when fitted by expectation maximization^{70}. The Cramer–Rao bound states that if \({{\rm{est}}}_{{\boldsymbol{\theta }}}({\bf{m}})\) is an unbiased estimate of the parameters \({\boldsymbol{\theta }}:=({\boldsymbol{\Pi }},{A},{\Theta })\) given the data m, such as that produced by expectation maximization, then
$${{\rm{cov}}}_{{\boldsymbol{\theta }}}\left({{\rm{est}}}_{{\boldsymbol{\theta }}}({\bf{m}})\right)\ge I{\left({\boldsymbol{\theta }};{\bf{m}}\right)}^{1},$$
(2)
where \(I{({\boldsymbol{\theta }};{\bf{y}})}_{ij}={\partial }^{2}\log L({\boldsymbol{\theta }}\,;{\bf{m}})/\partial {\theta }_{i}\partial {\theta }_{j}\), the Fisher information matrix. Therefore, we can obtain lower bounds on the uncertainty of each parameter from the diagonal elements of the inverse of the Fisher information matrix. We used the Forward–Backward algorithm to compute the marginal likelihood defined in equation (1) needed to compute the Fisher information matrix.
Finally, we use the Viterbi algorithm to compute the most likely set of true spin states that gave rise to the set of measurements given a set of model parameters^{69,71}.
Crosstalk correction
The relatively small ΔE_{Z} even at higher B_{0} requires cancellation of crosstalk between the two qubits, that is, the effect on the other qubit when one qubit is being driven. This can be addressed to the first order by considering the following aspects.
To cancel offresonance driving, we enforce
$$\sqrt{\Delta {E}_{{\rm{Z}}}^{2}+{f}_{{\rm{Rabi}}}^{2}}=N{f}_{{\rm{Rabi}}},$$
(3)
where f_{Rabi} is the Rabi frequency of the target qubit, and N = 4, 8, 12, …. Consequently, each π/2 microwave pulse on the target qubit incurs a full 2πN offresonance rotation on the ancilla qubit, as exemplified in Extended Data Fig. 7a. Failure to cancel the offresonance driving can result in large errors under parity readout, as shown in Extended Data Fig. 7b. With N = 4, this cancellation criterion dictates the fastest Rabi possible and is therefore expected to limit the singlequbit gate fidelities, especially at low B_{0} where ΔE_{Z} is small. The full set of f_{Rabi} used for singlequbit randomized benchmarking at different B_{0} is shown in Extended Data Fig. 7c. In this case, we can alternatively execute X(π/2) as a 3π/2 gate for faster driving at the cost of redundancy. We implemented this with the three and fiveelectron qubit at 0.1 T, 1.2 K in Fig. 3d.
In twoqubit sequence runs, it is also necessary to correct AC Stark shift by an amount of
$$\frac{{f}_{{\rm{Rabi}}}^{2}}{2\Delta {E}_{{\rm{Z}}}},$$
(4)
apart from cancelling the offresonance driving. Extended Data Fig. 7d measures the AC Stark shift on an ancilla qubit by preparing it on the equator, driving it offresonantly and projecting the phase. Before the correction, the AC Stark shift is seen as the linear fringes that correspond to the phase accumulation given by equation (4).
We note that the above cancellation of crosstalk does not prevent it from incurring errors. The perturbation on the ancilla qubit induces decoherence. At ultralow B_{0} at which ΔE_{Z} becomes diminishing, higherorder crosstalk terms cannot be neglected, and the control of individual qubits becomes unmanageable. However, these problems are circumvented in the SMART control scheme, which addresses all the qubits simultaneously.
Randomized benchmarking
Singlequbit randomized benchmarking sequences for Fig. 3d–e are constructed from elementary π/2 gates [X(π/2), Z(π/2), −X(π/2), −Z(π/2)], π gates [X(π), Z(π)] and an I gate. Each Clifford gate contains one physical elementary gate on average, excluding the virtual Z(π/2) and Z(π) gates.
To optimize the singlequbit gate fidelity, we study different B_{0} (Fig. 3d) and tightly confine the qubits with low barrier gate voltages to reduce noise coupling. In singlequbit randomized benchmarking, the coherent driving decouples the qubit from noise to a certain extent^{72}, and the random rotations of the qubit also have the effect of refocusing^{38,73}. Here we optimize the microwave power and thus f_{Rabi}, such that the spins are driven quickly without excessive microwaveinduced noise^{46}.
Twoqubit randomized benchmarking sequences for Fig. 4b are constructed from singlequbit elementary π/2 gates [X_{1}(π/2), Z_{1}(π/2), X_{2}(π/2), Z_{2}(π/2)] for Q1 and Q2, and a twoqubit elementary gate DCZ. Each Clifford gate contains 1.8 singlequbit elementary gates and 1.5 twoqubit elementary gates on average. All gates are sequentially executed, which means Q1 idles while X_{2}(π/2) or Z_{2}(π/2) takes place, and the same for Q2. The generated random sequences are used in both randomized benchmarking and FBT. In the case of IRB, we incorporate an interleaved DCZ gate between adjacent Clifford gates. The experimental implementation and the analysis protocol are shown in Extended Data Fig. 9a,b, and the IRB results are shown in Extended Data Fig. 9c.
We then fit the randomized benchmarking decay curve to the formula^{38,41}
$$a{{\rm{e}}}^{{(bx)}^{c}}+d,$$
(5)
from which 1 − 0.5b gives the Clifford fidelity in singlequbit randomized benchmarking, and 1 − 0.75b gives the Clifford fidelity in twoqubit randomized benchmarking. The term c represents the decay exponent and reflects the error Markovianity; a is subjected to the readout fidelity and d is close to 0.5.
From the twoqubit IRB decays, we first obtain an IRB fidelity^{50} of 99.8 ± 0.2% at T = 0.1 K and 97.7 ± 1.5% at T = 1 K for the DCZ gate. This fidelity reflects the combined effect of dephasing during t_{exchange} and echoing in the DCZ gate and the results can be understood from the stronger temperature dependence of \({T}_{2}^{{\rm{Hahn}}}\) compared with that of \({T}_{2}^{{\rm{* }}}\). We also note the numerical instabilities in IRB fidelities, which result in large error bars.
Fast Bayesian tomography
FBT^{53} is an agile gate set process tomography protocol that can selfconsistently reconstruct all gate set process matrices based on previous calibration. In principle, FBT learns and updates the model using the gate sequence information and its experimental outcome. In this work, we feed FBT with the variablelength twoqubit randomized benchmarking sequences and the corresponding experimental data. Clifford gates in the randomized benchmarking sequences are decomposed into their elementary gate implementation of X_{1}(π/2), Z_{1}(π/2), X_{2}(π/2), Z_{2}(π/2) and DCZ. The randomized benchmarking experiments at T = 0.1 K and T = 1 K run through 32,000 and 26,000 sequences, respectively, sufficient for FBT to reliably reconstruct the error channels. We feed the native parity readout results directly to FBT, without converting them to the standard twoqubit measurement basis.
To initiate the FBT analysis, we must bootstrap the model from educated guesses to help the analysis converge with a finite amount of experiments. Here, we do this by injecting guessed fidelity numbers as introduced in refs. ^{53,74}. FBT models each noisy gate \(\widetilde{G}\) as the product of the noise channel \(\mathop{G}\limits^{ \sim }=\varLambda G\) and the ideal gate G, in which the noise channel Λ is linearized about I by expressing it as Λ = I + ε. Each update of the FBT analysis is essentially on the statistics of the noise channel residuals ε. Extended Data Fig. 9d shows the reconstructed Pauli transfer matrices of the DCZ gate. Supplementary Information shows the reconstructed noise channel residuals of the three physical elementary gates DCZ, X_{1}(π/2), and X_{2}(π/2) at T = 0.1 K and T = 1 K.
As FBT does not guarantee that the reconstructed channels are physical or flag any gauge ambiguity, we perform CPTP projection and gauge optimization over the entire gate set at the output stage.
FBT extracts DCZ fidelities of 99.15 ± 0.13% at T = 0.1 K and 98.92 ± 0.67% at T = 1 K. Here, a singlequbit gate on one qubit always leaves the other qubit idling, which considerably limits the singlequbit process fidelities (Supplementary Information) and consequently the Clifford fidelity in twoqubit randomized benchmarking, even at T = 0.1 K (Fig. 4b). However, the reduction in the Clifford fidelity from 0.1 K to 1 K mainly originates from the degradation of the DCZ gate, exhibiting a similar factor.
Error taxonomy with pyGSTi
When examining the fidelity results, we are also interested in understanding the dominant error sources behind the DCZ gate infidelity and their variation at different temperatures. FBT is a flexible and efficient gate set process tomography that enables us to extract gate errors from randomized sequence runs^{53,74}. To categorize the gate errors, we perform postprocessing of the tomography results obtained by FBT using tools for decomposing errors implemented in the pyGSTi package^{59,60}.
Error taxonomy for FBT can be achieved by converting the noise channels Λ for each gate to their error generator \({\mathbb{L}}\) using the following relationship:
$$G=\varLambda {G}_{0}={{\rm{e}}}^{{\mathbb{L}}}{G}_{0},$$
(6)
where G is the estimated noisy gate, and G_{0} is the ideal gate.
Using the pyGSTi package^{59,60}, we project \({\mathbb{L}}\) into the subspace of Hamiltonian and stochastic errors, extracting the coefficients of each elementary error generator. We perform this analysis on each of the gates [DCZ, X_{1}(π/2) and X_{2}(π/2)] for both temperatures of 0.1 K and 1 K. The coefficients of the elementary error generators are represented in the Pauli basis and presented in the Supplementary Information. The five largest components of the Hamiltonian and stochastic errors for the DCZ gate are shown in Fig. 4c.
We also estimate the generator or entanglement infidelity \(1{{\mathcal{F}}}_{{\rm{ent}}}\) based on these error coefficients, given by^{60}
$$1{{\mathcal{F}}}_{{\rm{ent}}}\approx \sum _{P}{s}_{P}+\sum _{P}{h}_{P}^{2},$$
(7)
where the sum is performed over the extracted coefficients and P denotes nonidentity Pauli elements. The approximation is validated by the domination of Hamiltonian errors over stochastic errors in magnitude. To obtain the average gate fidelities (\({{\mathcal{F}}}_{{\rm{avg}}}\)), which are the quantities quoted based on IRB and FBT measurements, it can be connected to \({{\mathcal{F}}}_{{\rm{ent}}}\) in the following way^{75}:
$${{\mathcal{F}}}_{{\rm{avg}}}=\frac{d\cdot {{\mathcal{F}}}_{{\rm{ent}}}+1}{d+1},$$
(8)
where d is the dimension of the Hilbert space (4 for a twoqubit system). This means that generally stochastic errors contribute more to the gate infidelities, even in the case in which the magnitudes of the Hamiltonian errors are larger.