### Sample fabrication

Table of Contents

Graphene, h-BN, MoSe_{2} and WS_{2} layers were exfoliated from bulk crystals onto Si/SiO_{2} (285 nm) substrates and assembled in heterostructures using a standard dry-transfer technique with a poly(bisphenol A carbonate) film on a polydimethylsiloxane (PDMS) stamp. Both samples were encapsulated between two h-BN flakes of 30–40 nm thickness. Optical lithography and electron beam metal deposition were used to fabricate electrodes for the electrical contacts.

### Experimental set-up

Experiments on sample I were done in a dilution refrigerator (base temperature ≈ 20 mK) with free-space optical access, with windows at the still, 4 K and room-temperature stages. Sample II was in turn investigated in a liquid helium bath cryostat (also featuring free-space optical access) and in a second dilution refrigerator unit with a single-mode fibre-based optical access (see ref. ^{41} for a detailed description of this set-up). For free-beam set-ups, the samples were mounted on three piezoelectric nanopositioners. A fibre-coupled confocal microscope was used for optical measurements, with either a single aspheric lens or an objective (numerical aperture = 0.7 for both) focusing the light onto a diffraction-limited spot. A schematic of the optical set-up is shown in Extended Data Fig. 1.

Reflection spectra were measured using a supercontinuum laser with a variable filter as a light source and a spectrometer with a liquid-nitrogen- or Peltier-cooled CCD camera as the detector. For mK-temperature measurements of the magnetic circular dichroism (MCD), we used either a tunable continuous wave single-frequency titanium sapphire laser or a few-nm-wide white light with a central wavelength around the attractive polaron resonance. The reflected light was measured using Geiger-mode avalanche photodiodes (APDs) that enable the detection of low-power signals. To reduce the sensitivity of the MCD measurement to slow drifts in the experiments on sample I, the incident light-polarization was switched between *σ*^{+} and *σ*^{−} at kilohertz rates using an electro-optic modulator. By contrast, sample II was illuminated with linearly polarized light. On reflection from the device, the *σ*^{+}– and *σ*^{−}-polarized components were separated with a polarizing beamsplitter and detected in parallel using two separate APDs. All measurements were power-stabilized with feedback from a photodiode to an acousto-optic modulator or a fibre-coupled variable optical attenuator.

### Background subtraction

Differential reflectance presented in the plots is defined as Δ*R*/*R*_{0} = (*R* − *R*_{0})/*R*_{0}, where *R* is the measured reflection spectrum of the heterostructure and *R*_{0} is the background reflection spectrum on the h-BN flakes away from the TMD flakes. For the MCD measurements, the background reflectance \({R}_{0}^{{\sigma }^{\pm }}\) at the laser frequency is measured in both polarizations at charge neutrality or high electron density (*ν* > 2) for which there is no attractive polaron resonance. The degree of circular polarization is then given by

$${\rho }_{{\rm{AP}}}=\frac{({R}^{{\sigma }^{+}}-{R}_{0}^{{\sigma }^{+}})-({R}^{{\sigma }^{-}}-{R}_{0}^{{\sigma }^{-}})}{({R}^{{\sigma }^{+}}-{R}_{0}^{{\sigma }^{+}})+({R}^{{\sigma }^{-}}-{R}_{0}^{{\sigma }^{-}})}.$$

(3)

### Filling factor calibration

We convert the axis of applied gate voltage to the filling factor axis by finding the maxima of the optical resonances for different voltage ranges, which are equally spaced in voltage and peaked at the integer fillings of the moiré lattice (Extended Data Fig. 2). By extracting the positions of the maxima using linear fits (compare Fig. 1c), we estimate the absolute error in the filling factor to be ≤ 0.005.

### Power dependence of spin polarization

To access the true magnetic ground-state properties of the system, it is essential to ensure that the intensity of the probe light is sufficiently low so as not to perturb the system. Similar to monolayer MoSe_{2} (ref. ^{42}), light illumination leads to depolarization of the spin population in the moiré heterostructure. As the strength of the depolarizing effect depends on both temperature and charge density, it can give rise to misleading artefacts in the measured electronic magnetism. This is directly shown by our filling-factor-dependent measurements of the Curie–Weiss constant carried out at high temperatures *T* > 4 K in the bath cryostat on sample II. In these experiments, the sample was illuminated with a white light of tunable power. The magnetic susceptibility of the electron system was extracted based on the degree of circular polarization of the attractive polaron resonance that was, in turn, determined by fitting its spectral profile with a dispersive Lorentzian lineshape^{41}. On this basis, we were able to analyse the temperature dependence of the inverse magnetic susceptibility for various filling factors and excitation powers. As seen in Extended Data Fig. 3a (for *ν* = 0.75), although the powers used remain in the sub-μW range, they still markedly affect the magnetic response. More specifically, the Curie–Weiss constant is lower for larger excitation powers. This effect is most prominent for low filling factors and becomes indiscernible at *ν* ≳ 1 (Extended Data Fig. 3b).

This power dependence originates primarily from the changes in spin-valley relaxation dynamics of the electron system. As demonstrated in previous studies of TMD monolayers^{43}, the spin relaxation time becomes shorter for larger electron densities and higher temperatures. As a result, if a certain number of electrons undergo a spin flip because of the interaction with optically injected excitons, it takes longer for them to relax back to their ground state when *ν* and *T* are low. For this reason, the magnetic susceptibility determined on exciton injection into the system is lower compared with its unperturbed value. Moreover, the deviation between these two quantities becomes larger for higher excitation powers and lower *ν* and *T* (Extended Data Fig. 3c), which explains the striking power dependence of the Curie–Weiss constant at *ν* < 1 in Extended Data Fig. 3b. In particular, the data in this figure directly show that for the excitons to constitute a nondestructive probe of the electron spin system at *T* > 4 K and 0.5 ≲ *ν* ≲ 1.5, the excitation power needs to be around a few nW.

Owing to the aforementioned temperature dependence of the spin-valley relaxation time, accessing the true magnetic ground-state properties of the electron system at mK temperatures requires us to further reduce the excitation power. As shown in Extended Data Fig. 3d, nW excitation power is sufficient to significantly depolarize the spins at mK temperatures, even in a magnetic field of *B*_{z} = 9 T. By measuring magnetization curves at different levels of excitation power (Extended Data Fig. 4), we find that the requisite power for nondestructive probing, for which the power dependence disappears, is of the order of 10 pW. Taking this into account, we used a resonant laser with 11.7 pW incident power on the sample in our mK measurements. Note that this level of power is about six orders of magnitude below the level at which the laser measurably heats the cold finger in the cryostat. Because the line shape and energy of the attractive polaron do not vary appreciably with gate voltage and magnetic field (density-dependent g-factor *g*_{AP} < 10), measuring the reflectance at a single frequency is equivalent to measuring the area of the peak. At constant linewidth, the reflectance at a single frequency is proportional to the area of the whole peak. Line shifts smaller than the linewidth can be tolerated, because they affect only the absolute reflectance, but not the degree of polarization \({\rho }_{{\rm{AP}}}=({R}^{{\sigma }^{+}}-{R}^{{\sigma }^{-}})/({R}^{{\sigma }^{+}}+{R}^{{\sigma }^{-}})\), which is normalized by the total reflectance. We confirm the frequency independence by measurement with a broadband source filtered spectrally to cover the attractive polaron resonance in the whole doping range. The comparison to the single-frequency measurement is shown in Extended Data Fig. 5.

### Effect of optical spin pumping

Excitation with circularly polarized light can lead to an optical spin-pumping effect. Previous studies have shown that this effect is small for MoSe_{2} (ref. ^{44}). To exclude that a strong optical spin-pumping effect is present in our system, we measure the magnetic field dependence of the attractive polaron reflectance under the same experimental conditions as in the *ρ*_{AP} measurement, but with fixed circular polarization. We repeat the measurement for both *σ*^{+} and *σ*^{−} polarization. The curves are normalized to the range [−1, 1] and plotted together in Extended Data Fig. 6a. The effect of polarization-dependent optical spin pumping is a vertical displacement of the intersection of the two curves away from 0. We find the intersection is displaced to a negative value, indicating that the polarized laser slightly pumps the spins to the valley it probes, thereby reducing the strength of the probed attractive polaron resonance. In Extended Data Fig. 6b, the same data are plotted with the *σ*^{+} data mirrored on the horizontal axis to better visualize the displacement.

The spin-pumping effect is small, and the measurement of the degree of polarization *ρ*_{AP} is also insensitive to it. As shown in Extended Data Fig. 6, the spin pumping always leads to a slight reduction in the reflectance, regardless of which circular polarization the laser has. Therefore, the effect factors out in the definition of the degree of polarization, \({\rho }_{{\rm{AP}}}=({R}^{{\sigma }^{-}}-{R}^{{\sigma }^{+}})/({R}^{{\sigma }^{-}}+{R}^{{\sigma }^{+}})\).

### Detailed magnetization curves

Additional plots of *ρ*_{AP}(*B*_{z}) measured at fixed doping in a wider magnetic field range and with smaller step size are shown in Extended Data Fig. 7. The magnetization evolves smoothly with the applied external magnetic field and reaches its saturation value without any discontinuities. No further increase in *ρ*_{AP} is expected at higher magnetic fields, as the curves for all filling factors overlap with that at *ν* = 1.2, for which the ferromagnetic interactions ensure full spin polarization at low fields. The deviation of the saturation value from ±1 arises from difficulties in proper background subtraction for this particular measurement, for which the magnetic field was varied at a fixed filling factor. In the measurements presented in the main text, in which the filling factor was varied at a fixed magnetic field, this problem does not occur and *ρ*_{AP} reaches ±1 at saturation.

### Temperature calibration

Owing to the heat load on our sample through the electrical wiring and finite thermal conductivity at mK temperatures, the real electron temperature of the sample is expected to be higher than the value obtained from the built-in temperature read-out of the dilution refrigerator based on a resistance measurement, especially close to the base temperature. As the electron temperature of the sample is a crucial quantity for our Curie–Weiss analysis, we use the following model to calibrate it and estimate the associated systematic error: The heat transport responsible for cooling the sample is governed by the steady-state heat equation

$${\boldsymbol{\nabla }}\left(\kappa {\boldsymbol{\nabla }}T\right)={Q}_{{\rm{in}}},$$

(4)

where *κ* is the thermal conductivity and *T* is the temperature, and we assume a constant heat load *Q*_{in} on the sample. At mK temperatures, the heat is transported by electrons through the electrical contacts and wire bonds, so we consider the gradient along only one dimension (along the wire). The electrical conductivity is limited by impurity scattering and therefore independent of temperature, which results in a thermal conductivity proportional to the temperature according to the Wiedemann–Franz law. Setting *κ*(*T*) = *α**T*, the equation becomes

$$\alpha {\left({T}^{{\prime} }(x)\right)}^{2}+\alpha T(x){T}^{{\prime\prime} }(x)={Q}_{{\rm{in}}}.$$

(5)

By integrating twice, we arrive at the solution

$$T(x)=\sqrt{\frac{{Q}_{{\rm{in}}}}{\alpha }{x}^{2}+2T(0){T}^{{\prime} }(0)x+T{(0)}^{2}}.$$

(6)

Using the boundary conditions *T*(0) = *T*_{cryo} (cold-finger temperature according to thermometer read-out) and \({T}^{{\prime} }(0)=0\) (cold finger is well-thermalized), we find the relation

$${T}_{{\rm{sample}}}=\sqrt{{T}_{\min }^{2}+{T}_{{\rm{cryo}}}^{2}},$$

(7)

where *T*_{min} corresponds to the minimum achievable sample temperature and depends on the heat load and thermal conductivity. A previous independent measurement using a quantum dot in the same cryostat ^{45} found that the sample temperature saturated at *T* = 130 mK. Although the sample and electrical contacts are different, it is reasonable to assume a similar minimum temperature that can be reached in the current measurements. We therefore set *T*_{min} = 130 mK and use equation (7) to convert the temperature read-out to the sample temperature for the Curie–Weiss fit. In Extended Data Fig. 8, we plot the result of the Curie–Weiss fit when different values of *T*_{min} in a plausible range are used, showing the effect of a systematic error in the temperature calibration.

Given that at *θ*_{CW} = 0 at *ν* = 1 within our measurement accuracy, we assume a paramagnetic behaviour at *ν* = 1 to calibrate the sample temperature in the measurements on sample II at *T* > 4 K. For a paramagnet with *J* = 1/2, we have d*ρ*_{AP}(*T*)/d*B*_{z} = *g**μ*_{B}/(2*k*_{B}*T*), where *μ*_{B} is the Bohr magneton, *k*_{B} the Boltzmann constant and *g* the electronic *g*-factor. The assumption of paramagnetic behaviour at *ν* = 1 is further confirmed by measured magnetization curves that follow \({\rho }_{{\rm{AP}}}({B}_{z})=\tanh (g{\mu }_{{\rm{B}}}{B}_{z}/(2{k}_{{\rm{B}}}T))\). The value *g* = 4.5 of the *g*-factor can be fixed from this relation using the measured magnetization slope at a known temperature, for example, 4.2 K in a helium bath cryostat. We then use the same relation to extract the temperature from the measured slope at *ν* = 1 when heating the sample.

The temperature values obtained using this method were further verified by analysis of the temperature-induced redshift of the exciton resonance in a MoSe_{2} monolayer region of sample II. As shown in Extended Data Fig. 9, the measured energy *E*_{X}(*T*) of this resonance decreases quadratically with temperature, following the Varshni formula *E*_{X}(*T*) = *E*_{0} − *γ**T*^{2} (ref. ^{46}). The corresponding *γ* = 1.6 μeV K^{−2} agrees well with the values reported in previous studies of MoSe_{2} monolayers carried out in wider temperature ranges^{47}. This finding provides a strong confirmation of the validity of our temperature calibration procedure.

### Curie–Weiss fit

In our measurements, the slope d*ρ*_{AP}/d*B*_{z} can be measured with very high precision, whereas the sample temperature has a relatively large systematic uncertainty, as described above. To take this into account in the fit, we use the uncertainties in temperature rather than those in susceptibility as weights for the data points. This means that in the linear regression of \(a{({\rm{d}}{\rho }_{{\rm{AP}}}/{\rm{d}}{B}_{z})}^{-1}=T-\theta \), the temperature is treated as the dependent variable and the inverse susceptibility as the independent variable — that is, we fit \(T=a{({\rm{d}}{\rho }_{{\rm{AP}}}/{\rm{d}}{B}_{z})}^{-1}+\theta \) with 1/*σ*_{T} as weights. For the uncertainty *σ*_{T} of the temperature, we take \({\sigma }_{T}^{2}={\sigma }_{{\rm{readout}}}^{2}+{({T}_{{\rm{sample}}}-{T}_{{\rm{cryo}}})}^{2}\), where \({\sigma }_{{\rm{readout}}}^{2}\) is the variance of the temperature read-out and the second term quantifies the systematic uncertainty of the temperature as described in the section ‘Temperature calibration’.

### Reproducibility of the low-temperature results on a second device

The signatures of kinetic magnetism were also observed on two different spots for sample II. For each spot, we measured the degree of circular polarization *ρ*_{AP} of the attractive polaron transition as a function of both electron density and external magnetic field in a second dilution refrigerator unit featuring a base temperature of around 80 mK. On this basis, we determined the filling factor dependence of the slope d*ρ*_{AP}/d*B*_{z} around *B*_{z} = 0. Extended Data Figure 10a shows the results obtained for one of the investigated spots together with data from two spots on sample I. In all three cases, the slope is almost constant at *ν* < 1, starts to increase at *ν* > 1 and finally decreases around *ν* = 1.5. Both of these variations in sample II are markedly less abrupt compared with those seen in sample I. We attribute this difference to a larger disorder of the moiré lattice constant in sample II, caused by an unintentional twist angle of 1.3° in sample II, in contrast to the 0° alignment of sample I. Both twist angles were determined from a calibration of the electron density corresponding to *ν* = 1 based on the Landau-level spacing in monolayer MoSe_{2} regions at high magnetic fields^{48}. Owing to the finite twist angle, the moiré lattice constant in sample II is sensitive to variations in the local twist angle, whereas sample I is insensitive to first order. The inhomogeneity of the moiré lattice is responsible for fluctuations of the local filling factor within the examined laser spot. This, in turn, broadens the increase in d*ρ*_{AP}/d*B*_{z} at *ν* > 1, as the enhancement of magnetic susceptibility due to kinetic magnetism is sensitively dependent on *ν*. The presence of excessive filling factor disorder is independently confirmed by the lack of a robust decrease in d*ρ*_{AP}/d*B*_{z} around *ν* = 4/3 for sample II, which is due to the formation of a generalized Wigner crystal.

We want to underline that the slightly lower value of d*ρ*_{AP}/d*B*_{z} at *ν* < 1 in the case of sample II is related to the larger base temperature (≈80 mK) of the second dilution refrigerator unit used for the measurements of this device. This limits the lowest achievable electron temperature, which yields about 210 mK instead of 140 mK for the dilution refrigerator used in the measurements of sample I.

In Extended Data Fig. 10b, we show the same data from sample II together with a measurement at 4.2 K performed in a helium bath cryostat. Similar to Fig. 2b, each curve is multiplied by the temperature of the measurement, highlighting the enhancement of the spin susceptibility for 1 < *ν* < 1.5 at low temperatures with a larger increase than paramagnetic 1/*T* behaviour.

### Theoretical model

To explain the experimental results, we consider a single-band extended *t*–*J* model,

$$\begin{array}{cc}\hat{H}= & -t\,\hat{P}\sum _{\langle i,j\rangle ,\sigma }({\hat{c}}_{i,\sigma }^{\dagger }{\hat{c}}_{j,\sigma }+{\rm{h}}.{\rm{c}}.)\hat{P}\\ & +J\sum _{\langle i,j\rangle }\left({{\bf{S}}}_{i}{{\bf{S}}}_{j}-\frac{1}{4}{\hat{n}}_{i}{\hat{n}}_{j}\right)\\ & -\frac{A}{2}\,\hat{P}\sum _{\langle i,j\rangle ,\sigma }[{\hat{c}}_{i,\sigma }^{\dagger }({\hat{n}}_{i,\bar{\sigma }}+{\hat{n}}_{j,\bar{\sigma }}){\hat{c}}_{j,\sigma }+{\rm{h}}.{\rm{c}}.]\hat{P}\\ & +V\sum _{i < j}\frac{{\hat{n}}_{i}{\hat{n}}_{j}}{|i-j|}-h\sum _{i}{\hat{S}}_{i}^{z}+\sum _{i}{\Delta }_{i}{\hat{n}}_{i},\end{array}$$

(8)

where \({\hat{c}}_{i,\sigma }\) is the annihilation operator for an electron with spin *σ* on site *i*, \({\hat{n}}_{i}={\sum }_{\sigma }{\hat{n}}_{i,\sigma }={\sum }_{\sigma }{\hat{c}}_{i,\sigma }^{\dagger }{\hat{c}}_{i,\sigma }\) is the electron number operator on site *i*, and \({\vec{{\rm{S}}}}_{i}\) is the electron spin operator on site *i*. The subscripts \(\sigma \) and \(\bar{\sigma }\) denote opposite electron spins within a sum. The parameter *t* is the hopping strength, *J* is the spin–spin interaction, *A* is the assisted hopping, *V* is the strength of Coulomb interaction projected into the lowest Wannier orbital, *h* is the external magnetic field in units of *g**μ*_{B}, *Δ*_{i} is the on-site potential energy and \(\widehat{P}\) is a projector that projects out doublons in the holon-doped regime and holons in the doublon-doped regime. We consider a null spin–spin interaction *J* = 0 motivated by the experimental results pointing to a paramagnetic response at *ν* = 1. Moreover, to implement the long-range coupling proportional to *V*, we cut the range of the interaction at third neighbours. The on-site potential energy *Δ*_{i} takes into account spatial variations of the moiré potential. We consider a uniformly distributed disorder *Δ*_{i} ∈ [−*Δ*/2, *Δ*/2) of width *Δ* with a corresponding root-mean-square parameter \(\Delta /\sqrt{12}\).

### Model parameters

To estimate the relevant parameters of the Hamiltonian used in the tensor network simulations, we start from the finite discrete Fourier expansion of the moiré potential,

$$V({\bf{r}})=\mathop{\sum }\limits_{n=1}^{6}{V}_{n}{{\rm{e}}}^{{\rm{i}}{{\bf{G}}}_{n}\cdot {\bf{r}}},$$

(9)

where \({V}_{n}=-{V}_{0}\exp \left[{\rm{i}}{(-1)}^{n-1}\varphi \right]\), and we introduce the reciprocal lattice vectors

$${{\bf{G}}}_{n}=\frac{4\pi }{{a}_{{\rm{m}}}\sqrt{3}}(\begin{array}{c}\cos (\pi n/3)\\ \sin (\pi n/3)\end{array}),$$

(10)

where *a*_{m} is the moiré lattice constant. The parameters *V*_{0} = 6.3 meV and *φ* = 0 are obtained from first-principles calculations.

The single-electron problem is described by the low-energy Hamiltonian,

$$\widehat{H}=-\frac{{\hbar }^{2}}{2{m}^{* }}{{\boldsymbol{\nabla }}}^{2}+\widehat{V}({\bf{r}}),$$

(11)

where we introduce the effective mass *m** = 0.7*m*_{e} of the MoSe_{2} conduction band electrons, where *m*_{e} is the bare electron mass. As the moiré potential has a periodic structure, we can use Bloch’s theorem to write the wavefunctions as

$${\psi }_{{\bf{k}}}^{(n)}({\bf{r}})={u}_{{\bf{k}}}^{(n)}({\bf{r}}){{\rm{e}}}^{{\rm{i}}{\bf{k}}\cdot {\bf{r}}},$$

(12)

where *n* is the band index, **k** is restricted to the first moiré Brillouin zone (BZ) and \({u}_{{\bf{k}}}^{(n)}\) are the Bloch functions. As the Bloch functions have the same periodicity as the moiré potential, \({u}_{{\bf{k}}}^{(n)}({\bf{r}})={u}_{{\bf{k}}}^{(n)}({\bf{r}}+{{\bf{R}}}_{i})\), we can expand them by performing a discrete Fourier transform

$${u}_{{\bf{k}}}^{(n)}({\bf{r}})=\sum _{{\bf{G}}\in {\mathcal{G}}}{c}_{{\bf{k}},{\bf{G}}}^{(n)}{{\rm{e}}}^{{\rm{i}}{\bf{G}}\cdot {\bf{r}}},$$

(13)

where \({\mathcal{G}}\) is the set of all reciprocal lattice vectors. Therefore, the Hamiltonian can be written on the basis of reciprocal lattice vectors as

$${H}_{{\bf{G}},{{\bf{G}}}^{{\prime} }}({\bf{k}})=\frac{{\hbar }^{2}}{2{m}^{* }}{\left({\bf{k}}+{\bf{G}}\right)}^{2}{\delta }_{{\bf{G}},{{\bf{G}}}^{{\prime} }}+\mathop{\sum }\limits_{n=1}^{6}{V}_{n}{\delta }_{{\bf{G}}-{{\bf{G}}}^{{\prime} },{{\bf{G}}}_{n}},$$

(14)

with *δ*_{i,j} the Kronecker delta, which can be diagonalized for each quasi-momentum **k** by using a large set of reciprocal lattice vectors. The ground-state solution corresponds to the lowest band *n* = 0. The associated Wannier wavefunction *w*_{i}(**r**) localized at site **R**_{i} is obtained by performing the change of basis

$${w}_{i}({\bf{r}})=\frac{1}{\sqrt{{\mathcal{N}}}}\sum _{{\bf{k}}\in {\rm{BZ}}}{\psi }_{{\bf{k}}}({\bf{r}}){{\rm{e}}}^{{\rm{i}}{\bf{k}}\cdot {{\bf{R}}}_{i}},$$

(15)

where we drop the band index and introduce the normalization factor \({\mathcal{N}}\).

The interaction potential between charges in the TMDs is given by the Rytova–Keldysh potential^{49,50}

$${V}_{{\rm{RK}}}(r)=\frac{{e}^{2}}{8{\epsilon }_{0}{r}_{0}}\left({H}_{0}\left(\frac{{\epsilon }_{{\rm{r}}}r}{{r}_{0}}\right)-{Y}_{0}\left(\frac{{\epsilon }_{{\rm{r}}}r}{{r}_{0}}\right)\right),$$

(16)

where *H*_{0} is the Struve function, *Y*_{0} the Bessel function of the second kind, *r*_{0} = 3.5 nm the screening length for MoSe_{2}, *ϵ*_{r} = 4.5 the relative permittivity of h-BN as the surrounding medium^{51} and *ϵ*_{0} is the vacuum permittivity. The matrix elements

$$\begin{array}{lll}t & = & -\left\langle {w}_{i}\right|\widehat{H}\left|{w}_{j}\right\rangle =0.75\,{\rm{meV}}\\ U & = & \left\langle {w}_{i},{w}_{i}\right|{V}_{{\rm{RK}}}(| {{\bf{r}}}_{2}-{{\bf{r}}}_{1}| )\left|{w}_{i},{w}_{i}\right\rangle =157\,{\rm{meV}}\\ V & = & \left\langle {w}_{i},{w}_{j}\right|{V}_{{\rm{RK}}}(| {{\bf{r}}}_{2}-{{\bf{r}}}_{1}| )\left|{w}_{i},{w}_{j}\right\rangle =44.6\,{\rm{meV}}\\ J & = & -\left\langle {w}_{i},{w}_{j}\right|{V}_{{\rm{RK}}}(| {{\bf{r}}}_{2}-{{\bf{r}}}_{1}| )\left|{w}_{j},{w}_{i}\right\rangle =-0.61\,{\rm{meV}}\\ A & = & -\left\langle {w}_{i},{w}_{i}\right|{V}_{{\rm{RK}}}(| {{\bf{r}}}_{2}-{{\bf{r}}}_{1}| )\left|{w}_{i},{w}_{j}\right\rangle =6.1\,{\rm{meV}}\end{array}$$

(17)

are evaluated numerically, where \(\left|{w}_{i},{w}_{j}\right\rangle \) denotes a state in which two electrons occupy the neighbouring Wannier orbitals.

### Tensor network simulations

Our finite-temperature tensor network simulations are based on a purification scheme performed in the canonical ensemble. We implement the *U*(1) symmetry associated with the conservation of the total number of electrons, but we do not fix the net magnetization of the system. The finite-temperature density matrix is represented as a matrix product state (MPS) in a doubled Hilbert space. The MPS maximum bond dimension is set to *χ* = 768. The cooling process is performed as indicated in ref. ^{5}. We progressively apply an infinitesimal (*δ**β* = 0.1) Boltzmann factor e^{−δβ/2} by using the *W*_{II} technique^{52}. The finite-temperature calculations are performed in a triangular cylinder of size *L* = *L*_{x} × *L*_{y} = 15 × 3.

To obtain the ground state of the system, we use the density-matrix renormalization group algorithm. We perform simulations in a triangular cylinder of size *L* = *L*_{x} × *L*_{y} = 15 × 6, and we fix the maximum bond dimension of our MPS to *χ* = 1,024. To capture the effects of a disordered on-site potential, we have performed calculations in three different disorder realizations and taken the average. The tensor network calculations have been performed using TeNPy (ref. ^{53}).

### First-principles simulation with DFT

We study TMD heterobilayers with a small twist angle starting from R-stacking, in which every metal (M) or chalcogen (X) atom on the top layer is aligned with the same type of atom on the bottom layer. In a local region of a twisted bilayer, the atom configuration is identical to that of an untwisted bilayer, in which one layer is laterally shifted relative to the other layer by a corresponding displacement vector *d*_{0}. Therefore, the moiré band structures of twisted TMD heterobilayers can be well described by the continuum model.

In particular, \({{\bf{d}}}_{0}=0,-\left({{\bf{a}}}_{1}+{{\bf{a}}}_{2}\right)/3,\left({{\bf{a}}}_{1}+{{\bf{a}}}_{2}\right)/3\), where **a**_{1,2} is the primitive lattice vector of untwisted bilayers, corresponding to three high-symmetry stacking configurations of untwisted TMD bilayers, which we refer to as MM, XM and MX. In MM stacking, the M atom on the top layer is locally aligned with the M atom on the bottom layer, whereas in MX stacking, the M atom on the top layer is locally aligned with the X atom on the bottom layer, likewise for XM. The bilayer structure in these stacking configurations is invariant under three-fold rotation around the *z*-axis.

The density functional theory (DFT) calculations are performed using the generalized gradient approximation with SCAN density functional^{54} with dDsC dispersion correction method, as implemented in the Vienna Ab initio Simulation Package. We note that SCAN + dDsC captures the intermediate-range van der Waals interaction through its semi-local exchange term. Pseudo-potentials are used to describe the electron–ion interactions. We first construct the rigid structure of an R-stacked MoSe_{2}/WS_{2} heterobilayer with vacuum spacing larger than 20 Å to avoid artificial interaction between the periodic images along the *z*-direction. The lattice constants 3.32 Å and 3.19 Å are taken from bulk structures for MoSe_{2} and WS_{2}, respectively. The structure relaxation is performed with force on each atom less than 0.01 eV Å^{−1}. We use 1 × 1 × 1 for structure relaxation and self-consistent calculation because of the expensive computational cost.

For R-stacked MoSe_{2}/WS_{2}, we find that lattice relaxation is significant, which is the main source for the moiré potential. Our DFT calculations at *θ* = 2.7° with 1,545 atoms per unit cell show a significant variation of the layer distance *d* in different regions on the moiré superlattice (Extended Data Fig. 11a). The smallest distance *d* = 6.42 Å is in MX and XM stacking regions, in which a metal atom on the top layer is aligned with a chalcogen atom on the bottom layer and vice versa, whereas the largest distance *d* = 6.78 Å is in MM region, in which metal atoms of both layers are aligned. With the fully relaxed structure, the low-energy moiré conduction bands of R-stacked MoSe_{2}/WS_{2} are found to come from the ±K valley of MoSe_{2} after applying a gating field *E* = 0.5 V nm^{−1}, to be consistent with experimental observations.

From the fitting of moiré conduction bands, we obtain the continuum model parameters of the lowest bands as *V*_{0} = 6.3 meV, *φ* = 0°, with the bulk lattice constant *a*_{0} = 3.32 Å. Therefore, the effective model for the moiré conduction band is an ideal triangular lattice Hubbard model, with the MM region as the single potential minimum (Extended Data Fig. 11d) for the wavefunction plot. To demonstrate the accuracy of the continuum model method, we compare the conduction band structures computed by large-scale DFT directly at *θ* = 2.7° and by the continuum model (Extended Data Fig. 11c). We note that the DFT-calculated spin–orbit splitting of the conduction bands is 22 meV, whereas the bandwidth of the lowest moiré band extracted from continuum model is 10 meV for a twist angle of 0°. As for the moiré valence band, the continuum model parameters can be fitted as *V*_{0} = 1.9 meV and *φ* = 59^{∘}, with nearly identical moiré potential at the MM and MX regions.