### PDS 70 system

Table of Contents

PDS 70 (V1032 Cen) is a K7-type star in the Upper Centaurus-Lupus subgroup (*d* = 113.4 ± 0.5 pc (ref. ^{36})) in a late stage of accretion^{22} with an estimated age of 5.4 ± 1.0 Myr (ref. ^{37}). The disk around PDS 70 (refs. ^{38,39,40}) hosts two actively accreting protoplanets: PDS 70 b and PDS 70 c, which reside in an approximately 54 au annular gap between an inner and outer disk^{8,9}. The presence of an inner dusty disk in the PDS 70 system has been inferred from both near-infrared scattered light and ALMA images^{10,11,12}. The 855 μm dust continuum emission from the innermost disk regions is confined within the orbit of PDS 70 b (approximately 22 au; Extended Data Fig. 1), putting an upper limit to the inner disk radial extent of approximately 18 au (ref. ^{12}). A population of small dust grains may be responsible for the observed inner disk emission although the current low dust mass estimates could support the simultaneous presence of small and large dust grains^{41}.

### Observations and data reduction

The PDS 70 disk (CD-40-8434) was observed with MIRI^{13,14} on 1 August 2022 as part of the Guaranteed Time Observation (GTO) programme 1282 (PI: Th. Henning) with number 66. The disk was observed in FASTR1 readout mode with a four-point dither pattern in the negative direction for a total on-source exposure time of 4,132 s. The MRS^{15} mode was used, which has four IFUs. Each IFU (referred to as channel) covers a different wavelength range and splits the field of view into spatial slices. Calibration and processing of IFU observations produces three-dimensional spectral cubes. The latter are used to extract a final spectrum covering the MIRI 4.9–22.5 μm range and is a composite of the four IFUs: channel 1 (4.9–7.65 μm; *R* ≈ 3,400), channel 2 (7.51–11.71 μm; *R* ≈ 3,000), channel 3 (11.55–18.02 μm; *R* ≈ 2,400) and channel 4 (17.71–22.5 μm; *R* ≈ 1,600). Each channel is in turn composed of three sub-bands: SHORT (A), MEDIUM (B) and LONG (C) leading to a total of 12 wavelength bands.

We processed the PDS 70 data using a hybrid data reduction pipeline made from the combination of the JWST Science Calibration pipeline^{42} (v.1.8.4) stages 1 to 3, with dedicated routines based on the Vortex Image Processing (VIP) package^{43,44} for bad pixel correction, background subtraction and removal of spikes affecting the final spectrum. Specifically, data reduction proceeded as follows: (1) the class Detector1 of the JWST pipeline was used to process uncalibrated raw data files using Calibration Reference Data System (CRDS) context jwst_1019.pmap and default parameters; (2) apart from pixels flagged in the Data Quality (DQ) extension, we identified additional bad pixels with both an iterative sigma clipping algorithm and through a cross-shaped match filter, and corrected them using a two-dimensional Gaussian kernel; (3) Spec2 was then used with default parameters, but the background subtraction was skipped, and dedicated reference files^{45} for photometric and fringe flat calibrations were adopted; (4) as no dedicated background observation was taken, we leveraged the four-point dither pattern to obtain a first guess on the background map, then refined it using a median filter, which both smoothed the background estimate and removed residual star signals from it; (5) Spec3 was then run with default parameters, apart from the master_background and outlier_detection steps that were turned off, in the latter case to avoid spurious spectral features resulting from under-sampling of the point spread function; (6) we recentred the spectral cubes by applying the shifts maximizing the cross-correlation between cube frames, and found the location of the point spread function centroid with a two-dimensional Gaussian fit on the median image of each aligned cube; (7) spectra were then extracted with aperture photometry in 2.5-FWHM apertures centred on the centroid location (with the FWHM equal to approximately 1.22*λ*/*D* with the telescope diameter *D* equal to 6.5 m), corrected for both aperture size using correction factors^{18}, and spikes affecting individual spaxels included in the aperture; (8) spectra were finally corrected for residual fringes at the spectrum level and the bands were stitched together based on the level of the shorter wavelength bands (these rescaling factors were systematically within 3% of the photometric solution). Spurious data reduction artefacts were masked at 5.12, 5.90, 7.45 and 7.50 μm. The uncertainty associated with each photometric measurement considers both Poisson and background noise, combined in quadrature. The former is an output from the JWST pipeline, whereas for the latter we propagated our background estimate obtained in step (4) through Spec3, and considered the standard deviation of the fluxes inferred in independent 2.5-FWHM apertures as a proxy for the background noise uncertainty. The final relative uncertainties range from approximately 0.1% to approximately 1.1% with respect to the continuum at the shortest and longest wavelengths considered in this work (4.9 and 22.5 μm, respectively).

### Local continuum fit

Extended Data Fig. 2 shows the local baseline fit for the 7 μm region. The continuum level is determined by selecting line-free regions and adopting a cubic spline interpolation (scipy.interpolate.interp1d). This continuum is then subtracted from the original data to produce the spectrum shown in Fig. 3.

### Correction for the photospheric emission

The observed near-infrared colour index, *J* − *K*_{s} = 1.01 (2MASS), indicates a small colour excess, *E*(*J* − *K*_{s}) ≈ 0.16, which could be due to either interstellar extinction or a true excess in the *K*_{s} band, or a combination of the two. By assuming that the brightness in the *K*_{s} band is essentially due to photospheric emission we can, by using a model atmosphere provided by P. Hausschildt (personal communication, 2023), extrapolate the contribution into the mid-infrared spectral region. The parameters used for the model atmosphere of the PDS 70 K7-star are an effective temperature *T*_{eff} = 4,000 K, surface gravity log(*g*) = 4.5 and solar metallicity. At 5 μm the photospheric contribution amounts to 56 mJy, that is, 44% of the observed flux density. At longer wavelengths, in the 7 μm region in which the water emission is detected, the photosphere amounts to one-third of the observed flux density and the subtraction of the photospheric contribution marginally alters the continuum-subtracted spectrum (Extended Data Fig. 3).

### Slab models fits

The molecular lines are analysed using a slab approach that takes into account optical depth effects. The level populations are assumed to be in LTE and the line profile function to be Gaussian with a FWHM of Δ*V* = 4.7 km s^{−1} (*σ* = 2 km s^{−1})^{25}. The line emission is assumed to originate from a slab of gas with a temperature *T* and a line-of-sight column density of *N*. Under these assumptions and neglecting mutual line opacity overlap, the frequency-integrated intensity of a line is computed as follows:

$$I=\frac{\Delta V}{2\sqrt{{\rm{l}}{\rm{n}}2}{\lambda }_{0}}{B}_{{\nu }_{0}}(T){\int }_{-{\rm{\infty }}}^{+{\rm{\infty }}}(1-\exp (-{\tau }_{0}{{\rm{e}}}^{-{y}^{2}})){\rm{d}}y,$$

(1)

where \({B}_{{\nu }_{0}}(T)\) is the Planck function, *λ*_{0} is the rest wavelength of the line and *τ*_{0} is the optical depth at the line centre *ν*_{0}, with:

$${\tau }_{0}=\sqrt{\frac{{\rm{ln}}2}{{\rm{\pi }}}}\frac{{A}_{{\rm{ul}}}N{\lambda }_{0}^{3}}{4{\rm{\pi }}\Delta V}\left({x}_{{\rm{l}}}\frac{{g}_{{\rm{u}}}}{{g}_{{\rm{l}}}}-{x}_{{\rm{u}}}\right).$$

(2)

In this equation, *x*_{l} and *x*_{u} denote the level population of the lower and upper states, *g*_{l} and *g*_{u} their respective statistical weights and *A*_{ul} the spontaneous downward rate of the transition. The line intensity is then converted into integrated flux \({F}_{{\nu }_{0}}\) assuming an effective emitting area of π*R*^{2} and a distance to the source *d* as:

$${F}_{{\nu }_{0}}={\rm{\pi }}{\left(\frac{R}{d}\right)}^{2}I.$$

(3)

We note that neglecting mutual line overlap for H_{2}O when calculating the line intensity is a valid approximation for N(H_{2}O) ≲ 10^{20} cm^{−2} and significantly reduces the computational time^{46}. Finally, the spectrum is convolved and sampled in the same way as the observed spectrum^{47} and all lines are then summed to prepare a total synthetic spectrum. The molecular data, that is, line positions, Einstein *A* coefficients and statistical weights stem from a previous work^{48}.

### Fitting procedure for H_{2}O vapour lines

The LTE slab model described above is then used to fit the H_{2}O lines in the 6.78–7.36 μm region following a *χ*^{2} method. First, an extended grid of models is computed varying the total column density from 10^{15} to 10^{20} cm^{−2} in steps of 0.17 in log_{10} space and the temperature from 100–1,400 K in steps of 50 K. We further assume an ortho-to-para ratio of 3. For each set of free parameters (*N*, *T*, *R*), a synthetic spectrum is calculated at the spectral resolving power *R *= 2,000 and rebinned to the spectral sampling of the observed spectrum using the slabspec python code^{49}. The adopted spectral resolution is lower that the nominal MIRI MRS spectral resolution in channel 1 (ref. ^{15}); it was selected to account for the observed line broadening. This spectrum is further used to compute the *χ*^{2} value on a spectral channel basis. Specific spectral windows are chosen to avoid contamination by other gas features. This includes all spectral channels falling within 0.02 μm (1,000 km s^{−1}) of any hydrogen recombination line and a 0.01 μm wide spectral window at the position of the* S*(5) line of H_{2} at 6.91 μm. To mitigate the errors induced by the continuum subtraction procedure, we also include only spectral elements falling within 0.004 μm of a water line. For each value of (*N*, *T*), the *χ*^{2} is then minimized by varying the emitting size π*R*^{2}. The resulting *χ*^{2} map is shown in Extended Data Fig. 4 together with the best-fit emitting radius. The confidence intervals are estimated following a previous work^{50} and using a representative noise level of *σ* = 0.15 mJy. We note that, owing to the large number of lines in this crowded region, there is little space to determine the noise on the continuum. Therefore, we estimate the noise level between 7.72 μm and 7.73 μm to avoid contamination by H_{2}O and hydrogen lines.

### Origin of water in PDS 70

At the typical densities of inner disk regions (*n*_{H }≥ 10^{8} cm^{−3}) the chemistry can rapidly reach steady-state conditions and water vapour can form from a simple reaction sequence involving O, H_{2} and OH. Water and OH absorb efficiently in the ultraviolet (that is, water and OH shielding), ensuring the survival of water molecules even in regions of reduced dust opacity^{5}. This mechanism by itself is able to account for the column densities of water vapour detected in this work and is supported by the presence of CO_{2} emission. Small grains in the inner disk provide additional ultraviolet shielding. One question that naturally arises is whether the water vapour in PDS 70 originated before the formation of the giant protoplanets within the gap or whether there is a continuous supply of gas from the outer to the inner disk regions. ALMA high spatial resolution CO observations reveal the presence of gas inside the gap^{11,51}. Observations and models find the gap to be gas depleted^{11} (two to three orders of magnitude assuming an *r*^{−1} surface density profile) and dust depleted^{52}, but not empty. One possibility could be that a population of water-containing dust particles is able to filter through the orbits of PDS 70 b and PDS 70 c, enriching the inner disk reservoir^{12}. Experimental evidence indicates that water chemically bound to complex silicates can be preserved to temperatures up to 400–500 K (refs. ^{53,54}) and thus survive in the regions probed by our observations inside the water snowline. We note that some degree of dust filtering is expected with gas replenishment, as small dust particles can couple to the gas. Therefore, a replenishment of both gas and dust from the outer disk to sustain the water reservoir and hence the PDS 70 accretion rate is possible.

### Fitting procedure for the dust continuum

The 4.9–22.5 μm dust continuum is analysed using a two-layer disk model for the dust emission^{55}. This model was successfully applied to Spitzer-IRS spectra of planet-forming disks^{56}; we follow the same modelling approach here. We rebin the spectrum by averaging 15 spectral points and assign errors *σ* to the rebinned spectral points assuming a normal error distribution with equal weights for each individual spectral element. The stellar photospheric emission is represented by a stellar atmosphere model fitted to optical and near-infrared photometry. The disk model has three spectral components: (1) a hot inner disk *F*_{rim}, (2) an optically thick midplane disk layer *F*_{mp} and (3) an optically thin warm disk surface layer *F*_{sur}. The dust grains representing the disk components are assumed to have power-law temperature distributions, and each is characterized by a minimum and a maximum temperature *T*_{atm}. The disk surface layer is assumed to consist of a number of dust species *i* with different chemical compositions and with a fixed number of grain sizes *j*, all emitting at the same temperatures. The total disk flux can then be written as:

$${F}_{\nu }={F}_{\nu ,{\rm{rim}}}+{F}_{\nu ,{\rm{mp}}}+{F}_{\nu ,{\rm{sur}}}$$

(4)

where

$${F}_{\nu ,{\rm{sur}}}=\mathop{\sum }\limits_{i=1}^{n}\mathop{\sum }\limits_{j=1}^{m}{D}_{\mathrm{i,j}}\,{\kappa }_{\mathrm{i,j}}{\int }_{{T}_{{\rm{atm,max}}}}^{{T}_{{\rm{atm,min}}}}\frac{2{\rm{\pi }}}{{d}^{2}}\,{B}_{\nu }(T)\,{T}^{\frac{2-{q}_{{\rm{atm}}}}{{q}_{{\rm{atm}}}}}{\rm{d}}T$$

(5)

and *B*_{ν}(*T*) is the Planck function, *q*_{atm} is the power-law exponent for the temperature gradient in the disk surface layer, *κ*_{i,j} are the opacities in cm^{2} g^{−1} of dust species *i* with grain size *j*, *d* is the distance to the star and *D*_{i,j} are normalization factors^{55}. We use three grain compositions (with SiO_{2}, SiO_{3} and SiO_{4} stochiometry)^{57,58,59,60,61} and both amorphous and crystalline lattice structures to capture the rich spectral structure evident in the MIRI data. The choice of this set of compositions is based on previous analyses^{56}, which showed that this set of materials is able to capture most spectral variations in planet-forming disks observed with Spitzer-IRS. We use either two or three grain sizes (that is, 0.1, 2 and 5 μm) for each of the dust species. In total, the model has 23 fitting parameters. We use the MultiNest Bayesian fitting algorithm^{62} and the PyMultiNest package^{63} to find the best-fit parameters. The resulting fit and the separate spectral components (star, inner rim, midplane and surface layer) are shown in Fig. 2.

### WISE time-series observations

Extended Data Fig. 6 reports WISE time-series observations of PDS 70. Observations were executed on 2–3 February and 6 February 2010, and on 1–2 August 2010. We note that the source is highly variable and that WISE 4 (W4; 25 μm) is anticorrelated with WISE 1 (W1; 3.4 μm) and WISE 2 (W2; 4.6 μm). Such variability may be ‘seesaw’-like^{19}, for which changes in the scale height of the inner disk wall shadow the disk material located further out. However, a complete ‘seesaw’ profile is not observed, as at wavelengths shorter than 8 μm the MIRI spectrum lies above the IRS spectrum (Fig. 1). This is not surprising as the wavelength of the ‘pivot’ point (that is, the wavelength at which a shift in emission is observed) is dependent on the location of the occulting material with respect to the star, the stellar luminosity and the inclination of the system, with highly inclined systems showing a more complete ‘seesaw’ than more face-on systems such as PDS 70 (*i* = 51.7 ± 0. 1^{°} (ref. ^{11})). Interestingly, WISE 3 (W3; 12 μm) is not anticorrelated with WISE 1 and WISE 2 because of the dominant 10 μm silicate band, which indeed shows a minor offset compared with the longer wavelengths. This indicates that the material contributing to the 10 μm emission is not shadowed. This behaviour is seen if the emission arises from warmer dust closer to the star than the occulting material or further above the disk midplane.

We also note that the difference in aperture size of Spitzer-IRS and MIRI MRS cannot explain the observed variability. The Spitzer-IRS low-resolution spectrograph has a slit width of 3.6*″* for wavelengths shorter than 14 μm and 10.2*″* for wavelengths longer than 14 μm. Although the maximum aperture of Spitzer-IRS at longer wavelengths is larger than that of MIRI MRS, this is not the case for the shorter wavelengths, for which the slit widths are similar for both observatories (approximately 3.6*″* versus 4.0*″*). However, a flux offset is also observed in this spectral region. In addition, in the case when the long wavelength excess would arise from an extended component, a jump in flux level at 14 μm—where the aperture size changes—would be present in the IRS data, but it is absent.