CGR
Table of Contents
Annual global atmospheric CGR spanning from 1960 to 2018 is obtained from the Greenhouse Gas Marine Boundary Layer Reference of the National Oceanic and Atmospheric Administration (NOAA/ESRL)50. According to the guideline, the annual CGR in a given year is the difference in CO2 concentration between the end of December and the start of January of that year. In addition, we also use the estimated RLS from the latest Global Carbon Budget 20201 to verify the robustness of our main finding (Extended Data Fig. 1b). RLS is inferred as a residual between emissions, atmospheric CO2 accumulation and the ocean sink.
Climate data
Terrestrial water storage and lagged precipitation
The twin GRACE satellites provide the measurement of changes in terrestrial WS at monthly scale since March 200251. In particular, terrestrial WS is the sum of all above- and below-surface WS, including soil moisture, groundwater, snow, ice and water stored in vegetation and rivers and lakes. To complement the shorter record of observations provided by the GRACE satellites, we use a recently published statistical reconstruction of terrestrial WS that trained on the twin GRACE satellites18. The reconstructed terrestrial WS is based on two different GRACE solutions and three different meteorological forcing datasets. Here, we mainly use the ensemble mean of all members. The details of the statistical approach are documented in ref. 18. Validation against original terrestrial WS supports the reliability of reconstructions in reproducing historical signals at yearly time scale, including tropics (Supplementary Fig. 14 and Supplementary Table 1). The reconstructed terrestrial WS dataset has a spatial resolution of 0.5° × 0.5° and a temporal resolution of one day from 1901 to 2018.
In addition, direct observations of LagP over tropical land are found to capture IAV of aggregated tropical WS anomaly at yearly time scale (Extended Data Fig. 3a and Supplementary Table 1). For instance, the LagP in 2018 is the sum up of the precipitation from July in 2017 to June in 2018. Moreover, consistent with previous findings7,17, the IAV of tropical land lagged precipitation correlates with CGR well (Extended Data Fig. 3b). This helps to explain the rationality of this relationship from the process of WS memory. Therefore, LagP is identified as another efficient proxy for aggregated tropical WS anomaly IAV. Precipitation is obtained from station-based Climate Research Unit (CRU) TS4.03 (ref. 52) and Global Precipitation Climatology Centre (GPCC) Full Data Monthly v.2020. To further confirm that the data quality of station-based tropical precipitation is reliable, we compare station-based CRU precipitation with satellite-based Tropical Rainfall Measuring Mission (TRMM) precipitation (Extended Data Fig. 7a); we also compare CRU precipitation with GPCC precipitation which has much larger gauge stations than CRU (Extended Data Fig. 7b). In fact, compared to the 2000s to 2010s (in which satellite observations confirm the reliability of station-based tropical precipitation IAV), the number of gauge stations was much larger in the 1960s–1990s (Extended Data Fig. 7c). These validations suggest that station-based tropical precipitation IAV is reliable during 1960–2018. TRMM 3B43 precipitation dataset has a spatial resolution of 0.25° × 0.25° and a temporal resolution of 1 month from 1998 to 2019. Both CRU and GPCC have a spatial resolution of 0.5° × 0.5° and a temporal resolution of 1 month from 1891 to 2018.
Therefore, we use yearly reconstructed tropical WS and LagP to indicate the IAV of tropical terrestrial water availability anomaly.
Temperature
Temperature is obtained from CRU TS4.03 (ref. 52). The Berkeley Earth global surface temperature is also used for a robust test53. The two temperature datasets both have a spatial resolution of 0.5° × 0.5° and a temporal resolution of 1 month from 1901 to 2018.
Regional domain definition
Tropical lands are defined as the spatial average over all the vegetated land areas between 24° N and 24° S, as in ref. 7. Tropical semi-arid region domain consists of shrubland and (woody) savannah, which is identified according to the land-cover classification map from MODIS (MCD12C1, type3). The map was regridded using a majority filter to a spatial resolution of 0.5° × 0.5°.
ENSO indices
The ENSO is Earth’s most important source of interannual climate variability. The areal averages of sea surface temperature (SST) anomalies relative to a long-term average climatology are used to characterize ENSO. The time period of 1960–2018 is used as the climatology here. The SST anomalies over Nino3.4 region (5° N–5° S,120° W–170° W) are most commonly used and included the signals of both EP ENSO and CP ENSO54. Year is considered EP ENSO when the largest DJF (December to February) SST anomaly over the region of 2° S–2° N, 110° E–90° W lies in the EP (east of 150° W) and Nino3 index exceeds 1 s.d. Year is considered CP ENSO when the corresponding largest DJF SST anomaly lies in the CP (west of 150° W) and Nino4 index exceeds 1 s.d. We note that the identification of ENSO types could vary with the method used27. This study uses the Extended Reconstruction Sea Surface Temperature (ERASST v.5) from the National Oceanic and Atmospheric Administration (NOAA). This dataset has a temporal resolution of 1 month from 1855 to the present and a spatial resolution of 2° × 2° (ref. 55).
CMIP6 models
Coupled ESM
Nine coupled ESMs participating in the sixth phase of Coupled Model Intercomparison Project (CMIP6) are used: CESM2, CNRM-ESM2-1, IPSL-CM6A-LR, MPI-ESM1-2-LR, UKESM1-0-LL, ACCESS-ESM1-5, CanESM5, MIROC-ES2L and NorESM2-LM. Coupled ESMs allow for feedbacks between the physical climate and the biological and chemical processes in the ocean and on land. We adopted data output from the ‘historical’ scenario (1960–2014) with one ensemble member for each model. Climate and land carbon sinks are simulated with all-forcings, including both the natural causes (for example, volcanic eruptions and solar variability) and human factors (for example, CO2 concentration, aerosols and land use) over the period 1850–2014. In coupled ESMs, the carbon cycle is coupled to the climate system.
Offline LSM
Six offline LSMs from the Land Surface, Snow and Soil Moisture Model Intercomparison Project (LS3MIP) are used here56: CESM2, CNRM-ESM2-1, IPSL-CM6A-LR, MPI-ESM1-2-LR, UKESM1-0-LL and CMCC-ESM2. The offline LSMs account for land-use changes but do not include local land–atmosphere feedbacks. We adopted data output from the ‘Land-Hist’ scenario (1960–2014) with one ensemble member for each model, for which the atmospheric forcing, vegetation, soil, topography and land/sea mask data were prescribed following the protocol used for the CMIP6 DECK simulations. The atmospheric forcing comes from the Global Soil Wetness Project phase three (GSWP3), which is a dynamically downscaled and bias-corrected version of the Twentieth Century Reanalysis57. Spin-up of the land-only simulations follow the protocol of the project ‘Trends and drivers of the regional scale sources and sinks of carbon dioxide’ (TRENDY)58.
Following previous efforts11,59, to enable a fair comparison of the water–carbon relationship between observations and models, we use the sum of soil moisture in all layers and snow water equivalent as modelled terrestrial WS. In the tropics, snow water equivalent is negligible.
Partial correlation
Partial correlation is used here to directly check whether increasingly negative water–carbon coupling is influenced by confounding water–temperature coupling. However, using specific values of RW,CGR|T to conclude the sign and strength of total water impacts on CGR is not suggested. RW,CGR|T isolates water impacts on CGR from confounding water–temperature coupling by linearly removing all temperature-related covariations. However, given the well-documented soil moisture–atmosphere feedbacks21, temperature variability actually includes many feedbacks from soil moisture (for example, hot extremes at tropical semi-arid regions) and removing all of them would indirectly remove some water impacts on CGR because of their physical connection. In addition, models do not reproduce the intensified water–carbon coupling but roughly capture the sign and strength of long-term tropical water–carbon coupling during 1960–2018, thus providing insights into underpinning processes. Model factorial experiments show that removing soil moisture interannual variability suppresses land carbon uptake variability by about 90%, whereas tropical mean temperature remains unchanged (Extended Data Fig. 10 in ref. 20). Therefore, ref. 20 suggest that tropical mean temperature might not represent a mechanistic climatic driver for land carbon uptake variability. Hence, RW,CGR|T is an insufficient and less accurate measure to infer the sign and strength of independent water impacts on CGR and underestimate water impacts on CGR in phases in which temperature control is dominant. Nonetheless, their relative changes are useful in this study and support the finding that water–carbon coupling has become increasingly negative in the recent past (1989–2018) compared to previous climate conditions (1960–1989).
Empirical orthogonal function
The method of empirical orthogonal function (EOF) analysis can deconvolve the spatiotemporal variability of a signal into orthogonal modes, each indicated by a principal spatial pattern and the corresponding principal component time series. It is widely used to study spatial patterns of climate variability and how they change with time60,61. We perform the EOF analysis on simulated tropical soil moisture from CMIP6 models.
Ridge regression
In the presence of collinearity, using the OLS estimator can lead to regression coefficient estimates that have large sampling variability and even a wrong sign. Ridge regression is a common technique to be used to address the issues arising through collinearity62. In ridge regression, a penalty term is added to the loss function to shrink the regression coefficients63. The amount of shrinkage is defined by a regularization parameter that was chosen by us in a cross-validation approach. The data were randomly split 25 times into training and validation sets and, for each split, the performance for the validation test set (mean squared error) was assessed for 100 different regularization parameters spaced evenly on a log scale. The best-performing regularization parameters were selected for each split and the average between them was retained for the final model. To assess the uncertainty in the regression coefficient estimates, we relied on bootstrapping, meaning that we randomly sampled the data 5,000 times and estimated the regression coefficients for each sample.
Spatial coherence
To quantify the degree of spatial coherence of yearly tropical WS anomaly, following ref. 9, we calculate a large covariance matrix of all grid cells versus all grid cells for tropical WS anomaly. Each element in this covariance matrix is termed as ci,j as follows:
$${c}_{i,j}={\rm{cov}}({{\rm{WS}}}_{i},{{\rm{WS}}}_{j})$$
(1)
where i and j indicate the two grid cells that used to calculate covariance, WSi and WSj are the corresponding yearly WS anomaly time series in the specified time period. Then, we summed all, positive and negative covariance terms (termed as tcov, tcov+ and tcov−), respectively, as follows:
$$\begin{array}{c}{\rm{tcov}}=\sum _{i=1}\sum _{j\ne i}| {c}_{i,j}| \\ \,{{\rm{tcov}}}^{+}=\sum _{i=1}\sum _{j\ne i}{c}_{i,j}| {c}_{i,j} > 0\\ \,{{\rm{tcov}}}^{-}=\sum _{i=1}\sum _{j\ne i}{c}_{i,j}| {c}_{i,j} < 0\end{array}$$
(2)
The variances in the diagonal of the covariance matrix (where i = j) were excluded because they are always positive and do not contribute to the estimate of spatial coherence. Finally, the spatial coherence of WS anomaly was defined as the following equation:
$${\rm{Spatial}}\,{\rm{coherence}}=\frac{{{\rm{tcov}}}^{+}+{{\rm{tcov}}}^{-}}{{\rm{tcov}}}\times 100 \% $$
(3)
In theory, 100% indicates all grid cells covariate in the same sign, that is, highest spatial coherence. Lower values indicate that total positive covariances are counterbalanced by total negative covariances, that is, lower spatial coherence.
Vegetation optical depth and aboveground carbon
VOD retrieved from microwave satellite observations is linked to the water content of vegetation mass and offers possibilities for monitoring AGC dynamics41. We used recently published long-term VOD products from the VOD Climate Archive (VODCA), which combines VOD retrievals that have been derived from multiple sensors (SSM/I, TMI, AMSR-E, WindSat and AMSR2) using the Land Parameter Retrieval Model64. For time completeness, we used the longest available VOD estimated from Ku-band which covers the period 1988–2016. To estimate the tropical AGC, following the approach of ref. 65, we first fitted a four-parameter empirical function by calibrating tropical VOD against the tropical AGC benchmark map from ref. 66 in 2000 as follows:
$${\rm{AGC}}=a\times \frac{\arctan (b\times ({\rm{VOD}}-c))-\arctan (-b\times c)}{\arctan (b\times ({\rm{Inf}}-c))-\arctan (-b\times c)}+d$$
(4)
where a, b, c and d are four best-fit parameters and Inf is set to 1010. AGC density (MgC ha−1) was derived by multiplying the original aboveground biomass density values by a factor of 0.5 (ref. 67). For Ku-VOD and AGC, data are aggregated to the spatial resolution of 0.5° × 0.5°. The spatial scatter plot of VOD and AGC clearly demonstrates the good relationship between VOD and AGC (coefficient of determination R2 = 0.76, P < 0.01; Supplementary Fig. 15). It seems the performance of VODCA Ku-VOD is less comparable to L-VOD (R2 = 0.81) (ref. 41), which was considered to be more sensitive to AGC in high biomass regions. However, L-VOD has only been available since 2010 and thus is not used here. Finally, we apply this empirical function to convert VOD to AGC from 1988 to 2016.