The analysis was pre-specified at https://www.synapse.org/#!Synapse:syn11855121/wiki/513724.

### Study designs and inclusion criteria

Table of Contents

We included all longitudinal observational studies and randomized trials available through the Ki project on April 2018 that met five inclusion criteria (Extended Data Fig. 1) as follows: studies that were conducted in LMICs (children in these countries have the largest burden linear growth faltering and are the key target population for preventive interventions); studies that had a median year of birth in 1990 or later (this restriction resulted in a set of studies spanning the period from 1987 to 2017 and excluded older studies that are less applicable to current policy dialogues); studies that enrolled children between birth and age 24 months and measured their length and weight repeatedly over time (we were principally interested in growth faltering during the first 1,000 days (including gestation), thought to be the key window for linear growth faltering); studies that did not restrict enrolment to acutely ill children (our focus on descriptive analyses led us to target, to the extent possible, the general population; we thus excluded some studies that exclusively enrolled acutely ill children, such as children who presented to hospital with acute diarrhoea or who were severely malnourished); studies that collected anthropometry measurements at least every 3 months (to ensure that we adequately captured incident episodes and recovery).

Thirty-two longitudinal cohorts in 14 countries followed between 1987 and 2017 met inclusion criteria. All children from each eligible cohort were included in the study. There was no evidence of secular trends in LAZ (Supplementary Note 4). We calculated cohort measurement frequency as the median days between measurements. If randomized trials found effects on growth within the intervention arms, the analyses were limited to the control arm. We included all measurements under 24 months of age, assuming months were 30.4167 days. We excluded extreme measurements of LAZ > 6 or LAZ < –6 following WHO growth standard recommendations^{30}. In many studies, investigators measured length shortly after birth because deliveries were at home, but most measurements were within the first 7 days of life (Supplementary Note 5); for this reason, we grouped measurements in the first 7 days as birth measurements. Gestational age was measured in only five cohorts that measured birth length (three cohorts measured it by recall of last menstrual period; one measured it by newborn examination; one measured it by ultrasound); thus, we did not attempt to exclude preterm infants from the analyses.

### Quality assurance

The Ki data team assessed the quality of individual cohort datasets by checking the range of each variable for outliers and values that were not consistent with expectation. *z*-scores were calculated using the median of replicate measurements and the 2006 WHO child growth standards^{30}. In a small number of cases, a child had two anthropometry records at the same age, in which case we used the mean of the records. Analysts reviewed bivariate scatter plots to check for expected correlations (for example, length by height; length, height or weight by age; length, height or weight by corresponding *z*-score). Once the individual cohort data were mapped to a single harmonized dataset, analysts conducted an internal peer review of published articles for completeness and accuracy. Analysts contacted contributing investigators to seek clarification about potentially erroneous values in the data and revised the data as needed.

### Outcome definitions

We used the following summary measures in the analysis.

#### Incident stunting episodes

Incident stunting episodes were defined as a change in LAZ from above –2 *z* in the previous measurement to below –2 *z* in the current measurement. Similarly, we defined severe stunting episodes using the cutoff of –3 *z*. Children were considered at risk of stunting at birth, so children born stunted were considered to have an incident episode of stunting at birth. Children were also assumed to be at risk of stunting at the first measurement in non-birth cohorts and trials. Children whose first measurement occurred after birth were assumed to have experienced stunting onset at the age halfway between birth and the first measurement. Most children were less than 5 days of age at their first measurement (Supplementary Note 5).

#### Incidence proportion

We calculated the incidence proportion of stunting during a defined age range (for example, 3–6 months) as the proportion of children at risk of becoming stunted who became stunted during the age range (the onset of new episodes).

#### Changes in stunting status

Changes in stunting status were classified using the following categories—never stunted: children with LAZ ≥ –2 at previous ages and the current age; no longer stunted: children who previously reversed their stunting status with LAZ ≥ –2 at the current age; stunting reversal: children with LAZ < –2 at the previous age and LAZ ≥ –2 at the current age; newly stunted: children whose LAZ was previously always ≥ –2 and with LAZ < –2 at the current age; stunting relapse: children who were previously stunted with LAZ ≥ –2 at the previous age and LAZ < –2 at the current age; still stunted: children whose LAZ was <–2 at the previous and current age.

#### Growth velocity

Growth velocity was calculated as the change in length in centimetres between two time points divided by the number of months between the time points. We compared measurements of change in length in centimetres per month to the WHO child growth standards for linear growth velocity^{56}. We also estimated within-child rates of change in LAZ per month.

### Measurement frequency

Analyses of incidence and growth velocity (Figs. 3 and 5) included cohorts with at least quarterly measurements to include as many cohorts as possible. Analyses of stunting reversal (Fig. 4) were restricted to cohorts with at least monthly measurements to allow evaluation of changes in stunting status with higher resolution.

### Subgroups of interest

We stratified the above outcomes within the following subgroups: child age, grouped into one- or three-month intervals (depending on the analysis); the region of the world (Asia, sub-Saharan Africa, Latin America); sex of child; and the combinations of those categories. We obtained country-level data on the percentage of gross domestic product devoted to healthcare goods and spending from the United Nations Development Programme^{57} and the percentage of the country living on less than US$1.90 per day and under-5 mortality rates from the World Bank^{58}. In years without available data, we linearly interpolated values from the nearest years with available data and extrapolated values within 5 years of available data using linear regression models based on all available years of data. We also considered additional subgroups, including decade in which data were collected, gross domestic product^{58}, gender development index^{57}, gender inequality index^{57}, coefficient of human inequality^{58} and the Gini coefficient^{58}. However, for these variables, subgroup levels were strongly correlated with geographic region, making it impossible to separate the effects of each (Supplementary Table 3). Thus, we did not conduct subgroup analyses for these variables.

### Statistical analysis

All analyses were conducted in R version 3.4.2 (ref. ^{59}).

#### Estimation of mean LAZ by age in DHS and Ki cohorts

We downloaded standard DHS individual recode files for each country from the DHS Program website (https://dhsprogram.com/). We used the most recent standard DHS datasets for the individual women’s, household, and height and weight datasets from each country. We obtained variables for country code, sample weight, cluster number, primary sampling unit and design stratification from the women’s individual survey recode files. From the height and weight dataset, we used standard recode variables corresponding to the 2006 WHO growth standards for height-for-age.

After excluding missing observations, restricting to measurements of children of 0–24 months of age and restricting to *z*-scores within WHO-defined plausible values, surveys were collected from 1996 to 2018 in countries that overlapped with Ki cohorts with the exception of Guinea-Bissau because the DHS survey was not conducted there during the study period (Extended Data Table 1).

We classified countries into regions (South Asia, Latin America and Africa) using the WHO regional designations with the exception of the classification for Pakistan, which we included in South Asia to be consistent with previous linear growth studies using DHS^{20}. One included cohort was from Belarus, and we chose to exclude it from region-stratified analyses as it was the only European study.

We estimated the age-stratified mean from ages of 0 to 24 months within each DHS survey, accounting for the complex survey design and sampling weights. We then pooled estimates of mean LAZ for each age in months across countries using a fixed-effects estimator (details below). We compared DHS estimates with mean LAZ by age in the Ki study cohorts, which we estimated using penalized cubic splines with bandwidth chosen using generalized cross-validation^{60}. We used splines to estimate age-dependent mean LAZ in the Ki study cohorts to smooth any age-dependent variation in the mean caused by less frequently measured cohorts.

#### Distribution models

To investigate how the mean, standard deviation and skewness of LAZ distributions varied by age, we fitted linear models with skew-elliptical error terms using maximum-likelihood estimation. We fitted models separately by cohort.

#### Fixed- and random-effects models

Several analyses pooled results across study cohorts. We estimated each age-specific mean using a separate estimation and pooling step. We first estimated the mean in each cohort, and then pooled age-specific means across cohorts, while allowing for a cohort-level random effect. This approach enabled us to include the most information possible for each age-specific mean, while accommodating slightly different measurement schedules across the cohorts. Each cohort’s data contributed only to LAZ or stunting incidence estimates at the ages for which it contributed data.

The primary method of pooling was using random-effects models. This modelling approach assumes that studies are randomly drawn from a hypothetical population of longitudinal studies that could have been conducted on children’s linear growth in the past or future. We also fitted fixed-effects models as a sensitivity analysis (Supplementary Note 6); inferences about estimates from fixed-effects models are restricted to only the included studies^{61}.

Random-effects models assume that the true population outcomes *θ* are normally distributed (*θ* ~ *N*(*μ*, *τ*^{2})), in which *N* indicates a normal distribution and *θ* has mean *μ* and variance *τ*^{2}. To estimate outcomes in this study, the random-effects model is defined as follows for each study in the set of *i* = 1, …, *k* studies:

$${y}_{i}=\mu +{u}_{i}+{e}_{i}$$

(1)

in which *y*_{i} is the observed outcome in study *i*, *u*_{i} is the random effect for study *i*, *μ* is the estimated outcome for study *i*, and *e*_{i} is the sampling error within study *i*. The model assumes that *u*_{i} ~ *N*(0, *τ*^{2}) and *e*_{i} ~ *N*(0, *v*_{i}), in which *v*_{i} is the study-specific sampling variance. We fitted random-effects models using the restricted maximum-likelihood estimator^{31,32}. If a model failed to converge, we attempted to fit models with a maximum-likelihood estimator. If random-effects models failed to converge owing to the number of stunting cases being zero, we used a fixed-effects estimator. The quantity *μ* is the estimated mean outcome in the hypothetical population of studies (that is, the estimated outcome pooling across study cohorts).

We also fitted inverse-variance-weighted fixed-effects models defined as follows:

$${\bar{\theta }}_{w}=\frac{{\sum }_{i=1}^{k}{w}_{i}{\theta }_{i}}{{\sum }_{i=1}^{k}{w}_{i}}$$

(2)

in which \({\bar{\theta }}_{w}\) is the weighted mean outcome in the set of *k* included studies, and *w*_{i} is a study-specific weight, defined as the inverse of the study-specific sampling variance *v*_{i}. *θ*_{i} is the estimate from study *i*.

For both types of outcome, we pooled binary outcomes on the logit scale and then back-transformed estimates after pooling to constrain confidence intervals between 0 and 1. Although the probit transformation more closely resembles common distributions for physiologic variables, in practice the logit transformation produces nearly identical estimates and is more convenient for estimation. For cohort-stratified analyses, which did not pool across studies, we estimated 95% confidence intervals using the normal approximation (Supplementary Note 7).

#### Estimation of incidence

We estimated incidence as defined above in 3-month age intervals within specific cohorts and pooled within region and across all studies (Fig. 3). Pooled analyses used random-effects models for the primary analysis and fixed-effects models for sensitivity analyses as described above.

#### Estimation of changes in stunting status

To assess fluctuations in stunting status over time, we conducted an analysis among cohorts with at least monthly measurements from birth to the age of 15 months to provide sufficient granularity to capture changes in stunting status. We estimated the proportion of children in each stunting category defined in the section ‘Changes in stunting status’ at each month from birth to the age of 15 months. To ensure that percentages summed to 100%, we present results that were not pooled using random effects. Analyses using random effects produced similar results (Supplementary Note 6.3).

To examine the distribution of LAZ among children with stunting reversal, we created subgroups of children who experienced stunting reversal at ages 3, 6, 9 and 12 months and then summarized the distribution of the children’s LAZ at ages 6, 9, 12 and 15 months. Within each age interval, we estimated the mean difference in LAZ at older ages compared to the age of stunting reversal and estimated 95% confidence intervals for the mean difference. Pooled analyses used random-effects models for the primary analysis and fixed-effects models for sensitivity analyses as described above.

#### Linear growth velocity

We estimated linear growth velocity within 3-month age intervals stratified by sex, pooling across study cohorts (Fig. 5) as well as stratified by geographic region (Extended Data Fig. 10) and study cohort (Supplementary Note 7.4). Analyses included cohorts that measured children at least quarterly. We included measurements within a 2-week window around each age in months to account for variation in the age of each length measurement. Pooled analyses used random-effects models for the primary analysis and fixed-effects models for sensitivity analyses as described above (Supplementary Note 6.4).

### Sensitivity analyses

We conducted three sensitivity analyses. First, to assess whether inclusion of PROBIT, the single European cohort, influenced our overall pooled inference, we repeated analyses excluding the PROBIT cohort. Results were very similar with and without the PROBIT cohort (Supplementary Note 8). Second, to explore the influence of differing numbers of cohorts contributing data at different ages, we conducted a sensitivity analysis in which we subset data to cohorts that measured anthropometry monthly from birth to the age of 24 months (*n* = 21 cohorts in 10 countries, 11,424 children; Supplementary Note 3). Third, we compared estimates pooled using random-effects models presented in the main text with estimates pooled using fixed-effects inverse-variance-weighted models. The random-effects approach was more conservative in the presence of study heterogeneity (Supplementary Note 6).

### Inclusion and ethics

This study analysed data that were collected in 14 LMICs that were assembled by the Bill & Melinda Gates Foundation Ki initiative. Datasets are owned by the original investigators that collected the data. Members of the Ki Child Growth Consortium were nominated by each study’s leadership team to be representative of the country and study teams that originally collected the data. Consortium members reviewed their cohort’s data within the Ki database to ensure external and internal consistency of cohort-level estimates. Consortium members provided substantial input on the statistical analysis plan, interpretation of results and manuscript writing. Per the request of consortium members, the manuscript includes cohort-level and regional results to maximize the utility of the study findings for local investigators and public health agencies. Analysis code has been published with the manuscript to promote transparency and extensions of our research by local and global investigators.

### Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.