For this study we collected, processed and analysed published data from the literature to get estimates of lifespan, reproductive lifespan, size, survival to maturity and age at maturity for as many toothed whale species as possible. We then used these estimates in analyses to understand how and why menopause evolves. All data management, analysis and plotting were performed in R with the tidyverse, rstan cmdstanr and ape packages^{64,65,66,67}.

### Data

Table of Contents

#### Lifespan data and modelling

We collected sex-specific age-structured toothed whale data from the literature and then applied a mortality model to these data to get species-sex lifespan estimates for toothed whale species. In most toothed whale species, the age of a whale can be inferred by counting tooth growth rings^{68}. There is a tradition in the study of toothed whales to collect age and sex data from deceased individuals, for example, after mass-stranding events. Here, we collate and use these published data to understand toothed whale mortality patterns. We collated the data by performing a systematic search of the literature through Web of Science (webofscience.com) using species name^{69}, alternative species name^{69}, common name^{69} and alternative common names^{69} with each of the search terms ‘life history’, ‘lifespan’ and ‘age structure’. We also repeated the search using relevant languages local to the distribution of the species in question in an attempt to reduce the English-language bias of our data^{70}. We repeated the search for each of the 75 species of toothed whale recognized by the International Whaling Commission^{69}. For each species, we also performed opportunistic searches by following references through the literature and identifying potential data sources using the relevant chapters in authoritative edited collections^{71,72,73,74}. From our search, we identified and extracted^{75} sex-specific age data into a database^{76}. In the database, where possible, each dataset represents data from a single population collected at a single time; however, this is not always possible and some datasets represent data collected from several populations or over a longer timescale. We also parameterize the mortality model with dataset-specific information on population growth, sampling biases and age-estimation error each of which are based on data provided in the original publication^{76}. A species-sex was included in the analysis if they had at least one dataset with more than ten samples above the age of maturity and a sampling rate (number of samples/maximum observed age) of greater than 0.5. Our complete database for analysis used 269 datasets from 118 publications allowing us to estimate lifespan for 32 species of female (Fig. 1 and Supplementary Table 1) and 33 species of male toothed whale. Fisheries by-catch (*n* = 113) and stranding (*n* = 107) are the most common sources of data in our database, although most species have data from several sources. Research to date has not suggested or found evidence of systematic age differences in propensity to be taken as by-catch or stranding and when bias is suspected in a given dataset we include this bias in our modelling framework.

We developed a Bayesian modelling framework to derive the parameters, *α* and *β*, of a Gompertz mortality model for each species-sex. The framework aims to derive a distribution of mortality parameters for each species-sex most parsimonious with the data and with the potential and unknown effects of population growth, sampling bias and age-estimation error. We consider the count *c* of whales in age category *i* (where AGE is greater than age at maturity) in dataset *d* to be drawn from a multinomial distribution, where *θ*_{i} is a product of the probability of surviving to that age, *L*_{i}, the effect of population change, *R*_{i}, and the effect of sampling bias, *S*_{i} (equation (1)). *L*_{i}, *R*_{i} and *S*_{i} are defined on the basis of a function of the age, AGE_{i}, of the whales in category *i*. We concurrently model that the count *c*_{d,i} is the number of whales with the true age *i* in dataset *d*, where the true age of whale *j, τ*_{j}, is drawn from a normal distribution around the observed age, *o*_{j}, with a standard deviation of *ε*_{j} (equation (1)). The model is fit over all available datasets of the same species. Datasets of the sex (SEX_{d}) share mortality parameters, datasets from the same population (POP_{d}) share a population growth parameter (*r*_{d}) and each dataset has an independent sampling bias parameter (*s*_{d}). Sampling bias error is applied where AGE_{d,i} is within a predefined bias window *W*_{d} derived from the data source. Simulations demonstrate that this model is capable of recovering reliable estimates of lifespan under a biologically relevant range of demographic and error scenarios^{76} (see also Supplementary Information 1, Equations):

$$\begin{array}{c}c\, \sim {\rm{M}}{\rm{u}}{\rm{l}}{\rm{t}}{\rm{i}}{\rm{n}}{\rm{o}}{\rm{m}}{\rm{i}}{\rm{a}}{\rm{l}}(\theta \,,N)\\ {\theta }_{d,i}=\frac{{L}_{d,i}{R}_{d,i}{S}_{d,i}}{\sum {L}_{d,i}{R}_{d,i}{S}_{d,i}}\\ {L}_{i}=\frac{{{\rm{e}}}^{-({\alpha }_{{{\rm{S}}{\rm{E}}{\rm{X}}}_{d}}/{\beta }_{{{\rm{S}}{\rm{E}}{\rm{X}}}_{d}})({{\rm{e}}}^{{\beta }_{{{\rm{S}}{\rm{E}}{\rm{X}}}_{d}}\times {{\rm{A}}{\rm{G}}{\rm{E}}}_{d,i}}-1)}}{{\sum }_{j=0}^{n}{{\rm{e}}}^{-({\alpha }_{{{\rm{S}}{\rm{E}}{\rm{X}}}_{d}}/{\beta }_{{{\rm{S}}{\rm{E}}{\rm{X}}}_{d}})({{\rm{e}}}^{{\beta }_{{{\rm{S}}{\rm{E}}{\rm{X}}}_{d}}\times {{\rm{A}}{\rm{G}}{\rm{E}}}_{d,j}}-1)}}\\ {R}_{d,i}={(1-{\rho }_{d})}^{{{\rm{A}}{\rm{G}}{\rm{E}}}_{d,i}}\\ {\rho }_{d}=\frac{{r}_{{{\rm{P}}{\rm{O}}{\rm{P}}}_{d}}}{1/2\times \,max({{\rm{o}}}_{{{\rm{D}}{\rm{S}}{\rm{E}}{\rm{T}}}_{{{\rm{S}}{\rm{E}}{\rm{X}}={\rm{S}}{\rm{E}}{\rm{X}}}_{d}}})}\\ {S}_{d,i} \sim \left\{\begin{array}{cc}{s}_{d}+1 & {{\rm{A}}{\rm{G}}{\rm{E}}}_{d,i}\in {W}_{d}\\ 1 & {{\rm{A}}{\rm{G}}{\rm{E}}}_{d,i}\notin {W}_{d}\end{array}\right.\\ {\tau }_{j} \sim {\rm{N}}{\rm{o}}{\rm{r}}{\rm{m}}{\rm{a}}{\rm{l}}({o}_{j},{{\epsilon }}_{j})\\ {{\epsilon }}_{j}=\frac{1}{20}({o}_{j}+B)\\ {c}_{d,i}=\mathop{\sum }\limits_{k=1}^{N}[\lfloor {\tau }_{k}\rceil ={{\rm{A}}{\rm{G}}{\rm{E}}}_{i},{\rm{D}}{\rm{S}}{\rm{E}}{\rm{T}}=d]\end{array}$$

(1)

We fit this using Hamiltonian Monte Carlo chains implemented in R and Stan^{64,66} (Supplementary Information 2, Mortality and Corpora). For each species-sex, we then use the posterior estimates of *α* and *β* to calculate ordinary maximum lifespan (age *Z*): the age at which 90% of adult life years have been lived^{77}. This results in a distribution of ordinary maximum lifespans for each species-sex derived from 40,000 draws from the posterior of the fitted mortality model. In the following analyses, we use the mean ± s.d. of the posterior distribution of ordinary maximum lifespans as our species-sex lifespan measures.

#### Reproductive lifespan and modelling

We use age-specific ovarian activity to assess toothed whale reproductive lifespans. In toothed whales, after ovulation, the corpora in which the ovum develops persists in the ovary^{78}. A sample of corpora counts from whales of known ages can therefore be used to calculate changes in the rate of corpora deposition with age as a measure of age-specific ovarian activity^{9}. In long-finned and short-finned pilot whales the rate of corpora deposition with age matches age-specific pregnancy rates^{9}. We use the age at which the rate of new corpora deposition reaches 0 as our measure of reproductive lifespan.

The counts of corpora are commonly reported after dissections of deceased whales. We build on the database of age-specific corpora counts used in previous studies^{9} by adding more datasets uncovered during our systematic and opportunistic search for toothed whale life history data (Lifespan data and modelling section). Each corpora dataset consists of a sample of known-age females and the number of ovarian corpora they were found to have. We only include corpora datasets in our analysis if they have a sample size greater than or equal to 20 and a sampling rate (number of samples/maximum observed age from lifespan data) of more than 0.6. Our final sample size for this analysis is 27 datasets from 18 species (Fig. 1, Supplementary Information 2, Mortality and Corpora and Supplementary Table 2).

We apply a Bayesian model to these corpora data to find the age at which ovarian activity reaches 0. Unlike in previous studies^{9}, here we directly model the process of corpora deposition. We model the count of corpora at a given adult age *C*_{i} (where *i* = 0 is the age at maturity) to be drawn from a Poisson distribution with mean *λ*_{i}, where *λ*_{i} is the sum of corpora (*k*) deposited at all previous ages, which in turn depends on the initial rate of corpora deposition *α* and a linear rate of decline in rate with age *β* (see also Supplementary Information 1, Equations):

$$\begin{array}{c}C \sim {\rm{P}}{\rm{o}}{\rm{i}}{\rm{s}}{\rm{s}}{\rm{o}}{\rm{n}}(\lambda )\\ {\lambda }_{i}=\mathop{\sum }\limits_{j=0}^{i}\Delta {k}_{j}\\ \Delta {k}_{j}=\alpha (1-\beta {{\rm{A}}{\rm{G}}{\rm{E}}}_{j})\end{array}$$

(2)

If several corpora datasets are available from the same species, they are modelled together with datasets allowed to differ in *α* but sharing an age-specific rate of decline *β*. For all species, reproductive lifespan is then calculated as 1/*β* + age at maturity. We calculate a distribution of reproductive lifespans from 40,000 draws from the posterior distribution of *β*.

#### Size and maturity

We use length rather than mass as our measure of species size because of the difficulty of accurately measuring mass in cetaceans^{79}. For each toothed whale species-sex, we consulted expert-written edited volumes^{71,72,73,74} to get all available metrics of sex-specific length. For each species-sex, we collated all available measures of mean, asymptotic, minimum and maximum length, as well as standard deviation around the mean or asymptote. If metrics were not available from a sex-specific sample (*n* = 20) we instead used any available combined sex samples; if this occurred, the species were not included in any within-species-sex size comparisons. We then processed these metrics to get a single estimate of mean size and standard deviation around that estimated size (Supplementary Information 3, Additional Data Explanation). This error was carried through all subsequent analyses. Available estimates of mass were strongly positively correlated with length (Supplementary Information 3, Additional Data Explanation).

We use the same framework to generate estimates of age at maturity for each species-sex as we did for size (Supplementary Information 3, Additional Data Explanation). We gathered estimates of mean age of maturity and distribution from expert consensus and then processed these measures to get a single measure of mean age of sexual maturity and standard deviation around the mean for each species-sex. For simplicity, for analyses not directly testing correlates of the age of maturity we use the mean measure as the age of sexual maturity.

In analyses regressing size against age or vice versa, models take the form of a Bayesian phylogenetically controlled linear model with the required parameter and menopause as predictors. In these models, both the true size and true age at maturity are considered to be drawn from a normal distribution around the observed mean given the observed standard deviation. The effect of phylogeny is implemented as the covariance between species by means of the Ornestain–Uhlenbeck process^{80}.

Our kinship demography analysis requires estimates of juvenile mortality. We generate estimates of juvenile mortality for the 23 datasets with good sampling (>100) of whales under the age of maturity by applying a Gompertz mortality model with a bathtub term^{81} (to describe juvenile mortality) to these datasets. In these samples, we found a negative correlation between age at maturity and the probability of surviving to maturity. We used this correlation to generate a posterior distribution of the estimated proportion of whales born surviving to maturity for each species-sex (Supplementary Information 3, Additional Data Explanation).

#### Kinship demography analysis

We use a matrix population modelling framework to understand how many relatives of different classes a female can expect to have given her age^{25} and hence her potential to provide intergenerational help and intergenerational harm. For each species, we build a Leslie matrix based on (1) mortality hazard at each adult age derived from the lifespan modelling, (2) mortality hazard at juvenile ages derived from the survival to maturity analysis and (3) age-specific fecundity derived from the corpora analysis *β* scaled by the baseline fecundity *f* needed to maintain a stable population (the age-distribution of a stable population is calculated by inferring the posterior distribution of mortality parameters when *r* = 1 for each species from our fitted mortality models). For each *β* draw from the corpora analysis, we calculated the baseline reproductive rate by systematically exploring the parameter space to find the value of *f* needed to maintain this known stable age-distribution (*λ* = 1) using established matrix population methods^{34}. All of these inputs are the posterior estimates from Bayesian analyses. Computational limitations mean that it was not possible to explore the entire combined posterior space, we therefore take 1,000 draws from each posterior distribution and use these to create a set of alternative Leslie matrices for each species. We apply kinship demography models^{25}—adapted to predict both sexes of offspring and grandoffspring—to each Leslie matrix to calculate a distribution of the number of offspring and grandoffspring a female can expect to have at a given age.

We quantify grandmother overlap at age *x* as the number of matrilineal grandoffspring below the age at maturity that a female can expect to have at that age multiplied by the probability of surviving to that age from maturity. We then sum this over all female ages. This measure gives a direct measure of the number of years a female can expect to be alive at the same time as her grandoffspring. Similarly, mother–offspring overlap is quantified as the number of years a female can expect to spend alive at the same time as her offspring and is calculated in the same way as grandmother overlap but replacing the number of grandoffspring with the number of offspring (of both sexes). We measure reproductive overlap as the cumulative proportion of a female’s reproductive life remaining when her daughter gives birth. To calculate this at each age we take the product of the expected number of grandoffspring of age 0 that a female will have and multiply this by the proportion of reproductive capacity remaining to a female of that age. This is summed over all ages to get the total reproductive overlap in the species. We use the same methodological pathway to calculate ‘relative offspring overlap’ between mothers and the offspring, determined separately for daughters and sons.

#### Demographic simulations

For each of the five species with menopause, we also calculated the expected grandmother years and reproductive overlap under two alternative demographic scenarios: (1) the ancestral case and (2) the slow life history case. The ancestral case is the predicted demographics of the immediate non-menopausal ancestors of the species. Under the live-long hypothesis (see the section ‘Live-long or stop-early?’), species without menopause evolve by extending their lifespan without extending their reproductive lifespan. For the ancestral case, we therefore simulate the demographic parameters of a population with the same lifespan as expected for the size of the species and the same reproductive lifespan as observed in the real data (see Supplementary Table 3 for the derivation of parameters). The slow life history case simulates a population in which, instead of evolving menopause, females continue to reproduce for their whole life. Under the slow life history case, lifespan is the same as observed for the real species but reproductive lifespan is extended to lifespan (Supplementary Table 3). We compare the ancestral case and slow life history case to the observed demographics (observed case). For this analysis, we measure reproductive lifespan as the age of the oldest known reproductive active female (Supplementary Table 4) to allow the inclusion of killer whales in the analysis, as no corpora data are available for killer whales. Unlike most of the other species in our dataset, the reproductive datasets for these five datasets are large enough to allow a relatively robust estimate of the age of the oldest reproductively active female. For each demographic scenario, we use the pipeline for the kinship demography analysis (above) to calculate distributions of values of baseline reproductive rate, relative grandmother years, relative mother years and expected reproductive overlap.

### Analysis

#### Live-long versus stop-early

To understand if species with menopause live or reproduce longer than expected given their size and phylogenetic position we use a model of the form:

$$\begin{array}{c}\tau Z \sim {\rm{M}}{\rm{u}}{\rm{l}}{\rm{t}}{\rm{i}}{\rm{N}}{\rm{o}}{\rm{r}}{\rm{m}}{\rm{a}}{\rm{l}}(\mu ,K)\\ \tau Z \sim {\rm{N}}{\rm{o}}{\rm{r}}{\rm{m}}{\rm{a}}{\rm{l}}(\mu Z,\sigma Z)\\ \tau S \sim {\rm{N}}{\rm{o}}{\rm{r}}{\rm{m}}{\rm{a}}{\rm{l}}(\mu S,\sigma S)\\ {\mu }_{i}=\alpha +{\beta }_{{\rm{S}}{\rm{I}}{\rm{Z}}{\rm{E}}}\tau {S}_{i}+{\beta }_{{\rm{P}}{\rm{R}}}{M}_{i}\\ {K}_{i,j}={\eta }^{2}{{\rm{e}}}^{-{\rho }^{2}{D}_{i,j}}\end{array}$$

(3)

where true lifespan *τZ* is drawn from a multinormal distribution with a mean described by a linear model with terms for size (*β*_{SIZE}) and the effect of menopause (*β*_{PR}). True ordinary maximum lifespan *τZ* and log true size *τS* are drawn from normal distributions described by the means (*μZ*, *μS*) and standard deviations (*σZ*, *σS*) from the lifespan modelling and log observed size, respectively. In species in which menopause is present *M* = 1, otherwise *M* = 0. The dispersion of the multinormal distribution, *K*, is a covariance matrix derived from the phylogenetic distance matrix by the Ornestain–Uhlenbeck process (parameters *η, ρ*)^{80}. All parameters have weakly informative priors. We use 1,000 time-calibrated bootstrapped phylogenetic trees derived from a recently published cetacean phylogeny^{35}. We run the model on each of the bootstrapped phylogenies and combine the posterior estimates to get a complete posterior, which is reported. We use the same modelling structure replacing ordinary maximum lifespan *Z* with the age at which ovarian activity reaches 0 to understand the relationships between reproductive lifespan, size, phylogeny and the presence or absence of menopause. We confirmed the absence of an effect of menopause on reproductive lifespan by comparing the predictive power of models with and without a menopause parameter (Supplementary Table 5). There is no evidence that a model with a menopause parameter has greater predictive power than a model without a menopause parameter (elpd difference without − with = −0.8 ± 1.2 (±s.e.); Supplementary Table 6).

As the presence of menopause in beluga whales and narwhals^{58} and in false killer whales^{9} has sometimes proved controversial, we repeat this and other key analyses presented in this study excluding these species. We also repeat the analyses excluding the largest and smallest whale species to confirm that they are not exerting undue influence over our interpretation. Lastly, we also repeated our analysis, coding menopause as a continuous value with error to capture potential differences between menopause species and uncertainties around menopause status. In all cases, these exclusions made no qualitative and little quantitative difference to any of our results (Supplementary Table 5).

#### Help and extended lifespans

We tested three predictions to investigate the role of intergenerational help in the evolution of extended lifespans (Table 1). We compared the grandmother overlap of species with and without menopause using a phylogenetically controlled Bayesian regression model (Table 1). In this model, relative grandmother overlap—calculated as grandmother overlap/age at maturity—is the response variable providing a more meaningful interspecies comparison. Posterior predictive checks demonstrate that this model captures the observed distribution of the data (Supplementary Figs. 5 and 6) and, although the sample size means the true distribution is uncertain, there is no strong evidence that the grandmother years metric is bimodal (Supplementary Fig. 6). If further data find strong evidence of bimodality it could indicate that there are two modes of life in toothed whales that might have implications for understanding the evolution of life history strategies, including menopause. We use the same analytical pathway to compare the mother overlap of species with and without menopause (Table 1). Last, we used the outputs of the demographic simulations to compare grandmother overlap in species with menopause to their non-menopause ancestor. Applying the kinship demography models to the demographic simulation ‘ancestral case’ allowed us to generate a distribution of potential relative grandmother overlap values for the proposed non-menopausal ancestor of toothed whale species with menopause.

#### Help and reproductive lifespans

We use two metrics to quantify the costs of offspring and to test the prediction that the offspring of species with menopause are more costly than the offspring of species without menopause (Table 1): age at maturity and size at maturity (adult size). Specifically, we test the predictions that (1) species with menopause have a later age at maturity than expected given the size of their mother, which would suggest extended maternal investment, and (2) species with menopause have larger adult size than expected given their age at maturity, which would suggest greater maternal investment during the juvenile phase. We tested these predictions separately for each sex, using a phylogenetically controlled Bayesian regression model. There are other potential costs of raising offspring not captured by these metrics, including the costs of caring for offspring beyond maturity^{28}, which could be an interesting focus of future research. In addition, we test the explicit mechanism that some models have proposed by which the costly intergenerational help of older females can benefit their younger kin. These models propose that, by ceasing reproduction, older females can invest in increasing the reproductive rate of their daughters^{30,33}. We test this by comparing the observed baseline fecundity in species with menopause, derived from the kinship demography models, compared to the predicted baseline reproductive rate of their non-menopausal rate. In a further analysis, we compare the baseline fecundity of toothed whale species with menopause and without menopause (Supplementary Fig. 7).

#### Harm and reproductive lifespans

We tested two predictions to establish the role of intergenerational harm in the evolution of reproductive lifespan (Table 1). First, we used a phylogenetically controlled Bayesian regression model to compare the predicted reproductive overlap of species with and without menopause, in which reproductive overlap was derived from the kinship demography models (Table 1). Posterior predictive checks demonstrate that this model captures the observed distribution of the data (Supplementary Fig. 5) and the metric does not show any clear evidence of bimodality (Supplementary Fig. 6). We confirm our result by comparing the predictive power of models with and without a menopause parameter (Supplementary Table 7). Similarly, neither running the model on the mean reproductive overlap estimates without uncertainty, nor replacing the true uncertainty around the reproductive overlap with the scaled uncertainty for that species from the grandmother years metric, qualitatively change our conclusions. Second, we compare observed reproductive overlap in species with menopause to the reproductive overlap in the slow life history case representing the non-menopausal analogue of the species with menopause (Table 1). The predicted reproductive overlap is derived by applying kinship demography models to the output of the demographic simulations.

#### Female lifespans and male longevity

We tested two predictions of the male-driven menopause hypothesis. First, we compare relative male and female lifespans in species with and without menopause (Table 1). In species for which estimates of ordinary maximum lifespan are available for both sexes (*n* = 30, including all five species with menopause) we compared the difference in female:male lifespan ratio (*τl*) in species with (*M*_{i} = 1) and without (*M*_{i} = 0) menopause using a model of the form:

$$\begin{array}{c}\delta l \sim {\rm{l}}{\rm{o}}{\rm{g}}{\rm{N}}{\rm{o}}{\rm{r}}{\rm{m}}{\rm{a}}{\rm{l}}(\mu ,\sigma )\\ {\mu }_{i}=\alpha +{\beta }_{{\rm{P}}{\rm{R}}}{M}_{i}\\ \sigma \sim {\rm{E}}{\rm{x}}{\rm{p}}(1)\\ \tau {l}_{{{\rm{f}}}_{i}} \sim {\rm{N}}{\rm{o}}{\rm{r}}{\rm{m}}{\rm{a}}{\rm{l}}(\mu {l}_{{{\rm{f}}}_{i}},\sigma {l}_{{{\rm{f}}}_{i}})\\ \tau {l}_{{{\rm{m}}}_{i}} \sim {\rm{N}}{\rm{o}}{\rm{r}}{\rm{m}}{\rm{a}}{\rm{l}}(\mu {l}_{{{\rm{m}}}_{i}},\sigma {l}_{{{\rm{m}}}_{i}})\\ \delta {l}_{i}=\frac{\tau {l}_{{{\rm{f}}}_{i}}\,}{\tau {l}_{{{\rm{m}}}_{i}}}\end{array}$$

(4)

where *τl*_{f} and *τl*_{m} are, respectively, true female and male lifespans, drawn from a normal distribution parametrized by the distribution of age *Z* calculated from the posterior of mortality parameters (Lifespan data and modelling section).

Second, for each of the species with menopause we also calculated the probability of each sex reaching the age of female menopause (*l*_{M}) and the expected lifespan at the age of female menopause (*e*_{M}). We use the age of last known reproduction as age of menopause to allow all killer whales to be included in the analysis (Kinship demography analysis section) but results are qualitatively similar if the reproductive lifespan is used. We calculated *l*_{M} and *e*_{M} from the mortality parameters derived from the fitted Gompertz models^{82}. Morality parameters for both sexes are calculated in the same model so for each species female and male *l*_{M} and *e*_{M} can be directly compared from each posterior draw.

### Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.