In vitro parasites culture and phenotyping
Parasites (NF54) were cultured in O− or A− whole blood obtained from the Etablissement Français du sang (21PLER2020-004), in RPMI 1640 culture medium (Gibco) supplemented with 2 g l−1 glucose (Sigma), 2 g l−1 sodium bicarbonate (Sigma), 10 mg l−1 hypoxanthine (Sigma), 50 mg ml−1 gentamicin (Gibco), 0.5% Albumax-II (Gibco) and 5% AB+ human serum (21PLER2020-004), and were kept at 37 °C in a gaseous environment containing 5% O2, 5% CO2 and 90% N2. Sexual stages were obtained by using the same complete media and with standard gametocyte culturing methods25. Contaminating asexual forms were removed from the cultures by treatment with 20 U ml−1 heparin sodium salt (Sigma) added to the medium26. For the time courses used in the single-cell RNA-seq, an asynchronous parasite culture was grown to 4–7%, at which point only two thirds of the media was replaced with fresh media; one day after this stress, heparin was added to prevent any further invasion and the parasites were sampled at day 3, 5, 7 and 9 following the stress to cover a broad range of gametocyte development without excessive contaminating asexuals. For the maturity assays, heparin was also used and gametocytes were scored daily from day 4 after stress. The maturity of gametocytes was assessed and quantified on Giemsa-stained thin-blood smears. For the exflagellation assays, stage V gametocytes were placed in RPMI with 10% serum and 100 µM xanthurenic acid (Sigma), and the number of exflagellation centres was established on a hemocytometer from 15 to 30 min following activation. Exflagellation rates were calculated by using the red blood cell (RBC) density as well as the gametocytemia (stage V) of each culture and were normalized to WT. To track the viability of the parasites during sex determination, synchronous gametocyte cultures of WT, KO and Δ270–699 treated with heparin as above were followed every 24 h from day 2 to day 6 after stress. Gametocytes were stained with Vybrant Green (1:10,000 dilution in RPMI, Thermo Fisher) and 300 nM MitoTracker Deep Red (Thermo Fisher) in a RPMI culture for 30 min at 37 °C. An unstained control and stained controls with MitoTracker Deep Red and Vybrant Green were prepared with non-parasitized RBCs to establish the negative populations for both fluorochromes. After incubation, samples were washed twice, resuspended in 200 µl of phosphate-buffered saline (PBS) and analysed by flow cytometry on an LSRFortessa flow cytometer (BD Biosciences) with BD FACSDiva software (v.9.0.1). Vybrant Green was excited by a blue laser and detected by a 530/30 filter, whereas MitoTracker Deep Red was excited by a red laser and was detected by a 650/10 filter. 500,000 events were collected for each sample. Percentages of the MitoTracker Deep Red positive gametocytes amongst the Vybrant positive parasites were measured. The data collected were analysed with FlowJo v.10.7 (Tree Star). The gating strategy is shown in Supplementary Fig. 1. For the time course of md1 expression, synchronous cultures of Md1-2A-GFP treated with heparin as above were followed every 24 h from day 0 to day 8 after stress. Each day, parasites were stained with Hoechst 33342 (1:10,000 dilution in RPMI; Sigma) and 1,000 Hoechst-positive parasites were assessed for GFP expression under a ×40 oil immersion lens on a Zeiss Axiolmager Z2. The proportion of bright GFP-positive gametocyte stages was established on 100 Hoechst-positive cells.
P. falciparum (NF54) parasites were transfected by electroporating approximately 5% ring-stage-infected RBCs with a total of 50 μg circular plasmid as previously established27. After electroporation, transgenic parasites were grown under agitation (50 rpm) and were selected with the appropriate drug. The growth of resistant parasites was monitored by Giemsa-stained thin-blood smears under a light microscope. Mutant lines expressing the transgene episomally (KO complementation, overexpression FL and NT) were continuously selected with 2.5 µM blasticidin (Invitrogen). The mutant lines KO, Δ270–699, Md1-2A-GFP, Md1-3xHA, Md1-TurboID and Δint1 were selected with 2.5 nM of WR99210 (from Jacobus Pharmaceuticals) for 10 days and were subsequently grown without drug selection. Clones were derived from these mutants by limiting dilution in the absence of drug pressure. For each transfection, both mixed population and clones were screened by PCR for the correct integration and the absence of a WT locus.
Generation of plasmids
The genomic DNA used in all cloning reactions was extracted by using the NucleoSpin Blood kit (Macherey Nagel). Before extraction, erythrocytes of a highly parasitized culture of P. falciparum (NF54) were lysed in a solution of 0.1% saponin (Sigma) in PBS, for 5 min on ice, and parasites were harvested by centrifugation (3,600g, 4 °C, 5 min). The PCR reactions were performed by using KAPA HiFi polymerase (Roche) and the colony screening was done with GoTaq DNA Polymerase (Promega) following the manufacturer’s instructions. The annealed oligonucleotides used for the cloning of sgRNA sequences were first phosphorylated with T4 PNK for 30 min at 37 °C and were then cloned into the pDC2-Cas9-hDHFR-yFCU plasmid28 after BbsI digestion by using T4 DNA ligase (overnight at 16 °C). All restriction enzymes, T4 DNA ligase and T4 PNK were purchased from NEB. Fifteen different plasmids were prepared to study the function of md1 (PF3D7_1438800):
To completely excise the gene via a double crossover homologous recombination event, two donor sequences were designed to target the region upstream of the start codon (homology region 1; HR1) and downstream of the stop codon (homology region 2; HR2). Homology regions were amplified from genomic DNA with primer pairs 202/203 (522 bp) and 116/117 (554 bp), respectively, and sequentially inserted into the plasmid pLN-HAx3 by In-Fusion HD cloning (Clonetech) following digestion with AvrII and EcoRV (HR1) or AflII and BamHI (HR2). A sgRNA targeting the 5′ end of the gene was selected by using the tool available at https://zlab.bio/guide-design-resources29 and the sgRNA (annealed primers g33/g34) was cloned into the pDC2-Cas9-hDHFR-yFCU plasmid. The resulting parasite line was named KO in which the entirety of the md1 open reading frame (ORF) was removed, without integration of a resistance cassette.
Complementation of the KO was done in trans through the episomal expression of the complete md1 ORF preceded by 1,538 bp of its putative promoter region. The promoter was amplified from genomic DNA by using primers 209/210 (1,538 bp) and cloned into the pLN-HAx3 digested with ApaI and AvrII (HR1) by In-Fusion HD cloning. This intermediate plasmid was named the pLN-md1prom-HAx3-intermediate. Next the FL ORF was amplified by using as template stage IV/V gametocyte complementary DNA (cDNA) (see below) and primers 211/208 (2,100 bp). This was cloned into the pLN-md1prom-HAx3-intermediate digested with AvrII and NaeI by In-Fusion HD cloning.
A different gene-disruption strategy was used to generate the mutant line Δ270–699. In this case, two donor sequences were designed to target the end of the first exon by using primers 287/288 (HR1, 505 bp) and the fourth exon with primers 289/290 (HR2, 474 bp). These products were sequentially cloned into the plasmid pDC2-U6-Cas9-hDHFR30 with a Gibson assembly strategy following digestion with AatII and EcoRI (HR1) and ApaI (HR2). Insertion of each of the HRs on either side of the drug resistance cassette (hDHFR) present in the plasmid ensured the integration of the latter in the genome of the parasite upon transfection. A sgRNA sequence targeting the beginning of the fourth exon was cloned into the same plasmid by using the annealed primers g92/g93. Given the unexpected phenotype of the Δ270–699 mutant, we assessed the coding potential of the disrupted locus. After integration of the plasmid by double crossover homologous recombination, the first 808 bp (out of a total of 1,006 bp) of the exon 1 remained undisturbed. In the modified locus, this segment of the md1 gene was followed by 20 bp of plasmid sequence, a stop codon and the 3′ untranslated regions of the resistance gene cassette encoded in the reverse strand. In the remaining 476 bp at the 3′ end of the gene, the lack of a promoter, and a frameshift that generated six stop codons, eliminated its coding potential. As a result, in Δ270–699 mutants we expect the expression of a truncated form of Md1 with the first 269 aa of the first exon.
For the NT plasmid, to overexpress the N terminus of Md1, we amplified the exon 1 from stage IV/V gametocyte cDNA with primers 211/212 (1,039 bp). This fragment was cloned into the pLN-md1prom-HAx3-intermediate after digestion with AvrII and NaeI by In-Fusion HD cloning. The FL plasmid corresponds to that used for complementation as described above.
To introduce a triple haemagglutinin (HAx3) tag at the C terminus of Md1 via a double crossover homologous recombination event, two donor sequences were designed to target either the region immediately upstream (homology region 1; HR1) or downstream of the stop codon (homology region 2; HR2). The HR1 was amplified from genomic DNA by using primers 114/115 (630 bp), whereas the HR2 was amplified with primers 116/117 (554 bp). These fragments were sequentially inserted into the plasmid pLN-HAx3 by In-Fusion HD cloning (Clontech) following digestion with NaeI and EcoRV (HR1) or AflII and BamHI (HR2). Three shield synonymous mutations were included in the primer 115 to prevent Cas9 targeting after recombination. A sgRNA targeting the 3′ end of the gene was selected and the sgRNA sequence (annealed primers g31/g32) was cloned into the pDC2-Cas9-hDHFR-yFCU plasmid28. Both plasmids, one providing the donor recombination sequence and the other the Cas9, were transfected simultaneously to obtain a selection-marker-free edition of the locus with a scarless insertion of the tag. The resulting parasite line was named Md1-3xHA.
This plasmid was generated from the Md1:HAx3 plasmid, by replacing the 3xHA tag with a T2A:gfp cassette amplified from a template plasmid with primers 147/148. This was assembled with In-Fusion HD cloning following digestion of the Md1:HAx3 plasmid with NaeI and AflII. The same guide plasmid (g31/g32) was used to integrate the T2A:gfp cassette in frame with md1 at the C terminus. The resulting parasite line was named Md1-2A-GFP.
This plasmid was generated from the Md1:HAx3 plasmid, by replacing the 3xHA tag with a TurboID cassette. The latter was amplified from the template plasmid 3xHA-TurboID-NLS_pCDNA3 (Addgene; no. 107171) by using primers 316/317. This was assembled with In-Fusion HD cloning following digestion of the Md1:HAx3 plasmid with AflII and DraIII. The same guide plasmid (g31/g32) was used to integrate the TurboID cassette in frame with md1 at the C terminus. The resulting parasite line was named Md1-TurboID.
To remove the first intron of md1, we designed a 671 bp gene fragment g-block (IDT) that included the last 313 bp of the first exon followed by the beginning of the second exon (317 bp) excluding the intron. The g-block included at its 5′ and 3′ ends 20 bp of overlapping sequences for insertion into the pDC2-U6-Cas9-hDHFR30 plasmid through Gibson assembly, following digestion with EcoRI and AatII. In addition, the g-block included five synonymous shield mutations to prevent Cas9 targeting after modification of the locus. The sgRNA sequence produced by the annealed primers g94/g95 was cloned into the pDC2-Cas9-hDHFR-yFCU plasmid carrying the g-block. The resulting parasite line was named Δint1.
CRISPR interference plasmids
We identified a set of four sgRNAs in the md1 promoter (spread over the 600 bp preceding the start codon). Each of the sgRNA sequences produced by the annealed primers g52/g35, g54/g39, g55/g41 and g56/g43 was cloned into the sequence-optimized pL7 provided by J. Bryant31 at the btgZI site. The four pL7 plasmids (15 µg each; selected continuously with 2.5 nM of WR99210) were transfected with pUF_dCas9-3HA-glmS (30 µg; selected continuously with 1.5 µM of Dsm1) also provided by J. Bryant, to generate the parasite line Md1-dCas9-prom. A control parasite line was prepared, which carried the pUF_dCas9-3xHA-glmS alone (Md1-dCas9-ctrl).
A list of all primers that were used for genotyping, cloning and CRISPR–Cas9 guides is provided in Supplementary Table 1.
Cultured stage V gametocytes were spun down at 600g for 3 min. After discarding the supernatant, pellets were resuspended in 20× pellet volume of 0.15% Saponin (Sigma) for 5 min on ice, washed with Dulbecco’s phosphate-buffered saline (DPBS) once and the pellet was resuspended in 1 volume of DPBS (50 µl) plus 100 µl of TRIzol (Invitrogen) and stored at −80 °C. RNA was isolated according to the TRIzol manufacturer’s instructions. RNA was DNase-treated with Turbo DNA-free kit (Invitrogen). cDNA was generated with the High-Capacity cDNA Reverse Transcription Kit (Applied Biosystems) by using provided random hexamers and with the addition of oligodTs (Invitrogen), and was incubated for 1 h at 37 °C, followed by 10 min at 95 °C. The Power SyBR Green Master Mix (Applied Biosystems) was used for qRT–PCR with primers at a concentration of 400 nM in the final mix with the following cycling programme: a single incubation at 95 °C for 1 min followed by 44 cycles (95 °C for 10 s, 60 °C for 60 s). For the lncRNA qRT–PCR the cycling programme was modified to a single incubation at 95 °C for 1 min followed by 44 cycles (95 °C for 10 s, 58 °C for 10 s, 68 °C for 30 s). Fold changes were calculated by using the ΔΔCt method32. Sex ratios were measured by using PfMGET9 normalized with the housekeeping gene uce33. Expression measurements were made with intron-spanning primers and were normalized with uce33. A list of all primers is provided in Supplementary Table 1.
To measure the proportion of female gametocytes, stage V gametocytes were smeared on a glass slide, fixed with cold 100% methanol, blocked with blocking solution (1% fetal calf serum (Thermo Fisher) in PBS) for 1 h. Parasites were permeabilized for 10 min with 0.1% Triton X (Sigma) and quenched with 0.1 M glycine in PBS (Gibco) for 10 min. The slides were incubated with primary antibodies diluted in blocking solution at room temperature for 2 h; mouse monoclonal anti-α-tubulin (Sigma) and rabbit polyclonal anti-Pfg377 (provided by P. Alano) were used at 1:500. Slides were washed three times in PBS and were incubated with secondary antibodies (1:1,500, Thermo Fisher) in blocking solution for 1 h. Slides were mounted in ProLong Gold Antifade reagent (Thermo Fisher) and were observed under a ×40 oil immersion lens on a Zeiss Axiolmager Z2.
To assess the localization of the Md1-3xHA fusion protein, stage II to V gametocytes were fixed with 4% paraformaldehyde and were allowed to settle on poly-l-lysine coated slides (Thermo Scientific) overnight at 4 °C. Parasites were washed four times with 1× tris-buffered saline (TBS), permeabilized 5 min with 0.2% Triton X in PBS and blocked with 10% goat serum and 3% BSA in TBS for 1 h. Slides were then incubated with primary antibodies diluted in 1% BSA in TBS at 4 °C overnight; polyclonal rabbit anti-ERD2 (BEI, MRA-72) was used at 1:2,000 and monoclonal rat anti-HA (Roche, 3F10) was used at 1:500. Secondary antibodies Alexa-fluor 488/568 anti-rat or anti-rabbit (Thermo Fisher) were used at 1:1,500 dilution in TBS containing 1% BSA and were incubated for 1 h at room temperature. Slides were mounted in Fluoromount-G with DAPI (Thermo Fisher) and were imaged under a ×63 oil immersion lens on a Zeiss LSM 880 Airyscan confocal microscope with Zeiss ZEN microscopy software (v.2.3 SP1). Images were processed in Fiji34, the GDSC plugin was implemented on z-stacks by using the Stack Colocalisation Analyser to obtain the Pearson’s r coefficient between different channels.
120 ml cultures of WT, Δint1 or Δ270–699 parasites were supplemented with 20 U ml−1 heparin three days after stress. Cultures were harvested on day 9 after stress. The pellet was resuspended in ten times the pellet volume of PBS containing 0.15% saponin, incubated for 5 min on ice, washed three times with cold PBS and centrifuged at 4 °C at 3,200g for 5 min. Parasite pellets were then lysed by resuspension in 150 µl of lysis buffer (50 mM Tris pH 8, 150 mM NaCl, 1% NP40, 4 mM EDTA, 1× protease inhibitor cocktail (Sigma-Aldrich)) and incubated for 4 h at 4 °C on a rotating wheel. To improve homogeneity and fluidity, lysates were passed 10 times on a 27 G syringe, sonicated 3 times for 5 s separated by 25 s cooling on ice (Branson Sonifier) before centrifugation at 15,000g for 15 min at 4 °C. The supernatants were harvested and 50 µl of Laemmli 4× (Bio-Rad) and 10 µl of β-mercaptoethanol were added. Protein digestion was done on S-Trap micro spin columns (Protifi) according to the manufacturer’s protocol. In brief, protein extracts (300 µg) were resuspended in SDS buffer (5%), reduced with 20 mM of dithiothreitol (DTT) for 10 min at 95 °C and alkylated with 40 mM of iodoactamide for 30 min in the dark. Samples acidified by phosphoric acid (12% final) were diluted in a binding buffer (100 mM triethylammonium bicarbonate (TEABC) in 90% methanol) and proteins were trapped on the columns. Enzymatic digestion was performed by adding 1 µg of trypsin (Gold, Promega) for 2 h at 47 °C. Peptides were eluted with 50 mM of TEABC and formic acid. Peptide samples were fractionated by reverse phase with steps of acetonitrile in eight fractions (High pH Reverse-Phase peptide fractionation kit, Thermo Fisher Scientific). Samples were analysed by nano-flow HPLC-nano electrospray ionization using a Q-Exactive HF mass spectrometer (Thermo Fisher Scientific) coupled with an Ultimate 3,000 RSLC (Thermo Fisher Scientific). Desalting and pre-concentration of samples were performed online on a Pepmap precolumn (0.3 mm × 10 mm; Thermo Fisher Scientific). Peptides were introduced into the column (Pepmap 100 C18; 0.075 mm × 500 mm; Thermo Fisher Scientific) with buffer A (0.1% formic acid) and eluted with a gradient of 2–40% of buffer B (0.1% formic acid 80% CAN) at a flow rate of 300 nl min−1. Eluted peptides were electrosprayed into a Q-Exactive HF mass spectrometer. Spectra were acquired with the Xcalibur software (v.4.2 Thermo Fisher Scientific). MS/MS analysis were performed in a data-dependent mode. Full scans (375–1,500 m/z) were acquired in the Orbitrap mass analyser with a 60,000 resolution at 200 m/z. For the full scans, 3 × 106 ions were accumulated within a maximum injection time of 60 ms and detected in the Orbitrap analyser. The 12 most intense ions with charges states ≥2 were sequentially isolated to target value of 1 × 105 with a maximum injection time of 100 ms and were fragmented by higher energy collision dissociation in the collision cell (normalized energy of 28%) and detected in the Orbitrap analyser at 30,000 resolution. Raw spectra were processed by using the MaxQuant environment (v.220.127.116.11)35 and Andromeda for database search with label-free quantification (LFQ), match between runs and the iBAQ algorithm enabled36. The MS/MS spectra were matched against UniProt Reference proteome of Plasmodium falciparum (3D7, Proteome ID UP000001450; 2022_01), streptavidin and Δ270–699 Md1 sequence and 250 frequently observed contaminants as well as reversed sequences of all entries. Default search parameters were applied. Oxidation (Met) and acetylation (N-term) were used as variable modifications and carbamidomethylation (Cys) was used as a fixed modification. The FDR was set to 1% for the peptides and proteins. A representative ID in each group was automatically selected by using an in-house bioinformatics tool (Leading v.3.4). First, proteins with the most numerous identified peptides were isolated in a ‘match group’ (proteins from the ‘Protein IDs’ column with the maximum number of ‘peptides counts’). For the match groups in which more than one protein ID were present after filtering, the best annotated protein in UniProtKB was defined as the ‘leading’ protein (UniProtKB-GOA, made on 20220317). The MS proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE37 partner repository with the dataset identifier PXD035547 and 10.6019/PXD035547. All processed data are included in Supplementary Table 2.
Cultures were recovered between day 6 and day 10 after stress. Culture suspensions were centrifuged at 600g for 4 min. The pellet was resuspended in ten times the pellet volume of PBS containing 0.15% saponin, incubated for 5 min on ice then washed three times with cold PBS and centrifuged at 3,200g for 5 min at 4 °C. The parasites were resuspended in lysis buffer (50 mM Tris pH 8, 150 mM NaCl, 1% NP40, 4 mM EDTA, 1× protease inhibitor cocktail (Sigma-Aldrich)) (100 µl of lysis buffer for 10 ml of culture) and incubated 4 h at 4 °C on a rotating wheel. Lysates were clarified by centrifugation at 15,000g for 15 min at 4 °C. Proteins were denatured by boiling for 5 min with 1× NuPAGE LDS sample buffer (Fisher) supplemented with 100 mM DTT and separated on 10% SDS–PAGE gels. Gels were transferred onto a nitrocellulose membrane (Amersham) and stained with Ponceau S (0.1% (w/v) in 1% acetic acid). The blots were blocked with 5% (w/v) milk (Bio-Rad) in PBS–0.1% Tween 20 (PBS-T) for 30 min at room temperature then washed three times in PBS-T. Blots were then incubated with a mouse anti-HA antibody (Fisher) in 5% milk in PBS-T overnight at 4 °C, washed three times with TBS-T for 5 min each then incubated with horseradish peroxidase (HRP)-conjugated goat anti-mouse IgG antibody (Bio-Techne) in 1.5% BSA for 1 h at room temperature. Blots were washed three times with PBS-T for 5 min each before developing with the Immobilon Forte WB HRP substrate (Millipore) and were imaged on an Imager 680 (Amersham) or a Chemidoc MP (Bio-Rad). Raw blot images are shown in Supplementary Fig. 2.
Proximity labelling, proteomics and TurboID analysis
Three replicates of WT and three replicates of Md1-TurboID were prepared. Each replicate consisted of two 40 ml cultures committed two days apart. Two days after stress, cells were washed in biotin-free RPMI (Cell Culture technologies) and the culture medium was replaced by serum-free, biotin-free RPMI supplemented with 1% Albumax, 10 mg ml−1 hypoxanthine and 20 U ml−1 heparin. The medium was changed every day until recovery. Cultures were pooled at day 7 and day 9 post-stress, respectively. Before harvesting, cultures were treated with a 2 h pulse of 100 µM biotin (Thermo Scientific). Cells were processed with saponin and lysis as described above and were washed five times in cold PBS to eliminate excess biotin. Then 80 μl of streptavidin-coated magnetic beads (Fisher) were washed three times in lysis buffer and were supplemented with proteinase inhibitor cocktail then incubated with clarified lysates overnight at 4 °C on a rotating wheel. The beads were subsequently washed twice with lysis buffer, once with 1 ml of 1 M KCl, once with 1 ml of 0.1 M Na2CO3, once with 1 ml of 2 M urea in 10 mM Tris-HCl (pH 8.0) and twice with 1 ml lysis buffer38. Low-protein binding tubes were used and samples were transferred to new tubes twice during the bead washing process. Biotinylated proteins were eluted by boiling the beads 5 min in 80 μl of 4× Laemmli (Bio-Rad) supplemented with 100 mM DTT. Beads were removed with the magnets and by centrifugation for 5 min at 20,000g and 4 °C. Then 20 µl of lysate, 20 µl of flow through and 5 µl of eluted samples were run on a SDS–PAGE gel as described above. Blots were blocked in a casein buffer (Bio-Rad) and were incubated with HRP-conjugated streptavidin (Cytiva, 1:5,000) in casein buffer for 45 min to 1 h at room temperature, then washed three times with PBS-T for 5 min each before developing as described above. Raw blot images are shown in Supplementary Fig. 2. For MS preparation, 45 µl of eluted proteins were separated by SDS–PAGE gel using a short (2 cm) migration to avoid biotin contamination. Single pieces of gel were excised for each sample and protein were in-gel digested using Trypsin (Gold, Promega)39. MS processing and analysis was done as above. LFQ was used to identify proteins that were differentially enriched between conditions. Only hits found in all Md1-TurboID replicates were analysed. A two sample t-test was applied to the log2(LFQ) values of the WT and Md1-TurboID. In cases for which the WT had only been detected in one replicate, we applied a one-sample t-test by using the WT value as the mean against the triplicate values detected in the Md1-TurboID; P values were adjusted for multiple testing by using the FDR method. For the calculation of the Md1-TurboID/WT ratios, we imputed the lowest LFQ value detected in the dataset (16) to the missing values in the WT condition, and then calculated the log2(Md1-TurboID/WT) by using corrected LFQ values. All processed data and calculations are included in Supplementary Table 2. The list of highlighted hits in the volcano plots is described in Supplementary Table 2. The MS proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository37 with the dataset identifier PXD035553 and 10.6019/PXD035553.
Two methods were used for single-cell transcriptomics, 10x Genomics droplet sequencing was used to phenotype mutants, because it allows the generation of several thousand cells per sample, providing excellent coverage of the sex determination and differentiation process. Smartseq2 was preferred to characterize the Md1-2A-GFP line because by using this method each transcriptome can be associated with information obtained during the cell sort, such as levels of protein expression, given by the GFP intensity signal. This is not yet possible for intracellular fluorescent proteins with droplet-based single cell RNA-seq.
10x sample preparation, library generation, sequencing and initial mapping
Four different gametocyte cultures for the WT, KO and Δ270–699 strains were seeded at 2-day intervals, with heparin (20 U ml−1) added after the initial invasion to remove any remaining asexual forms. Cultures were harvested on days 3, 5, 7 and 9 after stress. Gametocytemias were assessed on Giemsa-stained thin-blood smears, and cell density was established on a haemocytometer. The cells from the four time points were pooled at a 1:1:1:1 ratio for each genotype. These pools were counted again with the same method and were seeded at a concentration of 72,000 RBC µl−1 (NF54), 74,000 RBC µl−1 (Δ270–699) and 85,000 RBC µl−1 (KO). Each pool was processed with the 10x chromium platform as described10 with a target recovery of 8,000 parasite cells per run and by using v3 chemistry. Sequencing was multiplexed on half a lane of the NovaSeq 6,000 S1 flow cell with paired-end (PE) sequencing (26 and 98 bp) and yielded approximately 150 millions reads per run. The data were demultiplexed and mapped to the P. falciparum v3 genome sequence (GenedB, June 2021) by using Cell Ranger v.6.0.1.
Smartseq2 sample preparation, library generation, sequencing and mapping
Single-cell transcriptomics were conducted as reported previously10,20 with slight modifications. In brief, gametocyte cultures were spun at 500g for 3 min at 4 °C, washed once in DPBS (Thermo Fisher) and stained with 2.5 µM of MitoTracker Deep Red FM (Thermo Fisher) in DPBS for 15 min on ice. Cell sorting was conducted on an FACsAria II cell sorter (BD Biosciences) with a 100 µm nozzle by using BD FACSDiva software (v.9.0.1). Gametocytes were sorted by gating on single-cell events and on the APC and GFP channel. The gating strategy is shown in Supplementary Fig. 1. A stained uninfected RBC control was used for establishing the gating strategy. Single cells were sorted in 96-well plates containing 4 µl of lysis buffer (0.8% of RNAse-free Triton X (Fisher) in nuclease-free water (Ambion), treated with ultraviolet light for 5 min with a Stratalinker UV Crosslinker 2,400 at 200,000 µJ cm−2, 2.5 mM dNTPs (Life Technologies), 2.5 µM of oligo(dT) (5′-AAGCAGTGGTATCAACGCAGAGTACTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT-3′) and 2U of SuperRNAsin (Life Technologies)). Sorted plates were spun at 1,000g for 10 s and were immediately placed on dry ice. Plates were heated at 72 °C for 3 min. A reverse transcription mix containing 1 µM of LNA-oligonucleotide (5′-AGCAGTGGTATCAACGCAGAGTACATrGrG+G-3′; Qiagen), 6 µM MgCl2, 1 M betaine (VWR), 1× reverse transcription buffer, 50 µM DTT, 0.5 U SuperRNAsin and 0.5 µl of Smartscribe reverse transcriptase (Takara) was added to the plates. The total volume of the reaction was 10 µl. The following cycling conditions were used: a single incubation period at 42 °C for 90 min; followed by 10 cycles (42 °C for 2 min, 50 °C for 2 min); before a final incubation at 70 °C for 15 min. A further PCR mix was added to the plates, containing 1× KAPA Hotstart HiFi Readymix and 2.5 µM of the ISO SMART primer40 and incubated by using the following programme: a single incubation at 98 °C for 3 min followed by 25 cycles (98 °C for 20 s, 67 °C for 15 s, 72 °C for 6 min) and a final incubation at 72 °C for 5 min. Reactions were purified with 1× Agencourt Ampure beads (Beckman Coulter). Amplified cDNA was eluted with 10 µl of nuclease-free water (Ambion). The quality of a subset of cDNA samples was assessed with a high-sensitivity DNA chip (Agilent) with an Agilent 2100 Bioanalyser. Sequencing libraries were prepared by using the Nextera XT 96 kit (Illumina) according to manufacturer recommendations but by using quarter reactions. Dual indices set A and B were used (Illumina) for 192 different index combinations. Libraries were pooled and cleaned up with Agencourt Ampure beads (Beckman Coulter) used at a 4:5 ratio. The quality of the libraries was assessed with the high-sensitivity DNA chip (Agilent) ran on an Agilent 2100 Bioanalyzer. Md1-2A-GFP transcriptomes were multiplexed in two pools of 192 samples and each sequenced on a lane of HiSeq 2,500 with 50 The FASTQ files were obtained after base calling and demultiplexing with Illumina’s software. The Nextera adaptor sequences were trimmed with Trim Gallore (v.0.6.5) by using trim_galore -q 20 -a CTGTCTCTTATACACATCT –paired –stringency 3 –length 50 -e 0.1. HISAT2 (v.2.0.0) (ref. 41) indexes were produced for the P. falciparum v3 genome sequence (GenedB, November 2019) by using default parameters. Trimmed, paired reads were mapped to the P. falciparum genome sequence by using hisat2 –max-intronlen 5000 p 12. SAM files were converted to BAM by using samtools-1.2 view –b and sorted with samtools-1.2 sort. The read counts were obtained with HTSeq (v.0.12.4)42 by using htseq-count -f bam -r pos -s no –t CDS by using a custom GTF as described elsewhere20. The raw read counts from all single-cell runs are available online43. To analyse the splice junctions, sorted bams from the Smartseq2 dataset were processed with portcullis (v.1.2.2)44 by using –max_length 2000 –min_cov 3. The resulting bed files were sorted and merged with bedtools (v.2.29.1)45.
The 10x data were initially filtered by using Seurat (v.4.0.4)46. Cells with less than 1,000 genes and genes present in less than two cells in each dataset were removed from the analysis. The data were normalized by using NormalizeData with the -LogNormalize method, variable features were identified with FindVariableFeatures and the data were scaled with ScaleData. The KO and Δ270–699 datasets were each merged with the WT for a combined analysis. Doublets were identified and were filtered with scDblFinder (v.1.6.0)47. After filtration and doublet removal, we recovered 2,076, 2,535 and 2,951 high-quality cells for the WT, KO and Δ270–699 datasets, respectively. For the Smartseq2, the count data were read as single cell experiments (v.1.8.0)48 and was normalized in scater (v.1.20.1)4,9 and cells with fewer than 1,000 genes or 25,000 reads were removed from the analysis. Genes identified in more than two cells by at least five reads were kept in the analysis. Filtration resulted in 322 high-quality transcriptomes.
Classification and ordering of cells, differential expression analysis
The principal component analysis was conducted within scater (v.1.20.1)49, and the sexual identity of cells was formally established by using scmap (v.1.8.0)50 to map our datasets to an existing P. falciparum dataset that contained both asexual and gametocytes from the Malaria Cell Atlas10,11. We built a cell index of the reference data and we used the scmapCell function to assign each query cell (from our datasets) to the index cell with which it had the highest cosine similarity. We could then assign the metadata of that index cell to the query cell. We used an assignment cosine similarity threshold of 0.75. This approach oriented the root (progenitor) and two branches (male and female) in our WT-KO, WT-Δ270–699 and Md1-2A-GFP datasets. To order the cells in a trajectory, we implemented lineage and pseudotime analysis by using slingshot (v.1.4.0)51. The cells were first clustered by kmeans, before identifying the lineage structure with getLineages, followed by a simultaneous principal curves analysis with getCurves, which fits branching curves to the identified lineages and provides pseudotime estimates for each lineage. The cells were assigned to two developmental paths in all cases going from the progenitor to either the male or the female tip of the branch. A progenitor cell was defined as one that is present in both lineages, a male cell as one absent from the female developmental path and a female cell absent from the male developmental path. Representation of the composition of genotypes along each lineage was generated with a subset of the data with 50 cells selected at random per kmeans cluster to correct for uneven representation of the clusters in each dataset. Differential expression analysis was conducted by using the MAST package (v.1.3.4)52 in which a hurdle model is implemented to each gene with zlm. The genes with a log fold change superior to 1.25 and a FDR inferior to 1% were considered to be differentially expressed.
Direct RNA sequencing
109 NF54 gametocytes (stage II to V) were harvested by centrifugation (500g for 3 min at 4 °C), then lysed with 0.15% Saponin (Sigma) in DPBS on ice for 2 min and washed once with DPBS. The RNA was then extracted with TRIzol Reagent (Thermo Fisher) according to the manufacturer′s protocol. RNA quality was checked with the RNA 6,000 Nano Kit (Agilent) with an Agilent 2100 Bioanalyzer. Total RNA (75 μg) was treated with a Dynabead mRNA Purification kit (Thermo Fisher). 500 ng of poly-A RNA were processed with the Direct RNA Sequencing kit from Oxford Nanopore Technologies. In brief, the library is generated by ligation of a 3′ adaptor on the RNA molecules, followed by reverse transcription with Superscript III (Thermo Fisher), purification of the DNA–RNA hybrids with Agencourt RNA XP beads (Beckman Coulter), ligation of the 1D adaptor, purification with Agencourt RNA XP beads (Beckman Coulter) and quantification of DNA–RNA hybrids by using the DNA HS kit (Qubit). The sample was spiked with a control RNA (5%) and was loaded on a MinION flow cell (Oxford Nanopore). Sequencing was performed as recommended with the MinKNOW software (v.19.10.1). Base calling was performed in Guppy (v.3.4.3) and quality control was performed with PycoQC (v.18.104.22.168). After quality control, 639,440 reads with a mean Phred score of 10.63 and a median reads size of 698 bp were recovered. The data were aligned with minimap2 (v.2.17-r941)53 to the control RNA sequence and Samtools (v.1.9)54 stats was used to calculate the run error rate, which amounted to 6.72 %. The reads were aligned to the P. falciparum genome v3 (GenedB, November 2019) by using minimap2 (v.2.17-r941)53 with options -ax splice -k14. 79.8% of the reads mapped to the reference genome. Bedtools (v.2.29.1)45 was used to generate stranded bed files with the genomeCoverageBed command. The coverage analysis and read visualization were performed with the R package Sushi (v.1.24.1)55. The coding potential of the lncRNA was tested with CPAT19.
For all experiments, the number of biological replicates are provided in the figure legends. Statistical analyses were performed in the R environment (v.4.1.0) by using RStudio (v.1.4.1717). The IFA and qRT–PCR sex ratio measurements were first tested for equal variance by implementing an F-test with default parameters to compare the variances of the two samples. The samples were then compared by using a student t-test with default parameters (two-sided) and by specifying var.equal = TRUE/FALSE in accordance with results of the F-test. The mutual exclusivity of expression was tested with a Fisher’s Exact Test for count data. The significance codes in the figures: NS, non-significant; *, P < 0.05; **, P < 0.01; ***, P < 0.001.
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.