Strange IndiaStrange India

Study participants

Individuals infected with HIV-1 were recruited as study participants at the Massachusetts General Hospital in Boston, MA, and at the NIH Clinical Center in Bethesda, MD. PB mononuclear cell (PBMC) samples were obtained according to protocols approved by the respective Institutional Review Boards. All study participants gave written informed consent for blood collection or for LN biopsies. Clinical characteristics of study participants are summarized in Extended Data Figs. 2 and 9. Study participants were preselected on the basis of high frequencies of HIV-1-infected CD4+ T cells analysed in previous studies.

LN biopsies

Inguinal LNs were excised surgically with informed consent of study participants, according to protocols approved by the Massachusetts General Hospital Institutional Review Board. LN tissue was dissected and mechanically disaggregated through a 70-µm nylon cell strainer in RPMI medium supplemented with 10% fetal bovine serum.

Isolation of memory CD4+ T cells

PBMCs and LN mononuclear cells (LNMCs) were isolated using Ficoll–Paque density centrifugation. PBMCs and LNMCs were viably frozen in 90–95% fetal bovine serum and 5–10% dimethylsulfoxide. For analysis, cells were thawed and subjected to negative immunomagnetic isolation of memory CD4+ T cells, using a commercial product (Stemcell EasySep Human Memory CD4+ T cell Enrichment Kit, no. 18000) per the manufacturer’s protocol.

Surface labelling with monoclonal antibodies

Monoclonal antibodies tagged with distinct oligonucleotides were custom manufactured and supplied as lyophilized single-reaction vials by a commercial vendor (BioLegend). Antibodies to the following surface markers were used: PDL1 (clone 29E.2A3), CD276 (clone DCN.70), HVEM (clone 122), CD155 (clone SKII.4), CD154 (also known as CD40L) (clone 24-31), CCR4 (clone L291H4), PD1 (A17188B), TIGIT (clone A15153G), CD44 (clone BJ18), CXCR3 (clone G025H7), CCR5 (clone HEK/1/85a), CCR6 (clone G034E3), CXCR5 (clone J252D4), CCR7 (clone G043H7), KLRB1 (also known as CD161) (clone HP-3G10), CTLA4 (clone BNI3), LAG3 (clone 11C3C65), KLRG1 (clone 14C2A07), CD95 (clone, DX2), OX40 (also known as CD134) (clone Ber-ACT35), CD57 (clone HNK-1), TIM3 (clone F38-2E2), BTLA (also known as CD272); (clone MIH26), CD244 (also known as 2B4) (clone 2-69), IL-2R (clone TU27), CD137 (also known as 4-1BB) (clone 4B4-1), GITR (also known as CD357) (clone 108-17), CD28 (clone CD28.2), CD127 (clone A019D5), IL-6R (clone UV4), HLA-E (clone 3D12), MICA or MICB (clone 6D4), IL-15R (also known as CD215) (clone JM7A4), IL-21R (clone 2G1-K12), TNFR2 (clone 3G7A02), CD160 (clone BY55), LIGHT (also known as CD258) (clone T5-39), IL-10R (also known as CD210) (clone 3F9), TGFβR (clone W17055E), IL-12R (clone S16020B), CD6 (clone BL-CD6), CD49d (clone 9F10), CD25 (clone BC96), CD30 (clone BY88), CD69 (clone FN50), CD45RA (clone HI100), CD38 (clone HIT2), HLA-DR (clone L243), CD4 (clone RPA-T4), CD2 (clone TS1/8), CD3 (clone UCHT1), CD62L (clone DREG-56), CD45RO (clone UCHL1). Two mouse IgG control antibodies (BioLegend, no. 400299, no. 400383), conjugated with distinct TotalSeq-D oligonucleotides were included as the isotype controls. The lyophilized antibody cocktails containing the above-mentioned antibodies were reconstituted with 60-μl of cell staining buffer (BioLegend, no. 420201). Cells were blocked with Human TruStain FcX (BioLegend, no. 422301) for 15 min on ice and then incubated with 50 μl of reconstituted antibody cocktail for 30 min on ice. After three washes with pre-chilled cell staining buffer, cells were filtered with a 40-μm cell Flowmi strainer (Fisher Scientific, no. 14-100-150), counted with an automated cell counter, and then loaded into a microfluidic cartridge for single-cell multiplex PCR assays.

Single-cell multiplex PCR

Single-cell amplification of defined genomic DNA segments was carried out using the Tapestri platform (MissionBio) according to the manufacturer’s protocol18. Viable single cells were encapsulated into droplets with a lysis buffer (containing protease and a mild detergent) and incubated for 1 h at 50 °C, followed by 10 min at 80 °C for heat inactivation of enzymes. Droplets containing single-cell barcoding beads (tagged with oligonucleotides carrying the cellular barcodes and custom-designed primers) were fused with encapsulated cell lysates. A panel of primers designed to amplify n = 18 different genomic regions in HIV-1 and n = 27 specific HIV–host DNA junctions of previously defined large clonal HIV-1-infected T cell populations in our study participants were used (Fig. 1b and Supplementary Table 1) in addition to two primer sets amplifying control genomic regions in the RPP30 gene on chromosome 10. The droplets were placed under ultraviolet light to cleave PCR primers containing unique cell barcodes from beads. To amplify the selected genomic DNA segments and the antibody oligonucleotide tags, droplets were subjected to PCR for 24 cycles with temperature gradients recommended by the manufacturer.

Sequencing library construction

Amplification products were pooled, mixed with AMPure XP beads (Beckman Coulter, no. A63882) at a ratio of 0.7 and placed in a magnetic field for separating the DNA and the protein tag libraries. The DNA library bound to AMPure beads was washed with 80% ethanol, and the supernatant containing the protein tag library was aspirated and incubated with a biotinylated oligonucleotide complementary to the 5′ end of the antibody tags, followed by magnetic isolation using streptavidin beads. For library amplification, PCRs were carried out with Illumina index primers P5 and P7 on purified DNA and protein libraries, respectively, according to the manufacturer’s protocol; 12 cycles were carried out for the DNA library, and 18 cycles were run for the protein tag library.

Next-generation sequencing

The DNA and protein tag libraries were run on a High Sensitivity D1000 ScreenTape instrument (Agilent Technologies, 5067-5584) with the Agilent 4200 TapeStation System to evaluate DNA quality. Libraries were quantified by a fluorometer (Qubit 4.0, Invitrogen) and sequenced on Illumina next-generation sequencing platforms with a 20% spike-in of PhiX control DNA (Illumina, no. FC-110-3002). DNA and protein tag libraries were sequenced separately on a NextSeq 500 instrument (Illumina), using the NextSeq 500/550 High Output flow cell v2.5 (Illumina, no. 20022408) and the NextSeq 500/550 High Output Kit v2.5 (300 cycles; Illumina, no. 20024908) in 2 × 150-bp paired-end runs.

Bioinformatic analysis

The Tapestri pipeline (MissionBio, v2.0.1) with minor modifications was used to process the sequencing data. Briefly, for DNA library data, cutadapt (v2.5)54 was used to trim 5′ and 3′ adaptor sequences, and extract 18-bp cell barcode sequences from read 1. Cell barcodes that aligned to a unique barcode on a whitelist within a Hamming distance of 2 were used for downstream analysis. Using bwa (v0.7.12)55, sequences were aligned to custom reference genomes built from the human genome (GRCh38) and patient-specific autologous HIV-1 sequences identified in prior studies. Single-cell alignments were filtered according to criteria implemented in the Tapestri pipeline, and indexed using samtools (v1.9)56. Candidate HIV-1-infected cells were determined by the CellFinder algorithm built in the Tapestri pipeline. Bcftools (v1.9)57 was used to call variants and generate consensus sequences. To reduce spurious alignments, viral sequencing reads were considered valid only if they covered at least 80% of the length of the reference sequence for each given amplicon; host sequencing reads had to cover at least 50% of the length of respective amplicon. For antibody library data, cell barcodes were similarly extracted using cutadapt. For reads with valid cell barcodes, 15-bp antibody barcodes were extracted from read 2. Antibody barcode sequences within a Hamming distance of 1 from known antibody barcodes were accepted. Candidate cells appearing in both libraries were processed for downstream analysis. Read counts for each antibody were normalized to the total read count in each cell using centred log-ratio transformation58. Cutoffs for a given phenotypic marker to be considered positive were defined by marker-specific read counts higher than 1 mean absolute deviation of the normalized median read count corresponding to unspecific IgG control antibodies. To generate a more homogenous cell population for analysis, centred log-ratio values of all candidate cells that were CD3CD4 (non-CD4+ T cells) and CCR7+CD45RA+ (contaminating naive CD4+ T cells) were excluded.

Dimension reduction and clustering

UMAP embeddings in two dimensions of the centred log-ratio values was carried out through the Monocle 3 (ref. 59) and uwot60 packages with the number of principal components set to 15 for all cells, and to 10 for HIV-1-infected cells alone; numbers of neighbours to use during k-nearest neighbours graph construction were set to 9 and 5, respectively. All other settings were kept to the default values. The cells were clustered using the Leiden community detection algorithm through Monocle 3, with the k-near neighbours set to 500.

Phylogenetic analysis of HIV-1

HIV-1 sequencing reads corresponding to each of the 18 HIV-1 amplification products and of additional amplification products corresponding to predefined virus–host junctions were aligned to the reference HIV-1 genome HXB2, to autologous intact HIV-1 sequences and to the human reference genome GRCh38. The presence or absence of hypermutations associated with APOBEC3G or APOBEC3F was determined using the Los Alamos National Laboratory HIV Sequence Database Hypermut 2.0 program A hypermutation. Bioinformatics 16, 400–401 (2000).” href=”″ id=”ref-link-section-d25643724e1448″>61. Sequence alignments were carried out using MUSCLE62. Phylogenetic distances between sequences were examined using maximum-likelihood trees in MEGA ( and MAFFT (, and visualized using highlighter plots ( Proviruses were classified in 4 different categories according to the following criteria—category 1 (any provirus): at least 20 total valid viral sequencing reads with at least 4 reads in at least 2 different HIV-1 amplicons each; category 2 (enriched for intact proviruses): a total of at least 20 viral sequencing reads with at least 4 sequencing reads from amplicons 2 and 16 (corresponding to IPDA amplicons) each or at least 4 reads from at least 15 amplification products each or at least 4 reads from amplification products spanning known virus–host junctions of intact proviruses (all viral sequencing reads in category 2 had to be without evidence of statistically significant hypermutation and could not correspond to virus–host junctions from known defective proviruses); category 3 (clonal proviruses): proviruses with identical proviral integration sites (based on at least 4 viral sequencing reads from amplification products of known virus–host junctions) or completely identical proviral sequences; category 4 (hypermutated proviruses): proviral sequences from category 1 that exhibited statistically significant sequence hypermutations (FDR-adjusted P < 0.05). Cells with HIV-1 sequencing reads that did not meet any of the above-mentioned criteria were excluded from the analysis. Category 0 cells were defined by complete absence of sequencing reads corresponding to HIV-1.

Cell sorting and flow cytometry

Memory CD4+ T cells isolated from PBMCs by negative immunomagnetic enrichment were incubated with defined fluorophore-labelled surface antibodies: CD3–PerCP–Cy5.5 (BioLegend, clone UCHT1, catalogue no. 300430), CD4–BUV395 (BD, clone RPA-T4, catalogue no. 564724), PD1–FITC (BioLegend, clone A17188B, catalogue no. 621612), TIGIT–BV421 (BioLegend, clone A15153G, catalogue no. 372710), PVR–PE (eBioscience, clone 2H7CD155, catalogue no. 12-1550-41), HVEM–APC (BioLegend, clone 122, catalogue no. 318808), BTLA–FITC (BioLegend, clone MIH26, catalogue no. 344524), KLRG1–APC (BioLegend, clone 14C2A07, catalogue no. 368606), HLA-E–BV421 (BioLegend, clone 3D12, catalogue no. 342612), PDL1–PE (BioLegend, clone MIH2, catalogue no. 393608). After 25 min of incubation, cells were washed, and indicated cell populations were sorted in a specifically designated biosafety cabinet (Baker Hood), using a FACSAria cell sorter (BD Biosciences) at 70 pounds per square inch. Cell sorting was carried out by the Ragon Institute Imaging Core Facility at Massachusetts General Hospital. Data were analysed using FlowJo software (Tree Star). Sorted cells were subjected to proviral sequencing analysis using the full-length individual proviral sequencing assay (FLIP-seq).

Full-length HIV-1 sequencing assay

A previously described protocol was used11,23. In brief, genomic DNA was extracted from sorted cell populations using a QIAGEN DNeasy Blood & Tissue kit. DNA diluted to single-genome levels based on Poisson distribution statistics was subjected to single-genome amplification using Invitrogen Platinum Taq and nested primers spanning near-full-length HIV-1. PCR products were visualized by agarose gel electrophoresis; amplification products were subjected to single-genome sequencing on the Illumina platform. Resulting short reads were de novo assembled and aligned to HXB2 to identify large deleterious deletions, out-of-frame indels, premature/lethal stop codons, internal inversions or packaging signal deletions, using an automated in-house pipeline ( The presence or absence of hypermutations associated with APOBEC3G or APOBEC3F was determined using the Los Alamos HIV Sequence Database Hypermut 2.0 program. Viral sequences that lacked all lethal defects listed above were classified as genome-intact. Sequence alignments were carried out using MUSCLE. Phylogenetic distances between sequences were examined using Clustal X-generated neighbour-joining algorithms. Proviral species that were completely sequence-identical were considered as clonal.

Ex-vivo HIV-1 reservoir cell killing assay

Memory CD4+ T cells isolated from PBMCs by negative immunomagnetic enrichment were incubated in R10 medium with or without epitopic peptides for 1 h and washed thoroughly. Afterwards, cells were co-incubated with previously isolated HIV-1-specific CD8+ T cell clones at selected effector-to-target (E/T) ratios for 16 h. After washing, cells were stained with blue viability dye (Invitrogen, catalogue no. L34962), CD3–FITC (BioLegend, clone UCHT1, catalogue no. 300406) and CD4–BV711 (BioLegend, clone RPA-T4, catalogue no. 300558) antibodies, followed by sorting of viable CD3+CD4+ events. Sorted cells were subjected to assessments of intact and total HIV-1 proviruses using the IPDA assay.


The IPDA uses digital droplet PCR to quantify proviruses lacking overt fatal defects, especially large deletions and hypermutations, and was carried out as previously described22.

Statistical analysis

The data are presented as pie charts, Venn diagrams, volcano plots, UMAP plots and heatmaps. Enrichment ratios between two cell populations for each marker were calculated as the ratio of the proportion of marker-positive cells in the first population divided by the proportion of marker-positive cells in the second population. Sensitivity values were calculated by dividing the number of marker-positive cells in each category of cells by the total number of cells in the respective category. A bootstrapped dataset was constructed by resampling equal numbers of cells from each cell category from each participant, while keeping the total number of cells from each cell category equal between the bootstrapped and raw datasets. Differences were tested for statistical significance using Fisher’s exact test, chi-squared test, t-test or Mann–Whitney U-test, as appropriate. P values of <0.05 were considered significant; FDR correction was carried out using the Benjamini–Hochberg method63 or the Bonferroni method. Analyses were carried out using Prism (GraphPad Software, Inc.), SPICE39, R (R Foundation for Statistical Computing64) and Python (Python Software Foundation). Figures were generated using Adobe Illustrator.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Source link


Leave a Reply

Your email address will not be published. Required fields are marked *