Strange IndiaStrange India

Comparative genomics analyses of tail-development-related genes

Hominoid evolution represents an extended stage in primate evolution that involved many phenotypic changes and widespread genomic sequence changes. Therefore, querying for hominoid-specific mutations across the genome results in tens of millions of candidates, with most of them disposed in non-coding regions. We used the following criteria to define that a candidate variant may contribute to the tail-loss evolution in hominoids: (1) has to be hominoid-specific, which means that the variant sequence or amino acid is unique to hominoid species and cannot be shared by any other species that have tails; (2) the function of the associated genes relates to tail development. Tail-development-related genes in vertebrates were collected from the MGI phenotype database and additional literature not covered by the MGI database. The initial analyses mainly covered genes extracted from the MGI term MP0003456 for ‘absent tail’ phenotype (, to a total of 31 genes. Additional analyses included genes from MP0002632 for ‘vestigial tail’ ( and MP0003456 for ‘short tail’ ( Together, the final list of genes related to vertebrate tail development included 140 genes (as of MGI updates in February 2023) and the mutations of which are reported to be related to tail-reduction phenotypes (Supplementary Data 1).

Gene structure annotations of the 140 genes were downloaded from BioMart of Ensembl 109 ( The longest transcript with the most exons were selected for each gene. Multiz30way alignments of genomic sequences across 27 primate species were downloaded from the UCSC Genome Browser. We selected all six hominoid species (hg38, gorGor5, panTro5, panPan2, ponAbe2 and nomLeu3) to calculate a hominoid-consensus sequence, and used two non-hominoid species (pig-tailed macaque, macNem1, and marmoset, calJac3) as the outgroups. The homologous regions of the 140 genes, together with 10,000 bp both upstream and downstream sequences, in the 8 species were extracted from Multiz30way alignment using bedtools (v.2.30.0)52. Hominoid-specific variants were identified using the following parameters: SNVs or substitutions shared by six hominoid species but different in any of the two outgroup monkey species were identified as putative hominoid-specific SNVs (Supplementary Data 2); DNA sequences present in all six hominoid species but absent in either of the two outgroup monkey species were identified as hominoid-specific insertions (Supplementary Data 3); and DNA sequences absent from all six hominoid species but present in both of the two outgroup monkey species were identified as hominoid-specific deletions (Supplementary Data 4). Notably, our criteria for analysing hominoid-specific variants may include a small proportion of false-positive hits that are outgroup-specific variants.

We used the Ensembl variant effect predictor (integrated in Ensembl 109)53 to infer the potential functional impact of the detected hominoid-specific SNVs, insertions and deletions. Owing to the lack of an ancestral genome as the reference sequence, variant effect predictor predictions were performed inversely using the human/hominoid genomic sequence as the reference allele, and the outgroup sequence served as the alternative allele. SNVs annotated as either ‘deleterious’ (<0.05) in the SIFT score or ‘damaging’ (>0.446) in the PolyPhen score (53 instances), and insertions (6 instances) or deletions (2 instances) that affect protein sequences were collected for further manual inspection comparison across species. This additional inspection was performed across the Cactus Alignment of the genomes across 241 species in the UCSC Genome Browser Comparative Genomics module51. This inspection found that most of the annotated variants that may affect host gene function fell into three categories: (1) outgroup-specific variants; (2) false-positive annotation of the variant function in a minor transcript; and (3) missense variants in hominoid species but sharing the same amino acid in at least one other tailed species. These variants were not considered as candidates that may have affected tail-loss evolution in hominoids. Excluding these variants, we identified nine variants as true hominoid-specific coding region mutations, including seven SNVs and two insertions and deletions (Supplementary Data 1). Following identification of top candidates, protein sequence alignments across representative vertebrate species were downloaded from the NCBI database and analysed using the MUSCLE algorithm with MEGA X software and default settings54.

RNA secondary structure prediction

RNA secondary structure prediction of the human TBXT exon 5–intron 5–exon 6–intron 6–exon 7 sequence was performed using RNAfold (v.2.6.0) through the ViennaRNA Web Server ( The algorithm calculates the folding probability using a minimum free energy matrix with default parameters. In addition, the calculation included the partition function and base pairing probability matrix. Notably, human TBXT transcripts were annotated to have a 5′ untranslated region exon, making its exon numbers differ from most of other species, including mouse. To simplify, we referred to the first coding exon of human TBXT as exon 1 and thus the alternative spliced exon as exon 6, consistent with mouse Tbxt. RNA secondary structure prediction used the DNA sequence from exon 5 to exon 7 following this order.

Human ES cell culture and differentiation

Human ES cells (WA01, also called H1, from WiCell Research Institute) were authenticated by the distributor WiCell using short tandem repeat profiling to authenticate the cell lines. Human ES cells were cultured in feeder-free conditions on tissue-culture-grade plates coated with human ES cell-qualified Geltrex (Gibco, A1413302). Geltrex was 1:100 diluted in DMEM/F-12 (Gibco, 11320033) supplemented with 1× Glutamax (100X, Gibco, 35050061) and 1% penicillin–streptomycin (Gibco, 15070063). Before seeding human ES cells, the plate was treated with Geltrex working solution in a tissue culture incubator (37 °C and 5% CO2) for at least 1 h.

StemFlex medium (Gibco, A3349401) was used for human ES cell maintenance and culturing in a feeder-free condition according to the manufacturer’s protocol. In brief, StemFlex complete medium was made by combining StemFlex basal medium (450 ml) with 50 ml of StemFlex supplement (10×) plus 1% penicillin–streptomycin. Each Geltrex-coated well on a 6-well plate was seeded with 200,000 cells to obtain about 80% confluence in 3–4 days. Human ES cells were cryopreserved in PSC Cryomedium (Gibco, A2644601). The culture medium was supplemented with 1× RevitaCell (100×, Gibco, A2644501, which is also included in the PSC Cryomedium kit) when cells had gone through stressed conditions, such as freezing-and-thawing or nucleofection. RevitaCell supplemented medium was replaced with regular StemFlex complete medium on the second day. Human ES cells grown under the RevitaCell condition might become stretched but would recover after returning to the normal StemFlex complete medium. All human ES cell lines tested negative during our routine quantitative PCR-based mycoplasma tests.

The human ES cell differentiation assay to induce a gene expression pattern of the primitive streak state was adapted from a previously published method36. On day −1, freshly cultured human ES cell colonies were dissociated into clumps (2–5 cells) using Versene buffer (with EDTA, Gibco, 15040066). The dissociated cells were seeded on Geltrex-coated 6-well tissue culture plates at 25,000 cells per cm2 (0.25 M per well in the 6-well plates) in StemFlex complete medium. Differentiation to the primitive streak state was initiated on the next day (day 0) by switching StemFlex complete medium to basal differentiation medium. Basal differentiation medium (50 ml) was made using 48.5 ml DMEM/F-12, 1% Glutamax (500 µl), 1% ITS-G (500 µl, Gibco, 41400045) and 1% penicillin–streptomycin (500 µl), and supplemented with 3 µM GSK3 inhibitor CHIR99021 (10 µl of 15 mM stock solution in DMSO; Tocris, 4423). The cells were collected at differentiation day 1 to 3 for downstream experiments, which confirmed the expression fluctuations of mesoderm genes (TBXT and MIXL1) in a 3-day differentiation period36 (Extended Data Fig. 3).

Mouse ES cell culture and differentiation

The mouse ES cell line (MK6) derived from the C57BL/6J mouse strain was obtained from the NYU Langone Health Rodent Genetic Engineering Laboratory. The wild-type MK6 mouse ES cell line was authenticated by its competence for contributing to embryos when cultured on feeder-cell-dependent conditions followed by blastomere injection. MK6 mouse ES cells used in this study were cultured in both feeder-dependent and feeder-free culture conditions depending on the purposes of the experiment. All mouse ES cell lines tested negative during our routine quantitative PCR-based mycoplasma tests. For feeder-dependent mouse ES cell culture conditions, mouse ES cells were plated on a pre-seeded monolayer of mouse embryonic fibroblast (MEF) cells (CellBiolabs, CBA-310). MEF-coated plates were prepared by seeding 50,000 cells per cm2 in tissue culture plates treated with 0.1% gelatin solution (EMD Millipore, ES-006-B). MEF culturing medium was made from DMEM (Gibco, 11965118) with 10% FBS (GeminiBio, 100–500), 0.1 mM MEM non-essential amino acids (Gibco, 11140050), 1× Glutamax (Gibco, 35050061) and 1% penicillin–streptomycin (Gibco, 15070063). Mouse ES cell medium was made from Knockout DMEM (Gibco, 10829018) containing 15% (v/v) FBS (Hyclone, SH30070.03), 0.1 mM β-mercaptoethanol (Gibco, 31350010), 1× MEM non-essential amino acids (Gibco, 11140050), 1× Glutamax (Gibco, 35050061), 1× nucleosides (Millipore, ES-008-D) and 1,000 units ml–1 LIF (EMD Millipore, ESG1107).

For feeder-free mouse ES cell culture conditions, cells were grown on tissue-culture-grade plates that were pre-coated with mouse ES cell-qualified 0.1% gelatin (EMD Millipore, ES-006-B) at room temperature for at least 30 min. Before seeding mouse ES cells, feeder-free mouse ES cell culturing medium was added to a gelatin-treated plate and warmed in a 37 °C and 5% CO2 incubator for at least 30 min. Feeder-free mouse ES cell culturing medium, also called ‘80/20’ medium, comprises 80% 2i medium and 20% of the above-mentioned mouse ES cell medium by volume. The 2i medium was made from a 1:1 mix of Advanced DMEM/F-12 (Gibco, 12634010) and Neurobasal-A (Gibco, 10888022), followed by adding 1× N2 supplement (Gibco, 17502048), 1× B-27 supplement (Gibco, 17504044), 1× Glutamax (Gibco, 35050061), 0.1 mM β-mercaptoethanol (Gibco, 31350010), 1,000 units ml–1 LIF (Millipore, ESG1107), 1 µM MEK1/2 inhibitor (Stemgent, PD0325901) and 3 µM GSK3 inhibitor CHIR99021 (Tocris, 4423).

The mouse ES cell differentiation protocol for inducing Tbxt gene expression was adapted from a previously described method in a feeder cell-free condition37. Cells were first plated in 80/20 medium for 24 h on a gelatin-coated 6-well plate, followed by switching to N2/B27 medium without LIF or 2i for another 2 days of culture. The N2/B27 medium (50 ml) was made with 18 ml Advanced DMEM/F-12, 18 ml Neurobasal-A, 9 ml Knockout DMEM, 2.5 ml Knockout Serum Replacement (Gibco, 10828028), 0.5 ml N2 supplement, 1 ml B27 supplement, 0.5 ml Glutamax (100×), 0.5 ml nucleosides (100×) and 0.1 mM β-mercaptoethanol. Then the N2/B27 medium was supplemented with 3 µM GSK3 inhibitor CHIR99021 to induce differentiation (day 0). The cells were collected at differentiation day–3 for downstream experiments, which showed consistent results of Tbxt gene expression fluctuations in a 3-day differentiation period.

CRISPR targeting

All guide RNAs of the CRISPR experiments were designed using the CRISPOR algorithm integrated in the UCSC Genome Browser55. Guide RNAs were cloned into the pX459V2.0-HypaCas9 plasmid (AddGene, plasmid 62988) or its custom derivative by replacing the puromycin-resistance gene with the blasticidin-resistance gene. Guide RNAs in this study were designed in pairs to delete the intervening sequences. Insertion sites for the AluSx1 and AluY pair in mouse Tbxt (TbxtinsASAY) were selected by the guide RNA quality and the relative distance compared to the human TBXT gene structure. The insertion site for the RCS element (TbxtinsRCS) was the same as for insertion of the AluY element. The CRISPR-targeting sites and guide RNA sequences are listed in Supplementary Table 2.

All oligonucleotides (plus Golden-Gate assembly overhangs) were synthesized by Integrated DNA Technologies (IDT) and ligated into an empty pX459V2.0 vector following the standard Golden Gate Assembly protocol using BbsI restriction enzyme (NEB, R3539). The constructed plasmids were purified from 3 ml Escherichia coli cultures using a ZR Plasmid MiniPrep Purification kit (Zymo Research, D4015) for sequence verification. Plasmids for delivering into ES cells were purified from 250 ml E.coli cultures using a PureLink HiPure Plasmid Midiprep kit (Invitrogen, K210005). To facilitate DNA delivery to ES cells through nucleofection, the purified plasmids were resolved in Tris-EDTA buffer (pH 7.5) to a concentration of at least 1 µg µl–1 in a sterile hood.

DNA delivery

DNA delivery into human or mouse ES cells for CRISPR–Cas9 targeting was performed using a Nucleofector 2b device (Lonza, BioAAB-1001). A Human Stem Cell Nucleofector kit 1 (VPH-5012) and a mouse ES cell Nucleofector kit (Lonza, VVPH-1001) were used for delivering DNA into human and mouse ES cells, respectively. ES cells were double-fed the day before the nucleofection experiment to maintain a superior condition.

Before performing nucleofection on human ES cells, 6-cm tissue culture plates were treated with 0.5 µg cm–2 rLaminin-521 in a 37 °C and 5% CO2 incubator for at least 2 h. rLaminin-521-treated plates provide the best viability when seeding human ES cells as single cells. Cultured human ES cells were washed with DPBS and dissociated into single cells using TrypLE Select Enzyme (no phenol red; Gibco, 12563011). One million human ES cell single cells were nucleofected using program A-023 according to the manufacturer’s instructions for the Nucleofector 2b device. Transfected cells were transferred onto the rLaminin-521-treated 6-cm plates with pre-warmed StemFlex complete medium supplemented with 1× RevitaCell but not penicillin–streptomycin. Antibiotic selection was performed 24 h after nucleofection with puromycin (final concentration of 0.8 μg ml–1; Gibco, A1113802).

Mouse ES cells were dissociated into single cells using StemPro Accutase (Gibco, A1110501), and 5 million cells were transfected using program A-023 according to the manufacturer’s instructions. Exon 6 deletion in mouse ES cells was performed using cells cultured in the feeder-free condition. Nucleofected cells were plated on 0.1% gelatin-treated 10-cm plates, followed by antibiotic selection 24 h after nucleofection with blasticidin (final concentration of 7.5 µg ml–1; Gibco, A1113903). The insertion of the AluSx1–AluY pair and insRCS engineering were performed using mouse ES cells cultured on a feeder-dependent condition. Mouse ES cells were plated on a monolayer of MEF cells seeded on 0.1% gelatin-treated 10-cm plates, followed by antibiotic selection 24 h after nucleofection.

Together with the pX459V2.0-HypaCas9-gRNA plasmids for nucleofection, single-strand DNA oligonucleotides were co-delivered for microhomology-induced deletion of the targeted sequences56. These ssDNA sequences were synthesized by IDT through its Ultramer DNA Oligo service, including three phosphorothioate bond modifications on each end. Detailed sequence information of these long ssDNA oligonucleotides are listed in Supplementary Table 3.

For TbxtinsASAY and TbxtinsRCS engineering, homology-based repairing template plasmids, including a selection marker gene puroΔTK, (puromycin-resistance gene for positive selection and ΔTK, a truncated version of herpes simplex virus type 1 thymidine kinase, for negative selection, as presented in Extended Data Fig. 7), was transfected together with the pX459V2.0-HypaCas9-gRNA plasmids. Following nucleofection and antibiotic selection (0.8 μg ml–1 puromycin for 3 days starting from the second day of nucleofection), single clones were picked, followed by PCR genotyping of CRISPR–Cas9-targeted loci (exon 6 deletion, inserting AluY, inserting AluSx1 or inserting RCS). The genotyping PCR primers are listed in Supplementary Table 4.

PCR genotyping-confirmed clones were further validated using Capture-seq (see below) to confirm the genotype and to exclude the possibility of any random integration of plasmid DNA. Subsequently, Cre recombinase was transiently introduced to remove the selection marker puroΔTK. Cells were treated with 250 nM ganciclovir for counter-selecting ΔTK-negative cells as the selection marker gene-depleted cells. Following isolation of single mouse ES cell clones of TbxtinsASAY and TbxtinsRCS mouse ES cells without the selection marker gene, these clones were used for downstream experiments, including in vitro differentiation assays and blastocyst injection for generating mouse models.

Capture-seq genotyping

Capture-seq, or targeted sequencing of the loci of interest, was performed as previously described39. Conceptually, Capture-seq uses custom biotinylated probes to pull-down the sequences at genomic loci of interest from the standard whole-genome sequencing libraries, thereby enabling sequencing of the specific genomic loci in a much higher depth while reducing the cost.

Genomic DNA was purified from mouse ES cells or from ear punches of mice of interest using a Zymo Quick-DNA Miniprep Plus kit (D4068) according to the manufacturer’s protocol. DNA sequencing libraries compatible for Illumina sequencers were prepared following the standard protocol. In brief, 1 µg of gDNA was sheared to 500–900 base pairs in a 96-well microplate using a Covaris LE220 (450 W, 10% duty factor, 200 cycles per burst and 90-s treatment time), followed by purification with a DNA Clean and Concentrate-5 kit (Zymo Research, D4013). Sheared and purified DNA were then treated with end repair enzyme mix (T4 DNA polymerase, Klenow DNA polymerase and T4 polynucleotide kinase, NEB, M0203, M0210 and M0201, respectively), and A-tailed using Klenow 3′−5′ exo-enzyme (NEB, M0212). Illumina sequencing library adapters were subsequently ligated to DNA ends followed by PCR amplification with KAPA 2X Hi-Fi Hotstart Readymix (Roche, KR0370).

Custom biotinylated probes were prepared as bait through nick translation using BAC DNA and/or plasmids as the template. The probes were prepared to comprehensively cover the entire locus. We used BAC lines RP24-88H3 and RP23-159G7, purchased from BACPAC Genomics, to generate bait probes covering the mouse Tbxt locus and about 200 kb flanking sequences in both upstream and downstream regions. The pooled whole-genome sequencing libraries were hybridized with the biotinylated baits in solution and purified using streptavidin-coated magnetic beads. Following pull-down, DNA sequencing libraries were quantified using a Qubit 3.0 Fluorometer (Invitrogen, Q33216) with a dsDNA HS Assay kit (Invitrogen, Q32851). The sequencing libraries were subsequently sequenced on an Illumina NextSeq 500 sequencer in paired-end mode.

Sequencing results were demultiplexed using Illumina bcl2fastq (v.2.20), requiring a perfect match to indexing BC sequences. Low-quality reads or bases and Illumina adapter sequences were trimmed using Trimmomatic (v.0.39). Reads were then mapped to the mouse genome (mm10) using bwa (v.0.7.17). The coverage and mutations in and around the Tbxt locus were checked through visualization in a mirror version of the UCSC Genome Browser.

Mouse experiments and generating Tbxt
Δexon6/+ mice

All mouse experiments were performed following NYULH’s animal protocol guidelines and performed at the NYU Langone Health Rodent Genetic Engineering Laboratory. Mice were housed in the NYU Langone Health BSL1 barrier facility in a 12-h light to 12-h dark cycle, with ambient temperature and humidity conditions. All experimental procedures were approved by the Institutional Animal Care and Use Committee at NYU Langone Health. Wild-type C57BL/6J (strain 000664) mice were obtained from The Jackson Laboratory.

The TbxtΔexon6/+ heterozygous mouse model was generated through zygotic microinjection of CRISPR reagents into wild-type C57BL/6J zygotes (Jackson Laboratory strain 000664), adapting a previously published protocol57. In brief, Cas9 mRNA (MilliporeSigma, CAS9MRNA), synthetic guide RNAs and single-stranded DNA oligonucleotide were co-injected into 1-cell stage zygotes following the described procedures57. Synthetic guide RNAs were ordered from Synthego as their custom CRISPRevolution sgRNA EZ kit, with the same targeting sites as used in the CRISPR deletion experiment of mouse ES cells (AUUUCGGUUCUGCAGACCGG and CAAGAUGCUGGUUGAACCAG). The co-injected single-stranded DNA oligonucleotide is the same as described above. Injected embryos were then in vitro cultured to the blastomeric stage, followed by embryo transferring to the pseudopregnant foster mothers. Following zygotic microinjection and transferring, founder pups were screened based on their abnormal tail phenotypes. DNA samples were collected through ear punches at about day 21 for PCR genotyping and Capture-seq validation to exclude off-targeting at the Tbxt locus.

After confirming the genotype, TbxtΔexon6/+ founder mice were backcrossed with wild-type C57B/6J mice for generating heterozygous F1 mice. Owing to the varied tail phenotypes, intercrossing between F1 heterozygotes were performed in two categories: type 1 intercrossing included at least one parent having no tail or a short tail, whereas type 2 intercrossing were mated between two long-tailed F1 heterozygotes (Table 1). As summarized in Table 1, both types of intercrossing produced heterogeneous tail phenotypes in F2 TbxtΔexon6/+ mice, thereby confirming the incomplete penetrance of tail phenotypes and the absence of homozygotes (TbxtΔexon6/Δexon6). Adult mice (>12 weeks) were anaesthetized for X-ray imaging of vertebra using a Bruker In-Vivo Xtreme IVIS imaging system. To confirm the embryonic phenotypes in homozygotes, embryos were dissected at E11.5 gestation stage from the timed pregnant mice using a standard protocol.

Generating Tbxt
insASAY and Tbxt
insRCS2 mice

The engineered TbxtinsASAY and TbxtinsRCS mouse ES cells were injected into either C57BL/6J-albino (Charles River Laboratories, strain 493) blastocysts for chimeric F0 founder mice or injected into B6D2F1/J (a F1 hybrid between C57BL/6J female and DBA/2J male, Jackson Laboratory strain 100006) tetraploid blastocysts for homozygote F0 founder mice production. The tetraploid complementation strategy aimed to generate homozygous mice with the proposed genotype in the F0 generation58. Through multiple trials of injection using both mouse ES cell lines, we achieved only one TbxtinsASAY/insASAY F0 founder mouse (male) but none for the TbxtinsRCS mouse. However, during genotype screening for TbxtΔexon6/+ founder mice, we serendipitously identified a male grey mouse that incorporated a heterozygous insertion in intron 5. Genotype analysis revealed that the insertion was a 220 bp DNA sequence from intron 6 of Tbxt (chromosome 17: 8439335–8439554, mm10), inserted in a reverse complementary scenario into intron 5 at a designed CRISPR targeting site (chromosome 17: 8438386, mm10). The inserted sequence insRCS2 in intron 5 therefore forms a 220 bp inverted complementary sequence pair with its original sequence in intron 6 (chromosome 17: 8439335–8439554, mm10), resembling the designed TbxtinsRCS and TbxtinsASAY gene structures. This genotype was therefore called TbxtinsRCS2. Capture-seq genotyping of this TbxtinsRCS2/+ mouse confirmed that the TbxtinsRCS2 allele is in the C57BL/6 background, whereas the wild-type Tbxt locus of the TbxtinsRCS2/+ founder mouse is from the DBA/2J background. This TbxtinsRCS2/+ mouse was therefore backcrossed to C57BL/6 wild-type mice and further intercrossed between F1 heterozygotes to produce homozygotes (TbxtinsRCS2/insRCS2) in the F2 generation. Capture-seq analysis of TbxtinsRCS2/insRCS2 mice confirmed their C57BL/6 background at the Tbxt locus (Extended Data Fig. 8). We also compared the tail phenotypes in age-matched C57BL/6 and DBA/2J mice and found no difference (data not shown), which suggested that any genetic background difference between the two strains does not affect tail length. The TbxtinsRCS2 mice (both heterozygotes and homozygotes) were therefore used for the analysis of tail phenotypes.

The TbxtinsASAY and TbxtinsRCS2 founder mice, both male, were separately backcrossed to wild-type C57B/6J mice for generating heterozygous F1 pups, followed by intercrossing between F1 heterozygotes to generate homozygotes in F2 generation. With all genotypes available, mouse tail lengths were measured monthly across genotypes and sex groups. Additionally, two types of breading pairs, TbxtinsRCS2/+ × TbxtΔexon6/+ and TbxtinsRCS2/insRCS2 × TbxtΔexon6/+, were performed across different founder lines of TbxtΔexon6/+ mice to analyse tail phenotypes in their offspring. These results are summarized and presented in Table 2.

To analyse the isoform expression patterns of mouse Tbxt in the embryonic tailbud region, wild-type, heterozygote and homozygote embryos from intercrossing experiments (TbxtinsRCS2/+ × TbxtinsRCS2/+, TbxtinsASAY/+ × TbxtinsASAY/+) were dissected at the E10.5 gestation stage. The tailbud for each embryo was collected for isolating total RNA, and together with embryonic tissue for gDNA to be used for genotyping. These results are presented in Fig. 4e.

Splicing isoform detection

Total RNA was collected from undifferentiated and differentiated cells of both human and mouse ES cells, and from embryonic tailbud samples, using a standard column-based purification kit (Qiagen RNeasy Kit, 74004). DNase treatment was applied during RNA extraction to remove any potential DNA contamination. Following extraction, RNA quality was assessed through electrophoresis based on ribosomal RNA integrity. Reverse transcription was performed using 1 µg of high-quality total RNA for each sample and a High-Capacity RNA-to-cDNA kit (Applied Biosystems, 4387406). DNA oligonucleotides used for RT–PCR and/or quantitative RT–PCR are listed in Supplementary Table 1.

Transcriptomics analyses in differentiated mouse ES cells

Total RNA samples isolated from day-1 in vitro-differentiated mouse ES cell lines across wild-type, TbxtinsASAY/insASAY, TbxtΔexon6/+ and TbxtΔexon6/Δexon6 genotypes were used for bulk RNA sequencing analysis. RNA samples were prepared using a standard column-based purification kit (Qiagen RNeasy kit, 74004). Two biological replicates were prepared for each mouse ES cell genotype, with the two TbxtΔexon6/Δexon6 mouse ES cell samples coming from different clones. RNA sequencing libraries were prepared using a NEBNext Ultra II Directional RNA Library Prep kit (NEB, E7765L) through its polyA mRNA sequencing workflow by using the NEBNext Poly(A) mRNA Magnetic Isolation Module (NEB, E7490L).

Raw sequencing reads were mapped to the mouse genome (mm10) with STAR (v.2.7.2a) aligner59. The resultant strand-specific read counts of all samples were integrated into a matrix for downstream analysis. Differentially expressed genes were detected using DESeq2 (v.1.40.2)60, using its default two-sided Wald test with the cut-off of log2(fold expression change) > 0.5 and multiple test-adjusted P value < 0.05. The top 500 variable genes from DESeq2 across all samples were used to perform principal component analysis. The Tbxt target genes were obtained from a previous publication41, defined by significant Tbxt ChIP–seq peak signals detected in in vitro-differentiated mouse ES cells. The set of Tbxt target genes was intersected with the significant differentially expressed genes identified in each mutant sample compared with the wild-type controls, and these were aggregated to generate the overall set of differentially expressed Tbxt target genes across the analysed mouse ES cell lines. These differentially expressed Tbxt target genes were visualized using a heatmap, with the log10-transformed normalized transcript matrix followed by z score standardization across samples.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Source link


Leave a Reply

Your email address will not be published. Required fields are marked *