Strange India All Strange Things About India and world

Plasmid construction

Rixosome subunits (NOL9, WDR18, PELP1, TEX10), PRC2 subunits (EZH2, EED, SUZ12), PRC1 subunits (RING1B, PCGF1-4) and CBX1-8 cDNAs were amplified from human ES cell cDNA library and inserted to pGAD-T7 (Takara, 630442) and pGBK-T7 (Takara, 630443) plasmids for Y2H assays. NOL9 siRNA resistant cDNA was generated by PCR. The siRNA target sequence was mutated from 5′-AGACCTAAGTTCTGTCGAA-3′ to 5′-CGGCCGAAATTTTGCAGGA-3′ and integrated into the pCI (Promega, E1731) plasmid for ectopic protein expression. For bacteria protein expression, cDNA was integrated to pGEX-6P-1 (GE Healthcare, 28-9546-48).

Y2H assays

Y2H budding yeast strain (Takara) was cultured with YEPD+adenine overnight at 30 °C. Yeast cells were collected OD 0.5 by centrifugation at 3,000 rpm for 3 min. Cells were resuspended and washed 2 times with 0.1 M LiAc (in 1x TE buffer). The bait pGBKT7 (0.5 μg) expressing rixosome, Polycomb, and HP1proteins and prey pGADT7 (0.5 μg) vectors were mixed with 10 μg carrier DNA, and further mixed with yeast cells collected from 10-ml cultures and resuspended in 50 μl 0.1 M LiAc (in 1× TE buffer). DNA-yeast mixture was incubated with 130 μl 40% PEG 4000 for 30 min at 30 °C. For transformation, 21 μl DMSO was added and mixed well with the yeast–DNA mixture, followed by heat shock at 42 °C for 20 min. After incubation on ice for 3 min, the cells were pelleted by centrifugation for 3 min at 4 °C. The supernatant was then discarded and sterile water was added to resuspend the cells, which were plated on double selective medium SC plates (Trp-, Leu-) for 3 days at 30 °C. Colonies were further transferred to quadruple selective medium SC plates (Trp-, Leu-, His-, Ade-) for 3–4 days at 30 °C. For spotting assays, cells were incubated overnight in 4 ml double selective SC medium (Trp-, Leu-). The cells were then diluted to an optical density at 600 nm of 1, one millilitre of which was pelleted, washed once with sterilized water, resuspended in 250 μl sterilized water, and transferred to 96-well plates. Three microlitres of cell suspension from each well was plated on double-selective medium SC plates (Trp-, Leu-) and quadruple-selective medium SC plates (Trp-, Leu-, His-, Ade-) for four days.

Cell culture

HeLa (ATCC, CCL-2), and HEK 293FT (ThermoFisher, R70007) cells were cultured in DMEM containing 10% fetal calf serum. Human embryonic stem cells were authenticated by the Initiative for Genome Editing and Neurodegeneration of Harvard Medical School and cultured as previously described54. In brief, cells cultures on 0.08 mg ml−1 matrigel coated plates with DMEM/F12 (containing 5 μg ml−1 insulin and 10 μg ml−1, 0.1 μg ml−1 FGF2, 1.7 ng ml−1 TGFβ1, 10 μg ml−1 transferrin). Cells were tested for mycoplasma contamination by the suppliers and were negative.


For siRNA-mediated knockdown, Lipofectamine RNAiMAX transfection reagent (Invitrogen) and siRNA (200 nM) were used to transfect the cells by following the manufacturer’s instructions. All the siRNAs were synthesized by Dharmacon and are listed in Supplementary Table 1.

CRISPR–Cas9-mediated human genome editing

Small guide RNA was synthesized via in vitro transcription by using MAXIscript T7 transcription kit (ThermoFisher, AM1312). CRISPR–Cas9 protein was purified by the Initiative for Genome Editing and Neurodegeneration Core in the Department of Cell Biology at Harvard Medical School. DNA Oligonucleotide templates (synthesized by IDT, Supplementary Table 2), guide RNA, and CRISPR–Cas9 protein were delivered to cells by electroporation with Neon transfection system (ThermoFisher). Clones were screened by PCR and Miseq sequencing (Illumina).


Cells were placed on plates with cover slides. Cells were first washed with PBS, and fixed and permeabilized with methanol for 8 min at −20 °C. Cells were then incubated for 4–10 h at 4 °C with primary antibodies in PBS containing 4% bovine serum, which was followed by staining with secondary antibodies and 1 μg ml−1 DAPI. A confocal microscope (Nikon, Ti with perfect focus and spinning disk) equipped with a 60×/1.40 NA objective lens was used to image cells. NIS-Elements imaging software was used for imaging data collection. Images were post-processed with ImageJ (NIH) and photoshop (Adobe) software. EZH2 and MDN1 fluorescence intensities were assessed using ImageJ. NPM1 foci were counted visually directly using ImageJ. For MDN1 foci, the signal was measured in the regions with NPM1 in control cells, foci with the lowest value of NPM1 staining in the control cells was then used as a cutoff and any foci measured by ImageJ with higher value were counted as MDN1 foci. A list of antibodies and their sources is described in Supplementary Table 3.

Immunoprecipitation and mass spectrometry analysis

To prepare chromatin-enriched fractions, cells were washed with PBS and then resuspended in ice-cold hypotonic buffer (10 mM HEPES, pH7.9, 1.5 mM MgCl2, 10 Mm KCl, 0.2 mM PMSF, 0.2 mM DTT) and incubated on ice for 10 min. Cell membranes were then disrupted by douncing 10 times. Nuclei were pelleted by centrifugation at 2,000g for 10 min, resuspended in cell lysis buffer (50 mM Hepes, pH 7.4, 150 mM NaCl, 1 mM MgCl2, 1 mM EGTA, and 0.5% Triton X-100) by pipetting for 3 min, and pelleted by centrifugation at 2,000g for 10 min to obtain a chromatin fraction. The chromatin pellet was resuspended in IP buffer (50 mM Hepes, pH 7.4, 250 mM NaCl, 10% glycerol, 1 mM MgCl2, 1 mM EGTA, and 1% Triton X-100) containing protease inhibitor cocktail (5056489001, Sigma) and 1 mM DNase I. Chromatin was digested for 2 h at 4 °C and centrifuged at 10,000g for 10 min. The supernatant was then incubated with specific antibodies (Supplementary Table 3) and immune complexes were collected using Dynabeads Protein A/G (ThermoFisher). For silver staining, samples were run on a 5%–20% Bis-Tris SDS–PAGE gel (BioRad) and stained with SilverQuest Silver Staining kit (Invitrogen) according to the manufacturer’s instructions. For immunoblotting, beads were boiled for 5 min in SDS loading buffer. For immunoprecipitations in Fig. 1f, g, Benzonase (Sigma, E8263) treatment was performed by adding 500 U ml−1 benzonase to cell lysates followed by incubation for 1 h in 4 °C before incubation with antibody immobilized beads. For mass spectrometry analysis, proteins were eluted with 0.5 M NH4OH and dried to completion in a speed vac.

For Flag–NOL9 and Flag–WDR18 immunoprecipitation and mass spectrometry, dried protein samples were digested in 200 mM EPPS buffer pH 8.5 with trypsin (Promega V5111). Digests contained 2% acetonitrile (v/v) and were performed at 37 °C overnight. Digests were labelled directly with TMT10 plex reagents (ThermoFisher Scientific, 90406). Labelling efficiency was checked by mass spectrometry. After hydroxylamine-quenching (0.3% v/v) for 15 min, reactions were mixed and acidified and solvent evaporated to near completion by speed vac. Samples were then fractionated by alkaline reversed phase chromatography (ThermoFisher 84868) into 12 fractions eluted with 10%, 12.5%, 15%, 17.5%, 20%, 25%, 30%, 35%, 40%, 50%, 65% and 80% acetonitrile. Fractions were pooled into 6 fractions (1+7, 2+8, 3+9, 4+10, 5+11, 6+12), dried down, stage-tipped and analysed by mass spectrometry on an Orbitrap Lumos instrument (Thermo Scientific). Relative quantification followed a multi-notch SPS-MS3 method. Prior to injection, peptides were separated by HPLC with an Easy-nLC 1200 liquid chromatography system using 100 μm inner diameter capillaries and a C18 matrix (2.6 μM Accucore C18 matrix, ThermoFisher Scientific). Peptides were separated with 4-hour acidic acetonitrile gradients. MS1 scans were measured by orbitrap recording (resolution 120,000, mass range 400–1400 Th). After collision induced dissociation (CID) (35%), MS2 spectra were collected by iontrap mass analyser. After SPS (synchronous precursor selection), TMT reporter ions were generated by high-energy collision-induced dissociation (HCD) (55%) and quantified by orbitrap MS3 scan (resolution 50,000 at 200 Th). Spectra were searched with an in-house written software based on Sequest (v.28, rev. 12) against a forward and reversed human proteome database (Uniprot 07/2014). Mass tolerance for searches was 50 ppm for precursors and 0.9 Da for fragment ions. Two missed tryptic cleavages per peptide were allowed and oxidized methionine (+15.9949 Da) was searched dynamically. For a peptide FDR (false discovery rate) of 1%, a decoy database strategy and linear discriminant analysis (LDA) were applied. The FDR for collapsed proteins was 1%. Proteins were quantified by summed peptide TMT s/n (signal/noise) with a sum s/n > 200 and an isolation specificity of >70%. Details of the TMT workflow and sample preparation procedures were described recently55.

For Flag–PHC2 and Flag–CBX4 immunoprecipitation and mass spectrometry, we added 20 µl of 8 M urea, 100 mM EPPS pH 8.5 to the beads. We added 5 mM TCEP and incubated the mixture for 15 min at room temperature. We then added 10 mM of iodoacetamide for 15 min at room temperature in the dark. We added 15 mM DTT to consume any unreacted iodoacetamide. We added 180 µl of 100 mM EPPS pH 8.5. to reduce the urea concentration to <1 M, 1 µg of trypsin, and incubated at 37 °C for 6 h. The solution was acidified with 2% formic acid and the digested peptides were desalted via StageTip, dried via vacuum centrifugation, and reconstituted in 5% acetonitrile, 5% formic acid for LC-MS/MS processing. All label-free mass spectrometry data were collected using a Q Exactive mass spectrometer (Thermo Fisher Scientific) coupled with a Famos Autosampler (LC Packings) and an Accela600 liquid chromatography (LC) pump (Thermo Fisher Scientific). Peptides were separated on a 100 μm inner diameter microcapillary column packed with about 20 cm of Accucore C18 resin (2.6 μm, 150 Å, Thermo Fisher Scientific). For each analysis, we loaded about 2 μg onto the column. Peptides were separated using a 1 h method from 5 to 29% acetonitrile in 0.125% formic acid with a flow rate of about 300 nl min−1. The scan sequence began with an Orbitrap MS1 spectrum with the following parameters: resolution 70,000, scan range 300−1,500 Th, automatic gain control (AGC) target 1 × 105, maximum injection time 250 ms, and centroid spectrum data type. We selected the top twenty precursors for MS2 analysis which consisted of HCD high-energy collision dissociation with the following parameters: resolution 17,500, AGC 1 × 105, maximum injection time 60 ms, isolation window 2 Th, normalized collision energy (NCE) 25, and centroid spectrum data type. The underfill ratio was set at 9%, which corresponds to a 1.5 × 105 intensity threshold. In addition, unassigned and singly charged species were excluded from MS2 analysis and dynamic exclusion was set to automatic. Mass spectrometric data analysis. Mass spectra were processed using a Sequest-based in-house software pipeline. MS spectra were converted to mzXML using a modified version of ReAdW.exe. Database searching included all entries from the S. pombe UniProt database which was concatenated with a reverse database composed of all protein sequences in reversed order. Searches were performed using a 50 ppm precursor ion tolerance. Product ion tolerance was set to 0.03 Th. Carbamidomethylation of cysteine residues (+57.0215 Da) were set as static modifications, while oxidation of methionine residues (+15.9949 Da) was set as a variable modification. Peptide spectral matches (PSMs) were altered to a 1% FDR. PSM filtering was performed using a linear discriminant analysis, as described previously, while considering the following parameters: XCorr, ΔCn, missed cleavages, peptide length, charge state, and precursor mass accuracy. Peptide-spectral matches were identified, quantified, and collapsed to a 1% FDR and then further collapsed to a final protein-level FDR of 1%. Furthermore, protein assembly was guided by principles of parsimony to produce the smallest set of proteins necessary to account for all observed peptides.

GST pulldown and immunoblotting

Proteins for GST pulldown assays were expressed in BL21 Codon Plus Escherichia coli (Agilent Technologies) with 200 μM IPTG induction at 16 °C overnight. Bacteria were then collected and washed with cold PBS, and sonicated (Branson sonicator) for 1 min with 20% amplitude at 4 °C. Sonicated samples were centrifuged at 20,000g for 10 min, and the supernatant was added to 0.5 ml Glutathione Sepharose 4B resin (GE Healthcare, 17075605), which was equilibrated with PBS. GST-tagged proteins were incubated with the resin for 2 h at 4 °C. The resin was then washed 6 times with PBS containing 1% Triton 100. To remove the GST tag, bead-coupled proteins were incubated with PreScission Protease (GE Healthcare, 27-0843-01) in reaction buffer (50 mM Tris-HCl, Ph7.0, 150 mM NaCl, 1 mM EDTA, 1 mM DTT) for 2 h at 4 °C. The GST-tagged PreScission Protease was removed using Glutathione Sepharose 4B resin.

For GST pulldown assays, 10 μl 50% slurry of Glutathione Sepharose 4B was used for each sample. GST or GST-tagged proteins (0.1 μM) were incubated with untagged proteins (0.1 μM) in 1 ml PBS (137 mM NaCl, 2.7 mM KCl, 8 mM Na2HPO4, and 2 mM KH2PO4, Ph7.4) containing 0.5% Triton 100 overnight at 4 °C. Beads were washed 3 times with PBS containing 0.5% Triton 100, resuspended in SDS protein buffer, and boiled for 5 min. Input (2–5%) and bound proteins (10–50%) were run on 4–20% gradient SDS–PAGE gel. SDS–PAGE was performed to separate proteins for 2 h at 80 V, and proteins were transferred to a PVDF membrane (Millipore). The membranes were blocked in 3% milk in PBS with 0.2% Tween-20, and sequentially incubated with primary antibodies and HRP-conjugated secondary antibodies, or directly incubated with HRP-conjugated primary antibodies for chemiluminescence detection. Sources of antibodies can be found in Supplementary Table 3.

Sucrose gradient centrifuge fractionation assay

Flag-tagged proteins were purified from the soluble chromatin fraction using magnetic beads (Sigma, M8823) and eluted with 3×Flag peptides (APExBIO, A6001) in elution buffer (20 mM Hepes-KOH, pH7.5, 100 mM KOAc, 5 mM Mg(OAc)2, 1 mM EDTA, 10% Glycerol). Sucrose gradients (10%-30%) were prepared using the Gradient station (BIOCOMP). An Optima TLX Ultracentifuge equipped with TLS-55 rotor was used for ultracentrifugation for 16 h at 4 °C with 35 k rpm. Gradients of 2.2 ml were fractionated into 22 fractions. One-hundred-microlitre fractions were pipetted from top and protein in fractions was captured using StrataClean resin (Agilent, 400714). Protein samples were boiled in SDS sample buffer (62.5 mM Tris-HCl, pH 6.8, 2% SDS, 10% glycerol, 0.01% bromophenol blue) for 3 min at 98 °C, and analysed by immunoblotting following gel electrophoresis (4%–15% precast protein gel with SDS from Biorad, 4561081).


Total RNA was extracted using the RNeasy Plus kit (74134, Qiagen) and reverse transcribed into cDNA using gene-specific primers and reverse transcription kit (18090010, ThermoFisher). cDNA was analysed by running PCR on a QuantStudio 7 Flex Real Time PCR System (Applied Biosystem). All reactions were performed using 10 ng RNA in a final volume of 10 μl. PCR parameters were 95 °C for 2 min and 40 cycles of 95 °C for 15 s, 60 °C for 15 s, and 72 °C for 15 s, followed by 72 °C for 1 min. All the quantitative PCR data presented were at least three biological replicates. The forward and reverse primers used for RT–qPCR targeted the first exons of the genes. Primer sequences are presented in Supplementary Table 4.


Total RNA was isolated from human cells with an RNA purification kit (Qiagen, 74134) and genomic DNA was removed by genomic DNA binding columns in the kit. Two micrograms of total RNA was used for RNA-seq library construction. Poly(A)-containing mRNA was isolated by poly(A) selection beads and further reverse transcribed to cDNA. The resulting cDNA was ligated with adapters, amplified by PCR, and further cleaned to obtain the final library. Libraries were sequenced on an Illumina Hiseq machine (Novogene) to obtain 150 bp paired-ended reads.

RNA-seq reads were pseudo aligned using Kallisto 0.45.1. An index was generating using the Ensembl hg19 GTF and cDNA FASTA. Kallisto was run using default parameters with two exceptions: allowing searching for fusions (–fusion) and setting bootstrap to 100 (-b 100).

To visualize the mapped RNA-seq with IGV or UCSC genome browser, bam files were generated with Hisat 2.2.0, which was followed by making bigwig files with deeptools (v/3.0.2) (binsize 10). Reads were normalized to reads per genome coverage.

Read counts were calculated on a per transcript basis using Kallisto and the above described pseudoalignment. The R package tximport 1.10.1 was used to select the dominant transcript per gene (txOut = FALSE), which was then used for DEseq2 analysis. To analyse only active genes, those with 0 read counts in all samples were removed from the DEseq2 output. As they are not transcribed by PolII, 13 genes on chrM were also removed, resulting in a list of 24,043 active genes. Upregulated genes and downregulated genes are defined with Padj < 0.05 and fold change > 2 or < −2.

PRO-seq library construction

Aliquots of frozen (−80 °C) permeabilized cells were thawed on ice and pipetted gently to fully resuspend. Aliquots were removed and permeabilized cells were counted using a Luna II, Logos Biosystems instrument. For each sample, 1 million permeabilized cells were used for nuclear run-on, with 50,000 permeabilized Drosophila S2 cells added to each sample for normalization. Nuclear run on assays and library preparation were performed essentially as described56 with modifications noted: 2× nuclear run-on buffer consisted of (10 mM Tris (pH 8), 10 mM MgCl2, 1 mM DTT, 300 mM KCl, 40 μM each biotin-11-NTPs (Perkin Elmer), 0.8 U μl−1 SuperaseIN (Thermo), 1% sarkosyl). Run-on reactions were performed at 37 °C. Adenylated 3′ adapter was prepared using the 5′ DNA adenylation kit (NEB) and ligated using T4 RNA ligase 2, truncated KQ (NEB, per manufacturer’s instructions with 15% PEG-8000 final) and incubated at 16 °C overnight. One-hundred-eighty microlitres of betaine blocking buffer (1.42 g of betaine brought to 10 ml with binding buffer supplemented to 0.6 μM blocking oligonucleotide (TCCGACGATCCCACGTTCCCGTGG/3InvdT/)) was mixed with ligations and incubated 5 min at 65 °C and 2 min on ice prior to addition of streptavidin beads. After T4 polynucleotide kinase (NEB) treatment, beads were washed once each with high salt, low salt, and blocking oligonucleotide wash (0.25× T4 RNA ligase buffer (NEB), 0.3 uM blocking oligonucleotide) solutions and resuspended in 5′ adapter mix (10 pmol 5′ adapter, 30 pmol blocking oligonucleotide, water). 5′ adapter ligation was per Reimer but with 15% PEG-8000 final. Eluted cDNA was amplified with five cycles (NEBNext Ultra II Q5 master mix (NEB) with Illumina TruSeq PCR primers RP-1 and RPI-X) following the manufacturer’s suggested cycling protocol for library construction. A portion of preCR was serially diluted and for test amplification to determine optimal amplification of final libraries. Pooled libraries were sequenced using the Illumina NovaSeq platform.

PRO-seq data analysis

All custom scripts described herein are available on the Adelman Lab Github ( Using a custom script (, FASTQ read pairs were trimmed to 41 bp per mate, and read pairs with a minimum average base quality score of 20 retained. Read pairs were further trimmed using cutadapt 1.14 to remove adapter sequences and low-quality 3′ bases (–match-read-wildcards -m 20 -q 10). R1 reads, corresponding to RNA 3′ ends, were then aligned to the spiked in Drosophila genome index (dm3) using Bowtie 1.2.2 (-v 2 -p 6–best–un), with those reads not mapping to the spike genome serving as input to the primary genome alignment step (using Bowtie 1.2.2 options -v 2–best). Reads mapping to the hg19 reference genome were then sorted, via samtools 1.3.1 (-n), and subsequently converted to bedGraph format using a custom script ( Because R1 in PRO-seq reveals the position of the RNA 3′ end, the ‘+’ and ‘−’ strands were swapped to generate bedGraphs representing 3′ end position at single nucleotide resolution.

For NOL9 KD PRO-seq, we performed 2 sets of PRO-seq experiments, each with two biological replicates. In the first set of experiments, NOL9 depletion resulted in many more upregulated (228) than downregulated (30) genes, while in the second set experiments, nearly the same number of genes were up (162) and down (160) regulated. Furthermore, unlike the first set, in the second set, the extent of overlap between siNOL9 upregulated and downregulated genes with those upregulated in EED-KO or RING1A/B-DKO was similar. Although the basis of this discrepancy is unclear, the correlation between the two biological replicates in Set2 was lower than Set1 raising the possibility that poor growth or inefficient NOL9 depletion in Set2 siNOL9 cells may have resulted in a larger number of non-specifically downregulated genes. We therefore eliminated the Set2 siNOL9 data and used only the 2 biological replicates from the Set1 siNOL9 experiment.

Gene model refinement using PRO-seq and RNA-seq

To select gene-level features for differential expression analysis, as well as for pairing with PRO-seq data, we assigned a single, dominant TSS and transcription end site (TES) to each active gene. This was accomplished using a custom script, (available at, which uses RNA-seq read abundance and PRO-seq R2 reads (RNA 5′ ends) to identify dominant TSSs, and RNA-seq profiles to define most commonly used TESs. RNA-seq and PRO-seq data from control and siNOL9 cells were used for this analysis, to capture gene activity under both conditions. Exon- and transcript-level features consistent with the resulting TSS to TES windows for 21,004 active genes in HEK 293T cells were selected from an hg19 reference GTF (GRCh38.99 from Ensembl). This filtered list of active genes was used for analyses shown in Figs. 2c–e, 4a–d, Extended Data Figs, 2c, d, 6b–k, as well as for defining differentially expressed genes in PRO-seq data. Differentially expressed genes between control (n = 2) and siNOL9 (n = 2) cells were determined using DESeq2 v1.26.0. Genes were called as differentially expressed using DEseq2’s DESeqDataSetFromMatrix mode at an adjusted P value threshold of <0.05 and fold change >1.5. This revealed 228 genes to be upregulated and 30 genes to be downregulated upon siNOL9.

ChIP–qPCR, ChIP–seq and data analysis of ChIP–seq

ChIP was performed as previously described with minor modifications57. Cells for ChIP were cultured in 15 cm plates. Cell were first washed with cold PBS, crosslinked at room temperature with 10 mM DMP (ThermoFisher Scientific) for 30 min, and then 1% formaldehyde (ThermoFisher Scientific) for 15 min. Crosslinking reactions were quenched by addition of 125 mM glycine for 5 min. Crosslinked cells were separated by 3 min treatment of 0.05% trypsin (Gibco), and then washed with cold PBS 3 times. In every wash, cells were centrifuged for 3 min at 1,000g at 4 °C. Cell were then resuspended in sonication buffer (pH 7.9, 50 mM Hepes, 140 mM NaCl, 1 mM EDTA, 1% Triton, 0.1% Sodium deoxycholate, and 0.5% SDS) and sonicated to shear chromatin into ~300 bp fragments using a Branson sonicator. Sonicated samples were diluted fivefold with ChIP dilution buffer (pH 7.9, 50 mM Hepes, 140 mM NaCl, 1 mM EDTA, 1% Triton, 0.1% Sodium deoxycholate) to obtain a final concentration of 0.1% SDS. Diluted samples were centrifuged at 13,000 rpm for 10 min. The supernatant was pre-cleared with protein A/G or Dynabeads M-280 Streptavidin beads (ThermoFisher) and immunoprecipitated for 3–12 h using 3 μg antibodies and 40 μl protein A/G or Dynabeads M-280 Streptavidin beads. The beads were washed twice with high salt wash buffer A (pH 7.9, 50 mM Hepes, 500 mM NaCl, 1 mM EDTA, 1% Triton, 0.1% Sodium deoxycholate, and 0.1% SDS), and once with wash buffer B (pH 7.9, 50 mM Hepes, 250 mM LiCl, 1 mM EDTA, 1% Triton, 0.1% Sodium deoxycholate, 0.5% NP-40). The bound chromatin fragments were eluted with elution buffer (pH 8.0, 50 Mm Tris, 10 mM EDTA, 1% SDS) for 5 min at 65 °C. Eluted DNA-proteins complexes were treated with RNase A and crosslinks were reversed overnight at 65 °C. Proteinase K was then added to digest proteins for 1 h at 55 °C. DNA was further purified using PCR Purification Kit (QIAGEN) and analysed by PCR on a QuantStudio 7 Flex Real Time PCR System (Applied Biosystem). PCR parameters were 95 °C for 2 min and 40 cycles of 95 °C for 15 s, 60 °C for 15 s, and 72 °C for 15 s, followed by 72 °C for 1 min. All the ChIP–qPCR data presented were at least three biological replicates. Primer sequences are in Supplementary Table 4. Error bars represent standard deviation (three biological replicates).

For ChIP–seq, sequencing library was constructed using TruSeq DNA sample Prep Kits (Illumina) and adapter dimers were removed by agarose gels electrophoresis. Sized selected and purified DNA libraries were sequenced on an Illumina Hiseq 2500 machine (Bauer core facility at Harvard University) to obtain 50 bp single-ended reads. ChIP–seq reads were quality controlled with fastqc (v0.11.5) and mapped to the human genome reference (GRCh37/hg19) using bowtie2 (v2.2.9) with default parameters or bowtie (v1.2.2) with parameters -v2 -k1–best. Bam files were generated with samtools 1.3.1, which was followed by making bigwig files with deeptools (v/3.0.2) (binsize 10). Reads were normalized to Reads Per Genome Coverage (RPGC) with deeptools (v/3.0.2) bamCoverage function. To analyse read density at TSS regions, we made heatmaps and metaplots of ChIP–seq samples. TSS was centered in the regions plotted and data were tabulated with the same distance relative to TSS. Matrix files were generated using computematrix function of deeptools (v/3.0.2). Based on generated matrix file, heatmaps were generated by PlotHeatmap function, and profiles were generated by plotprofile function or in Prism.

To analyse read density and correlation between different ChIP–seq samples, we performed Spearman correlation analysis. Reads density was analysed at all hg19 annotated TSSs (n = 56,335) with multiBigwigSummary function from deeptools (v/3.0.2) to get a npz matrix file. The heatmap Spearman of Pearson correlation was generated by plotCorrelation function of deeptools (v/3.0.2). The heatmaps generated in this study also included all annotated human genes (hg19). The gene list was obtained from Promoter regions were defined as ±2 kb from TSSs. Peak overlaps were analysed by bedtools (v/3.0.2) intersect function.

For co-occupany analysis in Extended Data Fig. 2, peak calling of TEX10, H2AK119ub1, and H3K9me3 was performed with MACS2 ( with Input ChIP–seq sample as control (-p 0.05–broad,–broad-cutoff 0.05, FoldChang>2.5, Length>800 bp).

For defining TEX10-bound targets in Fig. 2, TEX10 peaks were called using HOMER (version 4.9) with the -style histone option and siTEX10 ChIP–seq as background. TEX10-bound genes were defined as those that had 50 or more TEX10 reads in the TSS ±1 kb region (n = 7,827); all others were considered unbound (n = 13,177).

For defining Polycomb target genes in Figs. 2, 3, H2AK119ub1 ChIP–seq data from HEK 293FT cells were used. Deeptools was used to count reads in TSS ±2 kb regions. K-means clustering was performed with k = 2. Cluster one was H2AK119ub1 enriched and counted as Polycomb target genes. Venn diagrams in Extended Data Fig. 8 were made based on the number of overlapping target genes. Deeptools was used to count reads in TSS ±2 kb regions. K-means clustering was performed with a fixed value of k = 3. Cluster one was counted as target genes.

The sources of ChIP–seq data used in this study are listed in Supplementary Table 5.

Statistical tests

For RNA-seq, PRO-seq and ChIP–seq, statistical significance for comparisons was assessed by Wilcoxon (unpaired) or Mann–Whitney (pairwise) tests. The test used and error bars are defined in each figure legend.

Significance for immunostaining foci was evaluated using unpaired two-tail student’s t-test. All the RT–qPCR and ChIP–qPCR data are represented as mean ± s.d. using GraphPad Prism 8 software. Volcano plots of Mass spec results were made with Microsoft Excel.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this paper.

Source link

Leave a Reply

Your email address will not be published.