Strange IndiaStrange India

All-atom MD simulations

All atomistic MD simulations were performed using a memory optimized version of NAMD2.13b60, the CHARMM36 parameter set61 for protein and DNA, the TIP3P model for water62 and a custom hexahydrate model for magnesium ions63 used along with the CUFIX corrections to ion–DNA interactions63. A full set of CUFIX corrections64 was used where specified. The use of permanent magnesium hexahydrate complexes to represent Mg2+ ions prevented direct binding of Mg2+ to DNA, in line with previous findings65. Multiple time stepping66 was used, whereby local interactions were computed every 2 fs whereas long-range interactions were computed every 6 fs. All short-range nonbonded interactions were cut-off starting at 0.8 nm and completely cut-off by 1.0 nm. Long-range electrostatic interactions were evaluated using the particle-mesh Ewald method67 computed over a 0.21-nm-spaced grid with an interpolation order of eight. SETTLE68 and RATTLE69 algorithms were applied to constrain covalent bonds to hydrogen in water and in non-water molecules, respectively. The temperature was maintained at 300 K using a Langevin thermostat with a damping constant of 0.5 ps1 unless specified otherwise. Constant pressure (NPT ensemble) simulations used a Nosé–Hoover Langevin piston70 with a period and decay of 1,400 and 700 fs, respectively. Implicit representation of the capsid was realized using the Grid-Steered MD71 feature of NAMD. Energy minimization was carried out using the conjugate gradients method72. Atomic coordinates were recorded every 9.6 ps unless specified otherwise. Visualization and analysis were performed using VMD73 and MDanalysis74, respectively. Unless specified otherwise, custom scripts for Tcl interpreter of VMD were used in analysis of trajectories. Supplementary Table 1 provides a summary of the all-atom MD runs performed.

Tiling electrolyte volume for solvation

A cubic volume of electrolyte about 3 nm on one side was first simulated for 30 ns in the NPT ensemble. The last frame of the trajectory was used to build a much larger volume of the electrolyte by copying and shifting the electrolyte coordinates by a predetermined number of unit cell vectors. Because Mg2+ is modelled as a hexahydrate complex, water molecules of the hexahydrates were wrapped across the periodic boundaries of the unit cell before tiling. A typical solvation process included merging the tiled electrolyte volume with the protein or DNA structure followed by removal of water and ions that overlapped with protein or DNA. Having a slight excess of ions in the tiled volume was instrumental to obtaining a target ion density after solvation.

Vacuum simulations

Vacuum simulations were performed to resolve steric clashes between atoms. Unless stated otherwise, such simulations were performed using a dielectric constant of 200, the long-range electrostatic calculations (PME) switched off and a Langevin thermostat with a damping coefficient of 50 ps1. To ensure the structural integrity of the molecules of interest, non-hydrogen atoms of the molecules were subject to a network of harmonic restraints and/or, in the case of DNA, external bonds enforcing base pairing (see respective sections for the specific values of the parameters). Under such restraints, the overall RMSD of the molecule was about 0.5 Å.

All-atom simulations of empty HK97 capsid

An initial all-atom model of the HK97 capsid was constructed from the structure of the asymmetric subunit40 obtained by X-ray crystallography and cryo-EM (PDB identifier: 2FT1). First, the amino terminus of each of the seven proteins in the asymmetric subunit was extended to include residues that were not resolved in the structure using a model of the immature asymmetric subunit (PDB identifier: 1OHG)75. Note that the N-terminal disordered tails of the capsid proteins (up to residue 103) are cleaved during the maturation process and were, accordingly, omitted from the model. The final protein model included residues 103–383 for all capsid proteins. With crystallographically resolved atoms held fixed, steric clashes were resolved by 1,200 steps of minimization. Subsequently, a model of the complete HK97 capsid was generated by tiling the asymmetric subunit about the symmetry axes of the capsid. The titratable residues located on the inner surface of the capsid, that is, Asp136, Glu149, Asp173, Glu269, His283, Asp290, Glu292, Asp330 and Glu348, were protonated41, which reduced the electrical charge of the protein capsid to zero. Finally, Lys169 and Asn356 residues of adjacent proteins were crosslinked using a custom patch (Supplementary Information). Parameters for the atoms affected by the crosslinking were obtained using the CGenFF server76.

The capsid was merged with a 64 × 64 × 64 nm3 tiled volume of solvent with an excess concentration of ions. Water within 1.24 Å and ions within 2.25 Å of non-hydrogen atoms of the capsid were removed from the system. Ions were then removed from inside and outside the capsid until the Na+, Mg2+ and Cl concentrations were, respectively, approximately 0.21, 0.005 and 0.3 mol kg–1 inside and approximately 0.22, 0.028 and 0.24 mol kg−1 outside the capsid. At the end of the solvation procedure, the system was electrically neutral and contained 27,484,866 atoms. Following 9,600 steps of energy minimization, the solvated system was equilibrated in multiple stages (Supplementary Table 2) of NPT simulations at 300 K. With non-hydrogen atoms of the capsid subject to position restraints (k = 0.1 kcal mol–1 Å–2), a gradual lowering of the Langevin damping coefficient from 50 ps–1 (for 0.5 ns) to 10 ps–1 (for 2.2 ns) and finally to 0.5 ps–1 (for 18.8 ns) allowed water and ions to equilibrate, bringing the volume of the system to equilibrium in about 10 ns. Next, position restraints on the protein were gradually decreased in stages. First, the system was equilibrated for another 17.14 ns with the position restraints applied to all non-hydrogen atoms of the capsid (k = 0.01 kcal mol–1Å–2). Then the restraints were lowered (k = 0.005 kcal mol–1 Å–2) and simulated for 14.27 ns, after which, only the Cα atoms of the protein were restrained (k = 0.005 kcal mol–1 Å–2) for 12.01 ns. As a final step, the position restraints were further lowered (k = 0.001 kcal mol–1Å–2) for 4.7 ns, after which, the system was run without any position restraints. The last 600 ns of the simulation was performed using CUFIX corrections applied to acetate–ion and amine–carboxyl interactions. A replica simulation of the empty capsid was built and equilibrated following the same procedures, differing by the initial concentration of ions, 0.2 mol kg–1 Na+, 0.005 mol kg–1 Mg2+ and 0.21 mol kg–1 Cl, both inside and outside the capsid. The equilibration schedule is further detailed in Supplementary Tables 2 and 3.

Multiresolution simulations of DNA packaging

DNA packaging simulations were performed using a previously described multiresolution model of DNA, mrDNA77. The mrDNA model was developed to reproduce experimentally determined properties of DNA such as its persistence length and twist persistence length, as well as the pressure in a DNA array. Packaging simulations were performed using an in-house-developed, GPU-accelerated code, atomic resolution Brownian dynamics (ARBD)78. All simulations used a Brownian dynamics integrator with simulation-dependent time step, a temperature of 291 K and a nonbonded cut-off value of 4.1 nm or 5 nm for the fast or slow packaging simulations, respectively. Although the smoothening of the free-energy landscape makes processes occur much faster in a CG simulation than in an equivalent all-atom simulation79, the durations of our CG simulations are reported throughout the article simply as the product of the time step and the number of elapsed steps.

Grid-based representation of the capsid enclosure and the packaging motor

For our DNA packaging simulations, the protein capsid containing an embedded packaging motor was represented using a 3D grid-based potential. Before generating such a potential, a small patch of atoms was cut out from the all-atom structure of the capsid (PDB identifier: 2FT1)40 at one of the icosahedral vertices of the capsid. An all-atom model of an HK97-family portal protein (PDB identifier: 3KDR) was manually docked into the vertex opening. A 2.06 Å-resolution steric potential representing the capsid and the embedded portal was generated from the all-atom model using the volmap tool of VMD73, with each non-hydrogen atom replaced by a spherical Gaussian density of width equal to the nominal atomic radius multiplied by a scaling factor of 2. To model the action of the packaging motor, a linear potential ramp was added to the cylindrical region (15 Å in radius and 63.7 Å in length) inside the portal, as illustrated by the inset in Fig. 2a. To avoid entrapment of DNA beads within local wells of the capsid potential, a spherical Gaussian kernel of 1 Å width was linearly convolved with the density of the capsid, blurring the density and ensuring that the derived potential acting on DNA beads is smooth and continuous. This density was multiplied by a scaling factor of 0.5 kcal mol–1 to obtain the implicit potential for the CG packaging simulations. To the best of our knowledge, there is no experimental evidence suggesting that specific residues of the HK97 capsid have a crucial role during the packaging process. Experiments performed using a similar virus, phage-λ, found that packaged DNA forms random contacts with the capsid52, which suggests that representing the capsid as a steric, chemically inert barrier is a reasonable approximation.

Initial configurations of DNA genomes

Before packaging, initial configurations of the 39,732 bp HK97 genome were prepared as follows. The DNA was initially arranged in the shape of a cylindrical 3D spiral that did not contain knots. For fast packaging simulations, the spiral was 108 nm in length and 85 nm in diameter, and the DNA was equilibrated for 5 μs (100 fs time step) in the absence of the capsid potential at 4 bp per 1 bead resolution. The resulting configuration was placed with one end 28 nm away from the capsid portal having the portal and spiral axes coinciding. The terminal DNA bead nearest to the capsid was then pulled into the portal by a harmonic potential, (k = 0.1 kcal mol−1 Å−2) during a simulation lasting approximately 0.5 μs. For slow packaging simulations, the DNA genome was initially placed in the shape of a spiral filling a 254-nm-long annular cylinder with 50 nm and 100 nm inner and outer radii, respectively. Four replicate simulations were then performed at 10 bp per bead resolution to relax the configuration, each lasting 2.5 ms (250 fs time step). Each DNA configuration was then geometrically transformed to orient the DNA fragment near one of its ends along the portal axis and place that end 11.5 nm away from the portal entrance. Those configurations were then used to construct a 4 bp per 2 bead model using the mrDNA protocol77. The conformations of the newly added orientation beads were relaxed during a 40 ns simulation (40 fs time step), whereas the DNA contour beads were each harmonically restrained to their initial coordinates (k = 10 kcal mol−1 Å−2). The DNA end nearest to the portal was then brought to the portal entrance during another 40 ns simulation using a harmonic restraint potential (k = 1 kcal mol−1 Å−2), one such configuration is shown in the inset in Fig. 2a. Four additional 4 bp per 2 bead relaxation simulations were performed starting from the mirror-image configurations of the states obtained at the end of the four 10 bp per bead relaxation simulations, which provided a total of eight initial conditions for the slow DNA packaging simulations.

DNA packaging simulations

The genome packaging simulations were performed at either 4 bp per 2 bead resolution with 2 magnitudes of the packaging force, 280 pN (fast) or 55 pN (slow). In some simulations, additional torque was applied to the beads located within the portal region to rotate the DNA right-handed with respect to the axis directed from the portal to the capsid centre by either –359° (fast with twist) or 14° (slow with twist) per turn of the packaged DNA.

The simulations of DNA packaging with twist were performed using a custom build of ARBD that placed a harmonic potential on the angle formed by the vector connecting each DNA bead in the portal region and its orientation bead, projected into the plane defined by the portal axis, and a reference vector. The reference vector was given by a rotation of a vector within a plane normal to the portal axis by an angle θ0 that depended on the identity of the DNA bead and on whether the genome was packaged fast or slow. For fast packaging simulations, the spring constant was 500 kcal mol−1 radian−2 and the rest angle was specified (in degrees) as θ0 = (ibp/10.4) − (d/34) × 14, where ibp is the index of the base pair corresponding to the DNA bead, and d is the position of the bead projected along the portal axis, given in Å. For slow packaging simulations, the spring constant was 10 kcal mol−1 radian−2 and the rest angle was specified (in degrees) as θ0 = ibp (360/10.4 + 14/10.0). The first term accounts for the natural twist of the DNA, whereas the second term effectively rotates the DNA by 14° every 10 bp (one turn of a DNA duplex). The 14°-per-turn rotation of DNA was previously reported for the ϕ29 portal44.

The fast packaging simulations were performed using a 100 fs time step and lasted around 100 μs, whereas the slow packaging simulations were performed with a smaller 40 fs time step and lasted about 1.1 ms. Completion of the slow packaging simulations required about 3 months of continuous runs using the GPU nodes of the TACC Frontera. After completion of the packaging simulations, the systems were equilibrated for another 200 μs, at the same resolution as packaging, that is, 4 bp per 2 beads. This additional equilibration was essential for the fast packaging simulations with high levels of twist, as it allowed the superhelical twist accumulated from the packaging step to equilibrate. In total, 16 slow packaging simulations were performed, 8 each with and without twist; and 4 fast packaging simulations, 2 each with and without twist.

All-atom model of packaged HK97

All-atom model of the genome

After packaging and equilibration of select capsids, the multiresolution protocol77 was used to map the coordinates of the DNA to a higher resolution, 2 beads per 1 bp, model. The higher-resolution model was equilibrated for 400 μs (slow) or 1 μs (fast), using a smaller time step of 20 fs. The fast model was simulated for another 1 μs at 1 bead per 1 bp resolution. Coordinates from the last part of the final CG trajectory were averaged for mapping to an atomistic model as specified in the mrDNA protocol77. First, the trajectory was aligned to minimize the RMSD with respect to the final configuration. Stepping backwards through the trajectory, we selected a continuous fragment of the trajectory with individual DNA configurations and an RMSD with respect to the final configuration lower than a 1 Å threshold. The coordinates from that point onwards were averaged, and splines were fit through these to facilitate mapping to an all-atom structure of the DNA. The nucleotide sequence of the resulting all-atom DNA model matched the experimentally determined sequence of the 39,732 bp HK97 genome80. For accurate mapping of atomic coordinates and to prevent any ring piercing clashes from the mapping procedure, the sugar base of the DNA backbone was shrunk in size by a scaling factor of 0.25.

Implicit capsid potential for all-atom simulations

For atomistic simulations, an implicit representation of the capsid was designed to confine DNA while permitting the exchange of water and ions. To obtain the optimal grid potential using the volmap tool of VMD, with the ‘-weight mass’ option turned on, the resolution of the grid and the width of the Gaussians approximating atomic densities were tuned. The resulting grid-based model of the capsid was tested in a vacuum simulation in which an all-atom model of a two-turn DNA fragment was subject to a 20 pN force pressing the DNA duplex onto a small patch of the capsid grid (Extended Data Fig. 10). The specific value of the force was chosen to match that produced by the internal pressure of 20 bar inside a bacteriophage virus. For reference, an identical simulation was performed using a fully atomistic representation of both the DNA and the capsid. The parameters of the grid representations of the capsid were varied until a match with all-atom representation was reached with respect to the equilibrium DNA–capsid distance. See the caption of Extended Data Fig. 10 for more detail. The resulting grid-based density of the capsid had a resolution of 1.4 Å, and the atomic densities were approximated using Gaussians of the nominal atomic radii. For subsequent grid-steered MD simulations, the grid-based density was converted to a grid-based potential using a scaling factor of 50 kcal mol1 Å−1. The instantaneous grid-based forces were computed using a cubic polynomial interpolation in the three dimensions.

Initial relaxation of the genome

A short 2,500-step minimization with backbone atoms fixed was performed in vacuum to bring each DNA nucleotide to its nominal size. Subsequently, another minimization of 4,500 steps in vacuum was performed without having any atoms fixed, but under an elastic network of restraints applied to DNA, as previously described81. At the same time, the grid-forces functionality of NAMD71 was used to apply external forces from the implicit representation of the capsid to non-hydrogen atoms of the DNA, which prevented the DNA from expanding beyond the boundaries of the capsid. Using the extrabonds feature of NAMD, additional harmonic forces (k = 10 kcal mol1 Å−2) were applied to pairs of atoms that formed Watson–Crick hydrogen bonds between all base pairs of the genome. The DNA was simulated in vacuum for 21 ns in the NVT ensemble, producing an initial all-atom configuration of the 39,732 bp genome consisting of 2,523,065 atoms. In total, six such all-atom structures were constructed, three each for packaging with and without twist.

Solvation of the genome

To solvate the genome, we prepared a 64 × 64 × 64 nm3 tiled volume of solvent containing 1 mol kg−1 NaCl and 1.2 mol kg−1 Mg2+ hexahydrate; the solvent carried a net positive charge. This ion composition was chosen to exceed the equilibrium concentration of ions in a tight DNA bundle of approximately the same average DNA–DNA spacing50. After addition of the solvent, water and ions overlapping with the atoms of DNA were removed, setting the target concentration for individual ion species by tuning the cut-off distance for the ion removal. For the slow model of packaged DNA, the desired ion composition of 0.75 mol kg−1 Na+ and 0.5 mol kg−1 Mg2+ (ref. 50) was obtained by removing water molecules with oxygen atoms within 2.3 Å of any non-hydrogen DNA atom, Na+ ions within 3.7 Å from the DNA, magnesium hexahydrate complexes with Mg2+ atoms within 6.855 Å of DNA and Cl ions within 20 Å of the DNA, which ensured that no Cl ions were initially present inside the volume occupied by packaged DNA. For other DNA models, the above cut-off parameters were slightly adjusted (by less than 5%) to obtain the desired ion concentrations. The ions located outside the volume occupied by the genome were randomly removed such that the final concentrations of ionic species outside the capsid were 0.2 mol kg−1 Na+, 0.005 mol kg−1 Mg2+ and 0.21 mol kg−1 Cl. The regions interior of and exterior to the capsid were identified using the ‘measure volinterior’ functionality of VMD82. The resulting solvated systems contained about 26 million atoms.

Equilibration of the solvated genome

The solvated genome structure was equilibrated in a series of simulations described in detail for all systems in Supplementary Tables 4–9. Below, we illustrate the procedure using the slow model as an example. After 2,500 steps of energy minimization, the system was simulated in an NPT ensemble for 1.64 ns with all non-hydrogen atoms of the DNA harmonically restrained (k = 0.1 kcal mol−1 Å−2) to their initial coordinates and coupling each non-hydrogen atom to the Langevin thermostat with a damping coefficient of 50 ps−1. During this and subsequent simulations, a grid-based potential derived from the atomic structure of the capsid was applied on all non-hydrogen atoms of the DNA to confine it within the inner volume of the capsid, whereas water and ions were allowed to move in and out of the volume occupied by the packaged DNA. Doing so greatly reduced the possibility of having an out-of-equilibrium Mg2+ configuration within the genome that could be caused by the low permeability of the protein capsid to Mg2+. After a brief 0.25 ns simulation with the Langevin damping coefficient decreased to 10 ps1, and another 8.37 ns with the damping coefficient decreased to 0.5 ps1, the system was simulated for 3.8 ns with the spring constant of the position restraints lowered to 0.01 kcal mol1 Å−2. The position restraints were then removed and the system was simulated for 29.13 ns in the NVT ensemble under the action of the confining potential. At all stages of the solvent equilibration procedure, the integrity of all Watson–Crick base pairs of the DNA was enforced by means of external harmonic potentials (k = 50 kcal mol1 Å−2).

Addition of the atomistic protein capsid

To replace the implicit representation of the capsid with a corresponding all-atom model, ions residing within the volume to be occupied by the capsid were removed during a 2 ns NVT simulation. During this simulation, all non-hydrogen atoms of the DNA were harmonically restrained (k = 0.1 kcal mol−1 Å−2) to their coordinates from the end of the previous implicit capsid run, and a grid-based potential was applied only to ions to expel them from the volume to be occupied by the capsid. The ion-expelling grid-based potential was constructed by first creating a 2 Å resolution atomic density of the protein capsid, using the volmap tool in VMD with a scaling factor of 2, and then rescaling the grid cell size to 1.97 Å. This led to the density corresponding to the protein shrinking radially inwards by about 4.5 Å. The resulting density grid was scaled by a factor of 100 kcal mol−1 Da−1 to produce an external potential. A very small number of ions that, despite the expelling potential, remained in the volume to be occupied by the explicit capsid were manually moved to the region outside the capsid. At the end of this ion-expelling procedure, the volume to be occupied by the capsid was devoid of ions, which necessitated only the removal of water molecules in that area before the addition of the atomistic capsid, which was done as described below.

Next, the atomic structure of the genome was extracted from the last frame of the simulation described in the previous section and merged with the all-atom model of the capsid. A small number of DNA–protein clashes was resolved in a 1 ns vacuum simulation in which all non-hydrogen atoms of the DNA located within 15 Å of the protein were harmonically restrained (k = 0.1 kcal mol−1 Å−2) to their initial coordinates, whereas the coordinates of all other non-hydrogen atoms of the DNA were constrained to their initial positions using the fix atoms protocol of NAMD. At the same time, the protein capsid was subject to an elastic network of restraints (k = 20 kcal mol−1 Å−2) applied to all atom pairs located within 6 Å of each other. This vacuum equilibration simulation was performed using a Langevin damping coefficient of 500 ps−1 applied to all non-hydrogen atoms. The simulation successfully removed all DNA–protein clashes, perturbing coordinates of the protein atoms by less than 0.7 Å, with the average protein RMSD with respect to the crystal structure coordinates being just 0.2 Å. The resulting DNA–protein system was merged with the solvent configuration obtained at the end of the ion-expelling simulation. Water molecules located within a 2.5 Å of non-hydrogen atoms of the protein capsid were removed. Steric clashes between the protein and the ions were identified, and the ion coordinates were adjusted to remove such clashes, making sure the ion concentration inside and outside the capsid was not disturbed. A small number of persistent clashes with Mg2+ hexahydrates were removed by shrinking the size of the hexahydrates by 0.75 followed by a 2,500-step minimization. A similar procedure was performed to build initial models of all packaged particles. Details are described in Supplementary Tables 4–9.

Equilibration of the complete all-atom models

The DNA–protein–solvent systems were next equilibrated in the NPT ensemble as detailed in Supplementary Tables 4–9. Below, we illustrate the equilibration process using the slow model of packaged DNA as an example. The initial equilibration was carried out with position restraints (k = 0.1 kcal mol1 Å−2) applied to all non-hydrogen atoms of the protein and DNA, extra bonds applied to preserve the DNA base pairing (k = 10 kcal mol1  Å−2) and coupling all non-hydrogen atoms of the system to a Langevin thermostat with a damping coefficient of 50 ps1 for the initial 3.17 ns, 10 ps1 for the next 0.4 ns and 0.5 ps1 for the remaining 21.75 ns. Following that, the position restraints on the DNA were removed and the system was simulated for 14.88 ns while keeping the restraints enforcing the DNA base pairing and the position restraints on the capsid. Then, the position restraints on the non-hydrogen atoms of the protein were lowered to 0.01 kcal mol1 Å−2 for 13.53 ns and then to 0.001 kcal mol1 Å−2 for another 12.69 ns. Next, position restraints (k = 0.001 kcal mol1 Å−2) were applied only to the backbone atoms of the protein for 8.19 ns and then only to the Cα atoms of the protein (k = 0.001 kcal mol1 Å−2) for 8.57 ns. The spring constants of the latter restraints were then lowered to 0.0001 kcal mol1 Å−2 for the next 11.8 ns along while the spring constant of the DNA base pairing restraints was reduced to 1 kcal mol1 Å−2. For the next 22.72 ns, the system was simulated without any position restraints applied to the protein, whereas the strength of the DNA base pairing restraints was reduced to 0.1 kcal mol1 Å−2. Following that, the system was simulated without applying any restraints to either protein or DNA for 886.27 ns. These steps are described in Supplementary Table 4.

Other all-atom models of packaged capsid were constructed and equilibrated using similar procedures (see Supplementary Tables 5–9 for details). Some of the fast models were equilibrated without applying the NBFIX corrections to acetate–cation and amine–carboxyl interactions83, whereas the NBFIX corrections for the interactions involving DNA phosphates63 were present in all simulations. The application of these additional NBFIX corrections slightly increased the gaps between protein interfaces and slightly increased the rate of water transport through the capsid (Fig. 3h).

Common procedures for structure analysis

Unless specified otherwise, the ‘measure fit’ command of VMD was used for alignment of structures before RMSD calculations, using coordinates of atoms resolved in the crystal structure. Analyses that required local alignment of smaller subunits, for example, per residue RMSD calculations (Fig. 3e), were performed by first aligning all 12 pentameric subunits (one is shown in Fig. 1h) to a chosen reference pentamer using VMD measure fit functionality. After the alignment, the RMSDs of individual residues were computed with respect to the crystallographic coordinates. The RMSDs were then averaged individually for each of the seven protein locations in the asymmetric subunit Fig. 3e, with each type appearing five times in a pentameric subunit.

Capsid deformation calculations (Figs. 1b and Fig. 3b) were performed on a supercomputer using a Python-based framework, whereby the calculations for every capsid symmetry transformation were distributed over multiple nodes and the computations per transformation were multi-threaded across multiple CPU cores on the respective node. MDAnalysis74 was used to first load the trajectories one frame at a time, whereafter a global alignment of the capsid was performed using the Newton–Raphson quaternion and an adjoint matrix method84. This workflow was validated against the results of the VMD measure fit command. The atomic density profiles were then computed by averaging over 4 × 4 nm2 centred sections connecting the ten pairs of opposite faces or seven pairs of opposite vertices.

To characterize steady-state RMSF of the capsid proteins (Figs. 1e and Fig. 3d), the average CoM coordinates of each of the 420 capsid proteins were calculated over the last 50 ns of the MD trajectory. For each protein, the RMSF was then measured over that time interval relative to the average CoM coordinates, sampling instantaneous configurations every 48 ps. The CoM coordinates of a protein were calculated using the coordinates of all Cα atoms resolved in the crystal structure. The RMSF values were then averaged over 60 protein copies for each protein location within the asymmetric subunit. Before displacement calculation (Fig. 3f), protein coordinates were averaged over the last 40 ns of the corresponding MD trajectories (slow and empty), sampled every 48 ps. The displacement was computed as per-residue RMSD of the two average structures.

Radial plots of average ion concentration (Figs. 1l and Fig. 3g) or DNA density (Fig. 4c) were obtained by averaging ten 4 × 4 nm2 centred sections connecting the ten pairs of the twenty opposite faces of the capsid (face symmetry axis) or six pairs of the twelve opposite vertices of the capsid (vertex symmetry axis) and over the last 48 ns of the respective MD trajectory.

All SAXS profiles were calculated using Crysol85. Base-pair-level characterization of the packaged genome structures was carried out using Curves+ (ref. 86) and the standard reference frame for the description of nucleic acid geometry87.

Cryo-EM-like and TEM-like images of the simulated structures

The last 500 ns of the slow all-atom MD trajectory was sampled every 50 ns, rotating the genome and the protein capsid according to each of the 60 icosahedral symmetries. The volmap plugin of VMD was used to produce a volumetric grid of the DNA and protein atomic density with a resolution of 2.5 Å, averaging over both simulation frames and the symmetry axes. The volmap plugin was similarly used to produce 2.5 Å resolution maps of the CG DNA mass density of each CG capsid genome, averaging over the equilibration portion of each trajectory with a stride of 40 ns. Maps were produced with and without averaging over the symmetry axes of the system. The 2D images in Fig. 4a,b were produced by averaging a 6-nm-thick slab of the density passing through the centre of the capsid, then linearly mapping the density from white to black (4–7 Da Å−3). For CG images, the repulsive regions of the capsid potential are depicted using a semi-transparent blue overlay.

Individual TEM-like images (Extended Data Fig. 6b) were obtained for CG trajectories from the maps without symmetry, averaging by selecting a normal vector, finding the coordinate of each 3D voxel in the plane corresponding to the normal vector and binning the voxels by those normal coordinates with 2.5 Å resolution and summing the density in each bin. The images were then produced from the projected density by linearly mapping the summed mass density between 200 and 1,600 Da Å−3, from white to black.

Calculation of diffusion constants

The local diffusion of the DNA genome (Fig. 4i) was characterized using the following distance histogram fitting procedure88. First, the CoM of each 10 bp DNA fragment was recorded for the last 288 ns of the equilibration trajectory, sampled every Δt = 0.48 ns. Next, displacement of each fragment over that time interval was computed, producing a distribution of instantaneous displacements. The latter was fit by the probability distribution describing diffusion:

$$p\left(r,\Delta t\right){d}^{3}r=\frac{1}{{\left(4{\rm{\pi }}D\Delta t\right)}^{3/2}}{e}^{\frac{-{r}^{2}}{4D\Delta t}}\left(4{\rm{\pi }}{r}^{2}dr\right),$$

where a particle starting from the origin finds itself in the element of volume defined by r and r + dr after time Δt with the probability p(r, Δt). The fit yielded an estimate for the diffusion coefficient, D. To evaluate local diffusion of DNA fragments along specific axes (Extended Data Fig. 5b), the 3D displacements were projected along the respective axes before constructing the corresponding displacement histograms.

The local diffusion constants of water and ions (Fig. 4j and Extended Data Fig. 5c) were computed using the Einstein’s relation. For water diffusion, coordinates of the water oxygen atoms were recorded over the last 0.96 ns of the equilibration trajectory, sampled every 9.6 ps. The mean squared displacement values were computed over half the total number of non-overlapping time lags. Each diffusion coefficient was calculated as the slope of a linear fit to the mean squared displacement versus time lag dependence, using the first 0.24 ns of the dependence for the fit. To obtain a radial distance dependence, the average radial coordinate was computed over the first 0.24 ns of each water trajectory. A radial bin size of about 13.5 Å was large enough to obtain a radial dependence and short enough to get good resolution and statistics. To characterize ion diffusion, ion coordinates were collected over the last 48 ns of the equilibration trajectory, sampled every 9.6 ps. The radial averaging was done using a bin size of 10.4 Å. All other procedures were the same as for the water diffusion calculations.

Volumetric analysis

The measure volinterior82 functionality of VMD was used to classify all regions of the system as interior, exterior or boundary with respect to the protein capsid. A closed (pore free) quicksurf representation of the protein capsid was generated using VMD by setting the radius scaling to 2.8, the isovalue to 0.5 and the grid spacing to 1.5. The raytracing algorithm used 32 rays for every coordinate frame. The boundary voxels were then eliminated by using the skeletonize algorithm89 of the scikit-image Python package, ensuring that all parts of the system were classified as interior or exterior.

Electrostatic model of a packaged particle

The model (Extended Data Fig. 2g) assumed that all charge inside the capsid is distributed at the inner surface of 29.5 nm radius sphere, which is enclosed by a 2.5 nm thick spherical shell of dielectric medium representing the protein. Outside the dielectric medium, a Debye–Hϋckel description of the electrostatics was adopted with a Debye length of 6.39 Å. In addition to the measured charge inside the capsid, an additional charge qextra was placed on the inner surface of the dielectric, with a compensating charge placed on the outer surface. Under this model the potential decays rapidly outside the capsid and the potential inside is uniform. Hence, the potential difference between inside and outside is taken as the potential inside.

Internal pressure calculations

Starting from a chosen frame of an all-atom equilibration simulation of a fully packaged capsid, we created a corresponding all-atom model of an empty capsid by removing the DNA and resolvating the capsid interior with bulk-like solvent (0.2 mol kg1 NaCl and 0.005 mol kg1 MgCl2) using the tiling method. After resolvation, small differences in the protein coordinates (order of a few angstroms) resulted in initial differences in the water density inside and outside the capsid, thereby leading to different values of the pressure at the beginning of these simulations. Following 6,000 steps of minimization, the system was equilibrated restraining all non-hydrogen atoms of the capsid to the coordinates taken from the packaged capsid equilibration (Supplementary Video 10). For each frame of the restrained equilibration, the pressure was calculated by first determining the harmonic force applied on each atom of the capsid from the atom’s displacement relative to its restrained coordinate (Fig. 5h). The radial projection of the force on each atom was then divided by the area of a sphere with radius equal to the radial coordinate of the atom. These values were then summed up over all restrained atoms of the capsid to give an instantaneous pressure. The instantaneous pressure values were averaged over the restrained equilibration trajectory. Similar equilibrium pressure values were obtained when only Cα atoms of the capsid were restrained. For each model of the packaged genome, the pressure calculation simulations were performed using two instantaneous configurations taken at the end and 50 ns before the end of the respective equilibration trajectory (Fig. 5j,k).

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Source link


Leave a Reply

Your email address will not be published. Required fields are marked *