Strange IndiaStrange India

Sample preparation

Horse heart myoglobin (hhMb) was purchased from Sigma Aldrich (M1882). After dissolving lyophilized hhMb powder (70 mg ml) in 0.1 M Tris HCl pH 8.0, the solution was degassed and then saturated with CO. Upon addition of sodium dithionite (12 mg ml−1) while constantly bubbling with CO gas, the colour of solution turned to raspberry red. Dithionite was removed by desalting the protein solution via a PD10 column equilibrated with CO saturated 0.1 M Tris HCl pH 8.0. Subsequently, the MbCO solution was concentrated to approximately 6 mM using centrifugal filters before freezing in liquid nitrogen for storage.

hhMb crystals were grown in seeded batch by adding solid ammonium sulfate to a solution of 60 mg mM−1 hhMB in 100 mM Tris HCl pH 8.0 until the protein started to precipitate (about 3.1 M NH3SO4). Seed stock solution was then added. Crystals appeared overnight and continued growing for about a week, yielding relatively large, often intergrown plate-shaped crystals3. Using a HPLC pump the crystalline slurry was fractured with tandem array stainless steel 1/4 inch diameter filters48. For beam time 1 (March experiment) the first tandem array contained 100 and 40 µm filters followed by a second tandem array of 40, 20, 10 and 10 µm filters. For beam time 2 (May experiment), the crystals were further fractured using a tandem array of 10, 5, 2 and 2 µm stainless steel 1/4 inch diameter filters. On average, the largest crystal dimensions of the crystallites were about 15 µm (Supplementary Fig. 12a) and about 9 µm (Supplementary Fig. 12b) for beam times 1 and 2, respectively.

Laser power titration

Time-resolved spectroscopic data for estimating the extent of photolysis as a function of laser power density were obtained using a 6 mM hhMbCO solution. The sample was placed in a rectangular borosilicate glass tube sealed with wax to keep the solution CO saturated. The optical path length was 50 μm and the thickness of the glass tube was 1 mm. The optical density at the pump laser wavelength (532 nm) was about 0.5. An identical tube filled with the buffer solution (0.1 M Tris HCl pH 8.0) was used as a blank.

The fs laser pulses were generated by a Ti-sapphire amplifier (Legend, Coherent) seeded by a Mira fs oscillator. The laser output was divided into two branches: the vast majority was used as input of an optical parametric amplifier (Topas, LightConversion) to generate the pump pulses at 532 nm, while the remaining fraction was sent onto a sapphire crystal to generate short white-light pulses. Correction for white-light temporal chirp (of less than 2 ps over the probed window) was not needed at the time delay of interest. Mechanical choppers were used to lower the original 1 kHz repetition rate of both pump and probe pulses to 1 Hz and 500 Hz respectively. Pump and probe beams were spatially and temporally overlapped at the sample position and the relative time delay was set using a delay line. Pump pulses were focused to a full-width at half-maximum (FWHM) of about 0.1 mm, while the probing white-light FWHM beam size was about 0.02 mm diameter (FWHM). Each time-resolved spectrum was obtained by averaging 60 consecutive pump–probe events. A Berek compensator was used to change the pump light polarization from linear to circular. The 80 fs pump pulses were stretched to about 230 fs and 430 fs by inserting 10 and 20 cm water columns, respectively, along the pump laser path49. The difference spectra shown in Extended Data Fig. 1 were obtained using linearly polarized pump light; analogous results were found using circularly polarized light (data not shown).

Data collection at SwissFEL

The TR-SFX experiment was performed in March (beam time 1)/May (beam time 2) 2019 using the Alvra Prime instrument at SwissFEL50 (proposal no. 20181741). To follow the time-dependent light-induced dynamics, an optical pump, X-ray probe scheme was used. The repetition rate of the X-ray pulses was 50 Hz. Diffraction images were acquired at 50 Hz with a Jungfrau 16 M detector operating in 4 M mode. The outer panels were excluded to reduce the amount of data.

The X-ray pulses had a photon energy of 12 keV and a pulse energy of approximately 500 μJ. The X-ray spot size, focused by Kirkpatrick–Baez mirrors, was 4.9 × 6.4 μm2 in March 2019 and 3.9 × 4.1 μm2 in May 2019 (horizonal × vertical, FWHM). To reduce X-ray scattering, a beam stop was employed and the air in the sample chamber was pumped down to 100–200 mbar and substituted with helium. The protein crystals (10% (v/v) settled material, ref. 1) were introduced into the XFEL beam in a thin jet using a gas dynamic virtual nozzle (GDVN) injector51. The position of the sample jet was continuously adjusted to maximize the hit rate. In the interaction point, the XFEL beam intersected with a circularly polarized optical pump beam originating from an optical parametric amplifier producing laser pulses with 60 ± 5 fs duration (FWHM) and 530 ± 9 nm (FWHM) wavelength focal spots of 120 × 130 μm2 and 150 × 120 μm2 (horizontal × vertical, FWHM), in March and May, respectively. The laser energy was 0.5 and 1 μJ in May and 1–18 μJ in March 2019, corresponding to laser fluences of about 2.5 to 101 mJ cm2 and laser power densities of about 40 to 1,700 GW cm2 (Supplementary Table 2). Using an absorption coefficient of 11,600 M−1 cm−1 for horse heart carboxymyoglobin at 530 nm, this results in nominally approximately 0.3 to 12 absorbed photons/haem at the front of a crystal facing the pump laser beam. Time zero was determined in the pumped-down chamber at the same low-pressure helium atmosphere used for data collection. Information from a THz timing tool was used for determining the actual time delay. A power titration was performed at a 10 ps time delay (March 2019). Full time series were collected for pump laser fluences of 5 (May), 23 and 101 mJ cm2 (March). For the 5 mJ cm2 time series, the time delay could be set with sufficient reproducibility that each time point could be collected as a single dataset, with nominal time delays of ∆t = 150, 225, 300, 375, 450, 525, 600, 750, 900 and 1,300 fs. Using the timing tool available at the beam line, the actual time delays of these datasets could then be determined to be 254, 327, 402, 471, 627, 702, 847, 1,001 and 1,401 fs, with widths of approximately 85 fs. The number of indexed lattices in each dataset ranged from about 10,000 to greater than 30,000, and greater than 60,000 in the dark dataset.

At the time the 23 and 101 mJ cm2 time series were collected, the available timing reproducibility was less, and datasets were collected at a series of preset nominal time delays ranging from 150 to 1,300 fs that were then merged into large sets of about 150,000 indexed lattices for both fluences. These where then sorted according to the actual time delay of each image as determined by the timing tool of the beam line. Then, the data were split into smaller datasets by moving a window of 20,000 indexed lattices over the data for each fluence in steps of 10,000 indexed lattices. Thus, each of these datasets contain 20,000 indexed lattices, with an overlap of 10,000 indexed lattices between two consecutive time points. The timing distributions of these partial datasets have standard deviations of between 40 and 70 fs. In combination with the accuracy of the timing tool we estimate the true widths of these distributions to be approximately 100 fs. It should be noted that the overlap of the time delay distributions caused by this ‘binning’ of the 23 and 101 mJ cm2 data will result in a ‘smearing out’ of time-dependent effects.

The power titration data were collected during the same beam time as the 23 and 101 mJ cm2 data series, with the time delay set to nominally 10 ps. At this long time delay, the timing reproducibility of the beam line is of no concern and most heating effects have decayed. For the power titration, as many images were collected as was practical during the beam time, and the number of indexed lattices in each dataset varies.

Thus, while for the 23 and 101 mJ cm2 fluence time delay data, each dataset contains the same number of indexed lattices, whereas for the 5 mJ cm2 fluence time delay- and power titration data, there are different numbers of indexed lattices in each dataset. In serial crystallography, the precision of a dataset increases with the number of indexed lattices. However, this should not affect the magnitude of structural changes beyond measurement error levels52, and indeed, we observed no correlation of structural changes with the number of indexed lattices for the 5 cm2 time delay data or the power titration data. Data statistics are given in Supplementary Table 1.

In each case, every 11th pulse of the pump laser was blocked, so that a series of ten light activated and one dark diffraction pattern were collected in sequence. High-quality dark datasets were generated by merging all laser-off patterns as well as separately collected, dedicated laser-off runs. The latter were also used to confirm that the interleaved dark data in the light runs were indeed dark and not illuminated accidentally.

Diffraction data analysis

Diffraction data were processed using CrystFEL 0.8.0 (ref. 53); Bragg peaks were identified using the peakfinder8 algorithm and indexing was performed using XGANDALF54, DIRAX55, XDS56 and MOSFLM55,57. Monte-Carlo integration58,59 was used to obtain structure factor amplitudes. To calculate light-dark difference electron density maps, light data were scaled to the dark data using SCALEIT60 from the CCP4 suite61 using Wilson scaling. We investigated the use of different low- and high-resolution limits. Using a low-resolution limit of 30 Å worked for some datasets, but for others resulted in problems during light-dark scaling, likely due to differences in beam stop placement. However, we found that a low-resolution limit of 10.0 Å could be used for all datasets and this was therefore imposed for all calculations. Similarly, we found that a common high-resolution limit of 1.4 Å could be used for all photolysed structure determinations, which was implemented accordingly. The dark-state structures were refined against all available data.

Initially, occupancies of the photolysed state were determined by refining a model of the dark state without the CO ligand against the photolysed data and calculating mFo-DFc electron density maps using phases from a model. The heights of the peaks for the CO in the ground (dark) and photolysed CO* states were then used to calculate the occupancy \(f\) using:

$$f=\frac{{\rho }_{{\rm{C}}{\rm{O}}* }}{{\rho }_{{\rm{dark}}}+{\rho }_{{\rm{CO}}* }}$$

where \({\rho }_{{\rm{CO}}* }\) and \({\rho }_{{\rm{dark}}}\) are the peak heights for the CO*- and dark-state CO peaks, respectively. These occupancies are shown as the red line in Fig. 1.

As is clear from the non-unity occupancies obtained, the structure factors originate from a mixture of the dark- and photolysed states. To obtain refined structures of the photolysed states, we considered refinement against extrapolated structure factors62,63 as well as multicopy refinement. In this latter method, a mixture of the dark and photolysed states is refined against the original structure factor amplitudes. As multicopy refinement performed better than structure factor extrapolation in simulations (Supplementary Note 1), we continued with the multicopy refinement method. The occupancies were determined using a multicopy refinement-based approach that results in values that are very similar to the ones obtained using mFo-DFc omit maps (Supplementary Note 1).

For each photolysed structure, a starting structure for the light state was constructed from the appropriate dark-state structure, by moving the carbon monoxide molecule away from the haem and into the photolysed-state CO binding pocket. This photolysed-state starting structure was then combined with the appropriate dark-state structure to construct a range of dark/photolysed state ‘mixture’ pdb files with varying occupancies of the photolysed state (the occupancy of the dark state was set to 1-[photolysed state occupancy]). Each of these pdb files was then refined against the original photolysed data using phenix.refine build 1.19.2_4158 (ref. 64), allowing only the coordinates and B-factors of the photolysed state part of the mixture to vary. For all refinements we used a haem geometry in which the planarity restraints were relaxed to allow the haem to respond to photolysis. The coordinates and B-factors of the dark state, as well as the occupancies of both states were kept at their starting values. After each refinement, the mFo-DFc difference electron density on the dark-state CO position was determined using phenix.map_value_at_point. At the correct occupancies of dark- and photolysed states, there should be no difference density at this position. The mFo-DFc densities at the dark-state CO position were then plotted against the occupancies of the respective mixtures. A line was fitted through these data points, and the occupancy at which this line crossed the x axis (namely, where the mFo-DFc density at the dark-state CO position was zero) was taken as the correct photolysed-state occupancy for that particular dataset. These are the occupancies shown as the black line in Fig. 1 as well as in Fig. 2. A new mixture was then constructed with that occupancy for the photolysed state and refined in the same way (namely, while keeping the dark-state coordinates and B-factors as well as all occupancies at their starting values) to obtain the final, refined structure. As discussed in the main text, for the short time delay pump–probe data, the crystallographic occupancy, which we determined from density peaks for the CO molecule, does not reflect the true yield of the photolysis reaction. Rather, the plateau value of the apparent crystallographic occupancy is the real photolysis yield. Accordingly, the time delay structures were all refined using the plateau value of the apparent crystallographic occupancy for the respective fluences as the correct occupancy. Model statistics are given in Supplementary Table 1.

Structures were analysed using COOT65,66, PYMOL67 and custom-written python scripts using NumPy68 and SciPy69. To obtain error estimates for structural parameters such as bond lengths and torsion angles, bootstrap resampling was performed as follows: of each dataset, about 100 resampled versions were created using a sample-and-replace algorithm. These were used to refine about 100 versions of each structure, which were used to determine standard deviations. The number of 100 resampled versions was chosen as this has been shown to result in sufficient sampling46,47 while still being computationally tractable.

Quantum chemistry

For the calculation of the absorption spectra and attachment–detachment density analysis, a reduced model in gas phase was constructed that includes the Fe-porphyrin along with CO on one side of the porphyrin plane and an imidazole (part of the proximal histidine) on the other side. The geometry was optimized at the DFT/B3LYP/LANL2DZ level. The absorption spectra were computed at the optimized singlet ground state geometry at XMS-CASPT2/CASSCF/ANO-RCC-VDZP level using OpenMolcas70,71. An active space of 10 electrons in 9 orbitals was used (5d orbitals of iron and 4π orbitals). The stick spectra were convoluted with Gaussians of 0.1 eV FWHM to obtain the spectral envelope.

For the relaxed scan along the Fe–C(O) dissociation coordinate, the geometries of the model system were optimized at fixed Fe–C(O) bond lengths on the lowest quintet ground state at the DFT level. XMS-CASPT2 calculations were performed at these geometries to obtain the PES cut, to extract 60 singlets included in the state-averaging to account for the dissociative state corresponding to the sequential two-photon absorption model.

The QM/MM model was constructed on the basis of the crystal structure of the horse heart myoglobin (PDB code 1DWR)72. The protein was solvated in a cubic box of 70.073 Å side length containing 11,684 water molecules. First, a minimization of the whole system was performed, followed by an NVT dynamics of 125 ps and a production run of 10 ns using Tinker v.8.2.1 (ref. 73). From the molecular dynamics (MD), we extracted several snapshots to perform quantum mechanics/molecular mechanics (QM/MM) MD, using a development version of GAMESS-US/Tinker74. The QM region includes the haem, CO and parts of the proximal- and distal histidines and was described at the DFT level. The rest of the system is described at the MM level with the CHARMM36m (ref. 75) force field. A time step of 1 fs was used for the QM/MM molecular dynamics simulations.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Source link


Leave a Reply

Your email address will not be published. Required fields are marked *