- 1. Introduction
- 2. Tomography based on the intensity of specific features in the scattering data
- 3. Recovering spatially resolved X-ray scattering patterns based on component decomposition and segmentation of scattering tomogram by composition
- 4. MFA analysis
- 5. Experimental and data-analysis details
- 6. Concluding remarks
- Supporting information
- References
- 1. Introduction
- 2. Tomography based on the intensity of specific features in the scattering data
- 3. Recovering spatially resolved X-ray scattering patterns based on component decomposition and segmentation of scattering tomogram by composition
- 4. MFA analysis
- 5. Experimental and data-analysis details
- 6. Concluding remarks
- Supporting information
- References
research papers
X-ray scattering based scanning tomography for imaging and structural characterization of cellulose in plants
aNational Synchrotron Light Source II, Brookhaven National Laboratory, Bldg 745, Upton, NY 11973, USA
*Correspondence e-mail: lyang@bnl.gov
X-ray and neutron scattering have long been used for structural characterization of cellulose in plants. Due to averaging over the illuminated sample volume, these measurements traditionally overlooked the compositional and morphological heterogeneity within the sample. Here, a scanning tomographic imaging method is described, using contrast derived from the X-ray scattering intensity, for virtually sectioning the sample to reveal its internal structure at a resolution of a few micrometres. This method provides a means for retrieving the local scattering signal that corresponds to any voxel within the virtual section, enabling characterization of the local structure using traditional data-analysis methods. This is accomplished through tomographic reconstruction of the spatial distribution of a handful of mathematical components identified by non-negative matrix factorization from the large dataset of X-ray scattering intensity. Joint analysis of multiple datasets, to find similarity between voxels by clustering of the decomposed data, could help elucidate systematic differences between samples, such as those expected from genetic modifications, chemical treatments or fungal decay. The spatial distribution of the microfibril angle can also be analyzed, based on the tomographically reconstructed scattering intensity as a function of the azimuthal angle.
Keywords: X-ray scattering; imaging; tomography; plant cellulose.
1. Introduction
X-ray and neutron scattering patterns from plant cell walls typically consist of diffraction peaks from crystalline cellulose and diffuse contributions from amorphous components like lignin. The scattering data can be used to characterize the abundance and degree of organization of cellulose structures, based on derived quantities such as the crystallinity index (CI) (Thygesen et al., 2005) and the degrees of correlation between cellulose fibrils. Scattering methods therefore have been used in studies of breakdown of biomass feedstock for bioenergy production (Dadi et al., 2006; Samayam et al., 2011; Inouye et al., 2014) and natural fungal decay of wood (Castaño et al., 2022; Floudas et al., 2022). Spatially resolved measurements using micro-focused X-ray beams can reveal the structural heterogeneity in the sample and therefore have the potential to provide a more precise description of how these processes take place. For instance, scanning mapping of thin sections has been used to quantify the local orientation of the cellulose microfibrils in secondary cell walls (Lichtenegger et al., 1999). However, preparing sections that are cut perpendicular to the growth direction and therefore reveal the cell-wall architecture is not always feasible, especially for samples that have been chemically treated or have already decayed naturally. Furthermore, due to the fiber geometry of the cellulose structures, the observed scattering intensity can be highly dependent on local orientation of the cellulose fibrils, making data analysis difficult. Here, we use scanning scattering tomography to circumvent both issues.
Performing tomographic imaging based on X-ray scattering contrast has been applied to polymer materials (Stribeck et al., 2006; Schroer et al., 2006) as well as biological tissues (Jensen et al., 2011; Liebi et al., 2015; Schaff et al., 2015; Georgiadis et al., 2021; De Falco et al., 2021). In general, the observed scattering signals may be dependent on the orientation of the underlying structure (e.g. collagen and minerals in bones, or myelin in nerve tissues). The tomographic reconstruction algorithm therefore needs to account for this projection-angle dependence. On the other hand, in cases where the basis for the reconstruction is invariant during sample rotation (see discussion by De Falco et al., 2021), the reconstruction can be much simplified. Cellulose in plant cell walls is the primary source of the observed X-ray scattering intensity. The scattering from an isolated cellulose fibril is rotationally invariant with respect to the fiber axis. Plant cells in tissues that are most abundant in cellulose are cylindrical structures that are elongated along the growth direction. The overall scattering from the enclosed cell walls resembles that from cellulose fibrils, with the growth direction being the average cellulose fiber direction. However, the presence of the microfibril angle (MFA), which is the angle between the fiber axis and the growth direction [see Fig. 1(B)], gives rise to a split peak along the azimuthal angular direction [e.g. Fig. 1(D) and the inset of Fig. 1(E)]. Nevertheless, the scattering intensity integrated over all azimuthal angles does not depend on sample rotation about the growth direction. Therefore, as long as the plant sample (e.g. a matchstick cut from wood, or the stem of a small plant) is positioned such that the growth direction coincides with the rotation axis of the projection angle and is perpendicular to the incident beam, the rotational invariance holds. The reconstruction can then be performed using existing tools developed for tomographic imaging based on observables that are scalers, such as X-ray absorption or fluorescence emission.
Collecting the scattering data as described above can also be helpful in determining the MFA from the observed intensity. Typically, the cell-wall geometry must be separately characterized to define the cell-wall orientation as an input parameter when calculating the MFA distribution (Cave, 1997; Rüggeberg et al., 2013). In tomographic data collection, the sample is rotated for observation from different projection angles. In the scattering intensity averaged over all projection angles, all cell-wall orientations contribute equally. The overall MFA distribution can therefore be evaluated based on the scattering data alone. In addition, under the assumption of rotational invariance of the angular scattering intensity profile, we can further retrieve the local MFA and visualize its spatial distribution.
We demonstrate the workflow of scanning scattering tomography using, as examples, a bamboo sample cut from the internode and the intact stalk from several rice plants.
2. Tomography based on the intensity of specific features in the scattering data
Fig. 1 shows typical X-ray scattering patterns from a bamboo sample and a rice sample. As part of the uniform data-processing workflow summarized in Fig. 1(A), we first convert the X-ray scattering data into intensity maps with coordinates of the scattering vector q [defined as , where λ is the X-ray wavelength and 2θ is the scattering angle] and the azimuthal angle φ. The most prominent feature in a typical scattering pattern is the well documented diffraction peaks from crystalline cellulose. There is also a diffuse peak at ∼0.1 Å−1 that is perpendicular to the fiber axis, which can be well explained by the correlation between cellulose microfibrils (Jakob et al., 1994; Kennedy et al., 2007; Fernandes et al., 2011; Penttilä et al., 2019). Finally, the streak at extremely low q has been attributed to the porous structure of plant tissue (Jakob et al., 1996; Nishiyama et al., 2014).
Crystalline cellulose and amorphous structural components (e.g. lignin) contribute to this intensity map differently. In the context of tomographic reconstruction, we consider the contribution from a volume element in the virtual during sample rotation. The scattering from amorphous components is isotropic. In contrast, for crystalline cellulose, the scattering intensity is a function of the azimuthal angle, and dependent on the microfibril orientation in the cell walls (Cave, 1997; Barnett & Bonham, 2004). We have two different options for analyzing these data. A unsplit cellulose peak along the φ axis in the intensity map that is independent of the projection angle during tomographic data collection, as is the case for the bamboo sample [see the inset of Fig. 1(E)], suggests that most cellulose fibrils are highly oriented with a near-zero MFA. We could then focus on the intensity profile at φ = 0 (on the equator in the fiber-diffraction diagram): Ie(q) = I(q, φ = 0). This assures that the intensity is free of contamination from the cellulose peaks outside of the zeroth layer line. More generally, however, the MFA is not a single value but a distribution. Multiple structural components may also have different orientational distributions [e.g. cellulose and starch contributions in Fig. 1(D), as described below]. It is therefore preferred to work with the intensity integrated over all azimuthal angles: , which is equivalent to the data from powder diffraction measurements (commonly referred to as XRD in the literature). This is because the integral over φ negates the MFA-induced intensity redistribution at the fibril level. Therefore, Io(q) is independent of the projection angle, even if the φ-dependent profile of the contribution from the individual cells could vary with the projection angle due to the cell-wall architecture.
With the rotational invariance of the scattering intensity established, we can use general-purpose software like tomopy (Gürsoy et al., 2014) for tomographic reconstruction. The contribution to the overall scattering intensity from cell walls that are not aligned with the growth direction is expected to be small and therefore neglected. As we will see in the following, this assumption does not seem to affect tomographic reconstruction and therefore likely holds true for the examples below.
As a proof of concept, we have previously demonstrated (Yang et al., 2022) tomographic sectioning of a poplar stem based on the intensity of the cellulose (020) peak in the scattering data. Fig. 2 shows similar results for the bamboo and rice samples. In principle, any parameter extracted from conventional analysis of the individual scattering patterns can be used for tomographic reconstruction, so long as they are additive and therefore follow the Radon transform. For instance, rice plants store starch in granules in the leaf sheath and culm during vegetative growth (Perez et al., 1971; Sato, 1984). Diffraction peaks that are characteristic of the A-type starch structure at q ≃ 1.07, 1.21 and 1.27 Å−1 (scattering angles of 2θ = 15.2, 17.2 and 18.0° at 8 keV, respectively; Kadan & Pepperman, 2002) are visible in the data shown in Fig. 1(C). Rice plants are also known to contain silica. However, no clear signature from silica is visible in the data, likely due to its amorphous form. We can use the intensity of the most prominent peaks that correspond to cellulose and starch in the scattering data (see details in Section 5.4) to estimate their spatial distributions based on tomographic reconstruction. This is demonstrated in Fig. 2(B). Notably, while the starch distribution appears uniform in the leaf sheath, in the culm there is a clear radial distribution, with higher concentration on the inner periphery. Starch also appears absent in tissues that show high values in the CI map, suggesting those specific cell types are not involved in starch storage.
The intensity-based tomograms are direct measures of material abundance (cellulose and starch in this example). However, the numerical values are not calibrated. They may need to be scaled to account for the difference in the integrating q ranges used for preparing the sinograms when making comparisons between datasets (different samples or plant individuals). The values in the absorption tomogram correspond to the logarithm of absorption per voxel and are therefore comparable between tomograms. Indirect calculations of other quantities are possible. In particular, the cellulose crystallinity can be calculated based on the intensity tomograms for the cellulose (020) peak, and for the intensity minimum between the (020) and (110) peaks, as shown in the CI maps in Fig. 2. While the numerical value of the CI cannot be taken at face value, as has been discussed in the literature (e.g. French & Cintrón, 2013; Lindner et al., 2015), it is adequate for quantitative comparisons.
The small-angle X-ray scattering (SAXS) tomograms correspond to the scattering intensity in the q range where the cellulose-fibril correlation peak is expected to appear. Correlations between fibrils can affect both the magnitude and the position of this peak. Therefore, the numerical values in these maps cannot be simplistically interpreted as the abundance of a structural species. However, it could serve as a starting point to identify regions for further inspection, for instance by using the method for extracting local scattering intensity described below.
3. Recovering spatially resolved X-ray scattering patterns based on component decomposition and segmentation of scattering tomogram by composition
While the tomograms presented in the previous section provide intuitive visualization of material distribution, spatially resolved scattering data that correspond to each individual voxel in the tomogram would permit evaluation of the local structure based on traditional X-ray scattering analysis methods. In principle, tomographic reconstruction can be performed for each q value in the scattering data, to recover the local scattering intensity, as has been previously demonstrated (Jensen et al., 2011; Birkbak et al., 2015). Here, we take a different approach that is computationally less costly, by reducing the number of required reconstruction calculations from a few hundred (>400 q points in our data) to under ten, and potentially more reliable since limiting the number of components assures that the reconstructions are minimally affected by the statistical noise in the low-intensity portions (q ≃ 0.3 Å−1) of the scattering data (see Fig. S1 of the supporting information). Representing the data with a small number of components amounts to reducing the dimensionality of the dataset and facilitates the application of machine-learning methods like clustering analysis, as we will show below.
It is reasonable to assume that the sample being measured contains only a finite number of structural components. If we could identify the scattering profiles that correspond to these components, by decomposing the scattering data into this basis set, we would be able to quantify the contribution from each component in a spatially resolved manner. We could then use tomographic reconstruction to convert these distributions as functions of angle and position to the cross-sectional distributions of the same components. In practice, the number of components and their corresponding scattering profiles are typically unknown. However, we can decompose the scattering patterns based on a set of mathematically selected components, or basis vectors, and similarly recover the local scattering intensity as we would do for physical components. Fig. 3 shows this process being applied to the bamboo and rice samples, once we decompose the scattering profile into three and four components, respectively, using non-negative matrix factorization (NMF, see Section 5.6 for details). As examples, the reconstructed local scattering profiles are shown for three different locations in each sample.
These results are in general consistent with the tomograms in Fig. 2. For instance, for the rice sample, the top two basis vectors in Fig. 3(C) exhibit the scattering characteristics from the cellulose fibril correlation and starch. Not surprisingly, the resulting tomograms are similar to the SAXS and starch maps in Fig. 2(B). On the other hand, the NMF analysis highlights some features in the scattering data that may be easily missed unless being specifically looked for. For the bamboo samples, the third basis vector shows several sharp but low-intensity peaks. We are unable to identify the corresponding structural component. Since this component is abundant in the cortex (near the outer surface of the culm), it could be partially due to water-insoluble components that typically need to be removed when bamboo is used as a structural material (Nkeuwa et al., 2022). Bamboo is also known to store starch in the undifferentiated cells between vascular bundles (Wang et al., 2016). The extracted starch has been identified to have the A-type structure (Felisberto et al., 2020). However, since the presence of starch is seasonal, transient structural species that form during starch metabolism could have contributed to what we observe as well.
The components identified by NMF reflect prominent features observed in the scattering data, but in general they do not correspond to physical constituents. The NMF algorithm only assures that the basis vectors have positive values. Some vectors may contain features that are not realistic in scattering data, such as a dip in intensity that corresponds to the inverse of a peak from a different basis vector [see examples in Fig. 4(A)]. NMF results are not unique, but rather depend on the input parameters. NMF also does not account for any feature that varies slightly in position (e.g. the fibril correlation peak) in the dataset as a single component, but rather interprets that as a superposition of different peaks. These are well known issues and an active research area in the X-ray scattering and powder diffraction community (e.g. Maffettone et al., 2021). Nevertheless, the recomposed spatially resolved scattering data are expected to be independent of the basis set used for the decomposition, as long as the process is sufficiently accurate in describing the original experimental data. Even when the tomographic reconstruction is marginal [e.g. the second component for the bamboo sample, as shown in Fig. 3(C)], the salient features in the scattering pattern are still captured.
The recomposed scattering data can now be analyzed using methods developed for conventional scattering data. The large number of data points (∼105 per sample in these examples) makes it very computationally costly to perform analysis (e.g. model fitting) on the individual scattering intensity profiles before tomographic reconstruction. This is also true for the recomposed data. In cases where features in the NMF components can all be accounted for by physical constituents, the individual components can be analyzed first. For instance, they can be fitted to find the intensity of known crystalline cellulose peaks. These results can then be used either to construct tomograms as discussed in Section 2 or to derive the results for the same analysis for the recomposed per-voxel data.
In samples with complex compositions, it may be more important to understand how the overall composition varies spatially, rather than to identify the distribution of each pure component. This is analogous to the CI that captures the relative abundance of crystalline cellulose and the amorphous component being more informative than the abundance of either the crystalline or amorphous component alone. For this purpose, decomposition by NMF can be seen as a process of reducing the dimensionality of the parameter space, to provide inputs to further composition analysis, for instance by clustering. This is shown in Fig. 4. Here, we have performed NMF to decompose eight sets of scattering data from samples of four genotypes into six common components, followed by a clustering analysis using k-means to `segment' the virtual section into three clusters, shown as different colors. The scattering intensities representative of these clusters are shown in Fig. 4(C).
This analysis is helpful when the NMF components do not show features that can be clearly attributed to structural components and therefore cannot be interpreted as scattering intensity from physical structures. In contrast, the cluster averages do represent actual scattering intensity and are therefore interpretable. In this specific example, the main difference between the clusters appears to be starch abundance, with the gold component being the most cellulose rich and the cyan component being the most starch rich. The biological study (Dwivedi et al., 2024) that produced these rice plants was designed to elucidate the consequence of introducing genes that affect the production of lignin, which is unclear from this analysis due to the presence of starch. On the other hand, the distribution of these components clearly varies between plants and between different locations within the same plant. Interpretation of these data would be more informative with better defined biological context.
4. MFA analysis
As discussed in the Introduction, averaging the scattering data over all projection angles eliminates the contribution from cell-wall architecture in the observed azimuthal angle dependence in the scattering intensity. The MFA distribution can then be estimated based on a finite number of discrete MFA values, similar to the method described by Rüggeberg et al. (2013). To do so, the contribution from amorphous components is first subtracted, assuming that it can be represented by the scattering intensity just outside of the crystalline cellulose peak (q = 1.8–1.9 Å−1), and the contribution from higher-order cellulose peaks is ignored. These approximations will likely introduce some inaccuracies in the subsequent MFA analysis, therefore we have employed a simplified method (see Section 5) for extracting the approximate MFA distribution, instead of fitting the data using an assumed functional form for the distribution. Figs. 5(A) and 5(B) show this analysis being applied to the rice plants from Fig. 4.
Assuming that the azimuthal angle dependence of the intensity from each voxel is invariant with respect to the projection angle, we can perform tomographic reconstruction for each azimuthal angular position (φ) and retrieve the azimuthal angular intensity profile for each voxel, I(φ; x, y). This assumption would hold true if the distribution of cell-wall orientation within each voxel is close to isotropic, which is more likely when the voxel size is large (∼5 µm in this study) compared with the cell size. The sinograms for these angular positions do not exhibit any anomalies (e.g. intensity discontinuity or usually symmetry) and the reconstructions show the same morphology as other tomograms from the same sample (see the supporting movie in the supporting information). Therefore, at a minimum, these tomograms can be considered as a reasonable approximation of the ground truth. However, due to the underlying assumption of invariance in the azimuthal intensity profile, they should be considered qualitative until more rigorous validation can be performed.
To visualize the MFA distribution, we plot [Figs. 5(C) and 5(F)] the nominal MFA value, defined as the intensity-weighted average of the absolute value of the azimuthal angular positions for each reconstructed scattering profile:
This is not a direct measurement of the MFA. But small MFAs do lead to a low φa value. In the extreme case, φa = 0 if all microfibrils are aligned with the growth direction. There is clearly a variation of MFA within these plants. In particular, the sclerenchyma cells in the periphery of the leaf sheath and the exterior surface of the culm exhibit the lowest values of MFAs (see the supporting movie). This is qualitatively consistent with the azimuthal intensity profiles observed during data collection, e.g. the angular distribution is narrower when the beam illuminates the exterior of the sample compared with when the beam path traverses the center of the stalk (Fig. S8). With the reconstructed full azimuthal angular intensity profile, we can also perform the MFA analysis for each voxel in the virtual section, as shown in Figs. 5(D) and 5(E).
This analysis is also performed for the bamboo sample for comparison, with the effective MFA shown in Fig. 5(F) and the local MFA analysis shown in Figs. 5(G) and 5(H). The MFA distributions in the fibers and parenchyma are consistent with the results reported in a previous study (Ahvenainen et al., 2017), where data carefully collected from each type of tissues were analyzed based on cell-wall architecture obtained from absorption-based X-ray microtomography.
5. Experimental and data-analysis details
The overall workflow for data processing and analysis is summarized in Fig. 1(A). Detailed descriptions of each step are given below. This workflow has been implemented in the lixtools Python package (https://github.com/NSLS-II-LIX/lixtools).
5.1. Samples
The bamboo sample, possibly Phyllostachys bissetii, was harvested from a local residential area. An ∼20 mm-long section was extracted from the internode of a plant that was ∼10 mm in outer diameter. The rice samples were provided by Dr Chang-Jun Liu's group (Dwivedi et al., 2024). Four different samples (see Figs. 4 and 5) were measured, including two MOMT variants in which lignin biosynthesis had been modified. All samples were air dried and measured without any further treatment.
5.2. Data collection
The X-ray scattering experiments were performed at the Life Science X-ray Scattering (LiX) beamline (Yang et al., 2022) at the NSLS-II synchrotron source of Brookhaven National Laboratory (Upton, New York, USA). The X-ray beam was focused to the sample position within a spot size of ∼5 µm. The X-ray energy was 15 keV. Scattering patterns were collected at a frame rate of 20 Hz, as the sample was scanned along the x axis (perpendicular to the beam and the growth direction) in fly scanning mode. That is, the motion controller (Newport XPS) produced pulses at a series of predetermined positions to trigger the detectors as the sample stage moved continuously. The incident and transmitted beam intensities were also recorded as previously described (Yang et al., 2020, 2022). The averaged sample position during the detector exposures followed 5 µm steps. A total of 121 projection angles (Ry at 1.5° intervals) were collected, divided into eight groups of evenly spaced angles, such that each group covered half a rotation, e.g. 0, 5, 10,…, 180°, followed by 1.5, 6.5, 11.6,…, 176.5°, etc. Between groups, the sample was shifted slightly (10 µm) along the rotation axis to limit radiation damage. This implicitly assumes that the structure does not change significantly along the growth direction. This is confirmed by the general lack of discontinuities in the sinograms that would otherwise arise from structural differences that are adjacent in space but measured far apart in projection angles (the beginning of an angular group and the end of the next one, e.g. 0 and 176.5°). Since these angular positions are also measured far apart in time, structural changes due to radiation damage would produce similar discontinuities. Sinograms, especially those based on mathematical components (see Fig. S9), can therefore be used effectively to monitor radiation damage.
5.3. Assembling data for analysis
For each data point (x and Ry), the X-ray scattering patterns from the two detectors were merged into a combined q–φ map, with central symmetry applied to increase the detector coverage in the (Yang et al., 2022). This intensity map was then further reduced to one-dimensional q and φ profiles. The background scattering, for which we take the average intensity from the scattering patterns of the lowest overall intensity when the sample is not illuminated by the beam, is subtracted based on the transmitted beam intensity. In this process, the data are also corrected by sample absorption, as measured by the incident and transmitted intensities. The scattering patterns sometimes contain sharp diffraction peaks that can be attributed to cuticular wax, which were removed using a simple rolling ball algorithm that excludes sharp features in the data.
5.4. Scattering intensity based sonograms
The sinograms, I(x, Ry), were first calculated based on various features in the scattering profile, then converted to the corresponding tomograms (see below). The SAXS sinograms are based on the scattering intensity integrated in the q range of 0.1–0.15 Å−1. The cellulose and amorphous tomograms are based on intensities integrated within the q ranges of 1.55–1.63 and 1.28–1.37 Å−1, respectively. The CI tomograms were calculated from the cellulose and amorphous tomograms, based on the Segal definition:
after normalization to account for the difference in the integrating ranges. Low-intensity voxels were excluded to avoid division-by-zero artefacts.
The data from the rice sample were processed differently due to the presence of starch. The intensity of the peaks attributed to starch was estimated using a rolling ball algorithm then integrated within the q ranges of 1.10–1.18 and 1.25–1.32 Å−1, where pure cellulose scattering does not show significant intensity (see Fig. S2). Similarly, the intensity for crystalline cellulose and amorphous components was integrated in the q ranges of 1.45–1.55 and 1.30–1.35 Å−1, which are minimally affected by the subtraction of the assumed starch scattering. While it is possible to fit each scattering profile based on a model that includes all know diffraction peaks from starch and cellulose, as well as amorphous scattering from other components, doing so would be very time consuming, given that each dataset contains ∼105 individual scattering patterns.
5.5. Pre-scaling
The scattering intensity spans several orders of magnitude over the full q range of the data [see Fig. 1(E)]. Since the loss function in the decomposition algorithm is based on intensity, low-intensity features are more likely to be neglected. In order not to miss any important features, before decomposition we first multiplied the data with a shape factor (Fig. S3) that significantly reduces the of the data without introducing any new features. In Figs. 3 and 4, the basis vectors are based on the modified data, while the recomposed data are divided by the same shape factor to recover the original dynamic range.
5.6. NMF
Decomposition by NMF was performed using scikit-learn (Pedregosa et al., 2011), using the default Frobenius beta loss. The parameters were manually adjusted to obtain basis vectors that resemble realistic scattering profiles in Fig. 3. To estimate the number of components required to adequately represent the data, trial runs were performed with an increasing number of components, as shown in Fig. 3(A). The number of components is selected such that more components do not result in a significant decrease in the beta loss. For the analysis in Fig. 4, all default NMF parameters were kept intentionally, to produce a basis set that clearly does not represent physical components. In addition, due to the larger size of the dataset combined from eight samples, the randomized singular value decomposition (SVD) algorithm was used to estimate the required number of basis vectors, by placing the cut-off at ∼1% of the first SVD eigenvalue.
5.7. Tomographic reconstruction
The final virtual cross sections that correspond to each component were calculated from sinograms using standard tomography software, tomopy (Gürsoy et al., 2014). The algorithm pml-hybrid (Chang et al., 2004) was used with typically 100 iterations. The numerical values were normalized based on the values in the sinograms. Representative sinograms are shown in the supporting information. For consistency, all tomographic reconstructions for the same dataset were performed using the same rotation center value, which is determined from test-running reconstruction on the absorption data.
5.8. Accuracy of the tomograms
The accuracy of the reconstructed tomograms described above can be affected by several factors. First, the use of a finite number of components necessarily introduces some discrepancy between the actual dataset and the simplified set that is used in the subsequent data analysis. This can be evaluated by the relative error of NMF, calculated as the Frobenius beta loss normalized to the Frobenius norm of the original data and shown in Fig. 3(A). Some examples of the decomposition are also shown in Fig. S4. Second, the results produced by iterative reconstruction algorithms are not strictly mathematical inverse of the input sinograms. Combining the two steps together in future reconstruction algorithms may improve the accuracy of this analysis.
From the standpoint of scattering data collection, due to the finite sample size, different parts of the sample that contribute to the intensity in the same detector pixel in fact correspond to slightly different q values. In our measurements, the maximum lateral dimension of the sample is less than 5 mm, compared with the mean sample-to-detector distance of ∼350 mm. This corresponds to an uncertainty of ∼1.5% in q, or a smearing of ∼0.02 Å−1 at the location of the cellulose main peak. This is considered negligible, compared with the q grid of 0.01 Å−1 in the q–φ intensity map. Data collection can also benefit from better detector coverage in as can be seen from Figs. 1(C) and 1(D).
5.9. Voxel size in the tomograms
This is set by the step size in the data collection, which is 5 µm for the results reported here and chosen to be close to the beam size. This is a good compromise for many plant samples. At this resolution, sufficient morphological details are preserved, allowing reasonable comparison with optical micrographs. With the current data-collection speed of 20 frames per second on the scattering detectors, which is limited by the speed of packaging the data into hdf5 files, data collection on a sample with a maximum lateral dimension of 3 mm takes 1 h. Given the same incident beam intensity, a smaller beam size would result in a higher rate of radiation deposited into the sample. This may require the data collection to run proportionally faster to limit radiation damage, resulting in lower scattering intensity. And it would take more data points to cover the same field of view. The compromise between sample morphology, data-collection speed, data quality and radiation damage ultimately determines the optimal voxel size in these measurements.
5.10. Clustering
After the tomographic reconstruction, we now have a set of distribution maps of the NMF components. In the parameter space, each voxel in the virtual k-means algorithm implemented in scikit-learn (Pedregosa et al., 2011), based on their locations in the parameter space. The number of clusters (Nc) was selected based on the change in inertia, which is the objective function minimized by the k-means algorithm and defined as the sum of squared distances of data points to the cluster center, as the number of clusters was increased [the inset of Fig. 4(C)]. There is not a clear best choice. Nc = 3 was chosen to keep the maps from becoming too difficult to read. The coordinates of the centroid of each cluster were used to calculate the representative scattering intensity for each cluster, as shown in Fig. 4(C). Clustering analyses depend on the evaluation of distances in parameter space. Since we were interested in the material composition, this analysis was based on the relative magnitude of different components, after the length of all vectors that represent the data in the parameter space was normalized to unity. However, since the basis sets do not correspond to physical components, this may not be the best representation of material composition.
is represented by a point with coordinates corresponding to the amplitude of each NMF component. These voxels were grouped into clusters using the5.11. MFA decomposition
The angular intensity profile after removal of the estimated contribution from amorphous components is decomposed into intensity distributions that correspond to a set of discrete MFA values, from 0 to 90°, at an interval of 3° (Fig. S7), assuming an intrinsic peak width of 5°. The decomposition is performed using the non-negative least-square (NNLS) algorithm implemented in scipy (Virtanen et al., 2020). This basis set is stored in a matrix A. To decompose the observed intensity y, we need to solve the equation y = Ax, where x gives the MFA distribution, by minimizing |y − Ax|2. To avoid over-fitting, two regularization terms are also added to simultaneously minimize the squared sum of the coefficients |Ix|2, where I is the identity matrix, and the difference between the neighboring terms |Dx|2, where the only non-zero elements in D are directly below the diagonal and have the value of −1. Effectively, the equation that the NNLS algorithm needs to solve becomes
where the Lagrange multipliers λ1 and λ2 are adjusted manually to get the best results.
6. Concluding remarks
We have demonstrated scattering-based scanning tomography for plant samples, for which the rotational invariance of scattering intensity is generally satisfied when the growth direction is aligned to the rotation axis in the measurements. As an imaging method, scattering tomography provides direct visualization of the sample. At the same time, this method reveals information on the underlying structural information that is only accessible through analysis of the scattering intensity. The data-analysis workflow described in this article necessarily introduces some systematic errors. The approach of representing data as mathematical components enables the calculation of spatially resolved scattering intensity and further analysis using machine-learning methods. On the other hand, the fidelity of this component representation to the ground truth depends on the components chosen and the subsequent tomographic reconstruction. This is especially true for the inter-fibril correlation peak, whose position variation should require multiple components of different discrete positions to reproduce. Therefore, further work is needed to improve the accuracy of the data-analysis workflow. The rotational invariance is another issue, which is assumed but may not always be satisfied in the analysis of local MFAs. On the other hand, even as a qualitative diagnostic tool, scattering tomography as described here is still valuable for helping the experimenter to identify interesting areas in intact samples for more detailed studies.
Scattering-based imaging complements other imaging modalities that are based on absorption and fluorescence and therefore not sensitive to material structures. They can be particularly useful when used in combination. For instance, the variation in crystallinity or cellulose fibril correlation inferred from the SAXS intensity could be correlated with the distribution of chemical agents that are expected to break down cell-wall structures, to evaluate the efficacy of chemical treatment in bioenergy research. We are currently implementing simultaneous fluorescence and scattering data collection at the LiX beamline. The sample itself will absorb fluorescence emission from the interior of the sample, which can be corrected following recently developed methods (Ge et al., 2022).
Radiation damage is an important consideration in synchrotron-based measurements on biological samples. In principle, the sample could be measured in a frozen-hydrated state, as is routinely done for protein crystallography and cryo-electron microscopy. However, performing flash freezing and maintaining the sample temperature during the measurement may not always be possible for larger samples. In the example presented here, all samples were measured dry, limiting the damage by radiation-induced free radicals. As a mitigation measure, we periodically translated the sample along the rotation axis during data collection to expose fresh parts of the sample to the X-rays and used the sinograms to monitor radiation damage as described earlier (an example of unmitigated radiation damage is shown in Fig. S9). Given that tomograms can be obtained even from scattering data of low intensities (Fig. S1), we should be able to further reduce radiation damage by shortening the exposure time for scattering-data collection. Future instrumentation developments to optimize data collection will be combined with of tomographic reconstruction algorithms, to better account for the various assumptions we have made in the data-analysis workflow.
Supporting information
Supporting figures. DOI: https://doi.org/10.1107/S1600577524004387/vl5025sup1.pdf
Supporting video. DOI: https://doi.org/10.1107/S1600577524004387/vl5025sup2.mp4
Acknowledgements
The author thanks Dr Chang-Jun Liu and his group for their long-standing support of this research by providing test samples.
Funding information
This work was performed at the LiX beamline as part of the Center for BioMolecular Structure (CBMS), which is primarily supported by the National Institutes of Health (NIH), National Institute of General Medical Sciences (NIGMS) through a P30 Grant (P30GM133893), and by the US Department of Energy (DOE) Office of Biological and Environmental Research (KP1605010). LiX also received additional support from NIH Grant S10 OD012331. As part of NSLS-II, a national user facility at Brookhaven National Laboratory, work performed at the CBMS is supported in part by the US DOE, Office of Science, Office of Basic Energy Sciences Program under contract number DE-SC0012704.
References
Ahvenainen, P., Dixon, P. G., Kallonen, A., Suhonen, H., Gibson, L. J. & Svedström, K. (2017). Plant Methods, 13, 5. Web of Science CrossRef PubMed Google Scholar
Barnett, J. R. & Bonham, V. A. (2004). Biol. Rev. 79, 461–472. Web of Science CrossRef PubMed CAS Google Scholar
Birkbak, M. E., Leemreize, H., Frølich, S., Stock, S. R. & Birkedal, H. (2015). Nanoscale, 7, 18402–18410. Web of Science CrossRef CAS PubMed Google Scholar
Castaño, J. D., Muñoz-Muñoz, N., Kim, Y. M., Liu, J., Yang, L. & Schilling, J. S. (2022). mSphere, 7, e00545. PubMed Google Scholar
Cave, I. D. (1997). Wood Sci. Technol. 31, 225–234. CrossRef CAS Google Scholar
Chang, J., Anderson, J. M. M. & Votaw, J. R. (2004). IEEE Trans. Med. Imaging, 23, 1165–1175. CrossRef PubMed Google Scholar
Dadi, A. P., Varanasi, S. & Schall, C. A. (2006). Biotech. Bioeng. 95, 904–910. CrossRef CAS Google Scholar
De Falco, P., Weinkamer, R., Wagermaier, W., Li, C., Snow, T., Terrill, N. J., Gupta, H. S., Goyal, P., Stoll, M., Benner, P. & Fratzl, P. (2021). J. Appl. Cryst. 54, 486–497. Web of Science CrossRef CAS IUCr Journals Google Scholar
Dwivedi, N., Yamamoto, S., Zhao, Y., Hou, G., Bowling, F., Tobimatsu, Y. & Liu, C. (2024). Plant Biotechnol. J. 22, 330–346. CrossRef CAS PubMed Google Scholar
Felisberto, M. H. F., Beraldo, A. L., Costa, M. S., Boas, F. V., Franco, C. M. L. & Clerici, M. T. P. S. (2020). Food. Res. Int. 132, 109102. CrossRef PubMed Google Scholar
Fernandes, A. N., Thomas, L. H., Altaner, C. M., Callow, P., Forsyth, V. T., Apperley, D. C., Kennedy, C. J. & Jarvis, M. C. (2011). Proc. Natl Acad. Sci. USA, 108, E1195–E1203. Web of Science CrossRef PubMed Google Scholar
Floudas, D., Gentile, L., Andersson, E., Kanellopoulos, S. G., Tunlid, A., Persson, P. & Olsson, U. (2022). Appl. Environ. Microbiol. 88, e00995. CrossRef PubMed Google Scholar
French, A. D. & Santiago Cintrón, M. (2013). Cellulose, 20, 583–588. CrossRef CAS Google Scholar
Ge, M., Huang, X., Yan, H., Gursoy, D., Meng, Y., Zhang, J., Ghose, S., Chiu, W. K. S., Brinkman, K. S. & Chu, Y. S. (2022). Commun. Mater. 3, 37. CrossRef Google Scholar
Georgiadis, M., Schroeter, A., Gao, Z., Guizar-Sicairos, M., Liebi, M., Leuze, C., McNab, J. A., Balolia, A., Veraart, J., Ades-Aron, B., Kim, S., Shepherd, T., Lee, C. H., Walczak, P., Chodankar, S., DiGiacomo, P., David, G., Augath, M., Zerbi, V., Sommer, S., Rajkovic, I., Weiss, T., Bunk, O., Yang, L., Zhang, J., Novikov, D. S., Zeineh, M., Fieremans, E. & Rudin, M. (2021). Nat. Commun. 12, 2941. Web of Science CrossRef PubMed Google Scholar
Gürsoy, D., De Carlo, F., Xiao, X. & Jacobsen, C. (2014). J. Synchrotron Rad. 21, 1188–1193. Web of Science CrossRef IUCr Journals Google Scholar
Inouye, H., Zhang, Y., Yang, L., Venugopalan, N., Fischetti, R. F., Gleber, S. C., Vogt, S., Fowle, W., Makowski, B., Tucker, M., Ciesielski, P., Donohoe, B., Matthews, J., Himmel, M. E. & Makowski, L. (2014). Sci. Rep. 4, 3756. Web of Science CrossRef PubMed Google Scholar
Jakob, H. F., Tschegg, S. E. & Fratzl, P. (1996). Macromolecules, 29, 8435–8440. CrossRef CAS Web of Science Google Scholar
Jakob, H. F. P., Fratzl, P. & Tschegg, S. E. (1994). J. Struct. Biol. 113, 13–22. CrossRef Google Scholar
Jensen, T. H., Bech, M., Bunk, O., Menzel, A., Bouchet, A., Le Duc, G., Feidenhans'l, R. & Pfeiffer, F. (2011). NeuroImage, 57, 124–129. Web of Science CrossRef CAS PubMed Google Scholar
Kadan, R. S. & Pepperman, A. B. (2002). Cereal Chem. 79, 476–480. CrossRef CAS Google Scholar
Kennedy, C. J., Cameron, G. J., Šturcová, A., Apperley, D. C., Altaner, C., Wess, T. J. & Jarvis, M. C. (2007). Cellulose, 14, 235. CrossRef Google Scholar
Lichtenegger, H., Müller, M., Paris, O., Riekel, C. & Fratzl, P. (1999). J. Appl. Cryst. 32, 1127–1133. Web of Science CrossRef CAS IUCr Journals Google Scholar
Liebi, M., Georgiadis, M., Menzel, A., Schneider, P., Kohlbrecher, J., Bunk, O. & Guizar-Sicairos, M. (2015). Nature, 527, 349–352. Web of Science CrossRef CAS PubMed Google Scholar
Lindner, B., Petridis, L., Langan, P. & Smith, J. C. (2015). Biopolymers, 103, 67–73. Web of Science CrossRef CAS PubMed Google Scholar
Maffettone, P. M., Daly, A. C. & Olds, D. (2021). Appl. Phys. Rev. 8, 041410. Web of Science CrossRef Google Scholar
Nishiyama, Y., Langan, P., O'Neill, H., Pingali, S. V. & Harton, S. (2014). Cellulose, 21, 1015–1024. Web of Science CrossRef Google Scholar
Nkeuwa, W. N., Zhang, J., Semple, K. E., Chen, M., Xia, Y. & Dai, C. (2022). Composites Part B, 235, 109776. CrossRef Google Scholar
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M. & Duchesnay, E. (2011). J. Mach. Learn. Res. 12, 2825–2830. Google Scholar
Penttilä, P. A., Rautkari, L., Österberg, M. & Schweins, R. (2019). J. Appl. Cryst. 52, 369–377. CrossRef IUCr Journals Google Scholar
Perez, C. M., Palmiano, E. P., Baun, L. C. & Juliano, B. O. (1971). Plant Physiol. 47, 404–408. CrossRef PubMed CAS Google Scholar
Rüggeberg, M., Saxe, F., Metzger, T. H., Sundberg, B., Fratzl, P. & Burgert, I. (2013). J. Struct. Biol. 183, 419–428. PubMed Google Scholar
Samayam, I. P., Hanson, B. L., Langan, P. & Schall, C. A. (2011). Biomacromolecules, 12, 3091–3098. CrossRef CAS PubMed Google Scholar
Sato, K. (1984). Jarq-Jpn Agric. Res. Q, 18, 79–86. Google Scholar
Schaff, F., Bech, M., Zaslansky, P., Jud, C., Liebi, M., Guizar-Sicairos, M. & Pfeiffer, F. (2015). Nature, 527, 353–356. Web of Science CrossRef CAS PubMed Google Scholar
Schroer, C. G. M., Kuhlmann, M., Roth, R., Gehrke, N., Stribeck, A., Almendarez-Camarillo, A. & Lengeler, B. (2006). Appl. Phys. Lett. 88, 164102. CrossRef Google Scholar
Stribeck, N., Camarillo, A. A., Nöchel, U., Schroer, C., Kuhlmann, M., Roth, S. V., Gehrke, R. & Bayer, R. K. (2006). Macro Chem. & Phys. 207, 1139–1149. CrossRef CAS Google Scholar
Thygesen, A., Oddershede, J., Lilholt, H., Thomsen, A. B. & Ståhl, K. (2005). Cellulose, 12, 563–576. Web of Science CrossRef CAS Google Scholar
Virtanen, P., Gommers, R., Oliphant, T. E., Haberland, M., Reddy, T., Cournapeau, D., Burovski, E., Peterson, P., Weckesser, W., Bright, J., van der Walt, S. J., Brett, M., Wilson, J., Millman, K. J., Mayorov, N., Nelson, A. R. J., Jones, E., Kern, R., Larson, E., Carey, C. J., Polat, I., Feng, Y., Moore, E. W., VanderPlas, J., Laxalde, D., Perktold, J., Cimrman, R., Henriksen, I., Quintero, E. A., Harris, C. R., Archibald, A. M., Ribeiro, A. H., Pedregosa, F., van Mulbregt, P., Vijaykumar, A., Bardelli, A. P., Rothberg, A., Hilboll, A., Kloeckner, A., Scopatz, A., Lee, A., Rokem, A., Woods, C. N., Fulton, C., Masson, C., Häggström, C., Fitzgerald, C., Nicholson, D. A., Hagen, D. R., Pasechnik, D. V., Olivetti, E., Martin, E., Wieser, E., Silva, F., Lenders, F., Wilhelm, F., Young, G., Price, G. A., Ingold, G., Allen, G. E., Lee, G. R., Audren, H., Probst, I., Dietrich, J. P., Silterra, J., Webber, J. T., Slavič, J., Nothman, J., Buchner, J., Kulick, J., Schönberger, J. L., de Miranda Cardoso, J. V., Reimer, J., Harrington, J., Rodríguez, J. L. C., Nunez-Iglesias, J., Kuczynski, J., Tritz, K., Thoma, M., Newville, M., Kümmerer, M., Bolingbroke, M., Tartre, M., Pak, M., Smith, N. J., Nowaczyk, N., Shebanov, N., Pavlyk, O., Brodtkorb, P. A., Lee, P., McGibbon, R. T., Feldbauer, R., Lewis, S., Tygier, S., Sievert, S., Vigna, S., Peterson, S., More, S., Pudlik, T., Oshima, T., Pingel, T. J., Robitaille, T. P., Spura, T., Jones, T. R., Cera, T., Leslie, T., Zito, T., Krauss, T., Upadhyay, U., Halchenko, Y. O. & Vázquez-Baeza, Y. (2020). Nat. Methods, 17, 261–272. Web of Science CrossRef CAS PubMed Google Scholar
Wang, S., Ding, Y., Lin, S., Ji, X. & Zhan, H. (2016). J. Wood Sci. 62, 1–11. CrossRef Google Scholar
Yang, L., Antonelli, S., Chodankar, S., Byrnes, J., Lazo, E. & Qian, K. (2020). J. Synchrotron Rad. 27, 804–812. Web of Science CrossRef IUCr Journals Google Scholar
Yang, L., Liu, J., Chodankar, S., Antonelli, S. & DiFabio, J. (2022). J. Synchrotron Rad. 29, 540–548. Web of Science CrossRef IUCr Journals Google Scholar
This is an open-access article distributed under the terms of the Creative Commons Attribution (CC-BY) Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are cited.