research papers
of organic compounds by a fit to the pair distribution function from scratch without prior indexing
^{a}Institut für Anorganische und Analytische Chemie, Goethe Universität, MaxvonLaueStrasse 7, Frankfurt am Main, 60437, Germany
^{*}Correspondence email: prill@chemie.unifrankfurt.de
A method for the ab initio determination of organic compounds by a fit to the pair distribution function (PDF), without prior knowledge of parameters and has been developed. The method is called `PDFGlobalFit' and is implemented by extension of the program FIDEL (fit with deviating parameters). The structure solution is based on a global optimization approach starting from random structural models in selected space groups. No prior indexing of the powder data is needed. The new method requires only the molecular geometry and a carefully determined PDF. The generated random structures are compared with the experimental PDF and ranked by a similarity measure based on crosscorrelation functions. The most promising structure candidates are fitted to the experimental PDF data using a restricted simulated annealing structure solution approach within the program TOPAS, followed by a structure against the PDF to identify the correct With the PDFGlobalFit it is possible to determine the local structure of crystalline and disordered organic materials, as well as to determine the local structure of unindexable powder patterns, such as nanocrystalline samples, by a fit to the PDF. The success of the method is demonstrated using barbituric acid as an example. The of barbituric acid form IV solved and refined by the PDFGlobalFit is in excellent agreement with the published data.
Keywords: pair distribution function analysis; structure determination; total scattering technique; similarity measures; PDFGlobalFit.
1. Introduction: PDF on the rise
et al., 2020), as well as to optimize them in terms of crystal engineering (Desiraju, 2003; Schmidt et al., 2007). The average can be determined by singlecrystal analysis or from powder diffraction data (SDPD) (David et al., 2002).
is an important step in the investigation of molecular solids due to the correlation of the molecular arrangement within the crystal and solidstate properties, such as physicochemical stability, solubility, bioavailability, and optical and magnetic properties. Knowledge of the is crucial to explain or predict these physical and chemical properties (HataRecently, there has been growing interest in the knowledge of the local structure. The local structure may deviate from the average et al., 2013), especially for complex materials such as pharmaceuticals (Moore et al., 2009; Terban et al., 2020), metal–organic frameworks (Mazaj et al., 2016), organic pigments (Hunger & Schmidt, 2018; Schlesinger et al., 2020), catalysts or magnetic materials, such as semiconductors (Frandsen et al., 2016). Disorder, defects or surface effects result in a local structure which differs from the average structure found by classical methods (Proffen et al., 2003; Young & Goodwin, 2011). Disorder, for example, can strongly influence the solidstate properties [see e.g. Gorelik et al. (2016) and Lindahl Christiansen et al. (2020)]. Therefore, the determination of the local structure of crystalline materials is important for the investigation and development of new materials.
(AkselMoreover, the local structure becomes fundamental if no average et al., 2007; Dinnebier & Billinge, 2008; Schlesinger et al., 2019). Due to their low crystallinity and small domain sizes a reliable indexing of the powder data is not possible. Alternatively, a structure solution from scratch by the global optimization approach of the commercially available software FIDEL can be performed, where large sets of trial structures are fitted to the powder pattern without the need for prior indexing (Habermehl, Schlesinger & Schmidt, 2021). However, while exploring the limits of structure fitting to lowquality powder patterns, this approach requires a certain minimum of crystallinity and longrange order to be successful. This general limitation applies to any variablecell directspace method for SDPD that could be performed, e.g. the VARICELLA approach (Rapallo, 2009). If indexing fails, potential parameters and possible space group(s) can be obtained, e.g. by a timeconsuming prediction (Bardwell et al., 2011; Neumann et al., 2015), although the comparison of the simulated powder patterns of the predicted crystal structures with the experimental powder pattern can only lead to the average (Mörschel & Schmidt, 2015).
can be determined as, for instance, in poorly crystalline, nanocrystalline solids, as well as for glasses and liquids. In these cases, classical methods such as singlecrystal analysis and SDPD fail (FernandesA reliable method to investigate the local structure, i.e. shortrange ordering, is the pair distribution function (PDF), which can be seen as the probability G(r) of finding pairs of atoms separated by a distance r (Neder & Proffen, 2008; Young & Goodwin, 2011; Egami & Billinge, 2012). The PDF describes the deviation of the microscopic pair density from the average [equation (1)], summed over all atom–atom pairs and weighted with the scattering power of the atoms. The PDF is a total scattering technique, i.e. it uses not only the Bragg peaks but also the total powder pattern including the diffuse scattering. G(r) is calculated from carefully measured and backgroundcorrected diffraction data by Fourier transformation of the corrected and normalized coherent scattered intensity S(Q) of the sample [equation (1)], Q [equation (2)] being the magnitude of the scattering vector, with θ the scattering angle and λ the wavelength of the used radiation (Egami & Billinge, 2012):
The classical application of PDF analysis entails qualitative and quantitative phase analysis (ZeaGarcia et al., 2019), including the determination of the domain size of nanoparticles (Neder & Korsunskiy, 2005) or the amorphous content of the sample (Peterson et al., 2013). The PDF is frequently used to study the local structure of inorganic materials, liquids and glasses (Juhás et al., 2010; Young & Goodwin, 2011; Ojovan & LouzguineLuzgin, 2020). While the PDF analysis of inorganic compounds has been steadily developed, the PDF analysis of organic compounds has been slightly delayed. The reasons for this are manifold and include, among other things, the low scattering power of mainly carbon and hydrogen atoms, as well as the different PDF caused by intermolecular versus intramolecular atom pairs (Rademacher et al., 2012; Prill et al., 2015). However, the number of organic materials investigated by PDF analysis is rapidly rising due to the growing interest in their local structure (Bates et al., 2006; Davis et al., 2013; Billinge, 2015; Terban et al., 2016, 2020; Rantanen et al., 2018). Several advances in local structure investigation by a fit to the PDF have been published. However, these methods – regardless of whether an organic or inorganic sample is investigated – require at least a rather well matching model(s) (Farrow et al., 2007; Neder & Proffen, 2008; Yang et al., 2020) or at least the knowledge of the and (Prill et al., 2016) in order to succeed in a reasonable fit. Remarkable work was recently published describing the determination of the from the PDF data (Liu et al., 2019). Nevertheless, the identification of the parameters is challenging for nanocrystalline compounds and often ends without an outcome. Hence, a new method is required to determine the local structure without prior indexing. Such a new method, the PDFGlobalFit, is presented here. Its aim is to solve the local structure of organic compounds from scratch by a fit to PDF data, without prior knowledge of parameters and space group.
2. Method development: by a fit to the PDF
The general procedure of the PDFGlobalFit is shown in Fig. 1. Only two files are needed as input, i.e. a carefully determined experimental PDF and a molecular geometry. An initial molecule model can be taken from an already solved of a known polymorph or similar compound, or alternatively derived by a geometry optimization using quantummechanical (QM) or forcefield methods. Since the PDFGlobalFit is designed to solve the structure of nanocrystalline substances of hithertounknown crystal structures, the QM geometryoptimized molecular model has been used as a start for the development.
The structure solution is based on trial structures generated with the FIDEL software (Habermehl et al., 2014; Habermehl, Schlesinger & Schmidt, 2021). For this purpose, a reliable searchspace setup is needed; a selection of investigated space groups and possibly special positions of the molecule, reasonable ranges for the parameters and the cell volume, and if required the selection of internal have to be defined in the preparation. According to the searchspace setup the trial structures are generated with random values for the parameters a, b, c, α, β, γ, the fractional molecular position m_{x}, m_{y}, m_{z} and the molecular orientation φ_{x} , φ_{y}, φ_{z}, as well as possibly random values for selected intramolecular All randomly created structural models that are outside the userdefined unitcellvolume range are discarded. Moreover, only random structures that do not exhibit any kind of molecular overlap are considered.
The PDFGlobalFit consists of five steps. The generated trial structures (step 1) are subjected to two subsequent structure solution steps: a comparison of the simulated PDF of the structural model with the experimental PDF (step 2) is followed by a fit of the structural model to the experimental PDF (step 3).
In step 2 the simulated PDF is compared with the experimental one by calculation of the similarity measure introduced by Habermehl, Schlesinger & Prill (2021), which is based on the generalized similarity measure using crosscorrelation functions according to de Gelder et al. (2001). The random structures are ranked by the similarity. All structures that do not reach a given minimum similarity ( ; e.g. 0.8) are discarded.
In step 3, the remaining structure solution candidates are fitted to the experimental PDF curve using the program TOPAS Academic 6 (Coelho et al., 2015; Coelho, 2018), which is called by FIDEL. This structure fitting is a restricted simulated annealing (SA) structure solution approach provided by TOPAS (Coelho, 2000; Coelho et al., 2015).
At the end of step 3, the optimized structure candidates from the SA fit are ranked by their value and only those structural models that exhibit the lowest value ( ; e.g. 35%) are considered further. The complete structure solution process is automated by FIDEL.
In step 4, the remaining structural models are subjected to an automated structure TOPAS.
against the experimental PDF usingIn step 5 a usercontrolled TOPAS is performed.
of the best structure candidate, or in case of ambiguity several promising candidates, to the PDF data withThe TOPAS input files for structure solution and were based on the technical references and examples provided with the TOPAS Academic 6 software (Coelho et al., 2015; Coelho, 2016).
2.1. Searchspace setup and generation of the random structures (step 1)
The choice of investigated space groups is usually based on the statistics of spacegroup frequencies according to the molecular symmetry (Pidcock et al., 2003; Pidcock & Motherwell, 2004). This means that the most frequent combinations of and of the molecule are considered. Hence, molecules of C_{1} or any higher pointgroup symmetry have to be investigated in selected space groups with the molecule on a general position and Z′ = 1. If the molecule belongs to a highersymmetry in particular if it has an inversion centre, the selection based on frequency statistics will also include the investigation of certain space groups in combination with Z′ < 1 and the molecule on a special position. The space groups in which possibly isomorphic or chemically similar compounds crystallize should also be considered. The selection and number of space groups is the user's decision, considering the available computational resources. If the initial selection does not lead to satisfactory results, additional calculations should be performed in less frequent space groups and/or with Z′ > 1 (e.g. in P1 with Z′ = 2, which also covers space groups of higher symmetry). For each selected combination of the and the general or special position of the molecule, a large set of trial structures is generated, with random values for (at most) the following parameters: the parameters a, b, c, α, β, γ, molecular position m_{x}, m_{y}, m_{z}, molecular orientation φ_{x}, φ_{y}, φ_{z} and selected intramolecular degrees of freedom.
The ). The minimal unitcell lengths were set to 3 Å, corresponding to the typical π–πstacking distance. The maximal unitcell lengths were set on the basis of the longest intramolecular atom–atom distance in the molecular model, taking into account the van der Waals radii and an additional increment of 0.3 Å. The upper boundaries of the cell lengths were derived by multiplying the maximal value for one molecule by the number of molecules in each unitcell direction according to the spacegroup symmetry. For the molecules that exhibit many different conformations, which cannot be easily predicted, the largest possible intramolecular atom–atom distance should be taken as a longest possible intramolecular distance. Therefore, the is large enough for every possible conformation that could occur.
parameter ranges are set according to the size of the molecule (Pidcock & Motherwell, 2004The cell volume is restricted to a certain range to avoid intermolecular contacts which are too close and unreasonable voids. Sensible ranges for cell volumes are derived using increment systems, e.g. Hofmann's volume increments (Hofmann, 2002), and/or known crystal structures of similar substances, chemical derivatives, other polymorphic forms or solvates, e.g. extracted from a suitable database such as the Cambridge Structural Database (Allen & Motherwell, 2002).
The position and orientation of the molecules in the random structural models are basically unrestricted. However, these parameters are chosen from ranges according to the spacegroup symmetry (e.g. inside the asymmetric unit) in order to avoid an excess of redundant or impossible representations. Furthermore, no trial structure that exhibits unreasonable interatomic distances, i.e. molecular overlap, is considered.
2.2. Simulation and comparison of PDF curves from structural models (step 2)
A PDF curve of a given structural model is simulated on the basis of equation (3), including the interatomic distance r, the scattering powers f_{i}, f_{j} of the atoms i, j, as the average scattering power of the sample and the Dirac delta function (Egami & Billinge, 2012):
The simulation of the PDF can be performed either using TOPAS (Coelho, 2018) automatically invoked by FIDEL or using the libdiffpy library of DiffPyCMI (Juhás et al., 2015) implemented as part of FIDEL. They both use constant scattering powers evaluated at the Q value of zero for f_{i}, f_{j} in equation (3), corresponding to the for a neutral atom. Alternatively, the calculation of the PDF from a structural model could be done via taking into account the Q dependence of the atomic form factors (Neder & Proffen, 2020). We used TOPAS for PDF simulation, since it was used in the subsequent steps of the overall procedure as well. For the simulation, two different isotropic displacement parameters are used, one for intramolecular distances and one for intermolecular ones (Rademacher et al., 2012; Prill et al., 2015). The simulated PDF and the experimental PDF are compared and ranked according to their calculated similarity measure [equation (4)] as implemented in FIDEL (Habermehl, Schlesinger & Prill, 2021). is based on [equation (5)], the integral of the weighted crosscorrelation function [equation (6)] of the two curves, and normalized by the respective integrals of the weighted autocorrelation functions and :
The crosscorrelation function of two PDFs, and ), correlates each data point of one curve to the data points at the distance s in the other curve [equation (6)]. The acronym LT denotes that the PDF curves are subjected to a linear transformation which shifts G(r) to positive values while keeping a common baseline. By weighting the crosscorrelation function with the triangular function the correlation of data points is restricted to a defined neighbouring range of [equation (7)] before integration over all datapoint distances within the given range yields (Habermehl, Schlesinger & Prill, 2021):
From equation (4) an value of 1 implies identity of the two PDF curves. The similarity measure is a powerful tool for the comparison of two roughly matching PDF curves, especially if their signal positions strongly deviate. A comparison based on pointwise differences would in many cases fail to indicate a considerable concordance of the two PDFs, whereas the similarity measure quantifies their congruence sufficiently well (Habermehl, Schlesinger & Prill, 2021).
The similarity measure is calculated for all structures. The structures are ranked according to their values, and all structures that have a value below a threshold value are discarded. is a userdefined value, which is expected to vary slightly depending on the investigated problem, in particular with respect to the experimental data.
2.3. Fit to the experimental PDF by simulated annealing (step 3)
Trial structures that qualified as structure solution candidates by reaching at least a given similarity threshold value are subjected to a fit to the experimental PDF using the SA method of TOPAS (Coelho et al., 2015) controlled by FIDEL.
The agreement of a structural model with the experimental PDF is commonly quantified by a weighted agreement factor [equation (8)] derived from the pointwise differences between the observed PDF and the calculated PDF with the corresponding weight w(r_{i}) = calculated from the error σ of at each data point i (Egami & Billinge, 2012):
Approaches based on pointwise differences serve well for the comparison and fitting of structural models to the experimental PDF if the model is already close to the best match. However, a pointwise comparison tends to fail or become indecisive if the shifts in signal positions are too big, in particular in the case of ). The SA method of TOPAS can very efficiently determine the molecular position and orientation if roughly correct parameters are given. In our experience of organic substances the sum of parameter deviations may not exceed 4–10%; otherwise the SA by a fit to the PDF fails. The robustness against deviating parameters, on the other hand, is a strength of FIDEL's approach based on the similarity measure S_{12}. Hence, the hierarchical search strategy of the global optimization by FIDEL (Habermehl, Schlesinger & Schmidt, 2021) has been combined with the SA procedure of TOPAS in order to ally the strengths of the two approaches.
parameter deviations (Habermehl, Schlesinger & Prill, 2021The SA fit is performed using basically the same representation of the structure candidates and fitted parameters as described for FIDEL, i.e. the parameters and the position and orientation of the molecule. The molecular geometry is described by a z matrix, which may include distances, angles or dihedral angles corresponding to selected internal According to the SA method the molecular position m_{x}, m_{y}, m_{z} and the molecular orientation φ_{x}, φ_{y}, φ_{z} are randomized on the basis of the start structure. The initial candidate is a trial structure that had been compared with but not fitted to the experimental PDF before. Hence, during the SA the parameters were allowed to vary within comparably narrow ranges, e.g. 5% of the parameters of the initial structure.
The TOPAS SA fit is performed by a robust, automated fourstep optimization approach. The of the PDF, as well as the scaling factor, are optimized in each step. At first the inter and intramolecular displacement parameters, the envelope, the molecular position m_{x}, m_{y}, m_{z}, and the molecular orientation φ_{x}, φ_{y}, φ_{z} are fitted on the basis of the structure candidate. In the second SA step these optimized values are kept fixed during a subsequent fitting of the parameters. In the third SA step a simultaneous fit of the parameters, m_{x}, m_{y}, m_{z} and φ_{x}, φ_{y}, φ_{z} is performed. In the last SA step, all mentioned variables are fitted simultaneously to the experimental PDF data. The optimized structures are ranked according to their weightedpattern R value calculated by TOPAS as a figure of merit of the fit of the structure candidate to the experimental PDF.
2.4. Structure refinements (steps 4 and 5)
In step 4 the fitted structure candidates from step 3 that yielded values below a predefined threshold value are refined to the experimental PDF using TOPAS Academic 6 (Coelho, 2018). The molecular geometry is described by internal coordinates using the zmatrix formalism, optionally including selected At first, the parameters, scale factor, damping of the PDF curve, and one inter and one intramolecular isotropic displacement parameter were refined simultaneously. Subsequently, the position and orientation of the molecule were refined. Alternatively, the molecular geometry can be refined using fractional atomic coordinates with restraints for bond angles, bond lengths and planar groups.
The results of the automated
are evaluated by the user with respect to the values, the difference curves of the calculated and observed PDF, the molecular packing or hydrogenbond pattern, and other criteria. On the basis of this thorough evaluation, one or, in the case of ambiguities, several structures are selected for the final usercontrolled (step 5).3. PDFGlobalFit: barbituric acid as an example
For the development and validation of the method a rigid organic compound with a known _{4}H_{4}N_{2}O_{3}, Fig. 2) was chosen, which is a commercially available, very well known, rigid, organic molecule that contains a small number of atoms. Barbituric acid exhibits keto–enol and forms different polymorphs with different tautomers. At ambient conditions, the thermodynamically stable form is polymorph IV, which contains the enol tautomer shown in Fig. 2. The of this polymorph of barbituric acid was solved by Xray and neutron powder diffraction (Schmidt et al., 2011), and later confirmed by Xray singlecrystal diffraction (Marshall et al., 2016). It crystallizes in P2_{1}/n with Z = 4 and unitcell parameters of a = 11.87614 (6), b = 8.91533 (4), c = 4.83457 (3) Å and β = 95.0854 (4)° (Schmidt et al., 2011). For comparability this was transformed to the standard unitcell setting of P2_{1}/c with a = 4.83457, b = 8.91533, c = 12.4192 Å and β = 107.729°. The crystal structures resulting from the by a fit to the PDF will be compared with this known of barbituric acid in P2_{1}/c.
was considered reasonable. Hence, barbituric acid (C3.1. Experimental detail
Barbituric acid was purchased from Sigma Aldrich (99% purity) and used without further purification. The sample was milled in a mortar and subsequently placed in a polyimide capillary (1 mm in diameter) which was sealed with clay at both ends. The Xray powder diagram of the sample was measured at 300 K at the X17A beamline of the National Synchrotron Light Source at Brookhaven National Laboratory. A monochromatic incident Xray beam conditioned using an Si(311) monochromator to have an energy of 67.42 keV (λ = 0.1839 Å) was used. The 2D PerkinElmer amorphous silicon detector was mounted orthogonally to the beam path with a sampletodetector distance of 204.2 mm, as calibrated with an LaB_{6} standard sample. Multiple scans were performed to achieve a total exposure time of 30 min. The 2D diffraction data were integrated and converted to intensity versus 2θ using the software FIT2D (Hammersley, 2016). The data were corrected and normalized and then truncated at a finite maximum value of the momentum transfer Q_{max}, which was optimized to avoid large termination effects whilst maximizing the signaltonoise ratio, using the program PDFgetX3 (Juhás et al., 2013) to obtain the total scattering structure function, F(Q), and G(r). The value Q_{max} = 21.9 Å^{−1} was found to be optimal for barbituric acid.
The molecular geometry of barbituric acid was calculated by geometry optimization at the B3LYP/631g** level using GAUSSIAN (Frisch et al., 2009). Although a highquality singlecrystal structure is available for barbituric acid, the molecular geometry was derived from QM geometry optimization in order to represent a general example for the proofofconcept evaluation of the PDFGlobalFit procedure.
All calculations (PDF simulation, structure solution and refinement) were performed on a standard desktop PC running a 64 bit Windows system and equipped with an Intel Core i73770 processor and 32 GB RAM. The generation of the random structures and the comparison of the simulated and experimental PDFs (steps 1 and 2) take approximately 4 days. The structure solution and
(steps 3 and 4) which take approximately 3 weeks are the most timeconsuming steps in the procedure. This is a rather long time; however, the process itself is still in development and calculation steps will be optimized.3.2. Searchspace setup for the PDFGlobalFit
For the preparation step the PDF data and the z matrix of a QM geometryoptimized barbituric acid molecule were provided as input files.
3.2.1. Parameters for the searchspace setup (step 1)
Barbituric acid exhibits the C_{s}. The most likely space groups for barbituric acid were selected according to spacegroup statistics for organic compounds (Pidcock et al., 2003; Pidcock & Motherwell, 2004). To save computational time, three space groups were chosen, which already cover over 75% of all crystal structures with molecules having the molecular symmetry C_{s}: P1, P2_{1}/c and P each with Z′ = 1. Moreover, the chosen space groups cover various supergroups with higher symmetries. For example, calculations in P1, Z = 1 can also result in structures in Pm, Z = 1 and Cm, Z = 2, calculations in P2_{1}/c, Z = 4 include structures in Pnma, Z = 4 and P2_{1}/m, Z = 2 etc. The searchspace setup is given in Table 1, including the ranges of parameters and cell volumes allowed. The minimal unitcell lengths were set to 3 Å, corresponding to the typical π–πstacking distance. The maximum limits for the unitcell parameters were derived from the longest intramolecular distance in the geometryoptimized barbituric acid, which is 5.535 Å. After adding the van der Waals radii plus 0.3 Å to avoid close contacts, the maximum space for one barbituric acid molecule in one direction of the is 8.5 Å. The number of possible molecules in each direction depends on the and symmetry operators, e.g. in P2_{1}/c a molecule can be situated four times in the c direction; therefore the maximum value of c is 4 × 8.5 Å = 34 Å.

The estimated molar volume of barbituric acid is 133.57 Å^{3} using Hofmann's increment system (Hofmann, 2002). In P1 and P the range for the cell volume was set to ±15% of that value. In P2_{1}/c the minimum cell volume was set to −15%. It is known that due to packing effects the cell volume is overestimated for aromatic planar compounds in highersymmetry space groups. Hence the maximum cell volume was set to +5% of the Hofmann volume.
3.2.2. Simulation of the PDF curves from structural models (step 2)
To ensure comparability, the simulations of the PDF curves from the structural models were all performed under the same fixed conditions with respect to the instrumental envelope and the intra and intermolecular atomic displacement parameters using the program TOPAS. The instrumental envelope was determined using a reference substance, resulting in a value of 48.0 Å^{−1}. The intramolecular displacement parameter B_{intra} of 0.16 Å^{2} was determined using a simulated PDF curve of a single molecule of barbituric acid. For small planar organic compounds, a ratio of B_{intra} to B_{inter} of 1 to 3.75 was found (Prill et al., 2015), resulting in an intermolecular displacement parameter B_{inter} of 0.6 Å^{2}. The simulated PDF curves were calculated and compared with the experimental one in a range of 1–20 Å.
3.2.3. Threshold criteria for the selection of structure candidates (steps 2 and 3)
During the structure solution process of the PDFGlobalFit a large set of random structural models within the searchspace setup outlined before is incrementally reduced to smaller sets of qualified structure candidates. At two points in the search for a correct local structure representative, promising structural models were selected according to the settings of threshold criteria: the first point was after the comparison step (step 2) and the second point after the SA fit (step 3).
Due to the first criterion the structural models that do not reach a minimal similarity measure value , resulting from the comparison of the calculated and the experimental PDF curve, were sorted out. To define the threshold value, preliminary tests were performed on modified crystal structures of barbituric acid and on randomly created structures. Preliminary tests on modified structures of barbituric acid [rootmeansquare Cartesian displacement (RMSCD) (van de Streek & Neumann, 2010) values smaller than 0.25 Å] resulted in values of ≥ 0.985 (Habermehl, Schlesinger & Prill, 2021). Further tests on randomly created crystal structures of barbituric acid showed that the value of 0.985 leads to a reasonable number of structural models in the next step of the structure solution. Therefore, the requested similarity value of = 0.985, using the neighbourhood range parameter l = 0.53 Å, was found to be adequate for the example presented here. Only structure candidates with values higher than the threshold criterion were subjected to the SA fit to the experimental PDF data using the TOPAS software as described earlier. The second selection step (step 3) was then imposed by discarding all fitted structure candidates that exceed a maximal value of 35%.
3.3. Results
A set of 100 000 random structures in each investigated ). In P2_{1}/c, 439 structure candidates reached a similarity value above 0.985 after comparison step 2, whereas no comparably promising structure candidates are observed in P1. The three best qualified structure candidates in P1 exhibit values of about 0.98. Accordingly, no qualified structure candidate was further considered in P1. In P1 (Z = 1), only layered structures with parallel molecules are possible. Apparently, this packing motif is unfavourable for the enol tautomer of barbituric acid.
was generated in step 1. The numbers of structure candidates qualifying in the subsequent steps 2–5 differ greatly depending on the (Table 2

After the comparison step (step 2) the similarity measures of the four topranked candidates in P were slightly higher than the best one in P2_{1}/c. The parameters showed an insignificant trend to a small a axis (range of 3.3–7.3 Å). By visual inspection of the best structural models it was noted that a crisscross packing motif is more frequent than other packings, such as layered structures. The best ten structural models for each of the space groups P2_{1}/c and according to the similarity measure from the comparison (step 2) of the simulated PDF curve with the experimental one are shown in Table S1 in the supporting information.
Table 3 represents the results of the SA fit of the structure candidates to the experimental PDF data, ranked by the value (step 3). By comparison of the values it was obvious that those of the structure candidates in P2_{1}/c are smaller than the ones from the models in P, although one model in P exhibits an value as low as the structural models in P2_{1}/c. As expected, the spread of the parameters is significantly smaller after structure fitting (step 3) than in the previous step (step 2) of the PDFGlobalFit and crucial trends were obvious.

The smallest value of 26.6% is significantly lower than all the others. This structure candidate, number 54845 in P2_{1}/c, is, already after step 3, in good agreement with the published structure of barbituric acid form IV: the correct parameters are already found, as well as the correct molecular position, although tiny discrepancies in the molecular orientation are shown, i.e. most atomic positions match well [Fig. 3(a)]. Nonetheless, the other structural models in P2_{1}/c also consistently show the correct crisscross packing motif, although the majority of structure candidates exhibit intermolecular contacts that are too close [Fig. 3(b)].
As shown in Tables 2 and 3, only 11 structure candidates satisfied the threshold criterion after the SA fit (step 3). For the subsequent automated of these 11 remaining structure candidates to the experimental PDF (step 4) the r range for the comparison of simulated and experimental PDF curves was increased to 1–30 Å. The automated was followed by a usercontrolled one (step 5). After these two steps three structures (structures 1, 2 and 3) in P2_{1}/c exhibit a value as low as approximately 20% (Table 4). The parameters are in perfect agreement with the parameters of the published by Schmidt et al. (2011). All structure representatives are chemically sensible, signifying that the structures exhibit no voids within the packing and have a sensible threedimensional hydrogenbond network. The correct molecular position was found in all three instances (Fig. 4). The RMSCD values (van de Streek & Neumann, 2010) relative to the published structure were calculated for all nonH atoms for these three structures. The corresponding values are 0.049 Å for 1, 0.045 Å for 2 and 0.064 Å for 3. One of the three models (3, yellow model in Fig. 4) shows a minor deviation of the molecular orientation relative to the published structure: the position of one H atom of the H—O bond is not exact. This corresponds to a molecular orientation switch of 180°. The positions of all the other atoms (nitrogen, oxygen and carbon) are correct as determined by the PDFGlobalFit. This is a result of the low scattering power of one H atom (∼0.008%) when compared with the other atoms. Moreover, the determination of hydrogen positions from Xray diffraction data is challenging, and hence it is conventional to calculate the associated hydrogen positions by a QM or forcefield method. Nevertheless, the correct hydrogenbond network is represented. The value is as low as for the other two structure representatives and the difference curve of the calculated and observed PDF curves is smooth (Fig. 5). Thus, structure candidate 3 can also be considered as the correct structure found by the PDFGlobalFit. Additionally, it cannot be ruled out that the hydrogen position is slightly disordered in the local structure and structure 3 could be an alternative representative for the local structure of barbituric acid.

The evolution of the and represents the improved optimization of the structure candidate to the experimental PDF data.
parameters of the best structure candidate (structure 1) within each step of the PDFGlobalFit is illustrated in Table 5

Using barbituric acid as an example, the power of the PDFGlobalFit without prior indexing using FIDEL and TOPAS could be demonstrated and highlighted. The correct of barbituric acid could be found three times starting from a set of only 300 000 random structures in the three most frequent space groups P1, P2_{1}/c and P1 by a fit to PDF data.
4. Discussion
Barbituric acid is a test case, which was used to demonstrate the feasibility and power of the PDFGlobalFit method. What about more complex structures? Prill et al. (2016) have shown that the structure of the organic compound allopurinol can be successfully solved even in P1 with four independent molecules, i.e. with 21 if the parameters are known in advance. The high information content of PDF data has also been used to determine the local structure of disordered materials, including SF_{6} (Tucker et al., 2007) and monomethylquinacridone, C_{21}H_{14}N_{2}O_{2} (Schlesinger et al., 2020). These observations indicate that the PDF data should contain enough information to solve more complex structures than barbituric acid from scratch.
Classical methods for θ range, and the background, which is generally ignored by classical structure solution methods.
use the Bragg peaks only. This information is quite limited, especially if the powder pattern contains only a few broad peaks. In contrast, the PDF uses the information from the total scattering, including the diffuse scattering, even in the very high 2To estimate the complexity of the structures that in principle should be solvable by the PDFGlobalFit, a comparison with classical directspace methods for SDPD might be helpful. Both approaches are based on the information content of the powder diffraction data. Experience shows that the success rate of the directspace methods is not limited by the size of the molecules, but by the number of et al., 2005; Kabova et al., 2017; Nilsson Lill et al., 2018). A similar trend can be expected for the PDFGlobalFit, given that the unknown parameters increase the number of degrees of freedom.
(presupposing that the indexing is reliable). The structure solution by directspace methods becomes challenging if the number of (for molecular position, molecular orientation and intramolecular degrees of freedom) exceeds a limit of 20–25 (FlorenceThe advantage of the PDFGlobalFit in comparison with classical directspace methods is that no prior indexing is required. Note that the PDF provides the local structure, whereas classical SDPD gives the average longrange ordering in the crystal, which may deviate from the local structure. Therefore, the PDFGlobalFit can support the classical SDPD for an unindexable powder pattern, such as nanocrystalline samples, but can also be combined with SDPD for crystalline compounds to determine the difference between local and average structure, for example in disordered materials.
The geometrical accuracy of the structures resulting from a fit to the PDF is excellent. The et al., 2016).
parameters as well as the molecular position and orientation of the investigated compounds determined by a fit to the PDF are in perfect agreement with already published singlecrystal data. This observation was made in prior work, where the parameters were (approximately) known, and mainly the molecular position and orientations were determined by a PDF fit (PrillThe r = 1–20 Å for the structure solution and 1–30 Å for the structure Actually, the PDF contains signals up to much larger r values, because the ordering length (domain size) in the investigated sample is more than 300 Å. The fact that a range of 1–30 Å was fully sufficient for reveals that the PDFGlobalFit should also work successfully for nanocrystalline compounds with small domain sizes (e.g. 30–100 Å). Hence, the PDFGlobalFit is a new method for the determination of crystal structures of nanocrystalline compounds from scratch, without the need to index the powder pattern. The PDFGlobalFit is built on the global optimization method of FIDEL, which has been developed and successfully applied for from unindexed powder patterns of very low, but still sufficient, quality for SDPD (Habermehl, Schlesinger & Schmidt, 2021). The basic concepts of the approach could be successfully adapted and applied to the by a fit to the PDF. Other methods to determine crystal structures of nanocrystalline organic compounds include electron diffraction or prediction, in combination with Xray powder diffraction to select the actual structure from the simulated ones. However, the characteristics of all these methods are different. The PDFGlobalFit is the only method that yields the local structure from the diffraction data, instead of the average structure. Furthermore, the PDFGlobalFit is the only method that can be applied if the powder pattern contains no Bragg peaks, but only broad humps. (A `crystal' consisting of 5 × 5 × 5 unit cells does not produce any useful Bragg peaks, but provides a reliable PDF.) Of course, a combination of different approaches is also useful.
of barbituric acid by the PDFGlobalFit was performed using the PDF only in the range ofA second reason as to why the PDF range used for structure solution was restricted to 1–20 Å instead of a broader range, e.g. 1–100 Å, is the required computational time. The most timeconsuming task of the structure solution is the simulation of the PDFs from structural models, with the time required for the calculation of a single PDF growing roughly proportional to r^{3}. This affects the screening of a huge number of trial structures by comparison with the experimental PDF and even more the fitting of structural models. Hence, the restriction of the r range is crucial for the feasibility of the structure solution. Although performed with a restricted r range, the structure fitting by SA (step 3) still required about 50% of the total computing time of the entire PDFGlobalFit. Because of the high computing effort in this step, it would be practically impossible to fit all random structures from step 1 to the PDF data (step 3). Hence an adequate reliable preselection of promising structure candidates is unavoidable. The preselection is done by the similarity measure in step 2. This highlights the essential role of two major concepts of the global optimization approach of FIDEL for the success of the global fit to PDF or powder patterns: (i) The use of the similarity measure S_{12} and its adaptability by variation of the neighbourhood range parameter l provides the basis for a comparison of simulated and experimental data that enables the detection of a rough match, in particular with respect to strongly deviating parameters. (ii) From the characteristics of the similarity measure, an effective incremental search strategy can be designed which makes a global fit starting from a huge number of random structures feasible by minimization of the computing time required.
3.8. Conclusion
A novel method called the PDFGlobalFit is reported for solving organic crystal structures from scratch by a fit to the pair distribution function without prior indexing. Only the molecular geometry and experimental PDF data must be provided as input. The method contains an automated structure solution procedure, according to the Monte Carlo approach, in selected space groups, using the program FIDEL. The PDF calculation and the fitting of the structural models are performed using TOPAS. Subsequently, a usercontrolled of the most promising structure candidates to the PDF data results in the final structure. The suitability of the method was proven using barbituric acid as an example. This is the first time that an organic has been solved from scratch by a fit to the PDF without parameters and the as input. The implementation of the PDFGlobalFit in FIDEL is still under development and therefore not yet available in a commercial version of the software.
The next steps will be the examination and development of the method, e.g. for crystal structures containing molecules with conformational nanocrystalline samples, or more complex systems such as hydrates, solvates, salts and cocrystals. Additionally, the procedure has to be further optimized to reduce the computational time in order to gain a higher throughput. Another perspective will be the combination of the fit to the PDF with the fit to the powder pattern under the common framework of the global optimization approach of FIDEL.
Nevertheless, the possibility to solve crystal structures from unindexable powder data by a fit to the PDF, or even to obtain the local structure of nanocrystalline organic materials, is within reach.
Supporting information
Supporting information, Table S1. DOI: https://doi.org//10.1107/S1600576721002569/vk5045sup1.pdf
Acknowledgements
The authors thank Professor Dr Martin U. Schmidt (Goethe University, Frankfurt am Main) for his support and helpful discussions, and Dirk Bender (Goethe University, Frankfurt am Main) for several test calculations using the PDFGlobalFit. Open access funding enabled and organized by Projekt DEAL.
Funding information
CS thanks the `Fond der chemischen Industrie' for a generous scholarship.
References
Aksel, E., Forrester, J. S., Nino, J. C., Page, K., Shoemaker, D. P. & Jones, J. L. (2013). Phys. Rev. B, 87, 104113. Google Scholar
Allen, F. H. & Motherwell, W. D. S. (2002). Acta Cryst. B58, 407–422. Web of Science CrossRef CAS IUCr Journals Google Scholar
Bardwell, D. A., Adjiman, C. S., Arnautova, Y. A., Bartashevich, E., Boerrigter, S. X. M., Braun, D. E., CruzCabeza, A. J., Day, G. M., Della Valle, R. G., Desiraju, G. R., van Eijck, B. P., Facelli, J. C., Ferraro, M. B., Grillo, D., Habgood, M., Hofmann, D. W. M., Hofmann, F., Jose, K. V. J., Karamertzanis, P. G., Kazantsev, A. V., Kendrick, J., Kuleshova, L. N., Leusen, F. J. J., Maleev, A. V., Misquitta, A. J., Mohamed, S., Needs, R. J., Neumann, M. A., Nikylov, D., Orendt, A. M., Pal, R., Pantelides, C. C., Pickard, C. J., Price, L. S., Price, S. L., Scheraga, H. A., van de Streek, J., Thakur, T. S., Tiwari, S., Venuti, E. & Zhitkov, I. K. (2011). Acta Cryst. B67, 535–551. Web of Science CrossRef IUCr Journals Google Scholar
Bates, S., Zografi, G., Engers, D., Morris, K., Crowley, K. & Newman, A. (2006). Pharm. Res. 23, 2333–2349. Web of Science CrossRef PubMed CAS Google Scholar
Billinge, S. J. L. (2015). Nanomedicine, 10, 2473–2475. Web of Science CrossRef CAS PubMed Google Scholar
Coelho, A. A. (2016). TOPAS Academic, Version 6. Technical Reference. Coelho Software, Brisbane, Australia. Google Scholar
Coelho, A. A. (2018). J. Appl. Cryst. 51, 210–218. Web of Science CrossRef CAS IUCr Journals Google Scholar
Coelho, A. A. (2000). J. Appl. Cryst. 33, 899–908. Web of Science CrossRef CAS IUCr Journals Google Scholar
Coelho, A. A., Chater, P. A. & Kern, A. (2015). J. Appl. Cryst. 48, 869–875. Web of Science CrossRef CAS IUCr Journals Google Scholar
David, W. I. F., Shankland, K., McCusker, L. B. & Baerlocher, C. (2002). Structure Determination from Powder Diffraction Data. Oxford University Press. Google Scholar
Davis, T., Johnson, M. & Billinge, S. J. L. (2013). Cryst. Growth Des. 13, 4239–4244. Web of Science CrossRef CAS Google Scholar
Desiraju, G. R. (2003). J. Mol. Struct. 656, 5–15. Web of Science CrossRef CAS Google Scholar
Dinnebier, R. E. & Billinge, S. J. L. (2008). Editors. Powder Diffraction: Theory and Practice. Cambridge: Royal Society of Chemistry. Google Scholar
Egami, T. & Billinge, S. (2012). Underneath the Bragg Peaks, 2nd ed. Amsterdam: Elsevier. Google Scholar
Farrow, C. L., Juhás, P., Liu, J. W., Bryndin, D., Božin, E. S., Bloch, J., Proffen, T. & Billinge, S. J. L. (2007). J. Phys. Condens. Matter, 19, 335219. Web of Science CrossRef PubMed Google Scholar
Fernandes, P., Shankland, K., Florence, A. J., Shankland, N. & Johnston, A. (2007). J. Pharm. Sci. 96, 1192–1202. Web of Science CSD CrossRef PubMed CAS Google Scholar
Florence, A. J., Shankland, N., Shankland, K., David, W. I. F., Pidcock, E., Xu, X., Johnston, A., Kennedy, A. R., Cox, P. J., Evans, J. S. O., Steele, G., Cosgrove, S. D. & Frampton, C. S. (2005). J. Appl. Cryst. 38, 249–259. Web of Science CrossRef CAS IUCr Journals Google Scholar
Frandsen, B. A., Gong, Z., Terban, M., Banerjee, S., Chen, B., Jin, C., Feygenson, M., Uemura, Y. J. & Billinge, S. J. L. (2016). Phys. Rev. B, 94, 094102. Google Scholar
Frisch, M. J. et al. (2009). GAUSSIAN09, Revision A.2. Gaussian Inc., Wallingford, CT, USA. Google Scholar
Gelder, R. de, Wehrens, R. & Hageman, J. A. (2001). J. Comput. Chem. 22, 273–289. Web of Science CrossRef Google Scholar
Gorelik, T. E., Czech, C., Hammer, S. M. & Schmidt, M. U. (2016). CrystEngComm, 18, 529–535. Web of Science CrossRef CAS Google Scholar
Habermehl, S., Mörschel, P., Eisenbrandt, P., Hammer, S. M. & Schmidt, M. U. (2014). Acta Cryst. B70, 347–359. Web of Science CSD CrossRef IUCr Journals Google Scholar
Habermehl, S., Schlesinger, C. & Prill, D. (2021). J. Appl. Cryst. 54, 612–623. Web of Science CrossRef CAS IUCr Journals Google Scholar
Habermehl, S., Schlesinger, C. & Schmidt, M. U. (2021). In preparation. Google Scholar
Hammersley, A. P. (2016). J. Appl. Cryst. 49, 646–652. Web of Science CrossRef CAS IUCr Journals Google Scholar
Hata, N., Furuishi, T., Tamboli, M. I., Ishizaki, M., Umeda, D., Fukuzawa, K. & Yonemochi, E. (2020). Crystals, 10, 53. Web of Science CSD CrossRef Google Scholar
Hofmann, D. W. M. (2002). Acta Cryst. B58, 489–493. Web of Science CrossRef CAS IUCr Journals Google Scholar
Hunger, K. & Schmidt, M. U. (2018). Industrial Organic Pigments. Production, Crystal Structures, Properties, Applications, 4th ed. Weinheim: WileyVCH. Google Scholar
Juhás, P., Davis, T., Farrow, C. L. & Billinge, S. J. L. (2013). J. Appl. Cryst. 46, 560–566. Web of Science CrossRef IUCr Journals Google Scholar
Juhás, P., Farrow, C., Yang, X., Knox, K. & Billinge, S. (2015). Acta Cryst. A71, 562–568. Web of Science CrossRef IUCr Journals Google Scholar
Juhás, P., Granlund, L., Gujarathi, S. R., Duxbury, P. M. & Billinge, S. J. L. (2010). J. Appl. Cryst. 43, 623–629. Web of Science CrossRef IUCr Journals Google Scholar
Kabova, E. A., Cole, J. C., Korb, O., LópezIbáñez, M., Williams, A. C. & Shankland, K. (2017). J. Appl. Cryst. 50, 1411–1420. Web of Science CrossRef CAS IUCr Journals Google Scholar
Lindahl Christiansen, T., Kjær, E. T. S., Kovyakh, A., Röderen, M. L., Høj, M., Vosch, T. & Jensen, K. M. Ø. (2020). J. Appl. Cryst. 53, 148–158. Web of Science CrossRef CAS IUCr Journals Google Scholar
Liu, C.H., Tao, Y., Hsu, D., Du, Q. & Billinge, S. J. L. (2019). Acta Cryst. A75, 633–643. Web of Science CrossRef IUCr Journals Google Scholar
Marshall, M. G., LopezDiaz, V. & Hudson, B. S. (2016). Angew. Chem. Int. Ed. 55, 1309–1312. Web of Science CSD CrossRef CAS Google Scholar
Mazaj, M., Kaučič, V. & Zabukovec Logar, N. (2016). ACSi, pp. 440–458. Web of Science CrossRef Google Scholar
Moore, M. D., Steinbach, A. M., Buckner, I. S. & Wildfong, P. L. D. (2009). Pharm. Res. 26, 2429–2437. Web of Science CrossRef PubMed CAS Google Scholar
Mörschel, P. & Schmidt, M. U. (2015). Acta Cryst. A71, 26–35. Web of Science CrossRef IUCr Journals Google Scholar
Neder, R. B. & Korsunskiy, V. I. (2005). J. Phys. Condens. Matter, 17, S125–S134. Web of Science CrossRef CAS Google Scholar
Neder, R. B. & Proffen, T. (2008). Diffuse Scattering and Defect Structure Simulations: a Cook Book Using the Program DISCUS. Oxford University Press. Google Scholar
Neder, R. B. & Proffen, Th. (2020). J. Appl. Cryst. 53, 710–721. Web of Science CrossRef CAS IUCr Journals Google Scholar
Neumann, M. A., van de Streek, J., Fabbiani, F. P. A., Hidber, P. & Grassmann, O. (2015). Nat. Commun. 6, 7793. Web of Science CSD CrossRef PubMed Google Scholar
Nilsson Lill, S. O., Widdifield, C. M., Pettersen, A., Ankarberg, A. S., Lindkvist, M., Aldred, P., Gracin, S., Shankland, N., Shankland, K., Schantz, S. & Emsley, L. (2018). Mol. Pharm. 15, 1476–1487. Web of Science CSD CrossRef CAS PubMed Google Scholar
Ojovan, M. I. & LouzguineLuzgin, D. V. (2020). J. Phys. Chem. B, 124, 3186–3194. Web of Science CrossRef CAS PubMed Google Scholar
Peterson, J., TenCate, J., Proffen, Th., Darling, T., Nakotte, H. & Page, K. (2013). J. Appl. Cryst. 46, 332–336. Web of Science CrossRef CAS IUCr Journals Google Scholar
Pidcock, E. & Motherwell, W. D. S. (2004). Acta Cryst. B60, 725–733. Web of Science CrossRef CAS IUCr Journals Google Scholar
Pidcock, E., Motherwell, W. D. S. & Cole, J. C. (2003). Acta Cryst. B59, 634–640. Web of Science CrossRef CAS IUCr Journals Google Scholar
Prill, D., Juhás, P., Billinge, S. J. L. & Schmidt, M. U. (2016). Acta Cryst. A72, 62–72. Web of Science CrossRef IUCr Journals Google Scholar
Prill, D., Juhás, P., Schmidt, M. U. & Billinge, S. J. L. (2015). J. Appl. Cryst. 48, 171–178. Web of Science CrossRef CAS IUCr Journals Google Scholar
Proffen, T., Billinge, S. J. L., Egami, T. & Louca, D. (2003). Z. Kristallogr. Cryst. Mater. 218, 132–143. Web of Science CrossRef CAS Google Scholar
Rademacher, N., Daemen, L. L., Chronister, E. L. & Proffen, Th. (2012). J. Appl. Cryst. 45, 482–488. Web of Science CrossRef CAS IUCr Journals Google Scholar
Rantanen, J. K., Majda, D., Riikonen, J. & Lehto, V.P. (2018). SSRN, https://doi.org/10.2139/ssrn.3275461. Google Scholar
Rapallo, A. (2009). J. Chem. Phys. 131, 044113. Web of Science CrossRef PubMed Google Scholar
Schlesinger, C., Bolte, M. & Schmidt, M. U. (2019). Z. Kristallogr. Cryst. Mater. 234, 257–268. Web of Science CSD CrossRef CAS Google Scholar
Schlesinger, C., Hammer, S. M., Gorelik, T. E. & Schmidt, M. U. (2020). Acta Cryst. B76, 353–365. Web of Science CSD CrossRef IUCr Journals Google Scholar
Schmidt, M. U., Brüning, J., Glinnemann, J., Hützler, M. W., Mörschel, P., Ivashevskaya, S. N., van de Streek, J., Braga, D., Maini, L., Chierotti, M. R. & Gobetto, R. (2011). Angew. Chem. Int. Ed. 50, 7924–7926. Web of Science CSD CrossRef CAS Google Scholar
Schmidt, M. U., Dinnebier, R. E. & Kalkhof, H. (2007). J. Phys. Chem. B, 111, 9722–9732. Web of Science CSD CrossRef PubMed CAS Google Scholar
Streek, J. van de & Neumann, M. A. (2010). Acta Cryst. B66, 544–558. Web of Science CrossRef IUCr Journals Google Scholar
Terban, M. W., Cheung, E. Y., Krolikowski, P. & Billinge, S. J. L. (2016). Cryst. Growth Des. 16, 210–220. Web of Science CrossRef CAS Google Scholar
Terban, M. W., Russo, L., Pham, T. N., Barich, D. H., Sun, Y. T., Burke, M. D., Brum, J. & Billinge, S. J. L. (2020). Mol. Pharm. 17, 2370–2389. Web of Science CSD CrossRef CAS PubMed Google Scholar
Tucker, M. G., Keen, D. A., Dove, M. T., Goodwin, A. L. & Hui, Q. (2007). J. Phys. Condens. Matter, 19, 335218. Google Scholar
Yang, L., Juhás, P., Terban, M. W., Tucker, M. G. & Billinge, S. J. L. (2020). Acta Cryst. A76, 395–409. Web of Science CrossRef IUCr Journals Google Scholar
Young, C. A. & Goodwin, A. L. (2011). J. Mater. Chem. 21, 6464. Web of Science CrossRef Google Scholar
ZeaGarcia, J. D., de La Torre, A. G., Aranda, M. A. G. & Cuesta, A. (2019). Materials, 12, 1347. Google Scholar
This is an openaccess article distributed under the terms of the Creative Commons Attribution (CCBY) Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are cited.