Gold Standard for macromolecular crystallography diffraction data

A large portion of the research community concerned with high data-rate macromolecular crystallography has agreed to an updated specification of data and metadata for diffraction images to be produced at light sources to facilitate the processing of data sets and to enable data archiving according to FAIR principles. Here, the resulting standard is presented.


NXmx Full Layout
This is a snapshot of the HDRMX NXmx application definition as it is being proposed to the NeXus International Advisory Committee (NIAC) for adoption by NIAC. That adoption process and discussions in the community are likely to result in additions to the Gold Standard as well as changes. The latest version prior to formal adoption is available from http://github.com/HDRMX/definitions. The HDRMX version will be updated as needed to reflect changes during and after adoption.

Status:
application definition, extends NXobject Description: functional application definition for macromolecular crystallography

Symbols:
These symbols will be used below to coordinate datasets of the same shape (configuration of array dimensions). Most MX x-ray detectors will produce twodimensional images. Some will produce three-dimensional images, using one of the indices to select a detector module. This field should only be filled when the value is accurately observed.
If the data collection aborts or otherwise prevents accurate recording of the end time, this field should be omitted. This is required for any scan experiment. The reason it is optional is primarily to accommodate XFEL single shot exposures.
Use of the depends on field and the NXtransformations group is strongly recommended. As noted above this should be an absolute requirement to provide for any scan experiment.
The reason it is optional is mainly to accommodate XFEL single shot exposures. Each detector element is represented as an NXdetector group with its own detector data array. Each detector data array may be further decomposed into array sections by use of NXdetector module groups.
The names are given in the group names field.
The groups are defined hierarchically, with names given in the group names field, unique identifying indices given in the group index field, and the level in the hierarchy given in the group parent field. For example if an x-ray detector, DET, consists of four elements in a rectangular array: we could have: The 32-bit pixel mask for the detector. This can be either one mask for the whole dataset (i.e. an array with indices i, j) or each frame can have its own mask (in which case it would be an array with indices np, i, j). It contains a bit field for each pixel to signal dead, blind, high or otherwise unwanted or undesirable pixels. They have the following meaning: If the full bit depth is not required, providing a mask with fewer bits is permissible.
If needed, additional pixel masks can be specified by including additional entries named pixel mask N, where N is an integer.
For example, a general bad pixel mask could be specified in pixel mask to indicate noisy and dead pixels, and an additional pixel mask from experiment-specific shadowing could be specified in pixel mask 2. The cumulative mask is the bitwise OR of pixel mask and any pixel mask N entries.
If provided, it is recommended that it be compressed.   The value at which the detector goes into saturation. Data above this value is known to be invalid.
For example, given a saturation value and an underload value, the valid pixels are those less than or equal to the saturation value and greater than or equal to the underload value.

underload value: (optional) NX INT
The lowest value at which pixels for this detector would be reasonably measured.
For example, given a saturation value and an underload value, the valid pixels are those less than or equal to the saturation value and greater than or equal to the underload value.
sensor material: (required) NX CHAR At times, radiation is not directly sensed by the detector. Rather, the detector might sense the output from some converter like a scintillator. This is the name of this converter material.
sensor thickness: (required) NX FLOAT {units=NX LENGTH} At times, radiation is not directly sensed by the detector. Rather, the detector might sense the output from some converter like a scintillator. This is the thickness of this converter material. This group specifies the hyperslab of data in the data array associated with the detector that contains the data for this module. If the module is associated with a full data array, rather than with a hyperslab within a larger array, then a single module should be defined, spanning the entire array. The data origin is 0-based.
The frame number dimension (np) is omitted. Thus the data origin field for a 2-dimensional dataset with indices (np, i, j) will be an array with indices (i, j), and for a 3dimensional dataset with indices (np, i, j, k) it will be an array with indices (i, j, k).
The order of indices (i, j) or (i, j, k) is slow to fast.  Several other use cases are permitted, depending on the presence or absence of other incident wavelength X fields.
In the case of a polychromatic beam this is an array of length m of wavelengths, with the relative weights given in incident wavelength weight.
In the case of a monochromatic beam that varies shot-to-shot, this is an array of wavelengths, one for each recorded shot. Here, incident wavelength weight and incident wavelength spread are not set.
In the case of a polychromatic beam that varies shot-to-shot, this is an array of length m with the relative weights specified in incident wavelength weight as a 2D array.
In the case of a polychromatic beam that varies shot-to-shot and where the channels also vary, this is a 2D array of dimensions np by m (slow to fast) with the relative weights specified in incident wavelength weight as a 2D array. In the case of a polychromatic beam that varies shot-to-shot, this is a 2D array of dimensions np by m (slow to fast) of the relative weights of the corresponding wavelengths specified in incident wavelength. In the case of a beam that varies in total flux shot-to-shot, this is an array of values, one for each recorded shot. The neutron or x-ray storage ring/facility. Note, the NXsource base class has many more fields available, but at present we only require the name.