data for structural and crystallization communications

This page gives a list of recommended items for inclusion in structural and crystallization communications in Acta Crystallographica Section F.

The recommendations are tabulated below alongside the data names from the PDB mmCIF exchange dictionary available from the Protein Data Bank. These data items are provided in the mmCIF data sets created by the Protein Data Bank when a structure is deposited.

If you intend to submit to Acta Crystallographica Section F, you are recommended to use the PDB_EXTRACT utility available from the Protein Data Bank to extract as much as possible of this information from result and log files of most commonly-used macromolecular structure packages to create an initial mmCIF suitable for direct upload to the Protein Data Bank during the deposition process. A copy of the deposited mmCIF can then be uploaded to the journal to create standard tables for inclusion in your article.

The list below also includes examples to show how particular data will be organised in the mmCIF and how they will be arranged in the journal article.

Click here for a more compact summary of the recommendations.



1. Sample information

_symmetry.cell_setting
_symmetry.Int_Tables_number
_symmetry.space_group_name_H-M
Description mmCIF items

1.1. Macromolecule and source information

  Example 1: complex of E. coli glutamate decarboxylase α with glutarate
  Example 2: a zinc-induced heterodimer of two isoforms of phospholipase A2
  Example 3: mutation
  Example 4: mutation and modification
Structure name [info] _struct.title
Component molecules _entity.pdbx_description
Additional molecular identifiers ENTITY_NAME_SYS
ENTITY_NAME_COM
_entity.pdbx_ec
Biological functional unit (BFU) or macromolecular assembly, numbers and types of chains _struct_biol.details
Mass of BFU (Da) _struct_biol.pdbx_formula_weight
_struct_biol.pdbx_formula_weight_method
Macromolecule sequence and chemical configuration [info]
Sequence database reference code _struct_ref.db_name
_struct_ref.db_code
Polymers (one-letter code sequence) _entity_poly.pdbx_seq_one_letter_code_can
_entity_poly.pdbx_seq_one_letter_code
or Polymer sequence as list of residues _entity_poly_seq.num
_entity_poly_seq.mon_id
Ligand, cofactor, ions, solvent _pdbx_entity_nonpoly.name

Small-molecule components will normally be identified as above by a pointer to an entry in the PDB ligand dictionary. Exceptionally, molecular details may be carried within the deposited mmCIF using data items from the appropriate categories:
_chem_comp.id
CHEM_COMP
CHEM_COMP_ATOM
CHEM_COMP_BOND
Mutations _entity.pdbx_mutation
Post-translational modifications _entity.pdbx_modification
Formula weight of entity (Da) _entity.formula_weight or
_entity.pdbx_formula_weight_exptl
_entity.pdbx_formula_weight_exptl_meth
Source organism
Scientific name _entity_src_nat.pdbx_organism_scientific
Strain _entity_src_nat.strain
Details _entity_src_nat.details
Source gene
Scientific name _entity_src_gen.pdbx_gene_src_scientific_name
Strain _entity_src_gen.gene_src_strain
Details _entity_src_gen.gene_src_details

1.2 Macromolecule production

[info]
The link below illustrates how to supply information on sample preparation.
  Example: requested data for sample preparation.
For each macromolecular entity
PCR protocol _pdbx_entity_prod_protocol.protocol
_pdbx_entity_prod_protocol.protocol_type
Cloning protocol _pdbx_entity_prod_protocol.protocol
_pdbx_entity_prod_protocol.protocol_type
Expression protocol _pdbx_entity_prod_protocol.protocol
_pdbx_entity_prod_protocol.protocol_type
Purification protocol _pdbx_entity_prod_protocol.protocol
_pdbx_entity_prod_protocol.protocol_type
Additional details _pdbx_entity_prod_protocol.protocol
_pdbx_entity_prod_protocol.protocol_type

1.3. Crystallization

[info]
The links below illustrate several scenarios and should be consulted alongside the following list of items.
  Example 1: hanging-drop vapour diffusion
  Examples 2 and 3: different reservoir components
  Example 4: microbatch or liquid-liquid diffusion or dialysis method
Crystallization method _exptl_crystal_grow.method
_exptl_crystal_grow.method_ref
Temperature (K) _exptl_crystal_grow.temp
_exptl_crystal_grow.temp_details
Additional details
Describe anything special about the method
_exptl_crystal_grow.details
Examples: screens used and hits, optimization details, special conditions such as microgravity, magnetic field
Apparatus _exptl_crystal_grow.apparatus
Atmosphere _exptl_crystal_grow.atmosphere
Pressure (kPa) _exptl_crystal_grow.pressure
Crystal growth time _exptl_crystal_grow.time
Seeding _exptl_crystal_grow.seeding
_exptl_crystal_grow.seeding_ref
Volumes and pHs of crystallization solutions _pdbx_exptl_crystal_grow_sol.volume
_pdbx_exptl_crystal_grow_sol.volume_units
_pdbx_exptl_crystal_grow_sol.pH
Compositions of crystallization solutions _pdbx_exptl_crystal_grow_comp.comp_name
_pdbx_exptl_crystal_grow_comp.conc
_pdbx_exptl_crystal_grow_comp.conc_range
_pdbx_exptl_crystal_grow_comp.conc_units
Cryo treatments
Final cryoprotection solution _pdbx_exptl_crystal_cryo_treatment.final_solution_details
Soaking _pdbx_exptl_crystal_cryo_treatment.soaking_details
Cooling _pdbx_exptl_crystal_cryo_treatment.cooling_details
Annealing _pdbx_exptl_crystal_cryo_treatment.annealing_details

1.4. Crystal data

Space group, crystal system
Unit-cell parameters (Å, °) (s.u. optional) _cell.length_a
_cell.length_a_esd
_cell.length_b
_cell.length_b_esd
_cell.length_c
_cell.length_c_esd
_cell.angle_alpha
_cell.angle_alpha_esd
_cell.angle_beta
_cell.angle_beta_esd
_cell.angle_gamma
_cell.angle_gamma_esd
Crystal dimensions or radius (mm) _exptl_crystal.size_max
_exptl_crystal.size_mid
_exptl_crystal.size_min or
_exptl_crystal.rad
Colour of crystal _exptl_crystal.colour or
_exptl_crystal.colour_primary
_exptl_crystal.colour_modifier
_exptl_crystal.colour_lustre
Crystal habit or shape _exptl_crystal.description
No. of molecules in unit cell (Z) _cell.formula_units_Z
Matthews coefficient VM3 Da-1) _exptl_crystal.density_Matthews
Solvent content (%) _exptl_crystal.density_percent_sol

2. Data collection and structure solution statistics

Description mmCIF items

2.1. Data collection, refinement data set

  Example 1: data collection for a trypsin inhibitor and complexes of trypsin with the wild-type inhibitor and mutant, collected at different synchrotrons and with rotating-anode equipment.
Data set identifier _database_2.database_id
_database_2.database_code
Crystal sample conditions exptl_crystal.preparation
Examples: temperature, pressure, crystal mount, cryostat.
Diffraction protocol _diffrn_radiation.pdbx_diffrn_protocol
Sampling protocol _diffrn_measurement.device
_diffrn_measurement.device_details
_diffrn_measurement.method
Source of diffracting beam _diffrn_source.source
_diffrn_source.type
Focusing and collimation _diffrn_radiation.collimation
Monochromator _diffrn_radiation.monochromator
X-ray beam size _diffrn_source.size
Wavelength (Å) _diffrn_radiation.pdbx_wavelength
_diffrn_radiation.pdbx_wavelength_list
Detector type _diffrn_detector.detector
_diffrn_detector.type
Temperature (K) _diffrn.ambient_temp
_diffrn.ambient_temp_esd
_diffrn.ambient_temp_details
Total measuring time (s) _diffrn_detector.pdbx_collection_time_total
No. of images _diffrn_detector.pdbx_frames_total
Data-processing software _computing.data_reduction
This is the preferred data item for providing a succinct reference to the software package used for this purpose. Additional information about the package may also be provided using appropriate items in the category
SOFTWARE
Resolution range (Å) _reflns.d_resolution_low
_reflns.d_resolution_high
and resolution range outer shell (Å) _reflns_shell.d_res_low
_reflns_shell.d_res_high
No. of unique reflections _reflns.number_all
_reflns.details
_reflns_shell.number_unique_all
No. of observed reflections _reflns.number_obs
This item will only be used if a structure has not been refined, for example in reporting results of crystallization experiments. Otherwise, _refine.ls_number_reflns_obs will be reported under Section 3.
Criterion for observed reflections _reflns.observed_criterion_sigma_F or
_reflns.observed_criterion_sigma_I
As above, these items will only be used if a structure has not been refined. Otherwise, the corresponding items for reflections used in the refinement will be reported under Section 3.
Completeness (%) _reflns.percent_possible_obs
_reflns_shell.percent_possible_obs
Redundancy _reflns.pdbx_redundancy
_reflns_shell.pdbx_redundancy
< I/σ(I) > overall and by shell _reflns.pdbx_netI_over_sigmaI
_reflns_shell.pdbx_netI_over_sigmaI_all
_reflns_shell.pdbx_netI_over_sigmaI_obs
Rmerge overall and by shell _reflns.Rmerge_F_all
_reflns.Rmerge_F_obs
_reflns_shell.Rmerge_F_all
_reflns_shell.Rmerge_F_obs

_reflns.pdbx_Rmerge_I_all
_reflns.pdbx_Rmerge_I_obs
_reflns_shell.Rmerge_I_all
_reflns_shell.Rmerge_I_obs
Rr.i.m.
Rp.i.m.
d-spacing (Å) at which < I/σ(I) > = 2 (if this does not occur, leave blank)
dopt [info]
_reflns.pdbx_Rrim_I_all
_reflns.pdbx_Rpim_I_all
_reflns.pdbx_res_netI_over_sigmaI_2

_reflns.pdbx_d_opt

2.2 Phasing

Phasing method _phasing.method

2.2.1. MAD/SAD data and structure solution statistics

The link below illustrates characterization of data sets in MAD or related phasing methodologies.
  Example: MAD phasing of a stretch of double-stranded DNA bound to a drug molecule
MAD/SAD phasing method used _phasing_MAD.method
Insertion of MAD/SAD scatterers _phasing_MAD.details
Method of locating scatterers _phasing_MAD.pdbx_anom_scat_method
No. of MAD/SAD sets used in phasing _phasing_MAD.pdbx_number_data_sets
Phasing resolution range (Å) _phasing_MAD.pdbx_d_res_low
_phasing_MAD.pdbx_d_res_high
Phasing power all data; centric, acentric _phasing_MAD.pdbx_power_centric
_phasing_MAD.pdbx_power_acentric
Figure of merit overall _phasing_MAD.pdbx_fom
_phasing_MAD.pdbx_fom_centric
_phasing_MAD.pdbx_fom_acentric
MAD/SAD solution software _computing.structure_solution
This is the preferred data item for providing a succinct reference to the software package used for this purpose. Additional information about the package may also be provided using appropriate items in the category
SOFTWARE
For each phasing set
Radiation source _phasing_set.radiation_source_specific
Radiation wavelength _phasing_set.radiation_wavelength
Temperature (K) _phasing_set.temp
_phasing_set.pdbx_temp_details
Resolution range in the phasing data set (Å) _pdbx_phasing_MAD_set.d_res_low
_pdbx_phasing_MAD_set.d_res_high
f' used in phasing _phasing_MAD_set.f_prime
f'' used in phasing _phasing_MAD_set.f_double_prime
Phasing power by set; centric, acentric _pdbx_phasing_MAD_set.power_centric
_pdbx_phasing_MAD_set.power_acentric
No. of sites _pdbx_phasing_MAD_set.number_of_sites
For each of the sites, the following: site no., atom symbol, occupancy, x, y, z and Biso _pdbx_phasing_MAD_set_site.id
_pdbx_phasing_MAD_set_site.atom_type_symbol
_pdbx_phasing_MAD_set_site.occupancy
_pdbx_phasing_MAD_set_site.fract_x
_pdbx_phasing_MAD_set_site.fract_y
_pdbx_phasing_MAD_set_site.fract_z
_pdbx_phasing_MAD_set_site.B_iso

2.2.2. MIR/MIRAS/SIR/SIRAS data and structure solution statistics

The link below illustrates characterization of data sets in MIR or related phasing methodologies, in the cases where the native data set is and is not used for refinement.
  Example: use of a samarium derivative and the SIRAS method
For the MIR application as a whole:
No. of derivatives _phasing_MIR.pdbx_number_derivatives
Description of the phasing strategy _phasing_MIR.details
Resolution range of phasing (Å) _phasing_MIR.d_res_low
_phasing_MIR.d_res_high
Phasing power all data; acentric, centric _phasing_MIR_der.power_acentric,
_phasing_MIR_der.power_centric
Figure of merit all data _phasing_MIR.FOM
_phasing_MIR.FOM_centric
_phasing_MIR.FOM_acentric
MIR solution software _computing.structure_solution
This is the preferred data item for providing a succinct reference to the software package used for this purpose. Additional information about the package may also be provided using appropriate items in the category
SOFTWARE
For each phasing data set (if the native data set used for phasing is not the set used for refinement, it should be described as the first phasing set; additional data sets will correspond to each of the derivatives):
Radiation source _phasing_set.radiation_source_specific
Radiation wavelength _phasing_set.radiation_wavelength
Temperature (K) _phasing_set.temp
_phasing_set.pdbx_temp_details
Resolution range of phasing data set (Å) _phasing_set.pdbx_d_res_low
_phasing_set.pdbx_d_res_high
Then, for each derivative:
Derivative _phasing_MIR_der.id
Derivative preparation _phasing_MIR_der.details
Heavy-atom location method _phasing_MIR.method
Number of sites _phasing_MIR_der.number_of_sites
Figure of merit _phasing_MIR_der.pdbx_fom
_phasing_MIR_der.pdbx_fom_centric
_phasing_MIR_der.pdbx_fom_acentric
For each of the sites, the following: site no., atom symbol, occupancy, x, y, z and Biso _phasing_MIR_der_site.id
_phasing_MIR_der_site.atom_type_symbol
_phasing_MIR_der_site.occupancy
_phasing_MIR_der_site.fract_x
_phasing_MIR_der_site.fract_y
_phasing_MIR_der_site.fract_z
_phasing_MIR_der_site.B_iso

2.2.3. Molecular replacement data and structure solution statistics

The link below illustrates characterization of data sets in molecular replacement phasing methodologies, in the cases where the native data set is and is not used for refinement.
  Example: phasing of a DNA octamer using the molecular replacement method
PDB code for search model _pdbx_database_related.db_name
_pdbx_database_related.db_id
_pdbx_database_related.content_type
or Identification of search model _refine.pdbx_starting_model
If phasing data set is not the data set used for refinement:
Radiation source _phasing_set.radiation_source_specific
Radiation wavelength (Å) _phasing_set.radiation_wavelength
Temperature (K) _phasing_set.temp
_phasing_set.pdbx_temp_details
Resolution range (Å) _phasing_set.pdbx_d_res_low
_phasing_set.pdbx_d_res_high
Molecular replacement phasing details
Alterations to the search model _pdbx_phasing_MR.model_details
MR solution software _computing.structure_solution
This is the preferred data item for providing a succinct reference to the software package used for this purpose. Additional information about the package may also be provided using appropriate items in the category
SOFTWARE

3. Model generation and refinement

Description mmCIF items
  Example 1: a 2'-5' RNA ligase
  Example 2: M. tuberculosis pyR protein
Structure refinement software _computing.structure_refinement
This is the preferred data item for providing a succinct reference to the software package used for this purpose. Additional information about the package may also be provided using appropriate items in the category
SOFTWARE
Refinement on |F|, I, or F2 _refine.ls_structure_factor_coef
σ cutoff in data (one of)
_refine.pdbx_ls_sigma_F
_refine.pdbx_ls_sigma_I
_refine.pdbx_ls_sigma_Fsqd
Resolution range (Å) _refine.ls_d_res_low
_refine.ls_d_res_high
and resolution range outer shell (Å) _refine_ls_shell.d_res_low
_refine_ls_shell.d_res_high
No. of reflections used in refinement _refine.ls_number_reflns_R_work
_refine_ls_shell.number_reflns_R_work
No. of reflections above σ cutoff in final cycle _refine.ls_number_reflns_obs
_refine_ls_shell.number_reflns_obs
Final overall R factor _refine.ls_R_factor_obs
_refine_ls_shell.R_factor_obs
Atomic displacement model (iso, aniso, mixed) _refine_B_iso.treatment
_refine_B_iso.class
Overall average B factor excluding solvent (Å2) _refine.B_iso_mean
_refine.B_iso_min
_refine.B_iso_max
No. of macromolecule atoms refined _refine_hist.pdbx_number_atoms_protein
_refine_hist.pdbx_number_atoms_nucleic_acid
No. of ligand atoms _refine_hist.pdbx_number_atoms_ligand
No. of solvent atoms _refine_hist.number_atoms_solvent
Total No. of atoms _refine_hist.number_atoms_total
No. of refined parameters _refine.ls_number_parameters
Non-crystallographic symmetry restraints _refine_ls_restr_ncs.ncs_model_details
Bulk solvent model _refine.solvent_model_param_bsol
_refine.solvent_model_param_ksol
_refine.solvent_model_details

4. Model validation

Description mmCIF items
The examples are those used also in Section 3 above:
  Example 1: a 2'-5' RNA ligase
  Example 2: M. tuberculosis pyR protein
Final Rwork _refine.ls_R_factor_R_work
_refine_ls_shell.R_factor_R_work
No. of reflections in test set for Rfree _refine.ls_number_reflns_R_free
_refine_ls_shell.number_reflns_R_free
Final Rfree _refine.ls_R_factor_R_free
_refine_ls_shell.R_factor_R_free
Cruickshank DPI _refine.overall_SU_R_Cruickshank_DPI
R.m.s. deviations from target values for bond distances, bond angles and
      isotropic B factors (overall, main chain and side chain)
_refine_ls_restr.type
_refine_ls_restr.dev_ideal
_refine_ls_restr.dev_ideal_target
_refine_ls_restr.number
Ramachandran plot analysis
   most favoured regions (%)
   additionally allowed regions (%)
   generously allowed regions (%)
   disallowed regions (%)
PDBX_FEATURE_SEQUENCE_RANGE
The links below illustrate the presentation of Ramachandran plot summary metrics for a single complete protein molecule, and for different sequence ranges within a more complex structure.
  Example 1: Ramachandran plot statistics for a single complete protein molecule
  Example 2: Ramachandran plot statistics for the two chains of a heterodimer
Omitted residues PDBX_FEATURE_MONOMER
The link below illustrates how to account for missing and partial residues in the refined model.
  Example: missing and partial residues.


Follow Acta Cryst. D
Sign up for e-alerts
Follow Acta Cryst. on Twitter
Follow us on facebook
Sign up for RSS feeds