Survey and analysis of crystal polymorphism in organic structures

A comprehensive list of organic crystalline polymorphs has been assembled using the Cambridge Structural Database (CSD) and the structures categorized by crystal type, uncovering significant variations in the polymorphism prevalence of each group. Phenomena such as the high prevalence of temperature-induced phase transitions in organic salts and the diminishing percentage of polymorphic crystal entries in the CSD over the last 20 years, with the exception of cocrystals, highlight areas of prospective study.


S1. Polymorphism in Crystal Growth & Design
Crystal Growth & Design (2017) has been published since 2001. For polymorphism articles, a search was conducted for that term for the publication range of each year, and with the restriction to use print publication date (instead of web publication date). Only research articles and rapid communications were considered for this data (reviews, editorials, and perspectives were not included). The total number of articles each year was determined by counting the number of research articles and rapid communications published in each issue, in each year. Note: Very few instances of the term genetic polymorphism or other instances where the term was not referring to polymorphic crystal structures occur, and do not contribute substantially to the numbers in this journal.

S2. Methods for Polymorphism Search of the CSD using Conquest
Searching for polymorphs in the CSD (Groom, et. al., 2016)  When searching in version 1.19 of Conquest for structures entered before the Nov 2016 update, the list shows structures deposited up until August of 2015, but also with 192 entries updated after then (11,907 entries). Therefore, it is necessary to note which version is used when searching for this data.
Each refcode family was analyzed by comparing the unit cell parameters and the simulated powder patterns exported to Mercury to confirm the existence of multiple polymorphic forms. The associated publications for each deposited structure were also consulted to determine the situations in which phase transitions were present due to temperature or pressure (Class B polymorphs).
There are 4,573 unique refcode families in the list of 11,909 polymorphic entries. However, three of these compounds show polymorphic forms of both deuterated and protonated versions of the compound, which means there are 4,576 unique chemical entities.

S5. Tables of Refcodes Not Included as Polymorphs
Entries in Table S5 are not included in the overall list of polymorphs. The 60 entries in blue in Table   S3 that were determined to be polymorphic compounds are not included in Table S5, and were instead   integrated into Table S4 and highlighted with an asterisk. Table S4 only includes compounds that have two structurally characterized entries in a refcode which contain 3D coordinates. Several entries in Table S5 have two forms listed as polymorphs, but do not have multiple forms with 3D coordinates known, and therefore are not included in Table S4. In this table and below, PT means phase transition.

Table S5
Description of compounds with only one entry in the polymorph list (shown as blue in Table S3) explaining why they are not included in

S6. Details of Polymorphism Tree Searching (Figure 2 in main text)
Searches of the CSD (Groom, 2016) detailed below were conducted with ConQuest version 1.18 with the restrictions of 3D coordinates known and organics only.
Single components: A search was conducted restricting to one chemical unit under Z/Density, and restricting entries to those not containing the name 'hydrate' or the name 'solvate'. This gives entries that should contain one neutral molecular unit (223,483 hits).

Multicomponent systems:
The search for salts involved analysis of any entry that contained two or more chemical units under However, solvates can also be listed under the term clathrate, a term used to designate host guest compounds, but the guest can be a solid or liquid. For this purpose, only clathrates that contain liquids are included. A listing of entries with the term clathrate that are not already in the solvate list produces 5,444 hits. These are analyzed to remove solid guests and 3,117 solvates were determined. Added with those in the search for just solvate, the total is 35,065.
To determine cocrystals, several searches were conducted: A search for 2 chemical units with no ions, no hydrates, and no solvates would be two neutral components (11,314 hits). Not every one of these entries shows cocrystals; however, as some were clathrates or unlisted solvates, this list needed to be individually sorted through to find the number of cocrystal entries in this group (7,080 hits). A search for 3 or more chemical units could contain cocrystals plus a solvent, or two solvents and one neutral molecule, as well as salts and/or ionic cocrystals. This search gave 25,667 hits and these were individually analyzed to find entries containing at least two neutral components that are solids at room temperature (5,712 hits). Added together, this results in 12,792 cocrystal entries.
Families in each category were determined by finding the number of unique refcodes in each list.
The number of polymorph entries were determined by adding a text search for "polymorph" to any of the crystal type searches outlined above.
The number of polymorph families were determined by finding the number of unique refcodes in each polymorph entries list.
The number of polymorphic compounds are the numbers determined from the previously outlined search in section SI 2, with the data presented in Table S4. Breakdowns for each multicomponent crystal type also come from the data in Table S4, and show the combination multicomponent systems (such as cocrystal salts or hydrate solvates, for example) in each crystal type.

Figure S1
Percentage of polymorphs versus organics in the CSD for cocrystals with only 2 molecular units.

Figure S2
Percentage of polymorphs versus organics in the CSD for cocrystals with 3+ molecular units.