
Figure 2
(a) Dendrogram for cluster analysis of the 14 cryocooled thaumatin data sets introduced in §3.1. (b) R_{meas} for random combinations of the 14 data sets introduced in §3.1. Calculations for groups of two, three, four all the way up to 14 data sets are shown. The broken line runs through the medians for all groups, while the full lines include the interquartile range, i.e. all dots falling below the lower line and all dots falling above the upper line represent 50% of all values. Optimally selected groups of data sets could be considered as those having R_{meas} below the lower full line; these are included among the 25% of bestperforming groups. (c) The broken and full lines in this plot are a replica of those in (b). The empty circles correspond to values of R_{meas} for all merged data sets found in the dendrogram in (a). Ten out of 13 of them fall under the lower interquartile range line. We know that only data sets performing among the top 25% fall in this region. Thus, the selective power provided by cluster analysis is quite evident. 