research papers\(\def\hfill{\hskip 5em}\def\hfil{\hskip 3em}\def\eqno#1{\hfil {#1}}\)

Journal logoJOURNAL OF
SYNCHROTRON
RADIATION
ISSN: 1600-5775

Nonlinear optimization for a low-emittance storage ring

crossmark logo

aDepartment of Accelerator Science, Korea University, 2511 Sejong-ro, Sejong 30019, South Korea, bPohang Accelerator Laboratory, POSTECH, Pohang, Kyungbuk 37673, South Korea, and cMutipurpose Synchrotron Radiation Construction Project, Korea Basic Science Institute, 162 Yeongudanji-ro, Cheongwon-gu, Cheongju, Chungcheongbukdo 28119, South Korea
*Correspondence e-mail: tlssh@korea.ac.kr, picoma@postech.ac.kr

Edited by G. Kamel, SESAME, Jordan (Received 4 January 2024; accepted 15 May 2024; online 25 June 2024)

A multi-objective genetic algorithm (MOGA) is a powerful global optimization tool, but its results are considerably affected by the crossover parameter ηc. Finding an appropriate ηc demands too much computing time because MOGA needs be run several times in order to find a good ηc. In this paper, a self-adaptive crossover parameter is introduced in a strategy to adopt a new ηc for every generation while running MOGA. This new scheme has also been adopted for a multi-generation Gaussian process optimization (MGGPO) when producing trial solutions. Compared with the existing MGGPO and MOGA, the MGGPO and MOGA with the new strategy show better performance in nonlinear optimization for the design of low-emittance storage rings.

1. Introduction

In the design of an accelerator complex, nonlinear beam dynamics optimization is often repeatedly conducted to ensure the accelerator meets the required performance. A larger dynamic aperture (DA) secures a high injection efficiency as it helps to capture the injected beam of large amplitude. A longer Touschek lifetime implies that the loss of stored beam due to energy transfer between electrons in the beam is minimized. The DA area and Touschek lifetime can be increased by imposing appropriate sextupole strengths. Optimization algorithms can help to find suitable sextupole strengths.

Several stochastic global optimization schemes such as the multi-objective genetic algorithm (MOGA) (Deb et al., 2002[Deb, K. A., Pratap, S., Agarwal, S. & Meyarivan, T. (2002). IEEE Trans. Evol. Comput. 6, 182-197.]; Yang et al., 2009[Yang, D., Robin, F., Sannibale, C., Steier, C. & Wan, W. (2009). Nucl. Instrum. Methods Phys. Res. A, 609, 50-57.]), multi-objective particle swarm optimization (MOPSO) (Pang & Rybarcyk, 2014[Pang, X. & Rybarcyk, L. J. (2014). Nucl. Instrum. Methods Phys. Res. A, 741, 124-129.]; Lin et al., 2015[Lin, Q. J., Li, Z., Du, J., Chen, J. & Ming, Z. (2015). Eur. J. Oper. Res. 247, 732-744.]) and multi-generation Gaussian process optimization (MGGPO) (Song et al., 2020[Song, M. X., Huang, L., Spentzouris, L. & Zhang, Z. (2020). Nucl. Instrum. Methods Phys. Res. A, 976, 164273.]) have been successfully adopted in optimizing the lattice design.

MOGA exploits survival of the fittest by analogy with biological evolution. First, N candidate solutions are produced randomly. These parent solutions are combined to yield child solutions by crossover, and some variables of child solutions are changed by mutation. The N best, i.e. fittest, solutions are retained and others are abandoned so that a population of solutions is maintained. These solutions become parents of the next generation. This process is carried out iteratively until the N solutions converge sufficiently (Pareto front).

In contrast, MGGPO generates more solutions (trial solutions) than the population by utilizing crossover and mutation of MOGA and random movement using the MOPSO mechanism which mimics animals' organized movement such as flocking behavior to find food. MGGPO uses Gaussian process (GP) regression, a kind of supervised machine learning, to predict objectives of trial solutions because this prediction requires negligible computing time. The N best solutions are retained and others are discarded. Objectives of selected solutions are evaluated to establish a GP model for GP regression in the subsequent generation.

In MOGAs, crossover, i.e. simulated binary crossover (SBX), and mutation are quantified using the crossover parameter ηc and the mutation parameter ηm; these parameters affect the Pareto front and the convergence speed. In existing MOGAs, ηc is fixed at every generation. In this paper, we introduce the self-adaptive crossover parameter (SAXP) scheme, which uses a new ηc at each generation (Deb et al., 2007[Deb, M., Karthik, S. & Okabe, T. (2007). Proceedings of the 9th Annual Conference on Genetic and Evolutionary Computation (GECCO'07), 7-11 July 2007, London, UK, pp. 1187-1194.]). A SAXP strategy was also adopted to a MOGA crossover for manufacturing trial solutions in MGGPO. Section 2[link] introduces the SAXP strategy. Section 3[link] compares the results of MGGPO and MOGA in optimizing nonlinear dynamics of low emittance storage rings, each using either a fixed crossover parameter (FXP) or SAXP scheme. Section 4[link] presents conclusions.

2. Self-adaptive crossover parameter strategy

We used a real-coded genetic algorithm (GA) for nonlinear beam dynamics optimization and, therefore, for crossover we used an SBX formula,

[{c_{1,i}} = {1\over2}\left[{\left({1+{\beta_i}}\right){p_{1,i}} + \left({1-{\beta_i}}\right){p_{2,i}}} \right], \eqno(1)]

[{c_{2,i}} = {1\over2}\left[{\left({1-{\beta_i}}\right){p_{1,i}}+\left({1+{\beta_i}}\right){p_{2,i}}}\right], \eqno(2)]

where i is the decision-variable index, p1,i and p2,i are decision variables of the parent solutions, c1,i and c2,i are decision variables of child solutions, and

[\beta_i = \left\{ \matrix{ \left({2u}\right)^{1/(\eta_{\rm{c}}+1)} \hfill & {\rm{if}}\ \ 0 \le u \le 0.5,\hfill \cr \left\{1/\left[2\left(1-u\right)\right]\right\}^{1/(\eta_{\rm{c}}+1)} \hfill & {\rm{if}}\ \ 0.5\, \lt \,u\, \lt\, 1,\hfill }  \right. \eqno(3)]

where 0 ≤ u ≤ 1 is a uniformly distributed random number. In MOGA, ηc is set by the user and generally remains constant during the whole MOGA process.

Equations (1)[link] and (2)[link] demonstrate that two child solutions c1,i, c2,i lie on a line between two parent solutions p1,i, p2,i in decision-variable space, and that the children are located inside a region bounded by parents if 0 ≤ βi ≤ 1 (i.e. 0 ≤ u ≤ 0.5), and outside of that region if βi > 1 (i.e. 0.5 < u < 1).

ηc in equation (3)[link] determines the probability distribution of offspring. As ηc increases, the probability distribution of offspring around the parents narrows in the variable space (Fig. 1[link]); i.e. when random number u is fixed, the child solution nears the parent as ηc increases. For example, when p1 = 5, p2 = 10 and u = 0.1, the first child c1 created with ηc = 15 is 5.24 and the second child [c_1^{\,\prime}] created with [\eta_{\rm{c}}^{\,\prime}] = 5 is 5.59 (Fig. 2[link]). This characteristic can be exploited using the principle of Nelder and Meade's simplex (Press et al., 1992[Press, W. H., Teukolsky, S. A., Vetterling, W. T. & Flannery, B. P. (1992). Numerical Recipes in C: The Art of Scientific Computing, 2nd ed., ch. 10.4. Cambridge University Press.]) to improve the child solutions. Firstly, offspring are produced by crossover using a fixed ηc that is pre-defined by the user. Objectives of each child and two parents are compared. If a child solution is better than both parent solutions, then a better solution than this child can probably be found at a farther position from the nearest parent than this child. This principle was created by exploiting the expansion concept of the simplex. To obtain this `farther' solution, ηc is replaced with a new parameter [\eta_{\rm{c}}^{\,\prime}] < ηc and another crossover is conducted. If the resulting child is inferior to the parents, then another crossover is conducted using [\eta_{\rm{c}}^{\,\prime}] > ηc by exploiting contraction of the simplex. Otherwise, the crossover uses [\eta_{\rm{c}}^{\,\prime}] = [{\eta_{\rm{c}}}]. The same u must be used for the first and second crossovers.

[Figure 1]
Figure 1
Distribution of two offspring according to the crossover parameter ηc.
[Figure 2]
Figure 2
Locations of c1 and [c_1^{\,\prime}] with ηc = 5 and [\eta_{\rm{c}}^{\,\prime}] = 10 and the same random number u = 0.1.

The new [\eta_{\rm{c}}^{\,\prime}] is chosen using a protocol. The relationship between ηc and βi differs [equation (3)[link]] according to whether 0 ≤ u ≤ 0.5 (0 ≤ βi ≤ 1) or 0.5 < u < 1 (βi > 1). The formula for [\eta_{\rm{c}}^{\,\prime}] also differs according to whether the child solution is superior or inferior to the parent solutions. Therefore, four [\eta_{\rm{c}}^{\,\prime}] formulae will be listed.

When child c1 is superior to both parents p1 and p2, it must be that [|c_1^{\,\prime}]p1| > |c1p1|. If βi > 1, the child is located outside of the two parents by external division from equation (2)[link]. From equation (2)[link], βi can be written as

[{\beta_i} = 1 + {{2\left({{c_{1,i}} - {p_{1,i}}} \right)} \over {{p_{1,i}} - {p_{2,i}}}}. \eqno(4)]

When βi > 1, equation (3)[link] can be rearranged to

[{\eta_{\rm{c}}} = - \left\{ {1 + {{\log \left [{2\left({1 - u} \right)} \right]{\rm{\,}}} \over {\log {\beta_i}}}} \right\}. \eqno(5)]

We introduce parameter

[\alpha = {{c_{1,i}^{\,\prime} - {p_{1,i}}} \over {{c_{1,i}} - {p_{1,i}}}} \eqno(6)]

to quantify how much changing the crossover parameter affects the difference between c′ and c.

Here, [|c_1^{\,\prime}]p1| > |c1p1|, so α > 1. As α increases, [c_{1,i}^{\,\prime}] moves away from p1. From equation (4)[link], [\beta_i^{\,\prime}] can be calculated by replacing c1,i with [c_{1,i}^{\,\prime}],

[\beta_i^{\,\prime}\, = 1 + {{2\left({c_{1,i}^{\,\prime} - {p_{1,i}}} \right)} \over {{p_{1,i}} - {p_{2,i}}}} \,{{{c_{1,i}} - {p_{1,i}}} \over {{c_{1,i}} - {p_{1,i}}}} = 1 + \alpha \left({{\beta_i} - 1} \right). \eqno (7)]

Using equation (5)[link], the new [\eta_{\rm{c}}^{\,\prime}] can be written as

[\eqalignno{ \eta_{\rm{c}}^{\,\prime} & = - \left({1 + {{\log \left [{2\left({1 - u} \right)} \right]{\rm{\,}}} \over {\log \beta_i^{\,\prime}}} \, {{\log {\beta_i}} \over {\log {\beta_i}}}} \right) \cr& = - \left[{1 - \left({{\eta_{\rm{c}}} + 1} \right){{\log {\beta_i}} \over {\log \beta_i^{\,\prime}}}} \right] \cr& = - 1 + {{\left({{\eta_{\rm{c}}} + 1} \right)\log {\beta_i}} \over {\log \left [{1 + \alpha \left({{\beta_i} - 1} \right)} \right]}}. &(8) }]

In equation (8)[link], [\eta_{\rm{c}}^{\,\prime}] < ηc if α > 1 and βi > 1. The right-hand side of equation (8)[link] includes βi, which has decision variable index i; this condition means that [\eta_{\rm{c}}^{\,\prime}] is determined differently according to all variables of all solutions. Strictly speaking, [\eta_{\rm{c}}^{\,\prime}] should be denoted as [\eta_{c,i}^{\,\prime}]. If child c1 is inferior to both parents p1 and p2, then α can be replaced with 1/α in equation (8)[link] to make [c_1^{\,\prime}] be created closer than c1 to p1. Therefore, [\eta_{\rm{c}}^{\,\prime}] can be presented as

[\eta_{\rm{c}}^{\,\prime} = - 1 + {{\left({{\eta_{\rm{c}}} + 1} \right)\log\beta_i }\over{ \log\left\{1+\left[\left(\beta_i-1\right)/\alpha\right]\right\} }}. \eqno(9)]

If 0 ≤ βi < 1, offspring are between the two parents by internal division from equation (2)[link]. When βi < 1, from equation (3)[link],

[{\eta_{\rm{c}}} = - 1 + {{\log \left({2u}\right)} \over {\log{\beta_i}}}. \eqno(10)]

From equation (3)[link], [\beta_i^{\,\prime}] must not be negative. If we define [\beta_i^{\,\prime}] as equation (7)[link], then [\beta_i^{\,\prime}] may be negative, when βi < 1, because α can be any value > 1. We need a new α formula to assure that [0 \le \beta_i^{\, \prime} \le 1] when 0 ≤ βi ≤ 1. When c1 is superior to p1 and p2, increased α must move [c_1^{\,\prime}] farther away from p1 than c1. Here, when 0 ≤ βi ≤ 1 and c1 is superior to p1 and p2, then we can introduce α with the relationship

[\beta_i^{\,\prime} = \left(\beta_i\right)^\alpha. \eqno(11)]

With equation (11)[link], if 0 ≤ βi ≤ 1, then [0 \le \beta_i^{\,\prime} \le 1]. If α ≥ 1, then [\beta_i^{\,\prime} \le {\beta_i}] and, as α increases, [\beta_i^{\,\prime}] decreases. From equation (4)[link], when [0 \le \beta_i^{\,\prime} \le 1], a decreased [\beta_i^{\,\prime}] increases [\left| {{{2\left({c_{1,i}^{\,\prime} - {p_{1,i}}} \right)} / {{p_{1,i}} - {p_{2,i}}}}} \right|] because [\beta_i^{\,\prime} - 1] = [{{2\left({c_{1,i}^{\,\prime} - {p_{1,i}}} \right)} / {{p_{1,i}} - {p_{2,i}}}} \le 0]; i.e. when α ≥ 1, [c_{1,i}^{\,\prime}] is produced at a farther position from p1,i than c1,i, and when α increases, [c_{1,i}^{\,\prime}] moves away from p1, i. Using equations (10)[link] and (11)[link], [\eta_{\rm{c}}^{\,\prime}] is calculated as

[\eqalignno{ \eta_{\rm{c}}^{\,\prime} & = - 1 + {{\log \left({2u} \right)} \over {\log \beta_i^{\,\prime}}} \, {{\log {\beta_i}} \over {\log {\beta_i}}} \cr& = - 1 + \left({{\eta_{\rm{c}}} + 1} \right) \, {{\log {\beta_i}} \over {\log \beta_i^{\,\prime}}} \cr& = - 1 + {{\left({{\eta_{\rm{c}}} + 1} \right)} \over \alpha }. &(12) }]

In equation (12)[link], α > 1, so [\eta_{\rm{c}}^{\,\prime}] < ηc. If c1 is inferior to p1 and p2, then we replace α with 1/α, so [\eta_{\rm{c}}^{\,\prime}] becomes

[\eta_{\rm{c}}^{\,\prime} = - 1 + \alpha \left({{\eta_{\rm{c}}} + 1} \right). \eqno(13)]

Otherwise, if a child solution is not inferior to and superior to both two parents, then we set [\eta_{\rm{c}}^{\,\prime}] = [{\eta_{\rm{c}}}.] If [\eta_{\rm{c}}^{\,\prime}] calculated using equations (8)[link], (9)[link], (12)[link] and (13)[link] is negative then we set [\eta_{\rm{c}}^{\,\prime}] = 0. If α = 1, the scheme of SAXP is the same as that of FXP because [\eta_{\rm{c}}^{\,\prime}] will always be the same as ηc. Different [\eta_{\rm{c}}^{\,\prime}] determined by equations (8)[link], (9)[link], (12)[link] and (13)[link] are adopted to each variable of each solution. This means that all variables of new child solutions are determined by the same crossover formulas equations (1)[link]–(3)[link] and different crossover parameter [\eta_{\rm{c}}^{\,\prime}].

3. Comparison of results of fixed and self-adaptive strategy

The SAXP scheme was first applied to MGGPO. MGGPO produces as many as 10N trial solutions by using random movement and 10N solutions by using crossover and mutation, where N is the population. Crossover can use the SAXP or the FXP scheme. To fulfill the SAXP strategy, objectives of child solutions must be calculated twice. This means that MGGPO using SAXP consumes double the computing time compared with MGGPO using FXP. However, first objectives investigation of the child solution to determine [\eta_{\rm{c}}^{\,\prime}] was not conducted using particle tracking but GP regression, meaning that not the calculation values but the expectation values of objectives were used to determine [\eta_{\rm{c}}^{\,\prime}]. GP regression demands insignificant computing time, so MGGPO takes similar times when using SAXP and FXP.

MGGPO exploiting both SAXP and FXP was used to simultaneously increase the on and off momentum dynamic apertures (DAs) of the Korea-4GSR lattice. Korea-4GSR is a 28-cell low-emittance electron storage ring with hybrid seven-bend achromat optics (Oh et al., 2021[Oh, B.-H. J., Ko, J., Lee, G., Jang, G. & Shin, S. (2021). Appl. Sci. 11, 11896.]). The ring has six sextupoles in each cell, and the ring is assumed to have a periodicity of 14 in terms of sextupole configuration (i.e. 12 power supplies for sextupoles in the ring). In the simulation, we used two of the 12 sextupoles to correct chromaticity and the remaining 10 sextupoles as decision variables of MGGPO. On ([\delta] = 0%) momentum DA area and off ([\delta] = 4%) momentum DA area were set as the two objectives. Population N was set at 300 and DA area was calculated using 100-turn tracking to reduce computing time. ηm = 60 and ηc = 60. For SAXP, α = 3 and [\eta_{\rm{c}}^{\,\prime}] was calculated using equations (8)[link], (9)[link], (12)[link] and (13)[link]. MGGPO with 100 generations was performed first using the MOGA crossover in the FXP way [Fig. 3[link](a)] then in the SAXP way [Fig. 3[link](b)], and the solutions were compared at the 100th generation in each case [Fig. 3[link](c)]. MGGPO with SAXP reveals better solutions than MGGPO with FXP [Fig. 3[link](c)].

[Figure 3]
Figure 3
Comparison of objectives by MGGPO using FXP and SAXP with α = 3 at generation (a) 10, (b) 50, (c) 100 and (d) 150.

MOGA using the SAXP and the FXP method was also performed, with the same decision variables and objectives that were used for MGGPO. Other values were N = 300, ηm = 25 and ηc = 10. The algorithm was run for 100 generations. When MOGA using SAXP is conducted, the objectives of both child solutions c and c′ were estimated using particle tracking, unlike in MGGPO, in which they were estimated by using expectation values of GP regression. For this reason, MOGA that used SAXP required twice as much computing time as MOGA that used FXP.

First, to find an appropriate α value, MOGA with SAXP was conducted using a range of α [Figs. 4[link](a)–(f)]. Results indicated that α = 5 is a good choice for the SAXP scheme, and provides better solutions than MOGA using FXP. When α = 5, the results of MOGA using SAXP at N generations and FXP at 2N generations were compared with, N from 20 to 100, considering doubled computing time in the SAXP strategy compared with the FXP strategy (Figs. 5[link] and 6[link]). Compared with the FXP scheme, the SAXP scheme gave inferior solutions at N = 20, 40 and similar solutions at N = 60, but superior solutions at N = 80, 100. The distribution of [\eta_{\rm{c}}^{\,\prime}] for all variables of all solutions obtained by the SAXP scheme with α = 5 at generation 100 is shown in Fig. 7[link].

[Figure 4]
Figure 4
Comparison of objectives by MOGA using FXP and SAXP with (a) α = 2, (b) α = 3, (c) α = 4, (d) α = 5, (e) α = 6 and (f) α = 7.
[Figure 5]
Figure 5
Results of MOGA with FXP strategy at 2N generations and SAXP strategy at N generations where N is (a) 20, (b) 40, (c) 60, (d) 80 and (e) 100.
[Figure 6]
Figure 6
Objectives of all solutions obtained from MOGA using (a) FXP at 2N generations and (b) SXAP with α = 5 at N generations where N is 20 to 100.
[Figure 7]
Figure 7
[\eta_{\rm{c}}^{\,\prime}] of all variables of all solutions obtained by the SAXP scheme with α = 5 at generation 100. S1–S10 are decision variables that represent 10 sextupole strengths.

4. Conclusion

Previous studies that have applied MOGA have usually used a constant crossover parameter for SBX, i.e. the FXP strategy. We tried another way for SBX by changing the crossover parameter at each generation, i.e. the SAXP strategy. We compared results of the MOGA optimization with the two schemes to simultaneously optimize the on- and off-momentum DA areas for the Korea-4GSR lattice. We verified that MOGA using SAXP certainly presents a better Pareto front than MOGA using FXP when an appropriate α value is adopted

Acknowledgements

We thank M. Yoon (POSTECH) for providing helpful information and many useful discussions. This work was supported by a Korea University Grant and by the National Research Foundation of Korea (NRF) grant funded by the Korea government (Ministry of Science and ICT) (RS-2022-00155836).

References

First citationDeb, K. A., Pratap, S., Agarwal, S. & Meyarivan, T. (2002). IEEE Trans. Evol. Comput. 6, 182–197.  CrossRef Google Scholar
First citationDeb, M., Karthik, S. & Okabe, T. (2007). Proceedings of the 9th Annual Conference on Genetic and Evolutionary Computation (GECCO'07), 7–11 July 2007, London, UK, pp. 1187–1194.  Google Scholar
First citationLin, Q. J., Li, Z., Du, J., Chen, J. & Ming, Z. (2015). Eur. J. Oper. Res. 247, 732–744.  CrossRef Google Scholar
First citationOh, B.-H. J., Ko, J., Lee, G., Jang, G. & Shin, S. (2021). Appl. Sci. 11, 11896.  CrossRef Google Scholar
First citationPang, X. & Rybarcyk, L. J. (2014). Nucl. Instrum. Methods Phys. Res. A, 741, 124–129.  CrossRef CAS Google Scholar
First citationPress, W. H., Teukolsky, S. A., Vetterling, W. T. & Flannery, B. P. (1992). Numerical Recipes in C: The Art of Scientific Computing, 2nd ed., ch. 10.4. Cambridge University Press.  Google Scholar
First citationSong, M. X., Huang, L., Spentzouris, L. & Zhang, Z. (2020). Nucl. Instrum. Methods Phys. Res. A, 976, 164273.  CrossRef Google Scholar
First citationYang, D., Robin, F., Sannibale, C., Steier, C. & Wan, W. (2009). Nucl. Instrum. Methods Phys. Res. A, 609, 50–57.  CrossRef CAS Google Scholar

This is an open-access article distributed under the terms of the Creative Commons Attribution (CC-BY) Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are cited.

Journal logoJOURNAL OF
SYNCHROTRON
RADIATION
ISSN: 1600-5775
Follow J. Synchrotron Rad.
Sign up for e-alerts
Follow J. Synchrotron Rad. on Twitter
Follow us on facebook
Sign up for RSS feeds