Machine learning-assisted lens-loaded cavity response optimization for improved direction-of-arrival estimation

Cavina T Morris

Lens-loaded dynamic aperture

The system block diagram is shown in Fig. 1—a single-input single-output lens-loaded cavity antenna is connected to a baseband processing unit via a single RF chain. The lens-loaded cavity antenna in and of itself serves as a replacement for an antenna array aperture, so therefore, it is simply referred to as an aperture in this work. The processing unit comprises an estimator or a matched filter responsible for the DoA estimation. The lens-loaded cavity comprises an oversized chaotic cavity operating as a frequency-diverse compressive medium18. A constant-\(\varepsilon _r\) lens is placed in front of the chaotic cavity while the medium of EM energy transfer between the cavity and the lens structure is a curved surface with sub-wavelength holes. The structural configuration of the lens-loaded cavity is clearer in Fig. 2a where the perspective view shows the surface of the sub-wavelength hole. The lens-loaded cavity is placed in a Cartesian coordinate system where its FoV is along the \(+z\)-axis. When looked at from one side, i.e., xy-plane, a portion of the lens can be seen to be submerged in the chaotic cavity, however, it is important to note that the lens is making any contact with the cavity. The gap between the lens and the surface containing the sub-wavelength holes is governed by the focal length of the lens at the frequency of operation, which is 28 GHz for this particular case. Note that the test frequency band in this work is among the mmWave 5G as defined by the 3GPP New Radio (NR) FR2 enlisting n257. Details of the synthesis approach to developing a constant-\(\varepsilon _r\) lens for a given frequency can be found in28 and the works discussed therein. The spherical constant-\(\varepsilon _r\) lens (see Fig. 2a) used in this study has a radius of 66.5 mm while the distance between the centre of the lens and the chaotic cavity is 70 mm. A careful adjustment of this gap is critical for the best radiation performance of the lens-loaded cavity. As an example, the radiation performance of the cavity in terms of simulated peak realized gain is shown in Fig. 2b.

The chaotic cavity in Fig. 2a has physical dimensions of 170 mm \(\times\) 178 mm \(\times 180\,{\text {mm}}\) in (x \(\times\) y \(\times\) z) directions. The cavity structure is basically a metallic box with a simple geometric configuration. The constant-\(\varepsilon _r\) lens is placed in front of the cavity (i.e., facing the +z-axis) while RF chain is connected to the back side (facing \(-z\)-axis). For 28 GHz operation, a standard waveguide probe (WR28) is used to connect the cavity to the RF chain and the subsequent computation DoA estimation system. The most important component of the chaotic cavity relevant to this study is the metallic mode mixing scatterer. The scatterer is a metallic strip of size 78 mm \(\times\) 45 mm randomly oriented and asymmetrically placed within the chaotic cavity. The main purpose of the scatterer is to enhance the quasi-randomness of the cavity by randomly reflecting the EM energy within the metallic structure. This is similar to other mode-mixing structures in13,14 and the works discussed therein. However, the unique feature of the mode-mixing scatterer of this kind is that with a slight rotation, the disturbance in the wave-chaotic medium results in a new set of radiation modes. This feature allows to dynamically reconfigure the chaotic cavity by simply controlling the rotation of the mode-mixing scatterer by connecting it to a stepper motor, as achieved in this study. This reveals that by controlling a single parameter (i.e., angle of rotation of the mode-mixing scatterer) of the aperture, it is possible to generate unique sets of radiation modes; hence, determining the best angle of rotation can be formulated as a 1-D optimization problem that can be solved in a short time.

Even though the problem definition above bears a semblance of a partial geometry modification problem, it is not a typification of partial geometry modification problems in which parts of the EM structure are removed or replaced for alteration and modification29,30. This is mainly because the geometry of the EM structure (and its associated elements) in this study, in and of itself, is not altered or modified. Rather, the EM structure (unaltered and unmodified in terms of physical geometry) is characterized, as the mode-mixing scatterer (also with a consistent geometry and connected to a stepper motor) is rotated for various angular states to establish the near-optimum state for DoA estimation. The connection settings are shown in Fig. 2a. It is worth mentioning that it does not really matter to which side of the chaotic cavity the mode-mixing scatterer is attached via the stepper motor shaft; however, it is recommended that the mode-mixing scatterer is placed close to a corner of the chaotic cavity to ensure that the physical symmetry of the structure along all three axes is broken. For this purpose, the scatterer in this study is placed at positions (87 mm, 42 mm, 65 mm) along (x, y, z) from the three walls of the chaotic cavity. Another point to remember here is that the scatterer needs to be firmly fixed to the stepper motor to ensure the chaotic cavity retains its physical state at any particular rotation angle. Also, note that a rotation mechanism with enhanced rotational resolution can lead to quasi-continuous control of the scatterer.

Figure 2
figure 2

(a) Lens-loaded cavity structure with a mode mixing scatterer connected to a stepper motor to include state-diversity. (b) Simulated peak realized gain representing high and low magnitude values on the radiation mask when a test signal of 28.1 GHz is excited at the WR28 input.

Dynamic aperture optimization methodology

Considering the targeted simulation-driven aperture design optimization problem, there are several local and global optimization methods in the literature, such as26,31,32,33. Local optimization techniques rely on good initial designs that the designer needs to specify as starting points31. However, in our case, it is difficult to find a good initial design. Global optimization-based EM device design techniques (e.g.33,34,35) do not require initial designs, but they often require a large (sometimes prohibitive) number of EM simulations to obtain optimal results26,33,34,35. For our targeted aperture, each EM simulation costs more than one hour. Hence, both kinds of methods are not suitable.

In recent years, the incorporation of ML techniques into the optimization kernel of standard EAs has been demonstrated to lower the computational cost of the optimization process, which is applied to EM device design36,37,38. This is mainly achieved through surrogate model-based optimization in which many computationally expensive EM simulations in the optimization process are replaced with surrogate model-based predictions. These surrogate models, also called metamodels, are computationally cheap approximation models of expensive full-wave EM simulations. They are often constructed using ML techniques and are used to emulate the characterization or behavior of the EM simulation model, as closely as possible. Even though many paradigms and methods are currently available for the ML-assisted optimization of EM designs as reported in36,37,38,39, some of these approaches still have the drawbacks of standard optimization methods and are not general due to the ad-hoc processes required to ensure their efficiencies.

The approaches in40,41,42 require good initial designs or starting points and may get trapped in local optima due to their use of a local search mechanism, trust-region gradient search. In43,44,45, the fidelity of the EM model is varied methodically in the optimization process to improve efficiency. This is implemented alongside ad-hoc processes such as verification and improvement of the generated designs using high fidelity simulations and input space mapping in the local region, respectively, and the use of user-defined thresholds to control the variance of the fidelity of the EM model in terms of cells or lines per wavelength. These methods are not applicable for our case because: (1) a good initial design cannot be deduced for the lens-loaded aperture a priori, as earlier mentioned, (2) the discretization of lens-loaded aperture requires millions of mesh cells at the host of a relatively long simulation time (even for a relatively low mesh density, see section “Lens-loaded dynamic aperture operation”) to guarantee model accuracy. So, having an accurate coarse (low fidelity) model with a low cost in terms of simulation time is not feasible. SADEA-I21, adopted in this work, helps to overcome these drawbacks by providing a methodology that implements supervised learning and evolutionary computation in a unified optimization framework to efficiently synthesize the lens-loaded aperture for mmWave DoA estimation. The supervised learning and evolutionary computation techniques and their harmonized framework in SADEA-I are discussed as follows:

Supervised Learning

Like other methods in the SADEA series22,23,24,25, SADEA-I uses Gaussian process (GP)46,47 for surrogate modelling. Given a set of EM design geometric and/or material properties (\(x=(x^{1},\ldots ,x^{n})\)), corresponding to a set of performances (\(y=(y^{1},\ldots ,y^{n})\)) from full-wave EM simulation results, GP predicts the targeted EM design performances (\(y=f(x)\)) for a candidate design x by modelling y(x) as a Gaussian distributed stochastic variable having a mean of \(\mu\) and a variance of \(\sigma _1^2\). If y(x) is continuous, as it is the case for typical EM device design landscapes, the function values (\(y(x^i)\) and \(y(x^j)\)) of any two candidate designs such as \(x^{i}\) and \(x^{j}\) should be in proximity if they are highly correlated. A Gaussian correlation function is used to deduce this correlation between two candidate designs in SADEA-I:

$$\begin{aligned} \begin{array}{ll} Corr(x_{i},x_{j})=e^H; \quad H=-\sum \limits _{t=1}^{d}\zeta _{l}|x_{i}^{t}-x_{j}^{t}|^{\rho _{t}} \\ \text {for} \quad \zeta _{t}>0, 1\le \rho _{t}\le 2 \\ \end{array} \end{aligned}$$


where d is the dimension of x and \(\zeta _{l}\) is the correlation parameter that determines how rapidly the correlation diminishes as \(x_{i}\) moves in the t direction. The smoothness of the function is related to \(\rho _{t}\) with respect to \(x^{t}\). To deduce the parameters \(\zeta _{t}\) and \(\rho _{t}\), the likelihood function that \(y=y^{i}\) at \(x=x^{i} (i=1,\ldots ,n)\) is maximized. Hence, the Gaussian process regression or kriging-based prediction of the performance (\(y(x^{*})\)) of a candidate design (\(x^{*}\)) is carried out as follows:

$$\begin{aligned} {\hat{y}}(x^{*})={\hat{\mu }}+z^{T}Z^{-1}(y-I{\hat{\mu }}) \end{aligned}$$



$$\begin{aligned} Z_{i,j}= & {} Corr(x_{i},x_{j}), i,j=1,2,\ldots ,n \end{aligned}$$


$$\begin{aligned} z= & {} [Corr(x^{*},x_{1}),Corr(x^{*},x_{2}),\ldots ,Corr(x^{*},x_{n})] \end{aligned}$$


$$\begin{aligned} {\hat{\mu }}= & {} (I^{T}Z^{-1}I)^{-1}I^{T}Z^{-1}y \end{aligned}$$


The mean squared error of the prediction uncertainty is:

$$\begin{aligned} {\hat{s}}^2(x) = \hat{\sigma _1}^{2}[I-z^{T}Z^{-1}z+(I-z^{T}Z^{-1}z)^{2}(I^{T}Z^{-1}I)^{-1}] \end{aligned}$$



$$\begin{aligned} \hat{\sigma _1^{2}}=(y-I{\hat{\mu }})^{T}Z^{-1}(y-I{\hat{\mu }})n^{-1} \end{aligned}$$


A number of prescreening methods are available for the appraisal of the quality of a candidate design with respect to the predicted value in Eq. (2) and the prediction uncertainty in Eq. (6)48. In SADEA-I, the lower confidence bound (LCB) method49 is used. If the predictive distribution of y(x) is \(N({\hat{y}}(x), {\hat{s}}^2(x)\) for y(x), then the LCB prescreening of y(x) can be estimated as follows:

$$\begin{aligned} \begin{array}{ll} {\hat{y}}(x)-L {\hat{s}}(x) \\ L \in [0,3] \\ \end{array} \end{aligned}$$


where L is a constant that is often set to 2 to have a good balance between exploration and exploitation48.

Evolutionary computation

The EA driver in the SADEA-I is differential evolution (DE)50. DE is a popular EA widely used in engineering optimization. It outperforms many other EAs for continuous optimization problems50. Suppose that \(P_{designs}\) is a population of candidate designs in the aperture optimization process. Let \(x \in R\) be a candidate design (individual solution) in \(P_{designs}\). To generate a child solution C for x, mutation is first carried out to produce a donor vector:

$$\begin{aligned} v^{i}=x^{best}+F \cdot (x^{r_2}-x^{r_3}) \end{aligned}$$


where \(x^{best}\) is the best individual of the current population having a size of \(P_{designs}\) by 1, and \(x^{r_1}\) and \(x^{r_2}\) are two mutually exclusive solutions randomly selected from \(P_{designs}\); \(v^{i}\) is the \(i^{th}\) mutant vector in the population after mutation; \(F\in (0,2]\) is the scaling factor (a control parameter). The mutation strategy in Eq. (9) is called DE/best/1. After the mutation is completed, the following crossover operator is applied to produce the child, C, as follows:

  1. 1

    Randomly select a variable index \(j_{rand} \in \{1, \ldots , P_{designs}\}\),

  2. 2

    For each \(j=1\) to \(P_{designs}\), generate a uniformly distributed random number rand from (0, 1) and set:

    $$\begin{aligned} C_j = \left\{ \begin{array}{ll} v_j, &{}\quad \hbox { if (}\ r \, and\le CR) | j=j_{rand} \\ x_j, &{}\quad \text{ otherwise } \\ \end{array} \right. \end{aligned}$$


    where \(CR\in [0,1]\) is the crossover rate (a constant).

Note that since the EA process is 1-D, the DE mutation and crossover operations (Eqs. (9) and (10) , respectively) are implemented using populations with a size of \(P_{designs}\) by 1, as detailed above. Additional details on how DE mutation and cross over operations are implemented generally and specifically can be found in50.

The SADEA-I method

The essential steps of SADEA-I for the lens-loaded aperture optimization are described as follows21:

  • Step 1: Using the Latin Hypercube sampling method51, a small number (\(\alpha\)) of designs are sampled from the design space of the lens-loaded aperture, and full-wave EM simulations are carried out to obtain their performances. The initial database is created using these designs and their simulation results.

  • Step 2: If a preset stopping criterion such as the maximum number of EM simulations is met, output the best design from the database; otherwise go to Step 3.

  • Step 3: Select the \(\gamma\) best designs from the database to form a population of \(P_{designs}\) having a size of \(P_{designs}\) \(\times\) 1, and update the best solution obtained so far.

  • Step 4: Apply DE mutation and crossover operations (Eqs. (9) and (10) , respectively) on \(P_{designs}\) (the size is as described in Step 3) to generate child populations having \(\gamma\) child solutions each.

  • Step 5: For every candidate design in each population, build a GP surrogate model using the nearest designs based on Euclidean distance from the database and their simulation results as the training data points.

  • Step 6: Use the surrogate models in Step 5 to prescreen the child solutions in Step 4 according to Eq. (8), and select the best child solution based on the LCB values.

  • Step 7: Evaluate (simulate) the prescreened best child solution from Step 6. Add it and its simulation results to the database. Go back to Step 2.

In terms of algorithm parameters (see section “Example and discussion”), \(\alpha\) = 20, \(\gamma\) = 20 and F = 0.8 are used.

Next Post

3D Printing Webinar and Event Roundup: June 12, 2022 -

We have another busy week of webinars and events, starting with an international conference on powder metallurgy. In addition, Stratasys is continuing its Experience Tour, TriMech will discussing managing data in the cloud, HP is offering another one of its popular virtual tours, America Makes is holding a workshop for […]