Mobile QR Code QR CODE

  1. (College of Information Engineering, Henan University of Science and Technology, Luoyang, 471023, China)



InP HBT, small-signal model, GWO algorithm, GWO-SVR algorithm

I. INTRODUCTION

InP HBTs are widely used in high-frequency circuits owing to their outstanding performance. Superior power efficiency can be obtained even at low frequencies due to the high current gain and power density of InP HBTs [1]. Moreover, their advantages of good uniformity, small chip area, and high thermal conductivity make them widespread in IC designing [2-11]. InP HBTs also exhibit low flicker noise, which enable them to be ideal choice applied in Low Noise Amplifiers (LNAs) [12-14]. Additionally, they are also popularly employed in oscillators, frequency dividers and frequency multipliers due to their high maximum oscillation frequency, excellent stability, and outstanding phase noise and amplitude noise [15-19].

In order to effectively utilize HBTs in computer-aided design, it is imperative to have a stable and accurate HBT small-signal model, given the increasing application requirements. Previous research on InP HBT modeling has presented various techniques. A novel approach proposed in 2018 [20], based on an enhanced small-signal model that incorporates the distribution effect of base and collector feedlines, allows for the extraction of base parasitic capacitance to further refine the model. With this method, the error of the S-parameters between final prediction and measured data can achieve a precision of 3.5%. Additionally, a study in 2020 introduced sensitivity analysis to derive optimal solutions for the intrinsic parameters [21]. Another development in 2021 involved the introduction of a nonlinear model that accounts for dc/ac frequency dispersion effect [22]. In an effort to enhance the accuracy of the S-parameters in the small-signal model, the researchers considered not only parasitic effect in the collector part [23], but also emitter-collector capacitance effects [24]. However, it should be noted that the maximum error of S11 and S22 reaches 3.5% and 5%, respectively. Other studies have also focused on refining the small-signal model and noise model by analyzing the fabrication procedure and structure of InP HBTs [25-28].

In order to improve the efficiency of model building, machine learning (ML) has been employed and proved to be valuable in creating small-signal models of HBTs [29]. Reference [30] describes the Genetic Algorithm-Extreme Learning Machine (GA-ELM) model, a neural network optimized by the genetic algorithm, which significantly enhances the accuracy of the GaN P-HEMT model. Nevertheless, the construction of high-precision neural networks demands substantial amounts of training data and time. The addition of an optimization algorithm may not prevent overfitting, leading to being trapped in local optimal solutions [31]. In order to avoid the problem of the model falling into the local optimal solution, reference [32] proposes a new Snake Optimizer-Back Propagation (SO-BP) neural network model. Although the SO-BP model performs excellently in small-signal modeling, it has a high computational complexity and is difficult to adjust the parameters. In this paper, GWO-SVR algorithm is proposed to model the small-signal behavior of InP HBT.

The paper is structured as follows: Section II selects appropriate parameters and methods for small-signal prediction by analyzing the model port. Following this, in Section III, the GWO-SVR algorithm is introduced and utilized to predict the InP HBT model. Section IV provides the verification of the feasibility of applying the GWO-SVR algorithm to the InP HBT small-signal model. Finally, in Section V, the article presents a summary of the findings and contributions.

II. SMALL-SIGNAL MODEL OF INP HBT

The small-signal equivalent circuit and its two-port network of the InP HBT are shown in Fig. 1. The equivalent circuit is comprised of an intrinsic model and an extrinsic model. The intrinsic part includes the intrinsic base resistance ($R_{\rm bi}$), base-collector capacitance $C_{\rm bc}$), base-emitter resistance ($R_{\rm be}$), base-emitter capacitance $C_{\rm be}$), and base-collector resistance ($R_{\rm bc}$). In this part, $\alpha$ represents the common-base current amplification factor. On the other hand, the extrinsic circuit model comprises the base pad parasitic inductance ($L_{\rm b}$), collector pad parasitic inductance ($L_{\rm c}$), and emitter pad parasitic inductances ($L_{\rm e}$), as well as the pad parasitic capacitance $C_{\rm pbc}$, $C_{\rm pbe}$, $C_{\rm pce}$). Furthermore, the extrinsic base, collector, and emitter resistances ($R_{\rm b}$, $R_{\rm c}$, $R_{\rm e}$) are included in the model, and the extrinsic base-collector capacitance $C_{bcx}$) is also accounted for in the circuit model. Finally, $a_{1}$, $a_{2}$ represent the normalized incident waves at the input and output ports, respectively, while $b_{1}$, $b_{2}$ represent the normalized reflected waves at the input and output ports. $I_{1}$, $I_{2}$ are the port currents, and $U_{1}$, $U_{2}$ are the port voltages. $\alpha$ is the common-base current amplification factor.

Support Vector Regression (SVR), known as a small-sample learning method, offers significant advantages for addressing nonlinear problems [33]. This method involves the application of nonlinear transformation to convert an actual nonlinear problem into a high-dimensional feature space, thereby enabling the resolution of nonlinear problems through the construction of a linear decision function [34]. SVR can also be utilized to minimize structural risk and search for the global optimal solution [35].

Fig. 1. InP HBT small-signal equivalent circuit and its two-port network.

../../Resources/ieie/JSTS.2025.25.5.564/fig1.png

III. MATERIALS AND METHODS

1. Support Vector Regression

SVR is a statistical machine learning method primarily used in regression prediction developed from Support Vector Machine (SVM) [36]. In the context of multi-class classification, the objective of SVR is to construct a high-dimension hyperplane for the input sample $x$. Its definition is shown in Eq. (1).

(1)
$ f(x)=w^{T} x+b , $

where $w$ is the normal vector that determines the direction of the hyperplane, and $b$ is the displacement factor determining the intercept between the hyperplane and the origin [37]. This construction ensures that the closest classes of samples are as far from the hyperplane as possible.

If all samples fall on this hyperplane, the classification error is zero, signifying that a perfect prediction function has been achieved. However, as it is unlikely for all samples to fit with the defined linear function, a constant value $\xi$ is introduced as a proper approximation in SVR, representing allowable tolerance on the approximation error for each input x. In regression prediction, a predefined range interval $\xi$ is set as the tolerable deviation on both the upper and lower sides of the hyperplane. The sample values fall within the range of $w^T+b-\xi $ to $w^T+b+\xi $, as shown in Fig. 2, with the loss approaches 0. This ensures that the closest classes of samples are kept as far from the hyperplane as possible.

Fig. 2. Schematic diagram of support vector regression.

../../Resources/ieie/JSTS.2025.25.5.564/fig2.png

In order to ensure generalization ability, it is crucial to obtain an appropriate value of $\xi$ that allows most samples to have a loss close to 0. If $\xi$ is too small, leading to high accuracy of the data columns, the risk of underfitting or overfitting significantly increases [38]. Therefore, the slack variables $\zeta$ is introduced, which represents the distances from the samples beyond $w^{T}x+b-\xi$ to $w^{T}x+b+\xi$ with respect to the boundary. The SVR model can be defined as

(2)
$ {\mathop{\min }\limits_{w,b,\zeta _{i} ,\zeta _{i}^{*}}} \frac{1}{2} \left\| w\right\| ^{2} +C\sum _{i=1}^{m}(\zeta _{i} , \zeta _{i}^{*} ), $

this model needs to satisfy the constrains

(3)
$ f(x_{i} )-y_{i} \le \xi +\zeta _{i} ,\\ $
(4)
$ y_{i} -f(x_{i} )\le \xi +\zeta _{i}^{*},\\ $
(5)
$ \zeta _{i} \ge 0, \zeta _{i}^{*} \ge 0, (i=1, 2, ..., m), $

where ${C}$ is a regularization constant used to balance the maximum boundary and minimize the training error, a larger ${C}$ value makes the model more inclined to reduce the training set error, increasing model complexity and enhancing the fitting degree to training data. However, excessive focus on the training set may overlook the overall patterns, leading to overfitting. Conversely, a smaller ${C}$ value allows higher tolerance for training set errors, yielding a smoother and simpler function. Nevertheless, this might result in underfitting, as the model fails to capture data characteristics effectively, and x represents the original sample data. The minimization problem can be transformed into a pairwise problem by introducing Lagrange multipliers in Eq. (2) and constructing Lagrange functions. The constraints can be obtained by making partial derivatives of $w$, $b$, $\zeta_{i}$, $\zeta_{i}^{*}$ equal to zero, respectively. By combining the restraints with the Sequential Minimal Optimization (SMO) algorithm, the SVR model can be defined as [39]

(6)
$ f(x)=\sum _{i=1}^{m}(\alpha _{i}^{*} -\alpha _{i} )x_{i}^{T} x+b, $

where $\alpha_i$ and $\alpha^{*}_i$ are Lagrange Multipliers. In this equation, referenced to SVM kernel function. The definition of the kernel function is given by Eq. (7).

(7)
$ k(x_{i} ,x_{j} )=\phi (x_{i} )^{T} \phi (x_{j} ) . $

Fig. 3. The two-dimensional data is projected into the three-dimensional feature space by $\phi $.

../../Resources/ieie/JSTS.2025.25.5.564/fig3.png

where $\phi $is the mapping function, which maps the input ${x}_{i}$ to a higher dimensional feature space and increases the probability of linear classification as shown in Fig. 3; and ${k}(x_{i}$, $x_{j})$ is a kernel function that can be used to calculate the inner product of eigenvectors in high-dimension feature spaces and make accurate predictions even when samples can not be linear classified. Thus, SVR can be modified as

(8)
$ f(x)=\sum _{i=1}^{m}(\alpha _{i}^{*} -\alpha _{i} )k(x,x_{i} )+b. $

The RBF kernel is one of the available kernels, alongside Linear, Polynomial, Sigmoid, and others. It has been found to exhibit superior performance in various application areas, including small-signal model prediction [40,41]. The definition of the RBF kernel is given by Eq. (9).

(9)
$ k(x_{i} ,x_{j} )=e^{-g\left\| x_{i} -x_{j} \right\| ^{2} } , $

where $g$ represents the impact of a single sample on the hyperplane. A larger ${g}$ value enhances the model's ability to fit local data points, making the decision boundary more complex and sensitive. When g is too large, it causes overfitting: the model overemphasizes the local details of training data, leading to poor performance on test data. Even if the training error is minimal, the model's generalization capability significantly declines. Conversely, a smaller $g$ value strengthens the model's global fitting ability, yielding a smoother and simpler decision boundary. When $g$ is too small, the model becomes overly smooth, failing to capture complex data patterns and resulting in large errors in both training and testing. The selection of $g$ has been extensively analyzed in the literature, aiming to achieve the best generalization ability [42]. As $g$ increases, individual samples are more likely to become support vectors, leading to a greater number of support vectors. Consequently, adjusting $\xi$ and $\zeta$ is crucial when using the SVR model for predictions, ensuring globally optimal prediction results and minimizing risk.

The accuracy, generalization ability, and small sample characteristics of SVR model are all based on optimum parameter selection. As for a RBF kernel, the parameters that have the greatest impact on the prediction performance are $C$ and $g$. Therefore, Gray Wolf Optimization (GWO) algorithm is adopted to obtain the optimum values of $C$ and $g$, which can save the calculating time while improving the reliability.

2. Grey Wolf Optimization

GWO simulates the hunting behavior of grey wolves using mathematical algorithms. The grey wolf population is divided into four distinct groups: alpha ($\alpha$), beta ($\beta$), delta ($\delta$), and omega ($\omega$). The alpha wolves act as leaders and are responsible for making decisions, while the beta wolves assist the alpha group in decision-making processes. The delta (d) wolves hold the third-highest position in the grey wolf social hierarchy and are required to submit to the dominance of the higher-ranking wolves. The remaining wolves, categorized as omega, hold the lowest rank within the hierarchy. As illustrated in Fig. 4, the tracking, surrounding, and attacking behavior of the omega wolves is directed by the alpha, beta, and delta wolves [43].

Fig. 4. Principle of the Grey Wolf algorithm.

../../Resources/ieie/JSTS.2025.25.5.564/fig4.png

The mathematical model of the grey wolf algorithm consists of three steps: encircling, hunting, and attacking the prey. The encircling behavior can be mathematically described as

(10)
$ \overrightarrow{D}=\left|\overrightarrow{C}\times \overrightarrow{X_{p} }(t)-\overrightarrow{X}(t)\right|,\\ $
(11)
$ \overrightarrow{X}(t+1)=\overrightarrow{X_{p} }(t)-\overrightarrow{A}\times \overrightarrow{D} , $

where $\overrightarrow{A}$ and $\overrightarrow{D}$ are the coefficient vectors; $\overrightarrow{X_p}\left(t\right)$ and $\overrightarrow{X}(t)$ are the current position vectors of the prey and the grey wolf, respectively; $t$ is the current iteration.

The coefficient vectors $\overrightarrow{A}=2\overrightarrow{a}\times\overrightarrow{r_1}-\overrightarrow{a}$ and $\overrightarrow{C}=2\times \overrightarrow{r_2}$., where components of $\overrightarrow{a}$ are linearly decreased from 2 to 0 over the course of iterations and $\overrightarrow{r_1}$, $\overrightarrow{r_2}$ are random vectors in $[0$, $1]$. The vectors $\overrightarrow{r_1}$ and $\overrightarrow{r_2}$ are random, while $\overrightarrow{A}$ is the coefficient vector. If $\left|\overrightarrow{A}\right|$ is less than 1, the wolves will attack the prey, and this will lead the system to a locally optimal solution. If $\left|\overrightarrow{A}\right|$ is greater than 1, the wolves are separated from the prey and seek a more suitable target. To ensure a well-balanced search probability and avoid falling into locally optimal solutions, $\overrightarrow{r_1}$ is set in the range of $[0$, $1]$. The vector $\overrightarrow{C}$ contains search coefficients that assign random weights to the prey. $\overrightarrow{r_2}$ is randomly selected from a vector ranging from 0 to 1. This helps the GWO to explore more stochastically from the initial iteration through to the final iteration. This leads to a global search in the decision space and as a result, avoids getting trapped in a local optimum during optimization, particularly in the middle and late stages, preventing a dead-end loop of local optimum solutions.

The hunting behavior is usually guided by the alpha, as shown in Eqs. (12)-(14). $\overrightarrow{D_\alpha}$, $\overrightarrow{D_\beta}$, $\overrightarrow{D_\delta}$ are the distances between $\alpha$, $\beta$, $\delta$ and other individuals. $\overrightarrow{C_1}$, $\overrightarrow{C_2}$, $\overrightarrow{C_3}$ are random vectors and $\overrightarrow{X}$ is the current position of the grey wolf. $\overrightarrow{X_1}$, $\overrightarrow{X_2}$, and $\overrightarrow{X_3}$ determine the step length and direction of each individual wolf in the pack towards $\alpha$, $\beta$, and $\delta$, respectively. The final position of the wolf is determined by $\overrightarrow{X}(t+1)$.

(12)
$ \overrightarrow{D}{} _{\alpha } =\left|\overrightarrow{C}{} _{1} \times \overrightarrow{X}{} _{\alpha } -\overrightarrow{X}{} \right|, \overrightarrow{D}{} _{\beta } =\left|\overrightarrow{C}{} _{2} \times \overrightarrow{X}{} _{\beta } -\overrightarrow{X}{} \right|,\nonumber\\ \overrightarrow{D}{} _{\delta } =\left|\overrightarrow{C}{} _{3} \times \overrightarrow{X}{} _{\delta } -\overrightarrow{X}{} \right| , $
(13)
$ \overrightarrow{X}{} _{1} =\overrightarrow{X}{} _{\alpha } -\overrightarrow{A}{} _{1} \times (\overrightarrow{D}{} _{\alpha } ), \overrightarrow{X}{} _{2} =\overrightarrow{X}{} _{\beta } -\overrightarrow{A}{} _{2} \times (\overrightarrow{D}{} _{\beta } ),\nonumber\\ \overrightarrow{X}{} _{3} =\overrightarrow{X}{} _{\delta } -\overrightarrow{A}{} _{3} \times (\overrightarrow{D}{} _{\delta } ) , $
(14)
$ \overrightarrow{X}{} (t+1)=\frac{\overrightarrow{X}{} _{1} +\overrightarrow{X}{} _{2} +\overrightarrow{X}{} _{3} }{3} . $

3. GWO based SVR

The SVR model, known for its use of theoretical analysis to achieve high accuracy with a limited number of samples, shows good stability and generalization capacity when tackling non-linear problems. The selection of appropriate parameters, such as the penalty factor c and the kernel function parameter g, significantly impacts the predictive effect within the SVR model. Typically, the optimal c and g values are manually selected or debugged when encountering specific problems, which entails a considerable amount of work and results in low reliability. To address this issue, the GWO is employed to automatically identify the optimal values of c and g within a designated range, offering both speed and certain generalization ability in solving diverse problems. Specifically, the GWO-SVR algorithm proposed in this study follows a set of steps, as detailed below, with the algorithm flowchart shown in Fig. 5.

Fig. 5. Flowchart of GWO-SVR algorithm.

../../Resources/ieie/JSTS.2025.25.5.564/fig5.png

1. Initially, the data is partitioned into a training set and a test set, wherein the ratio of training set to test data significantly impacts the fitting accuracy in regression prediction problems. To evaluate the small-sample properties of the SVR algorithm, two variants are considered: GWO-SVRL, with a training set comprising 30% and a test set comprising 70%, and the GWO-SVR algorithm, with a training set of 70% and a test set of 30% [44]. The contrasting training-test set ratios aim to comprehend the algorithm's behavior under varying sample sizes.

2. Subsequently, the GWO-related parameters are initialized to streamline computation speed. Specifically, the number of wolves affects the balance between search accuracy and computational efficiency, while a reasonable setting of iterations can avoid overfitting or local optimality. When both the number of wolves and iterations are set to 10, this value balances accuracy and time consumption in pre-experimental tests. As model performance improves slightly beyond this value, it is thus chosen. The upper and lower bounds of parameters C and g are respectively fixed at 0.001 and 300 to constrain their values within a feasible range.

3. Then, the fitness of the current position of the wolves is computed by applying the SVR model. The positions of the grey wolves are updated based on the calculated fitness values, in line with the algorithm's objective to predict complex S-parameters. The fitness function employed for evaluation is the mean square error (MSE), as indicated in Eq. (15), chosen for its effectiveness in quantifying the disparity between prediction outcomes and sample values.

(15)
$ MSE=\frac{1}{N} \sum _{j=1}^{N}\left({\hat{y}} -y_{i} \right)^{2}. $

4. After reaching the maximum iterations, the parameters are obtained within the defined range to achieve optimal fitness. These optimized parameters are then used in the SVR model to produce the final regression predictions.

IV. RESULTS AND DISCUSSION

In this study, the device used to verify the accuracy of the small-signal model is an InP HBT fabricated by the Institute of Microelectronics, Chinese Academy of Sciences, with an emitter area of 1 $\mu$m?15 $\mu$m. The S-parameters were measured using an Agilent 8510C vector network analyzer, and a direct-current (DC) bias was provided to the device under test by an Agilent B1500A semiconductor device analyzer. The testing process was controlled by IC-CAP software, and the S-parameters were measured in the frequency range of 0.1-40 GHz through on-wafer testing. The collected dataset consists of S-parameter measurement data at 25 different bias points, covering the frequency range of 0.1-40 GHz (stepped by 0.1 GHz). Each set of data includes the real and imaginary parts of four S-parameters ($S_{11}$, $S_{12}$, $S_{21}$, $S_{22}$), totaling $25\times 8\times 400 = 80,000$ data points. The algorithm is executed on an i7-12700 processor. Firstly, the models are trained using the training set. Subsequently, the accuracy of the models is assessed using the test set.

1. The Predicted DC Results

Fig. 6 displays the measured and predicted results of DC characteristics of $I_{\rm c}$-$V_{\rm ce}$. The GWO-SVR model utilizes a training set of 70% for GWO-SVR and 30% for GWO-SVRL. Additionally, the modeling results from an Extreme Learning Machine (ELM) and a SVR model, both with a training set of 70%, are also presented for comparison. All four models effectively fit the measured data. However, with a decrease in $V_{\rm ce}$, the GWO-SVR model outperforms the other three models in terms of predictive accuracy. The predicting error is defined as:

(16)
$ Error_{D} =\frac{1}{N} \sum _{i=1}^{N}\frac{\left|A_{i} -B_{i} \right|}{\left|A_{i} \right|} , $

where $N$ is the total number of $I_{\rm c}$ (or $V_{\rm ce}$), Ai is the actual measured value of $I_{\rm c}$, and $B_{\rm i}$ is the predicted value of $I_{\rm c}$, provides further details. Furthermore, Table 1 presents the DC prediction errors obtained from the mentioned models. The predictions are conducted at $I_{\rm b}=0.02$ mA, 0.06 mA, 0.14 mA, 0.30 mA, and 0.34 mA, while the average prediction error of $I_{\rm c}$, denoted as Err${}_{rm D}$, is listed in the Table 1.

Fig. 6. The measured and predicted results of ${I}_{\rm c}$-$V_{\rm ce}$.

../../Resources/ieie/JSTS.2025.25.5.564/fig6.png

Table 1. Comparison of the DC prediction errors employing the four models.

ErrorD(%)

ELM

SVR

GWO-SVRL

GWO-SVR

Ib=0.02 mA

3.4

3.1

3.1

0.83

Ib=0.06 mA

4.6

5.1

2.2

0.21

Ib=0.14 mA

5.5

5.2

3.6

0.28

Ib=0.22 mA

4.1

3.2

2

0.12

Ib=0.30 mA

3.4

1.2

2.8

0.31

Ib=0.34 mA

2.2

1.7

1.9

0.12

ErrD

3.8

3.3

2.3

0.24

The simulation results discussed above provide evidence that the GWO-SVR model proposed in this paper can accurately model the DC behavior of InP HBT. Based on the data presented in Table 1, it is evident that the GWO-SVRL model, with a training set of only 30%, achieves good prediction results. Additionally, when the training set is increased to 70% as shown in the GWP-SVR model, the prediction errors remain small under different bias conditions.

2. The Predicted Results of S-parameters

Experiments were validated based on data biased at $I_{\rm c} = 21$ mA, $V_{\rm ce} = 3$ V and $I_{\rm c} = 6$ mA, $V_{\rm ce} = 1.2$ V. And the four predicting methods are accessed under the above two conditions. The modelling results of S-parameters, along with the measured data, are depicted in Figs. 7 and 8, utilizing the Smith chart to portray the S-parameters within a frequency range of $0.1 \sim 40$ GHz.

Fig. 7. Measured and predicted curves of S-parameters at $I_{\rm c} = 21$ mA and $V_{\rm ce} = 3$ V.

../../Resources/ieie/JSTS.2025.25.5.564/fig7.png

Fig. 8. Measured and predicted curves of S-parameters at $I_{\rm c} = 6$ mA and $V_{\rm ce} = 1.2$ V.

../../Resources/ieie/JSTS.2025.25.5.564/fig8.png

Table 2 displays the predicted error of S-parameters for four distinct models. The error is defined as:

(17)
$ Error=\frac{1}{N} \sum _{i=1}^{N}\frac{\left|\left|A_{i} \right|-\left|B_{i} \right|\right|}{\left|A_{i} \right|} , $

where $A_{i}$ represents the measured value, and $B_{i}$ represents the predicted value of the S-parameters.

Table 2. The predicting error of S-parameters using different modeling methods.

Error(%)

ELM

SVR

GWO-SVRL

GWO-SVR

Ic=21 mA

Vce=3 V

S11

1.01

0.72

0.41

0.30

S12

1.61

1.34

0.93

0.74

S21

0.40

0.38

0.28

0.16

S22

2.18

2.01

0.91

0.70

Ic=6 mA

Vce=1.2V

S11

1.01

0.53

0.35

0.27

S12

1.61

0.94

0.82

0.71

S21

0.40

0.43

0.37

0.26

S22

2.18

1.17

0.49

0.42

It can be observed from the curves in Fig. 7 and Fig. 8, as well as the errors in Table 2, that the ELM Model and the SVR Model exhibit significant errors, particularly in predicting $S_{12}$ and $S_{22}$. The curves in the Smith charts display obvious deviation at high frequencies, which cannot be rectified by adjusting the equivalent small-signal circuit model. Comparatively, the small-sample GWO-SVRL model demonstrates superior performance in predicting S-parameters when compared to the SVR and ELM models. Furthermore, the results can be further enhanced by increasing the training set to 70%, as shown by the GWO-SVR model. Additionally, the errors in predicting $S_{12}$ and $S_{22}$ are larger than those in predicting S11 and $S_{21}$ using the same model. $S_{12}$ even shows noticeable jitter, particularly at high frequencies, which deviates from the measured curves. The complex transmission mechanism of $S_{12}$, representing the internal feedback of the transistor connected to a matched load, results in reduced prediction accuracy. Notably, the real part and imaginary part of $S_{12}$ are predicted separately, and when their absolute values are close to each other, the prediction error is larger. Also, insufficiently dense sampling points at high frequencies can also contribute to this result.

It can be clearly seen from Fig. 9 that the MSE of both the training set and the test set decreases synchronously with the increase of the number of iterations, and the test set MSE never rebounds, proving that the model does not show overfitting [45]. It can be seen from the figure that selecting the number of iterations as 10 can balance the prediction accuracy and time consumption while avoiding overfitting.

The assessment of small signal prediction using R${}^{2}$ and MSE does not capture the practical significance of predicting transistor small signal models. Therefore, the following equation is employed as the global error to characterize the overall prediction effect:

(18)
$ Error=\frac{1}{4N} \sum _{i,j=1}^{2} \sum _{k=1}^{N}\frac{\left|\left|A_{k} (S_{ij} )\right|-\left|B_{k} (S_{ij} )\right|\right|}{\left|A_{k} (S_{ij} )\right|} , $

where $N$ is the total number of test points, $A_{k}(S_{ij})$ represents the measured data, $B_{k}(S_{ij})$ is the simulated data [46].

To illustrate the error distribution, 3D distribution of the global errors under different bias conditions are employed for ELM, SVR, GWO-SVRL and GWO-SVR as shown in Figs. 9-12. In these figures, pseudo-colour images, which are the projection results of the 3D image on the $I_{\rm c} \sim V_{\rm ce}$ plane, are also depicted for further demonstration of the error distribution. Meanwhile, the average value of the global errors using different models are shown in Table 3.

Based on Figs. 10-13 and Table 3, it can be observed that both the ELM model and SVR model exhibit larger prediction errors in comparison to the other two models. The ELM model displays noticeable randomness in the distribution of prediction errors, attributable to the neural network getting trapped in a local optimum cycle within a specific range. On the other hand, the prediction errors for the SVR model tend to increase with higher values of $I_{\rm ce}$ and $V_{\rm ce}$. Additionally, the SVR model, which utilizes the Grey Wolf algorithm to optimize parameters, demonstrates superior performance compared to the ELM and SVR models. Moreover, when 30% of the training set is used, the prediction error can be reduced to below 0.55%, and even with adjustments to 70% of the training set, the overall error can be minimized to just 0.35%. It is also worth noting that the GWO-SVR Model's global error reaches its maximum value and stabilizes at $V_{\rm ce}$ $\mathrm{>}$ 1.2V and $I_{\rm ce}>12$ mA. Notably, the global error tends to decrease with a reduction in Ice at the same voltage, while an increase in $V_{\rm ce}$ leads to an escalation in the global error when the collector current $I_{\rm ce}$ is held constant with high stability.

Fig. 9. The MSE of train and test data with respect to the number of epochs.

../../Resources/ieie/JSTS.2025.25.5.564/fig9.png

Fig. 10. Global errors of ELM model under different bias conditions.

../../Resources/ieie/JSTS.2025.25.5.564/fig10.png

Fig. 11. Global error of SVR model under different bias conditions.

../../Resources/ieie/JSTS.2025.25.5.564/fig11.png

Fig. 12. Global error of GWO-SVRL model under different bias conditions.

../../Resources/ieie/JSTS.2025.25.5.564/fig12.png

Fig. 13. Global error of GWO-SVR model under different bias conditions.

../../Resources/ieie/JSTS.2025.25.5.564/fig13.png

Table 3. Global errors.

S-parameter

ELM

SVR

GWO-SVRL

GWO-SVR

Error(%)

1.15

0.93

0.57

0.43

V. CONCLUSION

In this paper, a small-signal model of InP HBT is established using Support Vector Regression (SVR) and Grey Wolf Optimizer (GWO) for optimization, with the primary objective being the determination of the optimal penalty factor c and kernel function parameter $g$ to achieve the global optimal solution. The integration of this optimal solution into the SVR model facilitates the attainment of the optimal prediction effect. The language employed throughout the research is objective and concise, focusing on precision in vocabulary and maintaining a balanced and logical flow of information. Furthermore, all technical term abbreviations have been clarified upon their first usage. The paper also adheres to a formal register, offers a conventional structure, and employs a consistent citation style to ensure academic rigor and coherence. Through the establishment of a control group to forecast both the DC and small-signal characteristics of an InP HBT measuring 1$\mu$m $\times$ 15$\mu$m, and subsequent comparison of the forecasts with collected data, it is demonstrated that the S-parameter prediction error is just 0.28% across the 0.1-40 GHz range, and the DC characteristic prediction error is only 0.24% when utilizing 70% of the available training set. These results provide strong verification for the feasibility and accuracy of the proposed modeling scheme. Additionally, the findings indicate that the predictive accuracy remains robust even with a limited number of training samples, thus highlighting the small-sample property of the SVR approach.

ACKNOWLEDGMENTS

This work was supported by the Henan Province Young Backbone Teachers Training Program (Grant No. 2023GGJS045), and the Henan Provincial Science and Technology Research Project (Grant No. 242102211103).

References

1 
Y. K. Fukai, K. Kurishima, N. Kashio, M. Ida, S. Yamahata, and T. Enoki, ``Emitter-metal-related degradation in InP-based HBTs operating at high current density and its suppression by refractory metal,'' Microelectronics Reliability, vol. 49, no. 4, pp. 357-364, Apr. 2009.DOI
2 
Y. K. Koh, Y. W. Kim, and M. Kim, ``Performance analysis of custom dual-finger 250 nm InP HBT devices for implementation of 255 GHz amplifiers,'' Electronics, vol. 11, no. 16, p. 2614, Aug. 2022.DOI
3 
V. Radisic, D. W. Scott, A. Cavus, and C. Monier, ``220-GHz high-efficiency InP HBT power amplifiers,'' IEEE Transactions on Microwave Theory and Techniques, vol. 62, no. 12, pp. 3001-3005, Dec. 2014.DOI
4 
V. Radisic, D. Scott, S. Wang, A. Cavus, A. Gutierrez-Aitken, and W. R. Deal, ``235 GHz amplifier using 150 nm InP HBT high-power-density transistor,'' IEEE Microwave and Wireless Components Letters, vol. 21, no. 6, pp. 335-337, Jun. 2011.DOI
5 
S. Yamanaka, K. Sano, and K. Murata, ``A 20 Gs/s track-and-hold amplifier in InP HBT technology,'' IEEE Transactions on Microwave Theory and Techniques, vol. 58, no. 9, pp. 2334-2339, Sep. 2010.DOI
6 
A. Alizadeh, P. V. Rowell, Z. Griffith, and M. J. W. Rodwell, ``A 78 mW 220 GHz power amplifier with peak 18.4% PAE in 250 nm InP HBT technology,'' IEEE Transactions on Microwave Theory and Techniques, vol. 72, no. 1, pp. 1-8, Jan. 2024.DOI
7 
L. Zhang, V. Iyer, J. R. Sheth, and M. J. W. Rodwell, ``F-band distributed active transformer power amplifier achieving 12 Gb/s in InP 130 nm HBT,'' IEEE Transactions on Microwave Theory and Techniques, vol. 72, no. 3, pp. 1696-1705, Mar. 2023.DOI
8 
S. Yoon, I. Lee, M. Urteaga, M. Kim, and S. Jeon, ``A fully integrated 40-222 GHz InP HBT distributed amplifier,'' IEEE Microwave and Wireless Components Letters, vol. 24, no. 7, pp. 460-462, Jul. 2014.DOI
9 
J. Kim, S. Jeon, M. Kim, M. Urteaga, and J. Jeong, ``H-band power amplifier integrated circuits using 250 nm InP HBT technology,'' IEEE Transactions on Terahertz Science and Technology, vol. 5, no. 2, pp. 215-222, Mar. 2015.DOI
10 
V. Iyer, J. Sheth, L. Zhang, and M. J. W. Rodwell, ``A 15.3 dBm, 18.3% PAE F-band power amplifier in 130 nm InP HBT with modulation measurements,'' IEEE Microwave and Wireless Technology Letters, vol. 33, no. 5, pp. 547-550, May 2023.DOI
11 
A. Alizadeh, U. Soylu, N. Sharma, and M. J. W. Rodwell, ``D-band adaptively biased 16-way-combined half-watt power amplifier in 250 nm InP HBT,'' IEEE Transactions on Microwave Theory and Techniques, vol. 73, no. 1, pp. 1-8, Jan. 2025.DOI
12 
P. Shirmohammadi and S. M. Bowers, ``A wideband 2.18-13.51 GHz ultra-low additive phase-noise power amplifier in InP 250 nm HBT,'' Proc. of the 2024 IEEE Radio and Wireless Symposium (RWS), pp. 16-18, Jan. 2024.DOI
13 
V. Chauhan, N. Collaert, and P. Wambacq, ``A 120-140 GHz LNA in 250 nm InP HBT,'' IEEE Microwave and Wireless Components Letters, vol. 32, no. 11, pp. 1315-1318, Nov. 2022.DOI
14 
P. Sakalas, M. Schroter, and H. Zirath, ``mm-wave noise modeling in advanced SiGe and InP HBTs,'' Journal of Computational Electronics, vol. 14, no. 1, pp. 62-71, Mar. 2015.DOI
15 
H. Son, J. Yoo, D. Kim, and M. J. W. Rodwell, ``A 700 GHz integrated signal source based on 130 nm InP HBT technology,'' IEEE Transactions on Terahertz Science and Technology, vol. 13, no. 6, pp. 654-658, Nov. 2023.DOI
16 
T. Jyo, M. Nagatani, H. Wakita, and K. Okada, ``DC-to-150 GHz bandwidth InP HBT mixer module with upper-sideband gain-enhancing function,'' IEEE Transactions on Microwave Theory and Techniques, vol. 72, no. 12, pp. 1-8, Dec. 2024.DOI
17 
S. Veni, P. Andreani, M. Caruso, M. Tiebout, and A. Bevilacqua, ``Analysis and design of a 17 GHz all-npn push-pull class-C VCO,'' IEEE Journal of Solid-State Circuits, vol. 55, no. 9, pp. 2345-2355, Sep. 2020.DOI
18 
J. Jeong, J. Choi, J. Kim, and W. Choe, ``H-band InP HBT frequency tripler using the triple-push technique,'' Electronics, vol. 9, no. 12, p. 2081, Dec. 2020.DOI
19 
U. Soylu, A. Alizadeh, M. Seo, and M. J. W. Rodwell, ``280 GHz frequency multiplier chains in 250 nm InP HBT technology,'' IEEE Journal of Solid-State Circuits, vol. 58, no. 9, pp. 2421-2429, Sep. 2023.DOI
20 
A. Zhang and J. Gao, ``A new method for determination of pad capacitances for GaAs HBTs based on scalable small-signal equivalent circuit model,'' Solid-State Electronics, vol. 150, pp. 45-50, Dec. 2018.DOI
21 
K. Cao, A. Zhang, and J. Gao, ``Sensitivity analysis and uncertainty estimation in small-signal modeling for InP HBT (invited paper),'' International Journal of Numerical Modelling: Electronic Networks, Devices and Fields, vol. 34, no. 5, Aug. 2021.DOI
22 
A. Zhang and J. Gao, ``An improved nonlinear model for millimeter-wave InP HBT including DC/AC dispersion effects,'' IEEE Microwave and Wireless Components Letters, vol. 31, no. 5, pp. 465-468, May 2021.DOI
23 
A. Zhang and J. Gao, ``An improved small-signal model of InP HBT for millimeter-wave applications,'' Microwave and Optical Technology Letters, vol. 63, no. 8, pp. 2160-2164, Aug. 2021.DOI
24 
X. Su, S. Mao, Y. Wang, and J. Gao, ``A parameter extraction method for InP HBT small-signal model considering emitter-collector laminated capacitance effect,'' Microwave and Optical Technology Letters, vol. 66, no. 8, e34285, Aug. 2024.DOI
25 
A. Kanitkar, R. Doerner, T. K. Johansen, and J. Gao, ``Influence of on-wafer parasitic effects on Mason's gain of down-scaled InP HBTs,'' Proc. of the 2024 54th European Microwave Conference (EuMC), pp. 252-255, Sep. 2024.DOI
26 
Y. Wang, W. Ding, Y. Su, and J. Gao, ``An electromagnetic-simulation-assisted small-signal modeling method for InP double-heterojunction bipolar transistors,'' Chinese Physics B, vol. 31, no. 6, 068502, Jun. 2022.DOI
27 
P. Sakalas, M. Schroter, and H. Zirath, ``mm-wave noise modeling in advanced SiGe and InP HBTs,'' Journal of Computational Electronics, vol. 14, no. 1, pp. 62-71, Mar. 2015.DOI
28 
L. Cheng, H. Lu, M. Xia, and J. Gao, ``An augmented small-signal model of InP HBT with its analytical-based parameter extraction technique,'' Microelectronics Journal, vol. 121, 105366, Nov. 2022.DOI
29 
A. Khusro, S. Husain, M. S. Hashmi, A. Q. Ansari, and S. Arzykulov, ``A generic and efficient globalized kernel mapping-based small-signal behavioral modeling for GaN HEMT,'' IEEE Access, vol. 8, pp. 195046-195061, 2020.DOI
30 
S. Wang, J. Zhang, M. Liu, B. Liu, J. Wang, and S. Yang, ``Large-signal behavior modeling of GaN P-HEMT based on GA-ELM neural network,'' Circuits, Systems, and Signal Processing, vol. 41, no. 4, pp. 1834-1847, Apr. 2022.DOI
31 
A. Jarndal, S. Husain, and M. Hashmi, ``On temperature-dependent small-signal modelling of GaN HEMTs using artificial neural networks and support vector regression,'' IET Microwaves, Antennas & Propagation, vol. 15, no. 8, pp. 937-953, Jul. 2021.DOI
32 
J. Dong, Y. Su, B. Mei, and J. Gao, ``Small-signal behavioral-level modeling of InP HBT based on SO-BP neural network,'' Solid-State Electronics, vol. 209, 108784, Oct. 2023.DOI
33 
M. Açıkkar and Y. Altunkol, ``A novel hybrid PSO- and GS-based hyperparameter optimization algorithm for support vector regression,'' Neural Computing and Applications, vol. 35, no. 27, pp. 19961-19977, Dec. 2023.DOI
34 
C. Peng, Z. Che, T. W. Liao, and J. Gao, ``Prediction using multi-objective slime mould algorithm optimized support vector regression model,'' Applied Soft Computing, vol. 145, article 110580, Nov. 2023.DOI
35 
R. G. Brereton and G. R. Lloyd, ``Support vector machines for classification and regression,'' The Analyst, vol. 135, no. 2, pp. 230-267, Feb. 2010.DOI
36 
C. Jung, H. Kim, S. Park, and J. Lee, ``Counter-rotating hoop stabilizer and SVR control for two-wheel vehicle applications,'' IEEE Access, vol. 11, pp. 14436-14447, Feb. 2023.DOI
37 
G. Xu and X. Wang, ``Support vector regression optimized by black widow optimization algorithm combining with feature selection by MARS for mining blast vibration prediction,'' Measurement, vol. 218, 113106, Oct. 2023.DOI
38 
Y. Shi, Q. Li, X. Meng, T. Zhang, and J. Shi, ``On time-series InSAR by SA-SVR algorithm: prediction and analysis of mining subsidence,'' Journal of Sensors, vol. 2020, pp. 1-17, Nov. 2020.DOI
39 
Y. Meng, X. Zhang, and X. Zhang, ``Identification modeling of ship nonlinear motion based on nonlinear innovation,'' Ocean Engineering, vol. 268, article 113471, Jan. 2023.DOI
40 
M. Geng, J. Cai, C. Yu, J. Su, and J. Liu, ``Piecewise small-signal behavioral model for GaN HEMTs based on support vector regression,'' Proc. of the 2020 IEEE MTT-S International Conference on Numerical Electromagnetic and Multiphysics Modeling and Optimization (NEMO), Hangzhou, China, pp. 1-3, Dec. 2020.DOI
41 
M. Geng, J. Cai, C. Yu, J. Su, and J. Liu, ``Modified small-signal behavioral model for GaN HEMTs based on support vector regression,'' International Journal of RF and Microwave Computer-Aided Engineering, vol. 31, no. 9, Sep. 2021.DOI
42 
D. Liu, W. Zhang, Y. Tang, and H. Li, ``Evolving support vector regression based on improved grey wolf optimization for predicting settlement during construction of high-filled roadbed,'' Transportation Geotechnics, vol. 45, 101233, Feb. 2024.DOI
43 
S. Mirjalili, S. M. Mirjalili, and A. Lewis, ``Grey wolf optimizer,'' Advances in Engineering Software, vol. 69, pp. 46-61, Mar. 2014.DOI
44 
G. Jabbour, A. Nolin-Lapalme, O. Tastet, and M. Després, ``Prediction of incident atrial fibrillation using deep learning, clinical models, and polygenic scores,'' European Heart Journal, vol. 45, no. 46, pp. 4920-4934, Dec. 2024.DOI
45 
M. Cho, M. Franot, O. J. Lee, and H. Kim, ``A neural compact model based on transfer learning for organic FETs with Gaussian disorder,'' Journal of Materials Chemistry C, vol. 12, no. 41, pp. 16691-16700, Nov. 2024.DOI
46 
X. Du, M. Helaoui, A. Jarndal, and F. M. Ghannouchi, ``ANN-based large-signal model of AlGaN/GaN HEMTs with accurate buffer-related trapping effects characterization,'' IEEE Transactions on Microwave Theory and Techniques, vol. 68, no. 7, pp. 3090-3099, Jul. 2020.DOI
Jinchan Wang
../../Resources/ieie/JSTS.2025.25.5.564/au1.png

Jinchan Wang was born in Luoyang, China, in 1980. She received her Ph.D. degree from Southeast University, Nanjing China, in June 2009. Now she is an associate professor in Henan University of Science and Technology, Luoyang, China. Her research is focused on semiconductor materials and devices.

Wenshuai Liu
../../Resources/ieie/JSTS.2025.25.5.564/au2.png

Wenshuai Liu was born in 2000. He obtained his B.E. degree in engineering from Weifang University, Weifang, from 2020 to 2024; and is currently pursuing an M.S. degree in engineering at Henan University of Science and Technology. His major field is modeling and simulation of semiconductor devices.

Huanqing Peng
../../Resources/ieie/JSTS.2025.25.5.564/au3.png

Huanqing Peng was born in Shangqiu, China, in 1999. He received his bachelor's and M.S. degrees in Henan University of Science and Technology, China, in 2021 and 2024, respectively. His research is focused on modeling of HBTs and design of very high speed integrated circuit.

Jingyu Chang
../../Resources/ieie/JSTS.2025.25.5.564/au4.png

Jingyu Chang was born in 2001. He obtained his B.Eng. degree from Henan University of Science and Technology, Luoyang, from 2020 to 2024. His research is focused on modeling of GaN HEMT and design of very high speed integrated circuit.

Jiahao Yao
../../Resources/ieie/JSTS.2025.25.5.564/au5.png

Jiahao Yao was born in 2001. He obtained his B.E. degree in engineering from Zhengzhou University of Aeronautics, Zhengzhou, from 2020 to 2024; and is currently pursuing an M.S. degree in engineering at Henan University of Science and Technology. His major field is modeling and simulation of semiconductor devices.

Kexin Wang
../../Resources/ieie/JSTS.2025.25.5.564/au6.png

Kexin Wang was born in 2000 in Kaifeng, China. She received her bachelor's degree from Henan University of Science and Technology in 2024. She is currently pursuing a master's degree at Henan University of Science and Technology. Her research is focused on modeling of GaN HEMT device.

Jincan Zhang
../../Resources/ieie/JSTS.2025.25.5.564/au7.png

Jincan Zhang was born in Xingtai, China, in 1985. He received his M.S. degree in Xi'an University of Technology, Xi'an, China, in 2010. He received a Ph.D. degree in XiDian University, Xi'an, China, in June 2014. Now he is an associate professor in Henan University of Science and Technology, Luoyang, China. His research is focused on modeling of HBTs and design of very high speed integrated circuit.