Mobile QR Code QR CODE

  1. (West Virginia University - Engineering Technology Morgantown West Virginia 26506 United States)



FinFET, shallow learning, deep learning, genetic algorithm, work function

I. INTRODUCTION

With the continuous scaling of technology nodes has shifted the focus of transistor optimization toward novel materials and advanced computational modeling. This summary highlights key developments in transistor modeling, material discovery, and fabrication, emphasizing the integration of Density Functional Theory (DFT) and Machine Learning (ML) to accelerate innovation. The performance of modern transistors is increasingly influenced by quantum phenomena, making accurate modeling crucial before fabrication [1]. DFT is a key quantum mechanical method that calculates a material’s electronic structure and ground-state properties from first principles, without the need for experimental data [1].

While powerful, DFT is computationally expensive, especially for large-scale screening and complex systems. The integration of DFT and ML has emerged as a promising solution to overcome this limitation [5]. Researchers use DFT to generate high-quality datasets that then train ML models to predict material properties, significantly accelerating the discovery process. For example, probabilistic models have been used to predict 209 new stable compounds, which were then validated with DFT [7].

Despite the promise, challenges remain, including the high cost of generating large datasets and the need for robust data-management algorithms [5,9]. Additionally, traditional DFT functionals, such as the generalized gradient approximation (GGA), can have varying accuracy, and DFT struggles to accurately model long-range dispersion interactions crucial for van der Waals materials [6,9]. The search for next-generation transistors involves leveraging new materials to overcome the limitations of traditional silicon-based devices. A notable example is the AlPN/GaN high electron mobility transistor (HEMT), which doubles the drain current density of its predecessor, enabling higher power densities at high frequencies [10].

Technology Computer-Aided Design (TCAD) is used to implement compact physical models that accurately account for variable Schottky barrier heights and charge injection mechanisms in devices like silicon-germanium (SiGe) heterojunction bipolar transistors (HBTs) [11]. A single-transistor chaotic oscillator has been developed, simplifying circuits and making them more compact [12]. Negative Capacitance Field-Effect Transistors (NCFETs) use a ferroelectric material in the gate stack to amplify the gate voltage, potentially lowering the subthreshold swing (SS) below the conventional 60 mV/decade limit for more energy-efficient operation [13]. The BSIM-TFT model is an all-region framework designed to accurately simulate the behavior of thin-film transistors (TFTs) [16]. Neural networks like Levenberg-Marquardt (LM-BPNN) and Conjugate Gradient (CG-BPNN) are used to model Gallium Arsenide (GaAs) pseudomorphic HEMTs (pHEMTs) [17]. A charge deficit-based non-quasi-static (NQS) model provides a more accurate representation of dynamic behavior in transistors for high-frequency applications [19].

To address these issues, we develop a deep learning model that integrates a database with DFT data and is fine-tuned. This continuation of research [28] addresses the following issues:

  • Implemented with deep learning algorithms.

  • Database is integrated with DFT data.

  • Deep learning Algorithms and shallow Machine Learning Algorithms are fine-tuned.

  • All ML/DL algorithms are compared with Genetic Algorithm for benchmarking.

This paper is organized as follows: Section I provides an introduction and reviews related work. Section II details the methodology, including the discussion and implementation of the various ML and DL algorithms. Section III presents the results and a discussion of our findings, concluding with the paper’s key takeaways.

II. METHODOLOGY

1. Descriptor Construction

For constructing a comprehensive descriptor framework for materials informatics, we first establish a set of atom-atom interactions derived from elemental parameters. These interactions are then aggregated to generate molecule-level descriptors. In this study, we implemented eleven descriptors to characterize materials, encompassing both electronic-crystallographic properties and thermodynamic stability metrics. The electronic-crystallographic descriptors include crystal systems, space group symbols, volume, density, and crystallographic sites. Thermodynamic stability is quantified using two key descriptors: Energy Above Hull (EAH) and Formation Energy (Ef). EAH is a critical metric for assessing a material’s stability relative to its possible decomposition products. It represents the energy difference, measured in eV/atom, between the material and the most stable composition of its constituent elements on the convex hull. A positive EAH value indicates that the material is metastable and less stable than its most stable constituent phases, suggesting a potential for decomposition. Conversely, a value close to zero indicates high thermodynamic stability. Formation energy, also expressed in eV/atom, quantifies the energetic change associated with the formation of a compound from its constituent elements in their stable reference states. It is mathematically defined as

(1)
E f = E compound i n i E Element

Here, Ecompound is the total energy of the compound, ni is the number of atoms of element i, and EElementi is the total energy of a single atom of element i in its stable reference state. A negative formation energy signifies that the compound is energetically stable with respect to its constituent elements, implying that its formation is a thermodynamically favorable process. Conversely, a positive formation energy indicates an energetically unfavorable compound that may spontaneously decompose. crystal system, space group, band gap, magnetic ordering, and other quantitative and qualitative parameters are considered. After careful consideration, irrelevant parameters from the datasets are omitted.

2. Datasets

In this study, we utilize the material project [29] and AFLOW database [30]. Both are comprehensive open repositories for computational material science and contain information on inorganic crystalline properties. These two frameworks use VASP to perform DFT calculations with GGA-PBE functional for standard compounds and apply GGA+U method for systems containing d- and f- block elements. The systematic approach with a vast database with plenty of feature space makes our dataset efficient for machine learning analysis.

We selected a subset of sixty materials with dielectric constant between 2.7 and 27, and then the data are augmented to avoid overfitting. Data augmentation is done with feature permutation that breaks strong spurious correlations and can introduce variations while preserving marginal distributions. The range is selected for choosing a suitable gate oxide material for the advanced node FinFET. The lower bound is 2.7 for better subthreshold swing (SS), and the upper bound is limited to 27. By targeting this range, the ML/DL model can select an efficient gate oxide material. The dataset is divided into training, validation, and test sets for reliable evaluation. 10% is reserved for test set. The remaining 90% is split into 80% training set and 20% validation set. The reason for this splitting is to ensure equal distribution of the dielectric constant and work function in every split.

3. Model Selection

For selecting the best model, a comparative analysis of several machine learning (ML) algorithms—including linear regression, Elastic Net, Random Forest, and Extreme Gradient Boost (XGB)—was performed on a dataset. After data cleaning and feature engineering, the data was split into a 1:4 test-to-train ratio. Hyperparameter tuning was conducted using GridSearchCV to find the optimal configuration. The best model was chosen based on the lowest function loss and the highest R2 score. To prevent overfitting, k-fold cross-validation was applied during training, and model stability was confirmed by validating performance across multiple random data splits.

4. Fine-tuning Model

This study analyzes both shallow machine learning (ML) and Deep Learning (DL) algorithms. The shallow ML models, ranging from Linear Regression to K-Nearest Neighbor (KNN), and the deep learning Neural Network Regression model were all fine-tuned using GridSearchCV. All models were analyzed with k-fold cross-validation.

The neural network architecture, optimized via GridSearchCV, had its hyperparameters—such as the number of hidden layers, neurons per layer, learning rate, and batch size—rigorously tuned. The Rectified Linear Unit (ReLU) activation function was used in the hidden layers, with a linear activation in the output layer. The model was optimized using the Adam optimizer to minimize the Mean Squared Error (MSE) loss function.

III. RESULTS AND DISCUSSION

1. Pipeline

The research methodology involves several key stages, from dataset creation to performance evaluation. The study began by creating a diverse dataset by combining data from the AFLOW and Materials Project databases, which contain comprehensive material properties. A meticulous feature selection process followed, identifying the most significant variables influencing the work function based on both domain knowledge and statistical analysis.

To combat bias and overfitting, data augmentation was used to expand the dataset artificially. This involved applying transformations and perturbations to the original data, creating a more diverse set for training machine learning (ML) and deep learning (DL) models. These models were trained using cross-validation to predict the work function of materials, a critical parameter for evaluating FinFET performance.

The predicted work function values from the trained models were implemented into TCAD Sentaurus simulations for a 14 nm FinFET architecture (Fig. 1). This allowed for realistic simulations of the device’s performance, bridging the gap between theoretical and material properties and real-world device behavior. From the I-V profile, the Subthreshold Swing (SS), ON-Current (Ion), and OFF-Current (Ioff) were extracted to evaluate FinFET performance. This data, combined with the predicted work function, provided a comprehensive assessment of the device’s suitability for 14 nm technology node applications.

Fig. 1. Workflow of ML/DL implementation in Advanced 14 nm Technology node. The dataset construction followed by data augmentation, and then fed to ML/DL models for work function inference. Finally, inference data implemented in TCAD.

../../Resources/ieie/JSTS.2025.25.6.721/fig1.png

2. Model Performance

Before modeling, exploratory data analysis (EDA) was performed to understand the dataset’s properties. A Pearson correlation analysis showed no strong linear relationships between features, with all coefficients below 0.49. However, a Variance Inflation Factor (VIF) analysis revealed high scores for ‘sites’ and ‘volume,’ indicating multicollinearity that the initial correlation analysis missed. A pair plot analysis further identified non-linear relationships and cluster formations, specifically between ‘bandgap’ and ‘work function,’ and ‘Energy Above Hull’ (EAH) and ‘bandgap,’ suggesting underlying groupings in the data beyond simple linear associations.

Table 1. Exploratory data analysis.

Count

7500

Pearson Correlation (all parameters)

< 0.49

Sites (VIF)

16.49

Energy above hull (VIF)

1.1

Volume (VIF)

13.11

Density (VIF)

3.95

Band gap (VIF)

1.59

Work function (VIF)

1.31

Dielectric constant (VIF)

1.23

As for ML analysis, it starts with Linear Regression, Lasso, Ridge, and Gradient Boosting regression. After that explored with Ensemble Algorithms like Extreme Boost (XGB) Regression, Random Forest (RF) Regression, and Decision Tree regression. As for DL algorithm, Neural Network Regression. As for benchmarking ML and DL algorithms are compared with Genetic algorithm. All algorithms’ performance is depicted in Table 2.

Table 2. Model performance.

Model RMSE MAE R2
Linear Regression 0.88 0.64 0.13
Ridge Regression 0.88 0.64 0.13
Lasso Regression 0.94 0.69 0.13
Elastic Net Regression 0.88 0.65 0.13
Gradient Boost Regression 0.29 0.25 0.91
XGB Regression 0.31 0.27 0.88
Decision Tree Regression 0.31 0.26 0.89
Random Forest Regression 0.32 0.27 0.88
KNN Regression 0.32 0.26 0.89
Support Vector Regression 0.39 0.31 0.82
Neural Network Regression 0.14 0.14 0.84
Genetic Algorithm 1.01 0.71 −0.03

From Table 2, we noticed that comparatively simpler regression algorithms like linear regression, Ridge, Lasso, and Elastic Net Regressor performed poorly in this regard. Their R2 scores are 0.13 with high Root Mean Square Error (RMSE) and Mean Absolute Error (MAE), which is not acceptable. As a result, these four algorithms are not suitable for this dataset. The reason for this poor performance can be the inherent behavior of the algorithms. As these algorithms approximate their prediction in a linear manner and the dataset is not linear, the prediction is wrong with a high value of RMSE and MAE. As RMSE and MAE increase, R2 score will decrease. Fig. 2 exhibits the summary of all models’ performance.

Fig. 2. Visualization of predicted and true values of work function with several ML algorithms. (a) Gradient Boost Regression. (b) XGBoost Regression. (c) Decision Trree Regression. (d) Random Forest Regression. (e) Neural Network Regression. (f) Support Vector Regression. R2 score is also mentioned in the images.

../../Resources/ieie/JSTS.2025.25.6.721/fig2.png

Boosting algorithms like Gradient Boost and Extreme Gradient Boost (XGB) perform well in this context. Gradient Boost regression has a lower value of RMSE and MAE. In the case of R2 score, the value is high, which implies the model is getting overfitted. On the other hand, XGB has RMSE and MAE values within a tolerable limit. R2 score of 0.88 ensures the model is not biased or overfitted. XGB identifies parameter ‘density’ as the most important feature (Fig. 3(a)).

Decision Tree regressor and Random Forest regressor are regarded as ensemble algorithms. These two algorithms have RMSE and MAE of low value, and the difference is 0.01 eV. R2 scores are in an excellent range of 0.88 to 0.89, which implies these two algorithms are not biased or overfitted. In the case of feature importance, both of the models show ‘density’ feature as the most important feature to predict work function (Figs. 3(b) and 3(c)).

Fig. 3. Feature Importance from Several ML models. (a) XGB. (b) Decision Tree Regressor. (c) Random Forest Regressor.

../../Resources/ieie/JSTS.2025.25.6.721/fig3.png

K nearest neighbor algorithm shows similar performance to ensemble algorithms. KNN performs 0.01 eV more than ensemble algorithms in R2 scores. Support vector regressor with RBF kernel performs the same in the case of RMSE and MAE. In case of R2 scores, the value is lower little lower than KNN.

Neural network Regression has the lowest value of RMSE and MAE. The reason for this lowest value is neural nets. As each neural network is trained with Adam optimizer and back propagation, the model predicts work function with the least error and a high R2 score. Fig. 4 depicts the performance of neural network regression. Fig. 4(a) exhibits Predicted work function Vs True Work function with 0.84 R2 score. Figs. 4(b) and 4(c) depict Loss Vs Epoch and Mean Squared Logarithmic Error Vs Epoch. Both of the cases, train and test sets aligned, which implies the error is decreasing with time and learning is progressing with the least error.

Fig. 4. Neural network Regression performance. (a) Predicted work function Vs True work function. (b) Loss vs Epoch for train and test sets. Both sets aligned at the end (c) Mean Squared Logarithmic Error vs Epoch for Train and Test sets.

../../Resources/ieie/JSTS.2025.25.6.721/fig4.png

Genetic algorithm is initiated with four parents and each iteration, there will be six hundred generations with 5% mutation rate with random mutation. Parent selection type is ‘sss’. Crossover is the process where two parent chromosomes exchange genetic material to produce new, unique offspring chromosomes. This introduces new combinations of genes into the population, combining the best traits of both parents. For our case it is “single point” (Fig. 5). Generation Vs Fitness is a steep curve so as is Generation Vs New Solution rate (Figs. 6(a) and 6(b)). The performance is the worst in this research. RMSE and MAE values are the highest of all the models, and R2 score is the lowest of all the algorithms.

Fig. 5. Some genetic evolution in the genetic algorithm.

../../Resources/ieie/JSTS.2025.25.6.721/fig5.png

Fig. 6. (a) Generation vs fitness in Genetic Algorithm. (b) Generation vs new solution rate.

../../Resources/ieie/JSTS.2025.25.6.721/fig6.png

Summary of all ML/DL models, simple linear regression models can be eliminated because of poor performance. All the ensemble algorithms, boosting algorithms and neural network regressor have high R2 score values without any sign of bias and overfitting. But in the case of MAE, RMSE, Neural Network regressor has the lowest values for all. MAE, RMSE play pivotal role in bias and overfitting. So low value of MAE, RMSE indicate less possibility of bias and overfitting. Considering all the parameters, neural network regressor is the best model for this research (Fig. 7).

Fig. 7. Work function (eV) of different models.

../../Resources/ieie/JSTS.2025.25.6.721/fig7.png

Neural Network Regression (NNR) is selected to predict the work function. NNR model predicts monoclinic Hafnium Oxide (Hf2O) with the work function of 4.66 eV. Implying NNR data in TCAD Sentaurus SProcess along with other process parameters like 10 nm fin thickness, 6 nm thick TiCl gate, 4e18 cm3 doping concentration, 1 nm thick HfO2 gate oxide. 14 nm technology FinFET is simulated with Sentaurus Process (Fig. 8(a)). The Characteristics of the FinFET are analyzed with Sentaurus SDevice (Figs. 8(b) and 8(c)). With 4.6 eV (data predicted from NNR) On current is 2.5 e–6 A and Off current is 1 e-10 A. On current in different switching condition with varying work function for our 14 nm technology FinFET is depicted in Fig. 8(d). From Fig. 8(d), for 4.6 eV work function for logic 1 On current is 1.5 e–6 A. From Sprocess and SDevice simulation, it is prominent that 14 nm FinFET is working well with NNR material data, and the research pipeline is working well.

Fig. 8. FinFET TCAD simulation. (a) Process simulation of 14 nm FinFET. (b) I-V characteristics of 14 nm FinFET. (c) Logarithmic I-V. (d) Work function vs on current in different logic state.

../../Resources/ieie/JSTS.2025.25.6.721/fig8.png

IV. CONCLUSIONS

This work presents a robust methodology for selecting and implementing optimal materials in advanced FinFET fabrication. We employed a Deep Learning-based Nonlinear Regression (NNR) model, achieving a high prediction accuracy with an R2 score of 0.84 and low Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE) of 0.14 eV. This model successfully predicted a 4.66 eV work function for monoclinic HfO2, a critical parameter for device performance. Implementing this predicted work function into a 14 nm FinFET simulation within the TCAD Sentaurus framework demonstrated promising performance characteristics. This predictive and simulation-based pipeline offers a scalable and efficient approach for material selection and optimization, potentially accelerating the mass fabrication of high-performance FinFETs.

References

1 
Lin L., Jacobs R., Ma T., Chen D., Booske J., Morgan D., 2023, Work function: Fundamentals, measurement, calculation, engineering, and applications, Physical Review Applied, Vol. 19, No. 3DOI
2 
Bhuwalka K. K., Schulze J., Eisele I., 2005, Scaling the vertical tunnel FET with tunnel bandgap modulation and gate workfunction engineering, IEEE Transactions on Electron Devices, Vol. 52, No. 5, pp. 909-917DOI
3 
Jain A., Shin Y., Persson K. A., 2016, Computational predictions of energy materials using density functional theory, Nature Reviews Materials, Vol. 1, No. 1, pp. 1-13DOI
4 
Duan C., Liu F., Nandy A., Kulik H. J., 2021, Putting density functional theory to the test in machine-learning-accelerated materials discovery, The Journal of Physical Chemistry Letters, Vol. 12, No. 19, pp. 4628-4637DOI
5 
Tang Q., Zhou Z., Chen Z., 2015, Innovation and discovery of graphene-like materials via density-functional theory computations, Wiley Interdisciplinary Reviews: Computational Molecular Science, Vol. 5, No. 5, pp. 360-379DOI
6 
Hautier G., Fischer C. C., Jain A., Mueller T., Ceder G., 2011, High-throughput identification of missing ternary oxides via machine learning and DFT, MRS Bulletin, Vol. 36, No. 7, pp. 576-580Google Search
7 
Jain A., Hautier G., Moore C. J., Ong S. P., Fischer C. C., Mueller T., Persson K. A., Ceder G., 2011, A high-throughput infrastructure for density functional theory calculations, Computational Materials Science, Vol. 50, No. 8, pp. 2295-2310DOI
8 
Padama A. A. B., Palmero M. A., Shimizu K., Chookajorn T., Watanabe S., 2025, Machine learning and density functional theory-based analysis of the surface reactivity of high entropy alloys: The case of H atom adsorption on CoCuFeMnNi, Computational Materials Science, Vol. 247DOI
9 
Hamza H., Jarndal A., 2025, Modeling and simulation of AlPN/GaN high electron mobility transistor, Advanced Theory and Simulations, Vol. 8, No. 4DOI
10 
Golec P., Bestelink E., Sporea R. A., Iñiguez B., 2025, Physical compact model for source-gated transistors for DC application, IEEE Transactions on Electron Devices, Vol. 72, No. 3, pp. 952-958DOI
11 
Fu L., Zhu W., Yu B., Zhang Y., Valdes-Sosa P. A., Li C., Ricci L., Frasca M., Minati L., 2025, Modeling and experimental circuit implementation of fractional single-transistor chaotic oscillators, Applied Mathematics and Computation, Vol. 500DOI
12 
Talukdar J., Choudhuri B., Mummaneni K., 2025, Analytical modeling and TCAD simulation of surface potential and drain current for pocket doped negative capacitance field-effect transistor, Physica Scripta, Vol. 100, No. 3DOI
13 
Eom S., Lee S., Yun H., Cho K., Kim S., Baek R., 2025, Machine learning‐driven extraction of hybrid compact models integrating neural networks and Berkeley short‐channel insulated‐gate field‐effect transistor model‐common multigate for multidevice applications, Advanced Intelligent Systems, Vol. 7, No. 5DOI
14 
Bharti , Mittal P., 2025, Analytical modeling of oppositely doped core–shell junctionless nanowire transistor considering fringe capacitance and dual material gate, Advanced Theory and Simulations, Vol. 8, No. 6DOI
15 
Pahwa G., Salahuddin S., Hu C., 2024, An all-region BSIM thin-film transistor model for display and BEOL 3-D integration applications, IEEE Transactions on Electron Devices, Vol. 71, No. 8, pp. 4701-4709DOI
16 
Lin Q., Yang S., R. Yang , Wu H., 2024, Transistor modeling based on LM-BPNN and CG-BPNN for the GaAs pHEMT, International Journal of Numerical Modelling: Electronic Networks, Devices and Fields, Vol. 37, No. 4DOI
17 
Khorram H. G., Sheikhaei S., Touski S. B., Kokabi A., 2024, Field-effect transistor based on MoSi monolayer for digital logic applications, IEEE Transactions on Electron Devices, Vol. 71, No. 1, pp. 7131-7137DOI
18 
Tung C.-T., Salahuddin S., Hu C., 2024, Non-quasi-static modeling of neural network-based transistor compact model for fast transient, AC, and RF simulations, IEEE Electron Device Letters, Vol. 45, No. 7, pp. 1277-1280DOI
19 
Hossain S., 2015, Iridium Modified Silicon (001) Surface, M.S. Thesis, the University of North DakotaGoogle Search
20 
Oguz I. C., Çakır D., Hossain S., Mohottige R., Gülseren O., Öncel N., 2016, On the structural and electronic properties of Ir-silicide nanowires on Si(001) surface, Journal of Applied Physics, Vol. 120, No. 9DOI
21 
Hossain S., Iqbal M. A., Rahman M., 2020, A new approach towards embedded logic in a single device, Proc. of 2020 IEEE 20th International Conference on Nanotechnology (IEEE-NANO), pp. 120-123Google Search
22 
Hossain S., Iqbal M. A., Samant P., Siddiki M. K., Rahman M., 2023, More than a device: Function implementation in a multi-gate junctionless FET structure, Journal of Electronics and Electrical Engineering, pp. 1-11Google Search
23 
Hossain S., 2023, Function Implementation in a Multi-Gate Junctionless FET StructureGoogle Search
24 
Hossain S., Rabbi A. F., 2024, Work function tuning for junctionless transistor high-K gate material using machine learning descriptor engineering, Proc. of the 8th International Conference on Theoretical and Applied Nanoscience and Nanotechnology (TANN 2024)DOI
25 
Mahshook M., Banerjee R., 2025, Beyond diamond: Interpretable machine learning discovery of coherent quantum defect hosts in semiconductors, arXiv preprint arXiv:2506.03844DOI
26 
Bifulco A., Malucelli G., 2025, AI/Machine learning and sol-gel derived hybrid materials: A winning coupling, Molecules, Vol. 30, No. 14, pp. 3043DOI
27 
Özdem S., Orak I. M., 2025, A novel method based on deep learning algorithms for material deformation rate detection, Journal of Intelligent Manufacturing, Vol. 36, No. 5, pp. 3249-3270DOI
28 
Hossain S., Rabbi A. F., 2024, Work function tuning for junctionless transistor high-K gate material using machine learning descriptor engineering, Proc. of the 8th International Conference on Theoretical and Applied Nanoscience and Nanotechnology (TANN 2024)DOI
29 
Jain A., Ong S. P., Hautier G., Richards W. D., Dacek S., Cholia S., 2013, Commentary: The Materials Project: A materials genome approach to accelerating materials innovation, APL Materials, Vol. 1, No. 1DOI
30 
Calderon C. E., Plata J. J., Toher C., Oses C., Levy O., Fornari M., Natan A., 2015, The AFLOW standard for high-throughput materials science calculations, Computational Materials Science, Vol. 108, pp. 233-238DOI
31 
, TCAD Sentaurus, https://www.synopsys.com/manufacturing/tcad.htmlURL
Sehtab Hossain
../../Resources/ieie/JSTS.2025.25.6.721/au1.png

Sehtab Hossain received his doctorate degree in computer engineering from the University of Missouri Kansas City in 2023. His research interests include the development of Material Discovery with AI and compound power semiconductors such as silicon carbide (SiC) and gallium nitride (GaN). He is serving as a Teaching Assistant Professor at West Virginia University.