Mobile QR Code QR CODE

A Novel Self-aligned Processing for Doubling the Integration Density of 3D NAND Flash Memory

https://doi.org/10.5573/JSTS.2025.25.3.199

(Yun-Jae Oh) ; (Yunejae Suh) ; (So Won Son) ; (Seongjae Cho) ; (Daewoong Kang) ; (Il Hwan Cho)

In this paper, we propose a novel processing idea to double the number of cells in 3D NAND flash memory to overcome the stacking cell limit of conventional 3D NAND flash memory. We present a process integration flow primarily focused on selective etching, where the core innovation lies in using materials with high selectivity to create self-aligned split cells with precisely identical characteristics. Using technology computer-aided design (TCAD) process simulation, we define the device based on the proposed process integration flow and extract the characteristics of the split cells. We confirm the equivalence of the split cells and evaluate their reliability by discussing incremental step pulse programming (ISPP), incremental step pulse erasing (ISPE), cell current (Icell), and retention characteristics. In this study, we propose a method in 3D NAND flash memory process integration where a single circular cell can be exactly divided into semicircular cells with identical characteristics.

1.8 mW, 4-8 GHz Bandwidth Mixer with Bleeding Transistors for Superconducting Qubit Read-out

https://doi.org/10.5573/JSTS.2025.25.3.206

(Seunghyeon Baek) ; (Hyeonsik Ahn) ; (Muhammad Fakhri Mauludin) ; (Youngwoo Ji) ; (Jusung Kim)

This paper presents a low-power down-conversion mixer with an ultra-low power consumption of 1.8 mW and a wide bandwidth covering 4-8 GHz, designed for superconducting qubit read-out systems. The proposed mixer utilizes bleeding transistors to enhance the linearity and reduce the noise, critical factors for accurate qubit state measurement. By optimizing the RF and LO paths, the proposed design achieves improved conversion gain and reduced local oscillator leakage, thereby preserving the integrity of qubit information. Simulation results demonstrate that the presented mixer achieves a noise figure of 8.6 dB, a 1 dB gain compression at -11.7 dBm, a gain of 16.8 dB, albeit with a ultra-low power consumption of 1.8 mW, making it suitable for integration into large-scale quantum computing architectures. The proposed mixer was designed using the TSMC 65 nm process, with a supply voltage of 1.2 V, showing competitive performance compared to state-of-the-art designs in the field.

Metal Oxide Resistive Memory Modeling with Physical Current Equation

https://doi.org/10.5573/JSTS.2025.25.3.213

(Gyunseok Ryu) ; (Jongwon Lee) ; (Myounggon Kang)

In this paper, DC compact model of a resistive-switching random-access memory (ReRAM) has been characterized and developed. ReRAM is one of the types of nonvolatile memory that is a promising candidate for use in the future. It is currently being actively studied for use in fields such as neuromorphic and AI computing due to its advantages such as fast switching speed and low operating voltage. Since the use of ReRAM in this field is used as a large-scale array simulation, a compact model is required to confirm the operation characteristics. The compact model was calibrated based on the measured values of two actually fabricated ReRAM devices using HfOx and SiNx materials as switching layers. In addition, this compact model was written using Verilog-A so that it can be directly applied to SPICE simulation. We have seen that it is possible to have a compact model with high accuracy for with different switching layers ReRAM devices when adjusting the parameters in current density equations and fitting parameters.

Design of a Floating-bulk NMOS Triggered GGNMOS with Low Triggering Voltage and High Robustness Aimed at 3.3 V I/O ESD Protection

https://doi.org/10.5573/JSTS.2025.25.3.218

(Haotian Chen) ; (Yang Wang) ; (Hongjiao Yang) ; (Liqiang Ding) ; (Wei Liu) ; (Jun Deng) ; (Fengfeng Zhou) ; (Beibei Nie)

GGNMOS is widely utilized as ESD protection device due to its simple structure and excellent process compatibility. The traditional multi-finger GGNMOS faces the problem of uneven current conduction. Moreover, under the application scenario of 3.3 V, the trigger voltage of GGNMOS is excessively high. This paper proposes a floating-bulk NMOS triggered GGNMOS (FBTGGNMOS), unlike previous studies, based on the standard 0.18um CMOS process, the FBTGGNMOS can achieve a relatively low trigger voltage without the need for additional detection circuits and control signals. FBTGGNMOS utilizes floating-bulk NMOS as its triggering structure, the factors contributing to the low BV of floating-bulk NMOS is investigated. TCAD simulation demonstrates the working mechanism of the device, the simulation results show that when ESD events occur, the added floating-bulk NMOS conducts first to provide triggering current for GGNMOS and help the device to conduct evenly. TLP test results demonstrate that, compared with traditional GGNMOS, FBTGGNMOS has a reduced trigger voltage by 41%, at only 5.24 V. With its low trigger voltage, high holding voltage, and high robustness, FBTGGNMOS can meet the requirements for 3.3 V I/O protection applications.

An Entropy Model for GPU Register Compression

https://doi.org/10.5573/JSTS.2025.25.3.228

(Minsik Kim) ; (Yunho Oh) ; (Won Woo Ro)

There have been continuous efforts to improve GPU energy efficiency. Register compression is a recent idea to tackle the efficiency problem, as the GPU register file is nontrivial in size and power-hungry to support massively concurrent thread execution. This paper proposes a mathematical model to estimate the ideal compression efficiency for GPU registers based on entropy theory. An ideal compression algorithm is expected to have a high compression ratio, however, the ratio varies significantly with application characteristics and also with the choice of algorithm. The proposed model can provide a theoretical bound of the ratio which is highly useful for evaluating register compression algorithms. Experimental results show an ideal case upper bound according to our model is 3.99×, while the existing algorithm based on BDI compression can achieve only 2.33×. This provides a strong case for exploring better register compression algorithms.

A BJT-based CMOS Temperature Sensor With a ±0.94?C 3σ -inaccuracy From ?40?C to +150?C

https://doi.org/10.5573/JSTS.2025.25.3.236

(Tae-June Park) ; (Jun-Ho Boo) ; (Jae-Geun Lim) ; (Hyoung-Jung Kim) ; (Jae-Hyuk Lee) ; (Seong-Bo Park) ; (Won-Jun Cho) ; (Gil-Cho Ahn)

This paper presents a BJT-based CMOS temperature sensor designed for an extended temperature range. In the sensing frontend, a β-compensation technique is employed to mitigate the effects of the finite current gain (β) of PNP transistors. Additionally, bitstream-controlled dynamic element matching (DEM) is applied to address mismatch errors in current sources and PNP transistors. The readout circuit based on 1-bit second-order incremental ?Σ ADC is configured with a minimum number of sampling switches to mitigate the impact of increasing switch leakage currents at high temperatures. Furthermore, system-level low-frequency chopping (CHL) is implemented digitally, removing the need for extra switches. Fabricated in a 0.18-μm CMOS process, the proposed sensor occupies an area of 0.63 mm2 . The sensor is accurate to within ±0.94?C (3σ) after one-point trimming from ?40?C to +150?C. It achieves a resolution figure of merit (FoM) of 21.17 pJ·K 2 at 27?C, with a conversion time of 20 ms and a power consumption of 22.23 μW from a 1.8 V supply.

XCNet: Enhancing Defect Detection in Sensor Boards Through Data Quality Analysis and Convolutional Neural Networks

https://doi.org/10.5573/JSTS.2025.25.3.245

(Sachin Ranjan) ; (Hoon Kim)

Sensor boards are vital components in modern technologies, but ensuring their quality remains a significant challenge. Increasing demand has driven manufacturers to integrate more components onto single boards, complicating quality control processes. Defects in these boards pose risks of financial losses and safety hazards. Traditional inspection methods, which rely on manual labor, are time-consuming, error-prone, and inefficient for handling complex products. Recent advancements in machine learning offer transformative solutions to these challenges. In this paper, we present XCNet, a convolutional neural network-based deep learning framework designed for automated defect detection in sensor boards. XCNet addresses the limitations of traditional methods by significantly enhancing inspection accuracy and efficiency while reducing human intervention. XCNet is tailored to handle highly imbalanced datasets caused by the rarity of defective products. Through comprehensive analyses, we investigate the impact of data quality on model performance, optimizing XCNet’s architecture and preprocessing techniques to achieve robust results. Extensive experiments on sensor board image data demonstrate XCNet’s remarkable accuracy of 99.54%, showcasing its potential as a reliable and scalable solution for automated quality control in manufacturing.

Design of IEEE 1500-compatible Test access mechanism for Tile-based AI semiconductor with Layout Mirroring

https://doi.org/10.5573/JSTS.2025.25.3.257

(Dongsup Song)

The IEEE 1500 standard provides robust test access for embedded cores but faces challenges in supporting tile-based designs with mirrored layout, which are commonly used in AI semiconductors. This paper introduces a novel design methodology for an IEEE 1500-compatible test access mechanism specifically developed to meet the requirements of AI semiconductor architectures. The proposed methodology leverages pass-through paths with bi-directional signaling, facilitating efficient reuse of tile layouts and enabling seamless integration of tile elements into chip designs. This approach significantly reduces development effort while maintaining design flexibility. Simulation results validate the effectiveness of the proposed test access mechanism, and two formulas are presented to optimize capture, update, and shift operating frequencies. Additionally, a pipelined path-through path design is proposed to improve the speed of IEEE 1500 shift operation.

A 232.2nW Segmented Curvature Compensation Sub-BGR with Bandgap Core Reusing

https://doi.org/10.5573/JSTS.2025.25.3.267

(Seung-Hun Park) ; (Jun-Ho Boo) ; (Jae-Geun Lim) ; (Hyoung-Jung Kim) ; (Jae-Hyuk Lee) ; (Seong-Bo Park) ; (Seong-U Choi) ; (Gil-Cho Ahn)

This paper presents a segmented curvature compensation sub-BGR that maintains a low TC with low power consumption over a wide temperature range. In this work, segmented curvature compensation is applied to achieve lower TC by adding correction voltages to uncompensated reference voltage. Furthermore, the bandgap core is reused for the generation of the voltages required for segmented curvature compensation, contributing high power and area efficiency. The proposed sub-BGR is implemented in a 180 nm CMOS process and occupies an active area of 0.21 mm2 . Measurement results show a reference voltage of 1.191 V and power consumption of 232.2 nW under 1.8 V supply. It achieves an average TC of 17.69 ppm/?C across a temperature range from ?40?C to 120?C.

Optimization of One-transistor (1T) DRAM Using Device Parameters-dependent Zero-temperature Coefficient Point

https://doi.org/10.5573/JSTS.2025.25.3.274

(Kyung Hee Kim) ; (Kyeong Min Kim) ; (Yeong Hwan Kim) ; (Jong Beom Im) ; (Gyu Ho Choi) ; (In Man Kang) ; (Young Jun Yoon)

In this study, we present a design technique that minimizes drain current fluctuations due to temperature changes and utilizes the concept of zero-temperature coefficient (ZTC) points to increase the stability of the onetransistor (1T) DRAM operation. In particular, the reliability of temperature changes was secured by maintaining the stability of drain current in a high temperature (300 K-400 K) environment through optimization of ZTC operation voltage, and data retention time and stability were strengthened by applying an asymmetric dual-gate structure. In addition, by optimizing the size of the device and adjusting the main gate work function (WF1) and body doping concentration, stable data retention performance was confirmed even in a high temperature environment. These designs minimize leakage current and maintain data retention times up to 330 ms to ensure reliable memory operation under various environmental conditions.

Two-rank Decimation Technique for High-speed Time-interleaved Analog-to-digital Converters

https://doi.org/10.5573/JSTS.2025.25.3.284

(Sang-won Oh) ; (Dong-Ryeol Oh)

This paper proposes a two-rank decimation technique for high-speed time-interleaved (TI) analog-todigital converters (ADCs) to reduce the multiplexer (MUX) speed burden and enable real-time measurements while minimizing area requirements. The proposed architecture can alleviate the speed burden of decimation and MUX circuits to a level comparable to that of a single-channel ADC by passing the digital outputs of each channel through a sequential decimation circuit before being merged in the MUX. The design was validated using a 6-bit 20 GS/s TI ADC implemented in a 40 nm CMOS process. The active area of the proposed two-rank decimation circuit is about 0.007 mm2 , which is only 8% compared to the memory-based approach of 0.09 mm2 . The power consumption of the proposed two-rank decimation circuit is 0.78 mW under a supply voltage of 0.9 V and with a 20 GS/s conversion speed, the measured signal-to-noise and distortion ratio (SNDR) and spurious-free dynamic range (SFDR) are 30.12 and 40.23 dB, respectively.

A 13-GHz Analog Fractional-N Sampling PLL With a Calibration-assisted Seamless Loop-switching Technique

https://doi.org/10.5573/JSTS.2025.25.3.292

(Seojin Kim) ; (Youngsik Kim) ; (Shinwoong Kim)

This work presents a 13-GHz low-jitter and high figure-of-merit (FoM) fractional-N phase-locked loop (PLL) using a digital-to-time converter (DTC)-based sampling PLL architecture. To achieve ultra-low jitter in fractional-N mode, a DTC gain calibration technique and a reconfigurable dual-core voltage-controlled oscillator (VCO) are applied, while a novel phase offset calibration technique is adopted to provide smooth loop switching transitions. Post-layout simulation results show an integrated rms jitter of 138.5-fs from 10 kHz to 100 MHz. The PLL consumes 4.12 mW and achieves a FoM of -251 dB, operating at a 1.0-V supply. The PLL core is implemented in a 28-nm CMOS process and occupies 0.47 mm2 .

Reducing Communication Overheads in MD Simulations: A Novel Floating-point Data Compression Approach

https://doi.org/10.5573/JSTS.2025.25.3.301

(Seongmin Ki) ; (Sungju Ryu)

Molecular dynamics (MD) simulations are critical tools for modeling the physical behavior of materials at atomic and molecular scales. However, widely used MD simulators such as LAMMPS [1] suffer from performance when the amount of inter-chip communication increases due to the MPI-based parallelized multi-chip computations at the large simulation data cases. To address this challenge, we propose an efficient data compression method tailored for MD computations. Our technique efficiently compresses the sign and exponent parts of floating-point data (12 bits in the FP64 format) into a single bit, significantly reducing communication bandwidth and energy consumption compared to conventional methods, enhancing the overall energy efficiency of large-scale MD simulations.

Analysis of the Effects of Bonding Misalignment on Current Density and Resistance Variation in Semiconductor Packaging Processes

https://doi.org/10.5573/JSTS.2025.25.3.311

(Seung-Hwan Oh) ; (Seul-Ki Hong)

In this study, the impact of misalignment in semiconductor packaging processes on the current density distribution and resistance characteristics at the bonding interface was analyzed. Finite element method (FEM)-based simulations using Ansys were conducted, and the results were validated through bonding process experiments. Simulation results revealed that the central region of the bonding interface exhibited relatively low current density, whereas higher current density was observed at the edges along the direction opposite to the applied electrical signal. As misalignment increased, localized current density surged in specific regions, altering the current flow pattern. Notably, when the misalignment exceeded a critical threshold, the area of concentrated current density expanded, leading to a deterioration in electrical characteristics. Experimental validation further confirmed that resistance increased with greater misalignment, and a sharp rise in resistance was observed beyond a specific threshold. These findings suggest that misalignment not only reduces the effective bonding area but also distorts the current flow, thereby impacting electrical signal transmission. This study demonstrates that minimizing misalignment in semiconductor packaging is a key factor in optimizing electrical performance. However, given that misalignment is an inherent aspect of semiconductor fabrication, complete elimination is impractical. Instead, structural design modifications are necessary to mitigate its impact on electrical characteristics. Based on the results, we propose that optimizing the bonding interface layout can alleviate localized current density concentration and enhance signal transmission efficiency.

Fault-tolerant GEMM Acceleratorbased on Microarchitectural Fault Analysis for Resource-constrained Devices

https://doi.org/10.5573/JSTS.2025.25.3.318

(Sunyoung Park) ; (Hannah Yang) ; (Hana Kim) ; (Hyunji Kim) ; (Ji-Hoon Kim)

As semiconductor technologies advances to the nanoscale, the likelihood of hardware faults increases, posing significant challenges in safety-critical applications such as autonomous driving and medical devices that are heavily rely on neural networks. To address this issue, we propose a fault-tolerant general matrix multiplication (GEMM) accelerator designed for resource-constrained edge devices. First, we introduce a high-low bit swapping mechanism (HL-Swap) to improve the fault resilience of registers in critical hardware components. Second, we quantify the impact of fault characteristics on accuracy degradation and propose a microarchitectural location-aware strategy that disables row-column operations (RC-Off). The proposed hardware is implemented in Samsung 28nm FDSOI technology, operating at a 1.0 V supply voltage with a 250 MHz clock frequency. Through tests utilizing 1000 random faults injected into the systolic array, we show that our proposed GEMM accelerator significantly mitigates accuracy degradation with hardware overhead of 2.4% and 8.9% for RC-Off and HL-Swap, respectively. In particular, compared to the conventional GEMM, a 63% improvement in performance was achieved in a scenario with a faulty PE rate (FPR) of 6%.

Sensing Characteristics and Transduction Mechanism of WO3-based Si FET-type Humidity Sensor Using Pulse Measurement

https://doi.org/10.5573/JSTS.2025.25.3.325

(Yoonki Hong) ; (Jonghyun Yun) ; (Dong Jin Han) ; (Sung-Tae Lee) ; (Sung Yun Woo)

As accurate humidity monitoring and control are critical in both home and factory environments, the recent interest in Internet of Things-based smart humidity sensors has rapidly increased. However, conventional humidity sensors have disadvantages of large size, output signal drift, and hysteresis. Thus, this study investigates the sensing characteristics of a Si field-effect transistor-type humidity sensor using a pulse measurement method. A tungsten trioxide (WO3) thin film, which is adopted as a sensing material to detect the relative humidity (RH) of the ambient air in the test chamber, is deposited via radio frequency magnetron sputtering. Water vapor is stably generated using a well-equipped humidity generation system, and N2 gas, which is used as a medium for carrying the water vapor, is controlled via mass flow controllers to adjust the RH. Subsequently, highly reliable humiditysensing characteristics of the sensor are obtained at room temperature in the forms of transfer (ID-VCG) curves and transient drain currents (IDs), without any significant ID drifts owing to pulse measurement. The chemical reaction between water molecules and the WO3 sensing layer is explained, and the effect of the chemical reaction in terms of electrical changes in the sensor is analyzed using energy band diagrams. The results indicate that |ID| decreases by 46% as RH increases from 3.4% to 80.3%. Furthermore, the response and recovery times are 97 s and 190 s, respectively.