Mobile QR Code QR CODE

  1. (Division of Electrical, Electronic, and Control Engineering, Kongju Na- tional University, Cheonan 31080, South Korea)
  2. (Institute of IT Convergence Technology, Kongju National University, Cheonan 31080, South Korea)



Fault tolerance, interleave division multiple access (IDMA), interleaver, multiuser detection (MUD), nonorthogonal multiple access (NOMA)

I. INTRODUCTION

Interleave division multiple access (IDMA) [1] is one of nonorthogonal multiple access (NOMA) techniques. It has accommodated multiple users in various communication systems including the 5G and beyond, the Internet of Things (IoT), and unmanned aerial vehicle (UAV) networks [2,3].

Fig. 1 illustrates an IDMA system for ${U}$ users [1]. It consists of ${U}$ transmitters associated with ${U}$ respective users, the single receiver that accommodates all the ${U}$ users, and the multiple access channel that bridges the transmitters and the receiver. While all the transmitters employ the same spreader, each of them adopts a distinct interleaver with a unique interleaving pattern different from the others, which is the key principle of enabling the IDMA. The subscript in the name of a block, e.g., interleaver${}_{u}$, indicates that the component is solely for the ${u}$th user. The receiver or the multiuser detector is subdivided into the elementary signal estimator (ESE) and ${U}$ user-specific processing blocks (UPBs). The ESE first calculates the log-likelihood ratios (LLRs) for the chips sent from the users, and UPB${}_{u}$ refines the LLRs associated with the ${u}$th user and feeds them back to the ESE. The ESE then recalculates the LLRs for UPBs based on the statistics of all the users' LLRs from UPBs. In such a manner, the ESE and UPBs exchange their LLRs several times to improve the reliability.

As stated above, to enable simultaneous accesses, every user transmits its data by employing its own interleaver of a distinct interleaving pattern. Then, the iterative multiuser detector reconstructs the original data from ${U}$ users by employing ${U}$ interleaver-deinterleavers pairs, each of which corresponds to the distinct pattern. As emphasized with gray in Fig. 1, the interleavers are integral in implementation of the systems.

Fig. 1. IDMA system for U users.

../../Resources/ieie/JSTS.2025.25.2.191/fig1.png

There have been numerous interleaver architectures in the literature. In [4], efficient methods to generate the interleavers were presented. The group random permutation was used in [5] to constitute interleaving patterns. In [6,7], partially parallel interleavers were developed. Common subexpressions were eliminated in [8]. The algebraic interleaving formula and its multi-stage architecture were proposed in [9,10], and have been widely employed [3,7,10,11,12,13,14,15,20]. Recently, [16] replaced the multiple stages by a fixed-latency look-up table (LUT), and [17] eliminated the conversion to further mitigate the latency.

All the above architectures, however, may not be suitable for harsh environments [3,15], where they may experience a single-event upset that causes the bits in registers to flip [18]. For example, tactical surveillance UAVs are usually subjected to jamming and cosmic rays [3]. Since the multiuser detection is performed chip-by-chip according to an interleaving pattern, it is of paramount importance to keep track of the current index at all times.

Under such circumstances, the following contributions are showcased in this paper.

• For the first time in the literature, we explicitly quantify how problematic the errors in interleavers are, and propose a fault-tolerant interleaver to efficiently deal with them.

• More specifically, to protect the interleaving index in the register from soft errors, an encoder and a decoder associated with an error-correcting code (ECC) are added.

• Furthermore, to mitigate the latency overhead of such an architecture, the encoder is merged with an essential logic that precomputes the next index.

• As a result, the proposed interleaver can tolerate soft errors while suppressing the latency overhead.

The remainder of this paper is organized as follows. Section II reviews the existing interleavers. In Sections III and IV, the proposed fault-tolerant interleaver is presented and evaluated, respectively. Concluding remarks are made in Section V.

II. EXISTING ARCHITECTURES

The multi-stage algebraic interleaver [10], which have been widely used in the literature [3,7,10,11,12,13,14,15,20], is drawn in Fig. 2. The up-counter first puts a sequence of ${j} = 0$, $1$, ..., ${J}-1$ at each clock cycle, where ${J}$ is the interleaving length. The subsequent ${S}$-stage interleaver converts ${j}$ into the interleaved index $\pi_{uS}(j)$ for the ${u}$th user, where $\pi_{uS}(j)$ is the ${S}$-time composition of $f_{us}(j)$ in [9] for ${s} = 1$, $2$, ..., ${S}$. More precisely,

(1)
$ \pi _{us} \left(j\right)=\left(f_{us} \circ f_{us-1} \circ \cdots \circ f_{u1} \right)\left(j\right) $
(2)
$ f_{us} \left(j\right)=\left[\frac{K_{us} j\left(j+1\right)}{2} \right] \mod~{J} $

where $K_{us}$ is a randomly chosen odd number for the ${u}$th user in the ${s}$th stage, and ${J}$ is a power of two. Each stage realizes (2) with two multipliers and one adder.

The ${S}$-time composition of $f_{us}(j)$ in Fig. 2 may suffer from a high latency, as it calculates the intermediate values of $\pi_{uS}(j)$ for all $1 \le s < S$ on the fly. Fig. 3 depicts the fixed-latency architecture [16] in which the multiple stages are replaced by a single LUT that readily returns $\pi_{uS}(j)$ corresponding to $j$. While the overall latency of Fig. 2 increases with ${S}$, that of Fig. 3 remains the same.

Fig. 2. Multi-stage architecture [10].

../../Resources/ieie/JSTS.2025.25.2.191/fig2.png

Fig. 3. LUT-based architecture [16].

../../Resources/ieie/JSTS.2025.25.2.191/fig3.png

Both of the above architectures [10,16] first generate $\{j\}$, and then convert it into $\{ \pi_{uS}(j)\}$. Such an approach may not be favorable in low-latency IDMA systems, as several operations of the system cannot be started until $\pi_{uS}(j)$ is available [1,12]. Accordingly, the conversion-less structure [17] in Fig. 4 was developed. It consists of a register that holds the current index $\pi_{uS}(j)$ and the next-index (NI) logic that precomputes the next index $\pi_{uS}(j+1)$ for the following cycle. In such a configuration, the output of the register is immediately $\{\pi_{uS}(j)\}$ without any further manipulations.

Fig. 4. Conversion-less architecture [17].

../../Resources/ieie/JSTS.2025.25.2.191/fig4.png

In summary, the key features of the interleaver architectures are compared in Table 1. The dependence on ${S}$ of [10], which elongates the overall latency, is resolved in [16]. The sequential-to-interleaved domain conversion, which defers the availability of $\pi_{uS}(j)$, is removed in [17]. The vulnerability to errors, which may corrupt the entire multiuser detection procedure, will be remedied by the proposed fault-tolerant interleaver to be unveiled in Section III. Note that the proposed interleaver inherits all the advantages of the prior arts, and serves as the only one with the fault tolerance.

Table 1. Key features of interleaver architectures.

Architecture

S-independent

Conversion-less

Fault-tolerant

Fig. 2 [10]

No

No

No

Fig. 3 [16]

Yes

No

No

Fig. 4 [17]

Yes

Yes

No

Proposed

Yes

Yes

Yes

III. PROPOSED ARCHITECTURE

This section first derives the baseline fault-tolerant interleaver. Subsequently, the overhead of empowering the fault tolerance is alleviated by integrating the two components therein.

1. Baseline Fault-Tolerant Architecture

Although the existing interleavers may operate well in gentle environments, none of them is prepared for soft errors that may occur in harsh surroundings. Hence, we develop the fault-tolerant interleaver by renovating the state-of-the-art structure in Fig. 4 with the minimal interleaving latency [17]. Since the register in Fig. 4 is prone to soft errors, we protect it by employing an error-correcting code. In other words, the next index $\pi_{uS}(j+1)$ is encoded before being stored into the register, and decoded after being retrieved. The corresponding architecture is drawn in Fig. 5, where the encoder and the decoder are associated with a linear block code of generator matrix $\mathbf{G}$ and parity-check matrix $\mathbf{H}$. Let us trace the datapath starting from the current index $\pi_{uS}(j)$. $\mathbf{b}(j)$ is the binary representation of $\pi_{uS}(j)$. Given $\mathbf{b}(j) = \pi_{uS}(j)$, the next-index (NI) logic [17] puts $\mathbf{b}({j}+1) = \pi_{uS}({j}+1)$. The encoder is a cluster of XOR gates [19] that computes the codeword $\mathbf{c}(j+1) = \mathbf{b}({j}+1)\mathbf{G}$ for the register. While the register in Fig. 4 holds $\pi_{uS}(j+1)$ of $\log_{2} J$ bits, the register in Fig. 5 holds $\mathbf{c}(j+1)$ of $\log_{2} J + p$ bits, where $p$ is the number of parity bits. When the clock ticks, a new cycle starts. $\mathbf{r}({j})$ is a retrieved codeword that may have bit errors. The decoder is of a common structure [19] that first calculates the syndrome $\mathbf{s}(j) = \mathbf{r}(j)\mathbf{H}^{T}$, detects the error pattern $\mathbf{e}(j)$ corresponding to $\mathbf{s}(j)$, and restores the error-corrected codeword $\mathbf{c}(j) = \mathbf{r}(j) \oplus \mathbf{e}(j)$, where $\oplus$ denotes the Galois-field addition or the XOR operation. Due to the systematic property of the code, a $\log_{2} J$-bit subset of $\mathbf{c}(j)$ is equal to $\mathbf{b}(j) = \pi_{uS}(j)$.

Fig. 5. Fault-tolerant architecture with separate NI logic and encoder.

../../Resources/ieie/JSTS.2025.25.2.191/fig5.png

2. Integration of NI Logic and Encoder

While the inclusion of the encoder and decoder makes Fig. 5 tolerant of errors, the overall latency is inevitably increased. To alleviate such an overhead, we integrate the functionalities of the NI logic and the encoder, and the resulting architecture is depicted in Fig. 6. As its name implies, the encoded NI logic therein puts $\mathbf{c}(j+1)$ for given $\mathbf{b}(j)$. The construction of the logic is exemplified for ${J} = 8$, ${S} = 3$, $K_{u1} = 3$, $K_{u2} = 5$, $K_{u3} = 7$, and

(3)
$ \mathbf G=\left[\begin{array}{cccccc} {1} & {1} & {0} & {1} & {0} & {0} \\ {0} & {1} & {1} & {0} & {1} & {0} \\ {1} & {1} & {1} & {0} & {0} & {1} \end{array}\!\!\right] $

in Fig. 7, where $\pi_{uS}({j})$, $\mathbf{b}({j})$, and $\mathbf{c}({j})$ are enumerated for all ${j} = 0$, $1$, ..., ${J}-1$. Note that the systematic property of $\mathbf{G}$ ensures that the $\log_{2} {J} = 3$ rightmost bits of $\mathbf{c}({j})$ are the same as $\mathbf{b}({j})$. When the logic takes $\mathbf{b}(0) = 000$, for instance, it returns $\mathbf{c}(1) = \mathbf{b}(1)\mathbf{G} = 100011$, as paired by the red segment. For $\mathbf{b}(1) = 011$, it puts $\mathbf{c}(2) = \mathbf{b}(2)\mathbf{G} = 111001$, as bound by the blue segment. In the same manner, all the pairs of $\mathbf{b}({j})$ and $\mathbf{c}({j}+1)$ are tabulated as the input and the output of the combinational logic, respectively.

Fig. 6. Fault-tolerant architecture with the encoded NI logic.

../../Resources/ieie/JSTS.2025.25.2.191/fig6.png

Fig. 7. Construction of the encoded NI logic.

../../Resources/ieie/JSTS.2025.25.2.191/fig7.png

Note that all the interleavers in Figs. 2-6 provide the same output $\pi_{uS}(j)$ for $j = 0$, $1$, ..., $J-1$. This indicates that the error-correcting capability of the proposed architecture is completely transparent, making it possible to replace the error-prone interleavers in IDMA systems without any modification of the interface.

IV. EVALUATION

This section assesses how robust the proposed interleaver is in harsh environments in terms of bit-error rates (BERs), and how practical it is in terms of implementation results.

1. BER Performance

Fig. 8 exhibits the BERs of the iterative multiuser detector in Fig. 1, where $p_{e}$ denotes the bit-error probability of the register in an interleaver. The simulations were conducted for the most prevalent set of parameters in the literature [3,7,8,12,13,14,15,16,17,20,21], i.e., the interleaving length $J = 8192$, the spreading factor $P = 16$, the number of users $U = 16$, and the number of maximally allowed iterations $I = 6$. Fig. 8(a) corresponds to the case when no ECC is employed. As shown, $p_{e} \ge 10^{-4}$ severely deteriorates the BER, hindering the practical use. $p_{e} \le 10^{-5}$ can be regarded as near-optimal. Figs. 8(b) and 8(c) are the cases when at most one and two errors among $\log_{2} J$ bits in the register can be corrected, respectively. In Figs. 8(b) and 8(c), only $p_{e} \ge 10^{-2}$ remains problematic, while $p_{e} \le 10^{-3}$ is fully operational. By virtue of one additional error-correcting capability, $p_{e} = 10^{-2}$ in Fig. 8(c) demonstrates a significant improvement over the counterpart in Fig. 8(b), despite still not robust enough to be used in practice. Although the interleavers with higher error-correcting capabilities may tolerate the extreme environments of $p_{e}\ge 10^{-2}$, such topologies will draw too many resources in exchange for the marginal gains. Accordingly, in Section IV.2, we will consider a single-error-correcting code of the minimal overhead yet with a significant fault tolerance.

Fig. 8. Bit-error rates (BERs) when (a) zero, (b) one, and (c) two errors in the register of an interleaver can be corrected.

../../Resources/ieie/JSTS.2025.25.2.191/fig8.png

Fig. 9. Orders of processing when errors occur (a) zero, (b) one, and (c) two times during $J = 16$ cycles.

../../Resources/ieie/JSTS.2025.25.2.191/fig9.png

Before moving on to the next analysis, an interesting phenomenon that contradicts the expectation can be observed from Figs. 8(a) and 8(b), e.g., the BER of $p_{e} =10^{-1}$ is slightly lower than that of $p_{e} = 10^{-2}$. The underlying reason is exemplified for $J = 16$ in Fig. 9, where the hexadecimal numbers are the cycle counts. Fig. 9(a) is an error-free case that the cycle counts from the left to the right are perfectly in an order of the sequential indices ${j} = 0$, $1$, ..., $J- 1$. Fig. 9(b) exemplifies that an error occurs after the 4th cycle, and ${L} = 3$ indices are bypassed. As the index reaches ${J}- 1$, it starts back from 0 due to the overflow, and wastes ${L}$ cycles to redundantly process the ${E}$ indices at the front. After all ${J}$ cycles, the interleaver fails to secure the data of ${L}$ indices, degrading the BER. On the other hand, Fig. 9(c) corresponds to the case that errors occur twice. After the 4th cycle, the first error bypasses ${L} = 3$ indices in the same manner as Fig. 9(b). After the 11th cycle, the second error fortunately returns back to the middle of the bypassed indices, reducing the number of missing data to be less than ${L}$. Then, the BER deterioration can be slightly alleviated. As exemplified, while the case of only one error has no way to recover ${L}$ lost indices, a few more errors may grant us chances to make up for some previous mistakes.

In summary, as shown in Fig. 8, the BER of the multiuser detector is drastically deteriorated in harsh environments if the interleavers are not fault-tolerant. On the other hand, the error-correcting capability of the proposed architecture makes it possible to endure the harsh environments in practice, sustaining the BER to be near-optimal.

Table 2 summarizes the implementation results of the interleavers in a 65-nm CMOS process for ${S} = 3$ and ${J} = 2^{9}$, $2^{10}$, $2^{11}$, $2^{12}$, $2^{13}$. Table 3 compiles the results for ${J} = 2^{13}$ and $S = 3$, $4$, $5$, $6$, $7$. The parameters have been widely adopted in the literature [3,7,8,10,11,12,13,14,15,16,17,20,21]. Synopsys Design Compiler was used to synthesize the designs and to measure the following metrics. $t_{\pi1}$, $t_{\pi2}$, and $t_{\rm FP}$ are important latency metrics to be defined and scrutinized soon. Equivalent gate counts (EGCs) were calculated by regarding a two-input NAND gate as one. The switching activities were first extracted from the testbenches of practical operation scenarios in the switching activity interchange format (SAIF). Then, the SAIF file was back-annotated when measuring power consumptions in Synopsys Design Compiler. All the metrics were averaged over 16 different sets of $\{K_{us}\}$. As a matter of course, only the proposed ones in Figs. 5 and 6 are fault-tolerant as they are protected by the single-error-correcting Hamming code for $\log_{2} J$ bits.

Table 2. Implementation results for $S = 3$ and $J = 2^9$, $2^{10}$, $2^{11}$, $2^{12}$, $2^{13}$.

J

Architecture

Fault tolerance

tπ1 (ns)

tπ2 (ns)

tFP (ns)

EGC

Power (mW)

29

Fig. 2 [10]

No

1.94

-

-

3451

0.57

Fig. 3 [16]

No

0.70

-

-

1847

0.33

Fig. 4 [17]

No

0.15

-

0.72

2626

0.89

Fig. 5

Yes

0.15

0.52

1.35

3057

0.68

Fig. 6

Yes

0.15

0.56

1.17

4259

0.82

210

Fig. 2 [10]

No

2.09

-

-

4449

0.71

Fig. 3 [16]

No

0.78

-

-

2905

0.51

Fig. 4 [17]

No

0.17

-

0.78

5197

1.32

Fig. 5

Yes

0.16

0.55

1.44

6100

1.01

Fig. 6

Yes

0.15

0.55

1.25

7534

1.22

211

Fig. 2 [10]

No

2.15

-

-

05366

0.85

Fig. 3 [16]

No

0.84

-

-

05054

0.83

Fig. 4 [17]

No

0.17

-

0.84

10503

2.08

Fig. 5

Yes

0.17

0.54

1.51

11250

1.48

Fig. 6

Yes

0.16

0.55

1.32

14176

1.90

212

Fig. 2 [10]

No

2.28

-

-

06468

1.00

Fig. 3 [16]

No

0.90

-

-

09677

1.52

Fig. 4 [17]

No

0.16

-

0.90

20659

3.40

Fig. 5

Yes

0.15

0.59

1.64

20592

2.15

Fig. 6

Yes

0.16

0.59

1.45

28291

3.06

213

Fig. 2 [10]

No

2.35

-

-

07644

1.16

Fig. 3 [16]

No

0.97

-

-

15693

2.15

Fig. 4 [17]

No

0.17

-

0.97

38521

5.13

Fig. 5

Yes

0.14

0.59

1.73

37457

3.16

Fig. 6

Yes

0.17

0.60

1.53

49392

4.43

Table 3. Implementation results for $J = 2^{13}$ and $S = 3$, $4$, $5$, $6$, $7$.

S

Architecture

Fault tolerance

tπ1 (ns)

tπ2 (ns)

tFP (ns)

EGC

Power (mW)

3

Fig. 2 [10]

No

2.35

-

-

07644

1.16

Fig. 3 [16]

No

0.97

-

-

15693

2.15

Fig. 4 [17]

No

0.17

-

0.97

38521

5.13

Fig. 5

Yes

0.14

0.59

1.73

37457

3.16

Fig. 6

Yes

0.17

0.60

1.53

49392

4.43

4

Fig. 2 [10]

No

2.89

-

-

10022

1.27

Fig. 3 [16]

No

0.98

-

-

19363

2.57

Fig. 4 [17]

No

0.16

-

1.03

38889

5.09

Fig. 5

Yes

0.14

0.58

1.72

38698

3.24

Fig. 6

Yes

0.18

0.60

1.54

48489

4.34

5

Fig. 2 [10]

No

3.39

-

-

12205

1.35

Fig. 3 [16]

No

0.99

-

-

21891

2.84

Fig. 4 [17]

No

0.17

-

1.05

38403

4.88

Fig. 5

Yes

0.14

0.58

1.73

38700

3.20

Fig. 6

Yes

0.17

0.60

1.54

48995

4.28

6

Fig. 2 [10]

No

3.88

-

-

14519

1.43

Fig. 3 [16]

No

1.00

-

-

24877

3.14

Fig. 4 [17]

No

0.17

-

1.05

38574

4.97

Fig. 5

Yes

0.14

0.60

1.72

38979

3.22

Fig. 6

Yes

0.15

0.60

1.55

48807

4.28

7

Fig. 2 [10]

No

4.34

-

-

16643

1.49

Fig. 3 [16]

No

1.01

-

-

27600

3.41

Fig. 4 [17]

No

0.18

-

1.05

38957

4.96

Fig. 5

Yes

0.15

0.58

1.72

38850

3.21

Fig. 6

Yes

0.17

0.60

1.54

48900

4.28

2. Implementation Results

To analyze the latencies, let $t_{\rm C2Q}$ and $t_{\rm SU}$ denote the clock-to-Q delay and the setup time of the register, respectively. Let $t_{\rm ENC}$, $t_{\rm DEC}$, $t_{\rm NIL}$, and $t_{\rm ENIL}$ indicate the delays of the encoder, the decoder, the NI logic, and the encoded NI logic, respectively. $t_{\pi1}$ is the latency from the clock edge to the time that error-prone $\pi_{uS}(j)$ becomes ready. Since Figs. 4-6 are all conversion-less [17], $t_{\pi1}$ of them are equal to the momentary tC2Q, and are much lower than those of Figs. 2 and 3. Note that the output of the register in Figs. 5 and 6, $\mathbf{r}(j)$, consists of error-prone $\pi_{uS}(j)$ and the parity bits, as the code is systematic. $t_{\pi2}$ is the latency from the clock edge to the time that error-corrected $\pi_{uS}(j)$ becomes available. $t_{\pi2}$ can be measured only for the fault-tolerant ones, and $t_{\rm DEC} = t_{\pi2} - t_{\pi1}$. Since the decoders in Figs. 5 and 6 are not different from each other, $t_{\pi2}$ of them are almost the same. $t_{\rm FP}$ is the latency of the entire feedback path. $t_{\rm FP}$ of Figs. 4, 5, and 6 can be formulated as $t_{\rm C2Q} + t_{\rm NIL} + t_{\rm SU}$, $t_{\rm C2Q} + t_{\rm DEC} + t_{\rm NIL} + t_{\rm ENC} + t_{\rm SU}$, and $t_{\rm C2Q} + t_{\rm DEC} + t_{\rm ENIL} + t_{\rm SU}$, respectively. The inevitable increase of $t_{\rm FP}$ due to the adoption of the ECC is indeed suppressed by the encoded NI logic that substitutes the series of the NI logic and the encoder, as $t_{\rm FP}$ of Fig. 6 is shorter than that of Fig. 5. In exchange for such a benefit, Fig. 6 occupies a larger area and dissipates a higher power than Fig. 5. Compared with the overall complexity of the entire multiuser detector, however, the overhead is not significant, as the detector includes hundreds of kilo-gates and even those gates are overwhelmed by the vast area of internal memories [7,10,11,12,13,14,15]. Accordingly, the proposed scheme can be a promising and practical solution to effectively overcome the harsh environments.

V. CONCLUSIONS

The fault-tolerant interleaver combined with the error-correcting code has been presented for IDMA systems operating in harsh environments. The latency overhead has been alleviated by merging the encoder with the NI logic. While the proposed architecture has been exemplified for the Hamming codes only, any linear block code can be adopted to further enhance the fault tolerance.

ACKNOWLEDGMENTS

This work was supported by the National Research Foundation (NRF) of Korea under Grant 2022R1I1A3 064200 funded by the Ministry of Education, and Grant 2020R1G1A1102620 funded by the Ministry of Science and ICT. Electronic design automation (EDA) tools were provided by the IC Design Education Center (IDEC), South Korea.

References

1 
L. Ping, L. Liu, K. Wu, and W. K. Leung, ``Interleave-division multiple-access,'' IEEE Transactions on Wireless Communications, vol. 5, no. 4, pp. 938-947, Apr. 2006.DOI
2 
C. Zhang, Y.-H. Huang, F. Sheikh, and Z. Wang, ``Advanced baseband processing algorithms, circuits, and implementations for 5G communication,'' IEEE Journal on Emerging and Selected Topics in Circuits and Systems, vol. 7, no. 4, pp. 477-490, Dec. 2017.DOI
3 
B. Y. Kong, ``Bitwise early termination of multiuser detection for IDMA systems,'' IEEE Communications Letters, vol. 25, no. 9, pp. 2998-3002, Sep. 2021.DOI
4 
I. Pupeza, A. Kavčić, and L. Ping, ``Efficient generation of interleavers for IDMA,'' Proc. of IEEE International Conference on Communications (ICC), Istanbul, Turkey, pp. 1508-1513, Jun. 2006.DOI
5 
W. Belaoura, M. Djeddou, and K. Ghanem, ``GRP-based interleaver for IDMA systems over frequency selective channel,'' Electronics Letters, vol. 51, no. 18, pp. 1462-1464, Sep. 2015.DOI
6 
S. Wu, X. Chen, and S. Zhou, ``A parallel interleaver design for IDMA systems,'' Proc. of International Conference on Wireless Communications and Signal Processing (WCSP), Nanjing, China, pp. 1-5, Nov. 2009.DOI
7 
B. Y. Kong and I.-C. Park, ``Parallel IDMA architecture based on interleaving with replicated subpatterns,'' Proc. of IEEE International Conference on Communications (ICC), Shanghai, China, pp. 1-6, May 2019.DOI
8 
B. Y. Kong and I.-C. Park, ``Efficient implementation of multiple interleavers in IDMA for 5G,'' Proc. of International SoC Design Conference (ISOCC), Daegu, South Korea, pp. 119-120, Nov. 2018.DOI
9 
O. Y. Takeshita and D. J. Costello, ``New classes of algebraic interleavers for turbo-codes,'' Proc. of IEEE International Symposium on Information Theory (ISIT), Cambridge, MA, USA, p. 419, Aug. 1998.DOI
10 
S. Yoshizawa, M. Nozaki, and H. Tanimoto, ``VLSI implementation of an interference canceller using dual-frame processing for OFDM-IDMA systems,'' IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, vol. E98-A, no. 3, pp. 811-819, Mar. 2015.DOI
11 
T. T. T. Nguyen, L. Lanante, S. Yoshizawa, and H. Ochi, ``Low latency IDMA with interleaved domain architecture for 5G communications,'' IEEE Journal on Emerging and Selected Topics in Circuits and Systems, vol. 7, no. 4, pp. 582-593, Dec. 2017.DOI
12 
B. Y. Kong and I.-C. Park, ``A memory-efficient IDMA architecture based on on-the-fly dispreading,'' IEEE Journal of Solid-State Circuits, vol. 53, no. 11, pp. 3327-3337, Nov. 2018.DOI
13 
B. Y. Kong and I.-C. Park, ``A 120-mW 0.16-ms-latency connectivity-scalable multiuser detector for interleave division multiple access,'' IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 67, no. 3, pp. 470-474, Mar. 2020.DOI
14 
B. Y. Kong and I.-C. Park, ``Improved parallel-IDMA architecture with low-complexity elementary signal estimators,'' Proc. of IEEE International Symposium on Circuits and Systems (ISCAS), Seville, Spain, pp. 1-4, Oct. 2020.DOI
15 
B. Y. Kong, ``A 97-mW bitwise-early-terminating multiuser detector for IDMA systems,'' IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 69, no. 8, pp. 3390-3394, Aug. 2022.DOI
16 
B. Y. Kong, ``Fixed-latency architecture for multi-stage algebraic interleavers in interleave division multiple access systems,'' Electronics Letters, vol. 58, no. 22, pp.DOI
17 
B. Y. Kong, ``Conversion-less algebraic interleaver architecture for low-latency IDMA systems,'' Electronics Letters, vol. 59, no. 13, pp. 1-3, Jul. 2023.DOI
18 
R. C. Baumann, ``Radiation-induced soft errors in advanced semiconductor technologies,'' IEEE Transactions on Device and Materials Reliability, vol. 5, no. 3, pp. 305-316, Sep. 2005.DOI
19 
S. Lin and D. J. Costello, Error Control Coding: Fundamentals and Applications, 2nd ed., Pearson Prentice Hall, Upper Saddle River, NJ, USA, 2004.URL
20 
B. Y. Kong, ``Statistical analysis of bitwise early termination for iterative multiuser detection in IDMA systems,'' Proc. of International Conference on Electronics, Information and Communications (ICEIC), Taipei, Taiwan, pp. 1-3, Jan. 2024.DOI
21 
B. Y. Kong, ``Low-complexity address generation for multiuser detectors in IDMA systems,'' Electronics, vol. 9, no. 12, 2069, Dec. 2020.DOI
Byeong Yong Kong
../../Resources/ieie/JSTS.2025.25.2.191/author1.png

Byeong Yong Kong received his B.S., M.S., and Ph.D. degrees in electrical engineering from the Korea Advanced Institute of Science and Technology (KAIST), Daejeon, South Korea, in 2011, 2013, and 2017, respectively. From 2017 to 2018, he was a Senior Researcher with the Agency for Defense Development, Daejeon, where he was involved in the research of guided missile systems. From 2018 to 2019, he was a Postdoctoral Researcher and a Research Assistant Professor with KAIST. Since 2019, he has been with the Division of Electrical, Electronic, and Control Engineering, Kongju National University, Cheonan, South Korea, where he is currently an Associate Professor. His research interests include digital circuit design and very large scale integration (VLSI) signal processing for communications. Dr. Kong was a recipient of the 1st Place Award at the Altera FPGA Design Contest in 2015.