Mobile QR Code QR CODE

  1. (Department Electrical & Computer Engineering, Missouri University of Science & Technology, Rolla, MO, USA)
  2. (Department Electronic Engineering, Daegu University, Gyeongsan, Korea)



Null convention logic, gate diffusion input, HYBRID implementation, ripple carry adder

I. INTRODUCTION

The clocked synchronous paradigm currently dominates the semiconductor design industry (1). However, there are major drawbacks of this synchronous approach, including critical timing analysis and clock skew issues (2). Typically, a precise clock distribution network is used to address these limitations, which is a tedious and complex task. Moreover, with the decreasing feature size, power consumption of clock distribution network is found to be rapidly increasing, which is a major limiting factor for emerging low power semiconductor industry (3). Since asynchronous designs consumes less power, produce less noise and electromagnetic interference (EMI) than their synchronous counterparts, there is renewed interest in this field (4).

Asynchronous circuits are characterized into two classifications: bounded-delay and delay-insensitive (DI) models (3). Bounded-delay models consider that the both gate and wire delays are bounded and therefore, require extensive timing analysis to determine the delay (1). On the other hand, DI circuits assume both interconnects and logic elements delay are unbounded and wire forks within the components are isochronic (5). However, wires connecting the components do not adhere to this isochronic fork assumption, ensuring the correct operation regardless on the input availability. Hence, DI circuits require little timing analysis and yield average case performance rather than the worst-case performance of bounded-delay and traditional synchronous paradigms (6).

Literature provides several DI paradigms such as Seitz’s, DIMS, Anantharaman’s, Singh’s, and David’s Phased Logic and NULL Convention Logic (1). Most of these DI methods (Seitz’s, DIMS, Anantharaman’s, Singh’s, David’s, Phased Logic) either depends on C-element or synchronous design to achieve DI. More elaborate description of these DI methodologies can be found in (7). Conversely, NCL methodology uses a library of hysteresis state holding functionality gates to attain DI. These gates enable transistor level optimization, which help in reducing the overall circuit area (8). Hence, NCL is the best alternative for integrating asynchronous digital design into the predominantly synchronous semiconductor design industry.

Fig. 1. Dual-rail representation of NCL AND function: Z = X • Y: initially X=DATA1 and Y=DATA0, so Z=DATA0; next X and Y both transition to NULL, so Z transitions to NULL; then X and Y both transition to DATA1, so Z transitions to DATA1 (7).

../../Resources/ieie/JSTS.2020.20.1.127/fig1.png

Fig. 2. NCL system framework (3).

../../Resources/ieie/JSTS.2020.20.1.127/fig2.png

The NCL paradigm consists of 27 hysteresis state holding logic gates (2). These gates are traditionally implemented using one of the CMOS techniques: static, semi-static, differential or dynamic methods. Detailed information regarding these approaches can be found in (8). Among these methods, the static CMOS method is most commonly used as it results in less leakage and noise compared to other CMOS techniques (9). However, the main drawback of static CMOS implementation is the area overhead. The results indicate that the area occupied by static CMOS NCL implementation is approximately 1.5-2 times the equivalent synchronous design (7). To address this drawback, this paper proposes a HYBRID technique for designing NCL circuits. The proposed approach integrates both CMOS and gate diffusion input techniques to realize NCL designs.

Gate diffusion input methodology is a low power design technique that utilizes only two transistors to implement different functions (1). The input configuration required to implement various function can be found in (9-12). Hence, by applying GDI methodology for realizing NCL gates, total transistor count is reduced, which in turn reduces switching power. However, the biggest drawback of GDI methodology is the voltage drop at the output which results in performance degradation (12-15). Hence, a new HYBRID methodology that can address both of these limitations, voltage drop of GDI and area overhead of CMOS is proposed in this work.

The aim of this work is to utilize both CMOS and GDI techniques to design NCL circuits. This approach where in both CMOS and GDI based NCL gates are used to design NCL circuits helps in reducing the transistor count when compared to the conventional static CMOS approach. To validate the performance of the proposed approach, a variety of NCL up-counter increment (NUI) circuits were realized and compared with the static CMOS methodology. The proposed approach shows a minimum of 4% reduction in the transistor count when compared with the static CMOS approach.

The rest of the paper is organized as follows: Section II discusses the preliminaries and review of NCL and GDI. Section III presents the design description of the HYBRID methodology. Section IV illustrates the simulation results, followed by conclusion in section V.

II. PRELIMINARIES AND REVIEW

1. Null Convention Logic

NCL is a clockless DI model that works correctly regardless of input accessibility. It is a self-timed logic model where both data and control are integrated to a single signal and communication is accomplished through local handshaking (1). To provide synchroni-ation DATA and NULL states are used which are obtained using dual or quad-rail logic. A dual-rail signal, D utilizes two wires D0 and D1 to represent values DATA0, DATA1 and NULL (4) as shown in Fig. 1. The NULL state (D0 = 0, D1 = 0) symbolizes that D is not available. The DATA0 state (D0 = 1, D1 = 0) and DATA1 state (D0 = 0, D1 =1) represents Boolean logic 0 and 1 (8). These two rails are mutually exclusive and cannot be asserted at the same time. This means that if both the rails are high, the state is known as an invalid/illegal state.

The framework of NCL system is shown in Fig. 2. As observed from the figure, combinational logic (CL) is always sandwiched between two DI registers and these adjacent DI registers communicate through request and acknowledge signals ki and ko. NCL utilizes a special set of logic element known as threshold gates for realizing the combinational logic, DI registers and competition detection circuits (3).

Fig. 3. (a) THmn threshold gate, (b) TH34w2 threshold gate (3).

../../Resources/ieie/JSTS.2020.20.1.127/fig3.png

Fig. 4. Transistor level implementation of TH23 gate using Static CMOS methodology (7).

../../Resources/ieie/JSTS.2020.20.1.127/fig4.png

Fig. 5. Transistor level implementation of TH23 gate using semi-static methodology (7).

../../Resources/ieie/JSTS.2020.20.1.127/fig5.png

There are 27 threshold gates and the primary type of threshold gate depicted in Fig. 3(a), is known as THmn, where 1 ${\leq}$ m ${\leq}$ n (2). Here, n represents the number of inputs and m denotes the number of inputs that need to be asserted for the output to be asserted. The secondary type of threshold gate illustrated in Fig. 3(b) is referred as a weighted threshold gate, denoted as THmn Ww1w2…wR. The constant equation is w1, w2…wR {\textgreater} 1, where w1, w2…wR are the integer weights of input1, input2 … inputR, respectively (5). These threshold gates have built-in hysteresis behavior to ensure DI. Hysteresis in NCL ensures that two DATA wavefront are not overwritten and are always separated by a NULL wavefront.

The general algebraic expression of an NCL gate is the combination of set and hold equations. The set equation defines the functionality of the gate and the hold equation determines till when the gate should be asserted once it is asserted. The set equation i.e. the functionality of each NCL gate is presented in (1) whereas; the hold equation remains the same for every gate, which is simply OR-ing, all the inputs. Therefore, the general equation for an NCL gate is given by Z = set + (Z- • hold), where Z- is the previous output value and Z is the current value. Prevailing methodologies utilized for realizing NCL circuits are static and semi-static CMOS technology. Fig. 4 and 5 depicts the static and semi-static CMOS implementation of TH23 gates.

As depicted in Fig. 3(b), the semi-static implement-ation only requires set and set’ expressions are utilized to realize TH23 gate. To achieve hysteresis, the semi-static implementation uses weak feedback inverters, which slows down the gate operation leading to large latency overhead. This limitation is addressed by using static CMOS implementation that utilizes pull-up (set) and pull-down networks (reset) as shown in the Fig. 3(a).

As observed from the figure, the additional circuitry is required to maintain the built-in hysteresis property of NCL gates. This leads to an area overhead where NCL designs are approximately 1.5 - 2 times larger than the equivalent synchronous designs (7).

Therefore, it is crucial to address this limitation so that NCL designs become viable alternative for synchronous design. This drawback is be addressed by utilizing a low power design methodology called GDI.

2. Gate Diffusion Input Method

GDI is a low power design technique commonly used in synchronous design to reduce area and dynamic power consumption (1). The structure of basic GDI cell is depicted in Fig. 6. It has three inputs G (common gate input of both the nMOS and the pMOS), P (input to the source/drain of the pMOS), N (input to the source/drain of the nMOS). The bulks of nMOS and pMOS transistors are constantly connected to GND and V$_{\mathrm{DD}}$, respectively (6).

Various logic functions of GDI cell for different input configurations are illustrated in Fig. 6(b). Since, the pull-up and pull-down networks of these functions are not always connected to power supply (V$_{\mathrm{DD}}$) and ground (GND), a voltage drop at the output is observed. This drawback is the biggest limitation of GDI methodology based implementation (9-15). Similarly, by realizing NCL gates using GDI technique, voltage swings prevail in the circuit. Therefore, this work mainly focuses on addressing this issue such that GDI technique can be used for realizing NCL circuit.

Fig. 6. (a) Basic GDI cell structure, (b) Different functions input configurations (10).

../../Resources/ieie/JSTS.2020.20.1.127/fig6.png

Fig. 7. FNCL implementation of TH22 gate.

../../Resources/ieie/JSTS.2020.20.1.127/fig7.png

The next section presents the mechanism for realizing the NCL gates using GDI methodology. The efficacy of the proposed approach is verified by realizing the several NUI circuits and comparing with the static CMOS methodology.

III. THE PROPOSED HYBRID METHODOLOGY

First, the mechanism to realize NCL gates using GDI methodology, also known as FNCL approach is discussed. Since this approach utilizes both F1 and F2 functions unique to GDI as shown in Fig. 3, it is named as FNCL. The limitation of voltage drop of FNCL approach is also presented in this subsection. Finally, the HYBRID methodology, which utilizes both CMOS and GDI techniques to address the area overhead limitation of NCL designs, is described in detail.

1. Realization of NCL using FNCL Approach (based on F1 and F2 functions of GDI gate)

To realize NCL threshold gates using FNCL approach, first the Boolean expression of THmn gate is factorized. Then, based on the factorized expression GDI function F1, F2 and MUX are utilized to implement the gate. As an example, steps for realization of TH22 gate is shown below:

Step 1: Factorized expression of TH22 gate is:

Z = AB + Z (A+B)

Where, A, B are the inputs and Z is the output.

Step 2: The GDI functions are utilized to realize AND (AB) and OR (A+B) expressions. Among all the GDI functions, only F1 and F2 are utilized, as they demonstrate voltage drop for only one input combination compared to the others (AND, OR) functions. Hence, F1 and F2 are used to implement AB and A+B as shown in the Fig. 7.

Step 3: The GDI MUX cell is used to determine final output i.e. whether to pass set data or hold data based on the previous results. The MUX is configured such that the output F1 cell (AB) and the O2 output of F2 cell (A+B) are fed to the sources of pMOS and nMOS as shown in Fig. 7. This will allow to select set equation (AB) when Z=0 and hold data (A+B) when the Z is 1.

Therefore, the proposed FNCL approach requires only eight transistors to implement TH22 gate. Compared to the static CMOS approach, a 20% reduction in the transistor count is observed. However, the main drawback of this approach is the performance degradation due to the substantial voltage drop at the final output. The mechanism to address this limitation is discussed in next subsection.

2. Performance Degradation of FNCL Approach

The performance degradation is due to the different input configuration of the nMOS and pMOS transistors. A voltage drop of V$_{tp}$ (threshold voltage of pMOS) and V$_{\mathrm{DD}}$-V$_{tn}$ (threshold voltage of nMOS) for pMOS and nMOS transistors are observed when their sources are not tied to V$_{\mathrm{DD}}$ and GND respectively (9).

Fig. 8. Simulation results demonstrating voltage drop of FNC TH22 gate.

../../Resources/ieie/JSTS.2020.20.1.127/fig8.png

Fig. 9. Structure of FNCL FA.

../../Resources/ieie/JSTS.2020.20.1.127/fig9.png

To demonstrate this phenomenon, the simulation results of the proposed TH22 gate is illustrated in Fig. 8. It is observed in Fig. 8, when either of the inputs or any one of the inputs are low (A=0, B=0, Z=0), the outcome is V$_{tp}$ rather than strong low ‘0’. This can be explained as follows: whenever A = 0, the voltage at the pMOS source of MUX cell is 0. Since, pMOS passes weak ‘0’, the result would be V$_{tp}$. Conversely, when A and B are high, the output is V$_{\mathrm{DD }}$without any voltage drop since pMOS passes strong ‘1’. Hence, three out of four input combinations result in voltage drop.

Fig. 10. Simulation results of FA validating the volatge drop at sum is greater than carryout.

../../Resources/ieie/JSTS.2020.20.1.127/fig10.png

Fig. 11. The proposed HYBRID framework.

../../Resources/ieie/JSTS.2020.20.1.127/fig11.png

This voltage swing issue further escalates when two FNCL gates are interconnected. To validate this conclusion an NCL Full adder (FA) circuit is implemented using the FNCL approach. The structure of FA and its simulation results are depicted in Fig. 6 and Fig. 9. From Fig. 10 it is observed that TH23 gates generates carryout and TH34w2 gates utilizes these results to generate the sum (S0, S1). When the FA circuit is simulated, voltage swings (for logic low) at carryout was ~ 0.1V, whereas for sum it was ~ 0.38V. This increased voltage swing at the sum output is due to the voltage drop at TH23 gate being carried on to TH34w2 gate. Therefore, realizing the whole circuit using FNCL gates is not viable option. To address this limitation, novel HYBRIB approach is also proposed in this paper.

3. CMOS-GDI HYBRID Approach

The design of the HYBRID model is inspired by the observation that in NCL system framework the DI combinational logic (CL) is always enclosed between DI registers. In other words, inputs or outputs of CL always pass through a DI register to ensure synchronization (two DATA wavefronts are not overwriting).

The idea of the HYBRID methodology is to redesign this NCL structure using both static CMOS and FNCL techniques. Fig. 11 depicts the framework of NCL system using HYBRID methodology. The difference between the original and the new (HYBRID) structure is the method by which NCL blocks CL, DI register and completion detection (CD) are realized. The FNCL approach is utilized to realize CL and CD blocks, while static CMOS method is used to implement the DI registers (CMOS_DI_reg).

As discussed, the FNCL blocks (CL, CD) yield a voltage drop at their output. This limitation can be addressed by transferring these outputs through the CMOS_DI_reg. The CMOS_DI_reg have strong pull-up and pull-down network which helps to restore signal strength and generate an output of either V$_{\mathrm{DD}}$ or ground. Thus, the HYBRID approach prevents the voltage drop of the GNCL blocks from progressing to the next stage. Fig. 12 shows the application of this idea to a one-bit full adder circuit and the simulation results depicts that HYDRIB approach has no performance degradation.

Fig. 12. Simulation results of a 1-bit full adder using HYBRID approach.

../../Resources/ieie/JSTS.2020.20.1.127/fig12.png

Fig. 13. Number of transistors utilized by CMOS and FNCL.

../../Resources/ieie/JSTS.2020.20.1.127/fig13.png

Table 1. Comparison of static CMOS and HYBRID methodologies

Model Type

STATIC CMOS

HYBRID

TC

of

only CL

Total TC including

DI registers

TC

of

only

CL

Total TC including

DI registers

Incomplete

AND

216

492

180

456

Reduced

Dual-Rail

460

764

340

616

Factored

Dual-Rail

308

584

236

512

Complex

Dual-Rail

212

488

192

468

In summary, the FNCL approach is proposed to address the area overhead limitation of static CMOS methodology. However, the voltage drop at the output hinders this approach for designing NCL system. Therefore, a HYDRID methodology, which utilizes both the FNCL and static CMOS techniques to address the drawbacks of both the approaches are introduced. To validate the effectiveness of the HYBRID methodology, the proposed approach is applied to a case study of different NCL up-counter increment (NUI) designs. A comparative study of the NUI circuits when implemented using static CMOS and HYBRID methodologies are presented in the next section.

IV. SIMULATION RESULT

This section presents the comparative results of NUI designs when implemented using static CMOS and HYBRID methodologies. All the designs are realized in 45nm technology using Cadence general-purpose design kit (PDK) which provides the standard cell library and associated technology files for circuit realization. The schematics are simulated using Specter simulator with V$_{\mathrm{DD}}$ = 1V and temperature = 27$^{\mathrm{o}}$C.

Serval alternative designs for NCL up-counter increment circuits are realized to verify the viability of the proposed approach. The proposed HYBRID methodology achieves a significant reduction in transistor count compared to the conventional static CMOS NCL designs.

1. NCL Gates Utilized for Realizing NUI Circuits

The NCL gates used for implementing various NUI designs along with their transistor count for CMOS and HYBRID methodologies are depicted in Fig. 13.

As illustrated in Fig. 13, an average of 6% decrement in the number of transistors utilized for implementing these NCL gates using FNCL methodology is observed.

2. Transistor Count for Various NUI Implementations

Table 1, presents the transistor count (TC) for various NUI designs implemented via static CMOS and HYBRID methodologies

As observed from the Table 1, HYBRID methodology utilizes a smaller number of transistors compared to the CMOS implementation. The percentage reduction in transistor count for each model is illustrated in the Fig. 14. Compared to the conventional static CMOS methodology, an incomplete AND NUI shows a 7% reduction in TC when implemented using the proposed approach. Similarly, the Reduced Dual-Rail, Factored Dual-Rail, Complex Dual-Rail NUI circuits show a 19.3 %, 12.3% and 4% reduction in TC when realized using HYBRID approach.

Fig. 14. Percentage reduction in the transistor count.

../../Resources/ieie/JSTS.2020.20.1.127/fig14.png

V. CONCLUSIONS

In this paper, a novel CMOS-GDI HYBRID methodology is proposed and validated to address the area overhead in conventional NCL based on static CMOS implementation. It utilizes two types of design techniques, static CMOS and GDI to realize NCL designs. The proposed approach demonstrated an average of 10% reduction in the transistor count when several NUI are realized using the proposed approach. This enhancement provides the scope for NCL to be an alternative for synchronous designs. The impact of HYBRID method on power consumption and latency will be the part of the future work.

REFERENCES

1 
Bandapati , Satish K., Scott C. Smith., 2007, Design and characterization of NULL convention arithmetic logic units, Microelectronic engineering 84, No. 2, pp. 280-287DOI
2 
Parsan F. A., Smith S. C., Oct 2012, CMOS implementation of static threshold gates with hysteresis: A new approach, in Proc. IEEE/IFIP 20th Int VLSI and System-on-Chip (VLSI-SoC) Conf, pp. 41-45DOI
3 
Smith , Scott C., Ronald F. DeMara, Jiann S. Yuan, Ferguson D., 2004, Optimization of NULL convention self-time circuits, INTEG-RATION, the VLSI Journal 37, No. 3, pp. 135-165DOI
4 
Bandapati , Satish K., Scott C. Smith, Minsu Choi., 2003, Design and characterization of NULL convention self-timed multipliers., IEEE design & test of computers 20, Vol. no. 6, No. 6, pp. 26-36DOI
5 
Bonam R., Chaudhary S., Yellambalase Y., Choi M., Aug 2007, Clock-free nanowire crossbar architecture based on null convention logic (ncl), in Proc. 7th IEEE Conf. Nanotechnology (IEEE NANO), pp. 85-89DOI
6 
Bailey , Andrew D., Jia Di, Scott C. Smith, Alan Mantooth. H., 2008, Ultra-low power delay-insensitive circuit design, In 2008 51st IEEE Midwest Symposium on Circuits and Systems, pp. 503-506DOI
7 
Smith , Christopher Scott, Demara. Ronald F., 2001, Gate and throughput optimizations for null convention self-timed digital circuits, Doctor of Philosophy, DissertationGoogle Search
8 
Parsan F. A., Smith S. C., Aug 2012, CMOS implementation comparison of ncl gates, in Proc. IEEE 55th Int. Midwest Symp. Circuits and Systems (MWSCAS), pp. 394-397DOI
9 
Mader , Roy , Eby G. Friedman, Ami Litman, Ivan S. Kourtev., 2002, Large scale clock skew scheduling techniques for improved reliability of digital synchronous VLSI circuits., In 2002 IEEE International Symposium on Circuits and Systems. Proceedings, Vol. 1, pp. I-IDOI
10 
Morgenshtein , Arkadiy , Michael Moreinis, Ran Ginosar., 2004, Asynchronous gate-diffusion-input (GDI) circuits, IEEE transactions on very large scale integration (vlsi) systems, Vol. 12, No. 8, pp. 847-856DOI
11 
Morgenshtein , Arkadiy , Alexander Fish, Israel A. Wagner., 2002, Gate-diffusion input (GDI): a power-efficient method for digital combinatorial circuits, IEEE transactions on very large scale integration (VLSI) systems, Vol. 10, No. 5, pp. 566-581DOI
12 
Morgenshtein , Fish A., Wagner A., 2001, Gate-diffusion input (gdi)-a novel power efficient method for digital circuits: a design methodology, in Proc. 14th Annual IEEE Int. ASIC/SOC Conf, pp. 39-43DOI
13 
Morgenshtein , Fish A., Wagner I. A., 2002, Gate-diffusion input (gdi) - a technique for low power design of digital circuits: analysis and characterization, in Proc. IEEE Int. Symp. Circuits and Systems (ISCAS), Vol. 1, pp. I-477-I-480DOI
14 
Morgenshtein , Shwartz I., Fish A., Nov 2010, Gate diffusion input (gdi) logic in standard CMOS nanoscale process, in Proc. IEEE 26-th Convention of Electrical and Electronics Engineers in Israel, pp. 776-000-780DOI
15 
Morgenshtein , Yuzhaninov V., Kovshilovsky A., Fish A., 2014, Full-swing gate diffusion input logiccase study of low-power cla adder design, INTEGRATION, the VLSI journal, Vol. 47, No. 1, pp. 62-70DOI

Author

Prashanthi Metku
../../Resources/ieie/JSTS.2020.20.1.127/au1.png

is from Hyderabad, India.

She received her B.Tech degree in Electronic and Communication Engineering from Jawaharlal Nehru Technological University, Hyderabad, India, in 2011 and M.Tech degree in Electronic Engineering from Pondicherry University, India, in 2014.

She is currently pursuing her Ph.D. degree in the Computer Engineering from Missouri University of Science and Technology, United States.

Her interests include CMOS circuit design and Error Correction Codes.

Kyung Ki Kim
../../Resources/ieie/JSTS.2020.20.1.127/au2.png

received his B.S. and M.S. degrees in Electronic Engi-neering from Yeungnam University, South Korea, in 1995 and 1997, respectively.

He was a candidate for Ph.D. in Computer Science from Sogang University, South Korea from 1997 to 1999, and received his Ph.D. Degree in Computer Engineering from Northeastern University, Boston, USA in 2008.

He was a member of technical staff with Sun Microsystems, Santa Clara, CA in 2008 and a senior researcher with Illinois Institute of Technology, Chicago, USA in 2009.

Since March 2010, he has been with the school of Electronic and Electrical Engineering, Daegu University, Korea, where he is currently an Associate Professor. His current research focuses on neuromorphic architecture, high speed low power VLSI design, asynchronous design, electronic CAD and nano-electronics.

Minsu Choi
../../Resources/ieie/JSTS.2020.20.1.127/au3.png

received his B.S., M.S. and Ph.D. degrees in Computer Science from Oklahoma State University in 1995, 1998 and 2002, respectively.

He is currently an associate professor of Electrical and Computer Engineering at Missouri University of Science & Technology (Missouri S&T).

His research mainly focuses on Computer Architecture & VLSI, Crypto-hardware design, Nanoelectronics, Embedded Systems, Fault Tolerance, Testing, Quality Assurance, Reliability Modeling and Analysis, Configurable Computing, Parallel & Distributed Systems and Dependable Instrumentation & Measurement.

He has won two outstanding teaching awards at MST in 2008 and 2009.

He is a senior member of IEEE and a member of Golden Key National Honor Society and Sigma Xi.