MetkuPrashanthi1
KimKyung Ki2
ChoiMinsu1
-
(Department Electrical & Computer Engineering, Missouri University of Science & Technology,
Rolla, MO, USA)
-
(Department Electronic Engineering, Daegu University, Gyeongsan, Korea)
Copyright © The Institute of Electronics and Information Engineers(IEIE)
Index Terms
Null convention logic, gate diffusion input, HYBRID implementation, ripple carry adder
I. INTRODUCTION
The clocked synchronous paradigm currently dominates the semiconductor design industry
(1). However, there are major drawbacks of this synchronous approach, including critical
timing analysis and clock skew issues (2). Typically, a precise clock distribution network is used to address these limitations,
which is a tedious and complex task. Moreover, with the decreasing feature size, power
consumption of clock distribution network is found to be rapidly increasing, which
is a major limiting factor for emerging low power semiconductor industry (3). Since asynchronous designs consumes less power, produce less noise and electromagnetic
interference (EMI) than their synchronous counterparts, there is renewed interest
in this field (4).
Asynchronous circuits are characterized into two classifications: bounded-delay and
delay-insensitive (DI) models (3). Bounded-delay models consider that the both gate and wire delays are bounded and
therefore, require extensive timing analysis to determine the delay (1). On the other hand, DI circuits assume both interconnects and logic elements delay
are unbounded and wire forks within the components are isochronic (5). However, wires connecting the components do not adhere to this isochronic fork assumption,
ensuring the correct operation regardless on the input availability. Hence, DI circuits
require little timing analysis and yield average case performance rather than the
worst-case performance of bounded-delay and traditional synchronous paradigms (6).
Literature provides several DI paradigms such as Seitz’s, DIMS, Anantharaman’s, Singh’s,
and David’s Phased Logic and NULL Convention Logic (1). Most of these DI methods (Seitz’s, DIMS, Anantharaman’s, Singh’s, David’s, Phased
Logic) either depends on C-element or synchronous design to achieve DI. More elaborate
description of these DI methodologies can be found in (7). Conversely, NCL methodology uses a library of hysteresis state holding functionality
gates to attain DI. These gates enable transistor level optimization, which help in
reducing the overall circuit area (8). Hence, NCL is the best alternative for integrating asynchronous digital design into
the predominantly synchronous semiconductor design industry.
Fig. 1. Dual-rail representation of NCL AND function: Z = X • Y: initially X=DATA1
and Y=DATA0, so Z=DATA0; next X and Y both transition to NULL, so Z transitions to
NULL; then X and Y both transition to DATA1, so Z transitions to DATA1 (7).
Fig. 2. NCL system framework (3).
The NCL paradigm consists of 27 hysteresis state holding logic gates (2). These gates are traditionally implemented using one of the CMOS techniques: static,
semi-static, differential or dynamic methods. Detailed information regarding these
approaches can be found in (8). Among these methods, the static CMOS method is most commonly used as it results
in less leakage and noise compared to other CMOS techniques (9). However, the main drawback of static CMOS implementation is the area overhead. The
results indicate that the area occupied by static CMOS NCL implementation is approximately
1.5-2 times the equivalent synchronous design (7). To address this drawback, this paper proposes a HYBRID technique for designing NCL
circuits. The proposed approach integrates both CMOS and gate diffusion input techniques
to realize NCL designs.
Gate diffusion input methodology is a low power design technique that utilizes only
two transistors to implement different functions (1). The input configuration required to implement various function can be found in (9-12). Hence, by applying GDI methodology for realizing NCL gates, total transistor count
is reduced, which in turn reduces switching power. However, the biggest drawback of
GDI methodology is the voltage drop at the output which results in performance degradation
(12-15). Hence, a new HYBRID methodology that can address both of these limitations, voltage
drop of GDI and area overhead of CMOS is proposed in this work.
The aim of this work is to utilize both CMOS and GDI techniques to design NCL circuits.
This approach where in both CMOS and GDI based NCL gates are used to design NCL circuits
helps in reducing the transistor count when compared to the conventional static CMOS
approach. To validate the performance of the proposed approach, a variety of NCL up-counter
increment (NUI) circuits were realized and compared with the static CMOS methodology.
The proposed approach shows a minimum of 4% reduction in the transistor count when
compared with the static CMOS approach.
The rest of the paper is organized as follows: Section II discusses the preliminaries
and review of NCL and GDI. Section III presents the design description of the HYBRID
methodology. Section IV illustrates the simulation results, followed by conclusion
in section V.
II. PRELIMINARIES AND REVIEW
1. Null Convention Logic
NCL is a clockless DI model that works correctly regardless of input accessibility.
It is a self-timed logic model where both data and control are integrated to a single
signal and communication is accomplished through local handshaking (1). To provide synchroni-ation DATA and NULL states are used which are obtained using
dual or quad-rail logic. A dual-rail signal, D utilizes two wires D0 and D1 to represent
values DATA0, DATA1 and NULL (4) as shown in Fig. 1. The NULL state (D0 = 0, D1 = 0) symbolizes that D is not available. The DATA0 state
(D0 = 1, D1 = 0) and DATA1 state (D0 = 0, D1 =1) represents Boolean logic 0 and 1
(8). These two rails are mutually exclusive and cannot be asserted at the same time.
This means that if both the rails are high, the state is known as an invalid/illegal
state.
The framework of NCL system is shown in Fig. 2. As observed from the figure, combinational logic (CL) is always sandwiched between
two DI registers and these adjacent DI registers communicate through request and acknowledge
signals ki and ko. NCL utilizes a special set of logic element known as threshold
gates for realizing the combinational logic, DI registers and competition detection
circuits (3).
Fig. 3. (a) THmn threshold gate, (b) TH34w2 threshold gate (3).
Fig. 4. Transistor level implementation of TH23 gate using Static CMOS methodology
(7).
Fig. 5. Transistor level implementation of TH23 gate using semi-static methodology
(7).
There are 27 threshold gates and the primary type of threshold gate depicted in Fig. 3(a), is known as THmn, where 1 ${\leq}$ m ${\leq}$ n (2). Here, n represents the number of inputs and m denotes the number of inputs that
need to be asserted for the output to be asserted. The secondary type of threshold
gate illustrated in Fig. 3(b) is referred as a weighted threshold gate, denoted as THmn Ww1w2…wR. The constant
equation is w1, w2…wR {\textgreater} 1, where w1, w2…wR are the integer weights of
input1, input2 … inputR, respectively (5). These threshold gates have built-in hysteresis behavior to ensure DI. Hysteresis
in NCL ensures that two DATA wavefront are not overwritten and are always separated
by a NULL wavefront.
The general algebraic expression of an NCL gate is the combination of set and hold
equations. The set equation defines the functionality of the gate and the hold equation
determines till when the gate should be asserted once it is asserted. The set equation
i.e. the functionality of each NCL gate is presented in (1) whereas; the hold equation remains the same for every gate, which is simply OR-ing,
all the inputs. Therefore, the general equation for an NCL gate is given by Z = set
+ (Z- • hold), where Z- is the previous output value and Z is the current value. Prevailing
methodologies utilized for realizing NCL circuits are static and semi-static CMOS
technology. Fig. 4 and 5 depicts the static and semi-static CMOS implementation of TH23 gates.
As depicted in Fig. 3(b), the semi-static implement-ation only requires set and set’ expressions are utilized
to realize TH23 gate. To achieve hysteresis, the semi-static implementation uses weak
feedback inverters, which slows down the gate operation leading to large latency overhead.
This limitation is addressed by using static CMOS implementation that utilizes pull-up
(set) and pull-down networks (reset) as shown in the Fig. 3(a).
As observed from the figure, the additional circuitry is required to maintain the
built-in hysteresis property of NCL gates. This leads to an area overhead where NCL
designs are approximately 1.5 - 2 times larger than the equivalent synchronous designs
(7).
Therefore, it is crucial to address this limitation so that NCL designs become viable
alternative for synchronous design. This drawback is be addressed by utilizing a low
power design methodology called GDI.
2. Gate Diffusion Input Method
GDI is a low power design technique commonly used in synchronous design to reduce
area and dynamic power consumption (1). The structure of basic GDI cell is depicted in Fig. 6. It has three inputs G (common gate input of both the nMOS and the pMOS), P (input
to the source/drain of the pMOS), N (input to the source/drain of the nMOS). The bulks
of nMOS and pMOS transistors are constantly connected to GND and V$_{\mathrm{DD}}$,
respectively (6).
Various logic functions of GDI cell for different input configurations are illustrated
in Fig. 6(b). Since, the pull-up and pull-down networks of these functions are not always connected
to power supply (V$_{\mathrm{DD}}$) and ground (GND), a voltage drop at the output
is observed. This drawback is the biggest limitation of GDI methodology based implementation
(9-15). Similarly, by realizing NCL gates using GDI technique, voltage swings prevail in
the circuit. Therefore, this work mainly focuses on addressing this issue such that
GDI technique can be used for realizing NCL circuit.
Fig. 6. (a) Basic GDI cell structure, (b) Different functions input configurations
(10).
Fig. 7. FNCL implementation of TH22 gate.
The next section presents the mechanism for realizing the NCL gates using GDI methodology.
The efficacy of the proposed approach is verified by realizing the several NUI circuits
and comparing with the static CMOS methodology.
III. THE PROPOSED HYBRID METHODOLOGY
First, the mechanism to realize NCL gates using GDI methodology, also known as FNCL
approach is discussed. Since this approach utilizes both F1 and F2 functions unique
to GDI as shown in Fig. 3, it is named as FNCL. The limitation of voltage drop of FNCL approach is also presented
in this subsection. Finally, the HYBRID methodology, which utilizes both CMOS and
GDI techniques to address the area overhead limitation of NCL designs, is described
in detail.
1. Realization of NCL using FNCL Approach (based on F1 and F2 functions of GDI gate)
To realize NCL threshold gates using FNCL approach, first the Boolean expression of
THmn gate is factorized. Then, based on the factorized expression GDI function F1,
F2 and MUX are utilized to implement the gate. As an example, steps for realization
of TH22 gate is shown below:
Step 1: Factorized expression of TH22 gate is:
Z = AB + Z (A+B)
Where, A, B are the inputs and Z is the output.
Step 2: The GDI functions are utilized to realize AND (AB) and OR (A+B) expressions.
Among all the GDI functions, only F1 and F2 are utilized, as they demonstrate voltage
drop for only one input combination compared to the others (AND, OR) functions. Hence,
F1 and F2 are used to implement AB and A+B as shown in the Fig. 7.
Step 3: The GDI MUX cell is used to determine final output i.e. whether to pass set
data or hold data based on the previous results. The MUX is configured such that the
output F1 cell (AB) and the O2 output of F2 cell (A+B) are fed to the sources of pMOS
and nMOS as shown in Fig. 7. This will allow to select set equation (AB) when Z=0 and hold data (A+B) when the
Z is 1.
Therefore, the proposed FNCL approach requires only eight transistors to implement
TH22 gate. Compared to the static CMOS approach, a 20% reduction in the transistor
count is observed. However, the main drawback of this approach is the performance
degradation due to the substantial voltage drop at the final output. The mechanism
to address this limitation is discussed in next subsection.
2. Performance Degradation of FNCL Approach
The performance degradation is due to the different input configuration of the nMOS
and pMOS transistors. A voltage drop of V$_{tp}$ (threshold voltage of pMOS) and V$_{\mathrm{DD}}$-V$_{tn}$
(threshold voltage of nMOS) for pMOS and nMOS transistors are observed when their
sources are not tied to V$_{\mathrm{DD}}$ and GND respectively (9).
Fig. 8. Simulation results demonstrating voltage drop of FNC TH22 gate.
Fig. 9. Structure of FNCL FA.
To demonstrate this phenomenon, the simulation results of the proposed TH22 gate is
illustrated in Fig. 8. It is observed in Fig. 8, when either of the inputs or any one of the inputs are low (A=0, B=0, Z=0), the
outcome is V$_{tp}$ rather than strong low ‘0’. This can be explained as follows:
whenever A = 0, the voltage at the pMOS source of MUX cell is 0. Since, pMOS passes
weak ‘0’, the result would be V$_{tp}$. Conversely, when A and B are high, the output
is V$_{\mathrm{DD }}$without any voltage drop since pMOS passes strong ‘1’. Hence,
three out of four input combinations result in voltage drop.
Fig. 10. Simulation results of FA validating the volatge drop at sum is greater than
carryout.
Fig. 11. The proposed HYBRID framework.
This voltage swing issue further escalates when two FNCL gates are interconnected.
To validate this conclusion an NCL Full adder (FA) circuit is implemented using the
FNCL approach. The structure of FA and its simulation results are depicted in Fig. 6 and Fig. 9. From Fig. 10 it is observed that TH23 gates generates carryout and TH34w2 gates utilizes these
results to generate the sum (S0, S1). When the FA circuit is simulated, voltage swings
(for logic low) at carryout was ~ 0.1V, whereas for sum it was ~ 0.38V. This increased
voltage swing at the sum output is due to the voltage drop at TH23 gate being carried
on to TH34w2 gate. Therefore, realizing the whole circuit using FNCL gates is not
viable option. To address this limitation, novel HYBRIB approach is also proposed
in this paper.
3. CMOS-GDI HYBRID Approach
The design of the HYBRID model is inspired by the observation that in NCL system framework
the DI combinational logic (CL) is always enclosed between DI registers. In other
words, inputs or outputs of CL always pass through a DI register to ensure synchronization
(two DATA wavefronts are not overwriting).
The idea of the HYBRID methodology is to redesign this NCL structure using both static
CMOS and FNCL techniques. Fig. 11 depicts the framework of NCL system using HYBRID methodology. The difference between
the original and the new (HYBRID) structure is the method by which NCL blocks CL,
DI register and completion detection (CD) are realized. The FNCL approach is utilized
to realize CL and CD blocks, while static CMOS method is used to implement the DI
registers (CMOS_DI_reg).
As discussed, the FNCL blocks (CL, CD) yield a voltage drop at their output. This
limitation can be addressed by transferring these outputs through the CMOS_DI_reg.
The CMOS_DI_reg have strong pull-up and pull-down network which helps to restore signal
strength and generate an output of either V$_{\mathrm{DD}}$ or ground. Thus, the HYBRID
approach prevents the voltage drop of the GNCL blocks from progressing to the next
stage. Fig. 12 shows the application of this idea to a one-bit full adder circuit and the simulation
results depicts that HYDRIB approach has no performance degradation.
Fig. 12. Simulation results of a 1-bit full adder using HYBRID approach.
Fig. 13. Number of transistors utilized by CMOS and FNCL.
Table 1. Comparison of static CMOS and HYBRID methodologies
Model Type
|
STATIC CMOS
|
HYBRID
|
TC
of
only CL
|
Total TC including
DI registers
|
TC
of
only
CL
|
Total TC including
DI registers
|
Incomplete
AND
|
216
|
492
|
180
|
456
|
Reduced
Dual-Rail
|
460
|
764
|
340
|
616
|
Factored
Dual-Rail
|
308
|
584
|
236
|
512
|
Complex
Dual-Rail
|
212
|
488
|
192
|
468
|
In summary, the FNCL approach is proposed to address the area overhead limitation
of static CMOS methodology. However, the voltage drop at the output hinders this approach
for designing NCL system. Therefore, a HYDRID methodology, which utilizes both the
FNCL and static CMOS techniques to address the drawbacks of both the approaches are
introduced. To validate the effectiveness of the HYBRID methodology, the proposed
approach is applied to a case study of different NCL up-counter increment (NUI) designs.
A comparative study of the NUI circuits when implemented using static CMOS and HYBRID
methodologies are presented in the next section.
IV. SIMULATION RESULT
This section presents the comparative results of NUI designs when implemented using
static CMOS and HYBRID methodologies. All the designs are realized in 45nm technology
using Cadence general-purpose design kit (PDK) which provides the standard cell library
and associated technology files for circuit realization. The schematics are simulated
using Specter simulator with V$_{\mathrm{DD}}$ = 1V and temperature = 27$^{\mathrm{o}}$C.
Serval alternative designs for NCL up-counter increment circuits are realized to verify
the viability of the proposed approach. The proposed HYBRID methodology achieves a
significant reduction in transistor count compared to the conventional static CMOS
NCL designs.
1. NCL Gates Utilized for Realizing NUI Circuits
The NCL gates used for implementing various NUI designs along with their transistor
count for CMOS and HYBRID methodologies are depicted in Fig. 13.
As illustrated in Fig. 13, an average of 6% decrement in the number of transistors utilized for implementing
these NCL gates using FNCL methodology is observed.
2. Transistor Count for Various NUI Implementations
Table 1, presents the transistor count (TC) for various NUI designs implemented via static
CMOS and HYBRID methodologies
As observed from the Table 1, HYBRID methodology utilizes a smaller number of transistors compared to the CMOS
implementation. The percentage reduction in transistor count for each model is illustrated
in the Fig. 14. Compared to the conventional static CMOS methodology, an incomplete AND NUI shows
a 7% reduction in TC when implemented using the proposed approach. Similarly, the
Reduced Dual-Rail, Factored Dual-Rail, Complex Dual-Rail NUI circuits show a 19.3
%, 12.3% and 4% reduction in TC when realized using HYBRID approach.
Fig. 14. Percentage reduction in the transistor count.
V. CONCLUSIONS
In this paper, a novel CMOS-GDI HYBRID methodology is proposed and validated to address
the area overhead in conventional NCL based on static CMOS implementation. It utilizes
two types of design techniques, static CMOS and GDI to realize NCL designs. The proposed
approach demonstrated an average of 10% reduction in the transistor count when several
NUI are realized using the proposed approach. This enhancement provides the scope
for NCL to be an alternative for synchronous designs. The impact of HYBRID method
on power consumption and latency will be the part of the future work.
REFERENCES
Bandapati , Satish K., Scott C. Smith., 2007, Design and characterization of NULL
convention arithmetic logic units, Microelectronic engineering 84, No. 2, pp. 280-287
Parsan F. A., Smith S. C., Oct 2012, CMOS implementation of static threshold gates
with hysteresis: A new approach, in Proc. IEEE/IFIP 20th Int VLSI and System-on-Chip
(VLSI-SoC) Conf, pp. 41-45
Smith , Scott C., Ronald F. DeMara, Jiann S. Yuan, Ferguson D., 2004, Optimization
of NULL convention self-time circuits, INTEG-RATION, the VLSI Journal 37, No. 3, pp.
135-165
Bandapati , Satish K., Scott C. Smith, Minsu Choi., 2003, Design and characterization
of NULL convention self-timed multipliers., IEEE design & test of computers 20, Vol.
no. 6, No. 6, pp. 26-36
Bonam R., Chaudhary S., Yellambalase Y., Choi M., Aug 2007, Clock-free nanowire crossbar
architecture based on null convention logic (ncl), in Proc. 7th IEEE Conf. Nanotechnology
(IEEE NANO), pp. 85-89
Bailey , Andrew D., Jia Di, Scott C. Smith, Alan Mantooth. H., 2008, Ultra-low power
delay-insensitive circuit design, In 2008 51st IEEE Midwest Symposium on Circuits
and Systems, pp. 503-506
Smith , Christopher Scott, Demara. Ronald F., 2001, Gate and throughput optimizations
for null convention self-timed digital circuits, Doctor of Philosophy, Dissertation
Parsan F. A., Smith S. C., Aug 2012, CMOS implementation comparison of ncl gates,
in Proc. IEEE 55th Int. Midwest Symp. Circuits and Systems (MWSCAS), pp. 394-397
Mader , Roy , Eby G. Friedman, Ami Litman, Ivan S. Kourtev., 2002, Large scale clock
skew scheduling techniques for improved reliability of digital synchronous VLSI circuits.,
In 2002 IEEE International Symposium on Circuits and Systems. Proceedings, Vol. 1,
pp. I-I
Morgenshtein , Arkadiy , Michael Moreinis, Ran Ginosar., 2004, Asynchronous gate-diffusion-input
(GDI) circuits, IEEE transactions on very large scale integration (vlsi) systems,
Vol. 12, No. 8, pp. 847-856
Morgenshtein , Arkadiy , Alexander Fish, Israel A. Wagner., 2002, Gate-diffusion input
(GDI): a power-efficient method for digital combinatorial circuits, IEEE transactions
on very large scale integration (VLSI) systems, Vol. 10, No. 5, pp. 566-581
Morgenshtein , Fish A., Wagner A., 2001, Gate-diffusion input (gdi)-a novel power
efficient method for digital circuits: a design methodology, in Proc. 14th Annual
IEEE Int. ASIC/SOC Conf, pp. 39-43
Morgenshtein , Fish A., Wagner I. A., 2002, Gate-diffusion input (gdi) - a technique
for low power design of digital circuits: analysis and characterization, in Proc.
IEEE Int. Symp. Circuits and Systems (ISCAS), Vol. 1, pp. I-477-I-480
Morgenshtein , Shwartz I., Fish A., Nov 2010, Gate diffusion input (gdi) logic in
standard CMOS nanoscale process, in Proc. IEEE 26-th Convention of Electrical and
Electronics Engineers in Israel, pp. 776-000-780
Morgenshtein , Yuzhaninov V., Kovshilovsky A., Fish A., 2014, Full-swing gate diffusion
input logiccase study of low-power cla adder design, INTEGRATION, the VLSI journal,
Vol. 47, No. 1, pp. 62-70
Author
is from Hyderabad, India.
She received her B.Tech degree in Electronic and Communication Engineering from Jawaharlal
Nehru Technological University, Hyderabad, India, in 2011 and M.Tech degree in Electronic
Engineering from Pondicherry University, India, in 2014.
She is currently pursuing her Ph.D. degree in the Computer Engineering from Missouri
University of Science and Technology, United States.
Her interests include CMOS circuit design and Error Correction Codes.
received his B.S. and M.S. degrees in Electronic Engi-neering from Yeungnam University,
South Korea, in 1995 and 1997, respectively.
He was a candidate for Ph.D. in Computer Science from Sogang University, South Korea
from 1997 to 1999, and received his Ph.D. Degree in Computer Engineering from Northeastern
University, Boston, USA in 2008.
He was a member of technical staff with Sun Microsystems, Santa Clara, CA in 2008
and a senior researcher with Illinois Institute of Technology, Chicago, USA in 2009.
Since March 2010, he has been with the school of Electronic and Electrical Engineering,
Daegu University, Korea, where he is currently an Associate Professor. His current
research focuses on neuromorphic architecture, high speed low power VLSI design, asynchronous
design, electronic CAD and nano-electronics.
received his B.S., M.S. and Ph.D. degrees in Computer Science from Oklahoma State
University in 1995, 1998 and 2002, respectively.
He is currently an associate professor of Electrical and Computer Engineering at Missouri
University of Science & Technology (Missouri S&T).
His research mainly focuses on Computer Architecture & VLSI, Crypto-hardware design,
Nanoelectronics, Embedded Systems, Fault Tolerance, Testing, Quality Assurance, Reliability
Modeling and Analysis, Configurable Computing, Parallel & Distributed Systems and
Dependable Instrumentation & Measurement.
He has won two outstanding teaching awards at MST in 2008 and 2009.
He is a senior member of IEEE and a member of Golden Key National Honor Society and
Sigma Xi.