Mobile QR Code QR CODE




Phase-change memory, PCM, neural network, feature extraction, neuron, synapse

I. INTRODUCTION

The emergence of non-volatile memory (NVM) used as synapses have promoted the development of hardware neural network (HNN) for their excellent scalability and synapse-like characteristics. Several types of NVM synapses have been implemented, including resistive random access memory (ReRAM), spin-transfer torque magnetic random-access memory (STT-MRAM), and phase-change memory (PCM) devices. We prefer to use PCM as the synapse for its excellent properties such as multi-level resistance values, strong data retention, high endurance, promising reliability, fast programming, CMOS compatibility, and technological maturity (1-14). However, due to the programming mechanism of the PCM, the asymmetric characteristic of the PCM between partially SET and partially RESET operations is severe (9-14). As a result, the change in the synaptic weights for the long-term potentiation (LTP) and long-term depression (LTD) operations is mismatched: changes in the synaptic weight for LTD (${∆}$W$_{\mathrm{LTD}}$) may be several times greater than those for LTP (${∆}$W$_{\mathrm{LTP}}$). In this paper, we compared the learning results between the fully-matched (or called symmetric, ${∆}$W$_{\mathrm{LTD}}$=${∆}$W$_{\mathrm{LTP}}$) and mismatched (or called asymmetric, ${∆}$W$_{\mathrm{LTD}}>{∆}$W$_{\mathrm{LTP}}$) cases, to show the influence of asymmetric synapse property during the training process. Then we proposed an alternate pulse scheme (APS) to reduce the influence of the asymmetric property.

In Section II, we explained the basic properties of the PCM device acting as a synapse. In Section III, we expatiate on the operating principle of HNN, which is used for feature extraction, and show the influence of the asymmetry between LTP and LTD. In Section IV, we describe the operating principle of the proposed APS and then compare the learning results of a mismatch case and those of a compensation case to show the effects of the APS. Finally, we conclude this paper in Section V.

II. PCM DEVICE AS SYNAPSE

Fig. 1. (a) The structure of a PCM device, (b) The experimental relationship of PCM conductance vs. programming voltage with 50 ns width, (c) The pulse for LTP and LTD operations, (d) The trace of synaptic weight evolution based-on the response of pulse.

../../Resources/ieie/JSTS.2020.20.1.119/fig1.png

The PCM device is a kind of resistive device with a simple structure that a chalcogenide glass layer (typically Ge$_{2}$Sb$_{2}$Te$_{5}$) sandwiched between the top electrode and the bottom electrode (as shown in Fig. 1(a)). The conductance, which is seen as the synaptic weight, is dependent on the molecular arrangement of the PCM. The conductance of PCM device changes based on the response of the pulse crosses the PCM device. The PCM device retains its conductance value when a low enough voltage is applied. The RESET operation can be performed by pulse heating the temperature over the melting point followed by a fast falling time. And the SET can be performed by pulse heating it over crystallizing temperature but below the melting point. In addition to fully SET and fully RESET operation, the PCM device can be programmed into intermedia conductance states by partially SET and partially RESET operation with suitable programming conditions (as shown in Fig. 1(b)). The most important properties of PCM device acting as synapse is the intermedia conductance states and gradual change by suitable partially SET (can be seen as LTP operation) or partially RESET (can be seen as LTD operation) operations. To introduce the performance of weight updating intuitionally, we assume the synaptic weight of PCM device changes as the response of pulses (as shown in Fig. 1(c)). If the pulse amplitude is below crystalline voltage (V$_{\mathrm{C}}$), the PCM synapse remains its weight. If the pulse amplitude above the melting voltage (V$_{\mathrm{m}}$), a partially RESET (LTD) operation is performed and the synaptic weight decrease with a value of ${∆}$W$_{\mathrm{LTD}}$. If the pulse amplitude between V$_{\mathrm{C}}$ and V$_{\mathrm{m}}$, a partially SET (LTP) operation is performed and the synaptic weight increase with a value of ${∆}$W$_{\mathrm{LTP}}$. However, the weight change for LTD operation (${∆}$W$_{\mathrm{LTD}}$) always much larger than that for LTP operations (${∆}$W$_{\mathrm{LTP}}$) due to the asymmetric programming characteristic of the PCM device. In this paper, we assumed the typical value of normalized ${∆}$W$_{\mathrm{LTP}}$ is set as 1/50; the value of ${∆}$W$_{\mathrm{LTD}}$ equals to ${∆}$W$_{\mathrm{LTP}}$ in the symmetric case and equals several times of ${∆}$W$_{\mathrm{LTP}}$ in asymmetric cases. Fig. 1(d) shows the weight trace following fifty suitable identical pulses.

Fig. 2. (a) The leaky integrate-and-fire circuit, (b) The LTP and LTD pulse scheme for PCM synapse, (c) The simulation result of the leaky integrate-and-fire circuit.

../../Resources/ieie/JSTS.2020.20.1.119/fig2.png

III. NEURAL NETWORK

1. Synapse and Neuron

The basic units of a neural network are synapses and neurons. We use one PCM device acting as one synapse. A leaky-integrate-and-fire (LIF) circuit (as shown in Fig. 2(a)) is used to represent one neuron. It collects weighted signals and aggregates them in the V$_{\mathrm{mem}}$. As shown in Fig. 2(c), when the V$_{\mathrm{mem}}$ exceeds a threshold (V$_{\mathrm{TH}}$), the V$_{\mathrm{mem}}$ is reset to zero and the pulse generator releases a square pulse as a post-synaptic spike. A square pulse with low amplitude, which below the crystallizing voltage (V$_{\mathrm{c}}$), acts as a pre-synaptic spike (V$_{\mathrm{PRE}}$). Such pulses can be weighted and transmitted by synapses without causing a change in synaptic weight. A square pulse with high amplitude, which over the melting voltage (V$_{\mathrm{m}}$), acts as a post-synaptic spike (V$_{\mathrm{POST}}$). The overlap of V$_{\mathrm{PRE}}$ and V$_{\mathrm{POST}}$ set the voltage through a PCM synapse between V$_{\mathrm{c}}$ and V$_{\mathrm{m}}$, so that an LTP operation is performed (as shown in Fig. 2(b)). The LTD operation is performed if only V$_{\mathrm{POST}}$ exists when the voltage across the PCM synapse exceeds V$_{\mathrm{m}}$. This weight updating rule can be understood as the simplified STDP (11-17).

Fig. 3. The input patterns (a) for original pattern rebuilding, (b) for common feature learning.

../../Resources/ieie/JSTS.2020.20.1.119/fig3.png

Fig. 4. (a) Conceptual design of neuron circuit for APS, (b) The pulses scheme for APS.

../../Resources/ieie/JSTS.2020.20.1.119/fig4.png

2. Neural Network

In this paper, we built a two-layer fully-connected neural network to perform the feature extraction. For simplicity, the output layer contains only one LIF neuron. The input layer contains 784 units which are connecting to the pixels of the input pattern. The input pattern is a binary MNIST pattern whose white part corresponds to on-pixels and the black part corresponds to off-pixels. The input neuron releases a pre-synaptic spike if its correlating pixel is an on-pixel. By contrast, the neuron does not release a pre-synaptic spike if its correlating pixel is an off-pixel. The learning mechanism is based on Hebbian theory. The synapses that exhibit causal relationships are potentiated. In contrast, the synapses that exhibited the anti-causal relationships are depressed.

3. Feature Extraction Tasks

Feature extraction is one of the important characteristics of the neural network. We performed two tasks about feature extraction. One of the tasks is the original pattern rebuilding from the contaminated pattern. A pattern labeled “1” is contaminated by different random noises (as shown in Fig. 3(a)). Here the “contaminated” means the pixels with noise are inversed. Our goal is to extract the original pattern from the contaminated pattern. The other task is common feature learning. Different patterns with the same label “5” (as shown in Fig. 3(b)) are provided to the neural network. And our goal is to extract their common features for learning.

4. Influence of Asymmetry

Theoretically, the two tasks of feature extraction can be performed perfectly if the weight updates caused by LTP and LTD are exactly fully-matched. However, due to the mismatch between LTP and LTD, the result of feature extraction may be unexpected.

For the original pattern rebuilding task, Fig. 6(a) shows the synaptic weights after the 100$^{\mathrm{th}}$ learning operation for mismatched cases. With the increase of mismatches, we can see that the image becomes fuzzy when ${∆}$W$_{\mathrm{LTD}}$ > 5/50. In addition, the average weight difference (AWD) is shown in Fig. 6(c) and Table 1. The AWD normalizes the difference between the actual weight value and the expected weight value. The AWD after the 100$^{\mathrm{th}}$ learning operation is less than 1% for cases with ${∆}$W$_{\mathrm{LTD }}$ < 5/50. With the mismatch increasing, the final learning result is far from the target.

The result of the common feature learning operation is shown in Fig. 7. The evolution of the synaptic weights for the fully-match (${∆}$W$_{\mathrm{LTD}}$ = 1/50) and mismatch (${∆}$W$_{\mathrm{LTD}}$ = 4/50) cases are shown in Fig. 7(a) and (b), respectively. With the limitation of the minimum weight value, some information from the trained patterns is removed for the mismatched case after a majority of the synaptic weights decrease to the minimum value. Furthermore, the final weight values correlating to the overlap of on-pixels for mismatched cases (range from 0 to 0.2) are much lower than those for the fully-matched case (range from 0 to 1). In Fig. 7(d) and (e), we selected four representative weights and showed the trace of weights evolution in the case of fully-match and mismatch, respectively. For the fully-match case, the weights correlating to pixel (12,12) and pixel (12,13) finally approach the maximum value with more LTP operations, while the weights correlating to pixel (23,15) and pixel (23,16) are relatively lower with less LTP operations. However, for the mismatch case, all of these weights approach the minimum value.

Fig. 5. Flowchart of the previous scheme and proposed APS.

../../Resources/ieie/JSTS.2020.20.1.119/fig5.png

Table 1. The AWD after 100th learning operation

../../Resources/ieie/JSTS.2020.20.1.119/tbl1.png

Fig. 6. Synaptic weights after 100$^{\mathrm{th}}$ learning operation for (a) mismatch, (b) compensation cases. The average weight difference for (c) mismatch, (d) compensation cases.

../../Resources/ieie/JSTS.2020.20.1.119/fig6.png

IV. PROPOSED ALTERNATE PULSE SCHEME

1. Concept of APS

The asymmetry between the LTP and LTD operations is an intrinsic property of a PCM device. The average mismatch “m”, which is defined as the rate of ${∆}$W$_{\mathrm{LTD}}$ to ${∆}$W$_{\mathrm{LTP}}$ (m = ${∆}$W$_{\mathrm{LTD}}$/${∆}$W$_{\mathrm{LTP}}$), can be pre-measured. A counter is added in the neuron circuit (as shown in Fig. 4(a)) to indicate how many times the neuron fires, and the pulse generator is designed to fire two kinds of pulses: pulse(1) only performs LTP operations without performing LTD operations; pulse(2) is the normal pulse for performing both LTP and LTD operations. The working principles of the pulses are shown in Fig. 4(b). The overlap of the pulse(1) can only drive the PCM synapses to set the voltage (LTP operations), and the working principle of the pulse(2) is explained in Section III. The flowchart of the previous scheme and the proposed APS are shown in Fig. 5. The counter number is set to zero initially, and it increments by one after each complete integrate-and-fire operation of a neuron. Pulse(1), which prohibits the LTD operation, is provided before the counter number reaches m. Once the counter number reaches m, pulse(2) is fired instead of the pulse(1). By performing the APS, an LTD operation is only performed once while LTP operations are performed m times. This scheme roughly compensates for the mismatch of the asymmetry property.

2. Effects of APS on Original Pattern Rebuilding

To show the effects of the APS on original pattern rebuilding, we compared the learning result for mismatched cases and for compensation cases (as shown in Fig. 6). The AWD values after the 100$^{\mathrm{th}}$ learning operation are shown in Table 1. For the cases with ${∆}$W$_{\mathrm{LTD}}$ > 8/50, the learning results by the APS more closely approach the fully-matched case than those of the mismatched cases with the same conditions. The AWDs for compensation cases are much more reduced than the mismatched cases and the patterns are much clearer. For the compensation cases with ${∆}$W$_{\mathrm{LTD }}$ < 15/50, the AWD is reduced to less than 0.6% by the APS. For the compensation case with ${∆}$W$_{\mathrm{LTD}}$ = 20/50, the AWD is as large as 1.658 due to the influence of noise in the background, but the pattern is clear enough.

3. Effects of APS on Common Feature Learning

Fig. 7. Synaptic weights for (a) fully-match, (b) mismatch, (c) compensation cases. The trace of specific synaptic weights for (d) fully-match, (e) mismatch, (f) compensation cases.

../../Resources/ieie/JSTS.2020.20.1.119/fig7.png

Fig. 8. (a) The synaptic weights after 5000th learning for fully match, mismatch and compensation cases, (b) The difference ratio, (c) The summation of weights.

../../Resources/ieie/JSTS.2020.20.1.119/fig8.png

In Fig. 7, we compare the weight evolutions and the trace of specific synaptic weights for fully-matched, mismatched and compensation cases. For the mismatched and compensation cases, the changes in the synaptic weights for the LTP and LTD operations are set as ${∆}$W$_{\mathrm{LTP}}$=1/50 and ${∆}$W$_{\mathrm{LTD}}$=4/50. In contrast with the mismatched case, the evolution process for the compensation case more closely approaches the fully-matched case. The learning results for the compensation case are the common properties of the trained patterns instead of the properties of the last training pattern for the mismatched case. The trace of the synaptic weight evolution for the compensation case is similar to that for the fully-matched case, while the trace for the mismatched case is in the lower range.

The learning results after the 5000$^{\mathrm{th}}$ learning operation shown in Fig. 8(a). All of the learning results for the compensation cases are much clearer than the mismatched cases. We statistically analyzed the difference ratio and summation of the synaptic weights in Fig. 8(b) and (c), respectively. The difference increases with the mismatch between LTP and LTD until a saturation value of about 8%. And the summation of the normalized synaptic weights is below 5 after the ratio of LTD to LTP becomes greater than 500%. This means that the learning result is merely the last training pattern and the integrated weighted signal is very weak. Compared with the mismatched cases, the difference ratios for the corresponding compensation cases are reduced and the summation of the normalized synaptic weights has a similar value to the fully-matched case. This means that the learning results are the common properties of trained patterns, just as in the fully-matched case, and the signal of the integrated weighted signal is strong enough to drive the integrate-and-fire operation of the LIF neuron.

4. Effects of APS on Energy Efficiency

An additional effect of APS is the reduction of energy consumption in the learning process. As the energy consumption for unit reset operation on the PCM device is several times larger than that for unit set operation. The total energy consumption is reduced by reducing the number of RESET operations in the APS. The operation numbers and energy consumption are shown in Table 2 to compare the previous scheme and the APS (W$_{\mathrm{LTD}}$=4/50). Here, the energy consumption for unit LTP and LTD operation is based on the data from (13). For the previous scheme, the number of RESET operation (N$_{\mathrm{RESET}}$) equals the number of LTD (N$_{\mathrm{LTD}}$). For the APS, N$_{\mathrm{RESET}}$ equals a quarter of N$_{\mathrm{LTD}}$. As a result, the total energy consumption in APS is only about 26.43% of that in the previous scheme.

V. CONCLUSIONS AND DISCUSSION

Table 2. he number of operations and energy consumption on specified synapse and total synapse array

../../Resources/ieie/JSTS.2020.20.1.119/tbl2.png

In this paper, we analyzed the influence of asymmetry between LTP and LTD, and demonstrated by two tasks of feature extraction: (1) original pattern rebuilding and (2) common feature learning. Then we proposed an APS to compensate for the influence of asymmetry. As a result, the rebuilt pattern by the APS is more similar to the original pattern. In common feature learning, the APS reduces the difference ratio and increases the summation of the synaptic weights. This APS improves the learning results by reducing the influence of asymmetry between LTP and LTD. However, we just focus on the asymmetry between LTP and LTD operations, neglecting other non-ideal characteristics of PCM devices, such as stochastic, non-linearity, and so on. We will further research the APS effects with more non-ideal characteristics (18-20) of the synapse.

ACKNOWLEDGMENTS

This research was supported by Nano Material Technology Development Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT and Future Planning (NRF-2016M3A7B4910398).

REFERENCES

1 
Kuzum D., 2011, Nanoelectronic programmable synapses based on phase change materials for brain-inspired computing, Nano letters, Vol. 12, No. 5, pp. 2179-2186DOI
2 
Breitwisch M. J., Cheek R. W., Lam C. H., Modha D. S., Rajendran B., :US Patent 20100299297.DOI
3 
Kim S., 2015, NVM neuromorphic core with 64k-cell (256-by-256) phase change memory synaptic array with on-chip neuron circuits for continuous in-situ learning, 2015 IEEE international electron devices meeting (IEDM), pp. 17.1.1-17.1.4.DOI
4 
Wang Z., 2015, A 2-transistor/1-resistor artificial synapse capable of communication and stochastic learning in neuromorphic systems, Frontiers in neuroscience, Vol. 8, pp. 438DOI
5 
Pantazi A., 2016, All-memristive neuromorphic computing with level-tuned neurons, Nano-technology, Vol. 27, No. 35, pp. 355205Google Search
6 
Stefano Ambrogio , 2016, Unsupervised Learning by Spike Timing Dependent Plasticity in Phase Change Memory (PCM) Synapses, Frontiers in neuroscience, Vol. 10, No. 56DOI
7 
Tomas Tuma , 2016, Detecting Correlations Using Phase-Change Neurons and Synapses, IEEE Electron Device Letters, Vol. 37, No. 9, pp. 1238-1241DOI
8 
Nandakumar S. R., 2017, Supervised Learning in Spiking Neural Networks with MLC PCM Synapses, 2017 75th Annual Device Research Conference (DRC), pp. 1-2DOI
9 
Barbera Selina La, 2018, Narrow Heater Bottom Electrode‐Based Phase Change Memory as a Bidirectional Artificial Synapse, Advanced Electronic Materials, Vol. 4, No. 9, pp. 1800223DOI
10 
Irem Boybat, 2018, Neuromorphic computing with multi-memristive Synapses, Nature communi-cations, Vol. 9, No. 1, pp. 2514DOI
11 
Suri M., 2011, Phase change memory as synapse for ultra-dense neuromorphic systems: Application to complex visual pattern extraction, 2011 International Electron Devices Meeting, pp. 4.4.1-4.4.4DOI
12 
Suri M., 2012, Physical aspects of low power synapses based on phase change memory devices, Journal of Applied Physics, Vol. 112, No. 5, pp. 054904DOI
13 
Bichler O., 2012, Visual Pattern Extraction Using Energy-Efficient, IEEE Transactions on Electron Devices, Vol. 59, No. 8, pp. 2206-2214DOI
14 
Suri M., 2011, Phase change memory for synaptic plasticity application in neuromorphic systems, The 2011 International Joint Conference on Neural Networks. IEEE, pp. 619-624DOI
15 
Querlioz D., 2011, Simulation of a memristor-based spiking neural network immune to device variations, The 2011 International Joint Con-ference on Neural Networks. IEEE, pp. 1775-1781DOI
16 
Querlioz D., 2012, Bioinspired networks with nanoscale memristive devices that combine the unsupervised and supervised learning approaches, 2012 IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH). IEEE, pp. 203-210DOI
17 
Querlioz D., 2013, Immunity to Device Variations in a Spiking Neural Network With Memristive Nanodevices, IEEE Transactions on Nano-technology, Vol. 12, No. 3, pp. 288-295DOI
18 
Irem Boybat, 2017, Stochastic weight updates in phase-change memory-based synapses and their influence on artificial neural networks, 2017 13th Conference on Ph. D. Research in Microelectronics and Electronics, pp. 13-16DOI
19 
Pritish Narayanan, 2017, Neuromorphic Tech-nologies for Next-Generation Cognitive Com-puting, Electron Devices Technology and Manufacturing Conference (EDTM)DOI
20 
G.W.Burr , 2015, Experimental demonstration and tolerancing of a large-scale neural network (165,000 synapses), using phase-change memory as the synaptic weight element, IEEE Transactions on Electron Devices, Vol. 62, No. 11, pp. 3498-3507DOI

Author

Cheng Li
../../Resources/ieie/JSTS.2020.20.1.119/au1.png

was born in Linfen city, Shanxi Province, China, in 1989.

He received the B.S. degree in electronics engineering from Han-yang University, Seoul, South Korea, in 2013.

He is currently working toward the unified M.S. and Ph.D. degree in electronic engineering. His current research interests include neuromorphic system, phase-change material synapse, and neuron circuit.

Junseop An
../../Resources/ieie/JSTS.2020.20.1.119/au2.png

received the B.S. degree in electronic engineering from Han-yang University, Seoul, Korea in 2015.

Since 2015, he has worked the unified the Master’s and Doctor’s Course in electronics engineering.

His current research interests include the development of neuromorphic system using phase-change memory, simulation study using TCAD, fabricating and characterization of memory devices such as PRAM.

Jun Young Kweon
../../Resources/ieie/JSTS.2020.20.1.119/au3.png

received the B.S. degree in electronics engineering from Hanyang University, Seoul, South Korea, in 2014.

Currently working the unified Ph.D student of division of nanoscale semiconductor engineering at Hanyang University.

Interesting phase-change material memory systems.

Yun-Heub Song
../../Resources/ieie/JSTS.2020.20.1.119/au4.png

received his M.S. degree in electronic engineering from Hanyang University, Seoul, Korea, in 1992, and his Ph.D. degree in intelligent mechanical engineering from Tohoku University, Sendai, Japan, in 1999. He is currently a Professor in Electronic Engineering, Hanyang University, Seoul, Korea.

He has researched semiconductor devices and circuit design for more than 30 years at the Semiconductor R&D Center, Samsung Electronic Co. and Hanyang University, Korea.

When Prof. Song was working at Samsung, he was responsible for the device and product development of Flash memory as a vice-president, and developed 256Mb and 512Mb NOR Flash memory in 2000–2003. After moving to Hanyang University in 2008, Korea, he served as a vice-dean, College of Engineering, engaging in extensive international collaboration research and planning on industrial projects from 2011 to 2013.

His research interests include device reliability modeling, device characterization, novel device structures and architecture for memory and logic applications, circuit design and algorithms for low power and high speed, and sensor systems based on semiconductor technology.