Skip to main content
Top
Published in: Journal of Computational Electronics 5/2021

Open Access 19-08-2021

Design and analysis of SHE-assisted STT MTJ/CMOS logic gates

Authors: Prashanth Barla, Vinod Kumar Joshi, Somashekara Bhat

Published in: Journal of Computational Electronics | Issue 5/2021

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

We have investigated the spin-Hall effect (SHE)-assisted spin transfer torque (STT) switching mechanism in a three-terminal MTJ device developed using p-MTJ (perpendicular magnetic tunnel junction) and heavy metal materials of high atomic number, which possesses large spin–orbit interaction. Using p-MTJ schematic and complementary-metal-oxide-semiconductor (CMOS) logic, we have designed three basic hybrid logic-in-memory structure-based logic gates NOR/OR, NAND/AND, and XNOR /XOR. Then the performances of these hybrid gates are evaluated and the results are compared with the conventional CMOS-based gates in terms of power, delay, power delay product, and device count. From the analysis, it is concluded that SHE-assisted STT MTJ/CMOS logic gates are nonvolatile, consume less power, and occupy a smaller die area as compared to conventional CMOS only logic gates.
Notes

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

1 Introduction

Spintronics is one of the emerging areas that use the spin property of an electron in association with its charge [1, 2]. It has an extra degree of freedom for computation. The spintronics research community has attracted much of the attention because already many efforts were made for various spintronics-based applications such as the development of nonvolatile (NV) memory [38], NV-logic implementation [918], and magnetic sensors for sensing magnetic field in pico-tesla range to the new growing technologies like neuromorphic computing and the brain-inspired computing, etc. [1921]. In conventional electronics, which relies on the charge-based scalar quantity, the power (or heat) dissipation can be reduced up to a certain extent. The absence or presence of charge may correspond to logic “0” and logic “1.” In contrast, spin is a pseudovector quantity whose magnitude is fixed with variable polarization and denoted by \(\hslash /2\) (where \(\hslash\) is the reduced Planck’s constant). If an electron is placed in a magnetic field, various logic states can be achieved by varying the applied magnetic field. But, while designing the hybrid circuits, which is a combination of conventional charge-based CMOS logic and spin-based spintronic devices, we need only two stable logic states, i.e., logic “1” and logic “0.” In spintronics, a spin-up and a spin-down electron are considered to be representing logic “1” and logic “0,” respectively, or vice versa. There are various spintronic devices that have been reported in the literature, such as spin-valve [22] magnetic tunnel junction (MTJ) [23], ferroelectric tunnel junction (FTJ) [24], domain wall [13, 25], skyrmion [26, 27], all-spin logic (ASL) devices [17, 18, 28], etc. Among all the spintronic devices, we chose MTJ owing to its advantages such as simplicity, nonvolatility, large endurance, high density, fast reading capability, logic computation ability, 3D fabrication, and ease of integration with the existing CMOS technology, availability of models, and ease of manufacturing, etc. [1, 29, 30]. Alternatively, it consumes zero static power and has an instant ON-OFF feature. In the literature, there are various switching/writing techniques for MTJ device, such as spin-transfer torque (STT) [3133], spin–orbit torque (SOT) [3436], field-induced magnetic switching (FIMS) [37], and voltage-controlled magnetic anisotropy (VCMA) [3840]. Considering the commercial aspect, the STT switching mechanism is more attractive among the rest [4145]. Also, because of its simplicity, STT-MTJ can be easily incorporated into logic-in-memory (LIM) circuits. LIM is a new paradigm where computational capability is embedded into the memory, i.e., memory and processing of data/information is united in a single unit. In LIM, the memory which is involved in the processing of information is also nonvolatile in nature. LIM structure offers benefits over the conventional von-Neumann architecture, viz. lower power dissipation (both static and dynamic), scaling compatibility below sub-micron level (thereby increases the device density), immunity for the radiation effects, etc. [30, 4648]. However, one of the key challenges for the LIM circuit is to improve the write speed and lowering the energy dissipation. The STT-MTJ device used in the LIM shows a large intrinsic incubation delay, which slows the write speed. Furthermore, incubation delay can contribute to circuit reliability issues. Because, to ensure the correct writing, we need a large write current flowing across the STT-MTJ. The flow of large write current causes a dielectric breakdown in the STT-MTJ over a period of time. This results in poor write endurance when we use only STT switching for MTJ devices [49]. In other words, it is challenging to maintain the critical current density (\(J_C\)) while keeping high stability for the LIM structure. On the contrary, lowering \(J_C\) can be achieved at the cost of lower stability. Also, lower \(J_C\) causes read disturbance in STT-MTJ. There are several methods discussed in the literature to alleviate this trade-off [5054]. But all the issues mentioned above can be addressed by spin–orbit torque (SOT) [5557]. SOT is observed in a three-terminal device as shown in Fig. 1a. For STT, a spin-polarized current is needed to switch the magnetization of the free layer, while for SOT, this spin-polarized current is created by either the Rashba effect [58] or the spin-Hall effect (SHE) [35, 59]. The SOT induced by the Rashba effect is more subjected to debate [60]. Therefore, for the simplicity and availability of the MTJ model, we consider that SHE is the driving mechanism for SOT. However, the requirement of an additional external field hinders the development of SHE-MTJ-based circuits, which works purely on the SHE mechanism. As a solution, SHE-assisted STT (SHE+STT) switching method was reported in the literature [61]. The SHE+STT switching technique has proved to be not only faster but also energy efficient. Using SHE+STT switching mechanism, various applications such as memory [62, 63], flip flop [64, 65], full adder [66, 67], and recently basic logic gates [68, 69] have already been developed. However the logic gates developed in Ref. [68] utilize in-plane-MTJ (i-MTJ) rather than perpendicular-MTJ (p-MTJ). The i-MTJ suffers from disadvantages such as larger size, the difficulty for scaling, and low thermal stability, as compared to p-MTJ [2, 70, 71]. Further only AND and OR operations can be performed using the circuit proposed in Ref. [68]. Though Ref. [69] uses p-MTJ to develop the logic gate circuit, it can perform only NOT, AND/NAND, and OR/NOR operations. Further to perform these operations, the circuit requires two passive capacitors to store the sensing voltage signals. But in VLSI design incorporation of passive devices is not preferred. In this paper, we have developed all the basic logic gates such as NOR/OR, NAND/AND, and XNOR/XOR using SHE-MTJ devices based on LIM structure, where the writing of the SHE-MTJ device is performed using the SHE+STT switching mechanism. Simulations are carried out to study the performance of these circuits in terms of an output response, power dissipation, power delay product (PDP), and the number of devices utilized in comparison with their CMOS counterparts; double pass-transistor logic-based clocked CMOS (DPTL-\(\text {C}^\text {2}\text {MOS}\)) logic gates. Further, we have also performed Monte Carlo (MC) simulations on these gates to study the power variations by incorporating process and mismatch variations in CMOS and extracted parameters of MTJ, which would arise during the fabrication process.
The paper is organized as follows: Sect. 2 presents the structure and working of SHE-MTJ device and hybrid LIM structure with MTJ/CMOS. In Sect. 3, we have explained the design and working of all the hybrid logic gates. Section 4 presents the performance evaluation of all the hybrid gates in terms of key performance indicators (KPI), power dissipation, delay, PDP, and device count. Comparison of these KPIs is compared with their CMOS counterparts. To study the power variations of various gates for process and mismatch (PM) variations, we have also conducted MC simulation. Finally, the paper is concluded in Sect. 5. Appendix representing the structure of (DPTL-\(\text {C}^\text {2}\text {MOS}\))-based logic gates has been also appended for convenience.

2 Background

SHE is considered as another phenomenon accountable for the SOT apart from Rashba effect [72]. Based on Mott scattering [73], Dyakonov and Perel predicted SHE in 1971 [34] and was revived by Hirsch in 1999 [35], the same was later demonstrated in Pt at room temperature by generating the substantial spin current [36]. Largely, SHE was believed to be arisen due to the skew scattering of s-electrons [35, 74], which is also known as an extrinsic mechanism. Materials with large SOI that also possess high atomic number such as HM convert the charge current to spin current exhibiting the SHE effect as shown in Fig. 1b. Here in HM, a flow of unpolarized electrons with charge current density (\(\mathbf {J}_C\)) creates spin-polarized electrons with spin current density (\(\mathbf {J}_S\)). Note that, \(\mathbf {J}_C\), \(\mathbf {J}_S\) and electron spin polarization \(\mathbf {\sigma }\) are all perpendicular to each other, and the relationship between them is defined in Eq. 1 as,
$$\begin{aligned} \mathbf {{J}_{S}}=\theta _{\mathrm {SH}}\left( \mathbf {\sigma }\times \mathbf {{J}_{C}}\right) . \end{aligned}$$
(1)
Where \(\theta _{SH}\) is the spin-Hall angle characterizing the strength of SHE in HM. In the SHE-MTJ structure (Fig. 1a, the SHE generated in HM influences the direction of the free layer (FL). However, accurate switching cannot be guaranteed. This is because the direction of FL in p-MTJ and the electron spin in HM are not collinear. Accurate switching in p-MTJ can be achieved by the application of either an external magnetic field or a STT current. From the literature, it is clear that generation and maintenance of the external magnetic field is undesirable from a practical point of view due to its design complexity, reduced sensitivity, and thermal stability [2]. Hence, using STT current in conjunction with SHE is a more suitable option for precise switching in p-MTJ [61]. This is called as SHE+STT switching in our manuscript. Figure 2 shows the switching of a three-terminal p-MTJ device structure with SHE+STT currents.
Switching the p-MTJ from AP to P (Fig. 2a) is explained as follows when the SHE current \(J_{SHE}\) flows in Y-direction (from T3 to T2), and spin accumulation is created at the FL-HM interface. This exerts a SOT onto the FL to tilt its magnetic orientation from −Z-direction to the X-Y plane. At the same time, the STT current (\(J_{STT}\)) flowing in the -Z-direction (from T1 to T2) exerts STT and switches the orientation of FL from XY-plane to Z-direction. On the contrary, to switch p-MTJ from P to AP (Fig. 2b), the direction of \(J_{STT}\) is reversed (from −Z to Z).

2.1 Logic-in-memory structure

Initially, Kautz put forth the idea of LIM in 1969 [75], and later in a year, i.e., in 1970, Stone presented an approach for LIM computer [76]. But later it did not attract much attention. However recently, due to the slowing trend of transistor scaling, advancements in thin films, and the emergence of spintronics devices, LIM concept is drawing the notice of both academia and industry. LIM can offer solutions for the problems of increased leakage current due to scaling and considerable delay in long interconnects by positioning the NV memory section above the CMOS logic [46]. MTJs provide nonvolatility to the structure. As a result, the power supply to the idle block can be turned off without losing the information stored in the MTJ. When the power is restored this information is readily available for processing without the need for the restore operation. Hence, the static power dissipation is almost zero in the standby mode. Therefore this approach is convenient for the “instant-on” and “normally-off” system [77]. To add to the above advantage, the 3D stacking feasibility of MTJ on top of CMOS not only reduces the latency but also increases the integration density in the LIM structure. A schematic of the LIM structure with three major components is illustrated in Fig. 3 as,
1.
Sense amplifier/Read circuit: It produces Out and \(\overline{\text {Out}}\) based on the differential current (IL and IR) flowing through the logic network.
 
2.
Write circuit: This is used to write the information (or change the states of MTJ) in pair of MTJ. Based on the various writing mechanisms, there will be a change in the write circuitry.
 
3.
Logic network: Various logical operations can be performed based on the design of the nMOS logic tree and information stored in the MTJ pair.
 

3 Hybrid MTJ/CMOS LIM-based logic gates

3.1 Selection and working of MTJ reading circuit

We have adopted the reading circuit proposed by [78], which is a modified version of the conventional pre-charge sense amplifier (PCSA) discussed in Ref. [79]. We address the PCSA from Ref. [79] as PCSA1 and from Ref. [78] as PCSA2. PCSA2 is superior to PCSA1 in terms of speed and power/energy consumption and the schematic of PCSA2 is shown in Fig. 4. The PCSA2 circuit works in two phases: the pre-charge phase and the evaluation phase. In pre-charge phase both outputs, OUT and \(\overline{\text {OUT}}\) are high. During the evaluation phase, the values stored in the MTJs are read and reflected at the output nodes. When the bit “0” is stored, the MTJ pair (MTJ0-MTJ1) is in AP-P configuration; on the contrary, when bit “1” is stored, the MTJ pair is configured as P-AP. The detailed working of the reading circuit is explained by considering bit “0” is stored in the MTJ pair as follows: Since the output of the hybrid circuits is complementary in nature, initially we consider OUT is at logic “0” and \(\overline{\text {OUT}}\) is at logic “1.”
1.
In the pre-charge phase: Clock (CLK)= “0” transistors P3 is ON and N3 is OFF. The OUT and \(\overline{\text {OUT}}\) node voltages will be shared through P3 and both OUT and \(\overline{\text {OUT}}\) will be pre-charged to Vdd-Vth (In PCSA1 [79], it was Vdd).
 
2.
In the evaluation phase: CLK=“1,” transistor N3 is turned ON, and it offers a discharging path for OUT as well as \(\overline{\text {OUT}}\) nodes to gnd. The current I1 and I2 in the left branch (LB) and right branch (RB), respectively, start to flow towards gnd at a different rate. This is due to the resistance variation between MTJ0 and MTJ1. As MTJ0 and MTJ1 are at \(R_{AP}\) and \(R_P\), respectively, I2 discharges faster than I1. It causes, OUT node to be pulled down the threshold voltage of P1, turning it ON. So the node \(\overline{\text {OUT}}\) is raised to logic “1.” On the contrary, OUT keeps discharging to gnd and reaches logic “0.” So sensing/reading bit “0” takes place from MTJ0-MTJ1. Similarly, we can understand reading bit “1” from the MTJ pair when it is configured as P-AP.
 
PCSA2 produces a quicker output response than PCSA1 because, in the pre-charge phase of PCSA2 both the outputs, i.e., OUT and \(\overline{\text {OUT}}\) are pre-charged to Vdd-Vth, whereas in PCSA1, OUT and \(\overline{\text {OUT}}\) are pre-charged to Vdd. In the evaluation phase, when one of the outputs of PCSA2 discharges to gnd from Vdd-Vth, it takes lesser time as compared to PCSA1, where Vdd needs to be discharged to gnd. Consequently, the lower voltage (Vdd-Vth) in PCSA2 dissipates lower power than PCSA1. Further, PCSA2 works with one PMOS lesser than PCSA1, which helps to reduce the transistor count.

3.2 SHE+STT MTJ writing circuit for LIM structure

Table 1
Input, intermediate signals and the corresponding MTJ states during SHE+STT writing
Input signals
Intermediate signals
MTJ status
EnW
EnSHE
Data
STTP
STTN
SHEP
SHEN
MTJ0
MTJ1
0
X
X
1
0
1
0
X
X
1
1
1
1
1
0
1
Metastable
1
0
1
1
1
0
0
P
AP
1
1
0
0
0
0
1
Metastable
1
0
0
0
0
1
1
AP
P
X represents don’t care condition
Figure 5 shows the writing circuit for SHE-MTJs with the SHE+STT switching mechanism. The writing driver is divided into two parts: control circuitry and writing core. It has three inputs, viz. Data, EnW, and EnSHE. Table 1 shows the different combinations of the input signals, control signals, and the corresponding status of the MTJ pair. When EnW is at logic “0,” the writing core is disabled by the intermediate signals (STTP, STTN, SHEP, SHEN = “1010”) of the control circuit, and no writing takes place. Information will be written only when EnW = “1.” Consider writing bit “1” into the MTJ pair, for which EnW, EnSHE, and Data are set as “111,” respectively. So the intermediate signals STTP, STTN, SHEP, SHEN will be at “1101,” respectively. This will turn ON transistors MP1, MP2, MP3, MN0, MN1, MN3. Two types of current are flowing in the circuit, i.e., SHE and STT current. SHE current path is; Vdda-MP1-MTJ0-MN1-gnd and Vdda-MP3-MTJ1-MN3-gnd. STT current path is; Vdda-MP2-MTJ1-MN3-gnd and Vdda-MP1-MTJ0-MN0-gnd. At this point, the MTJ pair is in a metastable state. After a brief period, EnSHE is asserted to “0” , making STTP, STTN, SHEP, SHEN = “1100.” This will turn OFF the transistors MP3 and MN1 which stops the flow of SHE current in the circuit. But STT current flows continuously to change the MTJ0-MTJ1 configuration to P-AP, respectively. Similarly, Data “0” can be written into the MTJ pair by suitably changing the input signals as shown in Table 1.

3.3 Hybrid MTJ/CMOS LIM-based logic gates

Figure 6a–c shows the circuit for NOR/OR, NAND/AND and XNOR/XOR gates, respectively, and Table 2 shows corresponding truth table. These hybrid circuits operate in the pre-charge and evaluation phase. In the pre-charge phase inputs are applied and in the evaluation phase output and its complement are obtained. The working of the NOR/OR gate in its evaluation phase is as follows. Whenever input A = “1” , N6 in the right branch (RB) is open and N5 in the left branch (LB) is ON. The total RB resistance for the discharge current (I2) is the sum of OFF-transistor (N6) resistance (\(R_{OFF}\)) and MTJ1 resistance (\(R_P/R_{AP}\)). Similarly, total LB resistance for the discharge current (I1) is the sum of ON-transistor (N5) resistance (\(R_{ON}\)) and MTJ0 resistance (\(R_P/R_{AP}\)). In this case, since transistor N6 is OFF, RB resistance is always greater than the LB resistance. The state of the MTJ does not affect the output, as \(R_{OFF}\) is always greater than either \(R_{ON}+R_{AP}\) or \(R_{ON}+R_P\) (\(R_{OFF}>(R_{ON}+R_{AP}\)) or \(R_{OFF}>(R_{ON}+R_P\))). Therefore OR output node is raised to Vdd, and the NOR node discharge to gnd; hence, it is at logic “0,” whereas when the A = “0,” N4 and N6 in LB and RB are ON, due to that NOR and OR outputs have a discharging path to gnd. In this condition, the bit stored in B decides the value of output nodes. If B = “0,” the state of the MTJ0-MTJ1 is in AP-P configuration. So the total resistance observed for currents I1 and I2 is \(R_{ON}+R_{AP}\) and \(R_{ON}+R_{P}\), respectively. Thus the LB resistance is comparatively greater than RB. This results in the NOR node being raised to logic “1”; correspondingly, OR node will be discharged to gnd. In another instance, when A = “0” and B = “1,” MTJ pair is at P-AP configuration. Resistances of LB and RB are at \(R_{ON}+R_{P}\) and \(R_{ON}+R_{AP}\) for the currents I1 and I2, respectively. Now the LB resistance is lower than RB. Hence, OR output node raises to logic “1” and the NOR output node discharges to gnd, producing logic “0.” Similarly working of NAND/AND and XNOR/XOR gates can be understood.
Table 2
Truth table for various logic gates along with the corresponding path resistance for the branch current in logic network
Gate type
Inputs
LB resistance
RB resistance
Resistance
OUT
\(\overline{\text {OUT}}\)
A
B
For I1(\(R_{LB}\))
for I2 (\(R_{RB}\))
Comparison
NOR/OR
0
0
\(R_{ON}+R_{AP}\)
\(R_{ON}+R_P\)
\(R_{LB}>R_{RB}\)
0
1
0
1
\(R_{ON}+R_P\)
\(R_{ON}+R_{AP}\)
\(R_{LB}<R_{RB}\)
1
0
1
0
\(R_{ON}+R_{AP}\)
\(R_{OFF}+R_P\)
\(R_{LB}<R_{RB}\)
1
0
1
1
\(R_{ON}+R_P\)
\(R_{OFF}+R_{AP}\)
\(R_{LB}<R_{RB}\)
1
0
NAND/AND
0
0
\(R_{OFF}+R_{AP}\)
\(R_{ON}+R_P\)
\(R_{LB}>R_{RB}\)
0
1
0
1
\(R_{OFF}+R_P\)
\(R_{ON}+R_{AP}\)
\(R_{LB}>R_{RB}\)
0
1
1
0
\(R_{ON}+R_{AP}\)
\(R_{ON}+R_P\)
\(R_{LB}>R_{RB}\)
0
1
1
1
\(R_{ON}+R_P\)
\(R_{ON}+R_{AP}\)
\(R_{LB}<R_{RB}\)
1
0
XNOR/XOR
0
0
\(R_{ON}+R_{AP}\)
\(R_{ON}+R_P\)
\(R_{LB}>R_{RB}\)
0
1
0
1
\(R_{ON}+R_P\)
\(R_{ON}+R_{AP}\)
\(R_{LB}<R_{RB}\)
1
0
1
0
\(R_{ON}+R_{P}\)
\(R_{ON}+R_{AP}\)
\(R_{LB}<R_{RB}\)
1
0
1
1
\(R_{ON}+R_{AP}\)
\(R_{ON}+R_{P}\)
\(R_{LB}>R_{RB}\)
0
1
\(R_{ON}\) : NMOS, ON resistance. \(R_{OFF}\): NMOS, OFF resistance.
\(R_{P}\): P state resistance of MTJ. \(R_{AP}\): AP state resistance of MTJ

4 Simulation results and discussion

We have performed electrical simulations in 45 nm CMOS generic process design kit using the Cadence tool (IC 6.1.7-64b.500.19) with default transistor parameters, L =45nm and W =120nm. The SHE-MTJ electrical model developed by Ref. [60] is used in our simulation work. This model is developed using the Verilog-A language. Table 3 shows the MTJ parameters set during the simulation with the SHE-MTJ model, whereas the other parameters are retained as default values as mentioned in Ref. [60]. We have set a supply voltage of Vdda = 1.2V for MTJ writing purpose, and Vdd = 1V for MTJ reading operations and DPTL-\(\text {C}^\text {2}\text {MOS}\) logic. With Vdda, we ensure that there is no area overhead caused by the writing core transistors; meanwhile, the MTJ write current is larger than the critical current so that switching of the MTJ can be accomplished with 100% probability.

4.1 Model verification of SHE-assisted STT MTJ model

In this section, we have performed model verification of SHE-MTJ and studied its switching behavior. Figure 7a, b show the waveform of SHE-MTJ model for three particular cases with SHE+STT, STT only (SHE current is zero), and SHE only (STT current is zero) simulation for AP to P and P to AP switching condition, respectively [60]. As seen in Fig. 7a, the switching time for AP to P switching with SHE+STT mechanism is at T1, whereas with STT only mechanism, it is at T2. On the contrary, with SHE only, the switching of MTJ does not take place. Similarly, as seen in Fig. 7b, the switching time for P to AP switching with SHE+STT mechanism is at T3, whereas it is at T4 with STT only mechanism. In this case also, with SHE only, switching does not take place. With SHE+STT switching mechanism, we have obtained T2-T1 and T4-T3, as the improvement in switching time for AP to P and P to AP switching, respectively, as compared to STT only switching mechanism. This is due to the elimination of incubation delay with the SHE+STT switching mechanism.
Table 3
SHE-MTJ parameters set during the simulation [60]
Parameter
Description
Value
\(t_{sl}\)
Free layer thickness
0.7 nm
\(t_{ox}\)
MgO barrier thickness
0.85 nm
TMR
TMR ratio under zero bias voltage
200\(\%\)
Shape
MTJ Surface shape
Circle
\(\mathrm {a}\)
MTJ Surface length
32 nm
\(\mathrm {b}\)
MTJ Surface width
32 nm
\(\mathrm {r}\)
MTJ Surface radius
16 nm
\(\mathrm {w}\)
Heavy-metal width
40 nm
\(\mathrm {d}\)
Heavy-metal thickness
3 nm
\(\mathrm {l}\)
Heavy-metal length
60 nm
\(\sigma _{TMR}\)
Standard deviation of TMR
3% of TMR
\(\sigma _{t_{sl}}\)
Standard deviation of \(t_{sl}\)
3% of \(t_{sl}\)
\(\sigma _{t_{ox}}\)
Standard deviation of \(t_{ox}\)
3% of \(t_{ox}\)
Figure 8 shows the analysis of varying SHE pulse width, i.e., from 0 to 355 ps, on the switching process of the SHE+STT switching mechanism. It is observed that when SHE pulse width is zero, STT only switching takes place. The switching time is minimum when SHE pulse width is set to 100 ps. Figure 9 shows the simulated waveform for the writing circuit which works on the SHE+STT switching mechanism. Initially bit “1” is stored in the MTJ pair. At time T1, EnW and EnSHE enable the writing of bit “0.” At time T2 bit “1” is written. At time T3, bit “1” is being rewritten into the MTJ pair, which is known as redundant write. Table 4 shows the comparison in terms of energy consumption, device count, and worst-case switching delay for SHE+STT and STT only writing process. Here we can notice that the average energy/bit consumption for STT only switching is 813.17fJ, whereas, with SHE+STT switching, it is 822.28fJ. There is a marginal decrease in the average energy/bit (i.e., 9.11fJ) with STT only as compared to SHE+STT switching. However, the worst-case delay for SHE+STT switching is 386.82ps, whereas for the STT only switching it is 900ps. With SHE+STT switching, the worst-case delay is decreased by 57.02%. Moreover, due to stochasticity and incubation delay associated with STT only switching, there is no guaranteed 100% switching in STT only. Hence, SHE+STT switching is the better option over STT only.
Table 4
Performance comparison between SHE+STT and STT only writing
Particulars
SHE+STT
STT only
No. of devices
 
Write 0
819.7f
812.4f
8MOS+2MTJ
Writing core
Write 1
819.3f
812.3f
Energy/bit (J)
Redundant
  
 
Write
819.3f
812.2f
 
Averagex
819.4f
812.3f
 
Write 0
3.23f
817.9a
38MOS
Control circuit
Write 1
2.74f
986.9a
Energy/bit (J)
Redundant
  
 
Write
2.67f
822.4a
 
Averagey
2.88f
875.7a
 
Totalx+y
822.28fJ
813.17fJ
46MOS+2MTJ
Worst-case delay (ps)
386.82
900
 
x,yRepresent the average energy/bit for writing core and control circuit respectively

4.2 Performance analysis of hybrid MTJ/CMOS logic gates

Table 5
Comparison between various logic gates with different read circuits in terms of power dissipation and device count at an operating frequency of 500 MHz
Gate
HG1
HG2
DPTL-\(\text {C}^\text {2}\text {MOS}\)
HG1
HG2
DPTL-\(\text {C}^\text {2}\text {MOS}\)
HG1
HG2
DPTL-\(\text {C}^\text {2}\text {MOS}\)
type
NOR/OR
NOR/OR
NOR/OR
NAND/AND
NAND/AND
NAND/AND
XNOR/XOR
XNOR/XOR
XNOR/XOR
Read
         
type
PCSA1
PCSA2
DPTL-\(\text {C}^\text {2}\text {MOS}\)
PCSA1
PCSA2
DPTL-\(\text {C}^\text {2}\text {MOS}\)
PCSA1
PCSA2
DPTL-\(\text {C}^\text {2}\text {MOS}\)
Static
0a
0a
231.8c
0a
0a
240.2c
0a
0a
264.6c
Power (nW)
(321.15b)
(312.35b)
 
(274.6b)
(285.35b)
 
(324.4b)
(311.1b)
 
Dynamic
         
Power (nW)
234.8
113.4
334.7
219.5
98.53
291.8
303.2
141.8
373.5
Total
         
Power (nW)
555.95d
425.75d
566.5d
494.1d
383.88d
532 d
627.6d
452.9d
638.1d
Worst case
         
Delay (ps)
91.55
89.3
72.96
70.35
67.52
74.03
75.48
74.74
74.04
PDP (aJ)
50.89
38.01
41.33
34.75
25.91
39.38
47.37
33.84
47.24
Device
12MOS
11MOS
19MOS
12MOS
11MOS
19MOS
13MOS
12MOS
22MOS
Count
+ 2MTJ
+ 2MTJ
 
+ 2MTJ
+ 2MTJ
 
+ 2MTJ
+ 2 MTJ
 
Write circuit has been excluded in the tabulated values. Static, dynamic, and total power dissipation are the average values for all the input combinations.
aHG1 and HG2 are nonvolatile in nature, and thus the supply can be turned off in standby mode
b,cStatic power dissipation in steady state condition
cDue to volatile nature of DPTL-\(\text {C}^{2}\text {MOS}\) logic, the power is not allowed to switch off
dTotal power dissipation= static power + dynamic power in active mode
We address the hybrid NOR/OR, NAND/AND, and XNOR/XOR logic gates with PCSA1 as HG1 and hybrid logic gates with PCSA2 as HG2, whereas its corresponding CMOS counterparts are developed using DPTL-\(\text {C}^{2}\text {MOS}\) logic and we address them as DPTL-\(\text {C}^\text {2}\text {MOS}\) logic gates. The comparison between hybrid and DPTL-\(\text {C}^{2}\text {MOS}\)-based NOR/OR, NAND/AND, and XOR/XNOR gate operations in terms of power, delay, PDP, and device count is presented in Table 5. The static power dissipation of HG1 and HG2 gates is considered to be zero in comparison with their CMOS counterparts because, in the hybrid logic gate, we store the input B in the nonvolatile MTJ pair, due to which we can turn off the power supply in the standby mode without losing the information. In the active mode, the stored information is readily available for computation without the need for a write/restoration process. However, there is steady-state power dissipation observed for the hybrid logic gates which are shown in Table 5.
The dynamic power dissipation in HG2 is least as compared to its corresponding HG1 and DPTL-\(\text {C}^{2}\text {MOS}\) logic gates. The dynamic power dissipation for NOR/OR, NAND/AND, and XNOR /XOR logic gates in HG2 is less than HG1 by 51.7%, 55.11% and 53.23%, respectively, and it is less than its DPTL-\(\text {C}^ {2}\text {MOS}\) logic gates by 66.11%, 66.23% and 62.03%, respectively. Similarly, the total power dissipation of NOR/OR, NAND/AND, and XNOR /XOR logic gates in HG2 is less than HG1 by 23.42%, 22.31%, and 27.84% and it is less than DPTL-\(\text {C}^ {2}\text {MOS}\) logic gates by 24.85%, 27.84% and 29.02%, respectively. Figure 10a, b shows a bar chart plot for the dynamic as well as total power dissipation comparison between HG1, HG2 and DPTL-\(\text {C}^ {2}\text {MOS}\)-based NOR/OR, NAND/AND and XNOR/XOR logic gates, respectively.
The delay depends upon the quality of the sense amplifier used in hybrid logic gates. In HG2 with PCSA2, during the pre-charge phase both output and its complement are at voltage Vdd-Vth, whereas for HG1 with PCSA1, output and its complement are at voltage Vdd. Hence during the evaluation phase, HG2 with PCSA2 discharges one of its output to gnd quicker than its corresponding HG1 with PCSA1 counterpart, producing a faster output response. But the DPTL-\(\text {C}^{2}\text {MOS}\) logic gates have lesser delay than hybrid gates. This is because in DPTL-\(\text {C}^ {2}\text {MOS}\) logic operates based on clocked CMOS pass transistor logic, whereas the hybrid gates use current-controlled sense amplifier. In the evaluation phase, the pull-down transistor N3 of the hybrid gates needs to provide a discharge path for the current (either I1 or I2) due to which the output delay depends upon the size of this transistor. By increasing the width of the N3 transistor the delay of the hybrid gates can be reduced. However, by doing so the power dissipation in the HG2 will increase, which is shown in Table 6 and Fig. 11. Hence, we need to tackle this delay versus power dissipation trade-off for the hybrid gates.
Table 6
Variation in total power dissipation and delay w.r.t the width of pull-down transistor N3 for HG2
Gate type
N3 Width (nm)
120
240
480
720
960
1200
 
Power (nW)
425.75
436.85
457.2
475.65
492.6
508.45
NOR/OR
Delay (ps)
89.3
83.37
80.05
78.64
77.97
77.58
 
Power (nW)
383.88
397.85
422.8
444.55
464.1
482
NAND/AND
Delay (ps)
67.52
65.03
64.2
64
63.95
63.89
 
Power (nW)
452.9
463.3
483
501.1
517.9
533.7
XNOR/XOR
Delay (ps)
74.74
67.42
63.23
61.68
60.98
60.62
In terms of device count, the HG2 NOR/OR and NAND/AND gates need only 11 MOS transistors, on the contrary, HG1, and DPTL-\(\text {C}^\text {2}\text {MOS}\) require 12 and 19 MOS transistors, respectively. Similarly, for XNOR/XOR gate, HG1 and DPTL-\(\text {C}^\text {2}\text {MOS}\) require 13 and 22 MOS transistors, whereas, for the HG2 XNOR/ XOR gate, only 12 MOS transistors are sufficient. Hence, in terms of the number of transistors HG2 logic gates are superior to HG1 and DPTL-\(\text {C}^\text {2}\text {MOS}\) logic gates.
The process and mismatch variations during nano-scale fabrication of VLSI circuits affect its performance. To study this effect at the design stage, we have performed MC simulation of 200 runs for the design. During MC simulation, we have not only incorporated CMOS variations, but also, 3% variations in the TMR, \({t_{sl}}\) and \({t_{ox}}\) (refer Table 3) that follow Gaussian distribution for the MTJs. Table 7 shows the total power dissipation comparison among all the logic gates. It suggests that HG2 logic gate design dissipates the least power compared to the rest of the designs.
Table 7
Total power dissipation comparison between HG1, HG2 and DPTL-\(\text {C}^\text {2}\text {MOS}\) logic gates for MC simulation
Design type
Gate type
Min
Max
Mean
Median
SD
(nW)
(nW)
(nW)
(nW)
(nW)
HG1
NOR/OR
515.25
596.55
554.35
554.65
13.63
NAND/AND
456.6
529
493.05
493.8
12.13
XNOR/XOR
580.8
670.3
625.75
626.7
16.45
HG2
NOR/OR
399.35
458.6
426.9
427.35
9.72
NAND/AND
361
407.7
384.55
384.8
7.52
XNOR/XOR
422.3
496.9
455.3
454.8
13.33
DPTL-\(\text {C}^\text {2}\text {MOS}\)
NOR/OR
456.8
592.1
559.6
563.5
24.73
NAND/AND
501.7
562.9
532
532.7
10.6
XNOR/XOR
598.9
768.1
640.3
635.2
27.42

5 Conclusion

The major contribution of this work is to provide an in-depth analysis of basic hybrid circuits developed using CMOS logic and p-MTJ device. In this paper, we have designed and presented a detailed analysis of all the logic gates, i.e., NOR/OR, NAND/AND, and XNOR/XOR based on the hybrid SHE+STT-MTJ/CMOS LIM structure. The simulation results obtained substantiates the fact that hybrid logic gates are not only nonvolatile in nature but also are superior to their CMOS counterparts in terms of power dissipation and the number of transistors used in the design. Due to the nonvolatility nature of hybrid logic gates, they can be completely turned off in standby mode to save a significant amount of power without the necessity of backup or restoring process as in the conventional CMOS technology. Hence these circuits can be considered for future low power applications.

6 Appendix: DPTL-\(\text {C}^\text {2}\text {MOS}\)-based logic gates

See Fig. 12.

Acknowledgements

Prashanth Barla would like to acknowledge the Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, for providing TMA Pai scholarship for his research work. Part of the work presented in Introduction and Background sections has been already reported and refs. [1, 2, 9] and can be used for detailed understanding.

Declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Code availability

Not applicable
Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://​creativecommons.​org/​licenses/​by/​4.​0/​.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Literature
11.
39.
go back to reference Chu, Y.-H., Martin, L.W., Holcomb, M.B., Gajek, M., Han, S.-J., He, Q., Balke, N., Yang, C.-H., Lee, D., Hu, W., Zhan, Q., Yang, P.-L., Fraile-Rodríguez, A., Scholl, A., Wang, S.X., Ramesh, R.: Electric-field control of local ferromagnetism using a magnetoelectric multiferroic. Nat. Mater. 7(6), 478–482 (2008). https://doi.org/10.1038/nmat2184CrossRef Chu, Y.-H., Martin, L.W., Holcomb, M.B., Gajek, M., Han, S.-J., He, Q., Balke, N., Yang, C.-H., Lee, D., Hu, W., Zhan, Q., Yang, P.-L., Fraile-Rodríguez, A., Scholl, A., Wang, S.X., Ramesh, R.: Electric-field control of local ferromagnetism using a magnetoelectric multiferroic. Nat. Mater. 7(6), 478–482 (2008). https://​doi.​org/​10.​1038/​nmat2184CrossRef
41.
go back to reference Song, Y.J. , Lee, J.H., Shin, H.C., Lee, K.H., Suh, K., Kang, J.R., Pyo, S.S., Jung, H.T., Hwang, S.H., Koh, G.H., Oh, S.C., Park, S.O., Kim, J.K., Park, J.C., Kim, J., Hwang, K.H., Jeong, G.T., Lee, K.P., Jung, E.S.:Highly functional and reliable 8Mb STT-MRAM embedded in 28nm logic, 2016 IEEE International Electron Devices Meeting (IEDM) (2016) 27.2.1–27.2.4 https://doi.org/10.1109/IEDM.2016.7838491 Song, Y.J. , Lee, J.H., Shin, H.C., Lee, K.H., Suh, K., Kang, J.R., Pyo, S.S., Jung, H.T., Hwang, S.H., Koh, G.H., Oh, S.C., Park, S.O., Kim, J.K., Park, J.C., Kim, J., Hwang, K.H., Jeong, G.T., Lee, K.P., Jung, E.S.:Highly functional and reliable 8Mb STT-MRAM embedded in 28nm logic, 2016 IEEE International Electron Devices Meeting (IEDM) (2016) 27.2.1–27.2.4 https://​doi.​org/​10.​1109/​IEDM.​2016.​7838491
42.
go back to reference Chung, S.-W., Kishi, T., Park, J.W., Yoshikawa, M., Park, K.S., Nagase, T., Sunouchi, K., Kanaya, H., Kim, G.C., Noma, K., Lee, M.S., Yamamoto, A., Rho, K.M., Tsuchida, K., Chung, S.J., Yi, J.Y., Kim, H.S., Chun, Y.S., Oyamatsu, H., Hong, S.J.:4Gbit density STT-MRAM using perpendicular MTJ realized with compact cell structure, 2016 IEEE International Electron Devices Meeting (IEDM) (2016) 27.1.1–27.1.4 https://doi.org/10.1109/IEDM.2016.7838490 Chung, S.-W., Kishi, T., Park, J.W., Yoshikawa, M., Park, K.S., Nagase, T., Sunouchi, K., Kanaya, H., Kim, G.C., Noma, K., Lee, M.S., Yamamoto, A., Rho, K.M., Tsuchida, K., Chung, S.J., Yi, J.Y., Kim, H.S., Chun, Y.S., Oyamatsu, H., Hong, S.J.:4Gbit density STT-MRAM using perpendicular MTJ realized with compact cell structure, 2016 IEEE International Electron Devices Meeting (IEDM) (2016) 27.1.1–27.1.4 https://​doi.​org/​10.​1109/​IEDM.​2016.​7838490
43.
go back to reference Lu, Y., Zhong, T., Hsu, W., Kim, S., Lu, X., Kan, J.J., Park, C., Chen, W.C., Li, X., Zhu, X., Wang, P., Gottwald, M., Fatehi, J., Seward, L., Kim, J.P., Yu, N., Jan, G., Haq, J., Le, S., Wang, Y.J., Thomas, L., Zhu, J., Liu, H., Lee, Y.J., Tong, R.Y., Pi, K., Shen, D., He, R., Teng, Z., Lam, V., Annapragada, R., Torng, T., Wang, P.-K., Kang, S.H.: Fully functional perpendicular STT-MRAM macro embedded in 40 nm logic for energy-efficient IOT applications, 2015 IEEE International Electron Devices Meeting (IEDM) (2015) 26.1.1–26.1.4 https://doi.org/10.1109/IEDM.2015.7409770 Lu, Y., Zhong, T., Hsu, W., Kim, S., Lu, X., Kan, J.J., Park, C., Chen, W.C., Li, X., Zhu, X., Wang, P., Gottwald, M., Fatehi, J., Seward, L., Kim, J.P., Yu, N., Jan, G., Haq, J., Le, S., Wang, Y.J., Thomas, L., Zhu, J., Liu, H., Lee, Y.J., Tong, R.Y., Pi, K., Shen, D., He, R., Teng, Z., Lam, V., Annapragada, R., Torng, T., Wang, P.-K., Kang, S.H.: Fully functional perpendicular STT-MRAM macro embedded in 40 nm logic for energy-efficient IOT applications, 2015 IEEE International Electron Devices Meeting (IEDM) (2015) 26.1.1–26.1.4 https://​doi.​org/​10.​1109/​IEDM.​2015.​7409770
45.
go back to reference Yoda, H., Fujita, S., Shimomura, N., Kitagawa, E., Abe, K., Nomura, K., Noguchi, H., Ito, J.:Progress of STT-MRAM technology and the effect on normally-off computing systems. In: 2012 International Electron Devices Meeting, 2012, pp. 11.3.1–11.3.4. https://doi.org/10.1109/IEDM.2012.6479023 Yoda, H., Fujita, S., Shimomura, N., Kitagawa, E., Abe, K., Nomura, K., Noguchi, H., Ito, J.:Progress of STT-MRAM technology and the effect on normally-off computing systems. In: 2012 International Electron Devices Meeting, 2012, pp. 11.3.1–11.3.4. https://​doi.​org/​10.​1109/​IEDM.​2012.​6479023
51.
go back to reference Kishi, T., Yoda, H., Kai, T., Nagase, T., Kitagawa, E., Yoshikawa, M., Nishiyama, K., Daibou, T., Nagamine, M., Amano, M., Takahashi, S., Nakayama, M., Shimomura, N., Aikawa, H., Ikegawa, S., Yuasa, S., Yakushiji, K., Kubota, H., Fukushima, A., Oogane, M., Miyazaki, T., Ando, K.: Lower-current and fast switching of a perpendicular TMR for high speed and high density spin-transfer-torque MRAM. https://doi.org/10.1109/IEDM.2008.4796680 Kishi, T., Yoda, H., Kai, T., Nagase, T., Kitagawa, E., Yoshikawa, M., Nishiyama, K., Daibou, T., Nagamine, M., Amano, M., Takahashi, S., Nakayama, M., Shimomura, N., Aikawa, H., Ikegawa, S., Yuasa, S., Yakushiji, K., Kubota, H., Fukushima, A., Oogane, M., Miyazaki, T., Ando, K.: Lower-current and fast switching of a perpendicular TMR for high speed and high density spin-transfer-torque MRAM. https://​doi.​org/​10.​1109/​IEDM.​2008.​4796680
52.
go back to reference Yoshikawa, M., Kitagawa, E., Nagase, T., Daibou, T., Nagamine, M., Nishiyama, K., Kishi, T., Yoda, H.: Tunnel magnetoresistance over 100% in MgO-based magnetic tunnel junction films with perpendicular magnetic L1\(\_\)\(\lbrace\)0\(\rbrace\)-FePt electrodes. IEEE Trans. Magn. 44(11), 2573–2576 (2008). https://doi.org/10.1109/TMAG.2008.2003059CrossRef Yoshikawa, M., Kitagawa, E., Nagase, T., Daibou, T., Nagamine, M., Nishiyama, K., Kishi, T., Yoda, H.: Tunnel magnetoresistance over 100% in MgO-based magnetic tunnel junction films with perpendicular magnetic L1\(\_\)\(\lbrace\)0\(\rbrace\)-FePt electrodes. IEEE Trans. Magn. 44(11), 2573–2576 (2008). https://​doi.​org/​10.​1109/​TMAG.​2008.​2003059CrossRef
55.
go back to reference Mihai Miron, I., Gaudin, G., Auffret, S., Rodmacq, B., Schuhl, A., Pizzini, S., Vogel, J., Gambardella, P.: Current-driven spin torque induced by the Rashba effect in a ferromagnetic metal layer. Nat. Mater. 9(3), 230–234 (2010)CrossRef Mihai Miron, I., Gaudin, G., Auffret, S., Rodmacq, B., Schuhl, A., Pizzini, S., Vogel, J., Gambardella, P.: Current-driven spin torque induced by the Rashba effect in a ferromagnetic metal layer. Nat. Mater. 9(3), 230–234 (2010)CrossRef
56.
go back to reference Miron, I.M., Garello, K., Gaudin, G., Zermatten, P.-J., Costache, M.V., Auffret, S., Bandiera, S., Rodmacq, B., Schuhl, A., Gambardella, P.: Perpendicular switching of a single ferromagnetic layer induced by in-plane current injection. Nature 476, 189–193 (2011). https://doi.org/10.1038/nature10309CrossRef Miron, I.M., Garello, K., Gaudin, G., Zermatten, P.-J., Costache, M.V., Auffret, S., Bandiera, S., Rodmacq, B., Schuhl, A., Gambardella, P.: Perpendicular switching of a single ferromagnetic layer induced by in-plane current injection. Nature 476, 189–193 (2011). https://​doi.​org/​10.​1038/​nature10309CrossRef
58.
go back to reference Bychkov, Y.A., Rashba, É.I.: Properties of a 2d electron gas with lifted spectral degeneracy. JETP Lett. 39(2), 78 (1984) Bychkov, Y.A., Rashba, É.I.: Properties of a 2d electron gas with lifted spectral degeneracy. JETP Lett. 39(2), 78 (1984)
72.
go back to reference Bandyopadhyay, S., Cahay, M.: Introduction to Spintronics. CRC Press, Boca Raton (2015)CrossRef Bandyopadhyay, S., Cahay, M.: Introduction to Spintronics. CRC Press, Boca Raton (2015)CrossRef
Metadata
Title
Design and analysis of SHE-assisted STT MTJ/CMOS logic gates
Authors
Prashanth Barla
Vinod Kumar Joshi
Somashekara Bhat
Publication date
19-08-2021
Publisher
Springer US
Published in
Journal of Computational Electronics / Issue 5/2021
Print ISSN: 1569-8025
Electronic ISSN: 1572-8137
DOI
https://doi.org/10.1007/s10825-021-01759-8

Other articles of this Issue 5/2021

Journal of Computational Electronics 5/2021 Go to the issue