top of page
Physics Tomorrow Letters icon.jpg
ORCID_iD.svg.png

Locked

applsciletetrsa>vol-01>issue-03>STeP-CiM: Strain-enabled Ternary Precision Computation-in-Memory based on Non-Volatile 2D Piezoelectric Transistors

ORCID_iD.svg.png
Copy of SCOPUS INDEXED (1).png
GIF featured item.gif
Editorial Choice.PNG

Research

STeP-CiM: Strain-enabled Ternary Precision Computation-in-Memory based on Non-Volatile 2D Piezoelectric Transistors

Niharika Thakuria
Reena Elangovan1
Applied Science Letters

Volume- 01, Issue 04

DOI- 0.1490/665877402.256applsci

We propose 2D Piezoelectric FET (PeFET) based compute-enabled non-volatile memory for ternary deep neural networks (DNNs). PeFETs hinge on ferroelectricity for bit storage and piezoelectricity for bit sensing, exhibiting inherently amenable features for computation-in-memory of dot products of weights and inputs in the signed ternary regime. PeFETs consist of a material with ferroelectric and piezoelectric properties coupled with Transition Metal Dichalcogenide channel. We utilize (a) ferroelectricity to store binary bits (0/1) in the form of polarization (-P/+P) and (b) polarization dependent piezoelectricity to read the stored state by means of strain-induced bandgap change in Transition Metal Dichalcogenide channel. The unique read mechanism of PeFETs enables us to expand the traditional association of +P (-P) with low (high) resistance states to their dual high (low) resistance depending on read voltage. Specifically, we demonstrate that +P (-P) stored in PeFETs can be dynamically configured in (a) a low (high) resistance state for positive read voltages and (b) their dual high (low) resistance states for negative read voltages, without afflicting a read disturb. Such a feature, which we name as Polarization Preserved Piezoelectric Effect Reversal with Dual Voltage Polarity (PiER), is unique to PeFETs and has not been shown in hitherto explored memories. We leverage PiER to propose a Strain-enabled Ternary Precision Computation-in-Memory (STeP-CiM) cell with capabilities of computing the scalar product of the stored weight and input, both of which are represented with signed ternary precision. Further, using multi word-line assertion of STeP-CiM cells, we achieve massively parallel computation of dot products of signed ternary inputs and weights. Our array level analysis shows 91% lower delay and improvements of 15% and 91% in energy for in-memory multiply-and-accumulate operations compared to near-memory design approaches based on 2D FET based SRAM and PeFET respectively. We also analyze the system-level implications of STeP-CiM by deploying it in a ternary DNN accelerator. STeP-CiM exhibits 6.11× - 8.91× average improvement in performance and 3.2× average improvement in energy over SRAM based near-memory design. We also compare STeP-CiM to near-memory design based on PeFETs showing 5.67× - 6.13× average performance improvement and 6.07× average energy savings.

Introduction


Deep Neural Networks (DNNs) have transformed the field of machine learning and are deployed in many real-world products and services (Lecun et al., 2015). However, enormous storage and compute 2 demands limits their application in energy-constrained edge-devices (Venkataramani et al., 2016). Precision reduction in DNNs has emerged as a popular approach for energy-efficient realization of hardware accelerators for these applications (Courbariauxécole and Bengio, 2015; Mishra et al., 2017; Choi et al., 2018; Colangelo et al., 2018; Wang et al., 2018). State-of-the-art DNN hardware for inference employs 8-bit precision, and recent algorithmic efforts have shown the pathway for aggressive scaling up to binary precision (Choi et al., 2018; Colangelo et al., 2018) . However, accuracy suffers significantly at binary precision. Interestingly, ternary precision networks offer a near-optimal design point in the low precision regime with significant accuracy boost compared to binary DNNs (Li et al., 2016; Zhu et al., 2016) and large energy savings with mild accuracy loss compared to higher precision DNNs (Mishra et al., 2017; Wang et al., 2018). Due to these features, ternary precision networks have garnered interest for their hardware realizations (Jain et al., 2020; Thirumala et al., 2020). Ternary DNNs can be implemented using classical accelerator architectures (e.g., Tensor Processing Unit and Graphical Processing Unit) by employing specialized processing elements and on-chip scratchpads to improve energy efficiency, but are nevertheless limited by memory bottleneck. In this regard, computing-in-memory (CiM) brings a new opportunity that can greatly enhance efficiency of DNN accelerators by reducing power-hungry data transfer between memory and processors.

Unlock Only

Read-only this publication

This option will drive you towards only the selected publication. If you want to save money then choose the full access plan from the right side.

Unlock all

Get access to entire database

This option will unlock the entire database of us to you without any limitations for a specific time period.
This offer is limited to 100000 clients if you make delay further, the offer slots will be booked soon. Afterwards, the prices will be 50% hiked.