

# **Complementary Resistive Switch based Neuromorphic Associative Capacitive Network**

L. Nielen<sup>1,2</sup>, A. Siemon<sup>1,2</sup>, S. Tappertzhofen<sup>1,2,3</sup>, R. Waser<sup>1,2,4</sup>,

#### S. Menzel <sup>2,4</sup>, E. Linn <sup>1,2</sup>

<sup>1</sup> Institut für Werkstoffe der Elektrotechnik II, RWTH Aachen University, Germany

<sup>2</sup> JARA – Fundamentals of Future Information Technology

<sup>3</sup> Department of Engineering, University of Cambridge, United Kingdom

<sup>4</sup> Peter Grünberg Institut 7, Forschungszentrum Jülich GmbH, Jülich, Germany





21.01.2015

#### Outline

Associative Capacitive Networks

a Neuronal application for non-volatile memories

Complementary Resistive Switches (CRS)

a solution of the sneak path problem in passive ReRAM arrays

- Capacitive Readout of CRS cells
- CRS-based Associative Capacitive Network (ACN)
- Experimental ACN circuitry
- Simulative evaluation
- Summary

## **Associative Capacitive Network - Motivation**

Artificial neuron model



- Information stored via capacitance value
- Simultaneous activation function on all lines
- Summarized output
- → Content Addressable Memory

Problem with conventional ACNs: Capacitances charge is volatile

CRS based solution: non-volatile memory

Applications area:

- Pattern recognition
- Fast routing (reprogrammable lookup-tables)

 $x_1$  Synaptic current  $\overline{x_1}$   $V_{ML,i}$  $x_M$   $V_{ML,i}$   $V_{ML,i}$   $V_{TH}$ 

O. Kavehei, E. Linn et al., Nanoscale, 5, 5119 (2013)

- Rewritable memory with lookup-table functionality
- Pattern matching in a single cycle



- Conventional CAMs are implemented by SRAM core cells
- $\rightarrow$  drawbacks:
  - static currents  $\rightarrow$  high energy consumption
  - large area demand

Classical SRAM based CAM cell

IW= 2



10-T NOR-type CAM



9-T NAND-type CAM

# **Complementary Resistive Switches (CRS)**



# **Capacitive Read-out**



#### NDRO

CRS cells offer a capacitive voltage divider property

Slide 6

- Both elements A and B are equal in terms of resistive switching
- But: capacitances differ (e.g. different areas)
- Capacitive read-out of the stored state



#### **CRS-based Associative Capacitive Network**



O. Kavehei, E. Linn et al., *Nanoscale*, 5, 5119 (2013)

#### 

## **ACN – 2-bit Example**

Stored pattern: 0 1 and negated pattern 1 0 HD=2: maximum output ( $x_1$ = '1'  $x_2$ = '0') HD=1: equal output for  $x_1$ = '0'  $x_2$ = '0' and  $x_1$ = '1'  $x_2$ = '1' HD=0: minimum output ( $x_1$ = '0'  $x_2$ = '1')





Output voltage reflects Hamming Distance (HD) → Pattern matching

O. Kavehei, E. Linn et al., Nanoscale, 5, 5119 (2013)

# **Fabrication of ACN Cells**



#### **Associative Capacitive Network**

- N×2M array of CRS devices
- green: stored template (T)
- yellow: stored negates of  $T(\underline{T})$
- input: vector X and X





#### Fabrication of a µ-structure array

- Silicon wafer
- Platinum, Titanium and TiO<sub>2</sub> sputter deposition
- UV-lithography

Cell A: 24×10 µm² Cell B: 20×20 µm²

- Die mounting in a 28 pin carrier
- Contacting via wedge-wedge bonding with gold wires
- Evaluation with an experimental circuitry

### **Experimental Setup for ACN Read-Out**



L. Nielen; S. Tappertzhofen et al., IEEE SNW, p. 61 (2014)

### **Experimental Setup for ACN Read-Out**



### **Experimental Setup for ACN Read-Out**



 $\rightarrow$  HD detection via switching event

#### **Experimental Results**



#### **Experimental Results**



#### **Memristive ECM model**



$$I = I_{ion} (V, w) + I_{Tu} (V, w)$$
$$\dot{w} = C_1 \cdot I_{ion}$$

S. Menzel, U. Böttger, et al., *J. Appl. Phys.*, vol. 111, pp. 14501 (2012) E. Linn, S. Menzel, et al., *Nanotechnology* **24**, 384008 (2013)

### VerilogA core cell



L. Nielen, A. Siemon et al., accepted Jetcas, (2015)

# **ACN Array Simulations**



 Memristive ECM model implementation with VerilogA (SPICE)

**RWTHAACHEN** 

 Array simulations performed by Cadence Spectre

#### **Features**

- Complete write and search operation feasible
- Implementation of coupling capacitances
- Different device capacitance ratios

## **Size Effects & Power Consumption**

#### Minimum voltage margin $\Delta V$

• Increasing array size (i.e. pattern length)

 $\rightarrow$  voltage interval corresponding to HD decreases

#### Search energy demand

Only caused by charging currents in:

- cells
- parasitic capacitances
- $\rightarrow$  Power consumption scales with array size

| <b>ACN</b> $(C_A/C_B = 1.5)$ |      | SRAM-based              |          |  |  |
|------------------------------|------|-------------------------|----------|--|--|
| 32×64                        |      | 2561024×144             |          |  |  |
| 8 F <sup>2</sup>             | 40nm | 120–1500 F <sup>2</sup> | 32–65 nm |  |  |
| 4.69 aJ/bit/search           |      | > 0.1 fJ/bit/search     |          |  |  |

P. T. Huang; W. Hwang, *IEEE J. Solid-State Circuits,* 46, p. 507, (2011) A. T. Do, C. Yin et al, *IEEE J. Solid-State Circuits,* 49, p. 1487 (2014)





L. Nielen, A. Siemon et al., accepted Jetcas, (2015)

## Summary

- Neuromorphic application for Memristive Random Access Memories was demonstrated
- An Associative Capacitive Network was fabricated
- Development of an **Experimental Setup** for ACN Evaluation
- **Proof-of-Concept:** The ACN shows the predicted behavior
- Study of arrays promises Low Power Consumption
- Fully parallel search within the range of **Nanoseconds** was demonstrated using a simple measurement setup

Advantages of CRS based ACN concept:

- Hamming Distance detection (similarity)
- Fully passive 2-CRS cell implementation small area demand
- − Non-volatile  $\rightarrow$  No Refresh  $\rightarrow$  Low Power Consumption
- − No reprogramming  $\rightarrow$  fast read access
- No requirement of constant voltage supply

# THANK YOU FOR YOUR ATTENTION



K. Eshraghian; K. R. Cho et al., IEEE Trans. VLSI, 19, p. 1407 (2011)



K. Pagiamtzis; A. Sheikholeslami, IEEE J. Solid-State Circuits, 41, p. 712 (2006)

# TABLE I Performance Comparison of Prior Works

|                             | This work | [25]    | [10]       | [3]       | [19]           | [20]    |
|-----------------------------|-----------|---------|------------|-----------|----------------|---------|
| Technology                  | 65 nm     | 130nm   | 22nm/1V    | 65 nm/ 1V | 65 nm/ 1V      | 130nm   |
| /Supply                     | /1.2V     | /1V     | 521111/1 V |           |                | /1.2V   |
| Search delay (ns)           | 1.07      | 0.9     | 0.145      | 1.92      | 0.6 (72 bits)  | 3.5 ns  |
|                             |           |         |            |           | 2.2 (240 bits) |         |
| FOM (fJ/bit/search)         | 0.77      | 1.827   | 1.07       | 1.98      | 0.99           | 1.3     |
| Normalized FOM*             | 1         | 1.09    | 2.48       | 2.37      | 1.2            | 0.65    |
| Frequency                   | 500 MHz   | 250 MHz | N.A        | 250 MHz   | 450            | N.A     |
| Chip area(mm <sup>2</sup> ) | 0.125     | 1.4     | N.A        | 99        | 0.078          | N.A     |
| Capacity                    | 128×128   | 128×32  | 128×128    | 18 Mb     | 64x72          | 256x144 |

\* Normalized FOM = FOM × (65 nm/technology node)×(1.2 V/VDD).

|                                                   | Hybrid              | PF-CDPD             | Range Match [16] | Tree-style Charge Recycling | This Work            |            |                          |
|---------------------------------------------------|---------------------|---------------------|------------------|-----------------------------|----------------------|------------|--------------------------|
|                                                   | [19]<br>(JSSC 2005) | [15]<br>(JSSC 2006) | (ISSCC 2006)     | [17, 21]<br>(JSSC 2008)     | [22]<br>(ASSCC 2008) | Simulation | Test Chip<br>Measurement |
| configuration                                     | 1024x144            | 256x128             | 512x144          | 256x128                     | 1024x144             | 256x       | 144                      |
| Technology                                        | 100 nm              | 0.18 µm             | 0.13 µm          | 0.18 µm                     | 0.18 µm              | 65 n       | m                        |
| Area (mm <sup>2</sup> )                           | 2.8x4.2             | 1.21x0.56           | 1.5x1.7          | 0.84x0.92                   | 3.67x0.98            | 1.01x      | 0.43                     |
|                                                   | (chip)              | (core)              | (core)           | (core)                      | (core)               | (cor       | e)                       |
| Supply voltage (V)                                | 1.2 V               | 1.8 V               | 1.2 V            | 1.8 V                       | 1.8V                 | 1.0        | V                        |
| Search time (ns)                                  | 2.20 ns             | 2.10 ns             | 4.80 ns          | 1.56 ns                     | 100MHz               | 0.38ns     | 400MHz                   |
| Energy metric (fJ/bit/search)                     | 0.700               | 2.330               | 0.590            | 1.420                       | 6.300                | 0.113      | 0.165                    |
| Normalized Search<br>time T* (ns)                 | 1.716               | 1.365               | 2.880            | 1.014                       | N.A.                 | 0.380      | N.A.                     |
| Normalized Energy<br>metric E*<br>(fJ/bit/search) | 0.316               | 0.260               | 0.205            | 0.158                       | 0.702                | 0.113      | 0.165                    |

TABLE I FEATURES SUMMARY AND COMPARISONS



K. Eshraghian; K. R. Cho et al., IEEE Trans. VLSI, 19, p. 1407 (2011)

### VerilogA core cell



F = 40 nm

 $C_{\text{seg}} = 2.76 \text{ aF}$  SiO<sub>2</sub> as interline material ( $\varepsilon_{\text{r}} = 3.9$ )  $R_{\text{seg}} = 0.86 \Omega$   $D_A / D_B = 21 \text{ nm} / 14 \text{ nm} = 1.5 (C_A / C_B = 1/1.5)$   $D_A / D_B = 24 \text{ nm} / 12 \text{ nm} = 2 (C_A / C_B = 1/2)$   $D_A / D_B = 28 \text{ nm} / 7 \text{ nm} = 4 (C_A / C_B = 1/4)$  $D_A / D_B = 30 \text{ nm} / 5 \text{ nm} = 6 (C_A / C_B = 1/6)$ 

# **Monte Carlo Simulations**



Probability function (p) of seven top matchlines with minimum HDs versus  $V_{ML}$ . The detection probability can be interpreted as *p*. In order to detect HD=0, (ML1), with p=95% chance,  $V_{TH}$  has to be around 1.5 V. Under these circumstances a successful detection over a range of outputs is achieved. If HD=0, the output is 1 with 95% probability. For HD=1, the output is 1 with 50% probability, while for HD=2 the probability for an output 1 is 10%. For HD=3, only a probability of 0.5% for observing an output 1 is given. The probability that output is 1 for HD > 3 tends to have negligible small values. Thus, HD < 4 are detect-able with high probability.

| Variable                        | Mean $(\mu)$        | Relative $3\sigma$ |
|---------------------------------|---------------------|--------------------|
| Supply and input voltages       | 3 V                 | 10%                |
| Series resistors on each ML     | 100 Ω               | 10%                |
| Device thickness                | 20 nm               | 10%                |
| Top electrode width             | 5 µm                | 10%                |
| Middle electrode width          | 10 µm               | 10%                |
| Bottom electrode width          | 15 µm               | 10%                |
| Load capacitor $(C_{ML})$       | 22 pF               | 10%                |
| RRAM ON resistance $(R_{ON})$   | $1 \text{ k}\Omega$ | 20%                |
| RRAM OFF resistance $(R_{OFF})$ | $1 M\Omega$         | 20%                |

Probability–voltage distribution for seven outputs with HD=0 to 6 that correspond to  $ML_1$  to  $ML_7$ . The solid blue line indicates VTH in our test-bench circuit. Outputs with a voltage amplitude above the threshold voltage are treated as 'hit' and below that are treated as 'miss'

O. Kavehei, E. Linn et al., Nanoscale, 5, 5119 (2013)