#### Developments of Reconfigurable Digital Logic in the ASIC using 130nm and 28nm CMOS

Larry Ruckman – <u>ruckman@slac.stanford.edu</u> Julia Gonski, Aseem Gupta, Hyunjoon Kim, Lorenzo Rota,

CPAD 2023 - Nov. 8, 2023





## Automated data processing in ASIC/FPGA - FABulous

- Goal: Move more data processing into the front-end ASICs
- Often algorithms and data processing techniques must evolve which make ASIC deployment problematic
  - Custom ASICs need to support updatable data processing pipelines
- Several popular FPGA architectures are becoming 20+ years old
  - Original patents have expired
  - Includes Spartan-3 and Virtex-II FPGAs from AMD/Xilinx
- In 2021, University of Manchester has started an open-source project called "FABulous"
  - an Embedded FPGA (eFPGA) Framework
- Idea is that you put an "reconfigurable logic" in your ASIC design
- SLAC is experimenting with this approach to determine it feasibility for front end data processing, both classical and ML based



2

N. Dao, A. Attwood, B. Healy, D. Koch, "FlexBex: A RISC-V with a Reconfigurable Instruction Extension," doi: 10.1109/ICFPT51103.2020.00034.
D. Koch, N. Dao, B. Healy, J. Yu, A. Attwood "FABulous: An Embedded FPGA Framework", doi:10.1145/3431920.3439302
J. Yu, A. Attwood, N. Dao, D. Koch, "The FABulous Open eFPGA Ecosystem in Action - From Specifications to Chips to Running Bitsteams," doi: 10.1109/FPL53798.2021.00090

FABulous in 130nm CMOS - FY2022 Effort

- Simple example eFPGA design to tryout the framework
  - 384 logic cells
  - 128 registers
  - 4 DSP slices
  - Submitted on TSMC 130nm MPW on May 2022
  - ASIC wire bonded to an FMC carrier and eFPGA bitstream loaded from a Xilinx development board
  - Testing started Aug 2022
  - ASIC+eFPGA functionality demonstrated
  - **Goal**: Get more familiar with the open-source framework and tools before implementing an eFPGA in a readout ASIC
    - Also determine where SLAC can contribute to this effort



a anna an Arrain ann a an Arrain an Arrain a Mar an Arrain an Arrain

Fabulous v1 ASIC

Outputs from eFPGA (16b counter) probed with oscilloscope



Floorplan of eFPGA fabric after Place&Route

# FABulous in 130nm CMOS - New IP developed

- Developed a new serial protocol for ASIC register access that can be used over standard fiber optics
  - Presented at 2022 IEEE NSS/MIC
  - Adopted for the nEXO charged ASIC design
- Methodology for Digital ASIC, FPGA and Software Development and Verification
  - Also presented at 2022 IEEE NSS/MIC
  - Used to develop the firmware/software and tested it on the ASIC's post-PnR prior to tape out

## Added support for Synopsys DC compiler to our **RUCKUS** build system

- DC compiler is used to build RTL code into digital ASIC
- Refer to RDC4 talk on "Rapid Firmware/Software Development with SLAC's Open-Source Tools: SURF, RUCKUS, and ROGUE" this Thursday





#### 5

## **Technology choice**

#### First iteration – last year: 130nm CMOS

- Relatively low-cost technology to experiment with open-source • eFPGA framework
- First-time right design
- Density on older CMOS nodes makes the implementation less attractive

#### Second iteration - this year: 28nm CMOS

- Scaling brings an improvement in logic density by ~20
- Process selected by US & international community for future • HEP ASIC developments (TSMC High Performance Computing+)
- Better radiation hardness & lower power consumption
- Synergy with other projects: build know-how and IP on 28nm

#### 5mm



## 28nm Design: Floorplan

450 um

# SLAC 700 um eFPGA + configuration / readout logic 700 um

450 um



eFPGA consists of:

- Array of 8x8 tiles (scalable & configurable)
- Switch matrix

### Scaling: 130nm vs 28nm area



## 130nm vs 28nm Tiles

#### 130nm fabric.csv config

- No customization from "TSMC example"
- Array of 8x8 tiles
- 384 logic cells, 128 registers, 4 DSP slices
- W\_IO is the "GPIO" Tile
- CPU\_IO tiles for register interface

#### 28nm fabric.csv config

- Custom WEST\_IO/EAST\_IO for register and streaming interfaces
- Array of 8x8 tiles
- 448 logic cells, 4 DSP slices
  - $\circ \qquad {\sf Found \ logic \ cells \ more \ valuable \ than \ register \ titles}$
  - Logic cells include a register

|    | A           | B              | С              | D             | E             | F             | G             | Н             | 1             | J      |
|----|-------------|----------------|----------------|---------------|---------------|---------------|---------------|---------------|---------------|--------|
| 1  | FabricBegin |                |                |               |               |               |               |               |               |        |
| 2  | NULL        | N_term_single2 | N_term_single2 | N_term_single | N_term_single | N_term_single | N_term_single | N_term_single | N_term_single | NULL   |
| 3  | W_IO        | RegFile        | DSP_top        | LUT4AB        | LUT4AB        | LUT4AB        | LUT4AB        | LUT4AB        | LUT4AB        | CPU_IC |
| 4  | W_IO        | RegFile        | DSP_bot        | LUT4AB        | LUT4AB        | LUT4AB        | LUT4AB        | LUT4AB        | LUT4AB        | CPU_IC |
| 5  | W_IO        | RegFile        | DSP_top        | LUT4AB        | LUT4AB        | LUT4AB        | LUT4AB        | LUT4AB        | LUT4AB        | CPU_IC |
| 6  | W_IO        | RegFile        | DSP_bot        | LUT4AB        | LUT4AB        | LUT4AB        | LUT4AB        | LUT4AB        | LUT4AB        | CPU_IC |
| 7  | W_IO        | RegFile        | DSP_top        | LUT4AB        | LUT4AB        | LUT4AB        | LUT4AB        | LUT4AB        | LUT4AB        | CPU_IC |
| 8  | W_IO        | RegFile        | DSP_bot        | LUT4AB        | LUT4AB        | LUT4AB        | LUT4AB        | LUT4AB        | LUT4AB        | CPU_IC |
| 9  | W_IO        | RegFile        | DSP_top        | LUT4AB        | LUT4AB        | LUT4AB        | LUT4AB        | LUT4AB        | LUT4AB        | CPU_IC |
| 10 | W_IO        | RegFile        | DSP_bot        | LUT4AB        | LUT4AB        | LUT4AB        | LUT4AB        | LUT4AB        | LUT4AB        | CPU_IC |
| 11 | NULL        | S_term_single2 | S_term_single2 | S_term_single | S_term_single | S_term_single | S_term_single | S_term_single | S_term_single | NULL   |
| 12 | FabricEnd   |                |                |               |               |               |               |               |               |        |

#### 130nm fabric.csv config

|    | A           | В             | С             | D             | E             | F          | G             | н             | 1             | J       |
|----|-------------|---------------|---------------|---------------|---------------|------------|---------------|---------------|---------------|---------|
| 1  | FabricBegin |               |               |               |               |            |               |               |               |         |
| 2  | NULL        | N_term_single | N_term_single | N_term_single | N_term_single | N_term_DSP | N_term_single | N_term_single | N_term_single | NULL    |
| 3  | WEST_IO     | LUT4AB        | LUT4AB        | LUT4AB        | LUT4AB        | DSP_top    | LUT4AB        | LUT4AB        | LUT4AB        | EAST_IO |
| 4  | WEST_IO     | LUT4AB        | LUT4AB        | LUT4AB        | LUT4AB        | DSP_bot    | LUT4AB        | LUT4AB        | LUT4AB        | EAST_IO |
| 5  | WEST_IO     | LUT4AB        | LUT4AB        | LUT4AB        | LUT4AB        | DSP_top    | LUT4AB        | LUT4AB        | LUT4AB        | EAST_IO |
| 6  | WEST_IO     | LUT4AB        | LUT4AB        | LUT4AB        | LUT4AB        | DSP_bot    | LUT4AB        | LUT4AB        | LUT4AB        | EAST_IO |
| 7  | WEST_IO     | LUT4AB        | LUT4AB        | LUT4AB        | LUT4AB        | DSP_top    | LUT4AB        | LUT4AB        | LUT4AB        | EAST_IO |
| 8  | WEST_IO     | LUT4AB        | LUT4AB        | LUT4AB        | LUT4AB        | DSP_bot    | LUT4AB        | LUT4AB        | LUT4AB        | EAST_IO |
| 9  | WEST_IO     | LUT4AB        | LUT4AB        | LUT4AB        | LUT4AB        | DSP_top    | LUT4AB        | LUT4AB        | LUT4AB        | EAST_IO |
| 10 | WEST_IO     | LUT4AB        | LUT4AB        | LUT4AB        | LUT4AB        | DSP_bot    | LUT4AB        | LUT4AB        | LUT4AB        | EAST_IO |
| 11 | NULL        | S_term_single | S_term_single | S_term_single | S_term_single | S_term_DSP | S_term_single | S_term_single | S_term_single | NULL    |
| 12 | FabricEnd   |               |               |               |               |            |               |               |               |         |
|    |             |               |               |               |               |            |               |               |               |         |

#### 28nm fabric.csv config

## **28nm Design: Full chip Integration**



983.32 µm

## **FABulous for Detector Readout**

- FABulous in 28nm can enable increased intelligence at the front-end of future detectors in HEP experiments
- AI/ML on-chip is a high priority R&D avenue for the collider community [2306.13567]
  - Data reduction for high channel density
  - Fast classification/inference for trigger information
- *Example*: 28nm ASIC for ML on low-level pixel charge data (right)
- 28nm CMOS is radiation hard; eFPGA needs TMR for configuration/registers
- Open source: potential to apply to variety of subsystems and subfields of HEP (eg. neutrino detectors, free electron lasers)



#### **IEEE Real Time Conference: FABulous eFPGAs**

- 24th IEEE Real Time Conference
  - ICISE, Quy Nhon, Vietnam
  - April 22-26, 2024
  - <u>https://indico.cern.ch/event/940112/</u>
- Abstract submissions are now open.
  - Deadline: 2 Dec 2023, 11:59
- Pre-Conference Program: 4 workshop/tutorials will be held
  - Cocotb
  - SURF
  - FABulous eFPGAs
  - Open-source ASIC Design (Skywater, Caravel)



## **Summary**

- Contributed to development of Fabulous Open Source eFPGA
- Established flow to implement an eFPGA on 28nm and 130nm:
  - ASIC implementation
  - eFPGA configuration through FPGA
  - Co-simulation
  - Submitted on TSMC 28nm MPW on July 2023
    - ETA: This week! (shipped out last Friday)
    - ASIC carrier is fabricated and ready for wiring bonding /
  - All RTL code is released as open-source:
    - <u>https://github.com/FPGA-Research-Manchester/FABulous</u>
    - <u>https://github.com/slaclab/surf</u>

