Featured ProjectFPGA/RTLPresent

CNN Accelerator RTL Implementation

High-Performance INT8 CNN Accelerator for Zynq-7020 FPGA with 31.36 GOP/s throughput

Project Gallery

CNN Accelerator RTL Implementation - Image 1

About This Project

Designed and implemented a high-performance CNN accelerator IP core for the Xilinx Zynq-7020, featuring a 14×14 INT8 processing-element array delivering 31.36 GOP/s at 80 MHz.

Developed a custom 32-bit ISA, hierarchical memory system (PE register files + global buffers), and AXI4 DMA integration for seamless ARM–FPGA communication.

Includes matrix-vector multiplier implementation with UART communication and AXI Stream interfaces.

Technologies Used

Xilinx Zynq-7020INT8SystemVerilogAXI4 DMA32-bit ISAVivadoARM-FPGA
CategoryFPGA/RTL
TimelinePresent
Technologies7

Quick Links