Dynamic Neural Accelerator
EdgeCortix Dynamic Neural Accelerator (DNA), is a flexible IP core for deep learning inference with high compute capability, ultra-low latency and scalable inference engine on BittWare cards featuring Agilex FPGAs.
Specially optimized for inference with streaming and high resolution data (Batch size 1), DNA is a patented reconfigurable IP core that, in combination with EdgeCortix’s MERA™ software framework, enables seamless acceleration of today’s increasingly complex and compute intensive AI workloads, while achieving over 90% array utilization.
Complemented by the MERA framework that provides an integrated compilation library and runtime, this dedicated IP core enables software engineers to use the Bittware IA-840F and IA-420F FPGA cards as drop-in replacements for standard CPUs or GPUs, without leaving their comfort zone of standard frameworks like PyTorch and TensorFlow. DNA bitstreams for Agilex provides significantly lower inference latency on streaming data with 2X to 6X performance advantage compared to competing FPGAs, and better power efficiency compared to other general purpose processors.
Up to 20 TOPS @ 400 MHz
(99% of FP32
50+ models tested with MERA framework
EdgeCortix deep learning compute engines as part of the DNA IP Core is optimized for the Bittware IA-840F and IA-420F cards and is shipped as ready to use bitstreams. The EdgeCortix solution suite comes with MERA™ framework that can be installed from a public pip repository, enabling seamless compilation and execution of standard or custom convolutional neural networks (CNN) developed in industry-standard frameworks.
MERA consists of the compiler and software tool-kit needed to enable deep neural network graph compilation and inference using the integrated DNA bitstreams. Having built-in support for the open-source Apache TVM compiler framework, it provides the tools, APIs, code-generator and runtime needed to deploy a pre-trained deep neural network after a simple calibration and quantization step. MERA supports models to be quantized directly in the deep learning framework such as Pytorch or TensorflowLit.
Ultra-low Latency AI inference IP Core:
- Up to 24576 MACs and dedicated vector engine for non-convolution operations @ 400 MHz
- Data-flow array based architecture with optimization for INT8 parameters and activations
- Patented runtime reconfigurable interconnect
Robust Open-sourced MERA software framework:
- MERA compiler 1.0 exploits multiple forms of parallelism and maximizes compute utilization
- Native support for Pytorch & TensorFlow Lite models
- Built-in profiler in MERA framework
- Integrated with open-sourced Apache TVM
Data Sheet and Product Details
Detailed Feature List
Diverse Operator Support:
- Standard and depth-wise convolutions
- Stride and dilation
- Symmetric/asymmetric padding
- Max pooling, average pooling
- ReLU, ReLU6, LeakyReLU, and H-Swish
- Upsampling and Downsampling
- Residual connections, split etc.
Drop-in Replacement for GPUs:
- Python and C++ interfaces
- PyTorch and TensorFlow-lite supported
- No need for retraining
- Supports high-resolution inputs
INT8 bit Quantization:
- Post-training quantization
- Support for deep learning framework built-in quantizers
- Maintains High accuracy
FPGA Card Ordering Options
|IA-420F-0006||BittWare IA-420F card powered by EdgeCortix® Dynamic Neural Accelerator
|IA-840F-0014||BittWare IA-840F card powered by EdgeCortix® Dynamic Neural Accelerator|
About the Company
An edge-AI focused fabless semiconductor design company with a software first approach, focused on delivering class-leading efficiency and latency for AI inference.
Interested in Pricing or More Information?
Our technical sales team is ready to provide availability and configuration information, or answer your technical questions.