Positron AI Acceleration

Products
Programmable Accelerators →

Powered by the latest FPGAs and SoCs from Achronix, AMD, and Altera, our cards are designed and manufactured in-house for enterprise-class performance.

Browse All Programmable Accelerators (FPGA + SoC)

Direct RF (RFSoC)

Custom-Designed Products

Browse by Accelerator Manufacturer:

Achronix

AMD (formerly Xilinx)

Altera (formerly Intel)

Direct RF and Software-Defined Radio →

Card-Level Products

Our RFX family of cards leverage the AMD RFSoC chip with our own analog amplification/filtering and more.

RFX 880/770 (4 in/out + Up to 400G I/O)

RFX 881/771 (Up to 8 in/out + Up to 400G I/O)

RFX-8440

WaveBox RF Servers and Enclosures

Taking RFX in an integrated, modular approach with up to 12 analog in/out in a 1U server.

WaveBox RF

TeraBox™ Servers and Enclosures →

Finding the right server and configuring it can be time-consuming. We've designed TeraBox servers to be ready to go from the start, saving you time if you're developing and for deployment: a robust solution suitable for the toughest challenges.

Browse TeraBox Servers

WaveBox RF

Looking for analog + digital in one box? Our direct RF-focused WaveBox servers and enclosures are a perfect fit!

Browse WaveBox Servers

IP Cores + Example Designs →

BittWare Partners and internal projects gives you an easy way to get started quickly.

Learn about BittWare Partners

Data Movers + DMA + RDMA

Arkville by Atomic Rules

NVMe Bridge by Intelliprop

RDMA by Grovf

Network Offload

UDP Offload by Atomic Rules

TCP Offload by Chevin

Precision Time

TimeServo by Atomic Rules

ÜberNIC by LMS

More Network Acceleration

MACsec by Xiphera

IPsec by Xiphera

100G NIC by Corundum

Open Source + Free

Open-Source 100G NIC by Corundum

SmartNIC Shell by BittWare

Loopback by BittWare

Traffic Generator by BittWAre

Pre-Programmed Solutions →

These solutions don't need FPGA programmers, rather they are software configurable and built on BittWare hardware!

Data/Packet Capture + Record

TK242 by Atomic Rules

P2P and Storage

Transparent Compression by Eideticom

Analytics Processor by Eideticom

Cryptographic Accelerator (coming soon)

Financial Services/Fintech

LMS has created ÜberNIC, pre-programmed with the entire network stack in hardware.

ÜberNIC Ultra

ÜberNIC Nano

AI + Machine Learning

Positron Atlas

EdgeCortix

Development Tools →

BittWare Software Development Kit (SDK)

BittWorks II Toolkit

High-Level Tool Flows

oneAPI

OpenCL

Vendor Tools we Suport

Achronix: ACE Design Tools

AMD: Vivado Design Suite

Altera: Quartus Design Software
Solutions
Direct RF and SatCom →

From RFX PCIe cards to our modular integrated WaveBox RF soutions, we have your RFSoC needs covered!

Browse our RF Products:

RFX 880/770 (4 in/out + Up to 400G I/O)

RFX 881/771 (Up to 8 in/out + Up to 400G I/O)

RFX-8440

WaveBox RF

AI/ML and High-Performance Computing (HPC) →

The ability to tailor the application to the silicon is a major win for FPGAs in the HPC space. We've also seen AI/ML use cases where these programmable devices can run more efficiently than even competing GPUs.

Learn About Compute Acceleration

AI/ML Partner Solutions:

Positron AI Acceleration, Built on BittWare

EdgeCortix Edge Inference for AI/ML

Intel FPGA AI Suite

Our BWNN Article Explores Using Binary Weights for ML Acceleration

Network & Security →

Networking covers a wide range of use cases, which is why we also have a large portfolio of solutions!

Learn About Network Acceleration

Offload Engines

Need TCP/UDP offload? Our partners offer premium IP cores ready to integrate into your project.

TCP Offload Engine by Chevin

UDP Offload Engine by Atomic Rules

MACsec + IPsec

Featuring Xiphera's IP running these popular security protocols in hardware has never been easier!

MACsec @ Up to 100G

IPsec @ up to 200G

On-demand MACsec/IPsec webinar

RDMA

RDMA over Converged Ethernet (RoCE v2) system implementation and integration, from Grovf.

RDMA IP Core for RoCE v2

SmartNIC

LMS has created ÜberNIC, pre-programmed with the entire network stack in hardware.

ÜberNIC Ultra

ÜberNIC Nano

Open Source + Free

Open-Source 100G NIC by Corundum

SmartNIC Shell by BittWare

Loopback by BittWare

Traffic Generator by BittWare

Accelerators for Financial Services & Fintech →

Whether you're after ultra-low-latency trade performance or simply need a high-performance NIC optimized for fintech, BittWare has a suite of solutions from experts like LMS and Exegy.

LMS ÜberNIC

LMS ÜberNIC Nano

More Fintech Solutions:

nxFramework from Enyx

MAU Accelerator from Myrtle.ai

Broadcast and Video Solutions →

Building on accelerators like FPGAs is a smart way to get more from your investment. Broadcast video is moving away from legacy pre-configured pipelines to software-defined architecture but with hardware doing the heavy lifting.

Learn More

Storage & Peer to Peer Data Movement →

Move your algorithm to the data, not data to the algorithm with our Storage and P2P partner Eideticom.

Learn About Storage Acceleration Solutions

Jump Directly to Eideticom NoLoad Solutions:

Acceleration Processing Unit (APU)

Transparent Compression

Storage Webinars On Demand:
Resources

BittWare Blog

Webinars

Articles and White Papers

Videos

Sample Applications

Product Support
About

About BittWare

Blog

Contact BittWare and Locations

Sales Channels

Careers

Support
Search

Powerful AI Inference

Built on BittWare

SC24 Video

Transcript

Hi, my name is Cameron McCaskill, and I run sales for a company called Positron AI. We’re actually here at the Supercomputing event with our partner BittWare, and what I’m going to do first is show you a demo of what we’ve built, and then I’ll explain how we’ve gotten from point A to point B.

What you’re going to see is a demo of two of these cards—two of these AI accelerators—that are produced by BittWare, that have programmed chips inside the card that are specifically focused on AI inference acceleration…and that’s going to be based on a comparison against the H100 card from NVIDIA.

On the left of the screen, you will see two of the H100 cards from NVIDIA running Llama 3.1 8B, which is a very popular model from Meta, and on the right, you’ll see it versus two of our cards from Positron. I’m going to run an industry-standard benchmark, it’s called MMLU Pro, and once I click that button, you’ll see the speed and the power consumption of each of the two solutions.

On the left, you’ll see NVIDIA is running about 140 tokens per second for this model—this Llama 3.1 8B—and it’s about 3 watts per token. On the right, you see the Positron two cards are running at almost double that speed and about a third of the power.

The value proposition of Positron is that we’ve made a much more efficient solution from not just from a performance perspective, but the cost of the server is much less than a DGX platform from NVIDIA, and the power consumption is about a third of what NVIDIA consumes to run AI model inference.

Let me tell you a little bit more about how we’ve done it. The product we’re shipping today is called Atlas. It is an eight-card system, so it has eight of these cards I mentioned before inside the server, and that is comparable to the eight-accelerator-card version of the DGX H100 box from NVIDIA.

This box is now shipping. This is, we think, ultimately the way people will measure the success of their hardware or hardware-software combination for running an AI model is going to all boil down to two things: performance per dollar and performance per watt.

Today, Positron is showing about a 5x improvement in terms of performance per watt versus a DGX H100 and a 4x improvement in terms of performance per dollar over DGX H100 from NVIDIA.

The other thing we wanted to do is make it an appliance. Think of this as a box that you just plug in, and it just works. From a software perspective, we decided not to go down the path that many AI chip startups have done of building their own compilers, and now you have to go and relearn how to use a new software package. We said, “What if we just leverage the transformer library that’s on Hugging Face?” Hugging Face has over a million transformer models uploaded to it today, and so it’s the largest repository of open-source large language models. As long as you’re able to take the .pt or the .safetensors file off of Hugging Face, it’s literally a drag-and-drop experience to the Positron AI hardware. So again, it’s that appliance-like experience.

Another example I will give you as to how we’ve made this a plug-and-play experience: our first customer, from the moment we started opening the box until we were running large language models through that server, was a grand total of 38 minutes. It is truly a plug-and-play experience for our customers.

Another advantage that we like to talk about is all of these data centers out there, particularly ones built before 2020 that I call “AI orphaned,” because they’ve got, as an example…maybe they only have 10 kW available (in terms of power) to each rack.

Well, in that power envelope, you can literally only fit one DGX H100, because when the DGX H100 is running inference, it’s burning about 6,000 watts. So, if you only have a 10 kW rack, you can only fit one of those servers in that footprint.

On the other hand, with our Atlas server, we’re only burning 2,000 watts running AI inference at full speed, so you can actually fit five of our servers in that 10 kW footprint. It’s allowed us to go after some of the data centers that feel today they’re sort of orphaned by AI—they’re not able to participate in this fast-growing business, and we show them a solution that can operate in their 10 kW racks very effectively and very price-effectively as well.

And so just in closing, what we’ve built to date, and again, the company’s been around since April of 2023, we’ve got what we call Positron Test Flight. Really, all that is, is we’ve hosted some of these Atlas servers in our own engineering facility, and then we enable customers to have a dedicated instance of that hardware, and they can test for free—so try before you buy.

The Atlas server you see here in the middle of the slide is the product that we’re shipping today and have been shipping since August. The accelerator cards I’ve shown you a few times, we do have some customers that have said, “Can I just buy cards from you and install them in my server?” The answer is yes, but for the most part today, we’re focused on selling the appliance. But we will sell cards if it’s the right level of volume and it’s not too complex in terms of the number of servers that we have to deploy within.

That’s Positron AI in a nutshell. While we’re already more performant than NVIDIA for popular models like LLaMA 3.1 8B, we can still double our today’s performance just through software on the exact same hardware platform. We’d be excited to talk to you more.

Interested in Pricing or More Information?

Our technical sales team is ready to provide availability and configuration information, or answer your technical questions.

"*" indicates required fields

Name*

First Last

Email*

Phone

If we can reach you by phone, please provide a number in addition to email.

Company Division

Company/Organization

If you are requesting pricing, please provide a company/organization name.

This field is hidden when viewing the form

Address and City

Street Address Address Line 2 City ZIP / Postal Code

Country*

Required field for directing your inquiry to the correct team

Country

US State*

State

Canadian Province*

Province

How can we help you?

Powered by the latest FPGAs and SoCs from Achronix, AMD, and Altera, our cards are designed and manufactured in-house for enterprise-class performance.

Browse by Accelerator Manufacturer:

Card-Level Products

Our RFX family of cards leverage the AMD RFSoC chip with our own analog amplification/filtering and more.

WaveBox RF Servers and Enclosures

Taking RFX in an integrated, modular approach with up to 12 analog in/out in a 1U server.

Finding the right server and configuring it can be time-consuming. We've designed TeraBox servers to be ready to go from the start, saving you time if you're developing and for deployment: a robust solution suitable for the toughest challenges.

WaveBox RF

Looking for analog + digital in one box? Our direct RF-focused WaveBox servers and enclosures are a perfect fit!

BittWare Partners and internal projects gives you an easy way to get started quickly.

Data Movers + DMA + RDMA

Network Offload

Precision Time

More Network Acceleration

Open Source + Free

These solutions don't need FPGA programmers, rather they are software configurable and built on BittWare hardware!

Data/Packet Capture + Record

P2P and Storage

Financial Services/Fintech

LMS has created ÜberNIC, pre-programmed with the entire network stack in hardware.

AI + Machine Learning

High-Level Tool Flows

Vendor Tools we Suport

From RFX PCIe cards to our modular integrated WaveBox RF soutions, we have your RFSoC needs covered!

Browse our RF Products:

The ability to tailor the application to the silicon is a major win for FPGAs in the HPC space. We've also seen AI/ML use cases where these programmable devices can run more efficiently than even competing GPUs.

AI/ML Partner Solutions:

Networking covers a wide range of use cases, which is why we also have a large portfolio of solutions!

Offload Engines

Need TCP/UDP offload? Our partners offer premium IP cores ready to integrate into your project.

MACsec + IPsec

Featuring Xiphera's IP running these popular security protocols in hardware has never been easier!

RDMA

RDMA over Converged Ethernet (RoCE v2) system implementation and integration, from Grovf.

SmartNIC

LMS has created ÜberNIC, pre-programmed with the entire network stack in hardware.

Open Source + Free

Whether you're after ultra-low-latency trade performance or simply need a high-performance NIC optimized for fintech, BittWare has a suite of solutions from experts like LMS and Exegy.

More Fintech Solutions:

Building on accelerators like FPGAs is a smart way to get more from your investment. Broadcast video is moving away from legacy pre-configured pipelines to software-defined architecture but with hardware doing the heavy lifting.

Move your algorithm to the data, not data to the algorithm with our Storage and P2P partner Eideticom.

Jump Directly to Eideticom NoLoad Solutions:

Storage Webinars On Demand:

Powerful AI Inference

Built on BittWare

SC24 Video

Interested in Pricing or More Information?

Where to Next?

Follow Us