
Speeding AI and HPC Workloads with Composable Memory and Hardware Compression / Decompression Acceleration
Sometimes I try to capture a column's intent in a short, snappy title using pithy prose, as it were. I might attempt an aphoristic or epigrammatic turn of phrase while trying not to appear gnomic (which - as we now see - is something I reserve for my opening sentence).
www.eejournal.com, Mar. 18, 2025 –
In this case, however, my Speeding AI and HPC Workloads with Composable Memory and Hardware Compression / Decompression Acceleration offering is as short and snappy as I could make it with respect to the topic about which I’m poised to pontificate.
My head is currently spinning like a top, because I was just chatting with Lou Ternullo, who is Senior Director Product Management Silicon IP at Rambus, and Nilesh Shah, who is VP Business Development at ZeroPoint Technologies.
In a crunchy nutshell, the problem we are talking about here is the high cost of memory in the data centers used to perform artificial intelligence (AI) and high-performance computing (HPC). But turn that frown upside down, because the guys and gals at Rambus are collaborating with the chaps and chapesses at ZeroPoint to provide a super-exciting solution (said Max, super-excitedly).
The typical setup—the one that’s still dominant in the market today—is for servers to have their XPU processing units (CPUs, GPUs, NPUs, TPUs, etc.), DDR memory, and SSD memory (in the form of NVMe drives) located within the same physical unit. This is the standard server architecture, especially in traditional data centers, because this setup is well understood, widely supported by operating systems and applications, and offers simplicity in terms of latency, consistency, and management. For example, consider a representative server as shown below: