Data Center

Astera Labs uses CXL to accelerate AI, expand memory | TechTarget

Astera Labs is sampling a new cable to expand GPU clustering for AI workloads, linking multiple racks together and spreading out heat output and energy usage.

The Aries PCIe and Compute Express Link (CXL) Smart Cable Modules (SCMs) use copper cabling to more than double the PCIe 5.0 signal reach from 3 meters to 7 meters. Astera Labs achieves this by adding to the cables its digital signal processor retimer, a protocol-aware device that compensates for transmission impairments.

This extended length enables greater interconnectivity between GPUs, as well as extending interconnectivity to CPUs and disaggregated memory. This enables the use of larger GPU clusters, including across racks, and can increase the number of GPUs in AI infrastructure, even as they require more energy to operate.

Aries SCMs take CXL connectivity beyond a single rack, according to Baron Fung, an analyst at Dell’Oro Group. Customers can now connect multiple servers into clusters, creating more data interconnectivity at a time when AI models will only get bigger.

[Astera Labs’ Aries SCM enables] cache-coherent, phase-coherent communication between these AI servers and GPUs, beyond the rack.
Baron FungAnalyst, Dell’Oro Group

“You can have cache-coherent, phase-coherent communication between these AI servers and GPUs, beyond the rack,” he said, referring to the consistency between caches in a multiprocessor environment and signal consistency. Introducing such a product could breathe new life into CXL, where use cases are slowly emerging, and put other GPU makers on more equal footing with Nvidia, the GPU market leader.

Spreading out the power usage

Each new generation of GPU requires more power to operate, Astera Labs noted. There is only so much power that can be used on existing racks, meaning data centers would either need new racks or ways to spread newer GPUs farther apart.

The Aries SCMs could add scale-up capabilities to compute in AI workloads, Fung said.

“You can have scale-up architecture based on the CXL standard,” he said.

Using the CXL standard, an open standard defining high-speed interconnect to devices such as processors, could also provide a market alternative to Nvidia’s proprietary NVLink, a high-bandwidth, high-speed interconnect for GPUs and CPUs, Fung said. NVLink is designed for Nvidia products only.

Customers interested in alternatives to Nvidia GPUs such as AMD’s MI series might be interested in the Aries SCM, according to Nathan Brookwood, a research fellow at Insight 64. While the SCM doesn’t connect competing GPUs, it could help customers cluster AMD GPUs in a way similar to NVLink.

Other uses

Enterprises can also use Aries SCMs to add DRAM to a system, which can be useful to some applications, but isn’t an option with off-the-shelf servers, Brookwood said.

“Something like SAP HANA just loves memory and eats it up like a dog eats dog food,” he said.

The SCMs could also connect an array of CXL memory modules, Brookwood said, noting this as a potentially interesting application for CXL.

Storage has been disaggregated from compute for years because it doesn’t require a low-latency, phase-coherent fabric, Fung said. This has allowed data storage to hit higher density and utilization within a rack than memory. Before CXL, memory couldn’t be disaggregated like storage.

“The challenge of memory is that you need the aisles to be phase coherent between the CPUs within the array,” he said. “[The CXL cable] can allow that guarantee.”

With the cable doubling the PCIe signal reach, a group of servers can pool disaggregated memory, potentially improving memory utilization significantly, Fung said.

Close-up image of the Astera Labs Aries SCM cable.
Astera Labs’ Aries SCM enables larger GPU clustering for AI workloads.

Longer length, latency and risk

While lengthening a cable could lead to higher latency, Astera Labs guarantees that its Aries SCMs will preserve the same signal integrity it provided in shorter modules, Fung said.

“It meets the same standards at 3 meters that it met at 7 meters,” he said.

Plus, GPU-based workloads tend to be more sensitive to bandwidth than latency, Brookwood said. CXL increases memory utilization, and if there is enough memory, there is a way to better optimize access to that memory.

“Performance is improved by more memory less than [it is] detracted by the latency of that memory,” he said.

More risky for Astera Labs is the potential for what’s to come, Brookwood said. The final standard for external PCIe connectivity might be different from what Astera is doing.

“But this is what happens in every area of technology,” he said. “People who come out first gain the advantage and time to market, but expose themselves to the ultimate standard evolving in a different direction.”

Adam Armstrong is a TechTarget Editorial news writer covering file and block storage hardware and private clouds. He previously worked at StorageReview.com.