This chip startup simply raised $135M on a wager that AI's largest bottleneck is not compute - it is reminiscence

Each time you ask ChatGPT a query, your request triggers an information relay race. Info leaves reminiscence, passes via a CPU for preprocessing, travels to a GPU for heavy computation, after which makes its means again — and that total journey repeats for each single phrase the AI generates.

The bottleneck is structural — it means routing via a number of the costliest and power-intensive chips within the trade on each single request. That inefficiency is strictly what XCENAa startup with places of work in South Korea and the U.S., is attempting to unravel. The four-year-old startup has designed a chip that locations compute capabilities a lot nearer to DRAM — the quick, short-term reminiscence chips that retailer information a processor is actively utilizing — permitting routine information operations to be dealt with close to reminiscence, with out the expensive spherical journeys between CPUs, GPUs, and reminiscence.

If it really works at scale, the implications for AI infrastructure prices could possibly be important, which largely explains investor enthusiasm across the nation. Certainly, XCENA simply raised $135 million in a Sequence B at a valuation of $570 million, bringing its complete raised to $185 million.

XCENA CEO Jin Kim co-founded the startup in 2022 alongside CTO Dohun Kim and CPO Harry Juhyun Kim, all veterans of Samsung and SK Hynix, the reminiscence giants that provide chips powering Nvidia’s GPUs. “CPUs and GPUs have each gotten smarter over the a long time. Reminiscence by no means did. XCENA needs to vary that,” Kim mentioned in an interview with TechCrunch. “The latest rise in reminiscence costs and associated shares factors to a broader shift in AI infrastructure towards memory-centric architectures,” he added. (This month, the three firms that dominate the worldwide reminiscence chip market — Samsung, SK Hynix, and Micron — every crossed a trillion-dollar valuation for the primary time.)

XCENA is betting its enterprise on the thesis that “inference isn’t only a compute drawback; it’s more and more a reminiscence scaling drawback,” mentioned Kim.

XCENA’s chip, the MX1, connects to the CPU via CXL (Compute Categorical Hyperlink) — primarily a devoted specific lane between the processor and reminiscence — processing information earlier than it ever wants to go away the reminiscence module. It brings compute to the information, not the opposite means round. The corporate claims that what used to require 10 servers may probably run on only one.

“Whereas GPUs excel at matrix multiplication — the heavy math behind AI mannequin coaching — a lot of the encompassing information orchestration, together with preprocessing, KV cache administration (the system that shops prior dialog context so a mannequin doesn’t should reprocess it), and information caching, nonetheless runs on CPUs. Our chip handles these duties instantly throughout the reminiscence module itself,” Kim mentioned.

Demand for reminiscence options has surged for the reason that second half of final yr, and the corporate believes the timing is working in its favor.

Conversations with a number of world reminiscence distributors are in early levels, although Kim declined to call them. The corporate’s superb prospects are hyperscalers spending tens of billions a yr on AI infrastructure, the place even a small achieve in reminiscence effectivity can imply a whole lot of hundreds of thousands in financial savings.

The MX1 continues to be a prototype. Mass manufacturing chips are scheduled to roll off Samsung’s foundry strains by the top of 2026, with the corporate anticipating to generate income beginning in 2027.

Whereas neural processing unit (NPU) makers are competing to problem Nvidia for coaching workloads, XCENA is focusing on the memory-intensive layer that sits beneath all of it.

XCENA’s closest rivals embody Astera Labs and Marvell, each Nasdaq-listed firms engaged on next-generation reminiscence connectivity. Marvell is a big, established participant already working in the identical house, Kim mentioned, including that the differentiator comes right down to mental property. “We now have hundreds of cores,” Kim mentioned. Primarily based on public specs, Marvell’s strategy depends on a handful of general-purpose cores by comparability.

These cores are constructed on RISC-V — an open supply chip design blueprint — and optimized particularly for information processing, with every core intentionally stored small and environment friendly. Past the cores themselves, XCENA designs its personal inner reminiscence hierarchy, interconnect bus, and DRAM controller — a stage of vertical integration that the majority chip firms, together with bigger rivals, sometimes outsource.

Seoul-based VC corporations Altinum and IMM Funding co-led the Sequence B spherical, together with Corstone Asia and present buyers SBI Funding and Mirae Asset Capital. The corporate, which has greater than 90 workers throughout places of work in Pangyo, a tech hub exterior Seoul, and Sunnyvale, can also be in conversations with worldwide buyers about extra funding.

Whenever you buy via hyperlinks in our articles, we may earn a small commission. This doesn’t have an effect on our editorial independence.

Source link

Login

Register

Related posts