China’s Z.AI Releases GLM-5.2: A Mannequin That Rivals Claude Opus—Utilizing Zero Nvidia Chips

In short

GLM-5.2 trails Claude Opus 4.8 by simply 1% on FrontierSWE—a benchmark measuring multi-hour autonomous engineering initiatives—whereas beating GPT-5.5 on the identical take a look at. It ships underneath an MIT license with zero regional restrictions.
The mannequin was constructed fully on Huawei Ascend chips with no NVIDIA {hardware} concerned.
Unsloth AI already launched 2-bit GGUF quantizations that shrink the mannequin from 1.51TB to 238GB. You will nonetheless want 256GB of RAM or VRAM—however at that time, you’ll be able to run it.

Z.ai dropped GLM-5.2 on June 16, promising prime stage performances, beating its already superior GLM 5.1.

The Beijing-based lab, which has been on the U.S. Entity Checklist since January 2025seems to be benefiting from rising considerations over America’s strategy to AI. Over the previous week, the ban on Anthropic Fable and the discharge of this new mannequin have helped drive zAI’s refill 90%, sending it to a brand new all-time excessive.

GLM 5.2 has the numbers to again up the hype.

On FrontierSWE—a benchmark that evaluates whether or not an AI agent can full open-ended technical initiatives measured in hours, protecting methods optimization, large-scale code building, and utilized ML analysis, scored by dominance price—GLM-5.2 hit 74.4 towards Claude Opus 4.8’s 75.1. It edged out GPT-5.5 at 72.6. On SWE-bench Professional, which assessments autonomous decision of real-world GitHub points scored as a move price, GLM-5.2 scored 62.1 to GPT-5.5’s 58.6—and cleared its predecessor GLM-5.1’s 58.4 by a large margin.

The standard soar makes it the very best open-source mannequin so far within the Synthetic Evaluation Intelligence Index, which aggregates the outcomes of 9 completely different scores to evaluate the overall high quality of an AI mannequin. OpenRouter’s benchmarks put it in the identical class because the now banned Claude Fable 5.

The {hardware} used to attain this feat is one other attention-grabbing a part of the story. GLM-5.2 was educated on Huawei Ascend chips—no Nvidia wherever within the pipeline. Emad Mostaque, founding father of Stability AI, estimated whole coaching prices at round $25 million, 80% of that in post-training, which might make it extraordinarily low cost in comparison towards its friends.

As Decrypt reported earlier this yearZ.ai was already coaching picture fashions on Huawei’s Ascend Atlas servers with no single American chip. GLM-5.2 takes that infrastructure additional—a 744-billion-parameter mixture-of-experts mannequin with a real 1 million-token context window, 5 instances the 200K restrict on GLM-5.1, and an MIT license meaning no authorities directive can flip the entry swap.

Tokens are the chunks of tet a mannequin can learn and generate whereas Parameters are the variety of inner settings and values that decide how a mannequin processes data and generates responses

Who it is for and what it prices

For builders, the context window is the operational shift. Complete-repo navigation, multi-file refactors, and lengthy agentic pipelines that beforehand required chunking turn into single-call workflows. API pricing runs $1.40 per million enter tokens and $4.40 per million output—towards Claude Opus 4.8’s $5 enter and $25 output. The Coding Plan begins at round $18 a month and works straight inside Claude Code, Cline, Kilo Code, and hottest agentic environments.

Native deployment can be technically potential. Unsloth AI pushed 2-bit GGUF quantizations that compress the mannequin from 1.51TB right down to 238GB whereas retaining ~82% accuracy.

Don’t get too excited, although. That also means it calls for 256GB of unified reminiscence or an identical RAM/VRAM combo—a maxed M4 Extremely Mac Studio or a workstation with a mid-range GPU and 256GB of system RAM with mixture-of-experts offloading. It’s nonetheless some huge cash, however at the very least one thing which you could purchase and run on your own home if you happen to actually wish to.

We ran a fast take a look at, asking GLM-5.2 to construct our normal recreation mixing typing mechanics with a shooter. The UI wasn’t the prettiest—different fashions generated extra polished-looking interfaces, however the expertise was probably the most assorted: completely different eventualities throughout waves, enemy varieties that shifted, bosses showing later within the run.

It generated extra various recreation states than anything we examined for a similar job in a zero shot setup.

If you wish to play it, it’s stay in our Itch.io profile.

That variance factors towards the place GLM-5.2 makes probably the most financial sense. For multi-shot technology workflows and agentic pipelines the place output variety issues greater than polish, the maths at open-source pricing levels is tough to argue with. For the toughest sustained duties—SWE-Marathon, the place it scores 13.0 towards Opus 4.8’s 26.0—the hole to the closed frontier remains to be actual, and 13 factors extensive.

Open-source weights are stay on HuggingFace underneath the MIT license. The quantized weights are additionally out there on HuggingFace. GLM Coding Plan subscribers can swap now with the mannequin string GLM-5.2, and it’s additionally out there free of charge testing on z.AI with some utilization constraints.

Each day Debrief E-newsletter

Begin daily with the highest information tales proper now, plus authentic options, a podcast, movies and extra.

Source link

Login

Register

In short

Who it is for and what it prices

Each day Debrief E-newsletter

Related posts