Key facts
- Z.AI's GLM-5.2 model closely rivals Claude Opus 4.8 on the FrontierSWE benchmark, scoring 74.4 compared to Opus's 75.1.
- GLM-5.2 outperforms GPT-5.5 on both FrontierSWE (74.4 vs 72.6) and SWE-bench Pro (62.1 vs 58.6).
- The model was trained exclusively on Huawei Ascend chips, excluding any NVIDIA hardware.
- GLM-5.2 boasts a 1 million-token context window and is released under an MIT license.
- Quantized versions of GLM-5.2 are available, reducing its size to 238GB but still requiring 256GB of RAM or VRAM for local use.
Z.AI, a Beijing-based lab, has released its new large language model, GLM-5.2, which demonstrates performance rivaling top-tier models like Anthropic's Claude Opus 4.8 and surpassing GPT-5.5 on key benchmarks. The model achieved a score of 74.4 on the FrontierSWE benchmark, just 1% behind Claude Opus 4.8's 75.1, and outperformed GPT-5.5 with a score of 72.6. On the SWE-bench Pro, GLM-5.2 scored 62.1, exceeding GPT-5.5's 58.6.
A significant aspect of GLM-5.2's development is its training entirely on Huawei Ascend chips, with no NVIDIA hardware utilized in the process. This achievement comes as Z.AI has been on the U.S. Entity List since January 2025, highlighting a growing trend of AI development outside of American technological dominance. The company's stock has seen a 90% increase following the release and the ban of Anthropic Fable.
GLM-5.2 is a 744-billion-parameter mixture-of-experts model featuring a 1 million-token context window, a fivefold increase from its predecessor GLM-5.1. It is released under an MIT license, meaning it has no regional restrictions. The model's performance places it at the top of open-source models in the Artificial Analysis Intelligence Index.
For developers, the expanded context window enables more complex tasks like whole-repo navigation and multi-file refactors in single calls. API pricing is set at $1.40 per million input tokens and $4.40 per million output tokens, significantly lower than Claude Opus 4.8's $5 input and $25 output. A GLM Coding Plan is available starting at $18 per month.
Local deployment is technically feasible thanks to 2-bit GGUF quantizations released by Unsloth AI, which reduce the model's size from 1.51TB to 238GB while retaining approximately 82% accuracy. However, this still requires a substantial 256GB of unified memory (RAM or VRAM). In testing, GLM-5.2 generated diverse game scenarios, showcasing its strength in multi-shot generation workflows and agentic pipelines where output variety is prioritized over polish. For highly demanding sustained tasks like SWE-Marathon, it scored 13.0, compared to Opus 4.8's 26.0, indicating a gap still exists with closed-frontier models.
Open-source weights for GLM-5.2 are available on HuggingFace under the MIT license, with quantized weights also accessible. GLM Coding Plan subscribers can use the model string GLM-5.2, and free testing is available on z.AI with usage constraints.
