GLM-5.2
GLM-5.2 is Z.ai's) flagship open-weight LLM), released as full open weights on June 16, 2026 (coding subscribers got access on June 13). It is the successor to GLM-5.1 and, at release, the leading open-weight model on the Artificial Analysis Intelligence Index. The headline change is a jump from a 2
Canonical version: GLM-5.2.
GLM-5.2 is Z.ai's flagship open-weight LLM, released as full open weights on June 16, 2026 (coding subscribers got access on June 13). It is the successor to GLM-5.1 and, at release, the leading open-weight model on the Artificial Analysis Intelligence Index. The headline change is a jump from a 200K to a 1M token Context Window with stable long-horizon performance.
Architecture
- AI Mixture of Experts (MoE) architecture: ~753 billion total parameters, ~40 billion active per token
- 1M token Context Window (up from 200K on GLM-5.1)
- Up to 131,072 token maximum output length
- IndexShare: reuses the same indexer across every four sparse-attention layers, cutting per-token FLOPs by 2.9× at 1M context
- Multiple thinking effort levels (e.g. High, Max) to trade capability against speed and cost
- FP8 KV-cache quantization support
- Text-only input (Z.ai's vision models are separate and not open-weight)
Note: Artificial Analysis reports 744B total / 40B active; Z.ai's own materials cite 753B. The active-parameter count (~40B) is consistent across sources.
Performance
- Artificial Analysis Intelligence Index v4.1: 51 (#1 open-weight at release; ahead of MiniMax-M3 and DeepSeek V4 Pro at 44 each)
- SWE-Bench Pro: 62.1 (up from GLM-5.1's 58.4; Claude Opus 4.8 ≈ 69.2)
- Terminal-Bench 2.1: 81.0 (up from 63.5; Claude Opus 4.8 ≈ 85.0, GPT-5.5 ≈ 84)
- FrontierSWE: 74.4 (up from 30.5; trails Opus 4.8 by ~1%, edges out GPT-5.5 by ~1%)
- GDPval-AA v2: 1524 (ahead of MiniMax-M3 and DeepSeek V4 Pro; in line with proprietary models)
- Reasoning: AIME 2026 99.2; GPQA-Diamond 91.2; HMMT Feb 2026 92.5; HLE 40% (+12); CritPt 21% (+16)
- Agentic: MCP-Atlas 76.8; Tool-Decathlon 48.2
- Ranked #2 on Code Arena WebDev, behind only Claude Fable 5
Note: benchmark figures are largely self-reported by Z.ai or sourced from Artificial Analysis as of June 2026.
Key Capabilities
- Stable 1M-context support for large-scale implementation, automated research, performance optimization, and complex debugging
- Strong long-horizon agentic coding, the biggest gains over GLM-5.1 are in sustained, multi-step tasks rather than one-shot generation
- Function calling and reasoning support for tool-augmented agents
- Built for Agentic Engineering across long-running development workflows
Token Efficiency
A notable drawback: GLM-5.2 burns ~43k output tokens per Intelligence Index task (up from ~26k on GLM-5.1), above MiniMax-M3 (~24k) and Kimi K2.6 (35k). It still lands on the Pareto frontier of intelligence vs cost per task ($0.46/task) thanks to cheap pricing.
Training Infrastructure
- Trained with Z.ai's slime framework using parallel OPD (online policy distillation) training that merges 10+ expert models into the final model; OPD took ~2 days
- Long-horizon RL via a critic-based PPO formulation learning from individual rollouts with trajectory compaction
- Anti-Reward Hacking safeguards: rule-based filter plus LLM-based judgment during RL
Availability
- MIT License: fully permissive, no regional limits
- Weights on HuggingFace and ModelScope
- Usable via Z.ai chat, ZCode desktop agent, Claude Code, and OpenCode
- On Cloudflare Workers AI as
@cf/zai-org/glm-5.2(launched at 262,144 context, expandable toward the 1M max) - GLM Coding Plan quota: 3× peak / 2× off-peak; promotional 1× off-peak through end of September
Pricing
- Z.ai first-party API: $1.40 input / $0.26 cache-hit / $4.40 output per 1M tokens
- OpenRouter: ~$1.20 input / $4.10 output per 1M tokens
- For comparison: GPT-5.5 ≈ $5/$30, Claude Opus ≈ $5/$25, GLM-5.2 is roughly 3–6× cheaper
Deployment Requirements
- Full BF16 weights: ~1.51 TB
- Q4_K_M (4-bit): ~476 GB, multi-GPU datacenter hardware (2× A100 80GB or 4× RTX 6000 Ada)
- 2-bit dynamic (Unsloth UD-IQ2_XXS): ~241 GB, runnable on a 256GB+ Mac Studio at 3–9 tokens/sec
- 1-bit dynamic: ~176 GB but quality degrades too far to be useful
- Supported frameworks: transformers, vLLM, SGLang, xLLM, ktransformers
- Practical reality: outside a ~$9,500 256GB+ Mac Studio (single-digit tokens/sec at 2-bit), this is a rent-or-API model, not a home-setup model
References
- https://huggingface.co/blog/zai-org/glm-52-blog
- https://huggingface.co/zai-org/GLM-5.2
- https://simonwillison.net/2026/Jun/17/glm-52/
- https://artificialanalysis.ai/models/glm-5-2
- https://artificialanalysis.ai/articles/glm-5-2-is-the-new-leading-open-weights-model-on-the-artificial-analysis-intelligence-index
- https://openrouter.ai/z-ai/glm-5.2
- https://developers.cloudflare.com/changelog/post/2026-06-16-glm-52-workers-ai/
- https://vettedconsumer.com/glm-5-2-the-most-powerful-open-weight-model-yet-and-the-brutal-reality-of-running-it-locally/
- https://www.latent.space/p/ainews-glm-52-the-top-frontend-coding
- https://www.youtube.com/watch?v=V1EPXfZV0Ew
- https://news.ycombinator.com/item?id=48567759
- https://news.ycombinator.com/item?id=48567004
Related
- GLM-5.1
- Zhipu AI (Z.ai)
- Artificial Intelligence (AI)
- Large Language Models (LLMs)
- AI Mixture of Experts (MoE)
- AI Open Weight Models
- Agentic Engineering
- Context Window
- MIT License
- Kimi K2.6
- Claude Opus 4.8
- Claude Fable 5
- GPT-5.5
About Sébastien
I'm Sébastien Dubois, and I'm on a mission to help knowledge workers escape information overload. After 20+ years in IT and seeing too many brilliant minds drowning in digital chaos, I've decided to help people build systems that actually work. Through the Knowii Community, my courses, products & services and my Website/Newsletter, I share practical and battle-tested systems.
I write about Knowledge Work, Personal Knowledge Management, Note-taking, Lifelong Learning, Personal Organization, Productivity, and more. I also craft lovely digital products and tools.
If you want to follow my work, then become a member and join our community.
Ready to get to the next level?
If you're tired of information overwhelm and ready to build a reliable knowledge system:
- 📚 KM for Beginners — 10+ hours of structured video lessons
- 🚀 Obsidian Starter Kit — Ready-made vault with 40+ templates
- 💼 Knowledge Worker Kit — Complete guides + lifetime community
- 🦉 1-on-1 Coaching — Personalized guidance
- 🎯 Join Knowii — Community + ALL courses & tools
Found this valuable? Share it with someone who needs it.