Qwen 3.5: Alibaba's Open-Weight MoE Hits the Frontier
Qwen 3.5 (235B-A22B) landed in late 2025 as Alibaba's most capable open-weight release yet — a mixture-of-experts model that only activates 22B parameters per token but delivers frontier-competitive quality. The open-source tier just got dramatically better.
The case for open-weight models keeps getting stronger. With Qwen 3.5, Alibaba has shipped a model that's not "great for open-source" — it's competitive with models people pay $15–$30 per million tokens to access, and you can run it yourself.
What Makes It Work
Mixture of Experts architecture — 235B total parameters, but only ~22B activate per token. That means frontier-level quality with inference costs closer to a mid-size dense model. Self-hostable on multi-GPU setups without needing hundreds of thousands of dollars in hardware.
Quality at release:
- Strong on coding: top-tier open-weight LiveCodeBench scores
- Competitive on math and science reasoning
- Context window sufficient for most production workloads
- MMLU-Pro: 86%+ — competitive with closed frontier models
Instruction following — Qwen 3.5 follows complex, multi-part instructions reliably. One of the persistent weaknesses of earlier open-weight models was instruction drift over long conversations. 3.5 handles this better.
How It Compares at Launch
| Model | Type | LiveCodeBench | MMLU-Pro |
|---|---|---|---|
| Qwen 3.5 (235B) | Open MoE | strong | ~86% |
| DeepSeek V3.1 | Open | strong | ~85% |
| Llama 4 Maverick | Open MoE | solid | competitive |
| Claude 4 Sonnet | Closed | top tier | top tier |
| Gemini 3 Pro | Closed | top tier | top tier |
For open-weight: Qwen 3.5 and DeepSeek V3 are neck-and-neck at the top.
Best For
- Self-hosted deployments where data can't leave your infrastructure
- Fine-tuning — Alibaba's Qwen series has a strong fine-tuning ecosystem
- Cost-sensitive applications that need frontier quality
- Multilingual workflows — Qwen has historically been among the best non-English models available
Not For
- Teams that need vendor support SLAs
- Coding agents specifically — Claude Code + Opus is still ahead
- Zero-shot vision tasks — multimodal is available but not the primary strength
Verdict
Qwen 3.5 is the second-best open-weight model available as of late November 2025 (behind DeepSeek's latest), and depending on the benchmark, it's the best for multilingual and instruction-following tasks. If you're self-hosting and need frontier quality, your shortlist is Qwen 3.5 and DeepSeek V3.1.
Part of our Model Watch series.
