Qwen 3.5: Alibaba's Open-Weight MoE Hits the Frontier

With Qwen 3.5, Alibaba has shipped a model that's not just "great for open-source": it's competitive with models people pay $15–$30 per million tokens to access, and you can run it yourself.

What Makes It Work

Mixture of Experts architecture — 235B total parameters, but only ~22B activate per token. That means frontier-level quality with inference costs closer to a mid-size dense model. Self-hostable on multi-GPU setups without needing hundreds of thousands of dollars in hardware.

Quality at release:

Strong on coding: top-tier open-weight LiveCodeBench scores
Competitive on math and science reasoning
Context window sufficient for most production workloads
MMLU-Pro: 86%+ — competitive with closed frontier models

Instruction following — Qwen 3.5 follows complex, multi-part instructions reliably. One of the persistent weaknesses of earlier open-weight models was instruction drift over long conversations. 3.5 handles this better.

How It Compares at Launch

Model	Type	LiveCodeBench	MMLU-Pro
Qwen 3.5 (235B)	Open MoE	strong	~86%
DeepSeek V3.1	Open	strong	~85%
Llama 4 Maverick	Open MoE	solid	competitive
Claude 4 Sonnet	Closed	top tier	top tier
Gemini 3 Pro	Closed	top tier	top tier

For open-weight: Qwen 3.5 and DeepSeek V3 are neck-and-neck at the top.

Best For

Self-hosted deployments where data can't leave your infrastructure
Fine-tuning — Alibaba's Qwen series has a strong fine-tuning ecosystem
Cost-sensitive applications that need frontier quality
Multilingual workflows — Qwen has historically been among the best non-English models available

Not For

Teams that need vendor support SLAs
Coding agents specifically — Claude Code + Opus is still ahead
Zero-shot vision tasks — multimodal is available but not the primary strength

Verdict

Qwen 3.5 is the second-best open-weight model available as of late November 2025 (behind DeepSeek's latest), and depending on the benchmark, it's the best for multilingual and instruction-following tasks. If you're self-hosting and need frontier quality, your shortlist is Qwen 3.5 and DeepSeek V3.1.

Part of our Model Watch series.