← Back to blog
·Greg Mousseau

Qwen 3.5: Alibaba's Open-Weight MoE Hits the Frontier

Qwen 3.5 (235B-A22B) landed in late 2025 as Alibaba's most capable open-weight release yet — a mixture-of-experts model that only activates 22B parameters per token but delivers frontier-competitive quality. The open-source tier just got dramatically better.

Model ReviewFrontier ModelsAI StrategyAlibabaOpen Source

The case for open-weight models keeps getting stronger. With Qwen 3.5, Alibaba has shipped a model that's not "great for open-source" — it's competitive with models people pay $15–$30 per million tokens to access, and you can run it yourself.

What Makes It Work

Mixture of Experts architecture — 235B total parameters, but only ~22B activate per token. That means frontier-level quality with inference costs closer to a mid-size dense model. Self-hostable on multi-GPU setups without needing hundreds of thousands of dollars in hardware.

Quality at release:

  • Strong on coding: top-tier open-weight LiveCodeBench scores
  • Competitive on math and science reasoning
  • Context window sufficient for most production workloads
  • MMLU-Pro: 86%+ — competitive with closed frontier models

Instruction following — Qwen 3.5 follows complex, multi-part instructions reliably. One of the persistent weaknesses of earlier open-weight models was instruction drift over long conversations. 3.5 handles this better.

How It Compares at Launch

ModelTypeLiveCodeBenchMMLU-Pro
Qwen 3.5 (235B)Open MoEstrong~86%
DeepSeek V3.1Openstrong~85%
Llama 4 MaverickOpen MoEsolidcompetitive
Claude 4 SonnetClosedtop tiertop tier
Gemini 3 ProClosedtop tiertop tier

For open-weight: Qwen 3.5 and DeepSeek V3 are neck-and-neck at the top.

Best For

  • Self-hosted deployments where data can't leave your infrastructure
  • Fine-tuning — Alibaba's Qwen series has a strong fine-tuning ecosystem
  • Cost-sensitive applications that need frontier quality
  • Multilingual workflows — Qwen has historically been among the best non-English models available

Not For

  • Teams that need vendor support SLAs
  • Coding agents specifically — Claude Code + Opus is still ahead
  • Zero-shot vision tasks — multimodal is available but not the primary strength

Verdict

Qwen 3.5 is the second-best open-weight model available as of late November 2025 (behind DeepSeek's latest), and depending on the benchmark, it's the best for multilingual and instruction-following tasks. If you're self-hosting and need frontier quality, your shortlist is Qwen 3.5 and DeepSeek V3.1.

Part of our Model Watch series.