← Compare
Gemma 3 vs Phi-4
Two best small open models compared. Gemma 3 ships in 2B/9B/27B; Phi-4 is 14B-only but reasoning-heavy.
| Gemma 3 | Phi-4 | |
|---|---|---|
| Org | Google DeepMind | Microsoft |
| Released | 2025-11 | 2025-12 |
| Max params | 27B | 14B |
| Variants | 27B/9B/2B | 14B |
| Context | 128K | 16K |
| License | Gemma | MIT |
| Commercial use | yes | yes |
| MMLU | 78.5 | 84.8 |
| HumanEval | 71.2 | 82.6 |
| GSM8K | 86.5 | 95.2 |
| Languages | 35 | 5 |
| Min VRAM (smallest) | 6GB | 8GB |
| Vision | Yes | No |
Verdict
Gemma 3 27B and Phi-4 14B trade blows: Phi-4 wins on math/reasoning (GSM8K 95.2 vs 86.5, MMLU 84.8 vs 78.5); Gemma 3 wins on language coverage (35 vs 5), context (128K vs 16K), and has a 2B variant for edge devices. Phi-4 is MIT licensed (cleanest), Gemma is Gemma license (more restrictions). Pick Phi-4 for one task done well; pick Gemma for breadth.