← Back to directory
Microsoft · Released 2025-12
Phi-4
Microsoft's 14B reasoning model. Hits 70B-level math and reasoning scores from a 14B base — best parameter efficiency on the leaderboard.
MITCommercial use OKsmallreasoning
Params (max)
14B
Variants
14B
Context window
16K tokens
MMLU
84.8
HumanEval
82.6
GSM8K
95.2
Min VRAM (fp16, smallest variant)
8GB
Smallest Q4 GGUF
~4GB
Languages supported
5
Pros
- ✓Best params/perf ratio
- ✓MIT license
- ✓Runs on a single 4090
Cons
- ×Short 16K context
- ×English-heavy
Highlights
- ●MIT license — fully open
- ●GSM8K 95.2 from 14B params
- ●Trained heavily on synthetic data
Where to download
Hugging Face: microsoft/phi-4
Or via Ollama (
ollama pull phi-4) or LM Studio's in-app browser.Homepage: https://huggingface.co/microsoft/phi-4
Related reading
Best Open Source LLMs 2026: Honest Picks by Use Case
Which open-source LLM should you actually run in 2026? Honest picks by use case — frontier reasoning, coding, RAG, edge devices, multilingual.
Open Source LLM Licenses Explained: Llama vs Apache vs Gemma vs MIT
Can you use Llama in a commercial product? What does the Gemma license actually restrict? A plain-English breakdown of every major open LLM license.
Running an LLM on Your Laptop in 2026: M-Series, Quantization, and What Actually Works
Step-by-step: pick a quantization, install Ollama or LM Studio, run a 7B-14B model on a MacBook or 16GB GPU, and not lose your sanity.
Small LLMs on Edge Devices: What Runs on Phones, Pis, and Browsers in 2026
Gemma 2B runs on a Pi 5. Phi-4 runs in a browser via WebGPU. Phones run Llama 3B. A practical guide to LLMs on tiny hardware.