GGUF Format

The standard file format for quantized LLMs used by llama.cpp, Ollama, LM Studio, and most local inference tools. A .gguf file contains the weights, tokenizer, and metadata in one cross-platform binary. The successor to the older GGML format. If you're running a local LLM in 2026, you're probably using GGUF.