RAG (Retrieval Augmented Generation)

A pattern where the LLM gets relevant documents retrieved from a database (vector or keyword search) appended to the prompt before answering. Lets you add fresh data, internal docs, or grounding without fine-tuning. Command R+ 2 is specifically tuned for RAG; long-context models like InternLM 3 are useful for stuffing many retrieved docs into one prompt.