What it means
An LLM is a large language model: a neural network trained on hundreds of billions to trillions of words of internet text and books. The training objective is simple: given some text, predict what word comes next. At sufficient scale, this produces models that can answer questions, write code, summarise documents, and have conversations.
The current production-grade LLMs are GPT-4 (OpenAI), Claude (Anthropic), Gemini (Google), and the open-weight Llama family (Meta). They differ in tone, accuracy, cost, and latency, but the basic architecture and behaviour is similar.
Why it matters
LLMs are the engine behind every modern AI agent. Choosing which LLM to use matters: it affects how natural the agent sounds, how often it hallucinates, how well it follows instructions, how fast it replies, and how much it costs to run.
For most production deployments, the choice is between Claude (strong at instruction-following and tone) and GPT-4 (broadest capability, largest ecosystem). The right answer depends on the use case.
Example
A clinic's AI agent runs on Claude because the conversations require a calm, careful tone and the cost-per-message is acceptable for their volume. A high-volume e-commerce brand runs on GPT-4o-mini for raw cost savings, accepting slightly less polished output. Both are LLMs; the architecture is identical, the trade-offs differ.