LLM (large language model)

What is an LLM (large language model)?

Alex GrygorievUpdated Jun 5, 20262 min read

Definition

A large language model (LLM) is a neural network trained on vast amounts of text to predict the next token in a sequence. By doing this at scale, it learns to generate, summarize, translate and reason over natural language — and it powers chatbots, AI agents and most modern AI features.

Table of contents

A large language model (LLM) is the engine behind today's AI boom — the technology inside ChatGPT, Claude and Gemini. It is a neural network with billions of parameters, trained to predict text, that turns out to be remarkably good at language tasks.

How an LLM works

An LLM is trained on a huge corpus of text with one deceptively simple objective: predict the next token (a word or word-piece) given everything before it. Repeated across trillions of words, this teaches the model grammar, facts, styles and patterns of reasoning. At runtime it generates text one token at a time, each prediction feeding the next.

Tokens & context window

LLMs don't read characters or words directly — they read tokens, chunks of roughly 3–4 characters. The context window is how many tokens the model can consider at once (its working memory). A bigger window lets it handle longer documents, but every token costs compute and money, which is why cost control matters in production.

Limits & hallucinations

Because an LLM predicts plausible text rather than looking up facts, it can "hallucinate" — produce confident but wrong answers. Its knowledge is also frozen at training time. Both limits are addressed by retrieval-augmented generation (RAG), which feeds the model real, current data to ground its answers.

Where LLMs are used

An LLM on its own answers prompts. Wrapped with tools and a goal it becomes an AI agent; connected to your data it powers search, drafting, classification and support. The model is the engine — the value comes from how you wire it into real workflows.

Summary

An LLM is a next-token predictor trained at massive scale. Understanding tokens, the context window and hallucination is the foundation for using it responsibly — and for building production systems that stay accurate and affordable.

Frequently asked questions

What is the difference between an LLM and AI?

AI is the broad field; an LLM is one specific kind of AI specialized in language. Most of what people call "AI" today — chatbots, writing assistants, AI agents — is built on top of LLMs.

Why do LLMs make mistakes?

An LLM generates statistically likely text rather than retrieving verified facts, so it can produce confident but incorrect answers. Grounding it with retrieval (RAG) and validation greatly reduces this.

Put AI to work in your business

What is an LLM (large language model)?

How an LLM works

Tokens & context window

Limits & hallucinations

Where LLMs are used

Summary

Frequently asked questions

More from the Wiki-Lexikon

What is an AI agent?

What is RAG (retrieval-augmented generation)?

What is prompt engineering?

What is a vector database (and embeddings)?