Glossary / Models & Architecture

Context Engineering

The discipline replacing 'prompt engineering' in 2026. It's not about the wording of your question. It's about what you put in front of the model before it answers.

Models & Architecture

The Technical Definition

Context engineering is the practice of deciding what an LLM sees in its context window before it generates an output. That includes the system prompt, the user’s question, retrieved documents, prior conversation history, examples of good answers, available tools, and anything else the model can read. The model’s response is a function of all of it — not just the question the user typed.

The discipline used to be called “prompt engineering.” That label is misleading because it implies the wording of the prompt is the lever. In practice, the lever is the assembly of context around the prompt.

What This Actually Means for Your Business

For two years, vendors and consultants sold “prompt engineering” as a skill. Companies hired prompt engineers, ran workshops, and paid for prompt libraries. Most of that work didn’t change outcomes much. The reason: the prompt was rarely the bottleneck.

The bottleneck was — and is — what the model gets to see when it answers. If your customer service AI doesn’t know the customer’s last three orders, no amount of clever prompt wording fixes that. If your research assistant retrieves the wrong document, the answer is wrong no matter how the question is phrased. If you’re asking the model to follow eight rules and only the first three fit in the context window, the others get ignored.

Context engineering is the work of solving those problems. It’s deciding which documents to retrieve and in what order. It’s choosing whether to include three examples or thirty. It’s deciding what gets summarized, what gets dropped, and what gets passed forward when the conversation gets long. It’s structuring the system prompt so the most important rules sit where the model is most likely to follow them.

The shift matters because it changes who does the work. Prompt engineering was a writing skill. Context engineering is closer to information architecture and data engineering. The teams winning here have people who understand how the model uses what it sees, not just how to phrase a clever instruction.

The other reason it matters: context engineering is where the operational cost lives. Every token in the context window costs money and adds latency. A bloated context with too many documents is slow and expensive. A starved context with too few is wrong. Tuning that tradeoff is real engineering work, not a writing exercise.

Reality Check

What the vendor says: “Our platform handles all the prompt engineering for you. Your team just asks questions in natural language.”

What that means in practice: The vendor made a set of decisions about what context to assemble for every query — which documents, which examples, how much history, in what order. Those decisions determine the quality of the output. You don’t see them. You can’t tune them. When the answers are wrong, you can’t fix it without going back to the vendor.

What Operators Actually Do

The teams doing this well have stopped treating the prompt as the artifact and started treating the context as the artifact. They version it. They test it. They monitor what’s actually in the context window when the model gets a question wrong, and they adjust the assembly logic until the failures stop.

They also instrument it. When a user complains about a bad answer, the first question isn’t “what did the user ask?” It’s “what did the model see?” The answer is often: a document that shouldn’t have been retrieved, a stale piece of conversation history, or a system prompt instruction that contradicts the new business rule someone added last month.

The other pattern: ruthless pruning. The instinct is to add more context — more documents, more examples, more rules. The discipline is to remove. Models perform better with less context that’s more relevant than with more context that’s noisy. The teams doing this well treat every token in the context window as something that has to earn its place.

The Questions to Ask

  1. What’s actually in the context window for a typical query? Have your vendor walk you through one real example end to end. The system prompt, the retrieved documents, the conversation history, the tool descriptions. If they can’t show you, they can’t tune it.

  2. Who decides what context gets assembled, and how does it change? Is it a hardcoded pipeline? A configuration file your team can edit? A black box owned by the vendor? The answer determines how much control you actually have.

  3. How do you debug a wrong answer? The right answer involves looking at what the model saw. If their debugging process is “we’ll re-train” or “we’ll adjust the prompt,” they’re not doing context engineering — they’re guessing.

Get the next Brief

One operator. Every other Wednesday.

Plus the AI Glossary and the Failure Museum.
Real names. Real numbers. Honest analysis.