Glossary / Models & Architecture

Neural Network

What vendors mean: the magic that makes AI smart. What it actually means: a stack of math that turns inputs into outputs by adjusting millions of small weights until the answers stop being wrong.

Models & Architecture

The Technical Definition

A neural network is a mathematical model loosely inspired by how neurons fire in a brain. It’s a stack of layers, each containing a set of numbers called weights. Data goes in at one end, gets multiplied and added through the layers, and a prediction comes out the other end. During training, the network compares its prediction to the correct answer and nudges the weights to make the next prediction less wrong. Repeat that a few billion times and you have a model that can recognize faces, translate languages, or write coherent text.

Every modern AI system you’ve been pitched — LLMs, image generators, fraud detectors, demand forecasters — is a neural network underneath. The differences are in the architecture (how the layers are arranged), the training data, and the scale.

What This Actually Means for Your Business

Neural networks are the math under everything. When a vendor says “our AI” or “our deep learning platform” or “our proprietary algorithm,” they almost certainly mean a neural network. The interesting questions are never about whether they use neural networks. They are about what kind, trained on what data, run on what infrastructure.

Here’s what most CEOs don’t realize: the same architectural family powers your fraud detection model and ChatGPT. The difference is scale and specialization. A fraud model might have a few million parameters and be trained on your transaction history. A frontier LLM has hundreds of billions of parameters and was trained on a meaningful slice of the public internet. Both are neural networks. They are not the same product.

This matters operationally because the size and training of the network determines the cost, the latency, and the failure modes. A small purpose-trained network is cheap to run, fast to respond, and fails in narrow predictable ways. A giant general-purpose network is expensive per query, slower, and fails in creative unpredictable ways — including making things up that sound plausible.

Vendors will often blur this distinction on purpose. They’ll wave at “AI” without telling you whether they’re using a 50MB classifier they trained themselves or routing every request to GPT-4. The pricing implications, the data privacy implications, and the reliability implications are completely different. You need to know which one you’re buying.

The other thing CEOs underestimate: neural networks are not deterministic in the way traditional software is. The same input can produce slightly different outputs depending on how the model was sampled. Your finance team’s spreadsheet always returns the same number. A neural network won’t necessarily, and the gap between “almost always right” and “always right” is where most enterprise AI projects either succeed quietly or fail loudly.

Reality Check

What the vendor says: “Our deep neural network learns from your data to deliver personalized predictions.”

What that means in practice: They have a model architecture (probably borrowed from a published paper), they will fine-tune it on a sample of your data, and the predictions will be statistical guesses with confidence scores. “Learns” is a verb that hides a lot of human labeling, data cleaning, and engineering work. Ask who does that work and where the data lives while it happens.

What Operators Actually Do

The companies getting durable value from neural networks treat them as components, not products. They identify a narrow task with clean inputs and a measurable output — flagging anomalous invoices, predicting which equipment will fail next quarter, classifying inbound support tickets — and they build or buy a network specifically for that task. They measure the baseline (what humans or rules-based systems get right) and they only ship the network if it beats the baseline by a margin that justifies the operational cost.

The companies that struggle are the ones that buy a “neural network platform” and go looking for problems. The technology is real. The use cases are specific. Generic platforms with no specific problem to solve generally produce generic results no one trusts.

The Questions to Ask

  1. What kind of neural network is this, and how big? A 100M-parameter classifier and a 100B-parameter LLM are both “neural networks” but they have nothing else in common operationally. Get specific about architecture, parameter count, and whether it’s hosted by the vendor or by a third party.

  2. What was it trained on, and what does it actually predict? Training data determines the network’s blind spots. If their fraud model was trained on retail transactions, it won’t perform the same on B2B invoices. Ask for the training data description in writing.

  3. How does this network fail, and how will I know? Every neural network is wrong some percentage of the time. The vendor should be able to tell you the error rate on a held-out test set, what categories of errors are most common, and what monitoring you’ll have in production to catch drift.

Get the next Brief

One operator. Every other Wednesday.

Plus the AI Glossary and the Failure Museum.
Real names. Real numbers. Honest analysis.