Machine Learning

What vendors mean: AI. What it actually means: software that gets its rules from data instead of from a programmer — which sounds magical until you have to maintain it.

Models & Architecture

The Technical Definition

Machine learning is the branch of computer science where systems learn patterns from data rather than following hand-written rules. Instead of a programmer writing “if A then B,” the system is shown thousands of examples of A and B and figures out the relationship on its own. The output is a model — a set of learned parameters — that can then be applied to new inputs.

Machine learning is the umbrella. Inside it sit several approaches: supervised learning (learn from labeled examples), unsupervised learning (find structure in unlabeled data), and reinforcement learning (learn from feedback). Deep learning is a subset. Large language models are a subset of deep learning. Generative AI is mostly built on top of all of this. The terms aren’t interchangeable, even though vendors use them as if they were.

What This Actually Means for Your Business

Most of the AI you’ve been running for years — fraud detection, demand forecasting, recommendation engines, churn models, ad targeting — is machine learning, not generative AI. The current hype cycle is about generative models, but the boring, durable, profitable applications of AI in the enterprise have been running on classic machine learning for a decade.

This matters because vendors are now slapping “AI-powered” on products that have always been machine-learning-powered, and slapping “machine learning” on products that are actually generative AI. The categories blur in marketing. They don’t blur in operations.

The operational profile of a machine learning system is genuinely different from a generative AI system. ML models are trained once on a fixed dataset, deployed, and then they drift. Drift is the quiet killer. Your fraud model was trained on 2024 data; it’s now 2026 and fraud patterns have moved. Your demand forecast was tuned to pre-pandemic seasonality; the seasonality has shifted. The model is still running. The model is still producing predictions. The predictions are no longer accurate. Nobody notices until a quarter goes sideways.

The maintenance cost of a machine learning system is therefore not the cost of building it. It’s the cost of monitoring it, retraining it, and re-validating it on a schedule. A vendor who pitches you an ML solution without a clear answer on retraining cadence is selling you a system that will silently degrade. That’s not theoretical. It’s the most common failure mode in enterprise ML.

The other dimension: machine learning models are often more useful than generative AI for narrow, high-volume, structured tasks. Predicting which customers will churn next month. Routing tickets to the right team. Flagging anomalous transactions. The model is small, fast, cheap, and explainable enough to defend in an audit. The temptation to replace it with an LLM call is real and almost always wrong on cost. Use the right tool.

The data dimension is also worth naming. Machine learning eats labeled training data. If you don’t have it — clean, plentiful, and representative — you don’t have a machine learning project. You have a data project that has to come first. Vendors will not say this in the proposal. They will say it in the change order three months later.

Reality Check

What the vendor says: “Our machine learning platform automatically improves over time as it processes more of your data.”

What that means in practice: It will only improve if someone is monitoring performance, flagging drift, retraining the model on new data, and validating the new version before promoting it. “Automatically” is doing a lot of work in that sentence. Ask who’s responsible for each of those steps.

What Operators Actually Do

Companies running machine learning well treat the model as one piece of a longer system. Monitoring is not optional. Retraining is on a calendar. Performance is reviewed against business outcomes, not just technical metrics. The team owning the model — usually data science with a data engineering counterpart — has explicit accountability for the model’s quality in production.

Smart operators also pick the simplest model that solves the problem. A logistic regression that predicts churn at 78% accuracy, runs in milliseconds, and is fully explainable beats a deep neural network that hits 81% and nobody can debug. The marginal three points of accuracy do not survive contact with the legal team’s audit request. Complexity has a cost beyond compute.

The other pattern: machine learning projects that succeed start with a narrow, measurable business question. Reduce false-positive fraud alerts by 30%. Increase forecast accuracy enough to cut safety stock by 15%. Cut customer service handle time by two minutes per ticket. Vague projects produce vague results. Specific projects produce numbers you can defend in a board meeting.

The Questions to Ask

What’s the retraining cadence, and who owns it? A machine learning system that doesn’t get retrained doesn’t keep working. If the vendor can’t tell you how often the model gets refreshed and who’s responsible, the model will degrade — silently — until it doesn’t.
What data does the model need, and do we have it? Machine learning is downstream of data. If your data is messy, missing labels, or stuck in five different systems, the model will reflect that. The honest vendors will tell you. The dishonest ones will let you find out in production.
How will we know when the model is wrong? Drift detection, performance monitoring, and a clear escalation path matter more than the initial accuracy number. What’s the alert? Who gets it? What’s the rollback?

The Technical Definition

What This Actually Means for Your Business

Reality Check

What Operators Actually Do

The Questions to Ask

One operator. Every other Wednesday.