Glossary / Data & Infrastructure

Feature Store

The infrastructure tax that only large organizations can afford to pay

Data & Infrastructure

The Technical Definition

A feature store is a centralized data system that computes, stores, and serves machine learning features for both training and inference. Instead of each model engineering team building their own data pipelines to produce features from raw data, a feature store provides a single source of truth: you define a feature once (e.g., “customer_total_spend_last_30_days”), compute it once at scale, cache it, and serve it to any model that needs it.

Feature stores separate feature computation (batch-intensive, can happen offline) from feature serving (low-latency, happens at prediction time). They support both: compute features in batch overnight, store them in a fast cache, and serve them in milliseconds when a model needs a prediction.

What This Actually Means for Your Business

A feature store is a bet on organizational scale. It only makes sense if multiple teams are building multiple models that all need overlapping features. If you have one team building one model, a feature store is expensive overhead. If you have ten teams building fifty models, and those models share common features (which they do), a feature store saves time and ensures consistency.

The problem it solves is real: feature redundancy and training-serving skew. Without a feature store, each team writes its own code to compute features from raw data. Team A computes “customer_spend_last_30_days” slightly differently than Team B. One uses a 30-day rolling window from transaction date; the other uses calendar days. One includes cancelled orders; the other doesn’t. Models trained with Team A’s definition perform well offline but poorly in production where Team B’s features are served. Debugging this is a nightmare.

A feature store prevents that by enforcing a single definition. You define “customer_spend_last_30_days” once, document the exact logic, and every model uses that exact same feature. Consistency is enforced, not assumed.

The operational cost is substantial. You’re building and maintaining another data system. Someone has to monitor it. Someone has to update feature definitions when business logic changes. Someone has to handle feature freshness—if a feature is cached from yesterday and today’s data matters, how stale is too stale? Someone has to debug feature serving latency when a model prediction that used to take 50ms now takes 500ms.

Feature stores are most valuable in mature organizations with many models in production. Early-stage teams and organizations with only a few models should not build feature stores. The operational burden exceeds the benefit. Focus on getting one pipeline and one model working reliably first.

Reality Check

What the vendor says: “Our feature store eliminates feature redundancy, prevents training-serving skew, and scales to thousands of models.”

What that means in practice: Feature stores do reduce redundancy and enforce consistency. But they require discipline. Teams still have to agree on feature definitions. Still have to maintain SLAs for feature freshness. Still have to monitor serving latency. And when a feature store goes down, all models that depend on it go down with it. You’ve traded complexity in individual teams for centralized operational risk.

What Operators Actually Do

Large enterprises (those with dozens or hundreds of models in production) build or adopt feature stores because the ROI is positive. They allocate a dedicated platform team to own it. That team owns feature compute infrastructure, caching layers, serving APIs, and monitoring. Model teams own feature definitions and updating them when logic changes.

They establish clear SLAs for feature freshness and serving latency. “All features updated within 1 hour” or “99th percentile serving latency < 100ms”. When SLAs are breached, the platform team investigates.

They version feature definitions explicitly. A model knows which version of each feature it was trained on. When a feature definition changes, the team decides: retrain the model with the new feature, or leave it on the old version? This requires governance, but it prevents silent model drift.

They use feature stores to enforce business logic consistency. If “active customer” has a specific definition for the revenue model, that definition is encoded once in the feature store, used by every model, and updated once if business rules change. Without a feature store, that definition lives in fifty different codebases and is impossible to change correctly.

For teams not ready for a full feature store, an intermediate approach works: a shared feature pipeline that multiple model teams depend on. One team owns the pipeline, other teams consume features from it. It’s lighter weight than a feature store but eliminates redundancy and reduces skew.

The Questions to Ask

  1. How many models in production share the same features today? If the answer is less than three, you probably don’t need a feature store yet. If it’s more than five, you probably do. Feature stores have fixed overhead; the benefit only exceeds the cost at scale.

  2. Who owns feature definitions, and how do they update when business logic changes? Feature stores enforce consistency, but they require governance. Someone has to decide when “customer_spend_last_30_days” changes meaning. Someone has to retrain affected models. Without clear ownership, a feature store becomes technical debt rather than infrastructure.

  3. What happens to model serving when your feature store is unavailable? Feature stores are a critical dependency. If it goes down, all models that depend on it stop making predictions. What’s your fallback? Cached features? Precomputed values? Models need to be resilient to feature store latency and unavailability, which adds complexity.

Get the next Brief

One operator. Every other Wednesday.

Plus the AI Glossary and the Failure Museum.
Real names. Real numbers. Honest analysis.