Back

Decision Memory Isn't the Missing Piece

January 20, 2026

The idea sounds obvious

Over the last few weeks on X, I've noticed several people who are trying to solve agent context problems. The proposal usually sounds something like this "what if we just store past decisions?" Sometimes it's framed as a decision knowledge graph, sometimes a historical context graph, sometimes a decision memory layer that agents can query. On the surface, it sounds reasonable and almost obvious. If agents fail because they don't understand why things were done, then surely the fix is to record those reasons and make them accessible.

But the more I think about it, especially for an enterprise engineering use case, the less I believe this works in practice.

Decisions aren't standalone events

The core issue is that most engineering decisions aren't clean, standalone events. They aren't the result of a single, well-formed tradeoff that can be neatly captured and replayed later. They are fragmented. They're shaped by time pressure, organizational dynamics, incomplete information, and gut instinct. A lot of the "decision" isn't even verbalized, it lives in Slack threads, PRs, or simply in what didn't get done because a deadline was looming. When you try to turn that into a clean, queryable record after the fact, you're not saving or preserving reality, you're shifting it to something it is not.

You can't reconstruct intent reliably

That leads to the next problem, missing data. If the tradeoffs weren't written down at the exact moment the decision was made, you can't reliably reconstruct them later. Any system that tries is forced to infer the INTENT from incomplete signals. At that point, you're effectively HALLUCINATING the rationale behind actions, which is ironic given that hallucination is precisely what we're trying to prevent with AI agents.

Decision memory becomes epistemic debt

Even when decisions are captured, they don't age well. This is true in startups and enterprises alike. Constraints change. Teams change. Infrastructure evolves. What was a correct decision six months ago may be actively harmful today. A "decision memory layer" that isn't constantly curated becomes another form of technical debt, except now it's epistemic debt. And realistically, no organization consistently maintains this kind of metadata over time.

Structure gets overweighted

There's also a more subtle failure that worries me, the agents over-trust in structure. Humans already do this. If you give an agent a formal "decision record," it will overweight it simply because it's structured and authoritative looking. That's dangerous when the surrounding context has shifted. The agent doesn't know what's stale, it just sees something that looks like truth.

We've always solved this socially

None of this is unique to AI. New engineers struggle with the exact same problem today. They inherit systems without fully understanding the rationale behind them. Historically, we haven't solved this with better databases, we've solved it, imperfectly, with better writing, better mentorship, and better tooling to explore the code itself.

Where decision records actually help

That said, I don't think the idea is entirely useless. I can see decision records working narrowly for explicit, high-stakes decisions, things like ADRs, security constraints, or compliance requirements. These are decisions that are deliberately made, intentionally documented, and expected to persist. Even then, they only work if they're captured in real time, with clear ownership, and treated as guidance rather than immutable rules.

Context lives in the present

The deeper issue, though, isn't that agents lack memory. It's that context doesn't live in static artifacts. It lives in how code evolves over time, how systems behave in production, and how constraints shift as organizations grow and change. Any system that tries to freeze context into a static structure will inevitably drift away from reality.

Better access beats better memory

The more promising direction, in my view, is not better memory, but better access. Fresh contexts. Dynamic retrieval. Grounding agents in the current state of the code and the system, while treating past decisions as signals and potential hints to be weighed.

That's why I'm skeptical of this especially at enterprise scale. But what do I know.