Rachel was a Senior PM when your agent first met her. Six months later she runs the whole product org. But the note your agent saved — "Rachel is a Senior PM" — never changed. So the next time the agent drafts an email, it addresses the VP of Product as if she still reports to one.
That is a small, human kind of wrong. Now scale it. An agent quotes a 30-day refund window three weeks after the policy moved to 14. An agent prices a deal off a rate card that was retired last quarter. An agent escalates to a manager who left in March.
None of these are hallucinations in the usual sense. The agent is recalling something it genuinely learned, and recalling it faithfully. The problem is subtler and, for a system that takes real actions, more dangerous: the fact was true once, and the memory has no idea it stopped being true.
Most memory systems treat every fact as eternal
When you store a fact in a typical memory or retrieval system, you store the text and maybe a timestamp of when it was written. That timestamp answers "when did we learn this?" It does not answer the question that actually matters at decision time: "is this still true right now?"
So the fact sits in the index forever, fully eligible for retrieval, ranked next to facts that are still accurate, with nothing to distinguish the live from the dead. The agent pulls it back, trusts it, and acts.
This is not a hypothetical failure mode. A 2025 academic benchmark called HoH, built specifically to measure the effect of stale information on retrieval-augmented systems, found that "outdated information significantly degrades RAG performance in two critical ways: (1) it substantially reduces response accuracy by distracting models from correct information, and (2) it can mislead models into generating potentially harmful outputs, even when current information is available" (Ouyang et al., 2025). Read that second clause again. Even when the correct, current fact is sitting right there in the knowledge base, the stale one can still win and steer the model into a wrong, sometimes harmful answer.
Practitioners building production retrieval systems describe the same thing as a slow credibility problem: as a knowledge base grows, an LLM that "confidently serves outdated information at scale" quietly erodes trust in the whole system (RAG About It, "The Knowledge Decay Problem"). The bug is rarely loud. It is a thousand small confidently-wrong answers.
The idea databases solved decades ago
Here is the reassuring part: this is not a new problem, and the data world already has a clean mental model for it.
It is called temporal data, and the key distinction is between two kinds of time. Valid time is "the time period during or event time at which a fact is true in the real world." Transaction time is "the time at which a fact was recorded in the database" (Wikipedia: Temporal database). They are not the same thing, and the gap between them is exactly where stale facts live.
The classic illustration: someone moves to a new town in August but does not officially register their address until December. The world changed in August; the record only caught up in December. If you only track when you recorded something, you can never answer "where did they actually live in October?" — and you can never tell whether a fact you are about to act on is still in force.
Agent memory has been making the database world's old mistake: storing only transaction time, and treating it as if it were truth.
A fact validity window is just a start and an end
The fix is small to describe. Every fact gets two extra fields:
valid_from— when this fact started being truevalid_until— when it stopped (ornull, meaning "still true as far as we know")
That pair is the validity window. It is the difference between "Rachel is a Senior PM" floating in the void and "Rachel was a Senior PM from 2025-01 until 2026-03." The second one carries its own expiration date. An agent recalling it on 2026-06 can see, without any extra reasoning, that the window has closed.
// A fact that has since expired
{
content: "The refund window is 30 days",
type: "fact",
validFrom: "2025-09-01",
validUntil: "2026-04-15" // policy changed here
}
// The fact that replaced it — still live
{
content: "The refund window is 14 days",
type: "fact",
validFrom: "2026-04-15",
validUntil: null // true until something supersedes it
}
Now recall can do something it never could before: filter by time. Ask "what is the refund policy as of today" and the expired 30-day fact simply does not come back. Ask "what was the policy in March" and it does. The stale fact is not deleted — it is just no longer mistaken for current truth.
Superseding: how a new fact retires the old one
A validity window answers "is this still true?" The other half of the problem is "what replaced it?" That is superseding.
When a newer fact contradicts or updates an older one, you do not silently overwrite the old record and lose the history. Instead, the new fact closes the old one's window (sets its valid_until) and records a link back to it: this supersedes that. String enough of those together and you get a supersede chain — a readable history of how a fact evolved.
Senior PM → Group PM → VP of Product
Each arrow is a recorded, queryable supersede relation, not a guess reconstructed after the fact. The current answer is obvious (follow the chain to its end), and the path that got there is fully auditable.
This is the agent-memory version of what temporal databases buy you by tracking both axes of time: you can reconstruct not just what occurred, but what the system believed at each point in time — temporal databases track these as two separate axes, valid time and transaction time (Wikipedia: Temporal database). For a human team that is a nice-to-have. For an autonomous agent that took an action you now have to explain, it is the difference between knowing exactly which fact it trusted and when that fact was correct — and a shrug.
Why this matters more every month
You could wave this away as a data-hygiene nicety. It used to be. The reason it is becoming load-bearing is that agents have stopped just answering questions and started doing things — sending the email, issuing the refund, quoting the price, booking the meeting.
When an agent only drafts text for a human to review, a stale fact is an annoyance the human catches. When the agent executes, a stale fact becomes a wrong action with real consequences: a refund honored against a policy that no longer exists, a contract sent to the wrong signatory, a customer told something untrue by a system speaking with your company's voice.
Anthropic frames the underlying constraint well in its guidance on building agents: "Context, therefore, must be treated as a finite resource with diminishing marginal returns," and context engineering is "the art and science of curating what will go into the limited context window from that constantly evolving universe of possible information" (Anthropic, "Effective context engineering for AI agents"). Curation is the operative word. You cannot curate what you cannot date. If a memory layer hands the model a pile of facts with no sense of which ones have expired, the model is left to guess — and, as the HoH benchmark showed, it often guesses wrong even when the right answer is available.
Fact validity windows are how you give memory the vocabulary to curate by time: keep the live facts, retire the expired ones, and keep the history for when someone asks why.
How AgentPrizm implements this
We built this into the memory layer rather than leaving it to every agent author to reinvent.
- Validity windows are first-class. Every memory can carry a
validFromandvalidUntil. Recall can filter to "true as of now," "true as of a past date," or — when you are debugging or auditing — include the expired ones on purpose. - Supersedes are recorded, not inferred. When a fact updates another, the relation is stored as a typed
supersedeslink, so the chain from old to current is queryable rather than reconstructed. - Superseded facts keep an audit trail. When a newer fact replaces an older one, the retired version is excluded from default recall but stays queryable for audit, so you can always answer "what did the agent know, and when." (A right-to-forget request is a separate operation — that removes the data outright.)
- Six memory types, not a sprawling taxonomy. Facts, lessons, directives, preferences, contacts, and bookmarks — that is the whole set. Validity windows apply where they earn their keep (facts and contacts change; a directive less so).
- Containers, not folders. Memories are scoped by container (a project, a customer, an agent), so time-aware recall stays fast and relevant instead of sweeping your entire history.
- Hybrid, validity-aware recall. Semantic and keyword search find the relevant facts; the temporal and structured filters then decide which of them are still allowed to influence the answer.
The honest pitch is narrow: stale facts are a real, measured failure mode for agents that act, and a validity window plus a supersede chain is the smallest thing that actually fixes it. If you want the mechanics, the docs walk through validity windows, supersedes, and audit. If you want to know what it costs, the pricing page is plain about it.
Rachel got promoted. Your agent should know that — and know exactly when it found out.