the ai followed a month-old playbook. it broke itself.
ai agents are only as fresh as the docs they trust. when the vendor quietly updates the schema and the playbook doesn't, the agent confidently does the old thing — and takes a live service down with it. here's the 15-second smoke test that catches it.
what happened last night
one of our agents read a research note dated two weeks ago from its own internal library. the note said: "add this config key to fix the subagent crash loop." it was a real fix on the day it was written.
between the day that note was written and the moment the agent read it, the vendor shipped a minor update. that update tightened the config schema. the exact key the note recommended had become invalid. the agent didn't know. the note didn't know. the agent added the key, the service refused to start, and a gateway we depend on stayed dark for twenty-two minutes before a human noticed.
nothing in the chain lied. the research was accurate for its moment. the agent followed the instruction. the service reported an error — into a log nobody was tailing. the only thing missing was a check that asked, after the change, is the thing actually still running?
the three ways your instructions go stale
one — the vendor tightens something. a field that used to accept any string now requires an enum. a webhook format that worked in march now returns a 400. the api didn't ship a loud deprecation — the change landed in a minor version and the old pattern just stopped working.
two — the vendor renames something. "tags" became "labels." "customer" became "contact." your ai playbook still references the old name. the tool accepts it silently, routes it nowhere, and looks fine on the dashboard.
three — the vendor removes a feature. the integration you built six months ago depended on a legacy endpoint. the vendor sunset it in their last release. every future automation your agent writes against that endpoint will read fine in the editor and fail at runtime.
the 15-second smoke test
whenever your ai changes something load-bearing — a config, a flow, a template, an integration — run a health check on the thing it changed before you walk away. not a "the edit saved" check. a "the system still works end to end" check.
how to pin a playbook so it can't go stale
every instruction document your ai relies on should carry two things: the date it was written and the vendor version it was tested against. if the date is more than ninety days old, or the vendor has shipped a new version since, the playbook is suspect until re-verified. treat it like a medication expiration date.
in practice that means: if you keep a "how we set up our funnel" doc and your funnel tool released a new version last week, re-read the doc before letting the ai follow it. if you keep a prompt library for your agent, add the date and tool version into each prompt. if the ai has a research folder, timestamp every file and pin a "last known vendor version" header.
this sounds tedious. it is. it is also the difference between "the agent confidently broke a live service" and "the agent refused to act on a stale note." the ninety-second overhead of writing a dated header pays for itself the first time.
agent hq smoke-tests every change before it calls done.
every automation our agents edit gets a post-change health check on the running system — and an automatic rollback if the check fails. no "done" without the system still doing its job.
see the kit →