4 min read

ai said "done." is it actually live? the 30-second ship check.

your agent reports a feature shipped. the button is there. the nav item is there. sometimes the thing behind the button isn't. here's how to catch it before a customer does.

DONE ≠ LIVE

the failure mode nobody warns you about

when you ask an ai agent to ship something — add a checkout button, add a booking page, add an offline mode — the agent will write the code, report "done," and move on. most of the time it works. but there's a specific failure pattern that bites small business owners more than anyone: the feature is registered, but the artifact behind it is missing or broken.

a button points to a page that doesn't exist. an offline mode is enabled but the file that powers it was never uploaded. a new form is wired up in your test folder but not in the live folder. none of it throws an error. it just… quietly doesn't work.

we saw this exact pattern four times yesterday in our own build logs. a service worker registered pointing to a file that wasn't there. a nav item that appeared on the deployed site but not the source. duplicate event handlers that cancelled each other out. and an audit that confirmed something worked without ever reading the code. every one of them passed "done" and every one of them was broken.

why this hits small businesses harder
when a fortune 500 ships a broken feature, qa catches it in staging. when a solo founder ships a broken feature, the customer catches it — and most of them never tell you. they just don't come back. the agent moved on. the bug waits.

the 30-second ship check

run this every single time your agent tells you a change is live. it takes less than a minute and it catches 90% of silent ship failures.

01
click the new thing as a customer. open an incognito window. don't be logged in. navigate to the feature the way a real customer would. if there's a new button, click it. if there's a new page, load it. you're not testing the code — you're testing the path.
02
watch for the quiet 404. pull up the browser dev tools (right-click → inspect → network tab). click around. any line in red? any request to a file that came back 404 or 500? that's the broken artifact. the page renders, the button clicks, but something behind it is missing.
03
verify the data landed. if the feature collects anything — a form submit, an order, a message — fill it in with a test value you'd recognize later. then go check the place it's supposed to land. your inbox, your database, your crm. if you can't find the test value, the pipeline is broken even though the page looked fine.
bonus check — ask the agent to prove it
when your agent says "done," reply with: "quote the actual code you wrote and show me the file path, then curl the deployed url and tell me the response code." most silent failures come from an agent asserting something is there without actually verifying the artifact exists. forcing it to show you the proof flushes out the hallucinated work.

why the agent doesn't catch this itself

ai agents are great at writing code. they're less great at verifying the deployed version of that code. the gap is what we call source/deploy drift — the agent edited one copy, forgot the other copy, or edited both but only one got pushed live. it looks right in the place the agent was reading. it's broken in the place the customer is reading.

the fix isn't a better agent. the fix is you running a 30-second check in the place the customer actually touches — before you close the task and move on.

what to build into your weekly rhythm

every monday morning, pick one feature you shipped last week and use it like a customer. not just "click once and confirm the page loads" — go all the way through. submit the form. complete the checkout. book the slot. you'll catch things that an automated monitor never would, because you know what the outcome is supposed to feel like.

this is the one habit that separates small businesses who trust their automation from the ones who get burned by it. ai makes shipping fast. you still have to confirm it landed.

bottom line
"done" is a claim. "live" is a thing your customer can actually use. the gap between the two is where silent failures live. a 30-second check closes the gap.

agent hq tells you when a ship actually landed.

every action your agents take shows up in the activity log with the artifact url and the response code. no "done" without proof.

see the kit →