Two tools, both shipping: gateway-keeper, a supervisor with runtime patches for upstream issues #29827, #47931, #30183, and #51010. claw-medic, a one-shot CLI that runs 10 end-to-end health checks and applies fixes with a single flag. Every fix is backed by an upstream issue number and a commit that explains why it exists.
The reliability work behind Agent HQ, shipped openly.
Anthropic built Kairos for Claude Code behind closed doors. We're building ours for OpenClaw in public — MIT-licensed, community-driven, battle-tested on the same infrastructure that powers Agent HQ. When OpenClaw breaks, we diagnose, patch, and ship the fix back to the world.
Ten checks. Five fixes. Zero guesswork.
Each check reads one concrete piece of system state — a process, a port, a log, a config key — and reports OK, WARN, or FAIL with a one-line explanation. If there's a fix, it's one flag away.
Gateway process health
Verifies the OpenClaw gateway node process is alive, the configured port is actually bound, and the /healthz endpoint responds.
3 checksAuto-detected startup
Finds however OpenClaw launches on your box — Scheduled Task, Startup-folder, systemd, launchd, gateway.cmd. No hardcoded assumptions.
cross-platform--conservative mode
The --fix flag normally runs openclaw gateway install --force when needed. --conservative skips that so it doesn't cascade-kill child services you care about.
24-hour log triage
Reads the gateway log tail, ignores anything older than 24 hours, flags only what's happening NOW. Stale errors from weeks ago don't make noise.
signal, not spamCommunity feedback loop
claw-medic --report generates a PII-scrubbed diagnostic JSON + pre-filled GitHub issue URL when a fix didn't work for you. Nothing auto-posts.
Bootstrap budget watch
Tracks your workspace's bootstrap token budget against OpenClaw's 150,000 char limit. Catches context bloat before it hits you in production.
upstream #27807Orphan scheduled tasks
Finds OpenClaw scheduled tasks that have never run (usually legacy clutter from older versions). --cleanup-orphans removes them; tells you the elevated command if access is denied.
WindowsSession-1 desktop check
Opt-in check (--require-session 1) that asserts the gateway is in an interactive user session — required for the desktop-control skill to actually work.
Zero telemetry
No daemon. No phone-home. No config file. No lock-in. Runs on-demand, writes to stdout or JSON, goes away when it's done.
your data · your machineSame code we run. Same bugs you hit.
We're not running a different version of OpenClaw with secret patches. If your gateway crashes at 2 AM, ours probably crashed too. The kit is how we kept Agent HQ up while we figured it out — and open-sourcing it is how we make sure the next person doesn't have to start from zero.
The kit started as one Python script I wrote at midnight after the gateway went down for the third time in a week. It worked. I wrote two more. They worked too.
At some point it became obvious this wasn't Agent HQ code — it was OpenClaw reliability work that belonged in the OpenClaw commons. So we pulled it out, MIT-licensed it, and started shipping fixes in the open.
Every commit in the repo is a real bug we hit. Every version bump ships a fix someone asked for. Every issue filed against the kit becomes input for the next version. That's the deal.
Complementary work we lean on.
The kit doesn't try to solve everything. Other maintainers have done excellent work in adjacent corners of the OpenClaw ecosystem — if your pain is in one of those corners, start with them.
Try the kit. Or try Agent HQ.
The kit is free, MIT, and works on any OpenClaw install. Agent HQ is the product we built on top of it.