We build in public

The reliability work behind Agent HQ, shipped openly.

Anthropic built Kairos for Claude Code behind closed doors. We're building ours for OpenClaw in public — MIT-licensed, community-driven, battle-tested on the same infrastructure that powers Agent HQ. When OpenClaw breaks, we diagnose, patch, and ship the fix back to the world.

MIT licensed

Used in production at JahFeel

Community feedback loop live

🔧

openclaw-survival-kit v1.0

Diagnostic and repair tools for OpenClaw gateways that don't stay up.

Two tools, both shipping: gateway-keeper, a supervisor with runtime patches for upstream issues #29827, #47931, #30183, and #51010. claw-medic, a one-shot CLI that runs 10 end-to-end health checks and applies fixes with a single flag. Every fix is backed by an upstream issue number and a commit that explains why it exists.

Health checks

Upstream bugs patched

MIT

License

Telemetry / lock-in

View on GitHub → Read the docs

claw-medic — live diagnostic

$ python3 claw_medic.py

Gateway port (resolved): 18789

[ OK ] gateway_process — gateway node running (pid 49672)

[ OK ] port_bound — listening on 18789

[ OK ] http_health — /healthz responded 200 in 34ms

[ OK ] startup_mechanism — Startup-folder gateway.cmd detected

[WARN] watchdog_process — no watchdog process found

Suggested fix: start openclaw-watchdog.ps1

[ OK ] bootstrap_budget — 42,180 / 150,000 chars (healthy)

[ OK ] gateway_log — no errors in last 24h (217 entries)

Summary: 6 ok, 1 warn, 0 fail

# apply suggested fixes with --fix, or --fix --conservative to protect child services

$ _

What's inside v1.0

Ten checks. Five fixes. Zero guesswork.

Each check reads one concrete piece of system state — a process, a port, a log, a config key — and reports OK, WARN, or FAIL with a one-line explanation. If there's a fix, it's one flag away.

⚡

Gateway process health

Verifies the OpenClaw gateway node process is alive, the configured port is actually bound, and the /healthz endpoint responds.

3 checks

🔄

Auto-detected startup

Finds however OpenClaw launches on your box — Scheduled Task, Startup-folder, systemd, launchd, gateway.cmd. No hardcoded assumptions.

cross-platform

🛡️

--conservative mode

The --fix flag normally runs openclaw gateway install --force when needed. --conservative skips that so it doesn't cascade-kill child services you care about.

child-service safe

📊

24-hour log triage

Reads the gateway log tail, ignores anything older than 24 hours, flags only what's happening NOW. Stale errors from weeks ago don't make noise.

signal, not spam

📮

Community feedback loop

claw-medic --report generates a PII-scrubbed diagnostic JSON + pre-filled GitHub issue URL when a fix didn't work for you. Nothing auto-posts.

v1.0 new

🔍

Bootstrap budget watch

Tracks your workspace's bootstrap token budget against OpenClaw's 150,000 char limit. Catches context bloat before it hits you in production.

upstream #27807

🧹

Orphan scheduled tasks

Finds OpenClaw scheduled tasks that have never run (usually legacy clutter from older versions). --cleanup-orphans removes them; tells you the elevated command if access is denied.

Windows

🎯

Session-1 desktop check

Opt-in check (--require-session 1) that asserts the gateway is in an interactive user session — required for the desktop-control skill to actually work.

Windows · opt-in

📦

Zero telemetry

No daemon. No phone-home. No config file. No lock-in. Runs on-demand, writes to stdout or JSON, goes away when it's done.

your data · your machine

Why it's public

Same code we run. Same bugs you hit.

We're not running a different version of OpenClaw with secret patches. If your gateway crashes at 2 AM, ours probably crashed too. The kit is how we kept Agent HQ up while we figured it out — and open-sourcing it is how we make sure the next person doesn't have to start from zero.

The kit started as one Python script I wrote at midnight after the gateway went down for the third time in a week. It worked. I wrote two more. They worked too.

At some point it became obvious this wasn't Agent HQ code — it was OpenClaw reliability work that belonged in the OpenClaw commons. So we pulled it out, MIT-licensed it, and started shipping fixes in the open.

Every commit in the repo is a real bug we hit. Every version bump ships a fix someone asked for. Every issue filed against the kit becomes input for the next version. That's the deal.

Built for agents that need to stay up · maintained by JahFeel Automation

Also worth your time

Complementary work we lean on.

The kit doesn't try to solve everything. Other maintainers have done excellent work in adjacent corners of the OpenClaw ecosystem — if your pain is in one of those corners, start with them.

@wassupjay

OpenClaw-Token-Optimization

Deep work on shrinking the OpenClaw context window footprint — useful if you're blowing through bootstrap budget.

@aaronjmars

soul.md

Structured agent personality + operating-system format. If you want your agent to have a real voice, start here.

@DanAndBub

Driftwatch

Heartbeat drift monitor for OpenClaw. Our kit checks for drift; Driftwatch measures and graphs it over time.

Try the kit. Or try Agent HQ.

The kit is free, MIT, and works on any OpenClaw install. Agent HQ is the product we built on top of it.

Open the kit on GitHub → Explore Agent HQ