Stop paying for
tokens twice.
With clawback you see every wasted token — cold caches, duplicate context, dead time between turns — then claw it back automatically. Free, open source, and provable on your own machine in 90 seconds.
npx @zapgun/clawback quickstartLocal-only · Open source · Transparent passthrough · Your keys and prompts never leave your machine
~$60–$160
80%+ of typical* usage is recoverable.
* An estimate — the real number prints on your machine: npx @zapgun/clawback bench
Actual measured Fable-5 pre-ban benchmark
- Cache hits
- 94%
- Model
- Fable-5
- Clawed back
- +2,153,358 tokens
See the waste. Claw it back. Track it all.
See the waste
Token-level visibility into every request — cold caches, duplicate context, dead spend. Your invisible bill, finally itemized.
Claw it back
One toggle each. clawback keeps context warm, strips the cruft, and manages the cache tier — so you stop paying twice. The suggestions engine tells you which knob pays off.
Track it all
Every run logged and itemized — cache hits, waste, tokens clawed back. Watch the numbers move on your own machine over time.
Every clawback is one toggle.
Every optimization is one toggle, with sensible defaults. A built-in suggestions engine tells you when each one pays off.
keep-alive
Stop cold starts
You get evicted from cache after a few idle minutes. Clawback keeps it warm so you never pay twice for the same context.
1h cache ttl
Pick up where you left off
Manage the premium cache tier automatically — after a standup, a review, or lunch, your context is still there.
strip-ephemeral
Survive midnight
A rolling timestamp in your prompt silently evicts you at midnight. Clawback normalizes prompts so late-night sessions keep moving.
auto-continue
No babysitting
Resume your jobs automatically when your quota refills.
more…
Continuously improving
Stay one step ahead with continuous ruleset updates as we publish new optimizations.
passthrough
The off switch
Flip one toggle and clawback becomes a transparent pipe — a clean baseline to measure against.
Don't trust us. Prove it yourself.
Every number here reproduces on your machine in 90 seconds. Same harness we use internally — run it on your own loop and generate your own report.
USEFUL WORK PER TOKEN — HIGHER IS BETTER
Output of npx @zapgun/clawback bench on a tight agent loop.
spend
Per-project spend
See where AI budget creates value — attributed by project and workflow, not by person.
routing
Smart model routing
Send easy work to a smaller model automatically; save the frontier model for hard problems.
reserves
Reserves & alarms
Hold back tokens for tests and CI, and get alerted before a runaway job drains the day.
local
Readable & local
Runs on your machine, in code you can read. Nothing about your prompts leaves your control.
Start free. Bring the team when you're ready.
Solo
The full gateway and every optimization, running on your machine.
- Transparent local gateway
- All caching optimizations
- Token-level spend dashboard — see exactly what's recoverable
- Suggestions engine
- Readable, self-hostable code
Team
Everything in Solo, plus shared visibility and controls. You only pay when we recover — priced on results, not seats.
- Per-project spend attribution
- Smart model routing
- Token reserves & alarms
- Shared configuration
- Priority support
Claw it back.
One command. Sensible defaults. Code you can read. See your waste — and claw it back — in minutes.
Drop your email and we'll send the recovered-spend report when it's ready.