Forgeloop // v2 beta

Your agents stop thrashing. You keep building.

Coding agents are powerful until they hit a wall — then they burn tokens retrying the same failure forever. Forgeloop catches that loop, pauses cleanly, saves every artifact, and shows you exactly what went wrong and what to do next. One install. Works with Claude, Codex, or any LLM.

Get started See the proof

Stops the spin When your agent hits the same failure twice, Forgeloop pauses the run and writes a clean handoff — no more burning tokens on retries.

Everything in the repo State, questions, escalations, and runtime status live in plain files in your repo. No external database. No second source of truth.

Prove it first Run ./forgeloop.sh evals before you trust it. One command, real scenarios, real proof.

What you get

From install to a run you can actually explain to your team.

v2 beta

01 — A live dashboard for agent runs

See runtime state, active blockers, and pending questions in a real-time HUD — no log spelunking required.

live HUD

02 — One-command proof it works

Run ./forgeloop.sh evals to verify pause behavior, failure handling, and state transitions before you trust it.

verified

03 — You stay in control

Pause, resume, replan, answer blockers. The system never pretends to be fully autonomous — you decide when to intervene.

human-first

Built for how agents actually fail

Tests keep failing

Forgeloop catches the retry loop, pauses, and shows you the exact failure chain.

Auth expires mid-run

Provider failover kicks in. If both providers are down, it pauses instead of burning tokens.

Agent needs a decision

The question gets written to a file. The run pauses. You answer when you're ready.

01 // The problem

Agent runs fail. The question is whether you lose the work.

Every team running coding agents hits the same wall: the tests fail, the agent retries, burns through your API budget, and leaves you with nothing useful. Forgeloop exists so that when an agent run goes sideways, you get a clean stop, a preserved trail, and a clear next step — not a $200 invoice and an empty diff.

Catches the loop

When an agent retries the same failure, Forgeloop stops the run, writes the reason to your repo, and preserves every artifact so nothing is lost.

One source of truth

Runtime state, pending questions, escalations, and blockers all live in plain files in your repo. The CLI, HUD, and daemon all read the same files.

You can review everything

Every pause, every escalation, every decision point is a file you can read, diff, and discuss in a PR. No hidden state, no magic.

02 // How you use it

CLI, dashboard, or plugin — same truth everywhere.

Every way you interact with Forgeloop reads and writes the same repo-local files. Pick whatever fits your workflow.

CLI

Terminal-first workflow

Install, plan, build, run proofs, and manage everything from ./forgeloop.sh. No GUI required.

HUD

Live dashboard

Real-time view of runtime state, blockers, questions, ownership, and slot activity. Pause, resume, replan, and inspect bounded parallel slot work without touching the terminal.

OpenClaw plugin

Monitor and steer Forgeloop runs from inside OpenClaw. Same files, same state, same control actions.

Proof

Self-host proof

One command spins up an isolated test environment, drives the real HUD, and verifies everything works end to end.

03 // Product screenshots

Real HUD shots, generated from a seeded demo repo.

These are real screenshots from the shipped loopback HUD, rendered against a canonical demo repo for Signalboard and regenerated with ./bin/capture-product-screenshots.sh.

Operator HUD

The control-room view for real work: runtime truth, ownership, queue pressure, workflows, questions, escalations, and bounded slot orchestration from the same canonical loopback state.

real service seeded repo state reproducible

Forgeloop operator HUD screenshot rendered against the Signalboard demo repo

Director Mode / broadcast frame

The spectator-facing scene: recap rail, now/next/queue/live-feed layout, proof shelf, and broadcast framing — still tied to the same canonical loopback truth.

director mode broadcast frame same truth

Forgeloop Director Mode screenshot rendered against the Signalboard demo repo

04 // Trust, verified

Prove it works before you use it

Run the eval suite in under a minute. It tests the behaviors that actually matter: does it pause? Does it escalate? Does it preserve state?

Stable proof

./forgeloop.sh evals

daemon pause behavior
repeated-failure escalation
blocker handling
runtime-state transitions
layout portability

05 // End-to-end proof

Test the full stack in one command

Spins up the real HUD and service in an isolated repo, drives actual UI interactions, and saves screenshots as evidence. Run it before you demo.

Release proof

./forgeloop.sh self-host-proof

uses the real loopback service and HUD
drives the UI with agent-browser
checks pause, clear-pause, replan, and one-off plan behavior
keeps screenshots and logs for review
beta proof cadence is encoded in .github/workflows/v2-beta-proof.yml

Need the exact ship/no-ship bar? Read the v2 release checklist.

06 // Release tracks

Pick the track that matches your risk appetite.

v1 is battle-tested. v2 beta has the dashboard, the live HUD, and the richer developer experience. Both share the same repo-local contract.

v1.0.0 stable

Ship with confidence

The stable track gives you the proven runtime: checklist loops, fail-closed pauses, clean escalations, and a public eval suite you can run in CI.

best for teams using agents on real projects today
verifiable with ./forgeloop.sh evals
clean upgrade path to v2 when you're ready

Moving from stable to beta? Start with the v1 → v2 upgrade guide.

main // v2 beta

Try the full experience

The beta track adds the live dashboard, real-time event streams, workflow packs, the OpenClaw plugin, and the self-host proof harness.

best for evaluation, demos, and building on top of Forgeloop
end-to-end proof via ./forgeloop.sh self-host-proof
everything v1 has, plus a richer developer experience

The current beta launch story and visual system are documented in design.md.

07 // How it works

Four steps from install to a run you trust.

Describe what you want to build. Let the agent loop against real checks. See shared state across every surface. Prove it works before you ship.

Step 01 Describe

Start from a one-paragraph brief. Forgeloop generates the spec, plan, and checklist so the agent has real structure to work against.

Step 02 Build

The agent plans and builds against your repo's real checks. When it gets stuck, Forgeloop pauses the loop instead of retrying forever.

Step 03 See everything

Runtime state, blockers, questions, and event history — visible from the terminal, the dashboard, or the OpenClaw plugin. Same files everywhere.

Step 04 Verify

Run the eval suite or the full self-host proof. Real scenarios. Real results. Know it works before you hand it real work.

08 // Quickstart

Four commands to see it for yourself.

Install into any repo. Spin up the dashboard. Run the proof. You'll know in under five minutes whether this is worth your time.

./install.sh /path/to/target-repo --wrapper

./forgeloop.sh serve

./forgeloop.sh evals

./forgeloop.sh self-host-proof