Forgeloop

Skills-Driven Development: kickoff → plan → skill preflight → build

Build in loops. Forge skills. Ship with backpressure.

Forgeloop is a portable workflow for AI-assisted development. It installs prompts + templates and gives you a repeatable loop: kickoff (optional) → plan (specs to tasks) → build (tasks to code with backpressure from tests). Then it adds Skills-Driven Development: before each task, forge reusable Skills (plain Markdown SOPs) and sync them into Codex / Claude Code.

Get started Provision a runner Read the Playbook

One-command runner (GCP)

Full-auto works best when the VM is the security boundary. Provision a dedicated runner and keep your laptop out of the blast radius.

OPENAI_API_KEY=... ANTHROPIC_API_KEY=... \
  ops/gcp/provision.sh --name forgeloop-runner \
  --project <gcp-project> --zone us-central1-a

Inspiration & Sources

Standing on the shoulders of giants. These projects shaped Forgeloop.

ghuntley.com/forgeloop how-to-ralph-wiggum marge-simpson compound-product

What it adds

The core loop can be as small as "feed a prompt to an agent, repeat." Forgeloop keeps that simplicity while adding structure: specs/templates, planning files, safer defaults, and a repeatable runtime.

plan-work multi-model routing GCP provisioner kickoff prompt .forgeloop runtime tasks lane report ingestion log ingestion shared libraries skills-driven dev sync-skills knowledge persistence domain experts lite mode

Quick install

Install the kit into any repo (writes docs + prompts + a wrapper), then run planning/build loops. You'll get a consistent file layout + prompts — plus a small Skills library you can extend as you go.

REPO=/path/to/target-repo
./install.sh "$REPO" --wrapper --skills
cd "$REPO"
./forgeloop.sh sync-skills
./forgeloop.sh plan 1
./forgeloop.sh build 10
# Or use tasks lane: ./forgeloop.sh tasks 10
# Continuous mode: add --watch or --infinite

--batch (CI mode) --interactive merge on conflict

How it works

Each session starts by loading persistent knowledge (decisions, patterns, preferences) and detecting relevant domain experts. Then the loop runs: kickoff (optional) → planning → building with backpressure. At session end, new knowledge is captured. Add Skills-Driven Development so your agents get better at your workflow over time.

flowchart LR
    subgraph SESSION["SESSION START"]
        direction TB
        SS[session-start.sh]
        KL[Load knowledge]
        EL[Detect experts]
        SS --> KL --> EL
    end

    subgraph KICKOFF["1. KICKOFF (optional)"]
        K1[Memory-backed agent]
        K2[docs/* + specs/*]
        K1 --> K2
    end

    subgraph PLAN["2. PLAN"]
        P1[Compare specs to code]
        P2[IMPLEMENTATION_PLAN.md
or prd.json]
        P1 --> P2
    end

    subgraph BUILD["3. BUILD (with SDD preflight)"]
        S1{Should this become a Skill?}
        S2[Forge or update Skill: skillforge]
        S3[Sync skills: sync-skills]
        B1[Implement task]
        B2[Run tests/lint/types]
        B3{Pass?}
        B4[Next task]
        B5[Fix + retry]
        S1 -->|Yes| S2
        S2 --> S3
        S1 -->|No| B1
        S3 --> B1
        B1 --> B2
        B2 --> B3
        B3 -->|Yes| B4
        B3 -->|No| B5
        B5 -->|Backpressure| B1
        B4 --> S1
    end

    subgraph CAPTURE["SESSION END"]
        SE[session-end.sh]
        KC[Capture knowledge]
        SE --> KC
    end

    SESSION --> KICKOFF
    KICKOFF --> PLAN
    PLAN --> BUILD
    BUILD --> CAPTURE

Daemon mode: Optionally run ./forgeloop.sh daemon 300 to watch git + REQUESTS.md and run loops automatically. Control with [PAUSE], [REPLAN], [DEPLOY], [INGEST_LOGS], [KNOWLEDGE_SYNC] flags.

Kickoff (new projects)

Use a memory-backed agent (ChatGPT/Claude Projects, long-running agent, etc.) to draft docs/* + specs/*. Forgeloop then plans from those artifacts.

Plan (specs → checklist)

Planning mode compares specs to code, then writes a prioritized checklist to IMPLEMENTATION_PLAN.md with test expectations.

Build (tasks → code + backpressure)

Building mode executes the next task, runs tests, and updates plan/status/changelog. Failures trigger retries — that's backpressure steering the agent.

Full-auto ≠ reckless: in auto-permissions mode, the agent can run arbitrary commands. Run in a dedicated VM/container. Treat the environment as disposable. See docs/sandboxing.md.

Skills-Driven Development

Every loop gets a preflight: before you implement the next task, decide if the work should become a reusable Skill. Skills are plain Markdown SOPs (optionally with scripts) that you version with your repo and compose into pipelines.

Operational (ICs)

One job. Repeat forever. The “how we do X here” playbook that keeps agents consistent across repos and time.

single-purpose scripts/ references/

Meta (managers)

Plan, gate, and steer the loop. Keep standards consistent. Turn “good taste” into a repeatable protocol.

project-architect skillforge completion-director

Composed (middle management)

Chain skills into a delivery pipeline: brief → plan → forge → execute → validate. Build your internal “skill factory”.

builder-loop voltrons

Forge + sync

Project skills live at skills/ (repo root). The kit ships a base library under forgeloop/skills/. Sync them into .claude/skills (Claude Code) and .codex/skills (Codex) so agents can discover and reuse them. If a sandbox blocks Codex mirroring or a destination exists as a non-symlink (your custom skill), sync-skills will warn and skip.

./forgeloop.sh sync-skills
# Optional: also install to user skill dirs (Codex/Claude/Amp)
./forgeloop.sh sync-skills --all
# Force overwrite non-symlink collisions
./forgeloop.sh sync-skills --force-symlinks

Skill factory mindset: if you had to explain a step twice, it probably wants to be a Skill. Keep them small, name them clearly, and let them compound.

Knowledge & Experts

Integrated from marge-simpson: persistent memory across sessions and domain expert routing for specialized guidance.

Knowledge Persistence

Session-to-session memory stored in system/knowledge/. Tracks decisions, patterns, preferences, and codebase insights. Entries decay after 90+ days without access.

decisions.md patterns.md preferences.md insights.md

./forgeloop.sh session-start   # Load context
./forgeloop.sh session-end     # Capture knowledge

Domain Expert System

Specialized guidance loaded from system/experts/ based on task keywords. Experts provide guidance; Skills provide procedures. Use both together.

architecture security testing implementation devops

Lite Mode

For simple one-shot tasks that don't need full planning overhead. Use --lite for direct execution without status tracking or iteration.

./forgeloop.sh build --lite 1   # One-shot, uses AGENTS-lite.md
./forgeloop.sh build --full 10  # Full mode (default)

Two Workflow Lanes

Forgeloop supports two approaches to task tracking. Pick based on your workflow: human-in-the-loop vs full automation.

Checklist Lane (default)

Uses IMPLEMENTATION_PLAN.md with markdown checkboxes. Best for human-in-the-loop workflows where you want to review and modify the plan.

./forgeloop.sh plan 1
./forgeloop.sh build 10

human-readable editable

Tasks Lane (optional)

Uses prd.json with machine-readable passes: true/false flags. Best for full automation with structured task definitions.

./forgeloop.sh tasks 10

machine-readable progress.txt

Comparison

	Checklist Lane	Tasks Lane
Task file	IMPLEMENTATION_PLAN.md	prd.json
Progress	Markdown checkboxes	passes: true/false
Run command	./forgeloop.sh build N	./forgeloop.sh tasks N
Best for	Human review/edits	Full automation
Tracking	STATUS.md	progress.txt

What gets added to your repo

Forgeloop steers by signs: prompts, operational notes, patterns in your codebase, and backpressure from tests/typecheck/lint. This kit drops in the structure so the signs are consistent — plus a typed Skills library (forgeloop/skills) and room for repo-specific skills (skills/) so your workflow compounds over time.

File layout

Installer writes prompts + coordination files at repo root, and vendors the kit at ./forgeloop.

./
├─ AGENTS.md
├─ AGENTS-lite.md           # Lite mode (one-shot tasks)
├─ PROMPT_plan.md
├─ PROMPT_plan_work.md
├─ PROMPT_build.md
├─ PROMPT_tasks.md          # Tasks lane prompt
├─ IMPLEMENTATION_PLAN.md
├─ REQUESTS.md
├─ QUESTIONS.md
├─ STATUS.md
├─ CHANGELOG.md
├─ prd.json                 # (optional) Tasks lane task file
├─ progress.txt             # (optional) Tasks lane progress
├─ system/
│  ├─ knowledge/            # Session memory (from marge-simpson)
│  │  ├─ _index.md
│  │  ├─ decisions.md
│  │  ├─ patterns.md
│  │  ├─ preferences.md
│  │  ├─ insights.md
│  │  └─ archive.md
│  └─ experts/              # Domain guidance (from marge-simpson)
│     ├─ _index.md
│     ├─ architecture.md
│     ├─ security.md
│     ├─ testing.md
│     ├─ implementation.md
│     └─ devops.md
├─ .claude/
│  └─ skills/               # (generated) Claude Code skill mirror
├─ .codex/
│  └─ skills/               # (generated) Codex skill mirror
├─ skills/                  # (optional) your project skills
│  ├─ operational/
│  ├─ meta/
│  └─ composed/
├─ specs/
│  ├─ feature_template.md
│  └─ ...
├─ docs/
│  ├─ README.md
│  └─ ...
└─ forgeloop/
   ├─ bin/
   │  ├─ loop.sh            # Main build loop
   │  ├─ loop-tasks.sh      # Tasks lane loop
   │  ├─ forgeloop-daemon.sh    # Daemon mode
   │  ├─ ingest-report.sh   # Report ingestion
   │  ├─ ingest-logs.sh     # Log ingestion
   │  ├─ kickoff.sh         # Kickoff helper
   │  ├─ sync-skills.sh     # Skills discovery (Claude Code / Codex / Amp)
   │  ├─ session-start.sh   # Load knowledge context
   │  └─ session-end.sh     # Capture session knowledge
   ├─ lib/
   │  ├─ core.sh            # Logging, notifications, git helpers
   │  └─ llm.sh             # LLM routing with failover
   ├─ skills/
   │  ├─ operational/
   │  │  ├─ prd/SKILL.md                # PRD generation skill
   │  │  └─ tasks/SKILL.md              # PRD → prd.json conversion
   │  ├─ meta/
   │  │  ├─ skillforge/SKILL.md         # Scaffold new Skills
   │  │  ├─ project-architect/SKILL.md  # Plan + skill opportunities
   │  │  └─ completion-director/SKILL.md # Closed-loop execution
   │  └─ composed/
   │     └─ builder-loop/SKILL.md       # End-to-end orchestration
   ├─ config.sh
   └─ ...

Routing + knobs

Works with Codex and/or Claude. By default: Codex plans/reviews, Claude builds. Override via env.

AI_MODEL=claude|codex force one model
FORGELOOP_AUTOPUSH=false by default
FORGELOOP_PLAN_AUTOPUSH push on plan/plan-work iterations (no CI gate)
FORGELOOP_ALLOW_PRD_VERIFY_CMD allow verify_cmd from prd.json (tasks lane)
./forgeloop.sh sync-skills refresh skill discovery (Claude Code; Codex mirror when writable)
FORGELOOP_TEST_CMD run after review auto-fixes
FORGELOOP_DEPLOY_CMD used by daemon on [DEPLOY]
FORGELOOP_INGEST_LOGS_CMD / FORGELOOP_INGEST_LOGS_FILE used by daemon on [INGEST_LOGS]
CODEX_PLANNING_CONFIG / CODEX_REVIEW_CONFIG reasoning tuning

Tune it like a guitar: if Forgeloop is producing the wrong shape of code, don’t only tweak prompts — add better utilities/patterns and strengthen backpressure. The repo itself becomes the steering wheel.

How to use it

Two common paths: start a new repo from scratch (kickoff + plan + build), or augment an existing repo (specs + plan + build). Either way, the loop is the same.

New project (greenfield)

Install kit + wrapper:
```
./install.sh /path/to/your/new-repo --wrapper --skills
cd /path/to/your/new-repo
./forgeloop.sh sync-skills
```
Tip: In TTY, existing files prompt for skip/overwrite/merge/diff. Use --batch for CI or --force to overwrite all.
Generate a kickoff prompt (paste into your memory-backed agent):
```
./forgeloop.sh kickoff "<one paragraph project brief>"
```
Apply the patch your memory-backed agent returns (creates docs/*, specs/*, and a solid plan). Then run:
```
./forgeloop.sh plan 1
./forgeloop.sh build 10
# Add --watch or --infinite for continuous looping
```

Read the full kickoff workflow: docs/kickoff.md

Existing repo (augmentation)

Install kit + wrapper:
```
./install.sh /path/to/existing-repo --wrapper --skills
cd /path/to/existing-repo
./forgeloop.sh sync-skills
```
When files exist: skip (default), overwrite, merge (append template), or diff (view changes).
Write/curate specs for the system you want (3–8 files is a great start). Then run planning:
```
./forgeloop.sh plan 1
```

Build in loops with backpressure:

./forgeloop.sh build 10
# Add --watch or --infinite for continuous looping

Tip: use plan-work on a work branch when you want a scoped plan: ./forgeloop/bin/loop.sh plan-work "scope"

Provision a Forgeloop-equipped VM (GCP)

The whole point: run full-auto without giving an agent access to your personal machine. Provision a dedicated runner VM, then clone your repo and loop.

One command

Requires gcloud on your laptop. Uploads the kit to the VM, installs Node/pnpm + agent CLIs (best-effort), and stores keys in /etc/forgeloop/keys.env.

OPENAI_API_KEY=... ANTHROPIC_API_KEY=... \
  ops/gcp/provision.sh --name forgeloop-runner \
  --project <gcp-project> --zone us-central1-a

Security: treat runners as disposable. Use least-privilege tokens. Never put personal SSH keys or browser cookies on the VM.

After it’s up

SSH in, clone your target repo, install the kit, and run loops in tmux.

gcloud compute ssh forgeloop-runner \
  --project <gcp-project> --zone us-central1-a

mkdir -p ~/work && cd ~/work
git clone <your-repo-url> repo
/opt/forgeloop/install.sh ~/work/repo --wrapper

cd ~/work/repo
./forgeloop.sh plan 1
./forgeloop.sh build 10

More details: ops/gcp/README.md and docs/sandboxing.md