SET

Ship Exactly This.

Give it a spec. Get merged features.

Autonomous multi-agent orchestration for Claude Code. We don't prompt — we specify. Every change is planned through OpenSpec, verified by quality gates, and merged automatically.

$ git clone https://git.setcode.dev/root/set-core.git && cd set-core && ./install.sh
// after install:
1. Open http://localhost:7400 — dashboard starts automatically
2. Open Claude Code in set-core dir, type: "run a micro-web E2E test"
3. Start sentinel from the dashboard — or tell Claude: /set:start
4. Watch the dashboard — agents decompose, implement, verify, merge
5. When done, tell Claude: "start the application that was just built"
6. Ready for your own project: set-project init --project-type web
// the pipeline

From spec to merged code — fully autonomous

Markdown specification Figma design
input: markdown spec + figma design. use /set:write-spec for interactive spec generation, set-design-sync to extract Figma tokens.
Digest — domains, requirements, acceptance criteria
digest: spec parsed into structured requirements, domains, and dependency graph.
Triage — ambiguity resolution
triage: ambiguities flagged during digest get resolved — automatically by the planner, or interactively by you. risk classification assigns priority. nothing proceeds until every AMB has a decision.
Parallel phases Token usage
orchestrate: independent changes run in parallel worktrees. sentinel monitors everything.
Gate results Sentinel log
verify: 7 quality gates per change — test, build, e2e, lint, review, spec coverage, smoke. BDD traceability binds every REQ-ID to a test. exit codes, not judgment.
Orchestration complete Running application
ship: verified code merges to main. result: running application built from your spec. zero intervention.
// capabilities
parallel_worktrees

Multiple Claude agents in isolated git worktrees. Real branches, real merges.

quality_gates[7]

Test, build, E2E, lint, review, spec coverage, smoke. Per-change E2E gates. BDD traceability binds REQ-IDs to tests. Exit codes, not vibes.

deterministic_output

3-layer templates + set-compare scoring. 87% structural overlap on micro-web, 83% on minishop. Convention compliance: 100%.

openspec_workflow

Structured artifacts: proposal → design → spec → tasks → code. No hallucination.

self_healing

Issue pipeline: detect → investigate → fix → verify. set-harvest scans E2E runs, classifies fixes, adopts them into planning rules. Permanent improvements.

plugin_system

Core stays abstract. Web, voice, fintech — pluggable project types.

cross_run_learnings

Gate failures become planning rules. set-harvest extracts fixes from 30+ runs across 3 projects. Each run is smarter than the last.

design_bridge

Figma Make → set-design-sync → per-change design.md with scope-matched tokens. Each agent gets only the colors, fonts, layouts for its pages.

web_dashboard

Real-time monitoring. Step progress (P→I→F→M→A), test artifact gallery, unified logs, gate results. Start orchestration from the browser.

persistent_memory

Hook-driven cross-session recall. Agents learn from each other. Shared across worktrees.

sentinel_supervisor

3-tier decision model. Auto-restarts crashed orchestrators, remembers spec path across restarts, enriches retries with git context. 30s stall detection.

team_sync

Multi-agent messaging. Broadcast status, avoid file conflicts, coordinate dependencies.

// the spec is everything

Waterfall took 8 months. This takes 8 hours.

The principle hasn't changed: output quality depends on input quality. A detailed spec used to mean months of upfront planning. Now it means hours of orchestrated agents building exactly what you described.

You are the product owner. The agents are the dev team. The spec is the sprint backlog. The better the spec, the better the result.

your_spec

Business requirements, acceptance criteria (WHEN/THEN), technical constraints, dependency listing, seed data conventions.

our_templates

Framework boilerplate, build config, test setup, linting rules, conventions. You say what. Templates handle how.

// commands

40+ tools. One workflow.

Slash commands in Claude Code, CLI tools in your terminal. Everything composes.

use_set // your project
set-project init deploy to project
/set:write-spec interactive spec gen
/set:decompose spec → execution plan
/set:start start orchestration
set-design-sync Figma → tokens
/set:audit health check
extend_set // plugins
modules/web/ built-in web type
ProjectType ABC base class
entry_points pip plugin system
CoreProfile inherit universal rules
planning_rules.txt domain patterns
templates/ scaffold per stack
develop_set // contribute
/opsx:new structured change
/opsx:apply implement tasks
/opsx:verify check before merge
set-harvest adopt E2E fixes
set-compare measure divergence
set-memory persistent recall

Plus: set-new, set-work, set-merge, set-close (worktrees) · /set:status, /set:msg, /set:inbox (team sync) · /set:todo, /set:loop, /set:push (workflow)

1,400+ commits · 65K core LOC · 375 specs · 860+ change artifacts · 30+ E2E runs
// reproducible, not random

We treat determinism as an engineering problem

Run the same spec twice — we measure the structural overlap. 30+ runs, 4 project types, set-compare scores every pair.

challengeapproachresult
output divergence3-layer template system + set-compare87% micro-web · 83% minishop · 4 project types
convention complianceroute groups, colocation, naming rules100% across all runs
quality roulette7 programmatic gates (exit codes)deterministic
hallucinationOpenSpec artifacts + acceptance criteriaspec-verified
spec driftcoverage tracking + auto-replan100% coverage
failure recoveryissue pipeline (detect → diagnose → fix)auto-recovery
agent amnesiahook-driven memory (infrastructure)100% capture
// self-healing pipeline

It doesn't retry. It investigates.

Environment broken? Fixed. Bug in SET itself? Fixed — and the fix is permanent. Every failure makes the system better.

The sentinel doesn't blindly retry failed gates. It reads logs, traces root causes, and dispatches targeted fixes. Environment misconfigured? It reconfigures. Dependency conflict? It resolves. Bug in SET's own code? It patches set-core and commits the fix — so the same failure never happens twice.

This is the difference between "retry 3 times and give up" and an actual engineering process. Detect → investigate → fix → verify → learn. Permanent fixes, not temporary workarounds.

sentinel — global issues (live)
Sentinel Issues Dashboard — real self-healing pipeline across 30+ orchestration runs

Real issue tracker from 30+ orchestration runs. Every resolved issue was fixed autonomously.

// ecosystem

Build your own project type

SET ships with a web project type battle-tested across 30+ runs. The real power is building your own.

modules/web/

Next.js, Playwright, Prisma. 30+ runs across micro-web, minishop, craftbrew. Per-change E2E, BDD traceability, convention enforcement.

E2E runners

Scaffold → init → register → sentinel start. One script per project. run-micro-web.sh, run-minishop.sh, run-craftbrew.sh. Repeatable validation.

your_project_type/

IDOR checks for fintech. HIPAA for healthcare. Your gates, your conventions, your templates. pip-installable plugin inheriting CoreProfile.

// see it run

A real agent session — spec to merged code

Claude agent session
// why now

Single-agent was the start. Orchestration is the present. Enterprise is preparing.

Systems like SET can do the work of a full development team — given the right spec and properly developed project types. This is the present, not the future.

Don't blame the model. 90% of agent failures are underspecification on our side. SET exists to enforce structure, verify output, and close those gaps.

Enterprise is next. On-premise models, secure multi-tenant — the infrastructure is coming. Every organization should prepare now.

Model providers will build orchestration natively. We welcome that. But we're not waiting.

// work with us

Build With SET

Open-source and autonomous. Need something custom? We can help.

custom_project_type

We build a ProjectType plugin for your stack and domain. Your rules, your gates, your templates. Pip-installable, works with set-project init.

workshop

Hands-on spec-driven development training. Write specs that produce working apps. Run orchestration, understand gates, build memory. Remote or on-site.

managed_run

Send a spec, get a working app. We run the orchestration, you review the PRs. Quality gates guarantee the output. Ideal for MVPs and proof-of-concepts.

interested in:
// one more thing

when orchestration gets intense, defend your changes.

Battle View

arrow keys + space. every change is a ship.