SET

Ship Exactly This.

Give it a spec. Get merged features.

Autonomous multi-agent orchestration for Claude Code. We don't prompt — we specify. Every change is planned through OpenSpec, verified by quality gates, and merged automatically.

$ git clone https://github.com/tatargabor/set-core && cd set-core && ./install.sh
// after install:
1. Open http://localhost:7400 — dashboard starts automatically
2. Open Claude Code in set-core dir, type: "run a micro-web E2E test"
3. Tell Claude: "start the sentinel" — orchestration runs via the manager API
4. Watch the dashboard — agents decompose, implement, verify, merge
5. When done, tell Claude: "start the application that was just built"
6. Ready for your own project: set-project init --project-type web
// the pipeline

From spec to merged code — fully autonomous

Markdown specification Figma design
input: markdown spec + figma design. that's all you provide.
Digest — domains, requirements, acceptance criteria
digest: spec parsed into structured requirements, domains, and dependency graph.
Parallel phases Token usage
orchestrate: independent changes run in parallel worktrees. sentinel monitors everything.
Gate results Sentinel log
verify: 7 quality gates per change — test, build, e2e, lint, review, spec coverage, smoke. exit codes, not judgment.
Orchestration complete Running application
ship: verified code merges to main. result: running application built from your spec. zero intervention.
// capabilities
parallel_worktrees

Multiple Claude agents in isolated git worktrees. Real branches, real merges.

quality_gates[7]

Test, build, E2E, lint, review, spec coverage, smoke. Exit codes, not vibes.

deterministic_output

3-layer templates + set-compare scoring. 87% structural overlap on micro-web, 75% on minishop. Convention compliance: 100%.

openspec_workflow

Structured artifacts: proposal → design → spec → tasks → code. No hallucination.

self_healing

Issue pipeline: detect → investigate → fix → verify. Sentinel diagnoses before acting.

plugin_system

Core stays abstract. Web, voice, fintech — pluggable project types.

cross_run_learnings

Review findings become rules. Gate failures teach the next run. Gets better with use.

design_bridge

Figma MCP → design tokens → Tailwind classes injected into agent context.

web_dashboard

Real-time monitoring. Phases, tokens, agents, issues, learnings. Start from browser.

persistent_memory

Hook-driven cross-session recall. Agents learn from each other. Shared across worktrees.

sentinel_supervisor

3-tier decision model. Crash recovery, checkpoint handling, stall investigation. 30s detection.

team_sync

Multi-agent messaging. Broadcast status, avoid file conflicts, coordinate dependencies.

// the spec is everything

Waterfall took 8 months. This takes 8 hours.

The principle hasn't changed: output quality depends on input quality. A detailed spec used to mean months of upfront planning. Now it means hours of orchestrated agents building exactly what you described.

You are the product owner. The agents are the dev team. The spec is the sprint backlog. The better the spec, the better the result.

your_spec

Business requirements, acceptance criteria (WHEN/THEN), technical constraints, dependency listing, seed data conventions.

our_templates

Framework boilerplate, build config, test setup, linting rules, conventions. You say what. Templates handle how.

1,295 commits · 720+ hours runtime · 134K LOC · 363 specs
// reproducible, not random

We treat determinism as an engineering problem

Run the same spec twice — we measure the structural overlap. 14 runs, 3 project types, set-compare scores every pair.

challengeapproachresult
output divergence3-layer template system + set-compare87% micro-web · 75% minishop · 3 project types
convention complianceroute groups, colocation, naming rules100% across all runs
quality roulette7 programmatic gates (exit codes)deterministic
hallucinationOpenSpec artifacts + acceptance criteriaspec-verified
spec driftcoverage tracking + auto-replan100% coverage
failure recoveryissue pipeline (detect → diagnose → fix)auto-recovery
agent amnesiahook-driven memory (infrastructure)100% capture
// ecosystem

Build your own project type

SET ships with a public web project type. The real power is building your own.

modules/web/

Next.js, Playwright, Prisma. 14+ orchestration runs across 3 projects. Divergence measured, conventions enforced.

your_project_type/

IDOR checks for fintech. HIPAA for healthcare. Your gates, your conventions. pip-installable plugin.

// see it run

A real agent session — spec to merged code

Claude agent session
// why now

Single-agent was the start. Orchestration is the present. Enterprise is preparing.

Systems like SET can do the work of a full development team — given the right spec and properly developed project types. This is the present, not the future.

Don't blame the model. 90% of agent failures are underspecification on our side. SET exists to enforce structure, verify output, and close those gaps.

Enterprise is next. On-premise models, secure multi-tenant — the infrastructure is coming. Every organization should prepare now.

Model providers will build orchestration natively. We welcome that. But we're not waiting.

// one more thing

when orchestration gets intense, defend your changes.

Battle View

arrow keys + space. every change is a ship.