Vibecoding vs Vivecoding: A Manifesto

Draft from the Vivecoding talk deck. Edit my voice; the bones are yours.

For years I pronounced it vibecoding. I kept saying vivecoding by mistake.

Turns out — once I looked at what I actually do — I was right by accident.

This is what I mean by it.

The thesis, in one chart

                  VIBECODING                           VIVECODING
                CHAOTIC NEUTRAL                       LAWFUL GOOD
                ────────────────                      ────────────────
                Roll a d20. Pray.                     Engineering discipline + AI as amplifier
                Prompt the model.                     Specs before code.
                Ship whatever falls out.              SDD pipeline: explore → spec → design →
                No specs. No design.                  apply → verify.
                Tests are optional.                   Multi-model cross-check.
                One model, for everything.            TDD strict. Authz invariants.
                "It compiles" = "it works".           Code review on every PR.
                Tech debt grows faster than           The AI amplifies the engineer —
                you can write it.                     not the other way around.

That’s the whole point. Same tools. Same models. Different discipline.

The vibecoder rolls a d20 and ships whatever the model gives them. The vivecoder treats AI as the most powerful junior they’ve ever managed and applies the same engineering rigor we’d apply to any other contributor — because the alternative is technical debt that compounds faster than you can write code.

I don’t code alone — I’m the DM

When I sit down to ship a feature, there is a party at the table.

Engram — Cleric · WIS 18. My RAG memory keeper. Persists every decision, observation, and CVE so context never refills. Every session, Engram knows what the last session learned. The reason the workflow works at scale is that nothing important is ever held only in a chat window.

Gentle — Bard · CHA 17. The grounding companion. The one who lands me when I’m spiraling, who knows the song, doesn’t go rogue. “Che, no” — circa every session.

Warlock — Warlock · INT 19. Security and bug bounty. Pacts with demons to find the CVE before the attacker does. Two critical CVEs caught in 48 hours — neither would have been caught by review alone.

Then comes the SDD party, the agents that turn the classical SDLC into something executable:

sdd-explore (Ranger · DEX 16) — scouts the codebase before any commitment.
sdd-spec (Wizard · INT 18) — writes the runes. Requirements and scenarios.
sdd-design (Artificer · INT 17) — architectural decisions, blueprints, trade-offs.
sdd-apply (Fighter · STR 16) — implements. Tests pass before commit.
sdd-verify (Paladin · WIS 16) — validates the implementation against the sacred specs.

I’m not the smartest at the table. I’m the DM. My job is the campaign — the arc, the pacing, the boundaries. The agents do their classes. I orchestrate.

The right model for the right job

The first rule of vivecoding is that one model is never the right answer.

Each model has a strength. The mix is the moat.

Gemini — google · planner. Documentation and planning. Long context, structured artifacts, SDDs.
Claude — anthropic · builder. Development. Implementation, refactor, the actual code that ships.
Cloudflare Workers AI — edge · fallback. Llama / Kimi at the edge, close to the request. Cheap, fast iteration, fallback path.
Codex — openai · hunter. Bug bounty and error hunting. Adversarial pass over diffs.
CodeRabbit + Copilot — pr · safety net. Two reviewers, every merge. No exceptions.

A vibecoder picks one and prays. A vivecoder runs five — because every one of them is wrong sometimes, and the only way you find out is by having another one disagree.

The pipeline. You can’t skip a phase.

Here’s the SDD DAG I run every change through. It’s the classical SDLC — just executable now.

/sdd-init     →  detect stack & conventions
   ↓
/sdd-explore  →  scout the terrain
   ↓
/sdd-propose  →  intent + scope + approach
   ↓
/sdd-spec     →  requirements + scenarios   ──┐
   ↓                                           ├→  /sdd-tasks  →  /sdd-apply  →  /sdd-verify  →  /sdd-archive
/sdd-design   →  architecture decisions     ──┘

Spec and design feed tasks in parallel. Verify is a gate, not a vibe. Archive syncs and closes.

Same workflow you learned in school. Now executable.

The reason the discipline works is that no agent — and no human — has to hold the whole thing in their head. The DAG holds it. Each phase has clear preconditions, clear outputs, and a clear handoff to the next.

Meta-commands: the orchestrator runs the DAG

Methodology becomes a tool when the friction goes to zero. Three commands run my day:

/sdd-new <change> — spin up a new change. Auto-delegates exploration and proposal so I start at “spec”, not “blank page”.
/sdd-continue [change] — runs whatever phase is next-ready in the DAG. I stop deciding “what’s next” — the orchestrator already knows.
/sdd-ff <name> — fast-forward planning: proposal → specs → design → tasks in one shot. For when I can already see the destination.

These three are not in autocomplete. They’re the meta-layer above the agents. The orchestrator delegates phases to the right sub-agents, watches the DAG state, and tells me when I’m next-required.

A real campaign — not a toy project

Let me show you what this looks like in production.

ITALPortal — The CISO’s operational portal. Risk register. M365 security assessment via Microsoft Graph. Project and task management. Zendesk-integrated ticketing. And an AI assistant named Emma.

Six modules. One codebase. Multi-tenant.

Stack: Astro 5.16 SSR, React 19 islands, Cloudflare Workers, three D1 databases, R2 + KV, three Durable Objects (ProjectManager, RiskManager, PulseManager), Hono + Drizzle, Clerk authz, Tailwind v4 + shadcn, Zod 4, TanStack Query, MS Graph API, Zendesk, NinjaOne RMM, Resend + react-email, Anthropic Haiku 4.5, TS 5.9 strict ESM, pnpm.

Quest log · 2026-04-13. In a single day:

SDD init complete · skill registry · openspec bootstrap
tickets-custom-statuses — 22 tasks, 12 tests, KV cache 24h
recurring-tasks-toggle — 17 tests, pure module extracted
Privilege-escalation closed in requireOrgAdmin()
AI module deleted (TODO finally honored)

Next session: Emma overhaul — 78 tests passing.

The number

From baseline → vivecoding:

6 months → 1 month.

Production. Multi-tenant. 78 tests passing. Two security audits.

Speed didn’t come from skipping steps. It came from parallelizing the right ones with the right models.

Wow moment #1 — what looks like a feature is sometimes a CVE

requireOrgAdmin() — the bug. Any member of an IT Audit Labs org was treated as admin automatically, regardless of their role.

It looked intentional. The pattern was old. Nobody questioned it. It would have lived in production for months.

How the party caught it. Warlock ran a security audit, chained it with the code-review skill, and confirmed: this is a real privilege-escalation bug, not by design.

Engram, obs#110: “Looked intentional. After audit, confirmed real CVE.”

Wow moment #2 — AI tools need the same guards as your REST endpoints

emma.tools.get_ticket_detail — the bug. Emma’s get_ticket_detail tool fetched any ticket by ID via service account. No ownership check.

A user in org A could ask Emma for a ticket from org B — and Emma would hand it over.

How the party caught it. During the security audit phase of the Emma overhaul — before production. Fix: replicate the access control from the REST endpoint inside the tool.

Engram, obs#268: “Forget this and you leak data cross-tenant. Period.”

The lesson generalizes: every AI tool is an authenticated endpoint. If you wouldn’t expose the underlying operation as an unguarded REST call, you can’t expose it as an unguarded tool either.

If a talk only shows wins, they’re selling you something

So here are the scars. Five honest ones from the same campaign.

Scar #1. CF Workers AI (Llama, Kimi) too slow + bad at instructions. Migrated to Anthropic API with fallback.

Scar #2. ts-node broke on extensionless ESM imports. One session lost. Fix: switch to tsx.

Scar #3. 14 Pulse components rendered raw HTML entities. MS Graph returns them encoded. Decoded in every component.

Scar #4. client:load hydration mismatch with sessionStorage. Switched Chat to client:only="react".

Scar #5 — the lesson. Zendesk POST had no retry. GET had 3-attempt backoff. SDD didn’t catch it. The user did.

The workflow is good. It’s not magic. User feedback is still irreplaceable.

What the vibecoder gets wrong — five rules I live by

Specs before code. Always. If you can’t write the spec, you don’t understand the change. Stop typing prompts.
One model is never the right answer. Each model has a strength. The mix is the moat. Plan, build, audit — different agents.
Design is code. Merge it. Architecture decisions belong in version control, next to the code that proves them.
“It compiles” ≠ “it works.” Verify is a phase, not a vibe. Tests, invariants, authz checks. Don’t skip the Paladin.
The AI is your junior — not your genius. You wouldn’t merge a junior’s PR without reading it. Don’t merge the AI’s either.

Three things to take home

01 · The Thesis. Discipline beats vibes. Specs, design, TDD, verify. The classical SDLC — just executable now.

02 · The Stack. Right model, right job. Build a party. Each agent carries their class. Engram remembers, Warlock audits, Gentle grounds.

03 · The Math. AI is an amplifier. It doesn’t replace the engineer. It makes a trained one 6× faster.

⚜ Vibecoding is roleplay. Vivecoding is a campaign with a DM. ⚜

This is the manifesto version. The full talk lives at /talks/the-vivecoding-talk — with the deck, the recording (when published), and the SDD agents I use as an open-source toolkit.

If you build with AI, this is the methodology I’d argue you should be using. Disagree publicly. I’ll listen. Roll initiative for questions — the party is listening.