← Back to Resources

The Delivery Squad

A small team of engineers and AI agents that owns one feature end-to-end — team shape, the human/agent split, and the daily routine.

Purpose: Define the smallest team that can take a feature from business intent to verified value with AI agents doing the heavy lifting.
Pairs with: The Delivery Model, The Delivery Workflow.
Key principle: Humans make the decisions. Agents do the work. The squad gives both a shared surface to work on.

What’s in a squad

A delivery squad is one Squad Lead, a few Squad Engineers who each run several AI agent sessions at the same time, and the agent sessions themselves. The squad is small enough to fit in a single planning conversation, and complete enough to ship a feature end-to-end without handing work to another team.

1 ×

Squad Lead

Senior engineer

Runs the squad. Reviews agent PRs for intent alignment, not just correctness. Records significant decisions in the decision log. Coaches the Squad Engineers.

Squad Engineers

A small group

Each runs several concurrent agent sessions. They feed agents the right context, review drafts before PR, and write the integration tests agents struggle with.

Agent sessions

Concurrent

Draft code, tests, docs. Scaffold structure. Refactor. Generate candidate implementations the humans then shape and verify.

Shape, not headcount. The exact numbers — one lead, how many engineers, how many concurrent agents — depend on the feature, the domain complexity, and how mature the shared context is. Start with one senior plus two mid-level engineers, then increase agent concurrency until the Squad Lead can no longer review the output within a day. That’s your ceiling.

What the human decides vs. what the agent does

The whole model rests on this split. Miss it, and you get either glacial progress (humans trying to do everything) or expensive rework (agents making judgment calls they shouldn’t).

Holder	Decision	Why
Human	What to build and for whom	Product intent lives in people and stakeholders, not in code or history.
Human	What “correct” means in context	Correctness is domain-specific. An agent cannot infer the unwritten rule that always applies in your world.
Human	Whether a change is safe to ship	Risk judgment integrates signals from history, relationships, and business context agents don’t see.
Human	When to stop an agent	Knowing a generation is drifting before it’s obvious is a senior-engineer skill. The Squad Lead and Engineering Practice Lead coach this explicitly.
Both	How to structure the work	Humans propose the shape of a feature. Agents draft the task breakdown. Humans approve and edit.
Both	Spec content	Humans own intent; agents draft, pressure-test, and surface gaps. Ceremony makes the split explicit.
Agent	Generating candidate code and tests	This is where agents are strongest: synthesizing patterns from context and producing reviewable drafts quickly.
Agent	Scaffolding repeatable structure	Folder layouts, boilerplate, migrations between patterns, repetitive refactors. Faster and more consistent than humans.
Agent	Running the regression suite	Mechanical verification. A good place to push agent autonomy.
Agent	Drafting documentation from code	Agents produce high-quality first drafts of API docs, changelogs, and runbooks. Humans edit for voice and omissions.

How a squad’s day runs

A squad’s routine is built around a simple fact: agents can work overnight, humans work during the day. The rituals match that cycle.

Morning: triage the overnight

The Squad Lead reviews agent drafts and overnight PRs first thing.
Anything clean goes into the review queue; anything confused is flagged for the Context Manager.
Priorities for the day are adjusted based on what landed.

Midday: co-work

Squad Engineers pair with their agent sessions on in-flight tasks.
Context gaps surface immediately and get fed to the Context Manager.
The Squad Lead unblocks, reviews tricky PRs, and updates the decision log.

Late afternoon: set up the night

Agents are briefed on the next batch of tasks, with explicit scope and references to the spec.
The Squad Lead spot-checks the briefs; bad briefs mean bad overnight output.
Any work that needs human judgment stays with humans for tomorrow.

Weekly: value check-in

The squad demos what shipped this week against the feature’s success metrics.
Context retrospective: what confused the agents, and what change to the shared context would have prevented it.
The Squad Lead pairs with the Context Manager on the top one or two context updates.

What a squad owns, what it doesn’t

A squad owns a feature, not a layer. That means end-to-end delivery of a bounded slice of the product — data model, business logic, API, UI, tests — not “all the backend” or “all the frontend.” Squads that own layers have handoffs; handoffs are where intent dies.

Good squad boundaries. The squad owns “the feature that lets users configure and run a scheduled calculation.” That includes the config UI, the config API, the scheduler integration, and the calculation engine plumbing. One team. One spec. One merge.

Bad squad boundaries. The squad owns “the calculation service.” Every feature needs multiple squads to ship. Coordination overhead multiplies. Context fragments across repositories. Agents lose the plot.

When to add another squad

Scaling the delivery model is a question of squad count, not squad size. Growing a squad past the point where the Squad Lead can review everyone’s work within a day breaks the model.

Add a squad when the roadmap has more feature areas than existing squads can own with clear boundaries.
Do not add a squad just to speed up an existing feature. Instead, increase agent concurrency inside the existing squad, or split the feature more cleanly.
Squads share the shared context, the Context Manager, the Engineering Practice Lead, and the architecture function. They do not share a backlog.

A common mistake. Growing a squad to a dozen people like a traditional team. The Squad Lead becomes a bottleneck or stops reviewing carefully; either way, agent output drifts. Hold the line on squad size; add squads instead.

← Back to Resources