Idea to SaaS in 6 Weeks: Multi-Agent AI

David PackmanFounder, Agenticise28 April 202615 min read

AI Agents Multi-Agent AI Tech Stack Case Study Founders and CEOs Semoria

The methodology: idea to production SaaS in 6 weeks with multi-agent AI-assisted development

Six months of traditional SaaS development compressed into six weeks. That is the headline number on the Semoria build. It is also the part that triggers the most reasonable scepticism, because the obvious read is that something must have been cut to fit the timeline, or that the build is fragile, or that an unrealistic definition of "done" is being applied.

None of those is true. Semoria has been in production with paying users on three live tiers since the build finished. The compression came from a methodology, not a shortcut. The most important piece of that methodology was running several AI coding agents in parallel against different parts of the same codebase, with a human in the lead seat. This post walks through what that looked like in practice, where it worked, where it did not, and what we deferred so the timeline held.

Want a clear, phased automation roadmap for your business? Book a free 30-minute discovery call.

Book a call

What did the planning phase actually cover?

The planning phase ran for under a week and produced three artefacts: a product requirements document, an architecture decision record, and a tier strategy. Everything downstream came from those three.

The product requirements document was deliberately small. It described the smallest version of the product that could be charged for: voice fingerprinting, research-grounded drafting, manual approval, and a single LinkedIn publishing channel. Mini series, scheduling, calendar view, free tools, and the second voice profile (Pro tier) were explicitly tagged as Phase 2 and deferred. That decision was the difference between a six-week build and a four-month one.

The architecture decision record named every layer of the stack with a one-sentence justification: Next.js on Vercel, Supabase Postgres with row-level security on every table, Stripe for billing, Resend for email, n8n for workflow automation, PostHog for analytics, Sentry for error monitoring. Nothing exotic. Each piece was chosen because it was production-grade, well-documented, and integrated easily with the others. According to the Stack Overflow Developer Survey, the most-used technologies year after year are the ones with the largest documentation footprint, and that is precisely why a fast build picks them. AI agents are demonstrably better at writing code against tools they have seen in their training data many thousands of times.

The tier strategy was three tiers (Voice at £29, Amplify at £59, Pro at £99) plus a 30-day free trial. Free tools as lead magnets came in Phase 2. That gave the build a single target shape for the billing system, the feature gating, the email flows, and the marketing pages.

How does multi-agent orchestration in Claude Code actually work?

Multi-agent orchestration means running several AI coding agents in parallel against different parts of the same codebase, with one lead agent coordinating their work. It is the part of the Semoria build that made the timeline possible.

The setup looks like this. A lead agent owns the plan, the scope, and the integration. Sub-agents get scoped tasks ("build the API route for voice profile creation", "refactor the trial-tracking helpers", "draft the weekly digest email template") and an isolated workspace, which in practice is a git worktree pointing at the same repository. The sub-agents work in parallel. The lead reviews each completed slice, integrates it back into the main worktree, and arbitrates any conflict when two slices have touched the same shared file. Tests run after every integration.

Single-agent development compresses individual tasks. Multi-agent orchestration compresses the whole project, because the slices do not queue behind each other. A new API route, a refactor of the billing flow, a new email template, and a new marketing page can all be in flight at the same time, on the same evening. The discipline is in scoping the tasks so the surface areas do not collide, and in being honest about what is independent versus what is genuinely sequential.

The lead-sub-agent model is documented in Anthropic's engineering writing on Claude Code and in the official Claude Code documentation, and it is the same shape Anthropic itself uses internally for large codebase work. Other AI development platforms expose similar primitives. The methodology is not Claude-specific in principle. It is, however, currently easiest to operate with Claude Code because the agent tooling and the worktree isolation are first-class.

There are three concrete patterns we used repeatedly across the Semoria build.

Pattern one: parallel feature drafts. When three independent features were specified (for example, the achievements system, the notification bell, the email digest scheduler), three sub-agents drafted them in parallel in separate worktrees. The lead reviewed and integrated them sequentially over a single afternoon, instead of waiting on each one to finish before the next began.

Pattern two: draft-and-review pairs. One sub-agent writes the feature, a second sub-agent reviews it with a senior-engineer brief ("look for missed edge cases, security gaps, brittle assumptions, dead code"). The reviewer's output goes back to the lead, who decides what to action. This catches more than a single pass through the same agent does, because the review brief is different from the drafting brief and the reviewer is looking for what the drafter missed.

Pattern three: a batch refactor across the codebase. A single sub-agent gets a brief like "rename PlanTier to PlanStatus across all 73 API routes and 47 components, then run the typecheck and fix what breaks". The agent handles the mechanical work; the human checks the diff for anything subtle that the type system did not catch.

How did the build cadence change across the six weeks?

The build cadence was not uniform. The shape of the work changed across the six weeks, and the methodology adjusted with it.

Week one was deep-work mode with a single agent. Architecture scaffolding, the initial database schema, the auth flow, the first set of API routes, the basic dashboard chrome. This kind of work has too much shared surface to parallelise safely: every later feature depends on these foundations, and a small mistake propagates. One agent, one worktree, one careful pass per night.

Weeks two and three were the heaviest parallel-work phase. The foundations were in place, and the feature set decomposed cleanly into independent slices: voice fingerprinting (its own API surface, its own tables), content generation pipeline (its own n8n workflows, its own state machine), Stripe billing (its own webhook handlers, its own RLS policies), email templates (each one independent of the others). Three to five sub-agents in flight at most points, the lead reviewing in fast cycles.

Week four was a polish-and-harden mode. Single agent, careful work, but with a different brief: security review of every endpoint, row-level-security audit on every table, prompt-injection blocking on every AI call, rate limiting, observability wiring. The kind of work that benefits from one mind holding the whole picture rather than four minds in parallel.

Week five was parallel again, this time on the marketing engine. The programmatic blog, free tools, persona pages, social automation, and email funnels all decomposed cleanly because they share no runtime state with each other. Four sub-agents on marketing surfaces while the lead handled the final billing integration.

Week six was the dress rehearsal. End-to-end tests, real-user trial walkthroughs, payment edge cases, mobile compatibility on the email templates, the Stripe webhook race conditions that the unit tests had missed. Multi-agent again, but on bug fixes rather than features.

The honest takeaway: the build cadence is not "always parallelise". It is "parallelise when the work decomposes, and stop parallelising when it does not". The skill is in the judgement of which is which.

What does AI do well versus what do humans still own?

The split between AI and human work was around 70-30 by hours, but the 30 percent that stayed with the human is the part that mattered most for whether the output was a product or just code.

AI handled the bulk of drafting, refactoring, test writing, integration scaffolding, batch edits, code review against a fixed brief, documentation, and anything where the specification was clear and the definition of done was verifiable. The latest GitHub Octoverse report shows a step-change in AI-assisted code volume across the industry, and the Semoria build is consistent with that trend at the project scale: most of the lines of code that shipped were AI-drafted under human direction, then reviewed and adjusted by either another agent or the human directly.

Humans handled prioritisation, scope arbitration, taste calls on product UX, the business model and pricing, security review of anything irreversible, decisions about which work to parallelise versus serialise, and the recurring "is this actually a good idea" check. Research summarised in the Harvard Business Review's generative AI coverage consistently lands on the same conclusion: AI lifts productivity most where the human stays in the lead seat for judgement, and lifts it least (or not at all) where the human disengages.

In concrete terms, the human role on the Semoria build looked like this every day. Pick the next slice of work in priority order. Decide whether it is one task or three. If three, write the briefs and dispatch sub-agents in parallel. While they work, review the previous slice's output and either approve, integrate, or send back with notes. At the end of the day, run the test suite, look at the deploy preview, and make a call on what tomorrow's first slice will be. The AI did the writing. The human did the steering.

The mistake people make when they think AI does not work for them is dropping the human role entirely and expecting the AI to make all of the steering decisions too. That is not the methodology that compresses six months into six weeks. The methodology is to keep the human role hard and small, and to delegate the writing.

How does this compare to other ways of building a SaaS?

Here is the honest comparison we used internally when we made the build call:

Approach	Typical timeline	Cost ceiling	Ceiling on complexity	Best for
Traditional development team	4 to 9 months for an MVP at this complexity	£60,000 to £250,000 of engineering time	No real ceiling, but timeline grows with team size	Funded teams with runway and senior staff to spare
Single-agent AI development	3 to 5 months for a comparable scope	Mostly tool costs, plus one senior engineer's time	Tasks compress, projects do not, because work queues sequentially	Solo builders with one well-scoped feature at a time
Multi-agent orchestrated AI development	4 to 8 weeks for the same scope	Tool costs plus one senior engineer's time	Limited by how cleanly the project decomposes into independent slices	Founders and small teams who want platform-grade output without a platform-grade team
No-code platforms	Days to weeks for simple SaaS	Low upfront, expensive at scale	Hard ceiling on complexity, hard ceiling on customisation	Solo founders validating a model who can rebuild later

The middle two rows are where most independent builders sit today. The third row, multi-agent orchestration, is where we built Semoria and where most platform-grade builds with a small team will sit over the next two years.

What did we deliberately defer to keep the timeline?

The build held the six-week target because the scope was held tightly, not because the methodology bent the laws of software. The honest list of what we deferred is short, but the discipline of writing it down up front is what kept the team out of scope creep.

We deferred the second voice profile (Pro tier) to week five rather than week one. We deferred scheduling and calendar view (Amplify tier) to week four. We deferred the three free tools (Voice Analyser, Post Generator, Voice Score Card) to a series of one-off Phase 2 builds that landed in the weeks after launch. We deferred Stripe annual pricing to month two. We deferred the in-app notification bell to month two. We deferred mini series generation (2-to-5 post series) to month two.

Nothing in that list was cut. Each item shipped within eight weeks of launch. The point is that the version that went live in week six did not need any of them to charge money or to deliver value. Deferring the work bought the wall-clock time that the methodology then put to use.

What does this prove about how Agenticise builds for clients?

Two things that matter for any client weighing a build.

First, the methodology is reproducible. Multi-agent orchestration, parallel sub-agents in isolated worktrees, a senior engineer in the lead seat, and a deliberately small Phase 1 scope is the shape we now run for any platform-grade client engagement, in line with the Agenticise services page. Bespoke tone-of-voice agents for marketing teams, internal automation platforms, customer-facing portals: same approach. The timeline assumption for a Semoria-shaped project is now six to eight weeks from PRD to first paying user, not four to nine months. The /case-studies/semoria hub is the long-form home for the full story.

Second, the timeline is not a slogan. It is what the build inventory looks like at week six: 424 TypeScript files, 73 API routes, 64 database migrations, 47 blog posts at launch, 27 transactional email templates, 5 n8n workflows. Every figure is from the production codebase, not estimates.

The methodology that made this work for Semoria is the same methodology that runs in every Agenticise platform engagement now. If the platform you have been waiting to ship has been quoted in months, it is worth a conversation about what the same scope looks like under multi-agent orchestration. The next post in this series covers the architecture and stack choices that make a £40-a-month cost ceiling at 50 users realistic. Post 4 covers how the marketing engine shipped alongside the product rather than after launch. The origin post covers why Semoria exists at all.

Frequently asked questions

Can you really build a production SaaS in 6 weeks?

Yes, when the scope is tightly bounded, the methodology is multi-agent rather than single-agent, and an experienced engineer stays in the lead seat for judgement and integration. Six weeks covered the full Semoria build: multi-tenant auth, Stripe billing, voice fingerprinting, content generation, scheduling, the marketing engine, 64 database migrations, 73 API routes, 27 transactional email templates, and a programmatic blog with 47 posts at launch. The compression comes from parallelising independent slices of work, not from any single agent moving faster.

What is multi-agent orchestration in AI development?

Multi-agent orchestration is the practice of running several AI coding agents in parallel against different parts of the same codebase, with a lead agent coordinating their work. Each agent gets an isolated workspace, a scoped task, and a clear definition of done. The lead reviews, integrates, and arbitrates conflicts. The result is real wall-clock compression on multi-feature work without sacrificing review quality, because independent slices run concurrently rather than queueing behind each other.

How do you stop parallel AI agents from conflicting?

Two safeguards. First, every parallel agent works in an isolated git worktree so file edits cannot collide at the filesystem layer. Second, the lead agent scopes tasks so the surface areas do not overlap, which is the same discipline a senior engineer applies when splitting work across a team. When agents do touch shared files such as types or configuration, the lead serialises those changes and runs them through review before integration. The human role shifts from writing code to arbitrating scope and judging trade-offs.

How maintainable is AI-built code?

Maintainable when it is reviewed under the same standards human-written code would be reviewed under, and brittle when it is not. Multi-agent orchestration helps because review is built into the workflow: a separate review agent runs against each merged slice, a senior engineer arbitrates anything irreversible, and tests are written in the same pass as the feature. Semoria has been in production with paying users since the build finished, and the same agents continue to ship maintenance and new features against the same codebase every week.

What does AI handle versus what humans still own?

AI handles drafting, integration scaffolding, refactoring at scale, batch reviews, and any task with a clear specification and a verifiable definition of done. Humans still own judgement, taste, business model decisions, security review of anything irreversible, prioritisation, and arbitration when parallel work touches the same surface. The Semoria build was about 70 to 80 percent agent execution and 20 to 30 percent human direction, but the 20 to 30 percent was the part that determined whether the output was actually a product or just code.

From client tone-of-voice agents to a full SaaS: why we built Semoria

From Tone-of-Voice Agents to Semoria SaaS

From bespoke client tone-of-voice agents to a multi-tenant SaaS. Why we built Semoria, what changed in the product, and what stayed the same.

11 min read

Building the marketing engine alongside the product

Building the Marketing Engine With the Product

Why we built Semoria's marketing engine alongside the product, not after. The compounding effect on launch velocity, plus what we'd do differently.

14 min read

Inside Semoria's architecture: the lean modern SaaS stack

Inside Semoria: Lean, Modern SaaS Stack

The architecture of Semoria, a lean, modern SaaS stack we shipped in 6 weeks. The choices, the trade-offs, and what we'd do again.