The problem
A Business Requirements Document is the single most-typed document inside enterprise project delivery. Every initiative has one; every team has its own template; every Business Analyst spends hours filling that template with text that is mostly structural and rarely the part of the work that actually requires their judgment.
The waste is not the writing — most of the writing is good. The waste is that writing the structure exhausts the BA before they get to the parts only they can write: the assumptions, the constraints, the security controls that depend on the organisation's risk appetite. By the time the BA reaches non-functional requirements, the document is six pages of dutiful prose and one paragraph of real thinking.
The approach
Structured input, structured output. Take a focused project brief as a typed form — objective, stakeholders, scope boundaries, target users, regulatory context — and produce a schema-conformant BRD. The schema is the BRD; the prose is a projection of the schema.
Template is a rendering target, not a generation constraint. The model produces vendor-neutral content against a stable schema. The team's Word template is a renderer at the end of the pipeline — never a prompt input. This means swapping templates does not destabilise generation, and a single brief can be rendered into multiple house styles.
A critic, again. A drafter agent fills the schema. A critic agent, with its own prompt and context, reviews each section and surfaces structured per-section feedback. Same two-agent pattern as Forge — same reason: shallow self-critique is worse than no critique. The drafter is allowed to be confident; the critic is allowed to disagree.
What I built
- Structured brief intake— a focused form, not a chat. The BA enters what only they know; the model is not asked to invent organisational context.
- Schema-driven generation— 12 BRD sections, each a typed schema, each produced by an OpenAI structured-outputs call against that schema.
- Critic agent— second model call per section. Returns severity-tagged, structured feedback. Surfaces only medium-and-above issues in the default UI.
- Word export— python-docx renders the schema into the default enterprise-flavoured template. Heading styles, numbering, and tables match the convention BAs already use.
- Default template only (v1)— team template upload is the agreed v1.1 work. Cutting it from v1 is the most expensive correct decision in the project.
The twelve sections, generated against a typed schema:
Technical decisions worth calling out
Structured outputs over prompting. Twelve sections, twelve schemas, twelve constrained generations. The schema defines what a section must contain; the model fills the shape. Parsing prose into a structured BRD would have been a downstream nightmare; producing the structure directly is the simpler engineering move.
Separation of content and template. The model never sees the team's Word file. The schema is vendor-neutral; the renderer is the only component that knows about heading levels, table styles, and house numbering. This is the load-bearing decision of the entire project — it is what makes template upload (v1.1) tractable rather than a rewrite.
Drafter and critic as separate calls. Same reasoning as Forge. A model writing in one pass writes too confidently; a second model with its own context produces honest review. Costs an extra call per section; pays for itself the first time the critic catches a missing acceptance criterion.
Enterprise-flavoured sections by default. User Access Management, Security Controls, and Benefits Realization are surfaced as first-class sections rather than as appendices. This reflects how BRDs are actually read in regulated environments — and it is what makes the output land as senior work, not as a template fill.
python-docx, not a templating engine. python-docx gives precise control over Word styles, numbering, and tables — the things that make a generated document look like it belongs in the team's repo, not like it was produced by a tool.
Hard problem worth mentioning
Template mapping. Every team's BRD looks different — different heading hierarchy, different numbering, different conventions for what a Functional Requirement row looks like. The clean architectural answer is to keep the schema neutral and let the renderer carry the team-specific shape. The hard part is making that promise actually hold against the variety of templates real teams actually use.
v1 ships with one default template that covers the common enterprise shape. v1.1 will add template upload: the user provides a .docx with placeholder regions, the system maps schema fields to placeholders, and renders. The mapping itself is a small declarative file the user reviews, not free-text — because the failure mode of inferring template structure from a Word document is the kind of failure that produces a confidently wrong BRD. Cutting upload from v1 was the most expensive correct call in the project.
Outcome
Live as a personal product. Validated with practising Business Analysts inside regulated environments — the test was not "does it produce text" but "does it produce text the BA would have written, leaving them only the parts they should have been spending their time on in the first place." It passed that test in the sections where I expected it to (Executive Summary, Scope, Assumptions) and surfaced sharper questions than I expected in the ones where the critic does the most work (Acceptance Criteria, Security Controls).
What I'd do differently
Vendor-neutral schema with domain section packs as extensions. The current 12-section schema is enterprise-flavoured by default. That is the right choice for the BAs I built it for — but a BRD for a small product team has a different shape than a BRD for a regulated bank initiative, and forcing the small-product BRD through the enterprise schema produces friction the user has to manually delete.
The next iteration treats the core schema as vendor-neutral and ships domain section packs as extensions — Enterprise, Product, Public Sector, Healthcare. Same drafter, same critic, different schema composition. The mistake was bundling the most common case into the core rather than packaging it as one of several first-class shapes.