RQ-107 deep research 生出力 (claude / 2026-06-18・$2.23 / 8 min)。突合・採択は Synthesis 参照。

Role Decomposition for Strategic / Architectural Decisions

A research report for the Decision Pipeline (ADR) Job-Executor design

Scope. This report maps how mature governance frameworks decompose the roles involved in proposing, reviewing, deciding, recording, executing, and re-using strategic decisions; how that decomposition flexes by organization size and audit regime; how multi-agent AI systems are formally being slotted into those roles; and which roles a single founder + multi-clone AI pipeline can safely collapse vs. must keep separated. It is intended to feed the Job Executor section of the Decision Pipeline JTBD.


Executive summary — five key findings

  1. Six roles are robust across frameworks (appear in ≥3 of DACI / RAPID / RACI / Bezos Type-1/2 / Holacracy / Sociocracy 3.0 / Spotify / J-SOX / OECD CG): Proposer/Recommender, Reviewer/Contributor/Input, Decider/Approver, Implementer/Performer, Recorder/Secretary, and Informed. Two roles are common but framework-specific: Agreer/Veto-holder (RAPID) and Compliance/Risk gate (RAPID's "A", J-SOX, OECD). One role is almost always implicit but rarely staffed: the Recaller / Retroactive Updater — a critical gap for ADR re-use.

  2. The "single Decider, separate from Recommender" rule is a hard constraint across all reviewed frameworks. DACI explicitly bars Driver=Approver; the Driver and Approver cannot be the same individual; RAPID treats "Single D" as the keystone; RACI requires only one Accountable per task; Holacracy elects the Facilitator/Secretary/Rep Link separately from the Lead Link. For a one-person operator, this rule cannot be obeyed mechanically and must be replaced by compensating controls (AI second-opinion, time-delay, retro-review).

  3. Audit regimes (SOX 404 / J-SOX / ISO-27001 A.5.3) make role separation legally required for financial-impact decisions, but explicitly permit "compensating controls" for small teams — typically logged break-glass accounts, independent third-party review, mandatory holiday/rotation, and tamper-evident logs. Small organisations use compensating controls like increased log monitoring, management reviews, automated alerts, and independent third-party verification where staff numbers are limited. For an accounting product specifically, the product-decision layer (ADRs) is not a SOX-controlled process; only the customers' financial transactions running through it are. This is a critical scope distinction.

  4. Multi-agent AI literature has converged on a canonical role split — Orchestrator/Lead, Worker/Specialist, Verifier/Critic, and (separately) Human-in-the-loop Approver — with strict isolation between Generator and Verifier contexts. Self-review is documented to fail: when a model reviews its own output in the same context window, it carries its original assumptions into the review… Independent review with no shared context eliminates this problem. The implication for a multi-clone Claude Code setup: drp / main / doc clones can already realize "different reviewer identities" if each clone receives only the requirements + artifact (not the originator's reasoning).

  5. For an n=1 + AI pipeline operating 5–15 ADRs/month on an accounting SaaS, the must-staff (cannot-collapse) roles are: (a) Decider on irreversible/Type-1 ADRs, (b) Independent Compliance/Policy gate on anything touching customer financial logic, and (c) Retroactive Updater / Recaller on any ADR older than 6 months. Everything else — Proposer, Reviewer-flavors, Recorder, Implementer, Informed broadcast, slug/numbering/body-gen — can be collapsed onto founder+AI with logged compensating controls. The biggest current gap in the user's pipeline is not the lack of a second human; it is the lack of an explicit Recaller role to surface old ADRs as context for new ones.


Q1. Standard role decomposition across mature frameworks

1.1 What each framework names

DACI (Intuit, 1980s). DACI, which stands for Driver, Approver, Contributor, and Informed, is a decision-making framework that assigns one of four roles to each stakeholder based on their level of involvement in a decision. It was developed as a variation of the RACI matrix to clarify who is responsible for driving a decision and who makes the final call. Each person should only hold one role. The Driver and Approver cannot be the same individual. Contributors can include multiple people, while the Informed group can be as many as needed. A DAI variant adds a Delegate: "Delegate," is responsible for executing the decision once it has been approved.

RAPID (Bain & Co, HBR 2006). The five letters in RAPID represent five critical roles in decision making: Recommend - Make a proposal, gather input and provide data and analysis. Agree - Vote "yes" or "no" to accept or reject the recommendation. Perform - Implement the decision once it has been made. Input - Consult the "right people" (usually those implementing the decision.) Decide - Make a formal and definitive decision. Key disciplines: Single D: Exactly one Decider per decision. Multiple D's = no decision. Veto is narrow: Keep Agree roles few and tightly defined. If someone wants a say but not a veto, they are Input. Time-boxed consultation: Set response SLAs (e.g., A responds in 48 hours; I within 3 business days) and "silence = no objection" rules where appropriate. The "Agree" role is the formal veto/compliance slot: Agree (A): A small number of roles with limited veto rights on clearly defined grounds (e.g., legal compliance, financial controls, brand/ethics, security).

RACI (project management lineage). A responsibility assignment matrix, RACI matrix (responsible, accountable, consulted and informed matrix)… A: Accountable (also approver or approving authority) — The one ultimately answerable for the correct completion of the deliverable or task, ensuring the prerequisites of the task are met, and delegating the work to those responsible. Accountable stakeholders sign off and approve work that responsible stakeholders provide… C: Consulted — Those whose opinions are sought, such as subject-matter experts, and with whom there is two-way communication. I: Informed — Those who are kept up-to-date on progress, often only on completion of the task or deliverable, and with whom there is just one-way communication. The RASCI variant adds Support: RASCI is another type of responsibility assignment matrix… It retains the four core roles of RACI — Responsible, Accountable, Consulted, and Informed — but adds a fifth: Supportive. The Supportive role in a RASCI chart is responsible for providing assistance to those in the Responsible role. The CAIRO variant adds an explicit "Out-of-the-loop" role: The CAIRO model expands RACI by adding the letter O (and jumbles up the letters). The letter O is for Out-of-the-loop and is for situations where you need to designate an individual or team as not involved in a task at all.

Bezos Type-1 / Type-2 (Amazon 1997 letter). Not a role taxonomy but a decision-class taxonomy that gates which role list applies: Some decisions are consequential and irreversible or nearly irreversible – one-way doors – and these decisions must be made methodically, carefully, slowly, with great deliberation and consultation… We can call these Type 1 decisions. But most decisions aren't like that – they are changeable, reversible – they're two-way doors… Type 2 decisions can and should be made quickly by high judgment individuals or small groups. Bezos warns of the role-bloat failure mode: As organizations get larger, there seems to be a tendency to use the heavy-weight Type 1 decision-making process on most decisions, including many Type 2 decisions. Implication: the role table below should be applied fully only to Type-1 ADRs; Type-2 ADRs should collapse aggressively.

Holacracy. Adds elected procedural roles distinct from the decision-content roles: There are also key roles to help organise the process and workflow of each circle including Facilitator, Secretary, Lead Link, and Rep Link. Crucially, governance decisions are made by Integrative Decision Making (consent-based), while any decision other than governance decisions can be made "autocratically"… anyone can step up and make any decision. The Lead Link assigns roles and priorities but is not the decision-maker on policy: A Lead Link has role responsibility: who has which role. The Lead Link cannot define or change roles, domains, and policies alone. That's governance – and governance belongs to the entire circle. The Secretary role is a formal recorder slot: Capturing and publishing the outputs of the Circle's required meetings, and maintaining a compiled view of the Circle's current Governance, checklist items, and metrics.

Sociocracy 3.0. Consent governs decision-making. A decision stands unless someone raises a reasoned, paramount objection. Circles are the basic governance unit. Each circle has a defined aim, domain, and authority to govern itself. Double-linking connects circles. Two people, a Leader and a Delegate, link each circle to its parent, ensuring bidirectional information flow. S3 distinguishes two kind of decisions; political and operational: Political Decisions: this is how we should act. Operational Decisions: this is how we act. — analogous to Bezos's Type-1/Type-2 distinction.

Spotify model. Less of a decision-role framework, more of an organizational topology. The typical decision-rights map laid over Spotify uses a DACI overlay: Publish a one‑page decision map (e.g., DACI): PO decides product priorities and scope; Chapter Lead owns hiring, performance, standards, and craft development; Tech Lead guides technical approach; Tribe Lead owns area outcomes and budget. Architectural review is treated as a cross-tribe gate: Establish lightweight architectural review (e.g., weekly architecture clinic) to address cross‑tribe decisions.

J-SOX (Japan's Financial Instruments and Exchange Act). J-SOX's Implementation Standards identify four components of internal control: Entity-level controls — Governance, risk management, and monitoring at the organizational level · Process-level controls — Controls over significant business processes affecting financial reporting · IT general controls — Controls over IT systems that support financial reporting processes. The framework imposes a separation of management assessor (designs/operates the control) from external auditor (attests) and from the operator of the controlled process — a classic three-line model. J-SOX tends to emphasize principles-based standards, providing companies more flexibility in designing and implementing internal control procedures. The relevance for an ADR pipeline is indirect: if an ADR changes a control that affects financial reporting, the change itself falls under J-SOX/SOX change-management scope.

OECD/G20 Principles of Corporate Governance. Defines the highest-level role split: Corporate governance involves a set of relationships between a company's management, board, shareholders and stakeholders. it is good practice for that person not to be involved in any decision involving the transaction or matter when a material interest is declared — the framework-level statement of "no self-review." OECD also assigns disclosure as a first-class duty: The corporate governance framework should ensure that timely and accurate disclosure is made on all material matters regarding the corporation, including the financial situation, performance, ownership, and governance of the company.

Architectural Decision Record practice (Nygard, AWS, Microsoft, MADR). Treats the lifecycle as state machine with explicit roles attached to state transitions. After the team identifies an architectural decision and its owner, the ADR owner provides the ADR in the Proposed state at the beginning of the process. ADRs in the Proposed state are ready for review. The ADR owner then initiates the review process for the ADR. The goal of the ADR review process is to decide whether the team accepts the ADR, determines that it needs rework, or rejects the ADR. The "immutability" rule introduces a new role implicitly: When the team accepts an ADR, it becomes immutable. If new insights require a different decision, the team proposes a new ADR. That replacement act is performed by what we will call the Retroactive Updater / Superseder. The standard ADR doc explicitly names the role list: Consider roles such as proposer, researcher, evaluator, reviewer, approver, maintainer, and the like. Consider responsibilities such as communication with stakeholders, ensuring expectations are met, sharing on the website or intranet, and reviewing the work periodically and especially when relevant.

1.2 Normalized cross-framework role table

Each row is a canonical role. The "appears in" column shows whether the role is named (✓), bundled into a larger role (~), or absent (—) in each framework. A role appearing in 3+ frameworks is treated as Robust; 1–2 as Framework-specific.

#Canonical roleResponsibilityDeliverableSeparation requirementTypical handoff → nextDACIRAPIDRACIBezosHolacracyS3SpotifyJ-SOXOECD CGADR practiceRobustness
1Proposer / Recommender / DriverFrame the problem, gather data, draft options, write the artifactDraft ADR (status: Proposed)Cannot also be Decider on same item→ Reviewer✓ Driver✓ Recommend✓ Responsible~ implicit~ role-holder raising tension✓ Proposer~ Tech Lead / PO~ process owner~ management✓ ADR ownerRobust (10/10)
2Input provider (advisory)Supply data, expertise, context; no vetoReviewer comments, citationsDistinct from Decider; time-boxed→ Recommender, → Reviewer✓ Contributor✓ Input✓ Consulted~~ Circle member~ Circle member~ Guild expert~ stakeholders~ researcherRobust (9/10)
3Reviewer / Critic (independent)Identify flaws, missing alternatives, hidden trade-offsStructured pass/fail + issue listShould not share context with Proposer (verifier pattern)→ Compliance gate, → Decider~ Contributor~ Input/Agree~ Consulted~ Type-1 deliberation✓ Objector (IDM)✓ Objector (CDM)~ Architecture clinic✓ test of controls✓ reviewerRobust (8/10)
4Compliance / Policy gate (veto)Narrow veto on legal, regulatory, security, financial-control groundsPass / block with rationaleIndependent of Proposer + Decider→ Decider (after veto resolved)✓ Agree~ Accountable for compliance domain~ Lead Link of risk circle✓ paramount objection~ guardrails✓ J-SOX A control owner✓ board/risk committee~ implicitRobust (7/10)
5Decider / Approver / AccountableMake the binding call, own the outcomeStatus: Accepted (or Rejected)Exactly one per decision (DACI, RAPID, RACI rules all converge)→ Implementer + Recorder✓ Approver✓ Decide✓ Accountable✓ Type-1: senior; Type-2: small group~ Lead Link (operational); Circle (governance)~ Circle by consent~ PO / Tribe Lead~ CEO/CFO attest~ board / executive✓ ownerRobust (10/10)
6Implementer / Performer / DelegateExecute the accepted decisionCode, configuration, control changesShould not also be sole Reviewer→ Recorder, → Informed~ (added in DAI: Delegate)✓ Perform✓ Responsible (execution)~ implicit~ role-holder~ role-holder✓ Squad✓ process operator~ management~ implementerRobust (9/10)
7Recorder / SecretaryCapture the decision, metadata, status; ensure findabilityNumbered, slugged, indexed ADR fileIndependent of decision content→ Informed, → Recaller✓ Secretary~ scribe~ documentation guild✓ documentation requirement✓ disclosure obligation✓ commit to docs/adr/Robust (5/10), but mandatory in audited contexts
8Informed (broadcast)Receive timely notice of outcome; one-way commsNotification, changelog entryNone(terminal)✓ Informed~ post-Decide✓ Informed~ Rep Link cascade~ transparency principle~ Tribe sync~ disclosure✓ disclosure~ release notesRobust (8/10)
9Out-of-the-loop (negative space)Explicit non-stakeholder declaration to prevent scope creepExclusion list~ CAIRO 'O'Framework-specific (1/10)
10Facilitator (process owner)Run the meeting, enforce process, stop discussion violationsDecision-meeting outputCannot be Proposer (Holacracy)→ Recorder✓ Facilitator✓ rotating facilitatorFramework-specific (2/10)
11Cross-link / Rep / LiaisonCarry context between circles / teams / decisionsSurfaced tensions, alignment✓ Rep Link✓ Delegate~ Chapter/GuildFramework-specific (2/10)
12Compliance attestor (external)Independent third party signs off on control design + operationAudit report, opinionMandatory independence(regulatory)✓ external auditor✓ external auditorFramework-specific (2/10)
13Retroactive Updater / SupersederWhen context changes, write a new ADR that explicitly supersedes an accepted oneNew ADR (status: Supersedes #N)Not same person as original Proposer (best practice, to avoid bias)→ full role cycle on new ADR~ via governance change~ Review Agreements pattern~ change management~ disclosure update✓ explicit Supersedes linkFramework-specific (3/10) but mandatory for any append-only log
14Recaller / Knowledge surfacerSurface relevant prior ADRs as context for a new decisionCitation set on new ADRsNone→ Proposer~ implicit Secretary~ implicit~ implicitAlmost never explicitly staffed — major gap
15Reader (consumer)Use ADRs to understand "why" during onboarding, debugging, future design(no deliverable; consumption)None~ Informed~ Informed~ readability requirement✓ named userFramework-specific (2/10)

Key observations on the table:

  • Roles 1, 2, 5, 6, 8 are universally robust. These are the spine of any decision lifecycle and must exist in any pipeline regardless of size.
  • Roles 3, 4, 7 are robust but vary in formality. They become hard requirements as audit exposure rises (Role 4 ⇄ SOX/J-SOX; Role 7 ⇄ J-SOX documentation; Role 3 ⇄ ISO-27001 A.5.3).
  • Roles 13–14 are systematically underspecified. ADR practice mentions "supersedes" but rarely names an actor who proactively re-opens stale ADRs. This is the user's observed "no explicit recaller actor" gap, and the literature confirms it is general, not just their pipeline.
  • The "no self-review" rule (Decider ≠ Proposer; Reviewer should not share context with Proposer) appears in every framework that explicitly addresses it (DACI, RAPID, OECD, S3, Holacracy, ISO-27001), and is the single most consistently asserted constraint.

Q2. Variation by organization size and industry

2.1 Size scaling — role count and specialization

Size bandHeadcountTypical role-count instantiated per decisionSpecialization modeWhat collapses, what hardens
Solo / micro1–53–4 (Proposer, Decider, Implementer, Informed broadcast collapsed onto 1–2 people)Generalist; one person wears 3+ hatsCompliance gate often informal; Recorder often skipped; audit exposure typically zero until first regulated customer or first audited contract
Startup<505–6 (Proposer, Reviewer = peer, Decider = founder/CEO, Implementer, Recorder = ad-hoc, Informed)Cross-functional; everyone is Consulted on somethingIn smaller organisations or start-ups where talent overlaps, it's expected that some team members must take on multiple responsibilities; the key is to enforce transparency, oversight, and regular review. Compensating controls become formal.
Mid-cap50–5007–9 (full DACI/RAPID; Compliance gate staffed; Architecture review board appears)Functional specialization; chapter/guild structures emergeRecorder formalized (ADR repo + ownership). Type-1 vs Type-2 lanes split. Compliance gate becomes mandatory pre-merge.
Large enterprise500–5,00010–14 (full table; Rep/Liaison roles staffed; Retroactive review board; legal/risk gates separated)Deep functional specializationSpotify-style structures or Holacracy/S3 attempts; For very large organisations (thousands of engineers), additional coordination layers beyond what the model specifies are typically needed.
Multinational5,000+14–18 (above + external auditor + multi-jurisdiction compliance + board sub-committees)Specialized by domain × regionOECD CG, J-SOX, SOX 404, sectoral regulators stack. Decision-rights are codified in matrices (RACI/RAPID maintained as living documents).

The size progression is a monotonic increase in role specialization, with two inflection points: (a) the first audited customer or first SOX-scoped contract (forces formalization of compliance gate + recorder + external auditor), and (b) crossing ~150 people (forces explicit liaison/rep roles because informal communication breaks).

2.2 Industry-driven hardening

Audit-driven separations are not a uniform overlay — they harden specific role pairs depending on the regime.

Audit / regulatory regimeHardens this separationMechanismCompensating controls allowed for small operators
SOX 404 (US public co. financial reporting)Proposer ≠ Approver on any control that touches financial reports; Custody ≠ Authorization ≠ Recording ≠ ReconciliationAt its core, SOX separation of duties is the practice of splitting responsibilities for initiating, authorizing, and reviewing critical transactions, ensuring no single user can both perpetrate and conceal errors or fraud.; Spell it out. Who drafts entries, who approves, who reconciles. Without clear lines, overlaps creep in.Dual authorization above threshold; Require dual authorization for payments above pre-set thresholds in both AP systems and banking portals, with approvals routed to the owner, CEO or a board member when there is only one finance staffer. Rotate responsibilities: Periodic rotation or temporary reassignment of duties and mandatory vacations can surface issues and reduce fraud risk. Seek external support when needed: If necessary, hire outside resources such as a part-time controller, a CPA firm, or outsourced accounts payable services.
J-SOX (Japan FIEA)Same as SOX but adds Response to IT as a 6th component; entity-level vs process-level controls separatedJ-SOX framework includes an element of "Response to IT" in addition to five COSO elements… Evaluate company-level internal controls. The list of elements is similar to COSO, with the addition of "Response to Information Technology."Documentation in Japanese; J-SOX tends to emphasize principles-based standards, providing companies more flexibility in designing and implementing internal control procedures.
ISO 27001 A.5.3 (info-sec)Requester ≠ Approver of access; Developer ≠ Production deployer; Admin ≠ Auditor of admin actionsPractical implementation of SoD involves separating the roles of "requesting," "authorising," and "executing" within a technical or financial workflow. Common examples include separating code development from production deployment, separating access requests from approval, and separating invoice creation from payment authorisation. When staff numbers are limited, segregation of duties is achieved by implementing "compensating controls" that provide oversight where role separation is not physically possible.Robust logging of all privileged activities, mandatory peer reviews for code or configuration changes, and regular retrospective reviews of sensitive actions by management or an external party.; approximately 95% of auditors accept documented "Second Pair of Eyes" verification as a valid technical mitigation.
EU AI Act Art. 14 (high-risk AI)Human oversight role must be distinct from AI deployer; for biometric ID, at least two humans must verify before actionFor high-risk AI systems referred to in point 1(a) of Annex III, the measures referred to in paragraph 3 of this Article shall be such as to ensure that, in addition, no action or decision is taken by the deployer on the basis of the identification resulting from the system unless that identification has been separately verified and confirmed by at least two natural persons with the necessary competence, training and authority.Proportionate to risk; Human oversight shall aim to prevent or minimise the risks to health, safety or fundamental rights that may emerge when a high-risk AI system is used in accordance with its intended purpose or under conditions of reasonably foreseeable misuse. Most B2B SaaS ADRs are not high-risk AI systems unless the SaaS itself classifies as such.
OECD CG / corporate governance codeBoard ≠ management; conflict-of-interest holder must recuseWhere a material interest has been declared, it is good practice for that person not to be involved in any decision involving the transaction or matter.Disclosure + recusal log

2.3 Industry overlay (where the same regime stacks differently)

IndustryDominant audit regime(s)Distinctive role(s)Tendency
SaaS (B2B/SMB)SOC 2 (voluntary), ISO 27001 if enterprise customers; GDPR / data-protection officerDPO / Security partner as Agree-role on data-touching ADRsLight by default; hardens at first enterprise deal
System Integrator (SI)Contractual; sometimes SOC 2 inheritedProject Architect ≠ Implementation LeadProject-bounded role tables
Finance / FinTech / Accounting SaaSSOX 404 if public; J-SOX if Japanese listing; PCI DSS if cards; banking regulatorsCompliance Officer with formal Agree veto; Internal AuditHeavy. SoD hardened; "Segregation of duties is straightforward for manual journal entries: the approver must differ from the preparer, typically someone at a higher level"
ManufacturingSOX (if public) + J-SOX (if JP); ISO 9001 quality; ISO 27001 for ITQuality engineer as separate Reviewer; Change Control Board for engineering changesHeavy on change-control; process-level controls extensive ("process-level controls related to sales, accounts receivable (AR), and inventory should receive special attention in the case of manufacturing companies")

2.4 Where a solo operator legally / safely can collapse roles

  • Internal architectural choices that do not touch customer financial data, do not change a control mapped to SOX/J-SOX scope, do not alter access to customer data, and are reversible (Type-2). Proposer = Reviewer = Decider = Implementer is acceptable here, if recorded.
  • Bezos Type-2 decisions in general: Type 2 decisions can and should be made quickly by high judgment individuals or small groups.

2.5 Where a solo operator legally / safely cannot collapse roles (even with AI)

  • Any change to a control that is mapped to financial reporting scope if the company falls under SOX, J-SOX, or a contractual audit requirement (e.g., SOC 2 in scope for the customer's audit). The Compliance attestor role must be external.
  • Access provisioning changes under ISO 27001 A.5.3 where the same identity would both request and approve elevated access.
  • In an accounting product specifically: ADRs that change how customer journal entries are validated, posted, or reconciled. Even though the vendor may be small, the control affects audited customer books. SOX compliance extends to any system that serves as an input to accounting records and financial statements.

Q3. Embedding AI agents as formal decision roles

3.1 The canonical multi-agent role split

Across Anthropic's research system, LangGraph, CrewAI, AutoGen, and OpenAI Swarm/Agents SDK, four agent role-archetypes recur:

  1. Orchestrator / Supervisor / Lead. A coordinating agent (often a more capable model) that plans, decomposes, and routes. Anthropic: Our Research system uses a multi-agent architecture with an orchestrator-worker pattern, where a lead agent coordinates the process while delegating to specialized subagents that operate in parallel. LangGraph supervisor: Supervisor node — an LLM node that reads the full conversation history and decides which specialist to call next, or whether the question has been fully answered. The supervisor never adds a message visible to the user; it only controls routing.

  2. Worker / Specialist / Subagent. Domain-scoped agents that perform a single class of task in their own isolated context. Each subagent gets a self-contained task description, an output format, and a fresh context window. It doesn't know the other subagents exist. It cannot coordinate with them mid-task.

  3. Verifier / Critic / Reviewer. A separately-instantiated agent whose only job is to check the artifact against requirements, with no shared context: The verifier pattern uses an independent agent to review generated output, with no shared context from the generation step. This prevents the reviewer from inheriting the generator's errors. In Anthropic's system the Citation Agent plays this verification role: Citation/Review Agent: Verifies the output before returning it to the user.

  4. Human-in-the-loop Approver. Frameworks now ship this as a first-class primitive. LangGraph: The Human-in-the-Loop (HITL) middleware lets you add human oversight to agent tool calls. When a model proposes an action that might require review—for example, writing to a file or executing SQL—the middleware can pause execution and wait for a decision… A human decision then determines what happens next: the action can be approved as-is (approve), modified before running (edit), rejected with feedback (reject), or responded to directly (respond). CrewAI / AutoGen: For example, in an automated data analysis pipeline, CrewAI could generate a summary and then pause for a human analyst to confirm the findings before sending the summary to stakeholders. From the developer's perspective, enabling human-in-the-loop is often as simple as marking a task or using a particular task type that requires human confirmation… AutoGen also supports human involvement, but it manifests as part of the agent conversation itself.

OpenAI Swarm formalizes the role-handoff as a primitive: A routine is a set of instructions that agents follow to complete specific actions, while handoffs allow for seamless transitions between agents, each specializing in particular functions. CrewAI is built around role specialization: In CrewAI, each agent is assigned a specific role, such as researcher, writer, or reviewer, along with a defined objective. Tasks are distributed based on these roles, ensuring agents focus only on what they are best suited for.

3.2 The "no self-review" rule for AI

The same separation-of-duties principle that prevents human Proposer = Decider applies — empirically, more harshly — to AI.

Failure mode of self-review. When a model reviews its own output in the same context window, it carries its original assumptions into the review. If it misunderstood the requirements during generation, it'll evaluate the output against that misunderstanding — and judge it correct. Independent review with no shared context eliminates this problem… This pattern is sometimes called the generator-verifier loop or the critic pattern, depending on the context. The underlying logic is the same: separation of concerns, independent judgment, no shared cognitive bias.

Failure mode of naive LLM-as-judge. It exhibits position bias — defaulting to approving whatever it sees first. It has no external knowledge to verify against. And because the team now has "a review layer," they trust the outputs more than before. The reviewer didn't add a safety net. It added a failure surface. This isn't a one-off. LLM reviewers suffer from well-documented failure modes: position bias (favoring the first option presented), verbosity bias (rating longer responses higher regardless of accuracy), and sycophancy (agreeing with the generator's framing).

Failure mode of AI-only review loops on AI-generated content. Recent research demonstrates that LLM reviewers can be systematically fooled by fabricated content: despite provably sound aggregation mathematics, integrity checking systematically fails. This study reveals concrete vulnerabilities in AI-only publication loops and underscores the urgent need for defense-in-depth safeguards—including provenance verification, integrity-weighted scoring, and mandatory human oversight. Translated to ADRs: an AI gate that only sees AI-generated bodies cannot be the sole check.

3.3 Controls that prevent AI proposer = AI reviewer collapse

ControlWhat it doesSource
Fresh context for verifierVerifier receives requirements + artifact only, never the proposer's reasoning chainIndependent review with no shared context eliminates this problem.
Different model for verifierAvoids same-model biasAnthropic's research system: Opus as lead, Sonnet as workers, separate citation pass
Different persona / prompt"You are an adversarial code reviewer. Assume code is vulnerable. Find EVERY issue. Only approve if genuinely production-ready."Actor-critic pattern in code review
External signal (tests, retrieval, tools)Verifier has ground truth the proposer doesn'tReflexion breaks through this limitation — but only when you have an external signal. A code agent that runs tests, reflects on why they failed, and stores the insight in episodic memory goes from 67% to 91% pass rate on HumanEval… The tests provide ground truth the agent can't argue with. Chain-of-Verification works for factual claims by generating verification questions and answering them independently — often with tool access.
Modular agent decomposition by issue categorySplitting review agents by issue category, as in RevAgent, prevents semantic drift and improves specializationMulti-agent code review
Symbolic / heuristic validatorsSymbolic validators in RepoAudit and approval heuristics in CodeCureAgent reduce spurious comments and false positivesRepoAudit, CodeCureAgent
Adversarial author / debateA second agent argues the opposite, forcing the reviewer to defendAn adversarial Author agent validates or rebuts each proposed weakness at a granular level… mechanisms to detect messaging loops or protocol violations.
Human-in-the-loop interrupt on consequential toolsHard stop before destructive actionLangGraph HITL middleware (Fully autonomous agents that take consequential actions without human review are not acceptable in regulated industries or for high-stakes decisions. LangGraph's interrupt() mechanism provides a clean, resumable pause point)

3.4 Audit-trail conventions for AI-vs-human contribution

RequirementWhat to logSource
Attribution to human authorityEvery agent action ties to an initiating humanIt ties every agent action to a human source of authority. The organization should be able to identify who initiated the workflow, on whose behalf the system acted, and what permissions applied at that moment.
Model + prompt provenancePinned model revision, system-prompt hash, temperature, tool registry versionDecision identifier and correlation key… Model and prompt provenance. The exact model revision — a pinned Claude or GPT-5 version — plus the system prompt hash, temperature, and tool registry version in effect at that moment.
Inputs snapshotted at decision timeContext, retrievals, source documents — not mutable referencesInputs as seen at decision time. The retrieved context, the documents pulled from your vector store, the Salesforce or Zendesk fields read, and the user input, snapshotted rather than referenced by a mutable pointer.
Per-agent role attributionWhich agent role contributed which partSession-level logging that tracks model assignments per agent role. An architect agent using one model and a developer agent using another should be clearly distinguishable in the logs.
Tamper-evident storageAppend-only, hash-chained, ideally with qualified timestampsA compliant AI agent audit trail is operation-level, attribution-complete, tamper-evident, and real-time. It records what specific regulated data was accessed, by which authenticated agent, under which human authorization, performing which operation, with what policy outcome, at what timestamp — for every interaction.; ISO/IEC 42001 standard… For engineering teams, this means your logging pipeline must support Non-Repudiation providing cryptographic proof that an agent's audit log has not been tampered with or altered.
EU AI Act Art. 12 loggingLifetime event logging for high-risk AIarticle 12 of the EU AI Act requires automatic event logging across the entire lifetime of the system, and traditional application logs do not meet that bar, because they are mutable by the operator and therefore not admissible.

3.5 AI role-pattern catalog (with anti-patterns)

Pattern A — Orchestrator + Workers + Independent Verifier (Anthropic-style). A lead plans, workers run in parallel in isolated contexts, a separate verifier checks. Claude's Research is an orchestrator-worker system: a lead agent plans, spins up 3-5 specialized subagents in parallel, and synthesizes their findings with a separate citation pass. Anti-pattern A1 (Single agent multi-hatting): one agent prompted to "first propose, then critique" — fails the no-self-review test.

Pattern B — Role-Crew (CrewAI-style). Pre-declared agent roles (Researcher, Writer, Reviewer, Manager) with assigned tasks. A research agent focuses on gathering information. A critique agent validates outputs. An execution agent handles API calls. Together, they can tackle problems that would break a single model. Anti-pattern B1 (Role-name without role-isolation): different names but shared memory/context = effectively one agent.

Pattern C — Conversational debate (AutoGen-style). Solver + Critic agents talk until they converge. For example, you might have a Solver agent and a Critic agent that talk to each other until they agree on an answer. Anti-pattern C1 (Convergence-by-fatigue): as shown in the LangGraph supervisor case where "Forty-seven iterations later, I got an alert that we'd burned $180 in API costs on a single user request" — debate loops need MAX_HOPS guards.

Pattern D — Graph-state with HITL interrupts (LangGraph-style). Explicit nodes/edges; interrupt() for human approval. human-in-the-loop interrupts via interrupt() (agents pause for approval at critical decision points)… The human-in-the-loop pattern is particularly critical for enterprise customers. Anti-pattern D1 (Interrupt at the wrong gate): asking the human only at the end — automation bias has already locked in the framing. The EU AI Act recital warns: to remain aware of the possible tendency of automatically relying or over-relying on the output produced by a high-risk AI system (automation bias), in particular for high-risk AI systems used to provide information or recommendations for decisions to be taken by natural persons.

Pattern E — Handoff routine (OpenAI Swarm / Agents SDK). Lightweight handoff between specialist agents via function returns. coordination happens through explicit handoffs when an agent finishes its task or reaches its limits. This clarity prevents the endless loops that plague implicit systems. Anti-pattern E1 (Implicit state): stateless handoffs lose decision context — Swarm is explicit about being not production-ready precisely because It has no session management, no guardrails, no observability, and is not actively maintained. Use the Agents SDK for production systems.

General anti-patterns observed across frameworks:

  • A-P-X1 Reviewer rubber-stamping AI output. Cited evidence: the reviewer approves hallucinated drug interactions 40% of the time.
  • A-P-X2 No human gate on consequential actions. Even where Article 14 doesn't strictly require pre-action human review, Article 14 effectively creates a three-tier oversight framework: Understand → Monitor → Intervene/Halt. A pipeline with no Halt is one fault from a runaway.
  • A-P-X3 Git commits hide AI authorship. Most git histories show the human who committed the code, not the AI that wrote it. When an agent generates code and a developer commits it, the provenance is lost… Commit metadata must include the model, session, and work item that produced the code.
  • A-P-X4 Logs without snapshotting inputs. Mutable references break reconstruction (Q3.4 above).

Q4. Compression strategy for n=1 + AI

4.1 Decision matrix — what can / cannot be collapsed

Using the 15-row role table from Q1 as the source. Three contexts are evaluated: (a) current state (founder + Claude Code clones, no audit exposure beyond informal SOC 2 prep), (b) +1 junior engineer joining 2026-10, (c) first SOX/J-SOX-scoped customer or contractual SOC 2 audit.

#RoleSolo founder + AI (no audit)+1 junior (2026-10)First SOX/J-SOX customer or SOC 2 audit
1Proposer / Recommender✅ Founder OR AI drp/main/doc clone (proposer-class). Log which.✅ Junior or founder, or AI clone.✅ Same; provenance-logged.
2Input provider✅ AI gates (research, citation).✅ Same + junior.✅ Same.
3Reviewer / Critic (independent)⚠️ Must use AI clone with fresh context + different prompt + different model tier (Pattern A). Cannot be the proposer-clone.✅ Junior becomes the second "human eye" — strongest single upgrade.✅ Junior + AI cross-validation gate (already in pipeline).
4Compliance / Policy gate🟥 Must staff explicitly as an AI gate that operates on policy-as-code, not as ambient founder judgment. Use the existing "policy alignment" gate but log it as a first-class veto, not just a score.🟥 Same; junior cannot serve as compliance attestor on day 1.🟥 External attestor required for in-scope controls (CPA firm / auditor).
5Decider / Approver🟥 Founder only. Cannot collapse onto AI. Article 14 logic + RAPID Single-D + DACI rule.🟥 Founder only on Type-1; can delegate Type-2 to junior under written threshold.🟥 Founder for product-decision ADRs; co-signature for any ADR that changes a SOX-scoped control.
6Implementer / Performer✅ Founder + AI main clone.✅ Junior + AI.✅ Same; but Implementer ≠ Approver on the same change (SOX).
7Recorder / Secretary✅ AI doc clone with deterministic templating (slug, numbering, body-gen gates already exist).✅ Same.✅ Same, but logs must be tamper-evident (immutable storage).
8Informed broadcast✅ Automated changelog / digest.✅ Same.✅ Disclosure obligation may add timing / format constraints.
9Out-of-the-loop declaration🟡 Optional. Useful if you want to suppress AI-clone broadcasting to irrelevant doc-domains.🟡 Optional.🟡 Optional.
10Facilitator➖ Not needed at n=1 (no meeting to facilitate).🟡 Useful for synchronous review of high-stakes ADRs (the Type-1 architecture-clinic pattern).🟡 Same.
11Cross-link / Liaison➖ Not needed at n=1.🟡 Junior may become liaison to external advisors.✅ Compliance liaison to external auditor needed.
12Compliance attestor (external)➖ Not applicable yet.➖ Still not applicable.🟥 Mandatory external party. Cannot be the founder.
13Retroactive Updater / Superseder🟥 Currently unstaffed in user's pipeline. Must add a quarterly recall cycle: AI doc clone surfaces ADRs >90 days old whose context/dependencies have changed; founder approves supersede vs. keep.✅ Junior owns the quarterly cycle.✅ Same; cycle becomes part of change-management evidence.
14Recaller / Knowledge surfacer🟥 Currently unstaffed. Must add: at every new ADR's Proposer step, the doc clone must produce a citation set of prior related ADRs before the body is generated. (This is a pre-gate enhancement.)✅ Junior cross-checks.✅ Same.
15Reader (consumer)✅ Founder, future junior, future auditor.✅ Same.✅ Auditor adds external consumer.

Legend. ✅ safe to collapse; ⚠️ collapse with compensating controls; 🟥 cannot collapse (must staff); 🟡 optional / context-dependent; ➖ not yet relevant.

4.2 Three-tier compression principle

  • Tier 1 — Cannot collapse (must-staff, even at n=1):

    • #5 Decider (Type-1 ADRs)
    • #4 Compliance/Policy gate as a first-class gate, not ambient judgment
    • #13 Retroactive Updater cycle and #14 Recaller pre-gate
  • Tier 2 — Collapse with compensating controls:

    • #3 Reviewer/Critic — collapse onto AI only if (a) fresh context, (b) different prompt persona, (c) different model where possible, (d) external signal (lint/test/policy check) included
    • #1 Proposer + #6 Implementer — same human/AI is acceptable, but #5 Decider must be distinct from both
    • #7 Recorder — AI is fine if tamper-evident logging
  • Tier 3 — Safe to collapse without controls (Type-2 ADRs only):

    • #2 Input, #8 Informed, #15 Reader can all collapse onto the same operator+AI for low-stakes reversible ADRs

4.3 The four-eyes problem at n=1

The classical four-eye principle assumes two humans. At n=1, it must be approximated by time-staggered or context-staggered AI second opinion + human final confirmation. Practical translation:

Classical 4-eye stepn=1 substitute
Person A draftsFounder or proposer-AI drafts
Person B reviews independentlyReviewer-AI on a fresh context, different persona, different model tier; plus mandatory "sleep on it" delay of ≥24h for Type-1 ADRs
Person A and B sign offFounder reviews the AI-reviewer's report on the AI-proposer's draft. The founder's job is not to redo either role; it is to verify the disagreement-vs-agreement pattern and stress-test edge cases.
Audit log shows two identitiesAudit log shows: (proposer-clone-ID, model-version, prompt-hash) ; (reviewer-clone-ID, model-version, prompt-hash, fresh-context-true) ; (founder-ID, decision, rationale) — three identities, one human.

This is consistent with the EU AI Act's structural intent: it requires that humans have the capability to monitor, understand, intervene in, and halt the AI — and that the system is designed to make this possible — the human is the halt-and-decide role, not the do-everything role.

4.4 When the junior engineer joins (2026-10)

The junior should be slotted first into the Reviewer/Critic (#3) and Retroactive Updater (#13) roles, not into Proposer or Implementer. Reasons:

  1. Independent review is the role the user's pipeline currently approximates worst (everything is the same human's brain at the moment of acceptance).
  2. In smaller organisations or start-ups where talent overlaps, it's expected that some team members must take on multiple responsibilities; the key is to enforce transparency, oversight, and regular review.
  3. Retroactive review is a great onboarding task — the junior learns the system by reading and re-evaluating old ADRs.

The 2026-10 timing matters: a human reviewer before any audit-exposure inflection point makes the pipeline auditable cheaper than retrofitting compensating controls later.

4.5 When (not if) audit exposure arrives

The first SOX/J-SOX-scoped customer triggers three irreversible role additions:

  1. External compliance attestor (CPA firm, partial outsourcing acceptable: If necessary, hire outside resources such as a part-time controller, a CPA firm, or outsourced accounts payable services.).
  2. Documented SoD matrix mapping ADR-change types to required separations. When perfect segregation isn't possible, implement a documented "Compensating Control"—like a mandatory secondary review by a senior executive—to mitigate risk and provide audit evidence.
  3. Tamper-evident audit trail for all ADRs that change controls in scope. (See Q3.4 logging conventions.)

Must-have (do before scaling, ideally before junior arrives)

  1. Add an explicit Recaller pre-gate (Role #14). Before the body generator runs on a new ADR, the doc clone must produce a structured citation set of prior related ADRs (semantic search + tag match). This addresses the "no explicit recaller actor" gap directly and prevents inadvertent contradiction of prior decisions.

  2. Promote the existing "policy alignment" gate to a first-class Compliance/Policy veto (Role #4). Today it scores; it should block on policy-as-code matches. Make the veto criterion auditable (rule ID + version, not just a score). Aligns with RAPID "Agree" discipline (A small number of roles with limited veto rights on clearly defined grounds (e.g., legal compliance, financial controls, brand/ethics, security).) and ISO 27001 A.5.3 separation of design vs. review (The person who designs and implements a control is rarely the right person to assess whether it works as intended; objectivity is hard to maintain when reviewing your own work.).

  3. Enforce Verifier-pattern fresh-context for the cross-validation gate (Role #3). Confirm that the reviewer-clone receives only requirements + artifact, never the proposer-clone's chain-of-thought. This is the single most important "AI-as-role" hygiene control (Q3.2 evidence).

  4. Tag every gate output with (clone-ID, model-version, prompt-hash, fresh-context bool). Without this, role separation cannot be evidenced. (Q3.4 audit conventions.)

  5. Codify a Type-1 / Type-2 ADR tag. Apply the full 15-role pipeline only to Type-1; allow aggressive collapse on Type-2 (Bezos's anti-bloat rule). The triage gate already exists — surface its T1/T2 verdict as a first-class artifact.

Should-have (do within 6 months)

  1. Add a quarterly Retroactive Updater cycle (Role #13). Doc clone produces a "stale ADR" report monthly; founder reviews quarterly. After 2026-10, junior owns the cycle.

  2. Add a sleep-on-it delay for Type-1 acceptance. Replace synchronous Accept/Reject with a 24–48h hold during which the reviewer-clone re-runs on a new fresh context. Approximates 4-eye separation temporally when it can't be done by headcount.

  3. Document a Compensating Controls Register. A short document listing every role that is collapsed onto the founder, the compensating control in place, and the review cadence. This is what an ISO-27001 / SOC 2 auditor expects to see for a small team. Document the conflicts. A short segregation-of-duties analysis listing the roles in which one person handles activities that should ideally be separated, the resulting risk, and how you manage it is genuinely useful evidence.

  4. Make the 13 LLM gates' identities explicit in the ADR header. Which gates ran, which version, which pass/fail — turns the existing gates into named role-fillers for audit and reconstructability.

Nice-to-have (when junior is settled or before scaling team beyond ~3)

  1. Add an explicit Facilitator role for Type-1 ADRs. A standing 30-minute synchronous "architecture clinic" before Accept, in the Spotify mold (Establish lightweight architectural review (e.g., weekly architecture clinic) to address cross‑tribe decisions.). For n=2 humans, this is the founder + junior; the AI doc clone is the Secretary.

  2. Introduce a Rep-Link-equivalent across drp / main / doc clone domains. Currently each clone is scoped; nothing routinely surfaces tensions between clone domains (e.g., a drp-domain ADR that has implications for the doc-domain). A scheduled cross-clone reconciliation pass would harden this.

  3. Pre-register Out-of-the-loop declarations (CAIRO 'O'). Useful once you have multiple clones to prevent any one of them from generating unsolicited input on the wrong domain.

  4. Externalize the Compliance attestor. Pre-engage a CPA / SOC 2 firm before the first regulated contract so that the role exists before it is needed under pressure. Lead-time on this is typically 6–12 months for first SOC 2 Type 2.


References

Decision-rights frameworks

Bezos Type-1/Type-2

Holacracy / Sociocracy 3.0 / Spotify

Audit / governance regimes

ADR practice

AI multi-agent role decomposition

AI verifier / critic patterns and anti-patterns

AI audit trails and provenance

EU AI Act human oversight