Audit Report — `pg-money-core-migration.js` Workflow

Target: /Users/manwe/CascadeProjects/escrow/.claude/workflows/pg-money-core-migration.js Status: Not yet run. 8 phases (Preflight → Infra → Schema → Repos → Backfill → Tests → VerifyGate → SelfAudit). Adversarial verification: Every high/blocker finding was challenged against the actual script. All of them were refuted. The remaining open items are medium/low enforcement-gap concerns that were not independently re-verified (n/a).

[!warning] Historical scope This is a workflow audit, not the current runtime status. The backend branch now contains the generated Postgres layer, but Mongo remains authoritative for normal traffic and service-layer wiring is still incomplete. Use Postgres Runtime Cutover Status for the current code/runtime boundary.

1. Verdict: GO (with recommended changes)

The workflow cannot itself cause fraud or fund/order dataloss, and that is the load-bearing question. Three independent, code-level facts make it safe to run:

It never touches production data. SAFETY rule 1 (lines 34–36, 67) binds every agent to code/SQL/test generation only; any DB the harness spins up is an ephemeral local throwaway. The backfill scripts are generated as artifacts for humans, explicitly never auto-run (line 160), and are gated on a non-prod MIGRATION_PG_URL allowlist.
It cannot deploy. It works on an isolated branch feat/pg-money-core-migration cut from origin/main (lines 54–55, 91), is additive-only under backend/src/db/** (rule 2), never bumps version (rule 3), and never pushes / opens a PR / flips a flag (rule 4). Per the CI audit (BRANCH_001, DEPLOY_001), creating or committing to this branch fires no CI; only a human push to main/master or a v* tag deploys — outside the workflow's reach.
The worst realistic failure is recoverable. VerifyGate commits only on green typecheck+tests (lines 216–217), and to a local branch. Even the post-commit SelfAudit ordering (SAFETY_001, refuted) is benign because the commit is local and reversible; the verdict still surfaces to the human before any push.

Every adversarially-tested blocker/high claim was refuted because the corresponding safety constraint is present in the prompts (SAFETY rules 1–7, plus per-phase reinforcement on schema indexes line 127, transactions line 145, dual-write line 146, decimals line 126, backfill guard line 160). The legitimate residual risk is that these are prompt-level instructions, not code-enforced gates — an agent could under-implement one and VerifyGate (which only runs typecheck+tests) would not catch it. That is a quality/trust gap, not a safety gap, and it is fully contained by the mandatory human review before cutover. Hence: go, but tighten the gates below first so the human reviewer is handed verifiable evidence instead of having to re-derive it.

Confirmed blockers: 0.

2. Confirmed Blockers (must fix before first run)

None. All blocker/high-severity findings were adversarially verified and refuted — the safety contract they claimed was missing is in fact specified in the workflow prompts. The workflow is safe to run as-is. The items in §3 are improvements that raise trust and catch agent under-implementation; they are not gating.

ID	Why it is NOT a blocker
IDEMPOTENCY_INDEX_PRESERVATION / FK-010	Line 127 explicitly mandates the partial `WHERE provider='request.network' AND direction='in' AND status='pending'` and the ledger sparse-unique. Not enforced in code, but specified + audited by SelfAudit (line 240).
DECIMAL_PRECISION_TRUNCATION_RISK / FK-009	Line 126 mandates `numeric(38,18)`, never float; SelfAudit hunts float money (line 239). Token-decimal scale flagged as a risk to surface (line 131).
MULTI_DOC_WRITE / FK-011	Line 145 requires `db.transaction(...)` for payment+ledger, referral reward, dispute hold/release; SAFETY rule 7 repeats it.
LOCKFILE_001 / BRANCH_002 / SAFETY_002	Workflow never pushes; lockfile/`npm audit` are CI-merge-time concerns the human owns. Refuted as in-workflow blockers.
MIXED_ID / immutability / prod-guard / version-bump	All specified in prompts (lines 124, 48/line 70, 160, rule 3).

3. Recommended Changes Before Running (high / medium), by phase

These convert prompt-level promises into verifiable evidence so the human reviewer and VerifyGate can confirm them, and they fix the two genuine structural risks (factory race, test-skip-as-pass).

Phase	ID	Change	Severity
Preflight	PREFLIGHT_BRANCH_ALREADY_EXISTS_REUSE_RISK	After `git switch ${BRANCH}` (line 91, reuse path), run `git status --porcelain` on the target branch and STOP if dirty — re-running must not accumulate prior uncommitted work.	medium
Repos	PARALLEL_AGENTS_RACE / CONFLICT_001	Structural fix. Four parallel domain agents all "update `factory.ts`" (lines 138, 147); last writer wins, silently dropping 3 domains' wiring. Split Repos into: (a) parallel — each agent writes only its own repo files, no `factory.ts`; (b) sequential aggregator agent that builds one unified `factory.ts`. Then have VerifyGate assert `factory.ts` imports/exports all 4 domains.	medium
Repos	REPOS_PHASE_MONGO_WRAPPER_NOT_VERIFIED	Require the MongoRepo to carry the original Mongoose call as an inline comment (or a method→source map) so the human can confirm "verbatim, no behavior change" (line 144).	medium
Repos	FK-012	For multi-doc money writes, the dual-write "log+alert, don't throw" (line 146) can leave Mongo funds released with no PG ledger row during the dual-write window. Add an explicit escalation/alert requirement and a reconcile sweep for these gaps (covered by reconcile.ts, line 170).	medium
Schema	IDEMPOTENCY_INDEX_REPRODUCTION_NOT_TESTED / SCHEMA_AGENTS_INDEX_REPRODUCTION_NO_VERIFY / FK-001/FK-006/FK-008	Have each schema agent emit the generated SQL (or index/CHECK list) in its return. Add a schema-audit step in VerifyGate that greps the generated migration for: (a) the partial `WHERE` on the RN idempotency index, (b) ledger `idempotency_key IS NOT NULL`, (c) each Mixed-id CHECK constraint, (d) presence of every Mongo index. Missing partial-`WHERE` or CHECK → blocker.	high (where) / medium
Backfill	NO_PROD_BACKFILL_RUNAWAY_GUARD / FK-003/FK-004/FK-005/FK-007	Make the non-prod guard a whitelist (`localhost`/`127.0.0.1`/named staging hosts), not a `!includes('prod')` blacklist, and require a guard unit-test that aborts on a mock prod DSN. Add per-FK parent-completion checks (TrezorAccount `addresses[].paymentId`, PointTransaction `order`/`referredUser`, Category `parentId` string audit) before child upsert.	high (where) / medium
Backfill (verify)	VERIFICATION_CHECKSUM_PRECISION / LEDGER_IMMUTABILITY	In `checksums.ts` (line 168) compare fund sums as `::numeric` cast to text, not float. In `reconcile.ts` (line 170) add an assertion that ledger rows are never UPDATE/DELETE (replicating the Mongo pre-save immutability hook) — via a PG trigger or a repo guard plus a test.	medium / high (where)
Tests	TEST_SKIP_WITHOUT_TEST_PG_SILENTLY_PASSES / TEST_001	Structural fix. Tests "skip if no test PG" (line 190) but VerifyGate treats skip as pass. Return `{testsPassed, testsSkipped, skipReason}`; VerifyGate must fail the commit if `testsSkipped` is true, so money-safety tests are never silently bypassed.	medium
Tests	PAYMENT_CONFIRM_LEDGER_ATOMICITY / MISSING_CONSTRAINT_SERIALIZATION / DUAL_WRITE idempotency	Expand the atomicity test (line 182) to a concrete template: inject a mid-transaction ledger failure → assert payment status rolled back → retry → assert success (idempotent). Add a 2-concurrent-confirm test and document the required isolation level / `FOR UPDATE` locking to prevent double-release.	high (where) / medium
VerifyGate	VERSION_BUMP_DETECTION / VERSION_001	Add a hard diff check: `git diff ${BASE}...HEAD -- backend/package.json	grep '"version"'` → if non-empty, BLOCKER and do not commit. Cheap insurance against an accidental deploy-triggering bump.
SelfAudit	SELF_AUDIT_VERDICT_NOT_GATED	After SelfAudit, gate the final return: if `verdict === 'unsafe'` or `mustFixBeforeBackfill` is non-empty, return an error/flag prominently (the commit is local and reversible, so this is advisory, but it must not be buried in the JSON).	medium
New phase	NO_MANUAL_BACKFILL_RUNBOOK_GENERATED	Add a Documentation phase that emits `backend/src/db/BACKFILL_RUNBOOK.md`: dependency order, per-script invocation + env, checksum/row-count verification, dual-write enablement, soak/monitor, rollback. This is the cutover team's source of truth.	medium

4. What the Workflow Does Well (trust these parts)

Airtight non-prod / non-deploy posture. SAFETY rules 1–4 are stated globally and re-stated in every phase prompt. Branch off origin/main, additive-only, no version bump, no push, no auto-backfill. CI audit independently confirms the branch cannot trigger a deploy.
Correct scope and FK ordering. Tier A (money/orders) + Tier B parents (User, Category) migrated first; COLLECTIONS ordered parents-first (line 59); backfill runner enforces User, Category → PurchaseRequest, SellerOffer → Payment, ... (line 159). Out-of-scope set (Chat, Notification, Address, Review) is sensible.
Money-safety modeling is explicitly specified. numeric(38,18) not float (line 126); the exact RN partial-unique WHERE, ledger sparse-unique, and derived-destination uniqueness (line 127); Mixed-id discriminator+typed-FK+external-ref+CHECK pattern (line 124); ledger append-only (rule 6); multi-doc writes in real PG transactions (rule 7, line 145).
Dual-write direction is correct. Reads authoritative from Mongo, writes Mongo-first then PG idempotent upsert, PG failure does not block Mongo (line 146) — the safe ordering for a not-yet-cut-over store.
Real verification harness + reconciliation. Row counts, fund-sum checksums, shadow-read diffing, and a money-specific reconcile (released/refunded payments have matching ledger entries, no double-release, monotonic escrow state) — lines 167–170.
Layered safety net. A BULK(sonnet)/BRAIN(opus) model split that reserves Opus for the commit gate and the adversarial fund-safety SelfAudit (lines 16–20, 247), which hunts exactly the right failure modes (float money, dropped idempotency, non-transactional writes, collapsed Mixed ids, prod backfill).

5. Refuted / Non-Issues (do not re-raise)

All of the following were challenged and confirmed already handled by the workflow prompts or by the no-push/no-prod design. They are not action items:

Prod backfill guard, version-bump, client.ts throw-on-unset-PG_URL, ledger immutability, Mixed-id CHECK, idempotency index WHERE, decimal-as-numeric, multi-doc transactions — all specified in SAFETY rules 1–7 and per-phase prompts (lines 105, 124, 126, 127, 145, 146, 160, rule 3/6/7). The "not code-enforced" critique is valid as a trust improvement (see §3) but does not make the workflow unsafe to run.
SAFETY_001 (self-audit after commit): refuted — commit is to a local, reversible, non-deploying branch; the verdict still reaches the human.
LOCKFILE_001 / BRANCH_002 / SAFETY_002 (lockfile, npm audit): refuted as in-workflow blockers — the workflow never installs, never pushes; these are CI-merge-time concerns owned by the human at integration. (Note: CI is Gitea Actions, not Woodpecker as the SAFETY comment says — a cosmetic doc fix only.)
BRANCH_001 / BRANCH_003 / DEPLOY_001 / VERSION_001 / MERGE_001 / PARALLEL_001: refuted — branch isolation, additive package.json edits, and "never push" are operationally sound; the residual is human discipline at push time, covered by §6.

6. Pre-Run Checklist & Human-Only Gates

Before invoking the workflow

Apply the two structural §3 fixes first — they are cheap and prevent silent breakage:
- Repos factory.ts race: split into parallel-repo-files + sequential-aggregator.
- Tests skip-as-pass: return testsSkipped and have VerifyGate fail on skip.
Confirm a clean working tree on the repo (Preflight will STOP otherwise — line 90) and that origin is fetched.
Confirm branch state: if feat/pg-money-core-migration already exists, verify it has no stray uncommitted work (PREFLIGHT_BRANCH_ALREADY_EXISTS).
Provide an ephemeral local test PG (container/throwaway) so money-safety tests actually run instead of skipping. If you cannot, accept that VerifyGate (post-fix) will refuse to commit.
Verify parallel-agent scope is disjoint — per repo memory, the moojttaba agent pushes to the same branches. Ensure it is NOT touching user/payment/points/marketplace domains or backend/src/db/repositories/factory.ts.
Set no production env vars in the harness shell. Ensure PG_URL (if set) and MIGRATION_PG_URL point only at local/staging.

Gates enforced during the run

VerifyGate commits only on green typecheck + tests; otherwise leaves work uncommitted with verbatim blockers (lines 216–217). Trust this — but read the blockers array.
(After §3) VerifyGate also fails on version-bump diff, skipped tests, and missing partial-WHERE/CHECK in generated SQL.

After the run — human reviews REQUIRED before anything leaves the branch

Read selfAudit.verdict and selfAudit.mustFixBeforeBackfill (lines 222–248). Do not proceed if unsafe or non-empty must-fix.
Code-review the generated schema for: partial-unique WHERE clauses, ledger immutability enforcement, Mixed-id CHECK constraints, numeric (no float), and db.transaction(...) around payment+ledger / referral / dispute writes.
Confirm package.json version is unchanged and the lockfile situation is understood (update lockfiles + run npm/yarn audit at integration time, not in this workflow).

STAYS HUMAN-ONLY (workflow must never do these — and does not)

Running any backfill against real/staging data. Scripts are artifacts; a human runs them, in dependency order, against a whitelisted non-prod DSN, with --dry-run first and checksum verification after each step (use the §3 runbook).
Cutover / flipping any mongo|dual|pg rollout flag.
Pushing the branch, opening/merging a PR, tagging v* — any of which can trigger CI/deploy. Cherry-pick reviewed changes into a proper feature branch if needed; never push feat/pg-money-core-migration to main.

15 KiB Raw Blame History Unescape Escape

Audit Report — pg-money-core-migration.js Workflow