# Database Strategy — Mongo vs Postgres Assessment **Status:** RESOLVED — Full PostgreSQL migration complete as of 2026-06-06, backend v2.9.12. Document retained as historical reference. **Owner:** nick + claude **Decision:** Proceed with a staged hybrid migration, not an immediate full cutover. > [!success] Migration Complete — 2026-06-06 > The migration to PostgreSQL is **complete** as of backend v2.9.12. MongoDB and Mongoose have been fully removed from the runtime codebase. This document is retained as historical context for the assessment and decision-making process. --- ## TL;DR Amanat still runs on MongoDB (primary store) + Redis (cache/sessions/rate limits). Backend `2.6.79` adds Postgres 18 support, Drizzle schemas/migrations, repository implementations, backfill/verify tooling, and conditional `payment_quotes` persistence, but this is **not** a full runtime cutover. **Current recommendation:** continue the staged hybrid migration. Keep Mongo authoritative for live traffic until each domain is wired through the repository layer, backfilled, dual-written, shadow-read, and explicitly flipped. See [[Postgres Runtime Cutover Status]] for the current line between code that can use Postgres and code that still uses Mongo. --- ## What we run today | Store | Use | Notes | |---|---|---| | MongoDB (Mongoose 8.x) | Primary runtime store — normal domain traffic | 22 models, ~454 query call sites across 171 backend TS files | | PostgreSQL 18 + Drizzle | Migration target and conditional oracle quote store | Schemas/migrations through `0008`, repo implementations, backfill/verify tooling; broad service wiring still pending | | Redis | Sessions, cache, rate limits (paymentLimiter etc.) | Not in scope for any migration. Keep as-is either way. | ### Current Postgres implementation state (2026-05-31) | Implemented | Not yet cut over | |---|---| | `src/db/client.ts` fail-fast PG client, Drizzle schema/index barrel, migrations through `0008`, `id_map`, `pg_dualwrite_gaps`, `payment_quotes` | Service layer still imports Mongoose models directly; no broad runtime use of `createRepositories()` / `get*Repo()` factory | | Drizzle/Mongo/Dual repository classes for user, payment, points, marketplace | Auth, marketplace, payment, wallet, points, chat, notification, dispute, and admin paths still use Mongoose directly | | Backfill and verification scripts guarded by `MIGRATION_PG_URL` | Backfills are not auto-run and no domain is verified as PG-authoritative | | Oracle quote persistence can write PG `payment_quotes` when `ORACLE_QUOTING_ENABLED=true` | Payment records themselves are still created/updated in Mongo; PG quote insert depends on a resolvable PG parent row | ### Mongoose models (22) Ranked by how naturally they map to a relational schema: | Tier | Models | Relational fit | |---|---|---| | **Core financial** | `Payment`, `FundsLedgerEntry`, `PurchaseRequest`, `DerivedDestination`, `Dispute` | Strong. These are where FK constraints + ACID earn their keep. The orphan-payment deletion bug we hit on 2026-05-28 (`provider:` filter missing) lives here — an FK would have prevented it structurally. | | **Marketplace** | `SellerOffer`, `RequestTemplate`, `Category`, `Address`, `Review` | Strong. Already relational in shape. | | **Identity** | `User`, `TelegramLink`, `TelegramSession`, `TempVerification`, `TrezorAccount` | Strong. Clean 1-to-many. | | **Document-shaped** | `Chat`, `Notification`, `BlogPost`, `PointTransaction`, `LevelConfig`, `ShopSettings` | Weak. Chat especially — message arrays prefer either Mongo or Postgres JSONB. | ### Mongo-specific patterns we lean on These are the patterns that get expensive to migrate: - **Atomic upsert counters** — `Counter.findByIdAndUpdate({_id:'derived_destination_index'}, {$inc:{seq:1}}, {new:true, upsert:true})` in `derivedDestinations.ts`. Postgres equivalent is a `SERIAL` column or `nextval('seq')`, trivial — but every existing call site has to change. - **Embedded `metadata` blobs** — `Payment.metadata.requestNetworkData`, `.derivedDestination`, `.transactionSafety`. Used heavily for RN raw payloads and per-payment overrides. Two migration paths in Postgres: JSONB column (cheap, loses indexed query-ability) or normalized side tables (lots of work, lots of joins). - **Single-document atomicity assumption** — `grep -rE 'startSession|withTransaction'` finds **1 file** in the codebase using Mongo transactions. The remaining ~454 query sites implicitly rely on single-document atomicity. Going relational forces explicit transaction demarcation everywhere money moves; this is where post-migration bugs hide. - **Aggregation pipelines** — 11 files use `.aggregate()`. Each is a custom rewrite to SQL. --- ## Cost of a full migration One-engineer-equivalent, full-time, not parallel with feature work: | Phase | Scope | Estimate | |---|---|---| | Schema design + ERD | 22 models → relational schema, decide JSONB vs normalized for each `metadata` field | 1–2 weeks | | ORM swap (Prisma/Drizzle/TypeORM) | Rewrite 22 models, 454 query sites. ~80% mechanical, ~20% (aggregations, atomic upserts) need genuine rethinking | 6–10 weeks | | Data backfill scripts | Mongo → Postgres ETL per collection. ObjectId → uuid/int FK resolution, embedded subdoc unrolling | 2–3 weeks | | Cutover infra | Dual-write window, shadow reads, rollback plan, point-in-time backups | 1–2 weeks | | Test fix-up | 36 backend test files mock/seed Mongo; rewrite harness, fixtures, in-memory DB | 2–3 weeks | | Stabilization | Production incidents you didn't predict; the long tail | 2–4 weeks | | **Total** | | **14–24 weeks (3.5–6 months)** | ### Multipliers specific to this codebase - Only 1 file uses Mongo transactions today → most boundaries are implicit. Going relational means *finding* and explicitly wrapping every multi-row money operation. High bug yield. - Heavy `metadata` blob usage → either lose query-ability (JSONB) or pay normalization cost (side tables + joins everywhere). - Multiple agents (nick + claude + kimi + moojttaba) commit weekly. A 4-month migration branch will rot constantly; rebasing it against a fast-moving main is a tax on every other feature. - 36 test files all assume Mongo. Either keep both DBs in CI during transition, or rewrite the whole test harness up front. --- ## What we'd actually gain Honest accounting: | Win | Real value | |---|---| | FK constraints | Would have caught the 2026-05-28 orphan-payment bug (Payment cleanup with missing `provider:` filter). Will catch similar bugs in the future. | | Multi-row ACID | Real value for escrow release + dispute resolution + payment-to-request creation. Today these rely on app-level invariants. | | Audit / financial reporting | SQL is much friendlier for accountants, auditors, and ad-hoc analytical queries. | | Mature tooling | pg_dump, point-in-time recovery, logical replication, Metabase/Superset integration. | | Hiring | More backend engineers know SQL well than Mongo well. | | Non-win (claimed but not real) | Why it doesn't materialize | |---|---| | "Better performance" | Mongo handles this app's load fine; we're nowhere near needing it to scale further. | | "Better schemas" | Mongoose already enforces schemas at the app layer. The structural integrity gain is FKs, not types. | | "Fewer bugs" | Most bugs we've hit (`rn_webhook_event_field`, `backend_rate_limits`, `woodpecker_silent_build_fail`, telegram parse_mode) are application logic, not DB choice. Postgres wouldn't have caught any of them. | --- ## The structurally better path: targeted hardening (~2 weeks) Get most of the relational wins without the migration: 1. **Append-only ledger as source of truth.** Promote `FundsLedgerEntry` (or a new collection) to the authoritative record of every money movement. Strict invariants enforced in a single service. Becomes the audit log accountants and disputes consume. 2. **Explicit transaction boundaries.** Identify the ~5 places where multi-collection atomicity actually matters: Payment + PurchaseRequest creation, escrow release, dispute resolution, sweep + DerivedDestination update, refund. Wrap each in `mongoose.startSession() + session.withTransaction(...)`. This requires Mongo to be a replica set in prod (which it already is for our deployment). 3. **App-layer FK enforcement.** Mongoose `pre('save')` and `pre('deleteOne')` hooks that verify referenced documents exist before mutating. Catches the orphan-deletion class of bug. Cheap. 4. **Cleanup-query lint.** Codify the [[feedback-payment-cleanup-provider-filter]] rule: any `Payment.find()/.deleteMany()/.updateMany()` over the payments collection without a `provider:` filter is a bug. Custom ESLint rule or just a grep in CI. Estimated cost: ~2 weeks. Catches the bugs that actually hurt. Leaves the migration option open. --- ## Partial-migration option: dual-DB for financial models only A narrower question worth its own analysis: *what if we keep Mongo for the bulk of the app but move the financial/ledger operations to Postgres just to get ACID where money is involved?* ### Reference-surface in the current backend | Model | Files referencing it | |---|---| | `Payment` | 33 | | `PurchaseRequest` | 25 | | `FundsLedgerEntry` | 4 | | `DerivedDestination` | 4 | | `Dispute` | 2 | That gives three natural scoping tiers, each with very different cost. ### Option 1 — Ledger only (~3–4 weeks) — **recommended dual-DB shape** Move just `FundsLedgerEntry` to Postgres. Keep everything else on Mongo. The ledger becomes the append-only authoritative record of every money movement, written through a single `LedgerService`. | Phase | Work | Estimate | |---|---|---| | Postgres infra | docker-compose, dev seed, prod provisioning, backups, PITR | 3–4 days | | Schema + Drizzle setup | One table + indexes, migrations | 2 days | | Service boundary | `LedgerService` is the only writer; everywhere else reads | 3–4 days | | Rewrite the 4 call sites | Mechanical | 2 days | | Outbox pattern | Mongo write → outbox row → worker drains into Postgres. Survives crashes between the two writes. | 4–5 days | | Reconciliation job | Nightly diff between ledger sum and Mongo-derived balances; alerts on drift | 2–3 days | | Tests | Harness for both stores, ~10 new tests | 4–5 days | | **Total** | | **3–4 weeks** | **What you get:** Audit-grade money trail, ACID guarantee on the ledger itself, SQL-driven reporting for finance/regulators. No FK constraints across stores (does NOT solve the FK-shaped bug class — Mongo entities still can't reference Postgres rows with integrity), but the *financial record* is bulletproof. **Risk:** The outbox is the load-bearing piece. If Mongo writes succeed and the worker crashes before the outbox drains, the ledger is briefly behind. Reconciliation closes the gap within 24h. Acceptable for typical regulatory regimes; not for high-frequency real-time settlement. **Reusable foundation:** The outbox + reconciliation pattern built here is the template if you later expand to Option 2. None of the work is wasted. ### Option 2 — Ledger + Payment + Dispute (~10–14 weeks) Move `FundsLedgerEntry` + `Payment` + `Dispute` to Postgres. Keep `PurchaseRequest`, `User`, marketplace data in Mongo. The hard part is not the 33 Payment refs — it's that **Payment refers to User, SellerOffer, PurchaseRequest, all of which live in Mongo**. Every cross-store join becomes an app-layer lookup. Queries like "find all Payments for users created last week" need a two-stage fetch. | Phase | Work | Estimate | |---|---|---| | Everything from Option 1 | | 3 weeks | | Payment + Dispute schema design | Including JSONB-vs-normalized for `Payment.metadata.requestNetworkData`, `.derivedDestination`, `.transactionSafety` | 1–2 weeks | | Rewrite 33 + 2 = 35 call sites | Mix of mechanical + `populate('userId')` → manual lookup conversions | 3–4 weeks | | Cross-store query helpers | Layer that fetches Payment from PG and enriches with User from Mongo. Pagination becomes painful. | 1–2 weeks | | Dual-store transactional discipline | Payment update + PurchaseRequest update needs outbox + saga | 2 weeks | | Tests rewrite | 36 test files, most touch Payment | 2 weeks | | Stabilization | Cross-store bugs you didn't predict | 1–2 weeks | | **Total** | | **10–14 weeks** | **What you get:** ACID across the entire payment lifecycle. But you've introduced a permanent cross-store consistency problem and queries got more complex everywhere. ### Option 3 — All five financial models (~16–20 weeks) Move all of `FundsLedgerEntry` + `Payment` + `PurchaseRequest` + `Dispute` + `DerivedDestination`. At this point you're approaching the full-migration cost (14–24 weeks) without the full-migration cleanliness — you still own a cross-store boundary, just relocated to the User/marketplace edge. **Skip this option.** If you're going this far, commit to the full migration plan in the section above instead of leaving an awkward two-store seam through the middle of the domain. ### Recommendation among dual-DB options **Option 1 (ledger only, 3–4 weeks).** Smallest blast radius, cleanest service boundary, 80% of the auditor/regulator/finance-team value. Postgres becomes the source of truth for "did money move," not for "what's the order status." Revisit Option 2 only if a specific compliance ask or repeated cross-Payment consistency bugs force it. **Avoid Option 2** unless there's a concrete forcing function. The permanent cross-store query pain is real and rarely worth it for the marginal ACID gain over Option 1 + good service discipline. ### How dual-DB Option 1 differs from "stay on Mongo + targeted hardening" The 2-week in-place hardening above (append-only ledger collection, `withTransaction` on the 5 money-paths, `pre('save')` FK hooks, cleanup-query lint) gets you a *Mongo-native* version of most of Option 1's wins. The reasons to do Option 1 anyway: - **Regulator/auditor specifically wants SQL** for ledger queries. - **Finance team wants Metabase/Superset/BigQuery sync** with relational primitives, not Mongo aggregations. - **A future financial product** (settlement netting, on-chain accounting export, multi-currency reconciliation) is on the roadmap and would be substantially easier in Postgres. If none of those apply yet, the 2-week targeted hardening is still the right first step. Option 1 builds on top of it cleanly. --- ## When to revisit (trigger conditions) Pull this doc out and re-evaluate when **any** of these fires: 1. **Compliance / audit requirement** — a regulator, payment partner, or auditor demands a relational ledger we can't easily produce from Mongo. 2. **Schema-flexibility cost has gone to zero** — feature velocity is no longer dominated by changing the shape of `Payment.metadata`, `RequestTemplate`, `PurchaseRequest`. If the schema has stabilized, the migration's main friction (rewriting too many evolving entities) is gone. 3. **The bug pattern has repeated** — we hit ≥3 incidents shaped like "missing referential integrity" or "no cross-collection transaction" within 6 months. Then the targeted hardening above wasn't enough and migration starts paying for itself. 4. **A green-field rewrite is happening anyway** — e.g. a major v2 architecture refactor, microservice split, or rewrite of the payments subsystem. Combine the migration with that work; don't do it standalone. 5. **Reporting needs blow up** — finance/ops team wants live SQL-driven dashboards and our Mongo aggregation pipelines + Metabase plugins can't keep up. If none of the above fires, **stay on Mongo**. --- ## If we ever do migrate — order of operations For when the trigger condition fires. Don't do it standalone — pair it with another large refactor. 1. Start with the **financial-tier models only** (Payment, FundsLedgerEntry, PurchaseRequest, DerivedDestination, Dispute). These are 5 of 22 models. Dual-store: Postgres for these, Mongo for the rest, with a sync layer or service-per-store boundary. 2. Validate for 3+ months on dev + prod-shadow before any cutover. 3. Migrate the marketplace + identity tiers next (10 more models). Document-shaped models (Chat, Notification, etc.) probably never need to migrate — they're happier in Mongo or as Postgres JSONB. 4. Use Drizzle or Prisma. Prefer Drizzle if you want migrations-as-code and don't want a heavy runtime; Prisma if the team prefers a higher-level abstraction. 5. **Don't** dual-write the same record. Pick one source of truth per model and don't compromise on it. --- ## Related - [[feedback-payment-cleanup-provider-filter]] — the bug that prompted this discussion (Payment cleanup missing `provider:` filter destroyed multi-seller cart records). - `PRD - Wallet, Multichain, Confirmations, AML, Trezor.md` — Task #7 (derived destinations) is the most Mongo-shaped feature we've shipped recently; reference for how atomic upserts and embedded metadata are used. - `01 - Architecture/Request Network In-House Checkout.md` — RN integration relies heavily on `Payment.metadata.requestNetworkData` blob storage.