docs: sync from backend cab0719 - align request budget validation
This commit is contained in:
@@ -0,0 +1,437 @@
|
||||
---
|
||||
title: MongoDB → PostgreSQL Migration Plan (Drizzle)
|
||||
tags: [data-model, migration, postgres, drizzle, plan, runbook]
|
||||
aliases: [Drizzle Migration Plan, PG Migration Plan]
|
||||
created: 2026-05-31
|
||||
companion: "[[MongoDB to PostgreSQL Migration Guide]]"
|
||||
updated: 2026-05-31 for backend integrate-main-into-development@cab0719
|
||||
---
|
||||
|
||||
# MongoDB → PostgreSQL Migration Plan (Drizzle)
|
||||
|
||||
> [!abstract] What this is
|
||||
> The **execution plan** for the recommendation in [[MongoDB to PostgreSQL Migration Guide]]: a **hybrid target** (Postgres for the money/relational core, Mongo retained for Chat/Notification/TTL-session collections) reached via the **strangler pattern with dual-write**, using **Drizzle ORM** + **drizzle-kit** migrations.
|
||||
>
|
||||
> It is opinionated and concrete: a repository seam, an `id_map` bridge, Drizzle schema sketches for the hard cases (Mixed ids, embedded arrays, partial-unique idempotency, TTL), per-phase backfill/verify/cutover mechanics, and a rollback runbook. Where it references fields it uses the **real schema** from `backend/src/models/`.
|
||||
>
|
||||
> **Scope reminder:** partial migration (Phases 0–5) is the recommended stopping point — ≈16–28 engineer-weeks. Full migration of Chat/Notification/sessions is explicitly deferred.
|
||||
|
||||
> [!warning] Current implementation status
|
||||
> Backend `2.6.80` has completed the first implementation slice of this plan: Postgres/Drizzle infra, schemas/migrations through `0008`, `id_map`, `pg_dualwrite_gaps`, Drizzle/Mongo/Dual repo implementations, backfill/verify tooling, conditional oracle `payment_quotes` persistence, and the `PurchaseRequest`/`RequestTemplate` budget enum alignment with PG `budget_currency`. It has **not** completed service-layer wiring or runtime cutover. Mongo remains authoritative for normal traffic. See [[Postgres Runtime Cutover Status]].
|
||||
|
||||
---
|
||||
|
||||
## 0. Guiding principles
|
||||
|
||||
1. **Never cut over without a soak.** Every collection goes through backfill → dual-write → shadow-read verify → flip reads → soak → decommission. Rollback at any point = flip reads back to Mongo.
|
||||
2. **The repository layer is the only thing that knows where data lives.** Services must stop calling Mongoose directly. This seam is what makes the swap invisible and per-collection reversible.
|
||||
3. **Parents before children.** FK remapping flows through `id_map`; you cannot migrate `Payment` before `User` exists in PG with stable uuids.
|
||||
4. **Money correctness is the point.** The migration's payoff is real ACID transactions around payment + ledger + dispute flows that today lean on Mongo per-document atomicity. Treat every money write as transactional from day one in PG.
|
||||
5. **No feature work during migration.** No new fields, no behavior changes. A migration that also ships features cannot be verified by row-count + checksum equality.
|
||||
6. **Mongo stays authoritative until cutover.** Dual-write writes both; reads come from Mongo until a collection's shadow-read window is clean.
|
||||
|
||||
---
|
||||
|
||||
## 1. Target architecture
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────┐
|
||||
│ Service layer │
|
||||
│ (marketplace, payment, dispute, points, …) │
|
||||
└───────────────────────┬─────────────────────┘
|
||||
│ calls interfaces only
|
||||
┌───────────────────────▼─────────────────────┐
|
||||
│ Repository layer │
|
||||
│ IUserRepo, IPaymentRepo, IPurchaseRepo, … │
|
||||
│ ── feature-flagged per collection ── │
|
||||
└───────┬───────────────────────────┬─────────┘
|
||||
reads/writes reads/writes
|
||||
│ │
|
||||
┌───────────▼─────────┐ ┌───────────▼─────────┐
|
||||
│ MongoRepo (today) │ │ DrizzleRepo (new) │
|
||||
│ Mongoose models │ │ Postgres + Drizzle │
|
||||
└─────────────────────┘ └─────────────────────┘
|
||||
│ │
|
||||
┌─────▼─────┐ ┌─────▼─────┐
|
||||
│ MongoDB │◄── id_map ──────►│ Postgres │
|
||||
└───────────┘ (bridge) └───────────┘
|
||||
|
||||
Permanent on Mongo: Chat, Notification, TelegramSession,
|
||||
TempVerification, TelegramLink-state. Redis untouched.
|
||||
```
|
||||
|
||||
Each domain gets an interface (`IPaymentRepo`), a `MongoPaymentRepo` (wraps today's Mongoose calls verbatim), a `DrizzlePaymentRepo` (new), and a `DualWritePaymentRepo` (delegates reads to one, writes to both, behind a flag). A factory picks the implementation per collection from config:
|
||||
|
||||
```ts
|
||||
// repos/factory.ts
|
||||
type Mode = 'mongo' | 'dual' | 'pg';
|
||||
const MODE: Record<string, Mode> = {
|
||||
user: env.REPO_USER ?? 'mongo',
|
||||
payment: env.REPO_PAYMENT ?? 'mongo',
|
||||
// …per collection
|
||||
};
|
||||
export const paymentRepo: IPaymentRepo =
|
||||
MODE.payment === 'pg' ? new DrizzlePaymentRepo()
|
||||
: MODE.payment === 'dual' ? new DualWritePaymentRepo(new MongoPaymentRepo(), new DrizzlePaymentRepo())
|
||||
: new MongoPaymentRepo();
|
||||
```
|
||||
|
||||
A collection's migration is then just three flag flips: `mongo → dual → pg`.
|
||||
|
||||
---
|
||||
|
||||
## 2. Drizzle & infra setup (Phase 0)
|
||||
|
||||
### Packages
|
||||
```
|
||||
pnpm add drizzle-orm pg
|
||||
pnpm add -D drizzle-kit @types/pg
|
||||
```
|
||||
|
||||
### Layout
|
||||
```
|
||||
backend/src/db/
|
||||
schema/ # one file per table group
|
||||
users.ts
|
||||
payments.ts
|
||||
purchaseRequests.ts
|
||||
...
|
||||
idMap.ts
|
||||
index.ts # re-exports all tables + relations
|
||||
client.ts # drizzle(pg.Pool) singleton
|
||||
migrations/ # drizzle-kit generated SQL
|
||||
repositories/
|
||||
interfaces/ # IUserRepo, IPaymentRepo, …
|
||||
mongo/ # MongoUserRepo (wraps existing Mongoose)
|
||||
drizzle/ # DrizzleUserRepo
|
||||
dual/ # DualWriteUserRepo
|
||||
factory.ts
|
||||
backfill/ # per-collection batch copiers
|
||||
verify/ # row-count + checksum + shadow-read harness
|
||||
drizzle.config.ts
|
||||
```
|
||||
|
||||
### `drizzle.config.ts`
|
||||
```ts
|
||||
import { defineConfig } from 'drizzle-kit';
|
||||
export default defineConfig({
|
||||
schema: './src/db/schema/index.ts',
|
||||
out: './src/db/migrations',
|
||||
dialect: 'postgresql',
|
||||
dbCredentials: { url: process.env.PG_URL! },
|
||||
strict: true,
|
||||
verbose: true,
|
||||
});
|
||||
```
|
||||
|
||||
### Client
|
||||
```ts
|
||||
// src/db/client.ts
|
||||
import { drizzle } from 'drizzle-orm/node-postgres';
|
||||
import { Pool } from 'pg';
|
||||
import * as schema from './schema';
|
||||
export const pool = new Pool({ connectionString: process.env.PG_URL, max: 10 });
|
||||
export const db = drizzle(pool, { schema });
|
||||
```
|
||||
|
||||
> Mirror the current Mongo pool size (`maxPoolSize: 10` in `connection.ts`). Keep `mongoose.connect` alive in parallel — both drivers run for the whole migration.
|
||||
|
||||
### Migration workflow
|
||||
- Author tables in `schema/*.ts` → `pnpm drizzle-kit generate` → review the SQL in `migrations/` → `pnpm drizzle-kit migrate` in CI per environment.
|
||||
- **Migrations are versioned, reviewed, and reversible.** This is brand-new discipline — there is no migration framework today.
|
||||
|
||||
---
|
||||
|
||||
## 3. The `id_map` bridge
|
||||
|
||||
ObjectIds become uuids. Every legacy id is recorded so FKs can be remapped and dual-writes stay idempotent.
|
||||
|
||||
```ts
|
||||
// src/db/schema/idMap.ts
|
||||
import { pgTable, uuid, text, timestamp, uniqueIndex } from 'drizzle-orm/pg-core';
|
||||
export const idMap = pgTable('id_map', {
|
||||
collection: text('collection').notNull(), // 'users', 'payments', …
|
||||
legacyId: text('legacy_object_id').notNull(), // 24-char hex
|
||||
newId: uuid('new_id').notNull().defaultRandom(),
|
||||
createdAt: timestamp('created_at', { withTimezone: true }).defaultNow(),
|
||||
}, (t) => ({
|
||||
uq: uniqueIndex('id_map_collection_legacy_uq').on(t.collection, t.legacyId),
|
||||
}));
|
||||
```
|
||||
|
||||
Rules:
|
||||
- Backfill allocates `new_id` once per `(collection, legacyId)` and upserts here. Re-running backfill is safe.
|
||||
- Resolving a foreign reference = look up the parent's `legacyId` in `id_map` to get its `new_id`. **A child cannot backfill until its parents are mapped** (enforces parents-before-children).
|
||||
- Keep `legacy_object_id` as a real column on each migrated table too, for traceability and for the dual-write path to match Mongo docs.
|
||||
|
||||
---
|
||||
|
||||
## 4. Resolving the hard data-modeling cases in Drizzle
|
||||
|
||||
These are the patterns from §3 of the guide, made concrete. Get these right once; they recur.
|
||||
|
||||
### 4.1 Mixed / polymorphic ids — `Payment`, `FundsLedgerEntry`, `DerivedDestination`
|
||||
|
||||
Today `Payment.purchaseRequestId`, `sellerOfferId`, `sellerId` are `Schema.Types.Mixed` — an ObjectId for normal flows, a **string** for template checkout. **Never** store "uuid-or-string" in one PG column. Split into a typed FK + a nullable free-text ref + a discriminator.
|
||||
|
||||
```ts
|
||||
// src/db/schema/payments.ts
|
||||
import { pgTable, uuid, text, numeric, boolean, timestamp, jsonb, pgEnum, index, uniqueIndex } from 'drizzle-orm/pg-core';
|
||||
|
||||
export const paymentProvider = pgEnum('payment_provider', ['request.network','amn.scanner','shkeeper','other']);
|
||||
export const paymentDirection = pgEnum('payment_direction', ['in','out','refund']);
|
||||
export const paymentStatus = pgEnum('payment_status', ['pending','processing','completed','failed','cancelled','refunded']); // confirm full enum from model
|
||||
export const escrowState = pgEnum('escrow_state', ['funded','releasable','released','refunded','releasing','failed','cancelled','partial']);
|
||||
export const refKind = pgEnum('ref_kind', ['entity','template']); // discriminator
|
||||
|
||||
export const payments = pgTable('payments', {
|
||||
id: uuid('id').primaryKey().defaultRandom(),
|
||||
legacyObjectId: text('legacy_object_id'),
|
||||
|
||||
// purchaseRequestId (Mixed) → typed FK OR free string
|
||||
purchaseRequestRefKind: refKind('purchase_request_ref_kind').notNull(),
|
||||
purchaseRequestId: uuid('purchase_request_id').references(() => purchaseRequests.id), // null when template
|
||||
purchaseRequestExternalRef: text('purchase_request_external_ref'), // set when template
|
||||
|
||||
// sellerOfferId (Mixed) → same shape
|
||||
sellerOfferRefKind: refKind('seller_offer_ref_kind').notNull(),
|
||||
sellerOfferId: uuid('seller_offer_id').references(() => sellerOffers.id),
|
||||
sellerOfferExternalRef: text('seller_offer_external_ref'),
|
||||
|
||||
buyerId: uuid('buyer_id').notNull().references(() => users.id),
|
||||
|
||||
// sellerId (Mixed)
|
||||
sellerRefKind: refKind('seller_ref_kind').notNull(),
|
||||
sellerId: uuid('seller_id').references(() => users.id),
|
||||
sellerExternalRef: text('seller_external_ref'),
|
||||
|
||||
// amount subdoc → inline columns
|
||||
amount: numeric('amount', { precision: 38, scale: 18 }).notNull(),
|
||||
currency: text('currency').notNull().default('USDT'),
|
||||
|
||||
provider: paymentProvider('provider').notNull().default('request.network'),
|
||||
direction: paymentDirection('direction').notNull().default('in'),
|
||||
status: paymentStatus('status').notNull().default('pending'),
|
||||
escrowState: escrowState('escrow_state'),
|
||||
|
||||
providerPaymentId: text('provider_payment_id'),
|
||||
blockchain: jsonb('blockchain'), // transactionHash etc. — read-as-blob, GIN if filtered
|
||||
metadata: jsonb('metadata'), // provider-specific, schema-varying
|
||||
|
||||
isRefunded: boolean('is_refunded').notNull().default(false),
|
||||
completedAt: timestamp('completed_at', { withTimezone: true }),
|
||||
createdAt: timestamp('created_at', { withTimezone: true }).defaultNow(),
|
||||
updatedAt: timestamp('updated_at', { withTimezone: true }).defaultNow(),
|
||||
}, (t) => ({
|
||||
byStatusCreated: index('payments_status_created_idx').on(t.status, t.createdAt),
|
||||
byBuyerStatus: index('payments_buyer_status_idx').on(t.buyerId, t.status),
|
||||
bySellerStatus: index('payments_seller_status_idx').on(t.sellerId, t.status),
|
||||
txHash: index('payments_tx_hash_idx').on(t.providerPaymentId),
|
||||
// Partial-unique idempotency — the real Mongo index 'uniq_pending_request_network_by_buyer_session_offer'
|
||||
pendingRnUq: uniqueIndex('uniq_pending_rn_by_buyer_offer')
|
||||
.on(t.buyerId, t.purchaseRequestId, t.sellerOfferId, t.provider, t.direction)
|
||||
.where(sql`provider = 'request.network' AND direction = 'in' AND status = 'pending'`),
|
||||
}));
|
||||
```
|
||||
|
||||
Add a CHECK so a discriminator always agrees with which column is populated:
|
||||
```sql
|
||||
ALTER TABLE payments ADD CONSTRAINT payments_pr_ref_ck CHECK (
|
||||
(purchase_request_ref_kind = 'entity' AND purchase_request_id IS NOT NULL AND purchase_request_external_ref IS NULL) OR
|
||||
(purchase_request_ref_kind = 'template' AND purchase_request_id IS NULL AND purchase_request_external_ref IS NOT NULL)
|
||||
);
|
||||
```
|
||||
|
||||
`FundsLedgerEntry` has the same Mixed `purchaseRequestId`/`paymentId` plus a **`idempotencyKey` sparse-unique** → partial unique index `WHERE idempotency_key IS NOT NULL`.
|
||||
|
||||
### 4.2 Embedded arrays → child tables
|
||||
|
||||
| Source (embedded) | PG | Notes |
|
||||
|---|---|---|
|
||||
| `PurchaseRequest.offers[]` (array of SellerOffer ids) | junction `purchase_request_offers(pr_id, offer_id)` | FK integrity; also drop the denormalized array. |
|
||||
| `PurchaseRequest.preferredSellerIds[]` | junction `pr_preferred_sellers(pr_id, user_id)` | — |
|
||||
| `PurchaseRequest.deliveryInfo / serviceInfo` (nested subdocs) | child tables `pr_delivery_info`, `pr_service_info` (1:1) | queried logistics; not blobbed. |
|
||||
| `Dispute.evidence[]`, `Dispute.timeline[]` | `dispute_evidence`, `dispute_timeline` | timeline pre-save append → explicit INSERT. |
|
||||
| `User.passkeys[]`, `User.refreshTokens[]` | `user_passkeys`, `user_refresh_tokens` | append/revoke + lookup semantics. |
|
||||
| `DerivedDestination` sweep history, `TrezorAccount.addresses[]` | child tables | per-address rows referenced by payments. |
|
||||
| `Payment.blockchain`, `Payment.metadata`, `Notification.metadata`, `PointTransaction.metadata` | **JSONB** | read-as-blob, never filtered/joined. |
|
||||
|
||||
Rule: **child table when you query/index/FK/aggregate it; JSONB when you read it whole and never filter on it.**
|
||||
|
||||
### 4.3 Self-referential FK — `Category`
|
||||
```ts
|
||||
export const categories = pgTable('categories', {
|
||||
id: uuid('id').primaryKey().defaultRandom(),
|
||||
legacyObjectId: text('legacy_object_id'),
|
||||
name: text('name').notNull(),
|
||||
nameEn: text('name_en'),
|
||||
parentId: uuid('parent_id'), // self-FK, see relations
|
||||
isActive: boolean('is_active').notNull().default(true),
|
||||
}, (t) => ({
|
||||
parentIdx: index('categories_parent_idx').on(t.parentId),
|
||||
activeIdx: index('categories_active_idx').on(t.isActive),
|
||||
}));
|
||||
// relations(): parentId → categories.id, ON DELETE SET NULL
|
||||
```
|
||||
`Category.parentId` is itself Mixed (ObjectId | string) in the model — verify all rows are ObjectIds during the pre-migration audit; treat stray strings as data errors to clean.
|
||||
|
||||
### 4.4 Sparse-unique → partial unique index — `User.email`, `User.referralCode`
|
||||
The runtime code in `connection.ts` rebuilds `users.email` as unique+sparse. In PG:
|
||||
```ts
|
||||
emailUq: uniqueIndex('users_email_uq').on(t.email).where(sql`email IS NOT NULL`),
|
||||
referralUq: uniqueIndex('users_referral_uq').on(t.referralCode).where(sql`referral_code IS NOT NULL`),
|
||||
```
|
||||
Reimplement `toJSON()` password/token stripping in the repository's read mapper (it deletes `refreshTokens`, `emailVerification*` before returning).
|
||||
|
||||
### 4.5 Atomic counter — `DerivedDestination.derivationIndex`
|
||||
Today allocation relies on Mongo atomicity. In PG use a real transaction with `SELECT … FOR UPDATE` on a per-(buyer,chain) counter row, or a dedicated sequence per chain. The `uniq_destination_by_buyer_seller_chain` unique index ports directly. `status` enum `('active','swept','sweeping','quarantined')` → `pgEnum`.
|
||||
|
||||
### 4.6 TTL → `pg_cron`
|
||||
`TempVerification` and `TelegramSession` stay on Mongo (ephemeral, recommended). If `Notification` (90-day TTL) ever moves: monthly range-partition + drop, or
|
||||
```sql
|
||||
SELECT cron.schedule('notifications_ttl', '0 3 * * *',
|
||||
$$DELETE FROM notifications WHERE created_at < now() - interval '90 days'$$);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. The dual-write seam (the mechanic that makes it safe)
|
||||
|
||||
```ts
|
||||
// repositories/dual/DualWritePaymentRepo.ts
|
||||
export class DualWritePaymentRepo implements IPaymentRepo {
|
||||
constructor(private mongo: IPaymentRepo, private pg: IPaymentRepo) {}
|
||||
|
||||
// READS: source of truth = Mongo until cutover
|
||||
findById(id) { return this.mongo.findById(id); }
|
||||
|
||||
// WRITES: both, idempotently. Mongo first (authoritative); PG must not break the request.
|
||||
async create(input) {
|
||||
const m = await this.mongo.create(input); // returns doc incl. _id
|
||||
try {
|
||||
await this.pg.upsertFromMongo(m); // keyed by legacyObjectId / idempotencyKey
|
||||
} catch (e) {
|
||||
metrics.dualWriteError('payments', 'create', e); // alert, do NOT throw
|
||||
}
|
||||
return m;
|
||||
}
|
||||
|
||||
async update(id, patch) {
|
||||
const m = await this.mongo.update(id, patch);
|
||||
try { await this.pg.upsertFromMongo(m); } catch (e) { metrics.dualWriteError('payments','update',e); }
|
||||
return m;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
- **Mongo write is authoritative and must succeed**; PG write failures are logged + alerted, never surfaced to the user, during `dual` mode. (Once in `pg` mode, PG is authoritative and wrapped in real transactions.)
|
||||
- All PG writes are **idempotent upserts** keyed on `legacyObjectId` (or natural idempotency keys: `Payment` partial-unique set, `FundsLedgerEntry.idempotencyKey`). This lets backfill and live dual-write overlap without double-insert.
|
||||
- `$inc`/`$push` translate inside the repo: `$inc points` → `UPDATE … SET points = points + $1` in a transaction; `$push offers` → `INSERT INTO purchase_request_offers …`.
|
||||
|
||||
---
|
||||
|
||||
## 6. Phased execution
|
||||
|
||||
Same phases as the guide §2, here with Drizzle-concrete entry/exit gates. Each phase ends with a collection in `pg` mode and dual-write removed only after the soak.
|
||||
|
||||
### Phase 0 — Foundations (2–5 wk) — *no data moves*
|
||||
- Stand up Postgres (per env), Drizzle, drizzle-kit, CI migrations. **Status 2026-05-31:** implemented in code and dev stack, but migrations must still be applied per target DB.
|
||||
- Build repository interfaces + `MongoRepo` wrappers for the relational-core domains (refactor services to call repos, not Mongoose directly). **Status 2026-05-31:** repo interfaces/implementations exist; service-layer wiring remains the bulk of the cutover risk.
|
||||
- Create `id_map`, the verification harness (§7), and the backfill batch runner skeleton.
|
||||
- **Exit:** all relational-core services call repositories; PG reachable everywhere; `id_map` + verify harness exist; CI runs migrations.
|
||||
|
||||
### Phase 1 — Address pilot (1–2 wk)
|
||||
- Smallest real domain; proves backfill → dual-write → verify → cutover end-to-end.
|
||||
- Reimplement the **one-primary-per-user** pre-save invariant as either a partial unique index `UNIQUE (user_id) WHERE primary = true` or a trigger.
|
||||
- **Exit:** `addresses` in `pg` mode in prod, invariant proven under concurrent writes, verify green, dual-write removed.
|
||||
|
||||
### Phase 2 — Reference/config (2–3 wk)
|
||||
- `Category` (self-FK, soft-delete), `LevelConfig`, `ConfigSetting`, `ConfigSettingHistory`, `ShopSettings`, `Review`.
|
||||
- Port seeds to run in dependency order. Enforce `ShopSettings.sellerId` unique, Category `parentId` ON DELETE SET NULL.
|
||||
- **Exit:** these read from PG; seeds run in PG.
|
||||
|
||||
### Phase 3 — User + auth core (3–5 wk)
|
||||
- `User` is the FK hub — **must precede the money core** so `id_map` for users is authoritative.
|
||||
- Normalize `profile`/`preferences`/`points`/`referralStats` into columns; extract `passkeys[]`, `refreshTokens[]` to child tables; partial-unique `email`/`referralCode`; reimplement `toJSON()` stripping; passkey `default: Date.now()` in app code.
|
||||
- Redis session/rate-limit + in-memory passkey challenge store stay as-is.
|
||||
- **Exit:** `users` in `pg` mode; referral self-FK intact; all auth flows pass; user uuids authoritative in `id_map`.
|
||||
|
||||
### Phase 4 — Money core (6–10 wk) — *the point of the project*
|
||||
- `PurchaseRequest`, `SellerOffer`, `Payment`, `FundsLedgerEntry`, `DerivedDestination`, `TrezorAccount`, `PointTransaction`.
|
||||
- Apply §4.1 (Mixed→discriminator+FK), §4.2 (offers/preferredSellers junctions, deliveryInfo/serviceInfo child tables), §4.5 (derivation counter).
|
||||
- **Wrap in real PG transactions the multi-doc writes that today have none:** `raiseDispute` (PurchaseRequest + Payment), payment confirm + `FundsLedgerEntry` AML-fee insert, referral reward (points + referralStats), PointsService flows (migrate its 2 `withTransaction` sites to PG `BEGIN/COMMIT`).
|
||||
- Preserve the `Payment` partial-unique idempotency index and `FundsLedgerEntry.idempotencyKey` uniqueness.
|
||||
- **Exit:** money core in `pg` mode; checksum equality on `funds_ledger_entries` sums & `payments` amounts across a full soak; idempotency + escrow-hold invariants pass concurrency tests.
|
||||
|
||||
### Phase 5 — Dispute + delivery (2–4 wk)
|
||||
- `Dispute.evidence[]`/`timeline[]` → child tables; pre-save timeline-append → explicit INSERT; delivery `$set/$push` nested updates → SQL.
|
||||
- `Dispute ↔ Chat` becomes a **cross-store call** (Chat stays on Mongo) — define the boundary API.
|
||||
- **Exit:** dispute lifecycle in `pg` mode; release-hold sync transactional.
|
||||
|
||||
### Phase 6 (deferred / optional) — `RequestTemplate`, `BlogPost`
|
||||
- Behind a search abstraction; `$regex` → PG trigram/FTS only if migrated. Otherwise leave on Mongo.
|
||||
|
||||
### Permanent on Mongo
|
||||
`Chat`, `Notification`, `TelegramSession`, `TempVerification`, `TelegramLink` link-state. Revisit only if dual-stack ops cost exceeds migration cost.
|
||||
|
||||
---
|
||||
|
||||
## 7. Verification (gate for every cutover)
|
||||
|
||||
Three layers, **all green before any read flip**:
|
||||
|
||||
1. **Row counts** — per collection and per FK relationship, Mongo vs PG. Catches dropped/dangling rows. Run continuously during dual-write.
|
||||
2. **Checksums** — column-level hashes; special attention to financial sums (`SUM(funds_ledger_entries.amount)`, `SUM(payments.amount)` grouped by status/provider) and the partial-unique idempotency set.
|
||||
3. **Shadow reads** — in prod, serve from Mongo, asynchronously read PG for the same key, diff, alert on mismatch. **A clean shadow-read window (e.g. 7 days, zero diffs on hot paths) is the exit criterion for cutover.**
|
||||
|
||||
```ts
|
||||
// verify/shadow.ts — wrap a repo read in dual mode
|
||||
async function shadowRead(key, mongoFn, pgFn) {
|
||||
const m = await mongoFn(key);
|
||||
pgFn(key).then(p => { if (!deepEqualNormalized(m, p)) metrics.shadowMismatch(key, diff(m, p)); })
|
||||
.catch(e => metrics.shadowError(key, e));
|
||||
return m; // user always gets Mongo result
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 8. Cutover & rollback runbook (per collection)
|
||||
|
||||
1. **Backfill** in batches with checkpointing; allocate uuids → `id_map`; remap FKs from already-migrated parents. Re-runnable (idempotent upserts).
|
||||
2. **Enable `dual`** (flag) — writes go to both; shadow-read diffing on. Backfill the delta accumulated during step 1.
|
||||
3. **Soak** until row-count + checksum + shadow-read are clean for the agreed window.
|
||||
4. **Flip reads to `pg`** (flag). Keep dual-write on.
|
||||
5. **Soak again** (shorter). Rollback = flip reads back to `mongo`; data still mirrored, so rollback is instant.
|
||||
6. **Decommission**: stop writing Mongo for that collection; archive the collection.
|
||||
|
||||
> Near-zero downtime: there is no global write freeze except, optionally, a brief one during final ledger reconciliation for the money core.
|
||||
|
||||
---
|
||||
|
||||
## 9. First two weeks — concrete starter checklist
|
||||
|
||||
- [ ] Add `drizzle-orm`, `pg`, `drizzle-kit`; create `src/db/{schema,client.ts,migrations}` + `drizzle.config.ts`.
|
||||
- [x] Provision Postgres in dev (compose) + define `PG_URL`; keep Mongo running alongside. Use Postgres 18 volume mount `/var/lib/postgresql`, not `/var/lib/postgresql/data`.
|
||||
- [ ] Write `id_map` schema; generate + run the first migration in CI.
|
||||
- [ ] Define `IAddressRepo`; implement `MongoAddressRepo` by moving the existing Mongoose calls behind it; refactor address service to use the repo. **No behavior change** — prove the seam is invisible (existing tests pass).
|
||||
- [ ] Build the verification harness (row count + checksum) against `addresses`.
|
||||
- [ ] Author `addresses` Drizzle schema (incl. one-primary partial unique index) + `DrizzleAddressRepo` + `DualWriteAddressRepo`.
|
||||
- [ ] Write the batch backfill for `addresses`; run dev backfill; confirm verify is green.
|
||||
- [ ] Flip dev to `dual`, then `pg`; document the flag flips. This is the template for all later phases.
|
||||
|
||||
---
|
||||
|
||||
## 10. Effort recap (from the guide)
|
||||
|
||||
| Scope | Eng-weeks | Notes |
|
||||
|---|---|---|
|
||||
| **Partial — money/relational core (Phases 0–5 + cross-cutting)** | **~16–28** | Recommended stopping point; captures ~90% of value (ACID money + relational integrity). |
|
||||
| Full — all 23 collections | ~23–40 | Extra 7–12+ wks mostly buys Chat/Notification normalization the access patterns don't reward. |
|
||||
|
||||
Add ~20% contingency for data-audit surprises in the Mixed-id fields. One focused engineer assumed; parallelize to compress wall-clock, not effort.
|
||||
|
||||
---
|
||||
|
||||
> [!warning] Before trusting the code sketches
|
||||
> Drizzle schemas above use the real field names from `backend/src/models/` but are **first-pass sketches** — confirm the full `Payment.status` enum, the exact `amount` precision/scale your tokens need (USDT/USDC decimals), and audit which `Mixed` rows are actually strings vs ObjectIds **before** writing the money-core migration. See [[MongoDB to PostgreSQL Migration Guide]] §3/§5 for the authoritative per-field detail.
|
||||
Reference in New Issue
Block a user