447 lines
28 KiB
Markdown
447 lines
28 KiB
Markdown
---
|
||
title: MongoDB → PostgreSQL Migration Plan (Drizzle)
|
||
tags: [data-model, migration, postgres, drizzle, plan, runbook]
|
||
aliases: [Drizzle Migration Plan, PG Migration Plan]
|
||
created: 2026-05-31
|
||
companion: "[[MongoDB to PostgreSQL Migration Guide]]"
|
||
updated: 2026-06-01 for backend integrate-main-into-development@2c5c3c7 backend 2.8.20 + deployment main@38cb75b
|
||
---
|
||
|
||
# MongoDB → PostgreSQL Migration Plan (Drizzle)
|
||
|
||
> [!abstract] What this is
|
||
> The **execution plan** for the recommendation in [[MongoDB to PostgreSQL Migration Guide]]: a **hybrid target** (Postgres for the money/relational core, Mongo retained for Chat/Notification/TTL-session collections) reached via the **strangler pattern with dual-write**, using **Drizzle ORM** + **drizzle-kit** migrations.
|
||
>
|
||
> It is opinionated and concrete: a repository seam, an `id_map` bridge, Drizzle schema sketches for the hard cases (Mixed ids, embedded arrays, partial-unique idempotency, TTL), per-phase backfill/verify/cutover mechanics, and a rollback runbook. Where it references fields it uses the **real schema** from `backend/src/models/`.
|
||
>
|
||
> **Scope reminder:** partial migration (Phases 0–5) is the recommended stopping point — ≈16–28 engineer-weeks. Full migration of Chat/Notification/sessions is explicitly deferred.
|
||
|
||
> [!warning] Current implementation status
|
||
> Backend `2.8.20` has started the runtime cutover with store-specific raw Postgres facades: auth-owned users/Telegram auth records behind `AUTH_STORE=postgres`, confirmation-threshold config/history behind `CONFIG_STORE=postgres`, user address CRUD behind `ADDRESS_STORE=postgres`, and the first marketplace/reference domains behind `CATEGORY_STORE=postgres`, `LEVEL_CONFIG_STORE=postgres`, `SHOP_SETTINGS_STORE=postgres`, and `REVIEW_STORE=postgres`. Category PG mode now deactivates duplicate active names and enforces an active normalized-name unique index. It also contains the broader `src/db/` Drizzle schemas through `0010`, repository implementations/factory, id-map bridge, and backfill runner described below. RequestTemplate now has a PG table/backfill. Funds ledger appends and balance reads now route through `getPaymentRepo()` and can be controlled by `REPO_PAYMENT`, but broad marketplace/payment/points services are still mostly not wired through their factory repos. Code defaults remain Mongo unless a per-store flag is explicitly flipped; dev deployment `38cb75b` now flips the seven PG-capable store flags to Postgres by default. See [[Postgres Runtime Cutover Status]].
|
||
|
||
---
|
||
|
||
## 0. Guiding principles
|
||
|
||
1. **Never cut over without a soak.** Every collection goes through backfill → dual-write → shadow-read verify → flip reads → soak → decommission. Rollback at any point = flip reads back to Mongo.
|
||
2. **The repository layer is the only thing that knows where data lives.** Services must stop calling Mongoose directly. This seam is what makes the swap invisible and per-collection reversible.
|
||
3. **Parents before children.** FK remapping flows through `id_map`; you cannot migrate `Payment` before `User` exists in PG with stable uuids.
|
||
4. **Money correctness is the point.** The migration's payoff is real ACID transactions around payment + ledger + dispute flows that today lean on Mongo per-document atomicity. Treat every money write as transactional from day one in PG.
|
||
5. **No feature work during migration.** No new fields, no behavior changes. A migration that also ships features cannot be verified by row-count + checksum equality.
|
||
6. **Mongo stays authoritative until cutover.** Dual-write writes both; reads come from Mongo until a collection's shadow-read window is clean.
|
||
|
||
---
|
||
|
||
## 1. Target architecture
|
||
|
||
```
|
||
┌─────────────────────────────────────────────┐
|
||
│ Service layer │
|
||
│ (marketplace, payment, dispute, points, …) │
|
||
└───────────────────────┬─────────────────────┘
|
||
│ calls interfaces only
|
||
┌───────────────────────▼─────────────────────┐
|
||
│ Repository layer │
|
||
│ IUserRepo, IPaymentRepo, IPurchaseRepo, … │
|
||
│ ── feature-flagged per collection ── │
|
||
└───────┬───────────────────────────┬─────────┘
|
||
reads/writes reads/writes
|
||
│ │
|
||
┌───────────▼─────────┐ ┌───────────▼─────────┐
|
||
│ MongoRepo (today) │ │ DrizzleRepo (new) │
|
||
│ Mongoose models │ │ Postgres + Drizzle │
|
||
└─────────────────────┘ └─────────────────────┘
|
||
│ │
|
||
┌─────▼─────┐ ┌─────▼─────┐
|
||
│ MongoDB │◄── id_map ──────►│ Postgres │
|
||
└───────────┘ (bridge) └───────────┘
|
||
|
||
Permanent on Mongo: Chat, Notification, TelegramSession,
|
||
TempVerification, TelegramLink-state. Redis untouched.
|
||
```
|
||
|
||
Each domain gets an interface (`IPaymentRepo`), a `MongoPaymentRepo` (wraps today's Mongoose calls verbatim), a `DrizzlePaymentRepo` (new), and a `DualWritePaymentRepo` (delegates reads to one, writes to both, behind a flag). A factory picks the implementation per collection from config:
|
||
|
||
```ts
|
||
// repos/factory.ts
|
||
type Mode = 'mongo' | 'dual' | 'pg';
|
||
const MODE: Record<string, Mode> = {
|
||
user: env.REPO_USER ?? 'mongo',
|
||
payment: env.REPO_PAYMENT ?? 'mongo',
|
||
// …per collection
|
||
};
|
||
export const paymentRepo: IPaymentRepo =
|
||
MODE.payment === 'pg' ? new DrizzlePaymentRepo()
|
||
: MODE.payment === 'dual' ? new DualWritePaymentRepo(new MongoPaymentRepo(), new DrizzlePaymentRepo())
|
||
: new MongoPaymentRepo();
|
||
```
|
||
|
||
A collection's migration is then just three flag flips: `mongo → dual → pg`.
|
||
|
||
---
|
||
|
||
## 2. Drizzle & infra setup (Phase 0)
|
||
|
||
### Packages
|
||
```
|
||
pnpm add drizzle-orm pg
|
||
pnpm add -D drizzle-kit @types/pg
|
||
```
|
||
|
||
### Layout
|
||
```
|
||
backend/src/db/
|
||
schema/ # one file per table group
|
||
users.ts
|
||
payments.ts
|
||
purchaseRequests.ts
|
||
...
|
||
idMap.ts
|
||
index.ts # re-exports all tables + relations
|
||
client.ts # drizzle(pg.Pool) singleton
|
||
migrations/ # drizzle-kit generated SQL
|
||
repositories/
|
||
interfaces/ # IUserRepo, IPaymentRepo, …
|
||
mongo/ # MongoUserRepo (wraps existing Mongoose)
|
||
drizzle/ # DrizzleUserRepo
|
||
dual/ # DualWriteUserRepo
|
||
factory.ts
|
||
backfill/ # per-collection batch copiers
|
||
verify/ # row-count + checksum + shadow-read harness
|
||
drizzle.config.ts
|
||
```
|
||
|
||
### `drizzle.config.ts`
|
||
```ts
|
||
import { defineConfig } from 'drizzle-kit';
|
||
export default defineConfig({
|
||
schema: './src/db/schema/index.ts',
|
||
out: './src/db/migrations',
|
||
dialect: 'postgresql',
|
||
dbCredentials: { url: process.env.PG_URL! },
|
||
strict: true,
|
||
verbose: true,
|
||
});
|
||
```
|
||
|
||
### Client
|
||
```ts
|
||
// src/db/client.ts
|
||
import { drizzle } from 'drizzle-orm/node-postgres';
|
||
import { Pool } from 'pg';
|
||
import * as schema from './schema';
|
||
export const pool = new Pool({ connectionString: process.env.PG_URL, max: 10 });
|
||
export const db = drizzle(pool, { schema });
|
||
```
|
||
|
||
> Mirror the current Mongo pool size (`maxPoolSize: 10` in `connection.ts`). Keep `mongoose.connect` alive in parallel — both drivers run for the whole migration.
|
||
|
||
### Migration workflow
|
||
- Author tables in `schema/*.ts` → `pnpm drizzle-kit generate` → review the SQL in `migrations/` → `pnpm drizzle-kit migrate` in CI per environment.
|
||
- **Migrations are versioned, reviewed, and reversible.** This is brand-new discipline — there is no migration framework today.
|
||
|
||
---
|
||
|
||
## 3. The `id_map` bridge
|
||
|
||
ObjectIds become uuids. Every legacy id is recorded so FKs can be remapped and dual-writes stay idempotent.
|
||
|
||
```ts
|
||
// src/db/schema/idMap.ts
|
||
import { pgTable, uuid, text, timestamp, uniqueIndex } from 'drizzle-orm/pg-core';
|
||
export const idMap = pgTable('id_map', {
|
||
collection: text('collection').notNull(), // 'users', 'payments', …
|
||
legacyId: text('legacy_object_id').notNull(), // 24-char hex
|
||
newId: uuid('new_id').notNull().defaultRandom(),
|
||
createdAt: timestamp('created_at', { withTimezone: true }).defaultNow(),
|
||
}, (t) => ({
|
||
uq: uniqueIndex('id_map_collection_legacy_uq').on(t.collection, t.legacyId),
|
||
}));
|
||
```
|
||
|
||
Rules:
|
||
- Backfill allocates `new_id` once per `(collection, legacyId)` and upserts here. Re-running backfill is safe.
|
||
- Resolving a foreign reference = look up the parent's `legacyId` in `id_map` to get its `new_id`. **A child cannot backfill until its parents are mapped** (enforces parents-before-children).
|
||
- Keep `legacy_object_id` as a real column on each migrated table too, for traceability and for the dual-write path to match Mongo docs.
|
||
|
||
---
|
||
|
||
## 4. Resolving the hard data-modeling cases in Drizzle
|
||
|
||
These are the patterns from §3 of the guide, made concrete. Get these right once; they recur.
|
||
|
||
### 4.1 Mixed / polymorphic ids — `Payment`, `FundsLedgerEntry`, `DerivedDestination`
|
||
|
||
Today `Payment.purchaseRequestId`, `sellerOfferId`, `sellerId` are `Schema.Types.Mixed` — an ObjectId for normal flows, a **string** for template checkout. **Never** store "uuid-or-string" in one PG column. Split into a typed FK + a nullable free-text ref + a discriminator.
|
||
|
||
```ts
|
||
// src/db/schema/payments.ts
|
||
import { pgTable, uuid, text, numeric, boolean, timestamp, jsonb, pgEnum, index, uniqueIndex } from 'drizzle-orm/pg-core';
|
||
|
||
export const paymentProvider = pgEnum('payment_provider', ['request.network','amn.scanner','shkeeper','other']);
|
||
export const paymentDirection = pgEnum('payment_direction', ['in','out','refund']);
|
||
export const paymentStatus = pgEnum('payment_status', ['pending','processing','completed','failed','cancelled','refunded']); // confirm full enum from model
|
||
export const escrowState = pgEnum('escrow_state', ['funded','releasable','released','refunded','releasing','failed','cancelled','partial']);
|
||
export const refKind = pgEnum('ref_kind', ['entity','template']); // discriminator
|
||
|
||
export const payments = pgTable('payments', {
|
||
id: uuid('id').primaryKey().defaultRandom(),
|
||
legacyObjectId: text('legacy_object_id'),
|
||
|
||
// purchaseRequestId (Mixed) → typed FK OR free string
|
||
purchaseRequestRefKind: refKind('purchase_request_ref_kind').notNull(),
|
||
purchaseRequestId: uuid('purchase_request_id').references(() => purchaseRequests.id), // null when template
|
||
purchaseRequestExternalRef: text('purchase_request_external_ref'), // set when template
|
||
|
||
// sellerOfferId (Mixed) → same shape
|
||
sellerOfferRefKind: refKind('seller_offer_ref_kind').notNull(),
|
||
sellerOfferId: uuid('seller_offer_id').references(() => sellerOffers.id),
|
||
sellerOfferExternalRef: text('seller_offer_external_ref'),
|
||
|
||
buyerId: uuid('buyer_id').notNull().references(() => users.id),
|
||
|
||
// sellerId (Mixed)
|
||
sellerRefKind: refKind('seller_ref_kind').notNull(),
|
||
sellerId: uuid('seller_id').references(() => users.id),
|
||
sellerExternalRef: text('seller_external_ref'),
|
||
|
||
// amount subdoc → inline columns
|
||
amount: numeric('amount', { precision: 38, scale: 18 }).notNull(),
|
||
currency: text('currency').notNull().default('USDT'),
|
||
|
||
provider: paymentProvider('provider').notNull().default('request.network'),
|
||
direction: paymentDirection('direction').notNull().default('in'),
|
||
status: paymentStatus('status').notNull().default('pending'),
|
||
escrowState: escrowState('escrow_state'),
|
||
|
||
providerPaymentId: text('provider_payment_id'),
|
||
blockchain: jsonb('blockchain'), // transactionHash etc. — read-as-blob, GIN if filtered
|
||
metadata: jsonb('metadata'), // provider-specific, schema-varying
|
||
|
||
isRefunded: boolean('is_refunded').notNull().default(false),
|
||
completedAt: timestamp('completed_at', { withTimezone: true }),
|
||
createdAt: timestamp('created_at', { withTimezone: true }).defaultNow(),
|
||
updatedAt: timestamp('updated_at', { withTimezone: true }).defaultNow(),
|
||
}, (t) => ({
|
||
byStatusCreated: index('payments_status_created_idx').on(t.status, t.createdAt),
|
||
byBuyerStatus: index('payments_buyer_status_idx').on(t.buyerId, t.status),
|
||
bySellerStatus: index('payments_seller_status_idx').on(t.sellerId, t.status),
|
||
txHash: index('payments_tx_hash_idx').on(t.providerPaymentId),
|
||
// Partial-unique idempotency — the real Mongo index 'uniq_pending_request_network_by_buyer_session_offer'
|
||
pendingRnUq: uniqueIndex('uniq_pending_rn_by_buyer_offer')
|
||
.on(t.buyerId, t.purchaseRequestId, t.sellerOfferId, t.provider, t.direction)
|
||
.where(sql`provider = 'request.network' AND direction = 'in' AND status = 'pending'`),
|
||
}));
|
||
```
|
||
|
||
Add a CHECK so a discriminator always agrees with which column is populated:
|
||
```sql
|
||
ALTER TABLE payments ADD CONSTRAINT payments_pr_ref_ck CHECK (
|
||
(purchase_request_ref_kind = 'entity' AND purchase_request_id IS NOT NULL AND purchase_request_external_ref IS NULL) OR
|
||
(purchase_request_ref_kind = 'template' AND purchase_request_id IS NULL AND purchase_request_external_ref IS NOT NULL)
|
||
);
|
||
```
|
||
|
||
`FundsLedgerEntry` has the same Mixed `purchaseRequestId`/`paymentId` plus a **`idempotencyKey` sparse-unique** → partial unique index `WHERE idempotency_key IS NOT NULL`.
|
||
|
||
### 4.2 Embedded arrays → child tables
|
||
|
||
| Source (embedded) | PG | Notes |
|
||
|---|---|---|
|
||
| `PurchaseRequest.offers[]` (array of SellerOffer ids) | junction `purchase_request_offers(pr_id, offer_id)` | FK integrity; also drop the denormalized array. |
|
||
| `PurchaseRequest.preferredSellerIds[]` | junction `pr_preferred_sellers(pr_id, user_id)` | — |
|
||
| `PurchaseRequest.deliveryInfo / serviceInfo` (nested subdocs) | child tables `pr_delivery_info`, `pr_service_info` (1:1) | queried logistics; not blobbed. |
|
||
| `Dispute.evidence[]`, `Dispute.timeline[]` | `dispute_evidence`, `dispute_timeline` | timeline pre-save append → explicit INSERT. |
|
||
| `User.passkeys[]`, `User.refreshTokens[]` | `user_passkeys`, `user_refresh_tokens` | append/revoke + lookup semantics. |
|
||
| `DerivedDestination` sweep history, `TrezorAccount.addresses[]` | child tables | per-address rows referenced by payments. |
|
||
| `Payment.blockchain`, `Payment.metadata`, `Notification.metadata`, `PointTransaction.metadata` | **JSONB** | read-as-blob, never filtered/joined. |
|
||
|
||
Rule: **child table when you query/index/FK/aggregate it; JSONB when you read it whole and never filter on it.**
|
||
|
||
### 4.3 Self-referential FK — `Category`
|
||
```ts
|
||
export const categories = pgTable('categories', {
|
||
id: uuid('id').primaryKey().defaultRandom(),
|
||
legacyObjectId: text('legacy_object_id'),
|
||
name: text('name').notNull(),
|
||
nameEn: text('name_en'),
|
||
parentId: uuid('parent_id'), // self-FK, see relations
|
||
isActive: boolean('is_active').notNull().default(true),
|
||
}, (t) => ({
|
||
parentIdx: index('categories_parent_idx').on(t.parentId),
|
||
activeIdx: index('categories_active_idx').on(t.isActive),
|
||
activeNameNormUq: uniqueIndex('categories_active_name_norm_uq')
|
||
.on(sql`lower(btrim(${t.name}))`)
|
||
.where(sql`${t.isActive} = true`),
|
||
}));
|
||
// relations(): parentId → categories.id, ON DELETE SET NULL
|
||
```
|
||
`Category.parentId` is itself Mixed (ObjectId | string) in the model — verify all rows are ObjectIds during the pre-migration audit; treat stray strings as data errors to clean.
|
||
Active categories must also be unique by normalized visible name; migration `0009_unique_active_categories.sql` deactivates duplicate active rows and repoints category references before adding the unique index.
|
||
|
||
### 4.4 Sparse-unique → partial unique index — `User.email`, `User.referralCode`
|
||
The runtime code in `connection.ts` rebuilds `users.email` as unique+sparse. In PG:
|
||
```ts
|
||
emailUq: uniqueIndex('users_email_uq').on(t.email).where(sql`email IS NOT NULL`),
|
||
referralUq: uniqueIndex('users_referral_uq').on(t.referralCode).where(sql`referral_code IS NOT NULL`),
|
||
```
|
||
Reimplement `toJSON()` password/token stripping in the repository's read mapper (it deletes `refreshTokens`, `emailVerification*` before returning).
|
||
|
||
### 4.5 Atomic counter — `DerivedDestination.derivationIndex`
|
||
Today allocation relies on Mongo atomicity. In PG use a real transaction with `SELECT … FOR UPDATE` on a per-(buyer,chain) counter row, or a dedicated sequence per chain. The `uniq_destination_by_buyer_seller_chain` unique index ports directly. `status` enum `('active','swept','sweeping','quarantined')` → `pgEnum`.
|
||
|
||
### 4.6 TTL → `pg_cron`
|
||
`TempVerification` and `TelegramSession` stay on Mongo (ephemeral, recommended). If `Notification` (90-day TTL) ever moves: monthly range-partition + drop, or
|
||
```sql
|
||
SELECT cron.schedule('notifications_ttl', '0 3 * * *',
|
||
$$DELETE FROM notifications WHERE created_at < now() - interval '90 days'$$);
|
||
```
|
||
|
||
---
|
||
|
||
## 5. The dual-write seam (the mechanic that makes it safe)
|
||
|
||
```ts
|
||
// repositories/dual/DualWritePaymentRepo.ts
|
||
export class DualWritePaymentRepo implements IPaymentRepo {
|
||
constructor(private mongo: IPaymentRepo, private pg: IPaymentRepo) {}
|
||
|
||
// READS: source of truth = Mongo until cutover
|
||
findById(id) { return this.mongo.findById(id); }
|
||
|
||
// WRITES: both, idempotently. Mongo first (authoritative); PG must not break the request.
|
||
async create(input) {
|
||
const m = await this.mongo.create(input); // returns doc incl. _id
|
||
try {
|
||
await this.pg.upsertFromMongo(m); // keyed by legacyObjectId / idempotencyKey
|
||
} catch (e) {
|
||
metrics.dualWriteError('payments', 'create', e); // alert, do NOT throw
|
||
}
|
||
return m;
|
||
}
|
||
|
||
async update(id, patch) {
|
||
const m = await this.mongo.update(id, patch);
|
||
try { await this.pg.upsertFromMongo(m); } catch (e) { metrics.dualWriteError('payments','update',e); }
|
||
return m;
|
||
}
|
||
}
|
||
```
|
||
|
||
- **Mongo write is authoritative and must succeed**; PG write failures are logged + alerted, never surfaced to the user, during `dual` mode. (Once in `pg` mode, PG is authoritative and wrapped in real transactions.)
|
||
- All PG writes are **idempotent upserts** keyed on `legacyObjectId` (or natural idempotency keys: `Payment` partial-unique set, `FundsLedgerEntry.idempotencyKey`). This lets backfill and live dual-write overlap without double-insert.
|
||
- `$inc`/`$push` translate inside the repo: `$inc points` → `UPDATE … SET points = points + $1` in a transaction; `$push offers` → `INSERT INTO purchase_request_offers …`.
|
||
|
||
---
|
||
|
||
## 6. Phased execution
|
||
|
||
Same phases as the guide §2, here with Drizzle-concrete entry/exit gates. Each phase ends with a collection in `pg` mode and dual-write removed only after the soak.
|
||
|
||
### Phase 0 — Foundations (2–5 wk) — *no data moves*
|
||
- Stand up Postgres (per env), Drizzle, drizzle-kit, CI migrations. **Status 2026-05-31:** implemented in code and dev stack, but migrations must still be applied per target DB.
|
||
- Build repository interfaces + `MongoRepo` wrappers for the relational-core domains (refactor services to call repos, not Mongoose directly). **Status 2026-05-31:** repo interfaces/implementations exist; service-layer wiring remains the bulk of the cutover risk.
|
||
- Create `id_map`, the verification harness (§7), and the backfill batch runner skeleton.
|
||
- **Exit:** all relational-core services call repositories; PG reachable everywhere; `id_map` + verify harness exist; CI runs migrations.
|
||
|
||
### Phase 1 — Address pilot (1–2 wk)
|
||
- Smallest real domain; proves backfill → dual-write → verify → cutover end-to-end.
|
||
- **Status 2026-06-01:** `/api/addresses` has an opt-in PG runtime path through `ADDRESS_STORE=postgres`; PG writes/deletes mirror to Mongo for rollback.
|
||
- Reimplement the **one-primary-per-user** pre-save invariant as either a partial unique index `UNIQUE (user_id) WHERE primary = true` or a trigger.
|
||
- **Exit:** `addresses` in `pg` mode in prod, invariant proven under concurrent writes, verify green, dual-write removed.
|
||
|
||
### Phase 2 — Reference/config (2–3 wk)
|
||
- `Category` (self-FK, soft-delete), `LevelConfig`, `ConfigSetting`, `ConfigSettingHistory`, `ShopSettings`, `Review`.
|
||
- **Status 2026-06-01:** confirmation-threshold `ConfigSetting` / `ConfigSettingHistory`, categories, level config, shop settings, and reviews have opt-in PG runtime paths through their per-store flags; writes mirror back to Mongo where still-Mongo consumers need compatibility. Categories now enforce one active row per normalized visible name in PG mode.
|
||
- Port seeds to run in dependency order. Enforce `ShopSettings.sellerId` unique, Category `parentId` ON DELETE SET NULL, and Category active normalized-name uniqueness.
|
||
- **Exit:** these read from PG; seeds run in PG.
|
||
|
||
### Phase 3 — User + auth core (3–5 wk)
|
||
- `User` is the FK hub — **must precede the money core** so `id_map` for users is authoritative.
|
||
- **Status 2026-06-01:** auth-owned user data is opt-in PG-backed through `AUTH_STORE=postgres`, with a Mongo legacy mirror for still-Mongo consumers. Broader user consumers are not fully cut over.
|
||
- Normalize `profile`/`preferences`/`points`/`referralStats` into columns; extract `passkeys[]`, `refreshTokens[]` to child tables; partial-unique `email`/`referralCode`; reimplement `toJSON()` stripping; passkey `default: Date.now()` in app code.
|
||
- Redis session/rate-limit + in-memory passkey challenge store stay as-is.
|
||
- **Exit:** `users` in `pg` mode; referral self-FK intact; all auth flows pass; user uuids authoritative in `id_map`.
|
||
|
||
### Phase 4 — Money core (6–10 wk) — *the point of the project*
|
||
- `PurchaseRequest`, `SellerOffer`, `RequestTemplate`, `Payment`, `FundsLedgerEntry`, `DerivedDestination`, `TrezorAccount`, `PointTransaction`.
|
||
- **Status 2026-06-01:** Drizzle schemas and backfill scripts exist for PurchaseRequest/SellerOffer/RequestTemplate. Backend `2.8.19` hardens the marketplace-core backfill path with `npm run backfill:marketplace-core:postgres`, fixed PurchaseRequest timestamp/preferred-seller writes, a RequestTemplate backfill step, a post-SellerOffer selected-offer remap step, and category duplicate cleanup/unique active-name enforcement. Backend `2.8.20` wires the funds-ledger service through `getPaymentRepo()`, fixes Mongo/Drizzle payment-stat parity for future service wiring, and makes the repo factory lazy-load PG/dual implementations so Mongo mode does not require `PG_URL`. Runtime marketplace services still call Mongoose directly and must not be flipped with `REPO_MARKETPLACE` until service wiring plus shadow-read checks land.
|
||
- Apply §4.1 (Mixed→discriminator+FK), §4.2 (offers/preferredSellers junctions, deliveryInfo/serviceInfo child tables), §4.5 (derivation counter).
|
||
- **Wrap in real PG transactions the multi-doc writes that today have none:** `raiseDispute` (PurchaseRequest + Payment), payment confirm + `FundsLedgerEntry` AML-fee insert, referral reward (points + referralStats), PointsService flows (migrate its 2 `withTransaction` sites to PG `BEGIN/COMMIT`).
|
||
- Preserve the `Payment` partial-unique idempotency index and `FundsLedgerEntry.idempotencyKey` uniqueness.
|
||
- **Exit:** money core in `pg` mode; checksum equality on `funds_ledger_entries` sums & `payments` amounts across a full soak; idempotency + escrow-hold invariants pass concurrency tests.
|
||
|
||
### Phase 5 — Dispute + delivery (2–4 wk)
|
||
- `Dispute.evidence[]`/`timeline[]` → child tables; pre-save timeline-append → explicit INSERT; delivery `$set/$push` nested updates → SQL.
|
||
- `Dispute ↔ Chat` becomes a **cross-store call** (Chat stays on Mongo) — define the boundary API.
|
||
- **Exit:** dispute lifecycle in `pg` mode; release-hold sync transactional.
|
||
|
||
### Phase 6 (deferred / optional) — `BlogPost`
|
||
- Behind a search abstraction; `$regex` → PG trigram/FTS only if migrated. Otherwise leave on Mongo. RequestTemplate schema/backfill moved into Phase 4 because template checkout creates PurchaseRequest/SellerOffer rows.
|
||
|
||
### Permanent on Mongo
|
||
`Chat`, `Notification`, `TelegramSession`, `TempVerification`, `TelegramLink` link-state. Revisit only if dual-stack ops cost exceeds migration cost.
|
||
|
||
---
|
||
|
||
## 7. Verification (gate for every cutover)
|
||
|
||
Three layers, **all green before any read flip**:
|
||
|
||
1. **Row counts** — per collection and per FK relationship, Mongo vs PG. Catches dropped/dangling rows. Run continuously during dual-write.
|
||
2. **Checksums** — column-level hashes; special attention to financial sums (`SUM(funds_ledger_entries.amount)`, `SUM(payments.amount)` grouped by status/provider) and the partial-unique idempotency set.
|
||
3. **Shadow reads** — in prod, serve from Mongo, asynchronously read PG for the same key, diff, alert on mismatch. **A clean shadow-read window (e.g. 7 days, zero diffs on hot paths) is the exit criterion for cutover.**
|
||
|
||
```ts
|
||
// verify/shadow.ts — wrap a repo read in dual mode
|
||
async function shadowRead(key, mongoFn, pgFn) {
|
||
const m = await mongoFn(key);
|
||
pgFn(key).then(p => { if (!deepEqualNormalized(m, p)) metrics.shadowMismatch(key, diff(m, p)); })
|
||
.catch(e => metrics.shadowError(key, e));
|
||
return m; // user always gets Mongo result
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
## 8. Cutover & rollback runbook (per collection)
|
||
|
||
1. **Backfill** in batches with checkpointing; allocate uuids → `id_map`; remap FKs from already-migrated parents. Re-runnable (idempotent upserts).
|
||
- Marketplace-core operator path: `MIGRATION_MONGO_URL=... MIGRATION_PG_URL=... npm run backfill:marketplace-core:postgres:dry-run`, then `npm run backfill:marketplace-core:postgres` in non-prod. The group now includes RequestTemplate before PurchaseRequest/SellerOffer. Run `scripts/smoke/marketplace-core-postgres-backfill.sh` with the same DSNs to exercise the static backfill invariants and dry-run.
|
||
2. **Enable `dual`** (flag) — writes go to both; shadow-read diffing on. Backfill the delta accumulated during step 1.
|
||
3. **Soak** until row-count + checksum + shadow-read are clean for the agreed window.
|
||
4. **Flip reads to `pg`** (flag). Keep dual-write on.
|
||
5. **Soak again** (shorter). Rollback = flip reads back to `mongo`; data still mirrored, so rollback is instant.
|
||
6. **Decommission**: stop writing Mongo for that collection; archive the collection.
|
||
|
||
> Near-zero downtime: there is no global write freeze except, optionally, a brief one during final ledger reconciliation for the money core.
|
||
|
||
---
|
||
|
||
## 9. First two weeks — concrete starter checklist
|
||
|
||
- [ ] Add `drizzle-orm`, `pg`, `drizzle-kit`; create `src/db/{schema,client.ts,migrations}` + `drizzle.config.ts`.
|
||
- [x] Provision Postgres in dev (compose) + define `PG_URL`; keep Mongo running alongside. Use Postgres 18 volume mount `/var/lib/postgresql`, not `/var/lib/postgresql/data`.
|
||
- [ ] Write `id_map` schema; generate + run the first migration in CI.
|
||
- [ ] Define `IAddressRepo`; implement `MongoAddressRepo` by moving the existing Mongoose calls behind it; refactor address service to use the repo. **No behavior change** — prove the seam is invisible (existing tests pass).
|
||
- [ ] Build the verification harness (row count + checksum) against `addresses`.
|
||
- [ ] Author `addresses` Drizzle schema (incl. one-primary partial unique index) + `DrizzleAddressRepo` + `DualWriteAddressRepo`.
|
||
- [ ] Write the batch backfill for `addresses`; run dev backfill; confirm verify is green.
|
||
- [ ] Flip dev to `dual`, then `pg`; document the flag flips. This is the template for all later phases.
|
||
|
||
---
|
||
|
||
## 10. Effort recap (from the guide)
|
||
|
||
| Scope | Eng-weeks | Notes |
|
||
|---|---|---|
|
||
| **Partial — money/relational core (Phases 0–5 + cross-cutting)** | **~16–28** | Recommended stopping point; captures ~90% of value (ACID money + relational integrity). |
|
||
| Full — all 23 collections | ~23–40 | Extra 7–12+ wks mostly buys Chat/Notification normalization the access patterns don't reward. |
|
||
|
||
Add ~20% contingency for data-audit surprises in the Mixed-id fields. One focused engineer assumed; parallelize to compress wall-clock, not effort.
|
||
|
||
---
|
||
|
||
> [!warning] Before trusting the code sketches
|
||
> Drizzle schemas above use the real field names from `backend/src/models/` but are **first-pass sketches** — confirm the full `Payment.status` enum, the exact `amount` precision/scale your tokens need (USDT/USDC decimals), and audit which `Mixed` rows are actually strings vs ObjectIds **before** writing the money-core migration. See [[MongoDB to PostgreSQL Migration Guide]] §3/§5 for the authoritative per-field detail.
|