docs: sync from backend 8835068 — C2 chat query bounds closeout

This commit is contained in:
Siavash Sameni
2026-06-07 08:15:46 +04:00
parent 3362e2e1b8
commit cc6395f75b
3 changed files with 45 additions and 4 deletions

View File

@@ -12,6 +12,16 @@ entries on top. Maintained by agents per the rule in `../AGENTS.md`.
---
### 2026-06-07 — backend@8835068, frontend@73d1407 — DB audit C2 chat query bounds closeout
**Commits:** `8835068` `73d1407`
**Touched:** backend `src/db/repositories/drizzle/DrizzleChatRepo.ts`, `src/db/schema/chat.ts`, `src/db/migrations/0026_chat_settings_archived_idx.sql`, `__tests__/drizzle-chat-repo.test.ts`, `__tests__/db-audit-high-indexes.test.ts`, `package.json`, `package-lock.json`; frontend `Dockerfile`, `package.json`; docs `09 - Audits/DB Query & Schema Audit - 2026-06-06.md`, `09 - Audits/C2-DrizzleChatRepo-Partial-Fix-Report.md`, `09 - Audits/Activity Log.md`
**Why:** Finish C2 from the DB Query & Schema Audit and the partial-fix report. Chat reads now have a bounded query builder, SQL pagination for SQL-pushable `findForUser` predicates, `findOne` id/limit fast paths, chat `type` predicate pushdown, bounded fallback/search scans, and an index for archived-chat filtering.
**Verification:** backend `npm run typecheck`; `npm test -- --runTestsByPath __tests__/drizzle-chat-repo.test.ts __tests__/db-audit-high-indexes.test.ts --runInBand` (2 suites / 9 tests); `scripts/smoke/db-audit-service-regressions.sh` (18 suites / 73 tests); backend/frontend scoped `git diff --check`; frontend version metadata confirmed at v2.9.35. Pushed to Forgejo.
**Linked docs updated:** [[09 - Audits/DB Query & Schema Audit - 2026-06-06]], [[09 - Audits/C2-DrizzleChatRepo-Partial-Fix-Report]]
---
### 2026-06-07 — backend@c3ad979, frontend@a8791b1 — DB audit medium transaction closeout M13/M14/M17
**Commits:** `c3ad979` `a8791b1`

View File

@@ -0,0 +1,30 @@
# C2: DrizzleChatRepo.findRows - Closeout Report
**File:** `src/db/repositories/drizzle/DrizzleChatRepo.ts`
**Method:** `findRows()`
**Status:** Fixed in `backend@8835068` / v2.9.35.
---
## What Was Finished
The earlier C2 fix moved the hot participant/archive/unread predicates into SQL, but this report correctly identified that edge paths could still fetch too much data. Commit `8835068` closes those remaining gaps:
| Gap from the partial report | Final behavior |
|---|---|
| Empty `findRows({})` fetched the whole chat table | All chat row reads now go through a bounded query builder with a 1000-row max cap |
| `findOne({})` could fetch many rows before returning one | `findOne()` now uses `LIMIT 1`; id-only queries use the existing `findById()` fast path |
| `findForUser()` applied `skip`/`limit` after fetching all matching rows | SQL-pushable predicates now use DB-side `offset`/`limit` |
| Search/fallback predicates still used in-memory matching | Fallback matching remains for unsupported predicates, but only after a bounded 1000-row scan |
| Chat `type` was left as a fallback predicate | `type` is now pushed to SQL for `direct`, `group`, and `support` |
| `settings.isArchived` had no matching B-tree index in the current schema/migrations | Added schema index and migration `0026_chat_settings_archived_idx.sql` for `chats_settings_is_archived_idx` |
## Verification
- `npm run typecheck` - passed.
- `npm test -- --runTestsByPath __tests__/drizzle-chat-repo.test.ts __tests__/db-audit-high-indexes.test.ts --runInBand` - passed, 2 suites / 9 tests.
- `scripts/smoke/db-audit-service-regressions.sh` - passed, 18 suites / 73 tests.
## Remaining Long-Term Schema Work
The C2 unbounded-fetch and pagination issue is closed. The larger chat storage design remains a separate schema project: moving `messages`, `participants`, and `unreadCounts` out of JSONB arrays into relational tables would enable exact indexed deep search, precise fallback pagination, and targeted message updates without rewriting large JSON blobs.

View File

@@ -84,6 +84,7 @@ updated: 2026-06-07
| M13: single template conversion request+offer writes split across service calls → dedicated serializable repo transaction for vital request/offer writes | `c3ad979` v2.9.34 |
| M14: PG payment completion follow-up assumed DB idempotency → explicit `notifyOnly` path skips all DB writes | `c3ad979` v2.9.34 |
| M17: profile email verification pending-email race → single conditional SQL `UPDATE` with conflict outcome handling | `c3ad979` v2.9.34 |
| C2/M26: `DrizzleChatRepo.findRows` unbounded chat fetch + JS pagination → bounded row scans, SQL pagination for SQL-pushable predicates, `findOne` `LIMIT 1`/id fast path, type pushdown, and archived-chat index | `8835068` v2.9.35 |
---
@@ -111,13 +112,13 @@ Every call to `rowToUser` issues 3 additional Postgres round-trips: one for refr
---
### 2. DrizzleChatRepo.findRows fetches entire chats table into memory before filtering
### 2. DrizzleChatRepo.findRows fetches entire chats table into memory before filtering | **FIXED** `8835068` v2.9.35
> **Category:** Unbounded Fetch | **File:** `src/db/repositories/drizzle/DrizzleChatRepo.ts:544-558`
`findRows()` runs `SELECT * FROM chats ORDER BY last_activity` with no WHERE clause, pulling every chat row (including the full JSONB `messages` array — potentially hundreds of messages per row) into Node.js memory, then applies a JavaScript-level `matchesQuery()` predicate. Every public read path — `findOne()`, `findForUser()`, `count()`, and `getUnreadMessageCountByUser()` — goes through this method. Memory grows O(total_chats × avg_messages), causing OOM or multi-second pauses on each user request.
**Fix:** Translate the common query predicates (participant userId, isActive, settings_is_archived) into SQL WHERE clauses. The participants JSONB column supports GIN-indexed containment: `WHERE participants @> '[{"userId": "<id>", "isActive": true}]'::jsonb`. Add a GIN index on `participants` and push filtering to the database. The in-memory engine can remain as a fallback for uncommon predicates only.
**Fix:** `8835068` completes the partial C2 fix. `findRows()` now always goes through a bounded query builder with a 1000-row max cap, including empty queries. SQL-pushable predicates now include participant filters, active-state filters, archive status, unread-count checks, and chat `type`. `findOne()` uses `findById()` for id-only lookups and otherwise calls `findRows(..., { limit: 1 })`. `findForUser()` pushes `offset`/`limit` into SQL when predicates are fully pushable; fallback/search predicates still use the in-memory matcher, but only after a bounded 1000-row scan. The backend also adds `chats_settings_is_archived_idx` in schema and migration `0026_chat_settings_archived_idx.sql`.
---
@@ -855,13 +856,13 @@ The DB COUNT(*) query (lines 893-905) counts rows before the seller-visibility i
---
### 26. findForUser applies skip/limit in JavaScript after a full DB fetch — true DB pagination not used
### 26. findForUser applies skip/limit in JavaScript after a full DB fetch — true DB pagination not used | **FIXED** `8835068` v2.9.35
> **Category:** Bad Pagination | **File:** `src/db/repositories/drizzle/DrizzleChatRepo.ts:596-605`
`findForUser` calls `this.findRows(query)` (full table scan), then does `rows.slice(skip, skip + limit)` in JS. For page 10 of 20 results, 200+ rows are fetched and discarded from the DB transfer before slicing.
**Fix:** Push pagination to SQL once the JSONB participant filter is moved to SQL: add `.offset(skip).limit(limit)` to the Drizzle query inside `findRows`, or restructure to accept pagination arguments.
**Fix:** `8835068` adds `offset`/`limit` options to the Drizzle chat row query builder. `findForUser()` now applies DB-side pagination for SQL-pushable filters and only slices in JavaScript for fallback/search predicates after the repository safety cap has already limited the database result set.
---